Drug Classification

Classification • Decision Tree + cross-validation + model visualization

Drug classification thumbnail

Predict the type of drug a patient should be prescribed based on medical attributes using a Decision Tree Classifier. Features include age, sex, blood pressure, cholesterol, and sodium-to-potassium ratio.

TL;DR

Decision Tree classifier trained on patient attributes, evaluated via 10-fold CV, and visualized for interpretability.

My role

Handled preprocessing/encoding, trained and evaluated decision trees, compared criteria and max_depth settings, and produced a readable tree visualization.

Tech

PythonPandasNumPyscikit-learnMatplotlibDecision TreeCross-validation

Links


Dataset

  • Source: drug200.csv (Kaggle notebook reference)
  • Features: Age, Sex, BP (LOW/NORMAL/HIGH), Cholesterol (NORMAL/HIGH), Na_to_K
  • Target: drugA, drugB, drugC, drugX, drugY

Approach

  • Preprocessing: Encode categorical variables into numeric values.
  • Model: DecisionTreeClassifier (entropy), with max_depth constraints to reduce overfitting.
  • Evaluation: 10-fold cross-validation; report mean accuracy and variability.
  • Tuning: Compare gini vs entropy vs log_loss and test different max_depth values.

Decision tree visualization

Decision tree visualization for drug classifier
Trained decision tree visualization for interpretability.

Repository

Explore the code on GitHub: Drug Classification