Classification
Directory actions
More options
Directory actions
More options
Classification
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
parent directory.. | ||||
End-to-end ML Modeling process for classification problem 1. Data reading and preprocessing: missing data, visualization, scaling, text preprocessing 2. Algorithm selection 3. Model Evaluation: train-test split, k-fold cross validation, stratified cv, metrics (accuracy,recall,precision,f1-score,confusion matrix) 4. Hyperparameter tuning: grid search 5. Final model Saving into disk and loading Algorithms 1. Logistic Regression 2. Linear Discriminant Analysis (LDA) 3. K-Nearest Neighbors (KNN) 4. Naive Bayes (NB) 5. Decision Tree 6. Support Vector Machine (SVM) 7. Random Forest 8. Bagged Decision Trees 9. Extra Trees 10. AdaBoost 11. Gradient Boosting 12. XGBoost 13. Neural Network 14. Voting Ensemble Examples 1. ISLR college data (binary class) https://www.kaggle.com/ishaanv/ISLR-Auto/data 2. UCI pima indians diabetes data (binary class) https://archive.ics.uci.edu/ml/datasets/pima+indians+diabetes 3. UCI breast cancer data (binary class) https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic) 4. UCI iris data (multi-class) https://archive.ics.uci.edu/ml/datasets/iris 5. UCI wine data (multi-class) https://archive.ics.uci.edu/ml/datasets/wine 6. Kaggle HR data (binary class) https://www.kaggle.com/giripujar/hr-analytics 7. Kaggle titanic data (binary class) https://www.kaggle.com/c/titanic 8. Kaggle otto data (multi-class) https://www.kaggle.com/c/otto-group-product-classification-challenge 9. Kaggle Toxic Comment Classification Challenge (binary class) https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge 10. CrowdFlower Twitter Global Warming Sentiment data (binary class) https://www.crowdflower.com/data-for-everyone/ 11. CrowdFlower Twitter Airline Sentiment data (multi-class) https://www.crowdflower.com/data-for-everyone/ 12. CrowdFlower Corporate Messaging data (multi-class) https://www.crowdflower.com/data-for-everyone/ 13. CrowdFlower Coachella 2015 Twitter sentiment data (multi-class) https://www.crowdflower.com/data-for-everyone/