27/05/2019

Introduction

  • Relatively balanced classes

  • Categorical, logical and numeric variables

  • Variable Manipulation: Key Variables

Methodology

RANDOM FOREST

  • Fit using recursive binary splitting
  • Bootstrap aggregation VS Variance problem
  • OOB error estimation

MULTI-CLASS SUPPORT VECTOR MACHINE

  • INTUITION
  • DATA PREPROCESSING
  • MODEL FITTING: Kernel + Regularisation/Tuning Parameters

Discussion

RANDOM FOREST

  • Diagnostics: Test Accuracy
  • Role of Variable Manipulation
  • XGBoost As an Alternative

MULTI-CLASS SUPPORT VECTOR MACHINE

  • Normalisation
  • Tuning of Kernel Parameters
  • Role of Cross Validation

Conclusion

  • Random Forest v SVM

  • Applicability and Interpretation