IAML Unit 9: Discussion

Announcements

  • Installing keras package
  • Slack media channel
  • Appendix for nested cross validation
    • using tidymodels packages but not parsnip engines and fit_resamples/tune_grid
    • tidymodels wrappers at three levels (we use levels 1-2 for engines and resampling/tuning but not workflows, etc)

General

  • I’m still confused about what out-of-bag error is and how it would differ from just model error

  • What did you mean in the lectures when you said that the hyperparameters were around the edges?

  • Learn more about the algorithms you use by reading package documentation

Trees

  • When we split the data in a tree by a variable, how the threshold is specified? Is the algorithm determining it internally?

    • RSS
    • Error/accuracy, gini, entropy (book formula)
  • Can we discuss further how tree-models can handle nonlinear effects and interactions natively?

  • What determines the depth of trees that we set in decision trees and random forest?

  • How do trees handle missing data? Do we ever want to impute missing data in our recipes?

  • In what situations might additional feature engineering (e.g., alternative handling of missing values or categorical aggregation) still improve decision tree performance?

    • missing values, categorical aggregation, nsv, other dimensionality reduction
    • Why isn’t feature engineering required as extensively when it comes to decision trees?
    • out of box statistical algorithms
  • Can you give more examples of how to interpret the decision tree graph?

    • book examples

Bagging

  • A further explanation on what a base learner is

  • When and why does bagging improve model performance

    • What are the benefits of bagging
    • No impact on bias
    • Interpretability loss
    • computational cost
  • Can you talk more about bagging and how it utilizes bootstrapping techniques?

  • Can you explain more on why we should de-correlate the bagged trees? Do we not need to de-correlate if we are not using bagging?

  • Besides using the same resampling (bootstrapping), how is the inner loop/outer loop with bagging and random forest distinct from nested CV?

  • its not similar

  • Out-of-Bag Error Estimation: I did not really understand the logic behind this

Random Forest

Understanding mtry

Hyperparameters and other cross method issues

  • How do we decide which stopping rule to use for trees?

    • Role of tree_depth (trees), min_n (both), cost complexity (trees) and mtry (rf), number of trees (rf) in decision trees and random forests
  • Can we go over the different advantages and disadvantages of the different tree algorithms like we did in class with QDA, LDA, KNN, Log, and RDA?

    • simple tree (e.g. CART)
    • bagged tress
    • rf and XGBoost