IAML Unit 5: Discussion

Feedback

Course

  • Move application assignments - thus? fri morning?
  • some of the videos or portions don’t have subtitles? (all of them should! - confirm?)
  • Transcripts? Timestamps?
  • Mic (lab, discussion?)
  • More questions in slides?
  • Vocab dictionary
  • End with conclusions? Not sure how different from learning objectives
  • Full solution of a problem posted at end of unit?

Lab

  • Less review of previous application assignment
  • More interactive - coding problems? Forward or backward or both?

Announcements

  • Starting Unit 6 (Regularization/Penalized models) at end of class today; All set!

  • Unit 7 is Mid-term Exam unit

    • No readings, lecture or quiz
    • Application assignment as exam (due at normal application assignment day/time; Weds, March 5th at 8pm OR NEW TIME BASED ON ABOVE?)
    • Lab used for conceptual exam review session (led by me)
    • Conceptual exam during discussion meeting (Thursday March 6th at 11 am)
  • Minimal EDA with new data sets (and the last application assignment)

Resampling methods to BOTH select and evaluate models

Most common scenario requires both model selection and final model evaluation

  • Need data for training, validation and test data set(s)

Test set

All methods (other than nested) use a single test set

  • 20-30% of data
  • The problems associated with a small test set motivate nested resampling (What problem?)
  • Conceptually, get test set first
    • use initial_split() for all but validation set approach (now use initial_validation_split() for this last approach)

Train and Validation set

Describe how each of these methods produce training and validation sets

  • Single validation set
  • LOOCV
  • K-fold
  • Repeated k-fold
  • Grouped k-fold
  • Bootstrap

No performance estimate necessary

When might you just want to know the best model configuration but NOT care about how it performs? (though I think you might always want to know!)

How would you modify the resampling procedure?

In this instance, we might call the held-out sets validation sets when using that terminology

Only one model configuration

When might you have only one model configuration? (is this ever really true?)

How would you modify the resampling procedure?

In this instance, we might call the held-out sets test sets when using that terminology

Bias and Variance - The MOST Fundamental Issues in Inferential Statistics

Describe these two properties of estimates in general

  • need to think about repeated estimation of something (\(\hat{f}\), \(\hat{DGP}\); \(\hat{accuracy}\), \(\hat{rmse}\))

Describe bias and variance associated with developing models (estimates of the DGP)

  • Examples for our models (\(\hat{f}\), \(\hat{DGP}\))
  • What factors affect each?

Describe bias and variance of our performance estimates

  • Examples for our performance estimates (\(\hat{accuracy}\), \(\hat{rmse}\))
  • What factors affect each?

Connect Bias and Variance to our Resampling Methods

Our resampling methods yield an ESTIMATE of (held-out/out of sample) performance metric

  • We can use it to select among model configurations (validation sets)
  • We can use it to evaluate a single (or best) model configuration (test sets)

Lets consider the bias and variable of this ESTIMATE of a performance metric for each METHOD

  • Use held-in/held-out terminology
  • Discuss the bias and variance of the performance estimate of this configuration. Consider implications of:
    • The method
    • size of the held-in data
    • size of held out data

Our METHODS

  • Single validation set
  • LOOCV
  • K-fold
  • Repeated k-fold
  • Bootstrap

Biased Performance Estimates

ALL resampling methods yield performance estimates with some degree of bias when used to evaluate the performance of a model we will implement trained on ALL the data

WHY?

Optimization bias

What are the implications if you use that same performance estimate to select the best model configuration AND to evaluate that best model configuration (i.e., estimate its performance in new data)?

Can this source of bias be eliminated?

Nested CV

  • When used?

  • Why needed?

  • Describe how to do it using held-in/held-out terminology in addition to train/val/test

  • Discuss the bias and variance of the performance estimate used to select the best configuration. Consider implications of:

    • The method
    • size of the held-in data
    • size of held out data
  • Discuss the bias and variance of the performance estimate used to evaluate that best configuration. Consider implications of:

    • The method
    • size of the held-in data
    • size of held out data

PCA

  • https://jjcurtin.github.io/book_iaml/app_pca.html