Evaluation

Regularization

  • prevent overfitting
  • Regularization Parameter:

L2 Norm

L1 Norm

  • weight decay

Dropout Regularization

Early Stopping

  • stop training earlier
    • prevent overfitting
    • not optimizing cost function
  • cannot work out the two problems
    • not overfitting
    • optimize cost function

Bias and Variance

Train set error Dev set error bias and variance
low high high variance
high high high bias, high variance
high low high bias

high bias

  • underfitting
  • large
  • high training error, high cross validation error
solve
  • bigger network
  • NN architecture search
  • train longer
  • polynomial features
  • decrease

high variance

  • overfitting
  • small
  • low training error, high cross validation error
solve
  • more data
    • flipping horizontally
    • rotate
    • distort
  • NN architecture search
  • regularization
  • smaller sets of features
  • increase

Training Set

  • training set increase, training error increase, cross validation error decrease

Error Analysis

Learning Approach

  • start with a simple algorithm
  • Implement and test on cross-validation data
  • plot learning curves
    • to decide data features
  • Manually examine the examples that the algorithm made errors on

Skewed Class

one kind much more than another kind in the training data

Evaluation

Precision/Recall

On cross-validation data:

  • positive: predict 1
  • negative: predict 0
  • true: predict = actual
  • false: predict != actual

Precision

  • true positives / predicted positives
  • true positives / true positives + false positives

Recal

  • true positives / actual positives
  • true positives / true positives + false negatives

Trade off

  • predict 1 if > p
    • if p = 0.9
      • high precision
      • low recall
    • if p = 0.1
      • high recall
      • low precision

F Score

  • Average:
    • not good
  • Score:

Evaluation Metrics

  • pick one to be optimizing
  • the left to be satisficing
    • reach a threshhold

Evaluation Methods

hold-out

previous (Machine Learning)
  • randomly order
  • 70% traning set
  • 30% test set
current (Deep Learning / Big Data)
  • more data for training
  • less for dev and test
    • dev set to evaluate different models
    • test set to evaluate final cost bias

cross validation

Leave-One-Out Cross-Validation
  • split the data into n sets
  • 1 for testing, n-1 for training
K-Fold Cross-Validation
  • split the data into n sets
  • k for testing, n-k for training
    • if k is too small, underfitting
    • if k is too large, overfitting
      • high correlation

results matching ""

    No results matching ""