# Evaluation

## Regularization

- prevent overfitting
- Regularization Parameter:

### L2 Norm

### L1 Norm

- weight decay

### Dropout Regularization

### Early Stopping

- stop training earlier
- prevent overfitting
- not optimizing cost function

- cannot work out the two problems
- not overfitting
- optimize cost function

## Bias and Variance

Train set error | Dev set error | bias and variance |
---|---|---|

low | high | high variance |

high | high | high bias, high variance |

high | low | high bias |

#### high bias

- underfitting
- large
- high training error, high cross validation error

##### solve

- bigger network
- NN architecture search
- train longer
- polynomial features
- decrease

#### high variance

- overfitting
- small
- low training error, high cross validation error

##### solve

- more data
- flipping horizontally
- rotate
- distort

- NN architecture search
- regularization
- smaller sets of features
- increase

### Training Set

- training set increase, training error increase, cross validation error decrease

## Error Analysis

### Learning Approach

- start with a simple algorithm
- Implement and test on cross-validation data
- plot learning curves
- to decide data features

- Manually examine the examples that the algorithm made errors on

### Skewed Class

one kind much more than another kind in the training data

## Evaluation

### Precision/Recall

On cross-validation data:

- positive: predict 1
- negative: predict 0
- true: predict = actual
- false: predict != actual

#### Precision

- true positives / predicted positives
- true positives / true positives + false positives

#### Recal

- true positives / actual positives
- true positives / true positives + false negatives

#### Trade off

- predict 1 if > p
- if p = 0.9
- high precision
- low recall

- if p = 0.1
- high recall
- low precision

- if p = 0.9

#### F Score

- Average:
- not good

- Score:

### Evaluation Metrics

- pick one to be optimizing
- the left to be satisficing
- reach a threshhold

### Evaluation Methods

#### hold-out

##### previous (Machine Learning)

- randomly order
- 70% traning set
- 30% test set

##### current (Deep Learning / Big Data)

- more data for training
- less for dev and test
- dev set to evaluate different models
- test set to evaluate final cost bias

#### cross validation

##### Leave-One-Out Cross-Validation

- split the data into n sets
- 1 for testing, n-1 for training

##### K-Fold Cross-Validation

- split the data into n sets
- k for testing, n-k for training
- if k is too small, underfitting
- if k is too large, overfitting
- high correlation