Common Problems in Hyperparameter Optimization

Common Problems in Hyperparameter Optimization Alexandra Johnson @alexandraj777

Hyperparameter Optimization ● Hyperparameter tuning, model tuning, model selection ● Finding "the best" values for the hyperparameters of your model

Better Performance ● +315% accuracy boost for TensorFlow ● +49% accuracy boost for xgboost ● -41% error reduction for recommender system

● Default values are an implicit choice ● Defaults not always appropriate for your model ● You may build a classifier that looks like this: Default Values

Choosing a Metric ● Balance long-term and short-term goals ● Question underlying assumptions ● Example from Microsoft

Choose Multiple Metrics ● ● Composite Metric ● Multi-metric

Metric Generalization ● Cross validation ● Backtesting ● Regularization terms

Optimize all Parameters at Once

Example: xgboost ● Optimized model always performed better with tuned feature parameters ● No matter which optimization method

What is an Optimization Method?

You are not an Optimization Method ● Hand tuning is time consuming and expensive ● Algorithms can quickly and cheaply beat expert tuning

Grid Search Random Search Bayesian Optimization Use an Algorithm

No Grid Search Hyper- parameters Model Evaluations 2 100 3 1,000 4 10,000 5 100,000

Random Search ● Theoretically more effective than grid search ● Large variance in results ● No intelligence

Use an Intelligent Method Genetic algorithms Bayesian optimization Particle-based methods Convex optimizers Simulated annealing To name a few...

SigOpt: Bayesian Optimization Service Three API calls: 1. Define hyperparameters 2. Receive suggested hyperparameters 3. Report observed performance

Intro Ian Dewancker. SigOpt for ML: TensorFlow ConvNets on a Budget with Bayesian Optimization. Ian Dewancker. SigOpt for ML: Unsupervised Learning with Even Less Supervision Using Bayesian Optimization. Ian Dewancker. SigOpt for ML : Bayesian Optimization for Collaborative Filtering with MLlib. #1 Trusting the Defaults Keras recurrent layers documentation #2 Using the Wrong Metric Ron Kohavi et al. Trustworthy Online Controlled Experiments: Five Puzzling Outcomes Explained. Xavier Amatriain. 10 Lessons Learning from building ML systems [Video at 19:03]. Image from PhD Comics. See also: SigOpt in Depth: Intro to Multicriteria Optimization. #4 Too Few Hyperparameters Image from TensorFlow Playground. Ian Dewancker. SigOpt for ML: Unsupervised Learning with Even Less Supervision Using Bayesian Optimization. #5 Hand Tuning On algorithms beating experts: Scott Clark, Ian Dewancker, and Sathish Nagappan. Deep Neural Network Optimization with SigOpt and Nervana Cloud. #6 Grid Search NoGridSearch.com References - by Section

References - by Section #7 Random Search James Bergstra and Yoshua Bengio. Random search for hyper-parameter optimization. Ian Dewancker, Michael McCourt, Scott Clark, Patrick Hayes, Alexandra Johnson, George Ke. A Stratified Analysis of Bayesian Optimization Methods. Learn More blog.sigopt.com sigopt.com/research

Common Problems in Hyperparameter Optimization

In this document

More Related Content

What's hot

Similar to Common Problems in Hyperparameter Optimization

More from SigOpt

Recently uploaded

Common Problems in Hyperparameter Optimization