Artificial Intelligence Lab Seminar

2012 Nov 23 at 11:30

DC 2306C (AI Lab)

Grid Search is a Bad Hyperparameter Optimization Algorithm

James Bergstra, Postdoctoral researcher, Centre for Theoretical Neuroscience, Univ. Waterloo

Grid search and manual search are the most widely used strategies for hyperparameter optimization. Manual search is well known to produce results that are difficult to reproduce. In this talk, I will argue that grid and manual search are inefficient and ineffective compared with alternatives based on Bayesian optimization and even random search. I draw empirical support from a large previous study that used grid search and manual search toconfigure neural networks and Deep Belief Networks. Analysis of the response surface function from hyperparameters to validation set performance reveals that for most data sets only a few of the hyperparameters really matter, but critically, different hyperparameters are important on different data sets. This property makes grid search a poor choice for configuring these algorithms for new data sets, and casts some light on why recent ``High Throughput'' methods based on random search achieve surprising success: they are not bogged down by irrelevant hyperparameters. In cases where brute force methods are not sufficiently efficient, Bayesian optimization offers a principled, practical, and effective framework for search. I will present recent and ongoing work on improved model selection of Deep Belief Networks andmulti-layer visual system models by Bayesian optimization.

Bio: James Bergstra is a postdoctoral researcher at the University of Waterloo working in the Centre for Theoretical Neuroscience under Chris Eliasmith. His research has focused by turns on visual system models and learning algorithms, hyperparameter optimization, high performance computing, and music information retrieval. He moved to the University of Waterloo from Harvard University where he worked for a year in David Cox's Computer and Biological Vision lab. He completed doctoral studies at the University of Montreal in July 2011 under the direction of Professor Yoshua Bengio with a dissertation on how to incorporate complex cells into deep learning models. In the course of his graduate work he co-developed Theano, an open source optimizing compiler that can make use of Graphics Processing Units (GPUs) for high-performance computation. He completed a Masters in 2006 under the direction of Douglas Eck on algorithms for classifying recorded music by genre.