Neural Architecture Search With Bayesian Optimisation And Optimal Transport

Authors:
Kirthevasan Kandasamy Carnegie Mellon University
Willie Neiswanger Carnegie Mellon University
Jeff Schneider CMU
Barnabas Poczos Carnegie Mellon University
Eric Xing Petuum Inc. / Carnegie Mellon University

Introduction:

Bayesian Optimisation (BO) refers to a class of methods for global optimisationof a function f which is only accessible via point evaluations.

Abstract:

Bayesian Optimisation (BO) refers to a class of methods for global optimisationof a function f which is only accessible via point evaluations. It istypically used in settings where f is expensive to evaluate. A common use casefor BO in machine learning is model selection, where it is not possible toanalytically model the generalisation performance of a statistical model, andwe resort to noisy and expensive training and validation procedures to choosethe best model. Conventional BO methods have focused on Euclidean andcategorical domains, which, in the context of model selection, only permitstuning scalar hyper-parameters of machine learning algorithms. However, withthe surge of interest in deep learning, there is an increasing demand to tuneneural network architectures. In this work, we develop NASBOT, a Gaussianprocess based BO framework for neural architecture search. To accomplish this,we develop a distance metric in the space of neural network architectures whichcan be computed efficiently via an optimal transport program. This distancemight be of independent interest to the deep learning community as it may findapplications outside of BO. We demonstrate that NASBOT outperforms otheralternatives for architecture search in several cross validation based modelselection tasks on multi-layer perceptrons and convolutional neural networks.

You may want to know: