AutoML Pipeline Selection: Efficiently Navigating the Combinatorial Space

2020 
Data scientists seeking a good supervised learning model on a dataset have many choices to make: they must preprocess the data, select features, possibly reduce the dimension, select an estimation algorithm, and choose hyperparameters for each of these pipeline components. With new pipeline components comes a combinatorial explosion in the number of choices! In this work, we design a new AutoML system TensorOboe to address this challenge: an automated system to design a supervised learning pipeline. TensorOboe uses low rank tensor decomposition as a surrogate model for efficient pipeline search. We also develop a new greedy experiment design protocol to gather information about a new dataset efficiently. Experiments on large corpora of real-world classification problems demonstrate the effectiveness of our approach.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    47
    References
    4
    Citations
    NaN
    KQI
    []