Prediction of stock price direction using a hybrid GA-XGBoost algorithm with a three-stage feature engineering process

2021 
Abstract The stock market has performed one of the most important functions in a laissez-faire economic system by gathering people, companies, and flows of money for several centuries. There have been numerous studies on the stock market among researchers to predict stock prices, and a growing number of studies employed machine learning or deep learning techniques on the stock market predictions with the advent of big data and the rapid development of artificial intelligence techniques. However, making accurate predictions of stock price direction remains difficult because stock prices are inherently complex, nonlinear, nonstationary, and sometimes too irrational to be predictable. Despite the wealth of information, previous prediction systems often overlooked key indicators and the importance of feature engineering. This study proposes a hybrid GA-XGBoost prediction system with an enhanced feature engineering process consisting of feature set expansion, data preparation, and optimal feature set selection using the hybrid GA-XGBoost algorithm. This study experimentally verifies the importance of feature engineering process in stock price direction prediction by comparing obtained feature sets to original dataset as well as improving prediction performance to outperform benchmark models. Specifically, the most significant accuracy increment comes from feature expansion that adds 67 technical indicators to the original historical stock price data. This study also produces a parsimonious optimal feature set using the GA-XGBoost algorithm that can achieve the desired performance with substantially fewer features. Consequently, this study empirically proves that a successful prediction performance largely depends on a deliberate combination of feature engineering processes with a baseline learning model to make a good balance and harmony between the curse of dimensionality and the blessing of dimensionality.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    58
    References
    3
    Citations
    NaN
    KQI
    []