Analyzing driving factors of land values in urban scale based on big data and non-linear machine learning techniques

2020 
Abstract Land value plays a vital role in the real estate market. It is a critical reference for urban planners to reallocate land resources and introduce valid policies. Studying the influential factors on land value can help better understand the spatial-temporal variation of land values and design effective control policies. This attracted a number of scholars to study the spatial and temporal relationships between land value and its possible influential factors from the perspective of macro and micro. However, the majority of the existing studies have the problems of linear assumption and multicollinearity in research models. Limited features and the lack of feature selection procedure are another two commonly seen limitations. To overcome the gaps, this paper adopts non-linear machine learning (ML) methods to investigate the influential factors on land values per square foot based on “big data” in New York City. More than one thousand potential factors are considered, covering from the land attribute, point of interest, demographics, housing, to economic, education, and social. They are further selected using a feature extraction model named Recursive Feature Elimination (RFE). Six ML algorithms, including Random Forest (RF), Gradient Boosting Decision Tree (GBDT), Multi Linear Regression (MLR), Linear Support Vector Regression (SVR), Multilayer Perceptron (MLP) Regression, and K-Nearest Neighbor (KNN) Regression are evaluated and compared. The optimal one with an R-square value of 0.933 is used to calculate the feature importance further. Several important impact features are disclosed, including the number of newsstands, and the vacant housing percentage.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    40
    References
    13
    Citations
    NaN
    KQI
    []