Analyzing potential tourist behavior using PCA and modified affinity propagation clustering based on Baidu index: Taking Beijing City as an example

2021 
Abstract In recent years, when planning and determining a travel destination, residents often make the best of Internet techniques to access extensive travel information. Search engines undeniably reveal visitors' real-time preferences when planning to visit a destination. More and more researchers have adopted tourism-related search engine data in the field of tourism prediction. However, few studies use search engine data to conduct cluster analysis to identify residents' choice toward a tourism destination. In the present study, 146 keywords related to “Beijing tourism” are obtained from Baidu index and principal component analysis (PCA) is applied to reduce the dimensionality of keywords obtained by Baidu index. Modified affinity propagation (MAP) clustering algorithm is used to classify provinces into several groups to identify the choice of residents to travel to Beijing. The result shows that residents in Hebei province are most likely to travel to Beijing. The cluster result also shows that PCA–MAP performs better than other clustering methods such as K-means, linkage, and Affinity Propogation (AP) in terms of silhouette coefficient and Calinski–Harabaz index. We also distinguish the difference of residents’ choice to travel to Beijing during the peak tourist season and off-season. The residents of Tianjing are inclined to travel to Beijing during the peak tourist season. The residents of Guangdong, Hebei, Henan, Jiangsu, Liaoning, Shanghai, Shandong, and Zhejiang have high attention to travel to Beijing during both seasons.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    59
    References
    4
    Citations
    NaN
    KQI
    []