Data Mining Application on Weather Prediction Using Classification Tree, Naïve Bayes and K-Nearest Neighbor Algorithm With Model Testing of Supervised Learning Probabilistic Brier Score, Confusion Matrix and ROC

2020 
— One of data mining techniques is Classification, used to predict relationships between data on a dataset. The prediction performed by classifying data into several different classes considering certain factor. Classification is a performance of Supervised Learning application where the training data already has a label when entered as input data. Classification is an approach of empirical techniques that can be utilized for short-term weather prediction. The most widely used algorithms in Classification Techniques are Classification Tree, Naive Bayes and K-Nearest Neighbors. In this study, the author used these three algorithms to predict rain with validation parameters of Brier Score, Confusion Matrix and ROC curves. The input data is synoptic data of Kemayoran Meteorological Station, Jakarta (96745) for 10 years (2006 - 2015) consists of 3528 datasets and 8 attributes. Based on a series of data processing, selection and model testing shows that the Naive Bayes Algorithm has the best accuracy rate of 77.1% with the category of fair classification so it is quite potential to be used in the operational. The dominant weather attributes in rain formation are moisture (RHavg), minimum temperature (Tmin), maximum temperature (Tmax), average temperature (Tavg) and wind direction (ddd).
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    3
    References
    1
    Citations
    NaN
    KQI
    []