An online software for decision tree classification and visualization using c4.5 algorithm (ODTC)

2014 
Classification is an important and widely carried out task of data mining. It is a predictive modelling task which is defined as building a model for the target variable as a function of the explanatory variables. There are many well established techniques for classification, while decision tree is a very important and popular technique from the machine learning domain. Decision tree is a decision support tool that uses a tree-like graph or model of decisions and their possible consequences, including chance event outcomes, resource costs and utility. C4.5 is a well known decision tree algorithm used for classifying datasets. The C4.5 algorithm is Quintan's extension of his own ID3 algorithm for decision tree classification. It induces decision trees and generates rules from datasets, which could contain categorical and/or numerical attributes. The rules could be used to predict categorical values of attributes from new records. C4.5 performs well in classifying the dataset as well as in generating useful rules. In this paper, a web based software for rule generation and decision tree induction using C4.5 algorithm has been discussed. The visualization in the form of tree structure enhances the understanding of the generated rules. The software contains the feature to impute the missing values in data. The input data can both be categorical and numerical in nature. The software can import TXT, XLS and CSV data file formats. Enhanced waterfall model has been used for the software development process. This software will be useful for academicians, researchers and students working in the area of data mining, agriculture and other fields where huge amount of data is generated.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    1
    References
    6
    Citations
    NaN
    KQI
    []