Online transfer learning with multiple decision trees

2019 
Online learning techniques have been widely used in many fields where instances come one by one. However, in early stage of a data stream, online learning models cannot exhibit good classification accuracy for it cannot collect sufficient instances to learn. For example, a well-known online learning algorithm named as very fast decision tree (VFDT) needs to wait for Hoeffding bound satisfied to split, which leads to poor classification accuracy at the beginning of data stream. Thus, VFDT may not be appropriate for some real applications which demand us a fast and accurate online detection. This situation will become more serious in the scenario of data stream classification with concept drift. This paper attempts to take transfer learning algorithm to make up this shortcoming of VFDT. To achieve this goal, a new decision tree method named as VFDT-D is first proposed to cache instances in its leaf nodes to handle numerical attributes and adapt to a framework of online transfer learning (OTL), and then a measure which considers tree path, classification accuracy and classification confidence is proposed to evaluate the local similarity between source and target domain classifiers. At last, a multiple-source online transfer learning algorithm named as DMOTL is proposed to take VFDT-D as base classifier and use the proposed measure of local similarity to select the optimal source domain classifier to help transfer learning. The extensive experiments on several synthetic and real-world datasets demonstrate the advantage of the proposed algorithm.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    27
    References
    1
    Citations
    NaN
    KQI
    []