Kernel learning method for distance-based classification of categorical data
2014
Kernel-based methods have become popular in machine learning; however, they are typically designed for numeric data. These methods are established in vector spaces, which are undefined for categorical data. In this paper, we propose a new kind of kernel trick, showing that mapping of categorical samples into kernel spaces can be alternatively described as assigning a kernel-based weight to each categorical attribute of the input space, so that common distance measures can be employed. A data-driven approach is then proposed to kernel bandwidth selection by optimizing feature weights. We also make use of the kernel-based distance measure to effectively extend nearest-neighbor classification to classify categorical data. Experimental results on real-world data sets show the outstanding performance of this approach compared to that obtained in the original input space.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
26
References
2
Citations
NaN
KQI