A divide-and-merge methodology for clustering
2005
We present a divide-and-merge methodology for clustering a set of objects that combines a top-down "divide" phase with a bottom-up "merge" phase. In contrast, previous algorithms either use top-down or bottom-up methods to construct a hierarchical clustering or produce a flat clustering using local search (e.g., k -means). Our divide phase produces a tree whose leaves are the elements of the set. For this phase, we use an efficient spectral algorithm. The merge phase quickly finds an optimal tree-respecting partition for many natural objective functions, e.g., k -means, min-diameter, min-sum, correlation clustering, etc., We present a meta-search engine that uses this methodology to cluster results from web searches. We also give empirical results on text-based data where the algorithm performs better than or competitively with existing clustering algorithms.
Keywords:
- Correlation clustering
- Fuzzy clustering
- FLAME clustering
- Cluster analysis
- k-medians clustering
- Complete-linkage clustering
- Canopy clustering algorithm
- Machine learning
- CURE data clustering algorithm
- Pattern recognition
- Artificial intelligence
- Mathematics
- Theoretical computer science
- Constrained clustering
- Data mining
- Single-linkage clustering
- Data stream clustering
- Computer science
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
44
References
103
Citations
NaN
KQI