Instability of Hierarchical Cluster Analysis Due to Input Order of the Data: The PermuCLUSTER Solution

2005 
Hierarchical agglomerative cluster analysis (HACA) may yield different solutions under permutations of the input order of the data. This instability is caused by ties, either in the initial proximity matrix or arising during agglomeration. The authors recommend to repeat the analysis on a large number of random permutations of the rows and columns of the proximity matrix and select a solution with the highest goodness-of-fit. This approach was implemented in an SPSS add-in, PermuCLUSTER, which can perform all HACA methods of SPSS. Analyses of 2 data sets show that (a) results are affected by input order, (b) instability in one method co-occurs with instability in other methods, and (c) some instability effects are more dramatic because they occur at higher agglomeration levels.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    45
    References
    43
    Citations
    NaN
    KQI
    []