Have an idea?

Visit Sawtooth Software Feedback to share your ideas on how we can improve our products.

Understanding k-means clustering

Is there a white paper or an article about k-means clustering with CCEA software using “highest reproducibility” approach using different types of starting points? I found the white paper about the ensemble approach and its benefits vs “highest reproducibility” k-means, but am looking for something that would “defend” k-means in general, so to speak, when it’s done properly with numeric or attitudinal scale data. Maybe there is an earlier paper, from before Ensemble was added to the software? Maybe even a paper that compares k-means and Latent class clustering and talks about the benefits of each in certain situations, with different types of data, etc.?
asked Apr 14, 2017 by JKincaid Bronze (1,035 points)
retagged Apr 14, 2017 by Walter Williams

1 Answer

+1 vote
A lot of the content from the original CCA white paper is included in this document:  http://www.sawtoothsoftware.com/download/techpap/ccea_manual.pdf

CCA is a particularly robust version of k-means that uses multiple starting points and that looks for convergent answers to emerge from the various sets of starting seeds, and this document does a good job of describing that process.
answered Apr 14, 2017 by Keith Chrzan Platinum Sawtooth Software, Inc. (102,700 points)