Have an idea?

Visit Sawtooth Software Feedback to share your ideas on how we can improve our products.

If running factor analysis for MaxDiff items, which scores should I use?

Hi experts,

Now, I would like to run factor analysis for items which are used to do MaxDiff exercises. Could you please kindly advise which scores should I use for factor analysis? Rescaled score or raw scores?

Thank you so much for sharing your knowledge on this.
asked Jul 17, 2012 by ericdee Bronze (1,785 points)
retagged Sep 13, 2012 by Walter Williams

1 Answer

+1 vote
A problem with raw scores is that the range of scores (the magnitude or scale of them) is quite different for different people.  The rescaled scores on the 0-100 scale have a constant sum per respondent.  My inclination is that these normalized scores should be more appropriate for cluster or factor analysis than the raw scores.
answered Jul 17, 2012 by Bryan Orme Platinum Sawtooth Software, Inc. (199,115 points)
Hi.. But some ppl say that factor analysis should not be run since variance-covariance matrix is not invertible. Since I am not a statistician, I am getting confused
I think the quick fix to that is to omit one of the variables from the analysis if using factor or principal components.  Cluster analysis will be fine using all scores summing to 100.
ok.. thanks for the clarification.
Since you introduced the cluster analysis, I want to clarify that a k-means cluster can be run on raw utilities as well? or you are saying it should be used only on rescaled scores.
K-means should NOT be run on the raw utilities.  That would be a very bad thing to do (due to potentially huge differences in the scale factor across people).  K-means should be run on normalized utilities from MaxDiff or CBC.  

And, FYI, you might want to take a look into Cluster Ensemble Analysis (our CCEA software) which is just as easy to run as K-means, but usually produces better results.  If you already have a license to that, it would be a shame not to use it.

Also, it's worth noting that many researchers prefer to directly use Latent Class to estimate the scores and find the segments in a simultaneous process, rather than a 2-stage approach of first estimating scores via HB and secondly submitting those scores to K-means or CCEA.