Have an idea?

Visit Sawtooth Software Feedback to share your ideas on how we can improve our products.

Segmentation over 3 different Maxdiffs


I am running three different Maxdiff exercises, on three different themes.
I will thus be running the HB analyses separately.

However, I think it is more interesting to have 1 segmentation based on all three themes, than have 3 segmentations, 1 per theme.

Is there an option to do this within the Sawtooth software, or is it more easy/logical to go to another software, say Latent Gold or Spss.

Many thanks
asked Sep 16, 2021 by Tina Van Regenmortel (415 points)

1 Answer

0 votes
This is an interesting setup, where you have three different MaxDiff exercises on different themes, all competed by each respondent.  At least that's what I assume from your description: that each respondent completed all three MaxDiff exercises.

As I mentioned in a previous post to a separate question you asked today, the recommended approach is latent class MNL rather than cluster when you have choice data like MaxDiff or CBC.  However, your data setup with three separate MaxDiffs doesn't make that automatic with our software.

I have three ideas, the first being hard, and the second two being easier:

1.  This would require extra data processing and setup to do this properly using our standalone latent class module.  Let's imaging each of your MaxDiff exercises had 21 items in it...so that it were being coded with k-1 independent variables in the design matrix (20 columns).  You would need to stack  the three datasets, so that each respondent had all three exercises gathered within each single respondent record.  The stacking would be done such that there were 60 total columns in the design matrix.   I'm suspecting what would make this even more challenging is that you'd want to use effects-coding rather than dummy-coding for the k-1 coding, such that the reference levels for all three experiments would be constrained to have zero utility.  That way when you padded the stacked matrix with the extra zeros where the missing data are, each MaxDiff exercise would be zero-centered in its utilities with respect to itself and with respect to the other two MaxDiff exercises.

I'm guessing you won't want to go this route because of all the data processing involved.

2. The second idea is much easier and relies on the cluster ensembling provided in our CCEA software.  First, you run three separate Latent Class MNL segmentations using our Latent Class tool.  Because you'll be doing an ensemble consensus step in the end with CCEA, you should cover a variety of solutions, such as from 2 to 11 solutions for each MaxDiff exercise.  So, now you have 10 candidate solutions (from 2 to 11 groups) for each of three exercises.  You are of course looking at the data output from each of the latent class runs where each respondent is discretely assigned to the class with the highest probability of membership.  So, now you have 30 candidate solutions, covering all three exercises, where there are 10 possible segmentations for each exercise consisting of 2-11 class solutions for each respondent.

Next, you assemble in a .CSV file each respondent on each row, with CaseID followed by the 30 latent class MNL segmentation variable assignments.

Take that .CSV file as a "Custom Ensemble File" (it's described in the CCEA documentation) into the Ensemble Analysis mode of our CCEA software.  You just tell CCEA that you are doing Ensemble Analysis and that you want to use your custom ensemble .CSV file, and you point the software at that .CSV file.  Call tech support if you cannot figure out from the documentation how to do this.  You then run an ensemble analysis on that custom ensemble candidates file, where you ask the CCEA software to give you consensus solutions for, say, 2 through 11 segments.

3.  So, the quick way to get nearly the same results (but not as good as the two opportunities above) would be to run HB separately on each of the MaxDiff exercises.  Then, use normalized utilities within each exercise, such that range of utilities is held constant across respondents (such as a range from -50 to +50, or 0 to 100).  Then, take the three sets of utilities for each respondent as basis variables and submit them to cluster analysis, such as Sawtooth Software's Convergent Cluster & Ensemble Analysis (CCEA)...or your favorite cluster package.
answered Sep 16, 2021 by Bryan Orme Platinum Sawtooth Software, Inc. (198,515 points)