Have an idea?

Visit Sawtooth Software Feedback to share your ideas on how we can improve our products.

Latent Class Analysis for MaxDiff


I am starting to analyze my MaxDiff data (profile- case 2) using Latent Class, I have the option of using lighthouse studio and the standalone program, which one do you recommend me to use?

also, I do not understand the tables titled (Display re-scaled utilities and attribute importances) I am not sure how was it calculated and why there are no values for the reference attribute in this table?

I saw some results reporting the (Mean Absolute Error "MAE") as one of the measures for the fitting the model, can this be calculated from the results reported?

Finally, If I produced the row data or the rescaled data (zero scales) then I wanted to convert the values to a certain scale where the maximum value is 1 which is the value of adding the best 6 Levels of the 6 Dimensions & the minimum value is 0 & that is the value of adding the worst 6 levels of the 6 Dimensions. How can I do that & what is the best reference values to use?
( I am doing this as I will rescale it using another Cardinal scale to adjust the scale to be cardinal from 0-1 expressing 1 as full health and 0 as death)
Can this process be applied for the HB results too?

Sorry for the multiple questions

asked Jan 17, 2020 by AMYN Bronze (2,980 points)
edited Jan 17, 2020 by AMYN

1 Answer

+1 vote
Using Lighthouse Studio's latent class or the standalone Latent Class Module mainly depends on whether you need to do something customized (by modifying the .CHO file manually) or not.  The math is the same.  But, if you use the Latent Class standalone system, it is reading in the .CHO file that you export from the Lighthouse Studio system.  The .CHO file is dummy-coded and the reporting you'll see from the Latent Class standalone doesn't show the utility of the reference level (it isn't aware of the presence of a reference level...it just knows about the columns that were fed to it and are in the design matrix).  You just need to remember that the reference level has a zero utility, since MaxDiff coding for analysis using the method of dummy-coding, where the reference level is a vector of 0s in the design matrix.

MAE (Mean Absolute Error) is a measure of fit when you are using a model to predict choice probabilities for held out data (e.g., held out choice tasks).  It's rarely the case that MaxDiff practitioners have hold out choice tasks; but it would be possible if the practitioner planned for this ahead of time and included holdout choice tasks in the questionnaire.

From other posts you've made, it seems you are using best-worst conjoint (profile case MaxDiff, or best-worst Case 2).  In that case, there is only one reference level set to zero utility across all attributes.  One of the interesting outcomes of Best-Worst Case 2 is that the utilities for levels between conjoint attributes can be compared...as opposed to the standard conjoint or CBC case in which they cannot.  Therefore, adding the worst levels of each attribute within a best-worst Case 2 study doesn't mean adding zeros.  It means adding the utility for the worst levels within each attribute that each are not necessarily zero.

So, if you want the worst product concept to sum to zero across its attribute levels, then you'll need to find the intercept to add to all the utilities of the levels such that the sum of the worst levels across attributes is zero.  If you have multiple latent classes, then you'll want to find this intercept for each of the classes.  If you are doing 1 class (aggregate logit), then there is only one class and one rescaling intercept.

Next, you want the utilities rescaled such that the best levels across all attributes sum to 1.0.  That means you need to take the utilities from the previous step (paragraph directly above) and multiply them by the factor such that the best product has a sum of utilities of 1.0.

Please note that after doing this rescaling it is no longer appropriate to use the logit equation to compute the likelihood of choosing one product concept versus the other!  The scale factor has now been modified from its original scale that had been based on the choice probabilities expressed by respondents in the questionnaire.

This process could be applied using HB utilities, by rescaling at the individual level.
answered Jan 20, 2020 by Bryan Orme Platinum Sawtooth Software, Inc. (201,565 points)
Oops, I just realized that the rescaling I proposed at the bottom of the post above won't work to make the worst product 0 and the best product 1.

Rather, you should compute the best product and worst product's utilities according to the scaling from the latent class utility run or the HB utility run.  Let's say you get -1.5 and +3.5 for the worst possible and best possible product concepts.  

Next, to rescale utilities for these two products (or any products in between) to have a range of 0.0 to 1.0, just figure out what % of the way the new product is between -1.5 and 3.5.  For example, if you get a result of 1.0, then you know that 1.0 is 50% of the way between -1.5 and 3.5.  So, the rescaled utility for that product concept (on the 0.0 to 1.0 scale) is 0.5.
Thank you again Bryan for your suggestions and answers that are all helpful.
Commenting on the MAE, I have a holdout task in my questionnaire that all respondents answered but was not included in the calculations of the model, I believe this is what you are referring to, How can this be used to calculate the MAE for Latent Class or Logit analysis?
For the recalling purpose, I see that you suggest calculating the utilities of the best and worst products then find the location of the other products in between. using this way I won't be able to find the individual values of the levels unless if I calculated them again as % of the total utility of the product and this will be very lengthy process especially for the HB individual levels.
I found in a published paper a suggestion of "Linear Transformation" and this uses the following modification:
[V - (X/6)]/y
X = the utility of the best product; Y = the utility of worst product, V = Utility of a certain Level; 6 is the number of dimensions I have in my Product
What do you think about this approach? which will be calculated on the individual level for HB, class level in the LC & aggregate level in the MNL.
N.B. All the assumptions are correct
Hi again,

I found that you have addressed all my questions except this one :

Can you explain the tables titled (Display re-scaled utilities and attribute importances) as I am not sure how was it calculated and why there are no values for the reference attribute in this table?

If you are using the standalone Latent Class program or the standalone CBC/HB program to re-analyze MaxDiff data, then the data matrix you feed to the software is just a series of variables representing the contrasts of each level with the reference level (the last level) which is assumed to be zero.  The software doesn't understand where this data came from or the nature of the dummy-coding procedure or which level was the reference level.  (The software can accept a data matrix of independent variables from any source.)  So, it just reports to you the utilities associated with the explicit columns in the independent variable matrix.  That's why the reference level (the last level) is missing from the report.  The software doesn't know about the structure of your study and that the last item in your MaxDiff list is actually that reference level that should get a utility of zero.  Thus, you need to fill in that missing item with its zero utility after running the report.

Rescaled utilities are calculated such that the range of scores for each attribute (assuming it is a conjoint design where each attribute has multiple levels) averages 100 points.