Have an idea?

Visit Sawtooth Software Feedback to share your ideas on how we can improve our products.

Hit rate in Latent Class Analysis

I conducted a Latent Class analysis for the CBC. To find the best solution I consider the lowest value BIC and CAIC. This leads to a five group solution.
Now I want to test the predictive accuracy of this group solution with three holdout sets and the hitrate.

Is it appropriate in the latent class analysis to calculate the hit rate with respect to each respondent (get individual results in the simulator for latent class and compare it with the actual choices of each respondents, then calculate the overall hit rate in Excel). Does it make sense to calculate the hitrate here?
Or do I have to look at the percentages of actual choices for each segment and compare it with the predictive results in the simulator? And then calculate the MAE?
Is there a rule of thumb for an appropriate value of MAE or the hitrate for an reasonably good fitting model?
Thank you very much for your help.
asked Mar 28, 2017 by Sophia

1 Answer

0 votes
Nice work to obtain a latent class solution and find that BIC and CAIC are minimized (lower is better) at a 5-group solution.

Latent class (especially low-dimensionality solutions like a 5-group solution) is not a very good way to obtain good individual-level utilities.  In other words, to compute hit rates for held out tasks at the individual level using low-dimensionality latent class utilities is inferior to using HB utilities to do the same.

Our software creates pseudo individual-level utilities for each respondent from latent class runs by taking the weighted average of the group utilities for each respondent, where the weights are each respondent's probability of belonging to each group.  Indeed, you could export those pseudo individual-level utilities from our software to a .CSV file to separately compute hit rates for held out tasks, but results usually will be inferior to HB.

Three holdouts sets provide some data for holdout validation, true.  But, research by Keith Chrzan here at Sawtooth Software shows that only 3 holdouts isn't enough to do a reliable job in comparing different model specifications and identifying the winning one.  But, three holdouts is certainly enough to obtain some face validity confidence that the utilities seem to be predicting held out tasks pretty well.

Other than computing hit rates at the individual level, there is the approach of using the utilities to predict summary shares of preference for the respondents and comparing those to the summary counts of choices of the (fixed, meaning the same task was asked of all respondents) holdout alternatives across the holdout sets.  Often researchers compute MAE (Mean Absolute Error) or MSE (Mean Squared Error) to quantify  how well the predictions fit the holdout tasks.

Latent class isn't great for individual-level hit rates, but it is very appropriate to use market simulators built on the backbone of latent class utilities to predict summary shares of choice for (fixed) held out tasks and to compute MAE as the loss function.  You don't need to build predictions for each separate latent class segment.  Our market simulator just applies the (say) 5-group solution by creating pseudo individual-level utilities as I earlier described, and summarizes the results for the sample across the individual-level share predictions.  Such latent class simulators often match or occasionally exceed the predictive quality of simulators built using HB utilities.

The size of MAE you'll obtain not only depends on the quality of your model and the conjoint data, but on the sample size, plus critically the number of alternatives per set (#products in the market simulator).  With large sample sizes (n=600 or larger), few concepts per task (4 or fewer concepts), I like to see MAE around 3 absolute percentage points or less.
answered Mar 28, 2017 by Bryan Orme Platinum Sawtooth Software, Inc. (175,415 points)
Thank you for your detailed answer.
Is this the right way to compute the MAE: I get the following actual and predicted preference shares (3 concepts and No choice option) for one fixed hold out set and 5 group Latent class solution :
Actual Preferences        Predicted Preferences
20,63%                        20,68%
19,73%                       19,97%
39,91%                       43,73%
19,73%                        15,62%

Then I calculate the differences between percentages (e.g. 20,68-20,63% for the first row), sum all four differences up and divide it by 4 (three alternatives and none option). In this case I get 0.02 or 2% as MAE. Is this way correct?
For this simulation I used the random first choice method. If I use the first choice method I get high MAE and huge differences between the percentages. So is it better to use the random first choice method to test the fit?

Can I compare the MAE with a Random Sample to show the good fit of the model?
I would calculate the difference between 25% (preference in a random model: 1/4) and each actual preference percentage, sum it up and divide it by 4. Then I can compare the both MAE. In this case: 0.07. Thus, the latent class model with 5 groups is a good fitting model.
Does this consideration make sense?

Thanks for your support
Regards, Sophia
Your calculation of MAE is correct.  But, you'd have multiple holdout tasks and you should average the MAE across the multiple holdout tasks.

Different simulation approaches (RFC, share of preference, first choice) have different assumptions and will give different results.  For example, first choice leads to a bit more extreme probabilities, which lead to steeper shares of preference (which can harm fit in terms of MAE).  RFC and Share of Preference approaches are meant to provide shares of preference that are on the same scale as probabilities of choice from counting analysis on the holdouts.  

If the holdout tasks look the same as the tasks used for utility estimation and are mixed within the questionnaire (not all appearing at the beginning or the end), then the sharpness or flatness of the predicted shares of preference should be just about right to fit the sharpness or flatness of the holdout probability counts for the holdout alternatives (same respondents, same-looking choice tasks, commensurate probability models).   But, if you are using a market simulator to predict some out-of-sample holdout event, such as real market purchases for a real market scenario (different respondents, probably different-looking choice scenario with different number of alternatives than in the CBC questionnaire), then the flatness or steepness of the simulated results might need to be tuned (by the use of the Exponent setting in our software) to obtain a better fit (lower MAE).

Computing a "null MAE" based on a naive model (share = 1/c, where c is the number of concepts in the scenario) is a decent way to show how the MAE you obtained from the model compares to the error one would observe if using a naive (uninformed) model.