Have an idea?

Visit Sawtooth Software Feedback to share your ideas on how we can improve our products.

Compute Log Likelihood for LC

I am trying to understand how the log likelihood is calculated in Latent class analysis (since this is the basis of all the other quality of fit measures).

For a single group, I have been able to calculate the Log Likelihood by assuming the (same) group part-worth for each respondent and using steps a) and b) described here: https://sawtoothsoftware.com/forum/24014/compute-rlh-for-hb
As described here: https://www.sawtoothsoftware.com/download/techpap/lclass_manual.pdf (on p.32), the overall log likelihood is obtained by summing the logs of those probabilities, over all respondents and questions. This worked fine for the 1 group case.

However, for the two-group case, my results differed from the log likelihood reported by Sawtooth. Could it be, that I need to use the pseudo individual-level utilities for each respondent (described here: https://sawtoothsoftware.com/forum/13296/hit-rate-in-latent-class-analysis?show=13296#q13296) instead of the group-level utility of the group, to which a respondent is most likely to belong?

If so, does this make sense with regards to the quality measures (AIC, BIC etc)? The purpose of these measures is to see, how well the groups capture the underlying preferences. However, if I use a "pseudo"-individual utility, this isn't really the same as the utility of the group, because I would use different utilities for each respondent...
related to an answer for: Compute RLH for HB
asked Dec 31, 2019 by some1 (175 points)
edited Dec 31, 2019 by some1

1 Answer

0 votes
The LL for latent class solutions is computed not using the psuedo-individual level utilities.  It also isn't done by wholly assigning each respondent to each group.  It is computed using weighted MNL, where the logit estimation is done for each group, where each respondent is weighted by the likelihood of belonging in each group.   The LLs are summed across the multiple groups.
answered May 4, 2020 by Bryan Orme Platinum Sawtooth Software, Inc. (191,015 points)
Dear Bryan, many thanks!
If I understand you right:
1.  "logit estimation is done for each group":  so in case of two groups, I would twice use the steps a to b described here (https://sawtoothsoftware.com/forum/24014/compute-rlh-for-hb) once to calculate L_xji_1 for each respondent "i" and task "j", assuming utilities of group "1", and the second time to calculate L_xji_2 for each respondent "i" and task "j", assuming utilities of group "2"?
2. "each respondent is weighted by the likelihood of belonging in each group... LLs are summed across the multiple groups": so I would calculate the total LL as sum of probability weighted root likelihoods over all respondents "i" and all tasks "j":
Sum[log(L_xji_1) * wi_1 + log(L_xji_2) * wi_2]
where wi_1 is the likelihood of respondent i belonging to group 1?
Is that what you mean, or did I get you wrong?
I think that's correct, though I need to get our programmer to double-check our code.  If you cannot replicate the software's results, we could do that.  And, of course, recognize the RLH is the geometric mean of the likelihoods, whereas LL is the natural log of the likelihoods.
Dear Bryan,

apologies for the typo: step 2. in my comment from 6th May should read "calculate the total LL as sum of probability weighted LOG-LIKELIHOODS over all respondents..." the rest of the formula should be correct.

I have implemented these steps in my calculation.. however, I can still not replicate the results of the 2-group solution.

If you could cross-check the steps with one of your programmers, that would be much appreciated!

If it helps, I can also share my sample data + evaluation script with you (after cleaning it up to remove unneccessary code + confidential data), just let me know!

Many thanks in advance, and best wishes :-)!
Please email walt@sawtoothsoftware.com to have him review your question and look to see any inconsistency between what you describe and what we do in our code.