Have an idea?

Visit Sawtooth Software Feedback to share your ideas on how we can improve our products.

Benchmark for PctCert in Latent Class CBC


in my Latent Class Analysis I get values of  around 30 for the PctCert for several group solutions. This seems rather low to me. Does anyone have a benchmark recommendation for me or an idea what could potentially cause this?

Thanks in advance
asked Dec 17, 2019 by Hans

2 Answers

0 votes
The Latent Class technical paper (https://www.sawtoothsoftware.com/download/techpap/lctech.pdf) has a section on page 9 (A Numeric Example) where  known utilities are used to see how well they are recovered and provides some examples of Percent Certainty and the other fit statistics.  I don't do a lot of LC work, though, and don't feel like I have a good grasp of recommendations on real data sets.  Perhaps someone else will be able to chime in with those.
answered Dec 17, 2019 by Brian McEwan Platinum Sawtooth Software, Inc. (52,430 points)
0 votes
Percent Certainty is also called Pseudo R-squared.  This fit statistic is dramatically affected by the number of classes you request.  If you request just two classes, the % Certainty will be relatively low.  If you request more classes, especially beyond 10 (for argument sake), the fit will be relatively high.

If you haven't already done so, I recommend using HB to assist you in cleaning the data of random responders, speeders, etc. prior to running latent class.
answered Dec 17, 2019 by Bryan Orme Platinum Sawtooth Software, Inc. (183,340 points)
Hi Brian, thanks for your answer. I would like to have 3-5 classes, but the PctCert stays more or less the same at around 30 %. Could you please elaborate a bit more on how to use HB for data cleaning? Which criteria should I use to detect bad answers which cannot be used for LC?
Sure, please look at the short article I wrote on the subject at: https://www.linkedin.com/pulse/identifying-consistency-cutoffs-identify-bad-respondents-orme/
Hi Brian,
thanks for your answer. Despite doing what you proposed, the PctCert still is around 0.3. Could you please tell whether you think this is as issue?
From what I know values for Pseudo R-squared between 0,.3 and 0.4 are considered as good.
Thanks in advance
Without having other data sets to compare to, it's hard to know if PctCert of 0.3 is bad or good.

So, I just looked at 6 CBC datasets that were collected back in the mid-1990s that we use for comparison.  Here are the PctCert scores for Latent Class MNL, 4-group solutions:

Tropt - 0.30
Study1 - 0.33
Study2 - 0.37
Study3 - 0.35
Phone - 0.37
Diaper - 0.28

So, a sampling of 6 professional CBC data sets from the 1990s (when respondents probably were more conscientious than they are today) shows a range of PctCert of 0.28 to 0.37.  I have about 25 such datasets, and I just randomly picked 6 of them.

Your study falls within that range.  So, from my perspective there isn't evidence that you should think your fit is any worse than other reasonable CBC datasets.