Have an idea?

Visit Sawtooth Software Feedback to share your ideas on how we can improve our products.

Priors to optimize Model

Dear community,
 I made a grid search including about 30 different settings to improve my estimation model, comparing the hit rates using each setting.

Are there further ways to improve the model?

The setting giving the best hit rates was:
prior degree of freedom(pdgrfr): 20
prior variance(pvar): 0.2
How should I interpret it? What are possible reasons for the difference to the recommended setting (pdgrfr:5,pvar:0,2) as stated in the white paper on priors?
The CBC has 342 analyzed respondents.

When comparing the hit rate to either the default settings or the recommended settings by the white paper: Should I rather only state the difference in %-hit rate or is there measure for the difference being significant?
asked Jul 20, 2018 by botmar (395 points)

1 Answer

0 votes
I assume you are referring to our white paper, and to the recommendations for priors settings based on number of attributes and sample size.  I also assume you are using our CBC/HB Model Explorer tool to automate the many HB runs and jack-knife sampling of holdout tasks.

There is quite a bit of variability in Degrees of Freedom...even in prior variance across data sets, so the recommendations in the white paper are generalities.  Each CBC study has specific characteristics due to amount of heterogeneity in the preferences across people and respondent engagement/error.

The different between degrees of freedom 5 and degrees of freedom 20 are not very big in my experience.

A big question is if the hit rates changed very much between the default settings (D.F.=5, Prior Var=1) and what turned out to be the "optimal" setting according to the jack-knife search.  

That said, based on our experience, if you have a lot of attributes in full-profile in CBC, prior variance of 0.2 can work better in general for prediction quality than prior variance of 1.0.
answered Jul 20, 2018 by Bryan Orme Platinum Sawtooth Software, Inc. (177,015 points)
I wasn't able to use the CBC/HB Model explorer tool to work properly so made  many analysis runs in Lighthouse studio.
The study consists of 8 attributes, 342 respondents, 10 random tasks, 3 fixed tasks, 4concepts+none pertask.
With default settings:
hit rate: 60,09%, avarage pct. cert.& RLH higher, Avg. Variance and Parameter RMS lower
D.F.=20,P.Var.=0,2: 61,38%
Whitepaper Suggestion: D.F=5PVar.=0.3: 60,61%, similar values compared to "optimal solution"
Does the better hit rate outweigh the worse pct cert & RLH (0.12-0.15 difference)?
Pct Cert & RLH are internal measures of fit to the choice tasks used in the utility estimation.  The goal is NOT to maximize those.  For example, you can use D.F. 1000 and Prior Variance 10 and see that the internal fit stats will go up but your holdout hit rates will go down.  That's called overfitting.