Have an idea?

Visit Sawtooth Software Feedback to share your ideas on how we can improve our products.

Linear Coding in HB Estimation: Specification of values with equal distances in levels

Dear Sawtooth Team,

I conducted a CBC analysis in which one of the attributes is "numeric" with 5 levels and equal distances.

I used the hierarchical Bayes model (HB) from Sawtooth software to estimate a single coefficent for the "numeric" attribute - taking into account that I specified the value of the levels as single digits prior to running the model.

I was wondering, why it is important to specify the values similar to the level appearance which the participants see.

E.g.: Imagine the numeric values range from 10.99 to 15.99 with equal distances of 1. Hence, to code it in single digits, the values for the HB estimation would be 0.1099 - 0.1599.
To compare different models, I further specified the values with 1-5 (since the distance between the levels are equel). This leads to a better percent certainty  and root likelihood but much different coefficients.

Another variables takes the levels of 100 - 200 with equal distances of 20. Hence, I would code the values 1, 1.2, 1.4,...However, again, specifying the values with 1-5, the percent certainty and root likelihood receives better fits.

Should I still stick to your suggestion and specify the values similar to the level appearance which the participants see or focus on the model with a better fit?

Thank you.
Best regards,
E.S.
asked May 19, 2021 by meschusti (190 points)

1 Answer

+1 vote
This is a very involved question.  First, two points:

1) Obtaining highest fit in terms of RLH or Pct. Cert. is not the goal for HB and shouldn't be your goal either.  HB purposefully looks for a compromise solution that fits each respondent's data reasonably well while also obtaining individual-level estimates that seem to have a high likelihood of coming from a multivariate-normally distributed population (with variances according to the priors set in the software).  

If we were just interested in obtaining the highest individual-level internal fit to the data, we'd ignore the latter part of the hierarchical model (the upper-level model) and would just run individual-level logit.  You can approximate this by setting the prior variance in the HB settings to be huge, such as 100 and the prior degrees of freedom to be huge, such as 5000.  You'll see your fit VASTLY improves...but you'll be overfitting and these individual-level utilities will not have as good of predictive fit to new observations (e.g., out of sample choices).

2) The default priors (prior variances and covariances) we've set in our HB software work well as long as the beta coefficients are in the expected range.  We're using a prior variance of 1.0, which we've found works well for beta parameters when the design matrix is coded as effects-coded or dummy-coded (involving 1s, 0s, and -1s).  However, if you code a quantitative attribute in a way that leads to very different magnitude of beta than those we're assuming, then the priors we've set aren't quite right.  Of course, you could just adjust your prior variances to account for doing very different things with your X matrix coding...alternatively, you could just code your X values to keep things well behaved with our default priors.  We've recommended the latter in our documentation.

I've found that if you keep your quantitative (linear) coding for variables in the X matrix in the range of around single digits in differences, then convergence tends to be good (given our default priors).  Best convergence happens if the range of your quantitative coding of an attribute (for the X matrix) is around 1 or 2 units and if the values are zero-centered.  However, if you code things from (say) -100 to +100 for a quantitative attribute in the X matrix, convergence will suffer and parameters may be biased.  Also, if you code things from -0.01 to +0.01, you'll also have troubles with convergence.  That's because in either of these two examples, the expected variance of the resulting beta is too far different from the priors.
answered May 19, 2021 by Bryan Orme Platinum Sawtooth Software, Inc. (198,315 points)
edited May 19, 2021 by Bryan Orme
Dear Bryan,

thank you for the great answer and the interesting insights on the fit of HB models. Just for curiosity, I will try your HB settings (“prior variance in the HB settings to be huge, such as 100 and the prior degrees of freedom to be huge, such as 5000”), to see how the fit can “artificially” be improved.

Thank you for the clarification, why it is important to code the variables in the X matrix in around single digits differences and why it is recommended to stick with the default prior variance.

However, what I still do not get is, why it is important to code the variables, so that they “metrically correspond to the quantities shown to respondents“?
If 5 price levels (as seen by the respondents) increase with equal distances of e.g. $20 (levels: 100, 120, 140, 160, 180), I would code them as 1, 1.2, 1.4,..1.8 respectively the zero centered values (e.g. -4, -2, 0, 2, 4).  I was wondering, if it wouldn’t it be sufficient to just code the variables as 1,2,3,4,5 to caputure the nature of equal distances and meet the default priors?

Thank you very much.
Best regards,
E.S.
Failure to zero-center the values means that a utility intercept needs to spill over to the "None" alternative (making it less intuitive to interpret the utility coefficients, though the predictions will be the same).  Absent having a None alternative, convergence is ever so slightly impaired if (for example) using 1, 2, 3, 4, 5 instead of -2, -1, 0, 1, 2.
Thank you for the clarification.
...