# Interpret alpha draws of HB estimation

Dear Sawtooth community,

I conducted an ACBC study and estimated the utilities with HB. Lighthouse Studio conveniently reports the part-worths (raw and zero-centered) and relative attribute importances which stem from the averages of every single respondent.

Now I would like to report the following:

1.  95% confidence interval for attribute levels.
Following the Bayesian approach, I take the alpha file and sort the utilities (after convergence) of every attribute level from low to high and note the 2.5th and 97.5th percentile for the interval. That part is easy.

However the confidence interval is based on the raw values reported in the alpha file. I would like to report part-worths using ZCD because I will anyhow report the zero-centered utilities as the main findings. Means, the confidence interval should also be on the same scale (ZCD, not raw).

How do I transform the (raw) utilities from the alpha file into the same ZCD scale that Lighthouse is using for the summary sheets. How do I proceed here?

PS: When I cumulate the average part-worths within an attribute (across all draws after convergence), I get 0 (well almost 0, it's a figure with lots of decimal values). --> Saw the raw figures somehow sum up to zero... So aren't they actually zero-centered? This is confusing...

2. Standard deviations:
In Bayesian statistics, is it common to report standard deviations? If yes, I would also need the zero-centered ones? (this ties to question 1 above)

3. Significance of attributes and attribute levels
If I wanted to follow the Bayesian approach, I have to think a bit differently and count the percentage of alpha utilities (for every level) with the same sign. Is that correct?
So I would not declare an attribute level based on the Frequentist 95% convention, but just report as it is: e.g. if 92% of alpha draws have negative sign, I can say that with a confidence of 92% the level is "significant". Is that right?

In terms of attribute significance. I read in another thread that it does not make sense to calculate this, not just because it takes some effort but because in the end it's all relative and highly dependent on the level ranges and so on.
Also, I have attributes with 3 and 4 levels. Means, by nature I would expect that one of the levels would evolve around zero just because of the amount of levels. So it would be no surprise to see that such an attribute level may not have >95% of draws with same sign, but that doesn't mean the level is not significant?

1) I wouldn't recommend reporting Zero-centered diffs. It was intended to be a way to get all respondents on the "same scale," but if you dig into the math  Zero-centered diffs amount to a rescaled weighted average where the least predictable respondents get the highest weights and the most consistent ones get the lowest weight. Then they are scaled upwards.

The math is this though:
1) For each individual or draw or whatever, determine the range (max minus min) for each attribute.
2) Add those ranges together (RangeSum). Your scale factor for that row is 100*(# Attributes)/(RangeSum)
3) Multiply all utilities in that row by the scale factor
4) Repeat for all rows

2) I don't think it's normal to report standard deviations for Bayesian models. If they are approximately normal, it's probably fine, but if they aren't, I would probably avoid them.

3) "Significance" is specifically a Frequentist term, not a Bayesian one. But in a more informal sense, yes, it's just a matter of counting the signs on the test you are performing.

For testing a whole attribute, I can't say with confidence that there isn't some test for it, but I don't know what it would be. I would probably reference the most extreme % of counts with the same sign as the indication of the attribute's significance. Or maybe look at pairwise contrasts of levels?
answered Apr 29, 2021 by Bronze (3,920 points)
That's very helpful Kenneth, thanks a lot! It's just confusing to see so many papers where researchers conduct HB estimations but then revert to Frequentist statistics to describe the results...
For a long time I thought that raw = not zero-centered and ZCD = zero-centered. The terminology was just a bit confusing to me.
Raw utilities in Lighthouse are indeed zero-centered, just not on the same scale like ZCD.
So to sum this up: If I just want to report the utilities (no market simulation), I'm fine with raw values. But when it comes to comparing utilities across segments (e.g. clustering), ZCD should be applied.
Kenneth, just to follow up on reporting  utilities based on zero-centered diffs vs. raw (zero-centered but not according to ZCD). If I report utilities NOT based on ZCD, the interpretation of part-worths becomes a bit challenging, doesn't it?

For example if I have
Level 1: 0.5
Level 2: 1
Level 3: -1.5

If it's zero-centered (but NOT according to ZCD, or in other words, the "raw" values that Lighthouse reports for ACBC), then I cannot say that Level 2 is twice as important as Level 1, right?
Found this in the Lighthouse manual:

"As with Raw Logit scores, an item with a score of 2.0 is higher (more important or more preferred) than an item with a score of 1.0.  But, we cannot say that the item with a score of 2.0 is twice as preferred as an item with a score of 1.0."

And also:

"Zero-Centered Raw Scores  These are weights that directly follow from the MNL (multinomial logit) procedure employed within the HB engine.  The items can have positive or negative weights and are zero-centered (the "average" item has a weight of 0).  These weights are on an interval scale, which does not support ratio operations.  In other words, you cannot state that an item with a score of 2.0 is twice as important (or preferred) as an item with a score of 1.0"
So if I have the average utility of all posterior alpha draws for every attribute level - I would need to transform them in order to make statements about ratio (e.g. level 2 is twice as important as level 1)?