If using HB to estimate individual-level utilities, the standard way Sawtooth Software handles the results is to collapse the many draws (after convergence) per respondent to create a single point estimate of the utilities. This is a shortcut practitioner's trick that makes it faster and easier to deal with the utility results during analysis, since there is just one case per respondent rather than 100s or 1000s. If you have the patience and the sophistication to leverage the many draws per respondent in analysis (such as market simulations), it's considered more true to HB and you may get very slightly better results.

Taking the mean of the draws throws away the granular uncertainty along with additional covariance structure in the individual-level draws. HB also involves some smoothing of each respondent's parameters to the population means. So, to take those point estimates and treat them as if they were independent estimates for respondents appropriate for frequentist statistical testing (such as t-tests and f-tests) is not statistically proper.

If using HB analysis to estimate the individual-level utilities, then Bayesian tests (as are described in the references you cite) are more proper to use rather than frequentist tests on the point estimates.