Using holdouts is much more common when folks run HB models - I rarely see them used with MaxDiff.
To test the hit rate, use your HB utilities to predict, for each respondent, which item has the highest utility and which the lowest. For those respondents for whom, say, the predicted best item was selected as best int eh holdout task, that's a hit. Dividing the number of hits by the number of respondents gives you the hit rate. Of course you could the same for the worst choices and you could report the two hit rates separately or average them for a total.
The reason folks use them more often for CBC models is that prediction and simulations are much more important in CBC than they are with MaxDiff. If you have enough holdout tasks (or better yet an entire holdout sample) then you can use the holdouts to test alternative model formulations (does my model with a linear price function predict better than a model with a categorical price function? does a model with an interaction included improve one without? does a model with monotonic constraints on some ordered attributes predict better than an unconstrained model? etc.)