Have an idea?

Visit Sawtooth Software Feedback to share your ideas on how we can improve our products.

Express MaxDiff Sample size


I would like to define a solid sample size formula for Express MaxDiff.

I'm wondering if the following would be sufficient or maybe we need a bit more (e.g. +20%) to ensure robust results taken the typical amount of missing data in the model:

General Population:
n = total_number_of_items / number_of_items_per_respondent x 200

Target Groups:
n = total_number_of_items / number_of_items_per_respondent x 150

total_number_of_items = 100
number_of_items_per_respondent = 40
number of exposures of each item per respondent = 3
number of items per task = 5
number of tasks per respondent = 24
sample size (GP) = 100 / 40 * 200 = 500
sample size (TG) = 100 / 40 * 150 = 375

Kindly share your recommendation, ideally with rationale / link to a paper with details on the same.


asked Mar 26, 2019 by Piotr (185 points)

1 Answer

0 votes
The rationale for having each item show at minimum 500 times and preferably 1000 times was laid out by Rich Johnson (our founder) back in the 1990s when describing quick-and-dirty sample size calculations for CBC.  But, the same logic applies to MaxDiff.

The reasoning behind the target of getting 1000 exposures of each level of an attribute in a choice experiment was that in aggregate analysis (and doing Express MaxDiff typically involves aggregate analysis) allowed one to obtain approximately +/-3% margin of error on the counts (win rate) proportions.  500 exposures allowed one to get about +/- 4% margin of error.  MaxDiff is a different animal than CBC analysis, for which this original quick-and-dirty sample size calculator was proposed...but it is a handy approach to think about as a first cut at sample size thinking.

We discuss this quick-and-dirty counts-based sample size rule in our book, "Becoming an Expert in Conjoint Analysis".  It is also referenced in our book, "Getting Started in Conjoint Analysis".

But, question for you, why use Express MaxDiff?  We tend to prefer Sparse MaxDiff for situations involving a large amount of items in MaxDiff.  It has proven multiple times in our simulations and our checking with real respondents to perform better than Express MaxDiff.   

And, if the principal goal is to identify and measure the top few items, then Bandit MaxDiff is about 2.5x more effective (efficient) than Express or Sparse MaxDiff when dealing with 80 items, and 4x more efficient when dealing with 120 or more items.
answered Mar 26, 2019 by Bryan Orme Platinum Sawtooth Software, Inc. (200,340 points)
Hi Bryan

Thank you very much for prompt reply.

Our reason behind Express MaxDiff is that on one hand at times we need to measure >60 items and on the other - we don't want to stretch excessively MaxDiff module length (which can be the risk in case of Sparse MaxDiff). We are investigating possibility of implementation of Bandit MaxDiff for our needs and for now we might need to use Express MaxDiff.

As per what you wrote, I should modify the formula and the example I shared, to obtain minimum and prefereable sample size as follows.

n = 500 x number_of_exposures_per_respondent / ( total_number_of_items / number_of_items_per_respondent)

n = 1000 x number_of_exposures_per_respondent / ( total_number_of_items / number_of_items_per_respondent)

The example
total_number_of_items = 100
number_of_items_per_respondent = 40
number of exposures of each item per respondent = 3
number of items per task = 5
number of tasks per respondent = 24

minimum sample size (GP) = 500 * 3 / (100 / 40) = 600
preferable sample size (GP) = 1000 * 3 / (100 / 40) = 1200

For TG I would assume 75% of the above values.

Please correct or confirm

Thank you

I think you have a misconception that sparse MaxDiff needs to show each item to each person at least 1x.  That's not the case.  You can show as few MaxDiff questions you like in Sparse MaxDiff (where each respondent doesn't see all the items...only a subset of the items)...you'd just need to compensate by increasing the sample size.  Just make sure to check the box in our software to permit (individual respondent) designs lacking connectivity when doing sparse MaxDiff.

The idea with your quick-and-dirty formulas is that across all respondents and choice sets, each item should appear ideally 1000 total times.  ...Unless you are using Bandit MaxDiff, and then we have sample size guidelines in the appendix of https://www.sawtoothsoftware.com/download/techpap/Bandit_MaxDiff_2018.pdf

As per what I saw in this thread https://sawtoothsoftware.com/forum/20936/how-sparse-can-sparse-max-diff-be there is no evidence of Sparse MaxDiff working well with less than 1 item show per respondent which is also why I assumed that 1 is the minimum value.

If you can share some examples of Sparse with <1 show per respondent (as well as recommendations about bottom-line values - e.g. 0.5 or 0.25), I will be very grateful


When I do sparse MaxDiff, I'm usually not doing HB, I'm doing aggregate logit, there is no limit to the fraction of items shown to each person.

In the limit (the limits of our software), imagine 2000 items were to be included in a MaxDiff study.  Imagine just 1 choice set shown per respondent, with 5 items per set.  Imagine we wanted each item shown at least 1000 times across the sample to stabilize our estimates of preferences for the population.  It would take 400 respondents such that each item was shown 1 time on average across the sample.  So, it would require 400,000 total respondents such that each item appeared 1000 times.  400,000 would then be the target sample size.  And, the results should be excellent to draw inferences about the total population (assuming we were dealing with a total population of 400,000 or more and we had accomplished random sampling with no non-response bias).

Sparse MaxDiff is wonderfully powerful if you are assuming aggregate logit estimation and massive sample sizes.  Like Archimedes' lever.