Have an idea?

Visit Sawtooth Software Feedback to share your ideas on how we can improve our products.

Calculate design-efficiency manually

Hello, everybody,

I would like to calculate the D-efficiency of different designs manually (e.g. with R). Unfortunately, I don't get the same results as with Lighthouse Studio.

How did I proceed? I used the design matrix, e.g. of a (3,4,3,4)-design with 12 choice tasks with 3 alternatives each and 4 blocks (versions). In total, the design matrix X has 144 rows and 4 columns. To take the effect coding into account, I extended the matrix to 10 columns (main effects only, i.e. 2+3+2+3 = 10). For example, factor 1 with level 1 became (1,0), level 2 (0,1) and level 3 (-1,-1). After that I multiplied the inverse of X (10x144) with X, so (X’X). The next step was calculating the determinant, so |(X’X)|. The last step consists of calculating the pth root, i.e. |(X'X)|^(1/p) with p=10.

When doing this I get a D-efficiency of 66.13658. But when I use Lighthouse Studio with 4 respondents, so one respondent for each version, the D-Efficiency is 14.60480.

I suspect that the difference could have something to do with the number of respondents and the number of parameters estimated.

The output of Lighthouse Studio shows 14 parameters instead of 10, since the base levels are also parameter estimates. Now I wonder whether Lighthouse Studio in my example calculates the 10th or 14th root? But even when calculating the 14th root the result is 19.96712.

Furthermore, my approach (probably) implicitly assumes that one respondent fills all 48 choice sets instead of 4 respondents with 12 choice sets each. But for overall D-efficiency, that should not make a difference. Is that right?

In “Kuhfeld et al. (1994) - Efficient Experimental Design with Marketing Research Applications” also the intra block efficiency is calculated. I assume that in this case every block has its own design matrix X, so in my case 4 design matrices?

I suppose that I’m missing the forest through the trees. Could you please tell me what I’m missing?

Thanks in advance!
asked Dec 17, 2019 by Nico Bronze (900 points)

1 Answer

+1 vote
Hi, Nico.  

I don't calculate D-efficiency in R, but I do use Ngene in addition to Lighthouse Studio.  I've found that the two agree, if I use enough respondents for Lighthouse Studio.  So I might run Lighthouse Studio with 500 respondents, and then calculate the answer as if each block had been answered just once.   Like it sounds like R does, Ngene calculates D-efficiency as if one respondent answers all blocks.
answered Dec 17, 2019 by Keith Chrzan Platinum Sawtooth Software, Inc. (104,650 points)
Hi, Keith,

thank you for your quick answer.
I still don't really understand how Lighthouse Studio calculates D-efficiency. Is the D-efficiency calculated for each block (=version) and then summed, or is the design matrix used as a whole, i.e. simply the lines of the different questionnaire versions are written one below the other?
In R I do the latter, i.e. with 4 versions the design matrix X consists of the following versions:
Version 1
Version 2
Version 3
Version 4

With 8 respondents, the whole thing twice:
Version 1
Version 2
Version 3
Version 4
Version 1
Version 2
Version 3
Version 4

And so on. But in all cases, X is one matrix and so D-efficiency is calculated with (X’X).

Lighthouse Studio calculates D-efficiency from random response data, using the method described in this paper that Bryan Orme and I wrote many years ago, here:  https://sawtoothsoftware.com/support/technical-papers/design-of-conjoint-experiments/an-overview-and-comparison-of-design-strategies-for-choice-based-conjoint-analysis-2000
Hey Keith,

Thanks again for your answer!
Even after I have read the paper, I do not come to the same results. My efficiencies between the designs differ only marginally (<1%). I am reassured, however, as my results match those of Jack Horne in the R package ‘choiceDes'. He also calculates the so-called "overall D-efficiency".

As you said, the differences might have something to do with Lighthouse Studio using data from simulated respondents. I'll try it again myself as soon as I have time.

I believe you are correct about the simulated respondents which is why I usually calculate the efficiency using a large number of respondents.
Hello, Keith,

I have calculated the D-Efficiency with simulated respondents and it works perfectly, even in R. What surprised me is that the whole R-packages, even if they are specially developed for choice experiments, only calculate the so-called "overall D-efficiency". However, this D-efficiency comes from the field of conjoint analysis and not from the field of choice based conjoint, i.e. the D-efficiency considers each stimuli as a measuring point in statistical space and does not take into account the fact that stimuli are compared in choice sets. Not to mention the consideration of blocks/versions.

What do I mean, exactly?
Imagine we have a design with 12 choice sets per respondent and each choice set consists of 3 stimuli. Furthermore, we have 4 blocks (called versions in Lighthouse Studio). So the design matrix has 144 rows (=12*3*4). The D-efficiency now results from |(X'X)|^(1/p), with p as the number of parameters to be estimated. As you may have noticed, this approach does not consider the number of blocks or the number of stimuli per choice set. For example, the D-efficiency of a design with 24 choice sets per respondent, one block and 6 stimuli per choice set is exactly the same, since X is equal with 144 rows (=24*1*6).

What is “the take away message”?
Be careful when you read the term "D-efficiency." For choice experiments, a large number of respondents should be simulated and the D-efficiency should be calculated based on this data (more precisely: based on the covariance matrix of the multinomial logit estimate).

I hope my remarks help "confused laymen", because I was confused at first, too. :-)
Thanks, Nico.  And really, it's even a bit worse than that.  D-efficiency in logit models depends on the utilities, so the bet we can do in estimating the D-efficiency of a model before fielding depends on how accurately we've estimated the utilities (and their distribution if we're planning on running HB or a mixed logit as our analysis engine).  Our software, as I believe R does, calculates the null D-efficiency, i.e. the D-efficiency for a model which has all zeros for its utilities.
You're absolutely right.

One final question: Does it make sense to calculate the D-efficiency per block (version)? In this case, one MNL model per block or respondent would be estimated. Of course, this only works if the number of parameters to be estimated is less than or equal to the number of choice sets. I have already seen this approach in classical conjoint analysis, but not yet in CBC.
I don't know when I would use block-specific D-efficiencies, unless I thought there might be a really bad block that I might want not to use.  But I really don't see folks look at or report block-specific D-efficiencies.
As I said before, I have only seen this approach in classical conjoint analysis.

Purely out of interest, I have calculated the D-efficiencies (more precisely: null D-efficiency) both aggregated across all respondents and separately for each respondent. Of course only for main effects, since interaction effects cannot be estimated for such small blocks. Since there is no orthogonal plan for my (3,4,3,4) design, I first selected the 48 most important cards in terms of D-efficiency using the Federov algorithm and then applied established methods for creating choice sets. More precisely: Shifting, Mix&Match1, Mix&Match2. I also used the methods by Cook & Nachtsheim, Sawtooth Random, Sawtooth Complete Enumeration, Sawtooth Shortcut and Sawtooth Balanced Overlap, whereby the available cards were not limited to 48. In this case, the 144 cards of the full factorial design were used for creating the 4 blocks with 12 choice sets each.

In the following, you can see the null D-efficiencies with aggregated MNL and 500 respondents:
Shifting: 2573
Mix&Match1: 1832
Mix&Match2: 1811
Cook & Nachtsheim: 1852
Sawtooth Random: 1976
Sawtooth Complete Enumeration: 2571
Sawtooth Shortcut: 2467
Sawtooth Balanced Overlap: 2179

In contrast to this, now the null D-efficiencies in disaggregated analysis, i.e. the D-efficiency is calculated per block/respondent and then summed over all respondents:
Shifting: 413
Mix&Match1: 320
Mix&Match2: 286
Cook & Nachtsheim: 360
Sawtooth Random: 352
Sawtooth Complete Enumeration: 602
Sawtooth Shortcut: 446
Sawtooth Balanced Overlap: 448

The relative differences in D-efficiency are different. Especially with the hierarchical Bayesian analysis, I could imagine that the results are better with Complete Enumeration than with Shifting, although the aggregated approach suggests a tie in terms of design efficiency. But these are only first thoughts about it. Probably the differences are marginal and have little practical relevance.
This is a very interesting result for the difference between the two D-efficiencies for each design method.  

Quick question about your Shifting and Mix and Match strategies - did you build them from an initial efficient design or an orthogonal array?  I might guess with the kind of differences you're showing that it was an orthogonal array, but even then the differences are (to me) surprisingly large.
To my knowledge there is no orthogonal array for my (3,4,3,4)-design. Otherwise, I would have used this as a starting point, as in "Chrzan, Orme (2000) - An Overview and Comparison of Design Strategies for CBCA".  Therefore, I selected 48 cards from the 144 cards of the Full Factorial Design using the Federov algorithm (see: http://reliawiki.org/index.php/Optimal_Custom_Designs).
Note: The Federov algorithm is only optimal for conjoint designs, not necessarily for Choice Based Conjoint.

Here is a comparison between an orthogonal array with 9 cards and a Federov output with also 9 cards of a (3,3,3,3) design:

Orthogonal Array
determinant = 0.3766103
diagonality = 1.0

Federov Output
determinant = 0.6136858
diagonality = 0.866

So you can see that the algorithm works fine.

I use the Federov output under the assumption that the stimuli that maximize the D-efficiency of classical conjoint analysis also provide much information for Choice Based Conjoint. So the 48/144 stimuli are my starting point for Shifting, Mix&Match1 and Mix&Match2.
Thank you for clarifying!
Hi Nico,
I see that you calculated the D-efficiency of a CBC manually using R(which I'm trying to do right now) and comparing it the value which Lighthouse Studio gives us. I calculated the design efficiency using |(X'X)|^(1/p). From this discussion, I  understood that for a choice experiment the D-efficiency calculated from simulated respondent makes more sense. So can you kindly help me understand how did you simulate the data and calculate D-efficiency using R?
I calculate the D-efficiency as described above and also by Sawtooth software.

Regarding your second question with the simulated respondents, you have two options:

(1) You use a random number generator with numbers from 1 to nChoices, where nChoices is the number of stimuli per choice set. In this case, the respondents are random respondents without a given utility function.

(2) You specify a utility function, perhaps even with heterogeneity between respondents. An example of this can be found in the blog of Sarrias, who created the R-Package 'gmnl': https://rpubs.com/msarrias1986/316032

I hope this helps you.
Hi Nico,

Thanks for your reply.

I've tried the first method you suggested. I generated responses using random number generator and then tried using the package  gmnl and mlogit to try and estimate the coefficients(Betas) and calculate the D-efficiency based on variance covariance of this estimate. But I'm facing the following issue:

-  I run into error saying it is computationally singular. But whereas for the same design Lighthouse gives a D-efficiency score. I really would like to understand what I'm doing wrong.

And also I would like to get more clarity on Null D-efficiency. I understand that it means the following:
It is the case where the utilities are assumed to be zero.So that means the choice probabilities are same for all the profiles.

Is that right?
And since we have chosen the choices of respondents randomly with equal probabilities, we call the above method of D-efficiency calculation as Null D-efficiency? Or is the Null D-efficiency calculated in some other way?

Thanks in advance. It would really help me a lot if you can clarify the above.
Hi Nishanth,

I also use the mlogit package for determining D-efficiency. The model results are stored in the variable curr.modelresults. That is the part of my code that does the trick:

curr.vcov <- vcov(curr.modelresults, what = c("coefficient", "errors", "rpar"))
curr.Deff <- det(curr.vcov)^(-1/length(curr.modelresults$coefficients))

Hope that helps. And please do not forget the Gumbel distributed error term when your respondents make the choices.

Your interpretation is right.

Best wishes