# Calculate design-efficiency manually

Hello, everybody,

I would like to calculate the D-efficiency of different designs manually (e.g. with R). Unfortunately, I don't get the same results as with Lighthouse Studio.

How did I proceed? I used the design matrix, e.g. of a (3,4,3,4)-design with 12 choice tasks with 3 alternatives each and 4 blocks (versions). In total, the design matrix X has 144 rows and 4 columns. To take the effect coding into account, I extended the matrix to 10 columns (main effects only, i.e. 2+3+2+3 = 10). For example, factor 1 with level 1 became (1,0), level 2 (0,1) and level 3 (-1,-1). After that I multiplied the inverse of X (10x144) with X, so (X’X). The next step was calculating the determinant, so |(X’X)|. The last step consists of calculating the pth root, i.e. |(X'X)|^(1/p) with p=10.

When doing this I get a D-efficiency of 66.13658. But when I use Lighthouse Studio with 4 respondents, so one respondent for each version, the D-Efficiency is 14.60480.

I suspect that the difference could have something to do with the number of respondents and the number of parameters estimated.

The output of Lighthouse Studio shows 14 parameters instead of 10, since the base levels are also parameter estimates. Now I wonder whether Lighthouse Studio in my example calculates the 10th or 14th root? But even when calculating the 14th root the result is 19.96712.

Furthermore, my approach (probably) implicitly assumes that one respondent fills all 48 choice sets instead of 4 respondents with 12 choice sets each. But for overall D-efficiency, that should not make a difference. Is that right?

In “Kuhfeld et al. (1994) - Efficient Experimental Design with Marketing Research Applications” also the intra block efficiency is calculated. I assume that in this case every block has its own design matrix X, so in my case 4 design matrices?

I suppose that I’m missing the forest through the trees. Could you please tell me what I’m missing?

Thanks in advance!
asked Dec 17, 2019

## 1 Answer

+1 vote
Hi, Nico.

I don't calculate D-efficiency in R, but I do use Ngene in addition to Lighthouse Studio.  I've found that the two agree, if I use enough respondents for Lighthouse Studio.  So I might run Lighthouse Studio with 500 respondents, and then calculate the answer as if each block had been answered just once.   Like it sounds like R does, Ngene calculates D-efficiency as if one respondent answers all blocks.
answered Dec 17, 2019 by Platinum (103,225 points)
Thank you for clarifying!
Hi Nico,
I see that you calculated the D-efficiency of a CBC manually using R(which I'm trying to do right now) and comparing it the value which Lighthouse Studio gives us. I calculated the design efficiency using |(X'X)|^(1/p). From this discussion, I  understood that for a choice experiment the D-efficiency calculated from simulated respondent makes more sense. So can you kindly help me understand how did you simulate the data and calculate D-efficiency using R?
I calculate the D-efficiency as described above and also by Sawtooth software.

Regarding your second question with the simulated respondents, you have two options:

(1) You use a random number generator with numbers from 1 to nChoices, where nChoices is the number of stimuli per choice set. In this case, the respondents are random respondents without a given utility function.

(2) You specify a utility function, perhaps even with heterogeneity between respondents. An example of this can be found in the blog of Sarrias, who created the R-Package 'gmnl': https://rpubs.com/msarrias1986/316032

I hope this helps you.
Hi Nico,

Thanks for your reply.

I've tried the first method you suggested. I generated responses using random number generator and then tried using the package  gmnl and mlogit to try and estimate the coefficients(Betas) and calculate the D-efficiency based on variance covariance of this estimate. But I'm facing the following issue:

-  I run into error saying it is computationally singular. But whereas for the same design Lighthouse gives a D-efficiency score. I really would like to understand what I'm doing wrong.

And also I would like to get more clarity on Null D-efficiency. I understand that it means the following:
It is the case where the utilities are assumed to be zero.So that means the choice probabilities are same for all the profiles.

Is that right?
And since we have chosen the choices of respondents randomly with equal probabilities, we call the above method of D-efficiency calculation as Null D-efficiency? Or is the Null D-efficiency calculated in some other way?

Thanks in advance. It would really help me a lot if you can clarify the above.
Hi Nishanth,

I also use the mlogit package for determining D-efficiency. The model results are stored in the variable curr.modelresults. That is the part of my code that does the trick:

curr.vcov <- vcov(curr.modelresults, what = c("coefficient", "errors", "rpar"))
curr.Deff <- det(curr.vcov)^(-1/length(curr.modelresults\$coefficients))

Hope that helps. And please do not forget the Gumbel distributed error term when your respondents make the choices.

Your interpretation is right.

Best wishes
Nico