Have an idea?

Visit Sawtooth Software Feedback to share your ideas on how we can improve our products.

Should we discount RLH in Dual Response None models

Dual Response None modelling takes the same respondent twice where forced choice response versus binary None response are treated as two different observations from the same respondent.
But overall modelling "None of these" response typically is much easier task resulting in much higher hit rates and higher RLH (if modeled separately).

Now suddenly this easier part  becomes 50% of the model in Dual Response none modelling. Would not it inflate overall RLH unfairly?

If yes, then should we be discounting this RLH somehow in order to properly compare RLH of traditional None models to RLH of Dual response models?

Thanks in advance
asked Apr 9, 2019 by furoley Bronze (885 points)

1 Answer

0 votes
Good points.

When dual-response None is being used, the choice task is coded twice in the design, but only when the respondent says "No, I really wouldn't buy this product".   When the respondent says, "Yes, I would buy", then the task is just coded once to capture all the information.  So, the number of choice tasks used in the actual coding and estimation depends on the respondent answers.

RLH for CBC-type studies is not so straightforward as a technique for identifying "bad" respondents.  That's because a respondent who simplifies to try to complete the survey very rapidly can get a very high RLH.  For example, always choosing "None" or "No, I wouldn't buy" in dual-response format will result in a high RLH.  Also, a respondent who always picks the lowest priced product in every choice task will also get a high RLH, etc.

However, for any given CBC survey setup, it is interesting to generate, say, 1000 random-responding people and to estimate HB utilities for them.  This will give you a distribution of RLH scores that come from truly random-responding cases.  A few of those random-responders will get lucky and get a reasonably high RLH.  But, most of these respondents will show relatively low RLH and viewing this distribution of RLH scores will give you a good feel for what it means to be a random responder in terms of RLH.  

The average RLH for random responders will be different, for each CBC questionnaire of different characteristics: #concepts per task, #tasks, #attributes, #levels, whether dual-response None is used or traditional None.  So, this practice of generating 1000 random responders and analyzing via HB needs to be repeated each time you create a CBC survey with different characteristics.
answered Apr 11, 2019 by Bryan Orme Platinum Sawtooth Software, Inc. (175,415 points)
Thank you Bryan,
Thanks for reminding the method "add random-response (or shuffled version) sample of 500-1000, re-analyse via HB, filter out the rest and apply 95 percentile on RLH" which would give you true project specific level of RLH for randomness. Now we can reference  or index actual achieved RHL to that derived "randomness level" RLH.  

The problem is that I know someone who took a very stretched design with very sparse data and modest sample size and who boosted their model's RHL by applying "dual response none" framework MULTIPLE times to the whole scale of follow-up question to each point. In the end they had a combined data  size like 10 times  of the original sample and they achieved skyrocketing RHL no one else can achieve by regular means.

 In order words, someone thought if SawtoothSoftware thinks it is okay to take real forced choice data that typically gives us more modest realisitc RLH level and combine it with "dual response none" data typically resulting in elevated RLH, where all together wold result in elevated RLH, then why  wouldn't we do the later part not just once but several times boosting RLH even more and more?

"Do we have 10 points in a follow-up question? Then lets model it as if there were 11 samples: 1 of forced choices  + 10 dual response ones - each for each point on the scale". We would get average RHL like 95% no matter how bad the design and data thinness were.
But would the suggested method of referencing of RLH to randomness level  RLH bring this back to reality?

I am not sure it will. I am just asking
Indeed, high RLH is not the end goal.  The end goal is to be able to predict with high accuracy new choices made by people out-of-sample (such as real buyers in the real world).  

In our discrete choice experiments, one can artificially boost RLH by a number of means, and get worse out-of-sample predictions due to  overfitted models and artificially high RLHs.  For example, using HB, just go into the advanced settings and increase the Prior Variance from its default 1.0 to something like 100.  That will pump up the RLH significantly, causing each respondent's utilities to fit each respondent's choices better.  Less information will be "borrowed" from the population information via the helpful aspect of appropriate Bayesian smoothing.  And, the out-of-sample predictions will suffer.

Running random "robots" through the questionnaire and analyzing via HB will give you a sense of what random respondents would achieve in terms of fit.  That's a good thing to examine if you are interested in identifying real respondents who seem to have been answering the same questionnaire essentially randomly.
Thank you again, Bryan.
Thank you for a good reminder of how RLH depends on low vs upper model selected priorities (also on design, number of options presented, overfitting etc).

If Random robots is the best approach, can we get it in the software?
There is a really simple trick that would fit all designs and situations - take whatever real design & data & model submitted for analysis and shuffle their versions (or screens) for half of sample and then report RLH or PCT for each half separately and index or reference one versus another.
It could be as simple as a checkbox for any model settings
I know you have a long plan for developers...
Just an idea

Thanks again
Great news for you...random robots (for CBC, MaxDiff, or ACBC projects) are already available in the Lighthouse Studio Software as an automatic option.  

First, clean out your Test data area on your hard drive (in case you've already generated some test records) by clicking Test + Reset Data...

Next, click Test + Generate Data...

On the dialog that appears, specify the number of random-answering respondents you wish to generate (by default it is 100...but you could increase it to something like 300 and that would probably be sufficient).  

Click Generate.

This generates 300 (or as many as you request) random-responding robotic respondents locally on your hard drive.  This may take several minutes and you'll see a progress indicator.  

After you have generated those random-responding robots, just click Test + Download Data to move those respondents (as if they were real respondents) into your dataset on your hard drive for your project.

Last, click Analysis + Analysis Manager.  Then, click Add to add a new utility run, select HB as the "Analysis Types" and click Run.

This computes HB utilities for the CBC, ACBC, or MaxDiff project for your random respondents.  In the HB report you will find a tab that shows the individual-level raw utility scores and their RLHs.  Take those data to something like Excel (or your favorite analysis package) and sort the random responders from low to high RLH.  Examine the median RLH for random responders, and also look at the 95% percentile (the top 5% RLH that random responders can achieve).

Once you obtain real respondents, if you set your cutoff RLH to the top 5% RLH level achieved by random responders, then you are roughly 95% confident that a random responder who takes your survey will fail this cutoff level and be identified as such.

After you're done playing with random responders, click Test + Reset Survey again to clean them out of your data file prior to collecting real respondents and appending them to the dataset on your hard drive.
Thank you again Bryan for you great advice.
I should be ashamed for missing the point when this new functionality came in. Sorry

One questions though - previously you recommended APPENDING random response data to real response data and analysing combined data via HB modelling together.
I thought (and still feeling) that it is superior approach comparing to analysing random responses abs separately. That is because a lot of interborrowing would be happening between random and non random people in the sample, leading to non random sample being flatter in results but also leading to random sample be more like a bit more meaningful population average.  
I would not debate if this is good or bad thing for random sample estimates per se. But in fact this is what exactly would happen to any real random clicker or confused clicker hidden  somewhere in real data sample. He/She will not be analyzed alone or in group of similar random clickers. So their estimates would show some sign of assimilation toward the rest of the sample.
And if I want to more accurately catch these via their RHLs affected that way by the rest of sample, shouldn't I also allow random response  sample be affected by real response sample too?
Yes, you make a good point. RLH is affected not only by how consistent a respondent is choosing, but how much that respondent's preference vector (the utilities) seem to resemble the multivariate distribution of preference vectors across the sample.  So, if you have real respondents, you can append random-responding respondents to the dataset and this might be an even more complete view of how a mixture of random responders and consistent human responders would look like in terms of the estimated RLH fit scores.