I have 2 attributes which, based on theory, could interact. The Interaction Explorer gave a highly significant LL test on this interaction, which would expect to have a gain of 3 pct certainty. I'm not sure if this is a "reasonably large" improvement -- ## is it ##? Other significant 2-way interactions led to a gain of <0.4% in pct certainty.

According to Sawtooth material, "It can be demonstrated that interaction effects can be revealed through choice simulations using main-effect models that do not directly model interaction terms, if the source of the interactions between attributes is principally due to differences in preference among groups or individuals."

## How would I most effectively/efficiently evaluate that qualifier, that the source of interaction is only due to individual variation (i.e., ...so that I'm justified in presenting simulations from a main effects model, even if there were significant interactions from the interaction search)?

The following are thoughts I have:

## 1) Compare average RLH between Interaction and Main Effects models? (the two models gave 0.74 and 0.78-- is that a 'significant' gain?) ##

## 2) Compare model differences in SD of average utilities for each level? (again, what's "a reasonable difference"?) ##

## 3) Evaluate utilities and SDs of the interaction levels? (What's "large"?) Notice and interpret any particular interaction levels that jump out from the others (i.e., the ones" driving" the interaction effect)? ##

## 4) Run all simulations for BOTH models and see whether adding the interaction had an impact on preference shares? (And if there was some discrepancy in trends, how do you decide which model to retain?) ##

## 5) Most CIs for the interaction levels tend not to cover 0, but should this be a criterion for whether the interaction model is better? ##

Further, I'm ultimately concerned about precision. The 2-way interaction yields 154 levels (constructed lists were used for both; a max of 5-6 levels per attribute were brought in per respondent); the expected sample size will be max 600.

## Does the increase in parameters for this sample size reduce the reliability/value-added of an interaction model? Or is an interaction justified based on theory (that there is an interaction at the construct level as opposed to simple individual heterogeneity [again...how does one tell the two apart])? ##

## If I DO retain the interaction model, how would I best present average utilities (by way of a pre-amble before showing preference shares)? From other methods courses, main effects aren't usually interpreted when interaction effects are present. Is that applicable here? ##

I plan to present simulations separately for 4 segments.

## Is a segment-wise presentation for a *main effects model* reasonable for "showing the interaction", even if some source of interaction is at the construct level? ##

Thank you for reading and supporting!