At Stitch Fix, when we want to understand the impact of some treatment or exposure, we often run an experiment. However, there are some questions we cannot answer via experimentation because it is operationally infeasible or because it doesn’t feel client-right. For example, we wouldn’t experimentally require stylists to style at specific times of day (most of our stylists work flexible, self-scheduled hours) or subject clients to experimental inventory conditions. In those cases, we rely on observational analyses. The key challenge in this setting is that the treatment has not been randomly assigned, which means that we may have issues with confounding (for more on causal structures, see this previous blog post). Our goal is to learn something true and be as precise as we can; a t-test is not going to cut the mustard here.
One common approach to the challenge of confounding is to use a regression model to estimate the effect of the treatment, including confounders as covariates. Unfortunately, for a regression estimator to do its job properly, you have to specify the model correctly. If you fail to specify the model correctly (misspecification), you may think you are fully controlling for confounders when you are not[1]. Another way to tackle this is to do inverse probability of treatment weighting, where you focus your regression efforts on learning the probability of treatment. This modeling task suffers from the same risk - misspecification - and isn’t terribly efficient. We rarely have a clear understanding of the potentially complex relationship between covariates, treatment/exposure, and outcomes and yet many of us make unilateral model specifications daily.