In my last post, I talked about using future data collection as a quasi-experimental validation of hypotheses about nonresponse. I thought I'd follow up on that a bit more.
We often have this controversy when discussing nonresponse bias: if I can adjust for some variable, then why do I need to bother making sure I get good response rates across the range of values that variable can take on? Just adjust for it.
That view relies on some assumptions. We assume that no matter what response rate I end up at, the same model applies. In other words, the missing data only depend on that variable at every response rate I could choose (Missing at Random). The missing data might depend only on that variable for some response rates but not others.
In most situations, we're going to make some assumptions about the missingness for adjustment purposes. We can't test those assumptions. So no one can ever prove you wrong.
I like the idea that we have a hypothesis at an interim point in the data collection. We might make this hypothesis very specific by predicting values for the missing cases. Then we add some addtional interviews and compare our predictions for those cases to the newly observed data. Does this confirm our hypothesis? Do we make new predictions for the remaining cases now that we have some additional data? In this setup, we can at least partially check our assumptions as we go.
We often have this controversy when discussing nonresponse bias: if I can adjust for some variable, then why do I need to bother making sure I get good response rates across the range of values that variable can take on? Just adjust for it.
That view relies on some assumptions. We assume that no matter what response rate I end up at, the same model applies. In other words, the missing data only depend on that variable at every response rate I could choose (Missing at Random). The missing data might depend only on that variable for some response rates but not others.
In most situations, we're going to make some assumptions about the missingness for adjustment purposes. We can't test those assumptions. So no one can ever prove you wrong.
I like the idea that we have a hypothesis at an interim point in the data collection. We might make this hypothesis very specific by predicting values for the missing cases. Then we add some addtional interviews and compare our predictions for those cases to the newly observed data. Does this confirm our hypothesis? Do we make new predictions for the remaining cases now that we have some additional data? In this setup, we can at least partially check our assumptions as we go.
Comments
Post a Comment