In my last post, I talked about using future data collection as a quasi-experimental validation of hypotheses about nonresponse. I thought I'd follow up on that a bit more.

We often have this controversy when discussing nonresponse bias: if I can adjust for some variable, then why do I need to bother making sure I get good response rates across the range of values that variable can take on? Just adjust for it.

That view relies on some assumptions. We assume that no matter what response rate I end up at, the same model applies. In other words, the missing data only depend on that variable at every response rate I could choose (Missing at Random). The missing data might depend only on that variable for some response rates but not others.

In most situations, we're going to make some assumptions about the missingness for adjustment purposes. We can't test those assumptions. So no one can ever prove you wrong.

I like the idea that we have a hypothesis at an interim point in the…

We often have this controversy when discussing nonresponse bias: if I can adjust for some variable, then why do I need to bother making sure I get good response rates across the range of values that variable can take on? Just adjust for it.

That view relies on some assumptions. We assume that no matter what response rate I end up at, the same model applies. In other words, the missing data only depend on that variable at every response rate I could choose (Missing at Random). The missing data might depend only on that variable for some response rates but not others.

In most situations, we're going to make some assumptions about the missingness for adjustment purposes. We can't test those assumptions. So no one can ever prove you wrong.

I like the idea that we have a hypothesis at an interim point in the…