Do you really believe that?

I had an interesting discussion with someone at a conference recently. We had given a presentation that included some discussion of how response rates are not good predictors of when nonresponse bias might occur. We showed a slide from Groves and Peytcheva.

Afterwards, I was speaking with someone who was not a survey methodologist. She asked me if I really believed that response rates didn't matter. I was a little taken aback. But as we talked some more, it became clear that she was thinking that we were trying to argue for achieving low response rates. I thought it was interesting that the argument could be perceived that way.

To my mind, the argument wasn't about whether we should be trying to lower response rates. It was more about what tools we should be using to diagnose the problem. In the past, the response rate was used as a summary statistic for discussing nonresponse. But the evidence from Groves and Peytcheva calls into question the utility of that single statistic. My conclusion from that is that we need to work harder to really diagnose the risks of nonresponse bias. We need to view a constellation of statistics, developed under a variety of assumptions.

Popular posts from this blog

My dissertation was entitled "Adaptive Survey Design to Reduce Nonresponse Bias." I had been working for several years on "responsive designs" before that. As I was preparing my dissertation, I really saw "adaptive" design as a subset of responsive design.

Since then, I've seen both terms used in different places. As both terms are relatively new, there is likely to be confusion about the meanings. I thought I might offer my understanding of the terms, for what it's worth.

The term "responsive design" was developed by Groves and Heeringa (2006). They coined the term, so I think their definition is the one that should be used. They defined "responsive design" in the following way:

1. Preidentify a set of design features that affect cost and error tradeoffs.
2. Identify indicators for these costs and errors. Monitor these during data collection.
3. Alter the design features based on pre-identified decision rules based on the indi…

Let $\mathbf{X_{ij}}$ denote a $k_j \times 1$ vector of demographic variables for the $i^{th}$ person and $j^{th}$ call. The data records are calls. There may be zero, one, or multiple calls to household in each window. The outcome variable is an indicator for whether contact was achieved on the call. This contact indicator is denoted $R_{ijl}$ for the $i^{th}$ person on the $j^{th}$ call to the $l^{th}$ window. Then for each of the four call windows denoted $l$, a separate model is fit where each household is assum…