On the mutability of response probabilities

I am still thinking about the estimation and use of response propensities during data collection. One tactic that may be used is to identify low propensity cases and truncate effort on them. This is a cost saving measure that makes sense if truncating the effort doesn't lead to a change in estimates.

I do have a couple of concerns about this tactic. First, each step back may seem quite small. But if we take this action repeatedly, we may end up with a cumulative change in the estimate that is significant. One way to check this is to continue the truncated effort for a subsamples of cases.

Second, and more abstractly, I am concerned that our estimates of response propensities will become reified in our minds.  That is, a low propensity case is always a low propensity case and there is nothing to do about that. In fact, the propensity is always conditional upon the design under which it is estimated. We ought to be looking for design features that change those probabilities. Preferably, design features that change low prob cases to high prob. I think this is the idea behind "phase capacity" in responsive design.

Popular posts from this blog

My dissertation was entitled "Adaptive Survey Design to Reduce Nonresponse Bias." I had been working for several years on "responsive designs" before that. As I was preparing my dissertation, I really saw "adaptive" design as a subset of responsive design.

Since then, I've seen both terms used in different places. As both terms are relatively new, there is likely to be confusion about the meanings. I thought I might offer my understanding of the terms, for what it's worth.

The term "responsive design" was developed by Groves and Heeringa (2006). They coined the term, so I think their definition is the one that should be used. They defined "responsive design" in the following way:

1. Preidentify a set of design features that affect cost and error tradeoffs.
2. Identify indicators for these costs and errors. Monitor these during data collection.
3. Alter the design features based on pre-identified decision rules based on the indi…

Let $\mathbf{X_{ij}}$ denote a $k_j \times 1$ vector of demographic variables for the $i^{th}$ person and $j^{th}$ call. The data records are calls. There may be zero, one, or multiple calls to household in each window. The outcome variable is an indicator for whether contact was achieved on the call. This contact indicator is denoted $R_{ijl}$ for the $i^{th}$ person on the $j^{th}$ call to the $l^{th}$ window. Then for each of the four call windows denoted $l$, a separate model is fit where each household is assum…