Defining phases

I have been working on a presentation on two-phase sampling. I went back to an old example from an RDD CATI survey we did several years ago. In that survey, we defined phase 1 using effort level. The first 8 calls were phase 1. A subsample of cases was selected to receive 9+ calls.

It was nice in that it was easy to define the phase boundary. And that meant that it was easy to program. But, the efficiency of the phased approach relied upon their being differences in costs across the phases. Which, in this case, means that we assume that cases in phase two require similar levels of effort to be completed. This is like assuming a propensity model with calls as the only predictor.

Of course, we usually have more data than that. We probably could create more homogeneity in phase 2 by using additional information to estimate response probabilities. I saw Andy Peytchev give a presentation where they implemented this idea. Even just the paradata would help. As an example, consider two cases:

We've called this case 8 times. No one has ever answered. No answering machine.
We've called this case 8 times. We spoke to a person 3 times and schedule an appointment that was subsequently missed.

I'd bet our chances our better with the second case.

If we did build a propensity model and used the results, then we would just need to be careful that the estimates near the boundary are consistent across the field period (see my previous post on this topic).

Survey Methods Musings

Search This Blog

Defining phases

Labels

Comments

Post a Comment