I have been working on a presentation on two-phase sampling. I went back to an old example from an RDD CATI survey we did several years ago. In that survey, we defined phase 1 using effort level. The first 8 calls were phase 1. A subsample of cases was selected to receive 9+ calls. It was nice in that it was easy to define the phase boundary. And that meant that it was easy to program. But, the efficiency of the phased approach relied upon their being differences in costs across the phases. Which, in this case, means that we assume that cases in phase two require similar levels of effort to be completed. This is like assuming a propensity model with calls as the only predictor. Of course, we usually have more data than that. We probably could create more homogeneity in phase 2 by using additional information to estimate response probabilities. I saw Andy Peytchev give a presentation where they implemented this idea. Even just the paradata would help. As an example, consider two...
Blogging about survey methods, responsive design, and all things survey related.