Tuesday, February 19, 2013

Estimating Daily Contact Models in Real-Time

A couple of years ago I was running an experiment on a telephone survey. The results are described here. As part of the process, I estimated a multi-level logistic regression model on a daily basis. I had some concern that early estimates of the coefficients and resulting probabilities (which were the main interest) could be biased. The more easily interviewed cases are usually completed early in the field period. So the "sample" used for the estimate is disproportionately composed of easy responders. To mitigate the risk of this happening, I used data from prior waves of the survey (including early and late responders) when estimating the model. The estimates also controlled for level of effort (number of calls) by including all call records and estimating household-level contact rates.

During the experiment I monitored the estimated coefficients on the daily basis. They were remarkably stable over time:
Of course, nothing says it had to turn out this way. I have found examples that are less stable than this. But I wanted to start out with the good news...

Monday, February 11, 2013

Interesting Experiment

I recently read an article about a very interesting experiment. Luiten and Schouten report on an experiment to improve the Statistics Netherlands' Survey of Consumer Sentiment. Their task was to improve representativity (defined as increasing the R-Indicator) of the survey without increasing costs and without lowering the response rate. This sounds like a difficult task. We can debate the merits of lowering response rates in "exchange" for improved representativity. But who can argue with increasing representativity without major increases in costs or decreases in response rates.

The experiment has a number of features all built with the goal of meeting these constraints. One of the things that makes their paper so interesting is that each of the design features is "tailored" to the specifics of the sampled units. For those of you who like the suspense of a good survey experiment, spoiler alert: they managed to meet their objectives.

Saturday, February 2, 2013

Level-of-Effort Paradata and Nonresponse Adjustments

We recently finished the development of nonresponse adjustments for a large survey. We spent a lot of time modelling response probabilities and the key variables from the survey. One of our more interesting findings was that the number of calls (modeled in a number of different ways) was not predictive of key variables but was highly predictive of response. In the end, we decided not to include this predictor. It could only add noise.

But this raises a question in mind. Their might be (at least) three sources of the noise:

1) the number of calls it takes to reach someone (as a proxy of contactibility) is unrelated to the key variables. Maybe we could speculate that people who are more busy are not different from those who are less busy on the key statistics (health, income, wealth, etc.).

2) The number of calls it takes to reach someone is effectively random. Interviewers make all kinds of choices that aren't random. These choices create a mismatch between contactibility and the number of calls.

3) Interviewers measure the number of calls wrong (see the recent article by Biemer, et al.). Measurement error adds noise.

In the end, we didn't need to distinguish among these three potential sources. But understanding these potential sources of error is important. If option 3 were the problem, then we would need to understand how to improve reporting on the number of calls. My guess is that issue 1 only occurs for some variables, therefore, understanding 2 and 3 will be important.