We recently finished the development of nonresponse adjustments for a large survey. We spent a lot of time modelling response probabilities and the key variables from the survey. One of our more interesting findings was that the number of calls (modeled in a number of different ways) was not predictive of key variables but was highly predictive of response. In the end, we decided not to include this predictor. It could only add noise.
But this raises a question in mind. Their might be (at least) three sources of the noise:
1) the number of calls it takes to reach someone (as a proxy of contactibility) is unrelated to the key variables. Maybe we could speculate that people who are more busy are not different from those who are less busy on the key statistics (health, income, wealth, etc.).
2) The number of calls it takes to reach someone is effectively random. Interviewers make all kinds of choices that aren't random. These choices create a mismatch between contactibility and the number of calls.
3) Interviewers measure the number of calls wrong (see the recent article by Biemer, et al.). Measurement error adds noise.
In the end, we didn't need to distinguish among these three potential sources. But understanding these potential sources of error is important. If option 3 were the problem, then we would need to understand how to improve reporting on the number of calls. My guess is that issue 1 only occurs for some variables, therefore, understanding 2 and 3 will be important.
But this raises a question in mind. Their might be (at least) three sources of the noise:
1) the number of calls it takes to reach someone (as a proxy of contactibility) is unrelated to the key variables. Maybe we could speculate that people who are more busy are not different from those who are less busy on the key statistics (health, income, wealth, etc.).
2) The number of calls it takes to reach someone is effectively random. Interviewers make all kinds of choices that aren't random. These choices create a mismatch between contactibility and the number of calls.
3) Interviewers measure the number of calls wrong (see the recent article by Biemer, et al.). Measurement error adds noise.
In the end, we didn't need to distinguish among these three potential sources. But understanding these potential sources of error is important. If option 3 were the problem, then we would need to understand how to improve reporting on the number of calls. My guess is that issue 1 only occurs for some variables, therefore, understanding 2 and 3 will be important.
Interesting! Thanks, James.
ReplyDeleteHow did you measure the relationship between contactability and key variables? I presume you did this only for respondents, so there is some missing data. Isn't there a fourth option, that the relationship between contactability and key variables exists, but only for nonrespondents? Meaning that: some nonrespondents would have responded if they had had one additional attempt, others would have needed five additional attempts, others maybe 100. Perhaps there is a correlation between this number and your key variables. What do you think, would a strong relationship among the nonrespondents be likely? Could it be strong enough to change change your opinion on adjustment?