One issue that we have been discussing is indicators for the risk of nonresponse bias. There are some indicators that use observed information (i.e. largely sampling frame data) to determine whether respondents and nonrespondents are similar. The R-Indicator is an example of this type of indicator. It's not the only one. There are several sample balance indicators. There is an implicit model that the observed characteristics are related to the survey data and controlling for them will, therefore, also control the potential for nonresponse bias.
Another indicator uses the observed data, including the observed survey data, and a model to fill in the missing survey data. The goal here is to predict whether nonresponse bias is likely to occur. Here, the model is explicit.
An issue that impacts either of these approaches is that if you are able to predict the survey variables with the sampling frame data, then why bother addressing imbalances on them during data collection? One answer is that, empirically, it does lead to reductions in bias.
I do think there is an opportunity to approach this issue in another way. That is, could we attack the model uncertainty? Could we investigate regions of the covariate space where the predictions are less good, i.e. more uncertain? I tried to use regression diagnostics in this way. Here, the model plays an important role. And this approach might be sensitive to model selection. Still, it might be good to know more about the conditions under which this approach can be useful.
Another indicator uses the observed data, including the observed survey data, and a model to fill in the missing survey data. The goal here is to predict whether nonresponse bias is likely to occur. Here, the model is explicit.
An issue that impacts either of these approaches is that if you are able to predict the survey variables with the sampling frame data, then why bother addressing imbalances on them during data collection? One answer is that, empirically, it does lead to reductions in bias.
I do think there is an opportunity to approach this issue in another way. That is, could we attack the model uncertainty? Could we investigate regions of the covariate space where the predictions are less good, i.e. more uncertain? I tried to use regression diagnostics in this way. Here, the model plays an important role. And this approach might be sensitive to model selection. Still, it might be good to know more about the conditions under which this approach can be useful.
Comments
Post a Comment