I've been thinking about the harmful effects of using the response rate as a data quality indicator. It has been a key -- if not THE key -- indicator of data quality for a while. One of the big unknowns is the extent to which the pervasive use of the response rate as a data quality indicator has malformed the design of surveys. In other words, has the pursuit of high response rates led to undesirable effects?
It is easy to say that we should be more focused on bias, but harder to do. Generally, we don't know the bias due to nonresponse. So if we are going to do something to reduce bias, we need a "proxy" indicator. For example, we could impute values to estimate the bias. This requires that the bias of an unweighted mean be related to things that we observe and that we specify the right model.
No matter which indicator we select, we need some sort of assumptions to motivate this "proxy" indicator. Those assumptions could be wrong. When we are wrong, do we more damage than good? On any particular survey, this could be the case. That is, following the "proxy" indicator actually leads to a worse bias.
At the moment, we need more research to see if we can find indicators that, on average, when used to guide data collection actually reduce nonresponse bias across many surveys.This probably means gold standard studies that are pursued solely for methodological purposes. If we don't do that research, and start tuning our data collection practices to other indicators we may actually run the risk of "throwing the baby out with the bath water" and developing methods that actually increases biases.
It is easy to say that we should be more focused on bias, but harder to do. Generally, we don't know the bias due to nonresponse. So if we are going to do something to reduce bias, we need a "proxy" indicator. For example, we could impute values to estimate the bias. This requires that the bias of an unweighted mean be related to things that we observe and that we specify the right model.
No matter which indicator we select, we need some sort of assumptions to motivate this "proxy" indicator. Those assumptions could be wrong. When we are wrong, do we more damage than good? On any particular survey, this could be the case. That is, following the "proxy" indicator actually leads to a worse bias.
At the moment, we need more research to see if we can find indicators that, on average, when used to guide data collection actually reduce nonresponse bias across many surveys.This probably means gold standard studies that are pursued solely for methodological purposes. If we don't do that research, and start tuning our data collection practices to other indicators we may actually run the risk of "throwing the baby out with the bath water" and developing methods that actually increases biases.
Comments
Post a Comment