Skip to main content


Showing posts from February, 2012

Response Rates as a Reward Function

I recently saw a presentation by Melanie Calinescu and Barry Schouten on adaptive survey design. They have been using optimization techniques to design mixed-mode surveys. In the optimization problems, they seek to maximize a measure of sample balance (the R-Indicator) for a fixed cost by using different allocation to the modes for different subgroups in the population (for example, <35 years of age and 35+).  The modes in their example are web and face-to-face. In their example, the older group is more responsive in both modes, so they get allocated at higher rates to web. You can read their paper here to see the very interesting setup and results.

In the presentation, they showed what happens when you use the response rate as the thing that you are seeking to maximize. In some of the lower budgets, the optimal allocation was to simply ignore the younger group. You could not get a higher response rate by doing anything other than using all your resources on the older group. Once y…

Call Record Problems

A couple of years ago I did an experiment where I recommended times to called sampled units in a face-to-face survey based on an area probability cluster sample. The recommendations were based on estimates from multi-level logistic regression models. The interviewers ignored the recommendations.

In meetings with the interviewers, several said that they didn't follow the recommendations since they call every case on every trip to an area segment. The call records certainly didn't reflect that claim. But it got me thinking that maybe the call records don't reflect everything that happens.

Biemer, Chen and Wang (2011) reported a survey of interviewers where the interviewers did report that they do not always create a call record for a call. They reported that sometimes they would not report a call in order to keep a case alive (since the number of calls on any case was limited) or because they just drove by the sampled unit and saw that no one was home. Biemer, Chen, and Wang …

Are we ready for a new reward function?

I've been thinking about the harmful effects of using the response rate as a data quality indicator. It has been a key -- if not THE key -- indicator of data quality for a while. One of the big unknowns is the extent to which the pervasive use of the response rate as a data quality indicator has malformed the design of surveys. In other words, has the pursuit of high response rates led to undesirable effects?

It is easy to say that we should be more focused on bias, but harder to do. Generally, we don't know the bias due to nonresponse. So if we are going to do something to reduce bias, we need a "proxy" indicator. For example, we could impute values to estimate the bias. This requires that the bias of an unweighted mean be related to things that we observe and that we specify the right model.

No matter which indicator we select, we need some sort of assumptions to motivate this "proxy" indicator. Those assumptions could be wrong. When we are wrong, do we m…