Skip to main content

Posts

Responsive Design and Sampling Variability II

Just continuing the thought from the previous post...

Some examples of controlling the variability don't make much sense. For instance, there is no real difference between a response rate of 69% and one of 70%. Except for the largest of samples. Yet, there is often a "face validity" claim that there is a big difference in that 70% is an important line to cross.

However, for survey costs, it can be a big difference if the budgeted amount is $1,000,000 and the actual cost is $1,015,000. Although this is roughly the same proportionate difference as the response rates, going over a budget can have many negative consequences. In this case, controlling the variability can be critical. Although the costs might be "noise" in some sense, they are real.
Recent posts

Responsive design and sampling variability

At the Joint Statistical Meetings, I went to a session on responsive and adaptive design. One of the speakers, Barry Schouten, contrasted responsive and adaptive designs. One of the contrasts was that responsive design was concerned with controlling short-term fluctuations in outcomes such as response rates.

This got me thinking. I think the idea is that responsive design will respond to the current data, which includes some sampling error. In fact, it's possible that sampling error could be the sole driver of responsive design interventions in some cases. I don't think this is usually the case, but it certainly is part of what responsive designs might do.

At first, this seemed like a bad feature. One could imagine that all responsive design interventions should include a feature that accounts for sampling error. For instance, decision rules that attain a level of statistical significance. We've implemented some like that.

On the other hand, sometimes controlling sampling …

Learning from paradata

Susan Murphy's work on dynamic treatment regimes had a big impact on me as I was working on my dissertation. I was very excited about the prospect of learning from the paradata. I did a lot of work on trying to identify the best next step based on analysis of the history of a case. Two examples were 1) choosing the lag before the next call and the incentive, and 2) the timing of the next call.

At this point, I'm a little less sure of the utility of the approach for those settings. In those settings, where I was looking at call record paradata, I think the paradata are not at all correlated with most survey outcomes. So it's difficult to identify strategies that will do anything but improve efficiency. That is, changes in strategies based on analysis of call records aren't very likely to change estimates.

Still, I think there are some areas where the dynamic treatment regime approach can be useful. The first is mode switching. Modes are powerful, and offering them in se…

What if something unexpected happens?

I recently finished teaching a short course on responsive survey design. We had some interesting discussions. One of the things that we emphasized was the pre-planned nature of responsive design. We contrast responsive design with ad hoc changes that a survey might make in response to unanticipated problems. The reasoning is that ad hoc changes are often done under pressure and, therefore, are likely to be less than optimal -- that is, they might be implemented too late, cost too much, or better options might not be considered. Further, it's hard to replicate the results when decisions are made this way.

Some of the students seemed uneasy about this definition. In part, I think this was because there was a sort of implication that one shouldn't make ad hoc changes. That really wasn't our message. Our point was that to be responsive design, it needs to be pre-planned. We didn't mean that if unanticipated problems arise, it would be better to do nothing. In this sense, r…

Centralization vs Local Control in Face-to-Face Surveys

A key question that face-to-face surveys must answer is how to balance local control against the need for centralized direction. This is an interesting issue to me. I've worked on face-to-face surveys for a long time now, and I have had discussion about this issue with many people.

"Local control" means that interviewers make the key decisions about which cases to call and when to call them. They have local knowledge that helps them to optimize these decisions. For example. if they see people at home, they know that is a good time to make an attempts. They learn people's work schedules, etc. This has been the traditional practice. This may be because before computers, there was no other option.

The "centralized" approach says that the central office can summarize the data across many call attempts, cases, and interviewers and come up with  an optimal policy. This centralized control might serve some quality purpose, as in our efforts here to promote more ba…

Every Hard-to-Interview Respondent is Difficult in their Own Way...

The title of this post is a paraphrase of a saying coined by Tolstoi. "Happy families are all alike; every unhappy family is unhappy in its own way." I'm stealing the concept to think about survey respondents. 

To simplify discussion, I'll focus on two extremes. Some people are easy respondents. No matter what we do, no matter how poorly conceived, they will respond. Other people are difficult respondents. I would argue that these latter respondents are heterogenous with respect to the impact of different survey designs on them. That is, they might be more likely to respond under one design relative to another. Further, the most effective design will vary from person to person within this difficult group. 

It sounds simple enough, but we don't often carry this idea into practice. For example, we often estimate a single response propensity, label a subset with low estimated propensities as difficult, and then give them all some extra thing (often more money). 

I susp…

Survey Data and Big Data... or is it Big Data and Survey Data

It seems like survey folks have thought about the use of big data  mostly as a problem of linking big data to survey data. This is certainly a very useful thing to do. The model starts from the survey data, and adds big data. This reduces the burden on respondents and may improve the accuracy of data.

But I am also having conversations that start from big data, and then fill the gaps with survey data. For instance, in looking for suitable readings on using big data and survey data, I found several interesting articles that come from folks working with big data who use survey data to validate the logical inferences they make from the data as with this study of travel based upon GPS data, or to understand missing data in electronic health records as with this study.

Now I'm also hearing discussion of how surveys might be triggered by events in the big data. The survey can answer the "why" question. Why the change? This makes for an interesting idea. The big data are the st…