Friday, May 29, 2015

From average response rate to personalized protocol

Survey methods research first efforts to understand nonresponse started by looking at response rates. The focus was on finding methods that raided response rates. This approach might be useful when everyone has response propensities close to the average. The deterministic formulation of nonresponse bias may even reflect this sort of assumption. 

Researchers have since looked at subgroup response rates. Also interesting, but assuming that these rates are a fixed characteristic leaves us helpless. 

Now, it seems that we have begun working with an assumprton that there is heterogenous response to treatments and that we should, therefore, tailor the protocol and manipulate response propensities.  

I thought this development has a parallel in clinical trials where there is a new emphasis on personalized medicine. 

We still have important questions to resolve. For example, what are we trying to maximize?

Is the "long survey" dead?

A colleague sent me a link to a blog arguing that the "long survey" is dead. The blog takes the point of view that anything over 20 minutes is long. There's also a link to another blog that presents data from survey monkey surveys showing that the longer the questionnaire, the less time that is spent on each question. They don't really control for question length, etc. But it's still suggestive.

In my world 20 minutes is still a short survey. But the point is still taken. There has been some research on the effect of survey length (announced) on response rates. There probably is need for more.

Still, it might be time to start thinking of alternatives to improve response to long surveys. The most common is to offer a higher incentive, and thereby counteract the burden of the longer survey. Another alternative is to shorten the survey. This doesn't work if your questions are the ones getting tossed. Of course, substituting big data for elements of surveys is another option that is being explored.

Matrix sampling is another useful approach that is little used. It seems like you could do a power analysis for each item, each scale, each model using data from a survey and then subsample content that is overpowered. That takes a lot of work -- by central office staff -- but it might save more respondent (and interviewer) time than it costs.

Another option is to split up interview sessions across time and modes. This seems like it will become a more attractive design. A series of short surveys, completed over some amount of time.

It's probably worth exploring all of these options.

Friday, May 8, 2015

Selection Effects

This seems to come up in a number of different ways frequently. We talk a lot about nonresponse and how it may be a selective process such that it produces biases. We might try to model this process in order to correct these biases.

Online panels and 'big data' like twitter have their own selection processes. It seems that it would be important to understand these processes. Can they be captured with simple demographics? If not, what else do we need to know?

I think we have done some work on survey nonresponse. I'm not sure what is known about online panels or twitter relative to this question.

Friday, May 1, 2015

Adaptive Design in Panel Surveys

I enjoyed Peter Lugtig's blog post on using adaptive design in panel surveys. I was thinking about this again today. One of the things that I thought would be interesting to look at would be to view the problem of panel surveys as maximizing information gathered.

I feel like we view panel studies as a series of cross-sectional studies where we want to maximize the response rate at each wave. This might create non-optimal designs. For instance, it might be more useful to have the first and the last waves measured, rather than the first and second waves. From an imputation perspective, in the latter situation (first and last waves) it is easier to impute the missing data.

The problem of maximizing information across waves is more complicated than maximizing response at each wave. The former is a sequential decisionmaking problem, like those studies by Susan Murphy as "adaptive treatment regimes." It might be the case, that a lower response rate in early waves might lead to overall higher information -- if it can lead to more data later. It's certainly a complicated problem, but one worth considering.

For example, would postponing refusal conversion across several waves increase the probability of responding to more waves?  A recent article by Burton and colleagues looked at the effect of refusal conversion on panel composition. People tended to stay in after being converted, but eventually dropped out. This is a useful evaluation of refusal conversion. It might also be useful to examine whether delaying refusal conversion increases the number of waves of response.