Skip to main content


Surveys and Other Sources of Data

Linking surveys and other sources of data is not a new idea. This has been around for a long time. It's useful in many situations. For example, when respondents would have a difficult time supplying the information (for example, exact income information).

Much of the previous research on linkage has focused on either the ability to link data, possibly in a probabilistic fashion; or there have been examinations of biases associated with the willingness to consent to linkage.

It seems that new questions are emerging with the pervasiveness of data generated by devices, especially smart phones. I read an interesting article by Melanie Revilla and colleagues about trying to collect data from a tracking application that people install on their devices. They examine how the "meter" as they call the application might be incompletely covering the sample. For example, persons might have multiple devices and only install it on some of them. Or, persons might share devices and not i…
Recent posts

Survey Modes and Recruitment

I've been struggling with the concept of "mode preference." It's a term we use to describe the idea that respondents might have preferences for a mode and that if we can identify or predict those preferences, then we can design a better survey (i.e. by giving people their preferred mode).

In practice, I worry that people don't actually prefer modes. If you ask people what mode they might prefer, they usually say the mode in which the question is asked. In other settings, the response to that sort of question is only weakly predictive of actual behavior.

I'm not sure the distinction between stated and revealed preferences is going to advance the discussion much either. The problem is that the language builds in an assumption that people actually have a preference. Most people don't think about survey modes. Most don't consider modes abstractly in the way methodologists might. In fact, these choices are likely probabilistic functions that hinge on the…

Response Rates and Responsive Design

A recent article by Brick and Tourangeau re-examines the data from a paper by Groves and Peytcheva (2008). The original analyses from Groves and Peytcheva were based upon 959 estimates with known variables measured on 59 surveys with varying response rates. They found very little correlation between the response rate and the bias on those 959 estimates.

Brick and Tourangeau view the problem as a multi-level problem of 59 clusters (i.e. surveys) of the 959 estimates. They created for each survey a composite score based on all the bias estimates from each survey. Their results were somewhat sensitive to how the composite score was created. They do present several different ways of doing this -- simple mean, mean weighted by sample size, mean weighted by the number of estimates. Each of these study-level composite bias scores is more correlated with the response rate. They conclude: "This strongly suggests that nonresponse bias is partly a function of study-level characteristics; th…

Mechanisms of Mode Choice

Following up yet again, on posts about how people choose modes. In particular, it does seem that different subgroups are likely to respond to different modes at different rates. Of course, with the caveat that it's obviously not just the mode, but also how you get there that matters.

We do have some evidence about subgroups that are likely to choose a mode. Haan, Ongena, and Aarts examine an experiment where respondents to a survey are given a choice of modes. They found that full-time workers and young adults were more likely to choose web over face-to-face.

The situation is an experimental one that might not be very similar to many surveys: Face-to-face and telephone recruitment to the choice of face-to-face or web survey. But at least the design allows them to look at who might make different choices.

It would be good to have more data on persons making the choice in order to better understand the choice. For example, information about how much they use the internet might be us…

The dose matters too...

Just a follow-up from my previous post on mixed-mode surveys. I think that one of the things that gets overlooked in discussions of mixed-mode designs is the dosage of each mode that is applied. For example, how many contact attempts under each mode? It's pretty clear that this matters. In general, more effort leads to higher response rates and less effort leads to lower response rates.

But, it seems that sometimes when we talk about mixed-mode studies, we forget about the dose. We wrote about this idea in Chapter 4 of our new book on adaptive survey design. I think it would be useful to keep this in mind when describing mixed-mode studies. It might be these other features, i.e. not the mode itself, that account for differences between mixed-mode studies. At least in part.

Is there such a thing as "mode"?

Ok. The title is a provocative question. But it's one that I've been thinking about recently. A few years ago, I was working on a lit review for a mixed-mode experiment that we had done. I found that the results were inconsistent on an important aspect of mixed-mode studies -- the sequence of modes.

As I was puzzled about this, I went back and tried to write down more information about the design of each of the experiments that I was reviewing. I started to notice a pattern. Many mixed-mode surveys offered "more" of the first mode. For example, in a web-mail study, there might be 3 mailings with the mail survey and one mailed request for a web survey. This led me to think of "dosage" as an important attribute of mixed-mode surveys.

I'm starting to think there is much more to it than that. The context matters  a lot -- the dosage of the mode, what it may require to complete that mode, the survey population, etc. All of these things matter.

Still, we ofte…

Should exceptions be allowed in survey protocol implementation?

I used to work on a CATI system (DOS-based) that allowed supervisors to release cases for calling through an override mechanism. That is, the calling algorithm had certain rules that kept cases out of the calling queue at certain times. The main thing was if something had been called and was a "ring-no-answer," then the system wouldn't allow it to be called (i.e. placed in the calling queue) until 4 hours had passed. But supervisors could override this and release cases for calling on a case-by-case basis. This was handy -- when sample ran out, supervisors could release more cases that didn't fall within the calling parameters. This kept interviewers busy dialing.

Recently, I've started to think about the other side of such practices. That is, it is more difficult to specify the protocol that should be applied when these exceptions are allowed. Obviously, if the protocol is not calling a case less than four hours after a ring-no-answer, then the software explicitl…