Skip to main content

Posts

Showing posts from 2017

Surveys and Other Sources of Data

Linking surveys and other sources of data is not a new idea. This has been around for a long time. It's useful in many situations. For example, when respondents would have a difficult time supplying the information (for example, exact income information). Much of the previous research on linkage has focused on either the ability to link data, possibly in a probabilistic fashion; or there have been examinations of biases associated with the willingness to consent to linkage. It seems that new questions are emerging with the pervasiveness of data generated by devices, especially smart phones. I read an interesting article by Melanie Revilla and colleagues about trying to collect data from a tracking application that people install on their devices. They examine how the "meter" as they call the application might be incompletely covering the sample. For example, persons might have multiple devices and only install it on some of them. Or, persons might share devices and no

Survey Modes and Recruitment

I've been struggling with the concept of "mode preference." It's a term we use to describe the idea that respondents might have preferences for a mode and that if we can identify or predict those preferences, then we can design a better survey (i.e. by giving people their preferred mode). In practice, I worry that people don't actually prefer modes. If you ask people what mode they might prefer, they usually say the mode in which the question is asked. In other settings, the response to that sort of question is only weakly predictive of actual behavior. I'm not sure the distinction between stated and revealed preferences is going to advance the discussion much either. The problem is that the language builds in an assumption that people actually have a preference. Most people don't think about survey modes. Most don't consider modes abstractly in the way methodologists might. In fact, these choices are likely probabilistic functions that hinge on

Response Rates and Responsive Design

A recent article by Brick and Tourangeau re-examines the data from a paper by Groves and Peytcheva (2008). The original analyses from Groves and Peytcheva were based upon 959 estimates with known variables measured on 59 surveys with varying response rates. They found very little correlation between the response rate and the bias on those 959 estimates. Brick and Tourangeau view the problem as a multi-level problem of 59 clusters (i.e. surveys) of the 959 estimates. They created for each survey a composite score based on all the bias estimates from each survey. Their results were somewhat sensitive to how the composite score was created. They do present several different ways of doing this -- simple mean, mean weighted by sample size, mean weighted by the number of estimates. Each of these study-level composite bias scores is more correlated with the response rate. They conclude: "This strongly suggests that nonresponse bias is partly a function of study-level characteristics;

Mechanisms of Mode Choice

Following up yet again, on posts about how people choose modes. In particular, it does seem that different subgroups are likely to respond to different modes at different rates. Of course, with the caveat that it's obviously not just the mode, but also how you get there that matters. We do have some evidence about subgroups that are likely to choose a mode. Haan, Ongena, and Aarts examine an experiment where respondents to a survey are given a choice of modes. They found that full-time workers and young adults were more likely to choose web over face-to-face. The situation is an experimental one that might not be very similar to many surveys: Face-to-face and telephone recruitment to the choice of face-to-face or web survey. But at least the design allows them to look at who might make different choices. It would be good to have more data on persons making the choice in order to better understand the choice. For example, information about how much they use the internet might

The dose matters too...

Just a follow-up from my previous post on mixed-mode surveys. I think that one of the things that gets overlooked in discussions of mixed-mode designs is the dosage of each mode that is applied. For example, how many contact attempts under each mode? It's pretty clear that this matters. In general, more effort leads to higher response rates and less effort leads to lower response rates. But, it seems that sometimes when we talk about mixed-mode studies, we forget about the dose. We wrote about this idea in Chapter 4 of our new book on adaptive survey design . I think it would be useful to keep this in mind when describing mixed-mode studies. It might be these other features, i.e. not the mode itself, that account for differences between mixed-mode studies. At least in part.

Is there such a thing as "mode"?

Ok. The title is a provocative question. But it's one that I've been thinking about recently. A few years ago, I was working on a lit review for a mixed-mode experiment that we had done. I found that the results were inconsistent on an important aspect of mixed-mode studies -- the sequence of modes. As I was puzzled about this, I went back and tried to write down more information about the design of each of the experiments that I was reviewing. I started to notice a pattern. Many mixed-mode surveys offered "more" of the first mode. For example, in a web-mail study, there might be 3 mailings with the mail survey and one mailed request for a web survey. This led me to think of "dosage" as an important attribute of mixed-mode surveys. I'm starting to think there is much more to it than that. The context matters  a lot -- the dosage of the mode, what it may require to complete that mode, the survey population, etc. All of these things matter. Still, we

Should exceptions be allowed in survey protocol implementation?

I used to work on a CATI system (DOS-based) that allowed supervisors to release cases for calling through an override mechanism. That is, the calling algorithm had certain rules that kept cases out of the calling queue at certain times. The main thing was if something had been called and was a "ring-no-answer," then the system wouldn't allow it to be called (i.e. placed in the calling queue) until 4 hours had passed. But supervisors could override this and release cases for calling on a case-by-case basis. This was handy -- when sample ran out, supervisors could release more cases that didn't fall within the calling parameters. This kept interviewers busy dialing. Recently, I've started to think about the other side of such practices. That is, it is more difficult to specify the protocol that should be applied when these exceptions are allowed. Obviously, if the protocol is not calling a case less than four hours after a ring-no-answer, then the software explicit

Future of Responsive and Adaptive Design

A special issue of the Journal of Official Statistics on responsive and adaptive design recently appeared. I was an associate editor for the issue and helped draft an editorial that raised issues for future research in this area. The last chapter of our book on Adaptive Survey Design also defines a set of questions that may be of issue. I think one of the more important areas of research is to identify targeted design strategies. This differs from current procedures that often sequence the same protocol across all cases. For example, everyone gets web, then those who haven't responded to  web get mail. The targeted approach, on the other hand, would find a subgroup amenable to web and another amenable to mail. This is a difficult task as most design features have been explored with respect to the entire population, but we know less about subgroups. Further, we often have very little information with which to define these groups. We may not even have basic household or person

Responsive Design and Sampling Variability II

Just continuing the thought from the previous post... Some examples of controlling the variability don't make much sense. For instance, there is no real difference between a response rate of 69% and one of 70%. Except for the largest of samples. Yet, there is often a "face validity" claim that there is a big difference in that 70% is an important line to cross. However, for survey costs, it can be a big difference if the budgeted amount is $1,000,000 and the actual cost is $1,015,000. Although this is roughly the same proportionate difference as the response rates, going over a budget can have many negative consequences. In this case, controlling the variability can be critical. Although the costs might be "noise" in some sense, they are real.

Responsive design and sampling variability

At the Joint Statistical Meetings, I went to a session on responsive and adaptive design. One of the speakers, Barry Schouten, contrasted responsive and adaptive designs. One of the contrasts was that responsive design was concerned with controlling short-term fluctuations in outcomes such as response rates. This got me thinking. I think the idea is that responsive design will respond to the current data, which includes some sampling error. In fact, it's possible that sampling error could be the sole driver of responsive design interventions in some cases. I don't think this is usually the case, but it certainly is part of what responsive designs might do. At first, this seemed like a bad feature. One could imagine that all responsive design interventions should include a feature that accounts for sampling error. For instance, decision rules that attain a level of statistical significance. We've implemented some like that. On the other hand, sometimes controlling samp

Learning from paradata

Susan Murphy's work on dynamic treatment regimes had a big impact on me as I was working on my dissertation. I was very excited about the prospect of learning from the paradata. I did a lot of work on trying to identify the best next step based on analysis of the history of a case. Two examples were 1) choosing the lag before the next call and the incentive, and 2) the timing of the next call. At this point, I'm a little less sure of the utility of the approach for those settings. In those settings, where I was looking at call record paradata, I think the paradata are not at all correlated with most survey outcomes. So it's difficult to identify strategies that will do anything but improve efficiency. That is, changes in strategies based on analysis of call records aren't very likely to change estimates. Still, I think there are some areas where the dynamic treatment regime approach can be useful. The first is mode switching. Modes are powerful, and offering them i

What if something unexpected happens?

I recently finished teaching a short course on responsive survey design . We had some interesting discussions. One of the things that we emphasized was the pre-planned nature of responsive design. We contrast responsive design with ad hoc changes that a survey might make in response to unanticipated problems. The reasoning is that ad hoc changes are often done under pressure and, therefore, are likely to be less than optimal -- that is, they might be implemented too late, cost too much, or better options might not be considered. Further, it's hard to replicate the results when decisions are made this way. Some of the students seemed uneasy about this definition. In part, I think this was because there was a sort of implication that one shouldn't make ad hoc changes. That really wasn't our message. Our point was that to be responsive design, it needs to be pre-planned. We didn't mean that if unanticipated problems arise, it would be better to do nothing. In this sense,

Centralization vs Local Control in Face-to-Face Surveys

A key question that face-to-face surveys must answer is how to balance local control against the need for centralized direction. This is an interesting issue to me. I've worked on face-to-face surveys for a long time now, and I have had discussion about this issue with many people. "Local control" means that interviewers make the key decisions about which cases to call and when to call them. They have local knowledge that helps them to optimize these decisions. For example. if they see people at home, they know that is a good time to make an attempts. They learn people's work schedules, etc. This has been the traditional practice. This may be because before computers, there was no other option. The "centralized" approach says that the central office can summarize the data across many call attempts, cases, and interviewers and come up with  an optimal policy. This centralized control might serve some quality purpose, as in our efforts here to promote more

Every Hard-to-Interview Respondent is Difficult in their Own Way...

The title of this post is a paraphrase of a saying coined by Tolstoi. " Happy families are all alike; every unhappy family is unhappy in its own way." I'm stealing the concept to think about survey respondents.  To simplify discussion, I'll focus on two extremes. Some people are easy respondents. No matter what we do, no matter how poorly conceived, they will respond. Other people are difficult respondents. I would argue that these latter respondents are heterogenous with respect to the impact of different survey designs on them. That is, they might be more likely to respond under one design relative to another. Further, the most effective design will vary from person to person within this difficult group.  It sounds simple enough, but we don't often carry this idea into practice. For example, we often estimate a single response propensity, label a subset with low estimated propensities as difficult, and then give them all some extra thing (often more money). 

Survey Data and Big Data... or is it Big Data and Survey Data

It seems like survey folks have thought about the use of big data  mostly as a problem of linking big data to survey data. This is certainly a very useful thing to do. The model starts from the survey data, and adds big data. This reduces the burden on respondents and may improve the accuracy of data. But I am also having conversations that start from big data, and then fill the gaps with survey data. For instance, in looking for suitable readings on using big data and survey data, I found several interesting articles that come from folks working with big data who use survey data to validate the logical inferences they make from the data as with this study of travel based upon GPS data, or to understand missing data in electronic health records as with this study. Now I'm also hearing discussion of how surveys might be triggered by events in the big data. The survey can answer the "why" question. Why the change? This makes for an interesting idea. The big data are the

What is the right periodicity?

It seems that intensive measurement is on the rise. There are a number of different kinds of things that are difficult to recall sufficiently over longer periods of time where it might be preferred to ask the question more frequently with a shorter reference period. For example, the number of alcoholic drinks consumed by day. More accurate measurements might be achieved if the questions was asked daily about the previous 24 hour period. But what is the right period of time? And how do you determine that? This might be an interesting question. The studies I've seen tend to guess at what the correct periodicity is. I think it's probably the case that it would require some experimentation to determine that, including experimentation in the lab. There are a couple of interesting wrinkles to this problem. 1. How do you set the periodicity when you measure several things that might have different periodicity? Ask the questions at the most frequent periodicity? 2. How does non

Slowly Declining Response Rates are the Worst!

I have seen this issue on several different projects. So I'm not calling out anyone in particular. I keep running into this issue. Repeated cross-sectional surveys are the most glaring example, but I think it happens other places as well. The issue is that with a slow decline, it's difficult to diagnose the source of the problem. If everything is just a little bit more difficult (i.e. if contacting persons, convincing people to list a household, finding the selected person, convincing them to do the survey, and so on), then it's difficult to identify solutions. One issue that this sometimes creates is that we keep adding a little more effort each time to try to counteract the decline. A few additional more calls. A slightly longer field period. We don't then search for qualitatively different solutions. That's not to say that we shouldn't make the small changes. Rather, that they might need to be combined with longer term planning for larger changes. That