Friday, August 22, 2014

Probability Sampling

In light of the recent kerfuffle over probability versus non-probability sampling, I've been thinking about some of the issues involved with this distinction. Here are some thoughts that I use to order the discussion in my own head:

1. The research method has to be matched to the research question. This includes cost versus quality considerations. Focus groups are useful methods that are not typically recruited using probability methods. Non-probability sampling can provide useful data. Sometimes non-probability samples are called for.

2. A role for methodologists in the process is to test and improve faulty methods. Methodologists have been looking at errors due to nonresponse for a while. We have a lot of research for using models to reduce nonresponse bias. As research moves into new arenas, methodologists have a role to play there. While we may (er... sort of) understand how to adjust for nonresponse, do we know how to adjust for an unknown probability of getting into an online panel? That's not my area, but certainly something worth looking at. Probably election polling is most developed in this area.

Related to this, how to we know when something is bad? Tough question, but methodologists ought to lead the way in developing methods to evaluate it.

3. At least some probability surveys are still needed. For example, in election polling, likely voter models are an important ingredient of estimates. Such models can be tested and developed on panel surveys like the National Election Study, and then applied to other samples. Mick Couper reviews uses of surveys in the age of "big data." Yes, we do still need them.

Friday, August 15, 2014

The Dual Criteria for a Useful Survey Design Feature

I've been working on a review of patterns of nonresponse to a large survey on which I worked. In my original plan, I looked at things that are related response, and then I looked at things that are related to key statistics produced by the survey. "Things" include design features (e.g. number of calls, refusal conversions, etc.) and paradata or sampling frame data (e.g. Census Region, interviewer observations about the sampled unit, etc.).

We found that there were some things that heavily influenced response (e.g. calls) that did not influence the key statistics. Good, since more or less of that feature, although important for sampling error, doesn't seem important with respect to nonresponse bias.

There were also some that influenced the key statistics but not response. For example, interviewer observations we have for the study. The response rates are close across subgroups of these estimates. As a result, I won't have to rely on large weights to get to unbiased estimates. Or, in another way of looking at it, I empirically tested what estimates would have looked like had I relied on that assumption at an earlier phase of the survey process.

And of course, there were some that predicted neither. And there were none that strongly predicted both response and key statistics.

This result seems good to me. Why? We haven't allowed any variables to be highly predictive of response. If we had, we would need to rely upon strong assumptions (i.e. large nonresponse adjustments) to motivate unbiased estimates. But we can also predict some of the key statistics. This relationship might be confounded, but it still seems good that we have some useful predictors of the key statistics.

In any event, organizing the analysis along these lines was helpful for me. I didn't develop a single-number characterization of the quality of our data, but I did tell a somewhat coherent story that I believe provides convincing evidence that our process produces good quality data.

Friday, August 8, 2014

"Go Big, or Go Home."

I just got back from JSM, where I participated in a session on adaptive design. Mick Couper served as a discussant for the session. The title of this blog post is one of the points from his talk. He said that innovative, adaptive methods need to show substantial results. Otherwise, it won't be convincing. As he pointed out, part of the problem is that we are often tinkering with marginal changes on existing surveys. These kinds of changes need to be low risk, that is, they can't cause damage to the results and should only help. However, these kinds of changes are often limited in what they can do. His point was to make some big changes that will show big effects may require some risk.

This made sense to me. It would be nice to have some methodological studies that aren't constrained by the needs of an existing survey. I suppose this could be a separate, large sample with the same content as an existing survey. However, I wonder if this is a chicken or egg type of problem. Do we need the small, restricted studies that show marginal benefits in order to justify the large, purely methodological studies? Or do we need the large, methodological study before more surveys will consider these kinds of design innovations?

My feeling is that we have enough evidence that large, purely methodological surveys are justified and even a logical next step.

Friday, August 1, 2014

Better to Adjust with Weights, or Adjust Data Collection?

My feeling is that this is a big question facing our field. In my view, we need both of these to be successful.

The argument runs something like this. If you are going to use those variables (frame data and paradata) for your nonresponse adjustments, then why bother using them to alter your data collection? Wouldn't it be cheaper to just use them in your adjustment strategy?

There are several arguments that can be used when facing these kinds of questions. The main point I want to make here is that I believe that this is an empirical question. Let's call X my frame variable and Y the survey outcome variable. If I assume that the relationship between X and Y is the same no matter what the response rate for categories of X, then, sure, it might be cheaper to adjust. But that doesn't seem to be true very often. And that is an empirical question.

There are two ways to examine this question. [Well, whenever someone says definitively there are "two ways of doing something," in my head, I'm thinking "at least two ways."] First, use existing data and simulate adjusted estimates at different response rates. Second, run an experiment. Compare the two methods. I think we actually need both of these things. It is an important question. We might as well be thorough in our research aimed at understanding it.