Saturday, March 28, 2015

Responsive Design Phases

In Groves and Heeringa's original formulation, responsive design proceeds in phases. They define these phases as:

"A design phase is a time period of a data collection during which the same set of sampling frame, mode of data collection, sample design, recruitment protocols and measurement conditions are extant." (page 440).

These responsive design phases are different than the two-phase sampling defined by Hansen and Hurwitz. Hansen and Hurwitz assumed 100% response so there was no nonresponse bias.  There two-phase sampling was all about minimizing variance for a fixed budget. 

Groves and Heeringa, on the other hand, live in a world where nonresponse does occur.  They seek to control it through phases that recruit complementary groups of respondents. The goal is that the nonresponse biases from each phase will cancel each other out. The focus on bias is a new feature relative to Hansen and Hurwitz. 

A question in my mind about the phases is how the phase boundaries should be defined. In Groves and Heeringa, they are points in time. Even saying which points in time is difficult. Groves and Heeringa suggest the use of the concept "phase capacity":

"Phase capacity is the stable condition of an estimate in a specific design phase, i.e. a limiting value of an estimate that a particular set of design features produces." (p. 445).

Deciding when this has occurred is an interesting statistical problem in its own right. There are a couple of articles on stopping rules which may be relevant for formalizing these definitions of phase boundaries.

I'm interested in designs where phase boundaries may be something other than a point in time. In my dissertation, I tried to show the adaptive treatment regimes approach might be applied to surveys. These regimes adapt the treatments to the baseline characteristics and history of previous treatments. Could this be thought of as an extension of responsive design? I think it might, if the concept of phases can be extended to include boundaries identified at the case level.

Friday, March 20, 2015

What is Current Standard Practice for Surveys?

In clinical trials, they have the concept that there is an "existing standard of care." New treatments are compared experimentally to this treatment. I suppose that clinical trials have some issues where informed persons can disagree about the existing standard of care, but there is at least some consensus.

I'm wondering what we have for existing standard practice in the administration of surveys? As I think about running experiments, the contrast is usually to the other thing we would normally do. But, that can be ill-defined. For instance, when running experiments in our telephone facility, it was difficult to describe current practice precisely as it involved expert knowledge of the managers adjusting parameters of the calling algorithm.

As further evidence that it's difficult to precisely define the essential survey conditions, there are several articles on "house effects," where the same survey with the same (rough?) specification ends up getting different results depending upon the vendor. 

This can be an issue for experiments and generalizability with new survey methods. One difficult solution is to give detailed specification of the survey conditions. One possibly easier solution is to replicate results in many settings.

Saturday, March 14, 2015

Survey Methods Training

Survey Practice devoted the entire current issue to a discussion of training in survey methodology. This is a very useful review of what is currently done and suggestions for the future.

As they observe, survey methodology is a broad discipline that draws upon a diverse set of fields of research. I expect that increasing this diversity would be positive. That is, there are a number of fields of study that would find applications for their methods in the field of survey research. 

A couple of key examples include operations research and computer science. Operations research could help us think more rigorously about designing data collection to optimize specified quantities. That doesn't mean we have to pursue one goal. But it would help, or maybe force us to quantify the vague trade offs we usually deal in. The paper by Greenberg and Stokes is an early example. The paper by Calinescu and colleagues is a recent one.  

Computer science is another such field. Researchers studying reinforcement learning seek to optimize complex, multi-stage decision problems. These methods have been used to optimize adaptive treatment regimes.  I think they may be a natural fit for some survey design problems. For example, the design of mixed mode surveys. Hopefully, we can direct ourselves toward such a future.


Friday, March 6, 2015

Reflecting the Uncertainty in Design Parameters

I've been thinking about responsive design and uncertainty. I know that when we teach sample design, we often treat design parameters as if they were known. For example, if I do an optimal allocation for a stratified estimate, I assume that I know the population element variances for each stratum. The same thing could be said about response rates, which relate to the expected final sample size.

Many years ago, the uncertainty might have been small about many of these parameters. But responsive design became a "thing" largely because this uncertainty seemed to be growing. The question then becomes, how do we acknowledge and even incorporate this uncertainty into our designs? Especially responsive designs.

It seems that the Bayesian approach is a natural fit for this kind of problem. Although I can't find a copy online, I recall a paper that Kristen Olson and Trivellore Raghunathan presented at JSM in 2005. They suggested using a Bayesian approach to update estimates of the required sample size to reach a targeted number of interviews when you are uncertain about layers of response and eligibility rates.

This is a really nifty idea. I think it has broader application than just setting the sample size. There are a lot of parameters, even cost parameters, about which we lack certainty. The approach might be vary helpful in even thinking about the consequences of this uncertainty (e.g. worst case scenarios, best case scenarios, etc.)