Survey Methods Musings

Posts

Showing posts with the label Reinforcement Learning

Learning from paradata

Susan Murphy's work on dynamic treatment regimes had a big impact on me as I was working on my dissertation. I was very excited about the prospect of learning from the paradata. I did a lot of work on trying to identify the best next step based on analysis of the history of a case. Two examples were 1) choosing the lag before the next call and the incentive, and 2) the timing of the next call. At this point, I'm a little less sure of the utility of the approach for those settings. In those settings, where I was looking at call record paradata, I think the paradata are not at all correlated with most survey outcomes. So it's difficult to identify strategies that will do anything but improve efficiency. That is, changes in strategies based on analysis of call records aren't very likely to change estimates. Still, I think there are some areas where the dynamic treatment regime approach can be useful. The first is mode switching. Modes are powerful, and offering them i...

Survey Data and Big Data

I had an opportunity to revisit an article by Burns and colleagues that looks at using data from smartphones (they have a nice appendix of all the data they can get from each phone) to predict things that might trigger episodes of depression. Of course, the data don't contain any specific measures of depression. In order to get those, the researchers had to.... surveys. Once they had those, then they could find the associations with the censor data from the phone. Then they could deliver interventions through the phone. There are 38 sensors on the phone. The phone delivers data quite frequently. So even a small number of phones (n=8 in this trial) there was quite a large amount of data generated. A bigger trial would have even more data. So this seems like a big data application. And, in this case the "organic" data from the phone need some "designed" (i.e. survey) data in order to be useful. This is also interesting in that the smartphone is delivering a...

Training for Paradata

Paradata are messy data. I've been working with paradata for a number of years, and find that there are all kinds of issues. The data aren't always designed with the analyst in mind. They are usually a by-product of a process. The interviewers aren't focused (and rightly so) on generating high-quality paradata. In many situations, they sacrifice the quality of the paradata in order to obtain an interview. The good thing about paradata is that analysis of paradata is usually done in order to inform specific decisions. How should we design the next survey? What is the problem with this survey? The analysis is effective if the decisions seem correct in retrospect. That is, if the predictions generated by the analysis lead to good decisions. If students were interested in learning about paradata analysis, then I would suggest that they gain exposure to methods in statistics, machine learning, operations research, and an emerging category "data science." It seems ...

Myopic Calling Strategies

I'm interested in sequential decision-making problems.In these problems, there is a tension between exploration and exploitation. Exploitation is when you take actions with more certainty about the rewards. The goal of exploitation is to get maximum reward to the next action given what is currently known. Exploration is when you take actions with less certainty. The goal is to discover what the rewards are for actions about which little is known. A strategy that always exploits is called myopic since it always tries to maximize the reward of the current action without any view to long-term gains. Calling algorithms certainly face this tension. For example, evenings might be the best time on average to contact households. If I know nothing else, then that would be my guess about when to place the next call. But it would be foolish to stay with that option if it continues to fail. If I have failures in that call window, I might explore another call window to try and see if the re...

Timing of the Mode Switch

I just got back from JSM where I presented the results of an experiment that varied the timing of the mode switch in a web-telephone survey. I'm not going to talk about the results of the experiment in this post, just the premise. The concern that motivated the experiment had to do with the possibility that longer delays before switching modes could have adverse effects on response rates. This could happen for several reasons. If there is pre-notification, then the effect of the prenote on response to the second mode might be reduced with longer delays before switching. If the first mode is annoying in some way, it can diminish the effectiveness of the second mode. The latter case is particularly interesting to me. It points to the ways that different treatment sequences can have different levels of effectiveness. We saw an impact like this in an experiment we did of two sequences of modes for a screening survey. The two sequences functioned about the same in terms of respo...

Personalized Survey Design

In my last post, I talked about personalized medicine. I found out this week that in personalized medicine, there is a distinction between targeted and tailored treatments. Targeted treatments are aimed at specified subgroups of the population, while tailored protocols are individual-specific treatments that may be based in a targeted treatment, but use within-patient variation to "tune" treatments over time. I wonder if the kind of tailored protocols suggested by this kind of tailoring are possible for surveys? Panel surveys are one area where this may be possible. But it seems that the panel would have to have many waves or repetitions. There might not be enough measurement of variation with only a few waves. What's a few? Let's say fewer than 10 or 20. It seems like these methods might have an application in surveys that use frequent measurement and/or a relatively long period of time. For example, imagine a survey that collected data weekly for 2 or 3 years. O...

From average response rate to personalized protocol

Survey methods research first efforts to understand nonresponse started by looking at response rates. The focus was on finding methods that raided response rates. This approach might be useful when everyone has response propensities close to the average. The deterministic formulation of nonresponse bias may even reflect this sort of assumption. Researchers have since looked at subgroup response rates. Also interesting, but assuming that these rates are a fixed characteristic leaves us helpless. Now, it seems that we have begun working with an assumprton that there is heterogenous response to treatments and that we should, therefore, tailor the protocol and manipulate response propensities. I thought this development has a parallel in clinical trials where there is a new emphasis on personalized medicine. We still have important questions to resolve. For example, what are we trying to maximize?

Adaptive Design in Panel Surveys

I enjoyed Peter Lugtig's blog post on using adaptive design in panel surveys. I was thinking about this again today. One of the things that I thought would be interesting to look at would be to view the problem of panel surveys as maximizing information gathered. I feel like we view panel studies as a series of cross-sectional studies where we want to maximize the response rate at each wave. This might create non-optimal designs. For instance, it might be more useful to have the first and the last waves measured, rather than the first and second waves. From an imputation perspective, in the latter situation (first and last waves) it is easier to impute the missing data. The problem of maximizing information across waves is more complicated than maximizing response at each wave. The former is a sequential decisionmaking problem, like those studies by Susan Murphy as "adaptive treatment regimes." It might be the case, that a lower response rate in early waves might lead...

Responsive Design Phases

In Groves and Heeringa 's original formulation, responsive design proceeds in phases. They define these phases as: "A design phase is a time period of a data collection during which the same set of sampling frame, mode of data collection, sample design, recruitment protocols and measurement conditions are extant." (page 440). These responsive design phases are different than the two-phase sampling defined by Hansen and Hurwitz. Hansen and Hurwitz assumed 100% response so there was no nonresponse bias. There two-phase sampling was all about minimizing variance for a fixed budget. Groves and Heeringa, on the other hand, live in a world where nonresponse does occur. They seek to control it through phases that recruit complementary groups of respondents. The goal is that the nonresponse biases from each phase will cancel each other out. The focus on bias is a new feature relative to Hansen and Hurwitz. A question in my mind about the phases is how the...

Survey Methods Training

Survey Practice devoted the entire current issue to a discussion of training in survey methodology. This is a very useful review of what is currently done and suggestions for the future. As they observe, survey methodology is a broad discipline that draws upon a diverse set of fields of research. I expect that increasing this diversity would be positive. That is, there are a number of fields of study that would find applications for their methods in the field of survey research. A couple of key examples include operations research and computer science. Operations research could help us think more rigorously about designing data collection to optimize specified quantities. That doesn't mean we have to pursue one goal. But it would help, or maybe force us to quantify the vague trade offs we usually deal in. The paper by Greenberg and Stokes is an early example. The paper by Calinescu and colleagues is a recent one. Computer science is another such field. Researc...

Happy Halloween!

OK. This actually a survey-related post. I read this short article about an experiment where some kids got a candy bar and other kids got a candy bar and a piece of gum. The latter group was less happy. Seems counter-intuitive, but in the latter group, the "trajectory" of the qaulity of treats is getting worse. Turns out that this is a phenomenon that other psychologists have studied. This might be a potential mechanism to explain why sequence matters in some mixed-mode studies. Assuming that other factors aren't confounding the issue.

Optimal Resource Allocation and Surveys

I just got back from Amsterdam where I heard the defense of a very interesting dissertation. You can find the full dissertation here . One of the chapters is already published and several others are forthcoming. The dissertation uses optimization techniques to design surveys that maximize the R-Indicator while controlling measurement error for a fixed budget. I find this to be very exciting research as it brings together two fields in new and interesting ways. I'm hoping that further research will be spurred by this work.

Were we already adaptive?

I spent a few posts cataloging design features that could be considered adaptive. No one labelled them that way in the past. But if we were already doing it, why do we need the new label? I think there are at least two answers to that: 1. Thinking about these features allows us to bring in the complexity of surveys. Surveys are multiple phase activities, where the actions at different phases may impact outcomes at later phases. This makes it difficult to design experiments. Clinical trials, some have labelled this phenomenon as " practice misalignments ." They note that trials that focus on single-phase, fixed-dose treatments are not well aligned with how doctors actually treat patients. The same thing may happen for surveys. When something doesn't work, we don't usually just give up. We try something else. 2. It gives us a concept to think about these practices. It is an organizing principle that can help identify common features, useful experimental me...

Panel Studies as a Place to Explore New Designs

I really enjoyed this paper by Peter Lynn on targeting cases for different recruitment protocols. He makes a solid case for treating cases unequally, with the goal of equalizing response probabilities across subgroups. It also includes several examples from panel surveys. I strongly agree that panel surveys are a fertile ground for trying out new kinds of designs. They have great data and there is a chain of interactions between the survey organization and the panel member. This is more like the adaptive treatment setting that Susan Murphy and colleagues have been exploring. I believe that panel surveys may be a fertile ground for bringing together ideas about adaptive treatment regimes and survey design.

Again on Refusal Conversions

This isn't a technique that gets much attention. I can think of three articles on the topic. I know of one article (Fuse and Xie, 2007)that investigates refusal conversions in telephone surveys and collects information (observations) from interviewers. And I just googled another one (Beullens, et al., 2010) that investigates the effects of time between initial refusal and first converstion attempt. There is a third article (Burton, et al. 2006) on refusal conversions in panel studies. This one adds another element in that a key consideration is whether refusers that are converted will remain in the panel in subsequent waves. This problem seems to fit really well into the sequential decisionmaking framework. The decision is at which waves, for any given case that refuses, should you try a refusal conversion. You might, for instance, optimize the expected number of responses (completed interviews) over a certain number of waves. Or, you might maximize other measures of data qual...

Are Refusal Conversions "Adaptive?"

I have two feelings about talking about adaptive or responsive designs. The first feeling is that these are new concepts, so we need to invent new methods to implement them. The second feeling is that although these are new concepts, we can point to actual things that we have always (or for a long time) done and say, "that's an example of this new concept" that existed before the concept had been formalized. I think refusal conversions are a good example. We never really applied the same protocol to all cases. Some cases got a tailored or adaptive design feature. The rule is something like this: if the case refuses to complete the interview, then change the interviewer, make another attempt, and offer a higher incentive. I'm trying to think systematically about these kinds of examples. Some are trivial ("if there is no answer on the first call attempt, then make a second attempt"). But others may not be. The more of these we can root out, the more we can...

Adaptive Interventions

I was at a very interesting workshop today on adaptive interventions. Most of the folks at the workshop design interventions for chronic conditions and would be used to testing their interventions using a randomized trial. Much of the discussion was on heterogeneity of treatment effects. In fact, much of their research is based on the premise that individualized treatments should do better than giving everyone the same treatment. Of course, the average treatment might be the best course for everyone, but they have certainly found applications where this is not true. It seems that many more could be found. I started to think about applications in the survey realm. We do have the concept of tailoring , which began in our field with research into survey introductions. But do we use it much? I have two feelings on this question. No, there aren't many examples like the article I linked to above. We usually test interventions (design features like incentives, letters, etc.) on the wh...

Exploration vs exploitation

Once more on this theme that I discussed on this blog several times last year. This is a central problem for the field of research known as reinforcement learning. I'd recommend taking a look at Sutton and Barto' s book if you are interested. It's not too technical and can be understood by someone without a background in machine learning. As I mentioned in my last post, I think learning in the survey environment is a tough problem. The paper that proposed the upper confidence bound rule said it works well for short run problems -- but the short run they envisioned was something like 100 trials. In the survey setting, there aren't repeated rewards. We're usually looking for one interview. You might think of gaining contact as another reward, but still. We're usually limited to a relatively small number of attempts (trials). We also often have poor estimates of response and contact probabilities to start with. Given that reward structure, poor prior informatio...

Contact Strategies: Strategies for the Hard-to-Reach

One of the issues with looking at average contact rates (like with the heat map from a few posts ago) is that it's only helpful for average cases. In fact, some cases are easy to contact no matter what strategy you use, other cases are easy to contact when you try a reasonable strategy (i.e. calling during a window with an average high contact rate), but what is the best strategy for the hard-to-reach cases? I've proposed a solution that tries to estimate the best time to call using the accruing data. I know other algorithms might explore other options more quickly. For instance, choosing the window with the highest upper bound on a confidence interval. It might be interesting to try these approaches, particularly for studies that place limits on the number of calls that can be made. The lower the limit, the more exploration may pay off.

Optimization of Survey Design

I recently pointed out this article by Calinescu and colleagues that uses techniques (specifically Markov Decision Process models) from operations research to design surveys. One of the citations from Calinescu et al. is to this article, which I had never seen, about using nonlinear programming techniques to solve allocation problems in stratified sampling. I was excited to find these articles. I think these methods have the promise of being very useful for planning survey designs. If nothing else, posing the problems in the way these articles do at least forces us to apply a rigorous definition of the survey design. It would be good if more folks with these types of skills (operations research, machine learning, and related fields) could be attracted to work on survey problems.