Skip to main content


Showing posts from May, 2014

Tracking, Again

Last week, I mentioned an experiment that we ran with changing the order of tracking steps. I noted that the overall result was that the original, expert-chosen order worked better than the new, proposed order.

In this example, the costs weren't all that different. But I could imagine situations where there are big differences in the costs between the different steps. In that case, the order could have big cost implications.

I'm also thinking that a common situation is where you have lots of cheap (and somewhat ineffective steps) and one expensive (and effective) step. I'm wondering if it would be possible to identify cases that should skip the cheap treatments and go right to the expensive treatment. Just as a cost savings measure. It would have to result in the same chance of locating the person. In other words, the skipped steps would have to have the same or less information than the costly step. My hunch is that such situations actually exist. The trick is finding th…

Tracking: Does Sequence Matter?

I've wanted to run an experiment like this for a while. When we do tracking here, we either run a standard protocol. This protocol is a series of "tracking steps" that are carried out in a specific order. The other way we do this is to let the tracking team decide which order to run the steps in. In cases where we run a standard protocol, experts decide which order to run the steps in. Generally, the cheapest steps are first on the list.

The problem is that you can't evaluate the effectiveness of each step because they all deal with different subgroups (i.e. those that didn't get found on the previous step). I only know of one experiment that varied the order of steps.

Well, I finally found one that wasn't too objectionable. I got them to vary the order. We recently finished the survey and found that... the original order worked better. The glass half full view: it did make a difference which order you used. And the experts did choose that one.

More on Measurement Error

I'm still thinking about this problem. For me, it's much simpler conceptually to think of this as a missing data problem. Andy Peytchev's paper makes this point. If I have the "right" structure for my data, then I can use imputation to address both nonresponse and measurement error.

If the measurement error is induced differently across different modes, then I need to have some cases that receive measurements in both modes. That way, I can measure differences between modes and use covariates to predict when this happens.

The covariates, as I discussed last week, should help identify which cases are susceptible to measurement error. There is some work on measuring whether someone is likely to be influenced by social desirability. I'm think that will be relevant for this situation. That sounds sort of like, "so you don't want me to tell me the truth about x, but at least you will tell me that you don't want to tell me that." Or something like …

Covariates of Measurement Error

I've been working on some mixed-mode problems where nonresponse and measurement error are confounded. I recently read an interesting article on using adjustment models to disentangle the two sources of error. The article is by Vannieuwenhuyze, Loosveldt, and Molenberghs. They suggest that you can make adjustments for measurement error if you have things that predict when those errors occur. They give specific examples. It's things that measure social conformity and other hypothesized mechanisms that lead to response error.

This was very interesting to read about. I suppose that just as with nonresponse, the predictors of this error -- in order to be useful -- need to predict when those errors occur and the survey outcome variables themselves. This is a new and difficult task... but one worth solving giving the push to use mixed mode designs.

Proxy Y's

My last post was a bit of crankiness about the term "nonresponse bias." There is a bit of terminology, on the other hand, that I do like -- "Proxy Y's." We used this term in a paper a while ago.

The thing that I like about this term, is that it puts the focus on the prediction of Y. Based on the paper by Little and Vartivarian (2005), this seemed like a more useful thing to have. And we spent time looking for things that could fit the bill.

If we have something like this, the difference between responders and the full sample might be a good proxy for bias with the actual Y's. I'm not backtracking here -- it's still not "nonresponse bias" in my book. It's just a proxy for it.

The paper we wrote found that good proxy Y's are hard to find. Still, it's worth looking. And, as I said, the term keeps us focused on finding these elusive measures.