I'm working on a paper on this topic. One of the things that I've been looking at is accuracy of predictions from models that use data during the field period. I think of this as a missing data problem. The daily models can yield different estimates that are biased. For example, estimates based on today might overestimate the number of interviews tomorrow. This can happen if my estimate of the number of interviews to expect on the third call is based on a select set of cases that responded more easily (compared to the cases that haven't received a third call).
One of the examples in the paper comes from contact propensity models I did for a monthly telephone survey a few years ago. Since it is monthly, I could use data from prior months. Getting the right set of prior data (or, in a Bayesian perspective, priors) is important. I found that the prior months data had a contact rate of 9.4%. The current month had contact rate of 10.9%, but my estimates for the current month were below that due to the weight of the prior data. Ouch.
I'm thinking that the Bayesian setup for this problem will actually work much better. I can calibrate the priors such that at a critical tipping point, the current data will play a greater role.
One of the examples in the paper comes from contact propensity models I did for a monthly telephone survey a few years ago. Since it is monthly, I could use data from prior months. Getting the right set of prior data (or, in a Bayesian perspective, priors) is important. I found that the prior months data had a contact rate of 9.4%. The current month had contact rate of 10.9%, but my estimates for the current month were below that due to the weight of the prior data. Ouch.
I'm thinking that the Bayesian setup for this problem will actually work much better. I can calibrate the priors such that at a critical tipping point, the current data will play a greater role.
Comments
Post a Comment