Since the results of the experiment on call scheduling were good (with the experimental method having a slight edge over the current protocol), I've been allowed to test the experimental method against other contenders. The experimental method is described in a prior post.
This month, I'm testing the experimental method which uses the predicted value for contact probabilities (MLE) across the four windows against another method which uses the Upper Confidence Bound (UCB) of the predicted probability. This quite often implies assigning a different window for calling than the experimental method.
The UCB method is designed to attack your uncertainty about a case. Lai ("Adaptive Allocation and the Multi-Armed Bandit Problem," 1987) proposed the method. Other than the fact that our context (calling households to complete surveys) is a relatively short process (i.e. few pulls on the Mult-Armed Bandit), the multi-armed bandit analogy fits quite well.
In my dissertation, I did some simulations that suggested that the UCB approach might beat out the MLE approach over the long run. But that it would fall behind early. In other words, it learns early and then exploits that learning later. Those data were simulated. We'll see how it works on the real thing...
This month, I'm testing the experimental method which uses the predicted value for contact probabilities (MLE) across the four windows against another method which uses the Upper Confidence Bound (UCB) of the predicted probability. This quite often implies assigning a different window for calling than the experimental method.
The UCB method is designed to attack your uncertainty about a case. Lai ("Adaptive Allocation and the Multi-Armed Bandit Problem," 1987) proposed the method. Other than the fact that our context (calling households to complete surveys) is a relatively short process (i.e. few pulls on the Mult-Armed Bandit), the multi-armed bandit analogy fits quite well.
In my dissertation, I did some simulations that suggested that the UCB approach might beat out the MLE approach over the long run. But that it would fall behind early. In other words, it learns early and then exploits that learning later. Those data were simulated. We'll see how it works on the real thing...
Comments
Post a Comment