I've been continuing with the experiments in call scheduling. January was the first month where there was a difference in response rates by treatment group. Generally, the response rates across the treatment arms (control and experiment) have been similar. But that doesn't necessarily mean the two methods obtain the same result.
When I look at response rates by phase, even in prior months, it appears that the experimental method has a higher response rate in the calls prior to a refusal and a lower response rate in the calls after a refusal (even though both sets of calls are now governed by the algorithm). The following tables shows the results from December and January (AAPOR RR2 by refusal status, NOT overall RR):
In the end, I will want to compare the respondents from the two groups and by refusal status to see if they do exhibit differences, especially on survey outcome variables.