I've been thinking about the evaluation of the risk of nonresponse bias a bit lately. Imputation seems to be a natural way to evaluate those risks. In my setup, I impute the unit nonresponders. Then I can use imputation to evaluate the value of the data that I observed (a retrospective view) and to predict the value of different parts of the data that I did not/have not yet observed (a prospective view).
Allow me to use a little notation. Y_a is a matrix of observed data collected under protocol a. Y_b is a matrix of observed data collected under protocol b. Y_m is the matrix of data for the nonresponders. It's missing. I could break Y_m into two pieces: Y_m1 and Y_m2.
1) Retrospective. I can delete data that I observed and impute the values plus all the other missing values (i.e. the unit nonresponse). I can impute Y_b and Y_m conditional on Y_a. I can also impute Y_m conditional on Y_b and Y_a. It might be interesting to compare the estimates from these two procedures to see if protocol b has added much.
2) Prospective. In this case I can use a nested imputation procedure to predict the impact of each piece on the fraction of missing information (see Harel and Stratton, 2009). If I impute Y_m1 conditional on Y_a and Y_b, and then Y_m2 conditional on Y_a, Y_b, and Y_m1, I can then break the estimated FMI into components due to Y_m1 and Y_m2. In this way I can predict which cases are more valuable in the sense of contributing more to the information.
Allow me to use a little notation. Y_a is a matrix of observed data collected under protocol a. Y_b is a matrix of observed data collected under protocol b. Y_m is the matrix of data for the nonresponders. It's missing. I could break Y_m into two pieces: Y_m1 and Y_m2.
1) Retrospective. I can delete data that I observed and impute the values plus all the other missing values (i.e. the unit nonresponse). I can impute Y_b and Y_m conditional on Y_a. I can also impute Y_m conditional on Y_b and Y_a. It might be interesting to compare the estimates from these two procedures to see if protocol b has added much.
2) Prospective. In this case I can use a nested imputation procedure to predict the impact of each piece on the fraction of missing information (see Harel and Stratton, 2009). If I impute Y_m1 conditional on Y_a and Y_b, and then Y_m2 conditional on Y_a, Y_b, and Y_m1, I can then break the estimated FMI into components due to Y_m1 and Y_m2. In this way I can predict which cases are more valuable in the sense of contributing more to the information.
Comments
Post a Comment