Paradata are messy data. I've been working with paradata for a number of years, and find that there are all kinds of issues. The data aren't always designed with the analyst in mind. They are usually a by-product of a process. The interviewers aren't focused (and rightly so) on generating high-quality paradata. In many situations, they sacrifice the quality of the paradata in order to obtain an interview.
The good thing about paradata is that analysis of paradata is usually done in order to inform specific decisions. How should we design the next survey? What is the problem with this survey? The analysis is effective if the decisions seem correct in retrospect. That is, if the predictions generated by the analysis lead to good decisions.
If students were interested in learning about paradata analysis, then I would suggest that they gain exposure to methods in statistics, machine learning, operations research, and an emerging category "data science." It seems like exposure to methods from these areas would strengthen a persons ability to manage the messy data, find patterns in the data, and inform decisions based on the results. While we're certainly making strides forward in our ability to work with these data, a new generation with the right training will be able to carry it further.
The good thing about paradata is that analysis of paradata is usually done in order to inform specific decisions. How should we design the next survey? What is the problem with this survey? The analysis is effective if the decisions seem correct in retrospect. That is, if the predictions generated by the analysis lead to good decisions.
If students were interested in learning about paradata analysis, then I would suggest that they gain exposure to methods in statistics, machine learning, operations research, and an emerging category "data science." It seems like exposure to methods from these areas would strengthen a persons ability to manage the messy data, find patterns in the data, and inform decisions based on the results. While we're certainly making strides forward in our ability to work with these data, a new generation with the right training will be able to carry it further.
Comments
Post a Comment