Skip to main content

Posts

Showing posts from 2020

Total Data Quality

In an earlier post, I suggested that survey methodologists are "data quality specialists." Our focus on " total survey error " (TSE) is, in many ways, the central defining concept of our field. This focus on data quality could be an important contribution that survey methodologists make to the emerging field of data science. But in order to make that contribution, we may need to test the fit of the TSE concept on evaluations of non-survey data. One of the sources of error in surveys that we examine in surveys is "nonresponse." Does this concept apply to other sources of data? Certainly other sources of data having missing data. But nonresponse is a specific mechanism where we sample a unit and then request data, but the unit fails to supply the data. How does this concept apply to other sources of data? I wouldn't say that Twitter data suffer from "nonresponse" due to the fact that not everyone has a Twitter account or even that not every