Thursday, August 25, 2016

Combining surveys with other sources of data

The term "big data" was meant to cover a wide variety of types of data. Surveys were left out of the definition. Bob Groves attempted to remedy this by coining the terms "organic" and "designed" data. These terms were meant to capture the strengths and weaknesses of big data, on the one hand, and survey data, on the other hand. Organic data are not generated for research purposes, but usually inexpensive to obtain (but not necessarily cheap to analyze). Survey data are designed for research but are often expensive to obtain.

I'm finding that these terms might get in the way of thinking about some actual problems. For instance, travel surveys are looking at combining survey data with GPS data. GPS data can be large and complex, i.e. "big data." On the other hand, features of these data are designed by the researchers in travel studies. That is, the researchers ask persons to carry a GPS device or download an app to their smartphones. These data are certainly outside of what would traditionally be thought of as a survey. On the other hand, these data are more useful when combined with survey data, as in this study.

Combining survey data with other sources of data, often measured by new technologies in ways that generate large amounts of data. This is an interesting area for surveys. And one that might not fit neatly into current categories.