Kapur’s and Herrera’s article on improving data quality does an excellent job of describing some of the problems and dilemmas faced in data collection. The piece begins with the authors views on what problems currently exist with data sets. They use a framework of validity, coverage, and accuracy to measure data quality where
Validity refers to the relationship between theoretical concepts and collected information; coverage refers to the completeness of data sets; and accuracy refers to the correctness or avoidance of errors in data sets (366).
The rest of the article boils down to an assessment of all the actors in the data collection process. Kapur and Herrara recognize a problem with the data sets as outlined by their above framework of validity, coverage, and accuracy, and they attempt to account for these problems with a look into the incentives and capabilities of each individual actor in the data collection process.
For example, respondents to survey’s or questionnaires have particular incentives for answering questions a certain way. They include “opportunity costs, fear of punishment, political support, and material gain” (372). A respondent’s capabilities may refer to the time available to fill out a survey, knowledge of the actual material they are being asked about, or level of physical or mental health.
Through the author’s analysis of each actor’s incentives and capabilities, the reader is able to understand the potential problems that arise from data collection. If research is conducted, and data collected, with these types of issues in mind, it can only serve to better the quality of the data and improve research in any area of study.