Paradata is data captured from a survey, opinion and marketing research project’s administration data. Paradata may include any and all information linked to a survey project and survey cases that do not generally include the survey variables.   Several common examples of paradata include characteristics & behaviors of interviewers, outcome and details of respondent contact attempts, information regarding respondent interview time to completion, dates/times of data collection, etc.

Jim O’Reilly (Westat) notes that the origin of paradata dates to the “early 1980’s as trace files of fields entered, keys pressed, and timings during computer-assisted interviews.”[1]  Since that time, paradata usage has expanded tremendously, though many consider its usage to remain relatively untapped. Common examples of current paradata usage include examining and correcting survey error, conducting respondent validation, interviewer performance evaluation, determining cost efficiencies of survey administration alternatives, and other purposes.

Although paradata is commonly used in marketing research, the term “paradata” is not often referenced.  For example, researchers utilizing online panels use paradata for validation purposes, such as through the examination of respondent time to completion in survey taking.  Mick Couper (University of Michigan), points out that paradata has the following characteristics:

  • It’s more useful at, or near, the end of the process than beginning
  • For post survey evaluation rather than development
  • Fits within a continuous quality improve framework
  • Identifies where problems occur, but not why
  • Identifies areas to focus attention using more expensive and time consuming-methods[2]

Case Example 1:  Researchers at the Urban Institute examined paradata from the National Survey of America’s Families to determine the effect of interviewer skill levels and experience on data quality.[3]  Researchers analyzed the relationship between interviewer characteristics (cooperation rate statistics, whether participated in first round of survey, etc.) and sensitive survey questions, missing questionnaire data, and other variables.   No conclusive relationship was found, thus interviewers with more skill and/or experience did not appear to outperform other interviewers in terms of data quality.

Case Example 2: Researchers at the US Census Bureau implemented an early warning system to alert management of possible interviewer falsification and other problems based upon

[1] O’Reilly, Jim. 2009. Paradata and Blaise: A Review of Recent Applications and Research.

[2] Couper, Mick P. 2009/ The Role Paradata in Measuring and Reducing  Measurement Error in Surveys.  NCRM Network for Methodological Innovation 2009: The Use of Paradata in UK Social Surveys.

[3] Safir, Adam, Black, Tamara, and Steinbach, Rebecca. 2001. Using Paradata to Examine the Effects of Interviewer Characteristics on Survey Response and Data Quality.