By Melanie Courtright, Member, IA Standards Committee.
Scope and Definition Considerations
Interest in and use of synthetic data has evolved significantly over the past two years. Synthetic respondents, simulated personas, and AI-generated datasets are now common in conference discussions, product development, and professional dialogue.
This shift makes it timely to revisit how synthetic data and synthetic participants intersect with the Insights Association Code of Standards and Ethics. The intent is not to slow innovation or endorse it uncritically, but to ensure that adoption is deliberate, transparent, and aligned with the principles that underpin research integrity and public trust.
For purposes of applying the Code, two foundational realities are relevant:
- Synthetic data and participants are often derived from, or informed by, data originating from real individuals
- Outputs generated from synthetic data or participants may be used to inform business decisions in ways similar to traditional research outputs
As a result, both participant protection principles and research integrity standards continue to apply.
IA Code Section 2: Primary Data Collection and Consent
The Code requires that researchers obtain informed consent for the collection and use of participant data, including when that data is used in ways that differ materially from its original purpose.
Application to Synthetic Data
When real participant data is used to develop, train, or inform synthetic models:
- This may constitute a material change from the original purpose of data collection
- Participants may not reasonably expect their data to contribute to simulated data and respondents, have their opinions included in training data sets, or enable future modeling applications
Guidance
- Consent language should clearly account for secondary and future uses of data where applicable
- If data may be used in modeling or simulation, this should be communicated in clear and understandable terms
- Where consent does not explicitly cover these uses, researchers should carefully assess whether the data is appropriate for inclusion
IA Code Sections 3, 4 and 6: Artificial Intelligence, Data Protection, and Privacy
The Code requires that personal data be protected against unauthorized access, disclosure, and re-identification.
Application to Synthetic Data
Synthetic models may introduce risk if:
- Source data contains identifiable or sensitive information
- Model outputs inadvertently reproduce or reveal elements of original participant data
Guidance
- Data used in all synthetic modeling, including synthetic participants, digital twins, and generated datasets, should be anonymized and minimized prior to use
- Researchers should evaluate whether model outputs could enable re-identification, directly or indirectly
- Appropriate security controls should be maintained throughout the data lifecycle
IA Code Section 9: Research Integrity and Methodological Soundness
The Code requires that research adhere to accepted methodological standards, and that emerging methods be evaluated for validity and reliability.
Application to Synthetic Data
Synthetic approaches may:
- Generate outputs without direct observation of real participants
- Produce responses that are plausible, but not grounded in empirical observation
Guidance
- Researchers should assess whether synthetic outputs are fit for their intended purpose
- Synthetic data should not be presented as equivalent to observed human responses without appropriate validation
- Methodological limitations should be clearly understood, documented, and considered in application
IA Code Sections 1, 4, 8, and 10: Artificial Intelligence, Transparency and Disclosure
The Code emphasizes honesty, transparency, and accurate representation of research methods and findings.
Application to Synthetic Data
Risks of misunderstanding increase when:
- The nature of the data source is not clearly communicated
- Outputs are presented without distinguishing between human-derived and synthetic inputs
Guidance
- The use of synthetic participants should be fully disclosed to research buyers and stakeholders
- Reporting should clearly indicate whether findings are based on:
- Human participants
- Synthetic participants or digital twins
- A combination of both
- Methodological descriptions should provide sufficient detail to support informed interpretation
IA Code Section 11: Professionalism and Public Trust
The Code requires researchers to act with integrity and avoid practices that could undermine confidence in the profession.
Application to Synthetic Data
Public trust may be affected if:
- Synthetic data is used in ways that imply representation of real individuals
- The distinction between observed behavior and modeled behavior is not clearly maintained
Guidance
- Synthetic data should not be used in a manner that misrepresents the source of insight
- Researchers should consider how synthetic approaches may influence confidence in findings and the broader research process
- The welfare and expectations of human participants should remain central, even when direct interaction is not taking place
Summary
The IA Code of Standards and Ethics provides a durable framework for evaluating emerging methodologies, including synthetic data and synthetic participants. While research tools and technologies continue to evolve, the core obligations remain consistent:
- Obtain appropriate consent
- Protect participant data
- Ensure methodological integrity
- Maintain transparency in methods and reporting
- Uphold public trust
Synthetic data does not change these responsibilities. It reinforces the need to apply them with the same level of rigor, discipline, and accountability as any other research approach.
ABOUT THIS SERIES: The Insights Association Code of Standards & Ethics sets the principles that guide ethical and professional market research, insights, and analytics. But how do those standards apply in everyday practice? In this series, members of IA’s Standards Committee bring the Code to life through practical examples, showing how it guides responsible research and decision-making across the industry.
About the Author
Melanie Courtright, Chief Strategy Officer at Sago, is a member of the Insights Association Standards Committee and former CEO of the association.