How Suzy’s Biotic is rewriting the rules of data quality

In an industry built on trust and precision, data quality is everything. Yet as online research continues to scale and generative AI becomes more accessible than ever, so too does the risk of sophisticated fraudulent respondents entering studies undetected and compromising results. While researchers have developed methods to detect and manage bad data, these reactive approaches often come too late—after time and money have already been spent. At Suzy, we believe the solution lies not in catching fraud after the fact, but in preventing it from happening in the first place. That’s why we built Biotic: to protect research integrity by addressing fraud across the entire respondent lifecycle, long before a question is ever answered.

The cost of bad data

The purpose of market research is to deliver reliable data that supports confident decision-making. When fraud is detected in a dataset, the next step for many researchers and agencies is to simply pay for more sample to offset the bad data. This cost is absorbed by someone along the way, whether it’s the researchers or agencies conducting the research or the panel provider that supplies the participants. And while it may not be the brand absorbing the financial cost of additional sample, they will always incur the cost of additional fielding time and delays in results.

Beyond the financial and time costs for individual studies, fraud threatens the validity of the entire market research industry. When bots infiltrate a panel, they undermine trust in every dataset that panel provides—raising fundamental questions about the reliability and role of market research as a whole.

Rethinking “removal rate” as a success metric

According to the Insights Association, the average removal rate of online sample responses has surged to over 43%—but what does that really tell us? Contrary to what some might believe, removal rate alone isn't a measure of data quality or research success. A low removal rate may suggest a clean dataset, but it could also mean that no effective checks were in place to catch bad responses. Conversely, a high removal rate might indicate major underlying issues with the panel, raising questions about whether it should have been used at all.

The real issue is not how many responses were removed—it's why they were removed and whether the measures that prompted their removal allowed any bad quality responses to slip through undetected. Without proper detection mechanisms, a low removal rate is meaningless. And with insufficient context, a high one can be misleading.

Traditional research agencies and panel providers are very much incentivized to complete fielding quickly, which can lead to shortcuts in quality assurance. In many cases, removal rates are not disclosed at all, or when they are, they’re stripped of the context brands need to properly interpret them.

In a DIY environment, this challenge is even more acute. On self-serve platforms, the burden of data cleaning shifts to the brand, requiring internal teams to manually identify and remove poor-quality responses—an added layer of work that compromises the very speed and agility DIY platforms promise.

Suzy’s approach, powered by Biotic, turns this model on its head. Our system is designed for fraud prevention, not just detection. We don’t rely on removal rates as a KPI. Instead, we aim to prevent bad data from ever entering your dataset by applying rigorous front-end quality controls—ensuring your results are reliable from the start.

Here’s a summary of the three approaches to research quality management:

	Suzy (with Biotic)	Agencies	Brands (DIY Platforms)
Proactive vs. Reactive	Proactive fraud prevention	Reactive fraud detection	Reactive fraud detection
Timing of Quality Checks	Before, during and after responses are collected	After responses are collected	After responses are collected
Breadth of Visibility	Fully integrated quality checks across both panel and platform	Siloed detection systems	Siloed detection systems
Responsible Party for Data Cleaning	Handled by Suzy via Biotic	Agency or panel vendor responsibility	Brand’s internal team responsible
Potential Impact on Fielding	Minimized due to upfront protections	Delays due to refielding and manual data cleaning	Delays due to manual data cleaning

Fraud detection vs. fraud prevention

Fraud detection and fraud prevention are often used interchangeably, but they represent two fundamentally different approaches to ensuring data quality. Fraud detection methods are reactive and typically operate independently of one another: panel providers use detection tools like digital fingerprinting, IP matching, and email domain validation, while research platforms check for straightlining, speeding, or gibberish responses. These methods not only operate in silos—with little to no connection between panel and research tool checks—but some also come into play after a respondent begins participating in research studies. In this model, fraud detection functions as a point-in-time bandaid, requiring agencies and brands to manually remove bad data, replace respondents, and often extend fieldwork timelines as a result.

Rather than simply detecting fraud after the fact, Biotic proactively prevents it before it ever reaches your data.

How Biotic works: Real-time fraud prevention at every touchpoint

Biotic proactively prevents fraud, where it continuously analyzes patterns of behavior across dozens of detection methods and leverages machine learning to identify and trap bots before they have the chance to enter research data. Because Suzy owns our own panel, Biotic is uniquely positioned in that it has access to the entire research ecosystem and the entire participant lifecycle from signup to survey participation to incentives and everywhere in between. Biotic can identify how individual and collective respondent behavior changes over time, making it possible to spot anomalies and evolving patterns that more siloed systems would miss. This ecosystem-wide view allows Biotic to intervene before a fraudulent respondent even reaches a survey.

At onboarding: Screening out bad actors before they begin

Before a respondent is ever approved to participate in research, Biotic performs a comprehensive series of checks, including:

IP blocking and email domain blocking: Prevents users with known bad IPs or suspicious email domains (e.g., disposable or temporary emails) from entering the system.
Email verification: Requires all new signups to verify their email addresses, ensuring the address is valid and in use.
Double verification (SMS) and telesign: Adds an extra layer of protection by confirming each user is a unique individual and checking phone numbers for fraud risk.
Onboarding questions: Includes education and attention scripts that reinforce expectations and assess whether respondents are actively engaged and understand the importance of their feedback.
CAPTCHA and Azure Vision: Confirms human presence with CAPTCHA tests, plus photo upload verification using Microsoft’s Azure Vision AI to validate identity.
New accounts review: Assesses overall account activity for larger patterns of coordinated fraud or unusual sign-up behavior.
Device risk blocking (Verisoul): Flags and blocks signups from devices associated with high-risk, spoofed, or previously banned activity.

Only once a user clears these onboarding checkpoints are they allowed to participate in live research.

During surveys: Real-time monitoring and anomaly detection

Once a respondent begins a survey, Biotic monitors their responses in real time, layering behavior-based detection with historical benchmarks to catch signs of fraud as it happens:

Email domain and IP monitoring: Ongoing monitoring for VPN usage, email address changes, and risky patterns associated with fraud networks.
Open-end duplication and relevancy review: A live review process by Suzy’s data quality team to catch duplicate or irrelevant open-end answers before they enter the dataset.
Speeding and straightlining: Flags respondents who rush through surveys or consistently choose the same answer option—often hallmarks of disengaged or automated behavior.

Between surveys: Continuous quality assurance

Biotic doesn’t stop working once a survey ends. In between research engagements, Biotic continues to validate respondent quality with passive and active defenses:

Biotic trap questions: Strategically placed questions that bots or inattentive users are likely to fail, but that human participants will easily pass.
Incentive monitoring: Tracks redemption behaviors across the panel to flag suspicious patterns like rapid cash-out attempts or bot-farm-like incentive harvesting.
Biotic honeypot: If a respondent fails a trap question or triggers suspicious behavior, they are silently diverted into a honeypot—a controlled testing environment where they no longer participate in real research. This allows Biotic to observe and learn from their behavior without signaling that they’ve been caught.

Because these bad actors don’t know they’ve been flagged, they don’t know how to adjust their behavior—giving Biotic the upper hand. And by the time they might figure out what triggered their flag (if ever), Biotic has already evolved to outpace them.

A living, learning system built for quality

Biotic’s fraud prevention engine isn’t just rule-based—it’s adaptive by design. Fraud in research evolves constantly, which is why Biotic evolves with it. Using a wave-based fraud identification model, Biotic continuously learns from the behaviors seen in each round of survey data. When new fraud patterns emerge, Biotic doesn’t wait to catch up—it evolves in real time.

Just last week, Biotic began surfacing “fuzzy duplicate” responses that weren’t exact duplicates but were suspiciously similar—mirroring structure, tone, and content in ways that suggested automation or coordinated abuse, bypassing traditional deduplication, grammar, and relevance checks.

For example:

Fuzzy Duplicate 1: "A L'Oreal brand eyeliner, I bought it at Walmart, I loved its quality."
Fuzzy Duplicate 2: "I bought the L'Oreal brand eyeliner, I bought it at Walmart, I really liked that purchase because I was able to calmly choose the one that was right for me."

This emerging pattern prompted the development of fuzzy duplicate detection, an enhancement that evaluates semantic similarity and syntactic patterns to catch near-duplicates that would otherwise slip through. The result? Once Biotic began checking for this pattern of fraud on March 21 (see the turquoise bars in the stacked bar chart below), we were able to keep thousands of fuzzy duplicate responses out of customer data from that day on:

‍

These iterative innovations are made possible by Biotic’s layered approach, which combines automated machine learning models with human oversight. Suzy’s dedicated data quality team plays a critical role, manually reviewing edge cases and feeding insights back into the system to continuously refine its capabilities.

The outcome is a living, learning defense system—one that protects research not just after fraud occurs, but before it ever has the chance. By staying ahead of emerging threats and integrating new fraud signals on an ongoing basis, Biotic ensures the data that powers business decisions remains authentic, reliable, and worthy of the trust placed in it.

One system. Any source. Only at Suzy.

Biotic is source-agnostic. It works whether a sample comes from Suzy’s proprietary panel (Crowdtap) or one of our trusted third-party panel partners. But access to Biotic is exclusive to the Suzy platform. That means whether you're sourcing consumers directly or blending our proprietary panel with third-party panel sources, you can rely on the same rigorous fraud protection without lifting a finger.

Asking the right questions

To ensure your research partners are prioritizing data quality, ask them these critical questions:

How do you detect and prevent fraudulent respondents in real time?
What are the newest fraud patterns you’ve seen?
How often do you update your fraud detection methodologies to adapt to new threats?
What safeguards do you have in place to prevent bot activity and duplicate respondents?
Do you rely on manual review, automated AI detection, or a combination of both?
Do you charge clients extra for additional sample to compensate for poor-quality responses?
How do you compensate for fielding time delays due to poor quality?
How do you work with clients to continuously improve data quality throughout a project?

At Suzy, we built Biotic to ensure the answer to those questions is always proactive, detailed, and transparent. Fraud isn’t going away. But with Biotic, neither is your confidence in the data you rely on to lead.

Fraud is everywhere—but Suzy’s Biotic is built to beat it.

Book a Demo