The elephant in the room: A history of data quality in market research
• • • • • •
Explore the history of data quality in market research and see what’s next.
By: Steve Seiferheld, Senior Director, Market Research, and Jessica Corbett, Associate Director, Product Marketing
Earlier in my (Steve’s) career, I oversaw a mall intercept research study, where people at malls were recruited to an onsite facility to participate in a taste test. While there, I observed a couple of women participating in a different study who came in together and joked about what answers they should pick. It was an early-career lesson showing that any research can suffer from poor-quality data or bad actors who participate in research for the wrong motivations: for the incentive rather than the opportunity to give honest feedback to brands.
While survey research will never be fully immune to data quality challenges, there have been significant evolutions in research methods throughout the years to mitigate biases wherever possible--and as those research methods evolved, new challenges emerged, followed by new solutions, and so on. We find ourselves now at a particularly pivotal point and the path forward is not straightforward. To better understand how today’s push and pull of challenges and solutions differ from those the industry has faced in the past, let’s look at a brief history of quality issues in survey research.
The rise of online survey research
The birth of the internet brought about a new era of data collection, and while this new era offered convenience and reach, it, like its predecessors, also faced its own unique set of challenges. In the early days, online surveys often excluded certain demographics, such as older or less tech-savvy individuals, leading to concerns about sample representativeness. App-based surveys were limited to respondents who had downloaded and actively used the app, which similarly resulted in skewed samples.
Additionally, the desire to complete surveys and maximize incentives led to new issues with data quality, like speeding, straight-lining, and gibberish responses. Researchers implemented quality checks, like logic traps and red herring questions, to filter out low-quality respondents. While these solutions were effective at eradicating bad-acting humans, they weren’t as successful against the next wave of challenges for the survey research industry: bad actors, bots, and click farms.
Starting in the mid-2010s, internet research came under attack by bad actors who created click farms, which are organized operations that employ hundreds or even thousands of bots, typically in markets where the value of a dollar is much greater than in the USA. Click farms enabled real humans with the ability to qualify for as many surveys as possible, move through surveys as quickly as possible, and maximize incentives as much as possible. Over the years, bad actors began employing programmatic bots, automating and scaling out their fraudulent efforts.
In response, panel companies began implementing 2-factor authentication and IP and digital fingerprinting. Survey tools began employing solutions like red herring questions, speed moderation, open-end reviews, and ReCAPTCHA v3. Because panel companies and survey tools are often separate entities in a fragmented ecosystem, their measures rarely speak to one another, leaving space for bots to evolve and evade them. Panel companies and survey tools began promoting an “acceptable threshold” of low-quality responses to be routinely cleaned from brands’ data--and for a while, these practices sufficed.
And this brings us to today.
When ChatGPT was released to the public in November 2022, it meant generative AI was easily accessible by bad actors to leverage for the bots in their click farms. It also meant the existing quality assurance measures employed by panel companies and survey tools were no longer sufficient. AI-powered bots can easily navigate red herring questions (go on, give it a try!), and they can also be instructed not to straightline questions, to move slowly through surveys, and to mimic human speech in their open-ended responses.
Because of this newfound sophistication of AI-powered bots, brands have been forced to choose between bad quality data and severely impacted timelines, as survey tools offer the subpar solution of cleaning out 80% of their data and refielding their studies over and over again.
Some experts suggest that the industry should raise awareness of market research best practices and improve the respondent experience to attract and retain more human respondents. But there’s one big issue with those solutions: none of them will get rid of the bots.
Our role in market research is to make it too time-consuming and too expensive for bots to attack our industry. AI-powered bots and bad actors need to be answered with AI-powered innovation, and that’s why we created Biotic.
Biotic is a patent-pending technology that identifies, isolates, and continually learns from bots that attempt to infiltrate our panel. Because of Suzy’s vertically integrated platform, Biotic can analyze the end-to-end respondent experience, from signup to survey to incentives, and everywhere in between, looking for suspicious patterns of behavior that most panel companies and survey tools can’t see due to their disconnected nature.
Instead of banning the bots from our panel, Biotic traps them in a honeypot. Once they’re isolated in the honeypot, they’re fed perpetual challenge questions and blocked from receiving incentives and real research questions. This serves two purposes: 1) bad actors managing the bots are unaware that they’ve been caught, making it more difficult for them to evolve; 2) Biotic can monitor their behavior and continually learn from them to make its engine more effective over time. And best of all--this process never impacts our human respondents, ensuring their experience remains fun and free from burdensome trap questions.
Since Biotic’s inception, we’ve blocked a significant number of bots from entering surveys and appearing in our customers’ results. However, while we’ve seen significant success with Biotic so far, the journey is far from over.
As AI models (and bots) become more sophisticated (looking at you, ChatGPT 5), the industry will have to evolve just as fast. With the introduction of Biotic, we’re honoring our commitment to innovation, rising to the challenge of AI, and paving the way to a new era in data quality, just as researchers have done time and time again.