Cover

What We Learned Auditing Our First 100 Studies

  • Main Home
  • /
  • What We Learned Auditing Our First 100 Studies

Sat, 03 Jan 26

What We Learned Auditing Our First 100 Studies

Where Strategy Meets Clarity

Over the course of auditing the first 100 studies run through Blanc’s fraud and quality layer, some uncomfortable truths emerged—and they kept repeating. Across industries, geographies, and sample providers, the patterns of failure looked remarkably similar, as did the few interventions that actually changed outcomes. This is a look at those patterns: where research kept going wrong, the surprising ways fraud shows up, and the three fixes that reliably moved the needle.

Pattern 1: “Clean Enough” Wasn’t Even Close

The most common starting belief was that existing checks were “good enough.” Teams had trap questions, speeding rules, and basic duplicate filters in place, so they assumed fraud was a marginal issue. Once audits were done, that assumption collapsed.

In many of these 100 studies, low-quality or fraudulent responses sat in the 15–30% range, even when internal QA processes reported only 3–5% removals. That gap came from what those QA processes were actually built to catch: obvious speeders, empty open-ends, and blatant straight-lining. What they missed were professional respondents who knew how to slow down, vary answers, and pass basic traps while still misrepresenting who they were.

The downstream effect was clear. Segments looked stable on paper but failed to explain real-world behavior. Campaigns targeting “high-value” audiences underperformed; A/B test results contradicted themselves across waves. Once data was re-run on a cleaned sample, cluster boundaries shifted, segment sizes changed, and sometimes the “hero” audience stopped being the hero at all.

Pattern 2: Self-Reported Demographics Were Treated as Ground Truth

A second pattern was over-reliance on self-reported demographics and profile attributes. In almost every audit, key targeting variables—age, income, region, role, industry—were accepted at face value. There were quotas, but no verification.

In practice, that meant respondents could (and did) reshape themselves to fit higher-paying or easier-to-qualify segments. People aged themselves up to appear as senior decision-makers, moved countries with a dropdown, or added children to qualify for parenting studies. Over time, audits revealed the same respondent claiming multiple inconsistent profiles across different surveys in the same ecosystem.

This mattered most in segmentation and roadmap studies. Product and marketing teams were building strategies based on the assumption that “25–34, high income, metro” really meant that, when in reality a significant fraction of those respondents did not match the claimed profile. Once behavior and response history were used to cross-check profiles, a notable share of the “most valuable” segment simply disappeared.

Pattern 3: Open-Ends Looked Rich but Repeated Themselves

Open-ended questions were often treated as the qualitative “soul” of a study—places where respondents voiced nuance, emotion, and detail. In the audits, they turned out to be one of the biggest red flags.

Over and over, the same patterns appeared:

  • Highly polished, generic answers repeated with minor wording changes.

  • Copy-paste blocks that showed up across many respondents.

  • Overly “on-brand” language that echoed the question text rather than lived experience.

On manual read, much of this looked plausible. At scale, though, it became clear that large chunks of open-ends were either copied, templated, or generated by tools designed to sound human without being grounded in real experience. When open-ends were clustered semantically, these repeated patterns formed dense islands of near-identical sentiment.

The danger was not just that some responses were fake; it was that these fake narratives pulled themes in a particular direction—more positive, more generic, more “clean” than real customers tended to be. Once those were stripped out, the remaining open-ends often painted a rougher, more contradictory, but far more useful picture.

Pattern 4: Incentive Mechanics quietly Encouraged Abuse

Another recurring theme was how incentives shaped the sample without anyone really looking. Across the 100 audits, studies with richer rewards, longer field periods, or loose entry criteria tended to attract disproportionate numbers of professional respondents and fraudsters.

Certain behaviors kept appearing:

  • A small group of repeat participants capturing a large share of incentive payouts over time.

  • Sudden spikes in completes from the same devices or IP ranges when incentives were increased.

  • High-complete “power users” whose response histories were inconsistent with any realistic consumer.

This wasn’t just about “too many speeders.” It was about a structure where the fastest way to maximize earnings was to bend or break the rules. Without caps and proper identity-level tracking, incentive budgets were effectively subsidizing fraud.

Pattern 5: Vendor Reporting Was Too Shallow

Across different suppliers, the same reporting pattern appeared: top-line quality metrics looked comforting but revealed very little. Vendors would report “X% quality fails removed” or “Y% attention checks passed” without exposing raw behavioral signals, device patterns, or cross-study consistency.

In many of the 100 audits, once independent checks were layered on top of vendor filters, a second, larger layer of issues surfaced—duplicate identities spread across panels, mismatched profiles, coordinated timing patterns, and clear signatures of external farms.

This didn’t mean every vendor was negligent or deceptive; it did mean that most quality frameworks stopped at the survey boundary and treated each project as an isolated event. Fraud, however, operated across panels, across time, and across clients. Without that broader view, traditional safeguards kept missing the real problem.

Three Fixes That Moved the Needle

Across all these audits, three interventions consistently changed outcomes—regardless of industry, questionnaire, or sample size.

1. Make “Verified Humans” the Core Metric

The single biggest shift happened when teams stopped optimizing for “number of completes” and started optimizing for “number of verified human completes.” That required:

  • Locking down key demographics so they couldn’t shift arbitrarily over time.

  • Building a minimal identity layer (device, behavior, and profile) that persisted across studies.

  • Treating obviously inconsistent profiles as high-risk, not “just noise.”

Once that lens was applied, fraud rates became visible rather than abstract. Teams could see exactly how many responses had to be removed and what that meant for budget, timelines, and segment stability. Importantly, it also changed internal conversations—leaders began asking “How many verified humans are behind this insight?” rather than “How large is the n?”

2. Combine Behavioral Patterns with Content Patterns

Another consistent win came from blending behavioral signals (timing, navigation, device patterns) with content signals (open-end structure, semantic repetition). Each type alone caught some issues; together, they caught far more.

Behavioral patterns helped identify:

  • Bots and farms pacing too consistently.

  • Devices or IPs responsible for suspiciously high volumes.

  • Sessions with implausible completion times or navigation.

Content patterns helped identify:

  • Repeated or templated open-ends.

  • Overly polished or generic language clusters.

  • Sets of respondents whose answers were “different enough” to dodge duplicate checks but obviously derived from the same template.

When these two layers agreed—suspicious behavior and suspicious content—the likelihood of genuine fraud was very high. Using that joint signal, teams could safely remove problematic clusters without over-pruning genuine respondents.

3. Move from One-Off Cleanup to Always-On Defense

The final pattern was about process, not just technology. Many of the first 100 studies came to audit as “one-off emergencies”—a big launch had gone wrong, a tracker had started drifting, or leadership had lost confidence. Cleanup helped, but the real transformation came when teams moved from emergency audits to ongoing defense.

That shift included:

  • Setting fraud and verification KPIs for every new study, not just special ones.

  • Standardizing minimum quality requirements across vendors.

  • Running recurring reviews of panel health, not just project-level checks.

  • Making it acceptable—expected, even—for insights leaders to pause or reject a dataset that didn’t meet those standards.

Once quality became a continuous practice instead of an occasional reaction, fraud rates trended down, segment stability improved, and the gap between research findings and real-world performance narrowed. Over several quarters, many teams saw a tangible change: fewer “mystery flops,” more predictable campaign outcomes, and a steady rebuilding of trust in research as a decision-making input.

What It All Adds Up To

Auditing 100 studies didn’t reveal a few exotic edge cases; it revealed ordinary, repeatable failure modes in how data quality is handled. The core lesson was simple: fraud and low-quality responses are not random noise; they are systematic, adaptive, and tightly linked to how studies are designed, incentivized, and verified.

The good news is that the fixes are also systematic. When organizations define “verified humans” as their core unit of truth, combine behavioral and content signals, and treat quality as an ongoing discipline instead of a crisis response, the character of their research changes. Segments stop shifting with every wave. Product roadmaps and campaigns start matching reality. Leadership meetings transition from defending the data to acting on it.

That is what “what we learned” really means: not just that fraud is worse than most teams think, but that the path to confident, trusted insights is clearer—and more repeatable—than it looks once the right foundations are in place.

Let’s connect and uncover something insightful together.