Why synthetic personas miss real insights, David Dobrin

Written by Simon Spyer | Jun 13, 2026 3:43:54 PM

Most AI generated research sounds convincing because it's designed to. They give plausible answers to sensible questions, built from web data and training that covers every demographic slice you can imagine. The problem is plausible responses from synthesised data tell you nothing about actual customers.

David Dobrin, founder of OriginalVoices.ai, cuts through the persona theatre. His platform builds digital twins from real customer conversations, support calls, interviews and first-party data. Not statistical approximations, not web-scraped profiles; actual people, turned into queryable APIs that respond based on what they actually said.

The difference matters commercially. Generic personas optimise for average responses. Customer twins surface the edge cases, the objections, the moments where your messaging breaks down with real people who actually buy.

Why API-first design changes everything about research

Traditional research delivers static reports. Twelve weeks to plan, eight weeks to field, four weeks to analyse. By the time findings reach the marketing team, the marketing agenda has moved on. 

Dobrin's approach flips this.

Customer twins become APIs you can query in real time. Need to test messaging against specific segments? Query the API. Want to understand objections to a new feature? Ask the digital twin network directly. The research becomes live infrastructure, not a quarterly deliverable.

This shifts the economics entirely. Instead of commissioning studies, you maintain customer intelligence that responds to business questions as they emerge. Marketing teams get immediate feedback on creative, positioning, and campaign strategy. No waiting. No additional budget approvals.

Same models, different context wins

Everyone has access to the same foundation models. ChatGPT, Claude, Gemini, the commoditisation is already here. Competitive advantage comes from the context layer you feed those models, not the models themselves.

Web scraping gives you surface patterns. First-party data gives you transaction history. But customer conversations give you reasoning. Why someone bought. What almost stopped them. How they actually describe your product to others.

Dobrin identifies three context layers:

  • web data (lowest value),

  • proprietary data (medium value),

  • and human insight (highest value).

Most AI implementations stop at the first two. The third layer, actual customer voices, is where differentiation lives.

This is why customer twins outperform synthetic personas. They're not built on demographic assumptions. They're built on recorded conversations with people who actually made purchase decisions. The model gets context that matters commercially.

Call centre data is customer intelligence, not cost centre overhead

Support calls are strategy goldmine. Every conversation contains unscripted customer language, real objections, authentic descriptions of value. Most companies treat this as compliance data to store and forget.

Dobrin sees it differently. Call transcripts become training data for customer twins. Support conversations surface the gaps between marketing promises and customer reality. The language customers actually use to describe problems. The objections that don't appear in focus groups.

This creates a feedback loop between customer experience and marketing strategy. Campaign messaging gets tested against actual customer language before launch. Creative gets validated against real objections recorded in support calls. The marketing team operates with customer intelligence that updates continuously.

The implementation is straightforward. Existing call recording systems integrate via API. Transcripts train customer twins that respond based on actual conversations. Marketing queries the twins for campaign feedback, messaging validation, creative testing.

Validation loops keep AI grounded in reality

Customer twins risk the same hallucination problems as any AI system. The fix is continuous validation against actual customer behaviour. Purchase data, support interactions, survey responses, all feeding back into the twin to keep responses accurate.

Without validation loops, even first-party-trained models drift toward plausible fiction. With validation, the twins become more accurate over time, reflecting how customer attitudes and language evolve.

This creates measurable customer intelligence. Marketing decisions get tested against twins, then validated against actual customer response. The gap between prediction and reality becomes a feedback signal that improves the system.

Watch the full conversation on The Precision Brief to see how OriginalVoices.ai turns customer conversations into queryable intelligence that marketing teams can actually use.