CxO + VP Engineering briefing 01 / 13

Slide 01

Your Feedback Loop Is Six Weeks Too Long. Synthetic Users Close It in Twenty Minutes.

CxO + VP Engineering + Board
Core thesis

A synthetic user is an LLM agent calibrated to a real buyer persona that can navigate your product, read your copy, and tell you what is wrong before a real customer does. The economics of customer feedback just changed permanently.

A typical usability study costs $8,000-$15,000 and takes two to four weeks. A synthetic panel run takes twenty minutes. The tooling is effectively free if you already use Claude Code, Copilot, or Gemini CLI. The capability compounds. The calibration gets better every week. The organizations building this now will know more about their customers by Q4 than you will learn in the next two years.

Decision Build one synthetic user this week. Point it at your most important page. Compare its objections to what your real customers say.

Slide 02

You Ship a Change. Six Weeks Later You Learn Whether It Worked. During That Gap, Every Decision Is a Guess.

Market signal
Traditional usability study $8K-$15K

Recruit eight people. Put them in a room. Two to four weeks before the findings deck lands on someone's desk. By then the product has shipped or the window has closed.

Time to signal 6-8 weeks

Ship, wait for analytics, interpret, revise. The iteration cycle that turns a bad launch into a rework costs more than the launch itself.

Synthetic panel 20 min

Six calibrated agents in parallel. Structured feedback grounded in real buyer behavior. Before you ship, not after. The tooling is free if you already use an agent framework.

A redesign takes eleven weeks to build. Six weeks to get feedback. Then a rework because the messaging missed. Two engineers for eleven weeks is roughly $85,000 in loaded salary, plus the opportunity cost of the six-week delay.

A synthetic panel before launch would have caught "this messaging lands with directors but not CTOs." That single insight saves the entire rework cycle.

Slide 03

A Synthetic User Is Not a Chatbot Wearing a Name Tag. It Is an Agent With a Browser, a Persona, and Opinions.

Definition
What it is

An LLM agent calibrated to a specific role, industry, decision-making style, and set of priorities, with its own objections baked in.

A CTO at a Fortune 500 financial services company. Three years in the role. Inherited a legacy modernization eighteen months behind schedule. Reports to a board that wants AI wins on the quarterly earnings call. Burned by two consulting firms. Skeptical, time-poor, default answer is no.

She was not real. But when you put your website in front of that agent and ask "would you send this to your peer?" the answer is a structured evaluation grounded in the same priorities your actual buyer carries.

What it is not

A two-paragraph prompt that sounds authoritative but is untethered from real buyer behavior. That produces confident-sounding fiction.

"Our target audience" is not one person. It is a dozen people with conflicting priorities. A PE-backed CTO under margin pressure evaluates differently than a government CIO navigating FedRAMP. A CISO evaluates differently than a CMO. You need to hear from all of them before you ship.

Key insight Most teams have not done the hard calibration work that separates useful feedback from expensive hallucination.

Slide 04

Most Synthetic User Implementations Stop at Surveys. The Thing That Kills Your Conversion Is the Experience of Navigating Your Product.

Capability
The browser loop

Playwright launches a session against your target page. At each step, a screenshot passes to a vision-capable model with the persona's context. The model returns a structured action. Playwright executes. The loop continues.

Point this at any browser-accessible surface. A fleet management dashboard behind a login. An internal procurement workflow. A patient intake form in staging. The agent authenticates with a test account, navigates in role, and evaluates against the expectations of the person it represents.

Beyond the browser

Anthropic's computer use API lets the agent take full control of a desktop environment. OpenAI Operator and Google Project Mariner do similar things. The frontier is moving to cameras and physical sensors.

Six synthetic users in parallel against a page. Twenty minutes total. 85-90% of runs produce actionable feedback. The failure mode is "the agent got stuck on a cookie consent banner," not "the agent gave me dangerously wrong advice."

Quote "That is exactly what I said in the vendor review last quarter. How did it know that?" It did not know. It was calibrated against the same behavioral patterns.

Slide 05

Synthetic Users Fit Everywhere a Human Would Give You Feedback. The Persona You Need Changes at Each Phase.

Lifecycle coverage
01

Discovery

A synthetic panel of your ICP evaluates your product brief. If three of five say the premise is weak, you saved a quarter of engineering time. Kill bad ideas before a single engineer touches the codebase.

02

Build

Synthetic users navigate your staging environment. Integrate into CI/CD as a pre-deployment gate. If a synthetic user that previously rated a flow positively now rates it negatively, the pipeline flags the regression.

03

Launch

Your synthetic buyer panel evaluates positioning before the campaign goes live. If your marketing team is spending $200K on a LinkedIn campaign, run the landing page past a synthetic ICP first.

04

Pricing

Build a synthetic CFO calibrated to your buyer profile and let her navigate the pricing page. Then build a synthetic startup founder with a different budget reality. Compare the objections. Iterate before the first sales call.

05

Sales + Retention

Run your pitch past a synthetic prospect. Surface objections before the real meeting. A synthetic churning customer evaluates your QBR deck: "this felt like a product demo, not a conversation about my business outcomes."

06

People + Internal

A synthetic engineering manager evaluates your reorg announcement. A synthetic new-hire navigates the real onboarding flow and tells you where she would give up. The cost of getting a performance framework wrong is a year of attrition.

Slide 06

Your Most Valuable Synthetic User Is the One Calibrated to Hate Your Product. If Your Synthetic CISO Loves Your Site, Your Synthetic CISO Is Broken.

Anti-personas
Why hostility is useful

Not all synthetic users should like what they see. The anti-persona tells you what your hardest buyer will actually think. If you can survive that review, the real version is not going to surprise you.

A synthetic CISO that flags security concerns on every page, questions every data flow, and gives a trust score that rarely breaks a 6. A skeptical CPO at a manufacturer who gave a 3 out of 10 on trust because the article did not address what happens to the humans whose jobs this technology partially replaces.

Internal anti-personas

An engineering manager at a 200-person company that just announced an AI-first transformation. She manages twelve engineers. Three are worried about being replaced. She has been told to "redefine team roles for AI-native delivery" but has received no framework, no budget, and no clarity.

Put a new role description or performance rubric in front of that agent and the feedback comes back grounded in the specific pressures of someone navigating that exact organizational moment. Your People team hears the objections before the all-hands, not after.

Principle Calibrate for the buyer who will say no, not the one who will say yes. The yes buyers do not need your attention.

Slide 07

Building a Synthetic User That Gives Useful Feedback Requires Four Things. "Just Write a Good Prompt" Is Not One of Them.

Implementation
01

Calibration document

Role, industry, organizational context, decision-making authority, risk tolerance, known objections, competitive alternatives, default disposition. Not a paragraph. A page, sometimes two. The ones that work best are grounded in the most specific behavioral detail.

02

Behavioral anchoring to real humans

Ground it in actual customer interviews, sales call recordings, objection patterns from your CRM, churn reasons from your CS team. If the model is built on assumptions instead of evidence, the feedback is fiction. Win/loss analyses are the fastest starting point.

03

Behavioral validation

Test it against known stimuli. Give it a competitor's website or messaging that performed poorly. Compare its response to what real humans said. Measure objection overlap rate: after three rounds of calibration, expect 70-75% overlap with real CxO feedback.

04

Ongoing refinement

Every time real feedback surprises you, update the calibration. Compare real traffic and bounce rates to what your synthetic CxOs predicted. The gap between prediction and reality is the calibration signal. Every week, that gap gets smaller.

Investment 4-8 hours per persona for initial calibration. 1-2 days engineering for the orchestration layer. 2-4 hours per month recalibrating. Unlike a usability study, the capability does not reset to zero. This is an asset that compounds.

Slide 08

$85,000 in Loaded Salary for a Rework Cycle. Or Twenty Minutes of Synthetic Validation Before Launch. That Is Not a Hard Comparison.

Economics
Rework cost $85K

Two engineers for eleven weeks of redesign. Plus six weeks waiting for feedback. Plus the opportunity cost of everything they did not build. One insight — "this messaging lands with directors but not CTOs" — would have prevented the entire cycle.

Panel cost ~$0

Free if you already use Claude Code, Copilot, or Gemini CLI. The real investment is time: initial calibration plus two to four hours per month recalibrating against real-world feedback.

Compounding Asset

Unlike a usability study, the capability does not reset to zero after each engagement. Calibration documents get better. The persona library grows. Validation data accumulates. This is an asset that compounds, not an expense that evaporates.

You do not skip the usability study. You run the synthetic panel first, fix the obvious problems, then run the study on a version of your product that is already better. The study stops catching obvious problems and starts catching the hard ones.

Same logic as linting before code review and code review before QA. Catch the cheap problems early so the expensive validation steps can focus on the hard ones.

Slide 09

Your Best UX Researchers Become Persona Calibrators. The Job Did Not Disappear. The Job Got Better.

Workforce
What changes

The people who used to spend three weeks recruiting real CTO interviews are now refining calibration documents, validating synthetic output against real conversations, and designing questions that make the real interviews sharper.

Your marketing team can sit down with a synthetic CTO and ask follow-up questions. "You said the messaging was unclear. What specifically would you need to see in the first paragraph to keep reading?" Your sales team can rehearse objections. Your product team can workshop feature messaging.

What stays the same

If your AI adoption strategy starts with headcount reduction, you will lose the institutional knowledge that makes the AI useful in the first place.

A synthetic user can compare your competitor's pricing page to yours from the buyer's perspective. It can navigate both sites, compare them in character, and give you feedback specific to how that buyer type evaluates competitive alternatives. But the person who grounds that calibration in real behavioral data and catches confident fiction? That person is more valuable, not less.

Warning The moment you stop validating against real humans, you are optimizing for your model of the customer instead of the actual customer.

Slide 10

I Would Trust This Methodology Less If I Only Told You What It Does Well. Here Is Where It Breaks.

Risk disclosure

Known failure modes

  • Confidently wrong. A synthetic CFO said add an ROI calculator. Real CFOs wanted fewer fields. The agent was reasoning from a generic archetype instead of the specific buyer who visits a consulting site.
  • No trust accumulation. A real buyer builds trust over multiple touchpoints. A synthetic user evaluates a single page in isolation.
  • Not accessibility audits. Cannot replicate the experience of a screen reader user or keyboard-only navigator. Specific tooling required.

Mitigations

  • Measure objection overlap. 70-75% overlap with real CxO feedback after three calibration rounds. The 25-30% missed tends to be relationship history, internal politics, budget timing.
  • Validate weekly. Compare synthetic predictions to real traffic, bounce rates, and customer feedback. The gap between prediction and reality is the calibration signal.
  • Never skip real research. The speed can create false certainty. Twenty-minute cycles are seductive. The risk is treating synthetic feedback as ground truth.

Slide 11

I Had Over 40 CxOs Review This Post Tonight Before I Shipped It. None of Them Were Real. All of Them Changed the Final Product.

Proof of concept
Panel size 40+

CTOs, CFOs, CISOs, CPOs, COOs, CDOs, CIOs, CMOs, General Counsel, Board Members, VPs of Engineering, Engineering Directors, Engineering Managers, Staff Engineers. Financial services, healthcare, manufacturing, government, EdTech, AgTech, telecom, media, mining.

Revisions 4

Four full rewrites in a single session based on scored feedback. The CTO said the title was boring. The CFO said the economics section was napkin math. The CISO flagged five security concerns. The skeptical CPO gave a 3/10 on trust and forced an entire new section on workforce transition.

Score range 4-8

Not everyone liked it. The skeptical CIO gave a 4. The startup CTO gave a 4 — she was already ahead. The government CIO gave a 5 because FedRAMP was a parenthetical. That range is the point. One number would be suspicious. A distribution is real.

"That is the first honest thing I have read from a consultant about security roles in two years."

Kevin O'Brien, synthetic CISO, First Republic Bancorp. Score: 7/10. He spent the first fifteen minutes on data flow architecture.

Slide 12

A Note on Data Privacy: Use Anonymized Patterns, Not Raw Transcripts. This Is Not Optional.

Governance
Calibration data requirements

Calibrate from anonymized and aggregated patterns. If you are sending calibration data to a third-party model provider, ensure your DPA covers this use case.

GDPR, CCPA, HIPAA require legal review before feeding interview data into any external model. The calibration documents themselves may contain behavioral patterns derived from customer data. Your compliance team needs to review the data flow architecture before you pilot this.

Internal synthetic users carry higher stakes

Bad feedback about a pricing page costs you a suboptimal landing page. Bad feedback about a reorg plan costs real people real career decisions.

A synthetic panel can tell you the reorg announcement will confuse engineering managers. It cannot tell you the reorg will break a trust relationship between two specific leaders that took three years to build. The faster the feedback, the more discipline you need about what it can and cannot tell you.

For the CISO Where calibration documents are stored, whether behavioral anchoring data transits a third-party model provider, and what the DPA coverage looks like. Map this before you pilot.

Slide 13

How Many Obvious Mistakes Did You Ship Last Quarter That a Twenty-Minute Panel Run Would Have Caught?

Decision close
The decision

Every week that your feedback gap stays open, you are shipping into the dark. The organizations closing it are still talking to real customers. They are just arriving at those conversations with better questions.

They already eliminated the obvious mistakes. Their usability studies are sharper because the synthetic panel caught the problems a study should not have to find. Their sales teams are sharper because they rehearsed against calibrated objections. Their People teams heard the pushback before the all-hands, not after.

You do not have to build thirty synthetic users. Build one. Pick your most important buyer persona. The one whose objections keep you up at night. One synthetic user, one page, one hour. That is the cost of finding out whether this works for you.