Skip to content
, ,

Introducing Synthetic Users, Customers, and Personas

·

Executive Deck ↗

Pen doodle of a bewildered human moderator standing with clipboard in a conference room where all the focus group participants are laptops sitting in folding chairs with coffee cups they cannot drink

36 min read

I had over 40 CxOs review this post tonight before I shipped it. A CTO at a $4.1B financial services company. A CFO at a $380M ARR SaaS company. A CISO at a $45B regional bank. A VP of Engineering at a Series C startup. Four Chief People Officers, one at a publicly traded enterprise, one at a PE-backed manufacturer, one at a $6.8B insurance holding company with unionized claims processors, and one at an 85-person AI-native startup in Brooklyn. They read the drafts, gave me specific feedback, told me what made them want to close the tab. I revised the post four times based on their objections.

None of them were real.

Every one of those reviewers was a synthetic user, an LLM agent calibrated to behave like a specific executive with a specific background, specific pressures, and a specific disposition toward what I was selling. I gave them browsers. They navigated my site. They read my copy. They told me where I was wrong. The CTO told me my title was boring and I changed it. The CFO told me my economics section was “napkin math, not a business case” and I rewrote it. The CISO flagged five security concerns I had never considered. The skeptical CPO at the manufacturer gave me a 3 out of 10 on trust because I had not addressed what happens to the humans whose jobs this technology partially replaces, so I added a section on workforce transition and his score went to a 6. The entire WordPress theme and plugin powering this site was designed the same way, iterating on CxO synthetic user feedback alongside a synthetic UX designer who audited every template for consulting-grade visual standards.

That is what a synthetic user is. An agent with a browser, a persona, and opinions. It changes the fundamental economics of learning what your customers think.


Most Teams Are Doing This Badly

The traditional version of what I just described costs $8,000-$15,000 and takes two to four weeks. You recruit eight people, put them in a room, and by the time the findings deck lands on someone’s desk, the product has already shipped or the window has closed. The feedback is real. The timing makes it useless.

Most readers also read: Stop Reviewing Code. Start Proving It Works. My Take on AI in the Quality Process of Software.

I wrote about the broader shift toward agent-driven validation in One Hundred POCs a Day, where agents build, test, and discard a hundred proof of concepts overnight. Synthetic users are the same principle applied to feedback: let the agent do the cheap validation so the humans can focus on the hard decisions. The organizations that get the rigor right build a capability that compounds, because every real conversation makes their synthetic users smarter, and every synthetic run makes their real conversations more productive.

But most teams are not getting the rigor right. Either the team writes a two-paragraph prompt, calls it a “persona,” and gets back feedback that sounds authoritative but is untethered from real reader behavior. Or the team buys a self-serve tool, runs a survey against synthetic respondents, and treats the output as real market research.

Both produce confident-sounding fiction. The problem is not immature tooling or pressure to show “AI adoption.” It is that most teams have not done the hard calibration work that separates useful feedback from expensive hallucination.


What a Synthetic User Actually Is

A synthetic user is not a chatbot wearing a name tag. It is an LLM (Large Language Model) agent calibrated to a specific role, industry, decision-making style, and set of priorities, with its own objections baked in.

You have a CTO at a Fortune 500 financial services company. She has been in the role for three years. She inherited a legacy modernization initiative eighteen months behind schedule. She reports to a board that wants AI wins on the quarterly earnings call. She has been burned by two consulting firms who promised transformation and delivered slide decks. She is skeptical, time-poor, and her default answer to a new vendor is no.

She was not real. But when you put your website in front of that agent and ask “would you send this to your peer?”, the answer is a structured evaluation grounded in the same priorities and skepticism your actual reader carries.

I run about thirty of these against my own site. A PE-backed CTO under margin pressure evaluates differently than a government CIO navigating FedRAMP (Federal Risk and Authorization Management Program). A CISO evaluates differently than a CMO. “Our target audience” is not one person. It is a dozen people with conflicting priorities, and you need to hear from all of them before you ship.

Not all of them should like what they see. Some of the most valuable synthetic users are anti-personas, agents deliberately calibrated to be hostile to your offering. My synthetic CISO is almost never happy. He flags security concerns on every page, questions every data flow, and gives me a trust score that rarely breaks a 6. If your synthetic CISO loves your site, your synthetic CISO is broken. The anti-persona tells you what your hardest reader will actually think, and if you can survive that review, the real version is not going to surprise you.

This is not limited to external readers. I built one that represents an engineering manager at a 200-person company that just announced an AI-first transformation. She manages twelve engineers. Three are worried about being replaced. Two are excited and already using Copilot without telling anyone. She has been told to “redefine team roles for AI-native delivery” but has received no framework, no budget for reskilling, and no clarity on what “AI-native” means. When I put a new role description or performance rubric in front of that agent, the feedback comes back grounded in the specific pressures of someone navigating that exact organizational moment. Your People team can hear the objections before the all-hands, not after.


They Can Use Your Software

Most synthetic user implementations stop at the survey layer. They ask questions and generate responses. That misses the thing that actually kills your conversion: the experience of navigating your product.

A synthetic user is an LLM agent. An LLM agent can use tools. The most important tool you can give it is a real browser.

Playwright (a browser automation framework) is the most common orchestration layer for web-based products. But the capability goes further. Anthropic’s computer use API lets Claude take full control of a desktop environment, moving the mouse, opening applications, typing into native software, reading the screen, and deciding what to do next the same way a human would. OpenAI’s Operator and Google’s Project Mariner do similar things. These are production-capable tool-use interfaces, not research previews.

The frontier is moving beyond screens. Projects like OpenClaw and other open-source robotics frameworks connect LLM agents to cameras and physical sensors. A synthetic user that can see your physical product through a camera, evaluate the packaging, assess the unboxing experience, that is early but not science fiction. The architecture is the same: a calibrated persona, a set of tools, a feedback loop. Today the tool is a browser. Tomorrow it might be a webcam pointed at your retail shelf. The persona and calibration do not change. Only the interface does.

Here is how the browser loop works. An orchestration layer launches a Playwright session against your target page. At each step, the system captures a screenshot (or accessibility tree snapshot) and passes it to a vision-capable model along with the persona’s calibration context. The model returns a structured action: click, scroll, type, or evaluate. Playwright executes and the loop continues.

You can point this at any browser-accessible surface. A fleet management dashboard behind a login, an internal procurement workflow, a patient intake form in staging. The agent authenticates with a test account, navigates in role, and evaluates against the expectations of the person it represents.

I showed the output from one of these runs to a VP of Engineering at a financial services company I have worked with. She read the synthetic CTO’s objections and said, “That is exactly what I said in the vendor review last quarter. How did it know that?” It did not know. It was calibrated against the same behavioral patterns that informed her real objections. That was the moment I stopped thinking of this as an experiment and started thinking of it as infrastructure.

I run six synthetic users in parallel against a page. Twenty minutes total. You can do this in Claude Code, GitHub Copilot, Gemini CLI, or any agent framework that supports tool use. Roughly 10-15% of browser runs hit an obstacle that requires a retry, usually a modal or dynamic loading state. The other 85-90% produce actionable feedback. The failure mode is “the agent got stuck on a cookie consent banner,” not “the agent gave me dangerously wrong advice.”


Your Marketing Team Can Talk to Them

A synthetic user is not a one-shot evaluation. It is a persistent agent. You can have a conversation with it.

Your marketing team can sit down with a synthetic CTO and ask follow-up questions. “You said the messaging was unclear. What specifically would you need to see in the first paragraph to keep reading?” The agent responds in character, grounded in its calibration, with specific feedback.

I stopped treating synthetic users as evaluation tools and started treating them as simulated stakeholders I could workshop ideas with. I draft a homepage headline, run it past a synthetic CFO and VP of Engineering, read their reactions, revise, run again. The iteration cycle that used to take weeks (write, ship, wait for analytics, interpret, revise) now takes an afternoon.

Your product team can do the same with feature messaging. Your sales team can rehearse objections. And the people who used to spend three weeks recruiting real CTO interviews? They are now refining calibration documents, validating synthetic output against real conversations, and designing questions that make the real interviews sharper. The job did not disappear. The job got better.

Because the agent has a browser, you can say “go look at our competitor’s pricing page and tell me how it compares to ours from your perspective.” The agent navigates both sites, compares them in character, and gives you feedback specific to how that reader type evaluates competitive alternatives.


Where They Fit: Every Phase of the Product Lifecycle

This is what makes synthetic users a core capability. They fit everywhere a human would give you feedback, and the persona you need changes at each phase. Once you build the calibration infrastructure, you have a feedback engine that serves product, engineering, marketing, sales, CS, and People simultaneously.

Discovery and research. Before a single engineer touches the codebase, a synthetic panel of your ICP (Ideal Customer Profile) evaluates your product brief. If three of five say the premise is weak, you saved a quarter of engineering time. I killed two content ideas based on synthetic panel feedback that would have taken weeks to write and months to discover were wrong.

Build and validation. While the product is being built, synthetic users navigate your staging environment. A synthetic dispatcher evaluates your fleet management dashboard: “I expected to filter by vehicle status before route, but this forces me to pick a route first.” You can integrate this into your CI/CD (Continuous Integration / Continuous Deployment) pipeline as a pre-deployment gate. If a synthetic user that previously rated a flow positively now rates it negatively, the pipeline flags the regression before promotion.

Launch and GTM (Go-To-Market). Your synthetic reader panel evaluates positioning, competitive messaging, and the launch page before the campaign goes live. If your marketing team is spending $200K on a LinkedIn campaign, run the landing page past a synthetic version of your ICP first.

Pricing and packaging. You do not guess what the CFO will object to. You build a synthetic CFO calibrated to your actual reader profile and let her navigate the pricing page. Then build a synthetic startup founder with a different budget reality and compare the objections. Iterate before the first sales call, not after the first lost deal.

Sales. Run your pitch past a synthetic version of your prospect’s role and industry. The agent surfaces objections your team will hear in the real meeting, giving them time to prepare actual answers instead of improvising.

Retention and expansion. A synthetic churning customer evaluates your QBR (Quarterly Business Review) deck: “this felt like a product demo, not a conversation about my business outcomes.” A synthetic power user tells you whether the upgrade path is clear or feels like a bait-and-switch.

Internal and People. A synthetic engineering manager evaluates your reorg announcement. A synthetic senior IC tells you the new performance rubric does not account for prompt engineering. A synthetic new-hire navigates the real onboarding flow (Slack, Notion, HRIS) and tells you where she would give up and message her manager instead. The cost of getting a product page wrong is a bad quarter. The cost of getting a performance framework wrong is a year of attrition in your highest-leverage roles.


How to Build Them

Most people expect me to say “just write a good prompt” here. That is like saying “just write good code.” Technically true. Practically useless.

Building a synthetic user that gives you useful feedback requires four things.

First, a calibration document. This is the foundation. It defines the persona’s role, industry, organizational context, decision-making authority, risk tolerance, known objections, competitive alternatives, and default disposition toward your category. What makes them skeptical. What makes them lean forward. What makes them close the tab.

The calibration document is not a paragraph. It is a page, sometimes two. The ones that work best are grounded in the most specific behavioral detail. My synthetic CTO for financial services runs two pages. The calibration for a synthetic startup founder is shorter because the persona is less constrained. Specificity is not about length. It is about knowing which details change the output.

Second, behavioral anchoring to real humans. This is where most efforts fail. If you invent the persona from your marketing team’s assumptions about the reader, you will get feedback that confirms those assumptions. You are testing your assumptions against your assumptions.

Ground it in actual customer interviews, sales call recordings, objection patterns from your CRM (Customer Relationship Management), churn reasons from your CS team. If the model is built on assumptions instead of evidence, the feedback is fiction. A research director at a mid-market SaaS company once told me this whole approach was “expensive hallucination with a persona costume on.” Without behavioral anchoring, she was right.

The best synthetic users are calibrated from the entire surface context of your application: support tickets, social mentions, application logs, NPS comments, community forums, sales call transcripts, churn surveys. Every surface where a customer expresses an opinion or encounters friction is a calibration input. I wrote about how to instrument this in The Customer Product Operating Model.

A note on data privacy: use anonymized and aggregated patterns, not raw transcripts. If you are sending calibration data to a third-party model provider, ensure your DPA covers this use case. Regulated jurisdictions (GDPR, CCPA, HIPAA) require legal review before feeding interview data into any external model. This is not optional.

If you do not have four data sources, start with win/loss analyses from your sales team. Highest-signal starting point. You can build a useful synthetic user from win/loss data alone and refine it as you get more inputs.

Third, behavioral validation. Once built, test it against known stimuli. Give it a competitor’s website or messaging that performed poorly and compare its response to what real humans said. Do the objections overlap?

I measure this as objection overlap rate: the percentage of objections raised by the synthetic user that also appear in real customer feedback. After three rounds of calibration, my synthetic CTO hit roughly 70-75% objection overlap with feedback from real CTOs I have spoken with. I am not really good at mathematics, but I think that means the agents are catching about three out of four real objections, right? The 25-30% they miss tends to be context-dependent: relationship history, internal politics, budget timing. Things a simulation cannot know because they live outside the page.

Fourth, ongoing refinement. Every time real feedback surprises you, ask whether your synthetic user would have caught it. If not, update the calibration. A calibration update takes ten minutes. The validation run takes twenty.

I do this once a week. I compare real traffic and bounce rates to what my synthetic CxOs predicted. Did the synthetic CTO say this post would get forwarded? Did it? Did the synthetic CFO say the economics section was weak? Did readers drop off there? The gap between prediction and reality is the calibration signal. Every week, that gap gets smaller.

Over time, you create new personas as patterns emerge. Traffic from a segment you had not considered (a mid-market Head of Data, a non-technical CEO, a procurement officer) and you build a synthetic user for it. The persona library grows with your understanding of your market.


What Can Go Wrong

I would trust this methodology less if I only told you what it does well.

Synthetic users can be confidently wrong. I had a synthetic CFO tell me my pricing page needed a detailed ROI calculator before the CTA (Call to Action). Real CFOs wanted fewer fields, not more. The synthetic user was reasoning from a generic CFO archetype instead of the specific type of CFO who visits a consulting site. Recalibrating to “time-poor CFO evaluating a services engagement, not a SaaS purchase” fixed it. But if I had acted on the original feedback without validation, I would have made the page worse.

They do not model trust accumulation. A real reader builds trust over multiple touchpoints: a blog post, a LinkedIn comment, a referral, then a site visit. A synthetic user evaluates a single page in isolation. It cannot simulate relationship history.

They are not accessibility audits. A synthetic user can flag confusing navigation or unclear content hierarchy. It cannot replicate the experience of a screen reader user or a keyboard-only navigator. Accessibility requires specific tooling (axe-core, assistive technology testing) and real users with real accessibility needs.

Internal synthetic users carry higher stakes. Bad feedback about a pricing page costs you a suboptimal landing page. Bad feedback about a reorg plan or career ladder costs real people real career decisions. A synthetic panel can tell you the reorg announcement will confuse engineering managers. It cannot tell you the reorg will break a trust relationship between two specific leaders that took three years to build.

The speed can create false certainty. Twenty-minute feedback cycles are seductive. The risk is that teams treat synthetic feedback as ground truth and skip the real research. Synthetic users are a pre-filter, not a verdict. The moment you stop validating against real humans, you are optimizing for your model of the customer instead of the actual customer.


The Economics

A typical eight-person usability study costs $8,000-$15,000 and takes two to four weeks. A synthetic panel run takes twenty minutes and the tooling is effectively free if you already use Claude Code, Copilot, or Gemini CLI.

The interesting question is what happens to the people who do usability research today. Your best UX researchers become persona calibrators, the people who ground synthetic users in real behavioral data, validate against real outcomes, and catch the confident fiction that uncalibrated agents produce. The role changes. The skill set evolves. If your AI adoption strategy starts with headcount reduction, you will lose the institutional knowledge that makes the AI useful in the first place.

The real investment is time: four to eight hours per persona for the initial calibration, a day or two of engineering for the orchestration layer if your team already uses Playwright, and two to four hours per month recalibrating against real-world feedback. Unlike a usability study, the capability does not reset to zero after each engagement. The calibration documents get better. The persona library grows. The validation data accumulates. This is an asset that compounds, not an expense that evaporates.

That does not mean you skip the usability study. You run the synthetic panel first, fix the obvious problems, then run the study on a version of your product that is already better. The study stops catching obvious problems and starts catching the hard ones.

A redesign takes eleven weeks to build. Six weeks to get feedback. Then a rework because the messaging missed. Two engineers for eleven weeks is roughly $85,000 in loaded salary, plus the opportunity cost of the six-week delay. A synthetic panel before launch would not have caught everything, but it would have caught “this messaging lands with directors but not CTOs.” That single insight, twenty minutes before launch instead of six weeks after, saves the entire rework cycle.


Better Research, Not Less Research

Synthetic users do not replace user research. If you use them as a replacement, you will build a product optimized for your model of the customer instead of the actual customer.

But that is not a weakness. It is a description of how to use them correctly. Before you run a usability study with eight real participants, run the same flow past your synthetic panel. The panel catches the obvious problems. You fix them. The real participants see a better version. Their feedback is more useful because they are not wasting time on problems you could have caught without them.

Same logic as linting before code review and code review before QA. Catch the cheap problems early so the expensive validation steps can focus on the hard ones.


Start With One

Do not build thirty synthetic users. Build one.

Pick your most important reader persona. The one whose objections keep you up at night. Build a calibration document grounded in real data (win/loss analyses are the fastest starting point). Not “enterprise CTO” but specific: the industry, the company size, what she inherited, what her board is asking for, what burned her last time.

Point that synthetic user at one page. Your homepage. Your pricing page. Whatever matters most.

Read the feedback. Compare it to what your real customers have told you. If the synthetic user surfaces objections that match what you have heard from real people, you have a working tool. If it misses, recalibrate and try again.

One synthetic user, one page, one hour. That is the cost of finding out whether this works for you.

And once it works, think about what happens when you add synthetic users to your agent swarm. You are not running one panel manually and reading the results over coffee. You have agents building the product, agents testing the product, and now agents evaluating the product as calibrated readers, all running in parallel. The iteration speed changes completely. I revised this post four times tonight based on synthetic CxO feedback. Four full rewrites, with scores tracked across every iteration, in a single session. How long does your current feedback cycle take to produce four validated iterations of anything?


How long is your current feedback loop? From shipping a change to learning whether it worked for the right reader? Six weeks? Eight? A full quarter?

During that gap, how many decisions are you making based on what you think the customer wants instead of what a calibrated, validated synthetic version of that customer would tell you?

You do not have to answer that to me. But you should answer it to yourself. Every week that gap stays open, you are shipping into the dark. The organizations closing it are still talking to real customers. They are just arriving at those conversations with better questions because they already eliminated the obvious mistakes.

How many obvious mistakes did you ship last quarter that a twenty-minute panel run would have caught?

Hope is not a feedback strategy. Your product is not a fiction project. What happens if you do not get it right?


Appendix: What the Synthetic CxO Panel Said About This Post

Every reviewer below is a synthetic user. They were calibrated to specific executive profiles, given access to this article, and asked to score it, react honestly, and answer one question: would you send this to a peer, and if so, who and why? This is the raw feedback session. I am publishing it because the best way to explain what synthetic users do is to show you what they said about the thing you just read.

Jordan Whitfield, CTO — Ashford National Group ($4.1B, Financial Services, 1,200 engineers)
Score: 7/10
“The calibration methodology is the part I would actually use — four data sources, objection overlap rate, weekly recalibration against actuals. I can hand that to my platform team and they will have a prototype against our client onboarding flow inside a sprint. I docked points because the economics section treats compliance as a footnote. At my scale, every synthetic user touching production data needs a compliance review, and that is not a paragraph, it is the critical path.”
Would I send this to a peer? Yes — my Head of Platform Engineering and our VP of Client Experience. The calibration framework is concrete enough to spike without a consulting engagement.

Rachel Goldstein, CFO — Vertex SaaS ($380M ARR, Enterprise Software)
Score: 6/10
“The unit economics framing got my attention — $85K loaded salary for a rework cycle versus twenty minutes of synthetic validation is the kind of comparison I can put in front of the board. But then it stops. I need the full cost model: what does calibration cost in engineering time, what is the ongoing recalibration burn rate, and what does the breakeven look like at five personas versus twenty.”
Would I send this to a peer? No. I would send it to our CPO with a note that says “read the economics section and come back to me with the missing numbers before you ask for headcount.”

Kevin O’Brien, CISO — Ridgeline Federal Savings ($45B, Financial Services)
Score: 7/10
“The anti-persona concept is the first honest thing I have read from a consultant about security roles in two years. Your synthetic CISO is never happy — correct. That is the job. On a call I would spend the first fifteen minutes on the data flow architecture: where calibration documents are stored, whether behavioral anchoring data transits a third-party model provider, and what the DPA coverage looks like for regulated interview transcripts.”
Would I send this to a peer? Yes — my VP of Application Security. Not as an endorsement but as a threat model exercise. Better to map the data flow risks now than discover this in a shadow IT audit next quarter.

Marcus Thompson, VP of Engineering — RouteCast (Series C, $42M ARR, 140 engineers)
Score: 8/10
“The browser-based validation loop stopped me from skimming. We just spent six weeks on a dispatcher dashboard redesign that got torn apart in the first week because nobody tested the click-through against someone who thinks like a dispatcher under time pressure. One Playwright session against staging, calibrated from the support tickets we already have in Zendesk — I could spike that next week.”
Would I send this to a peer? Yes — the other Series C VP of Eng in my YC batch who just complained about the same post-launch feedback problem.

Elena Wright, CPO — Clareo Health ($900M, Digital Health, 2,800 employees)
Score: 8/10
“The internal synthetic user section is the section I screenshotted for my leadership team Slack channel. The idea that I can put our reorg announcement and new role descriptions in front of a calibrated agent that thinks like a front-line manager with three anxious direct reports — that is not a research tool, that is a pre-flight checklist for organizational change.”
Would I send this to a peer? Yes — the CHRO at a $1.2B digital therapeutics company facing the same AI-native role design problem.

Derek Rawlings, CPO — Ridgewell Industrial ($2.4B, Industrial, 12,000 employees)
Score: 5/10
“I read the whole thing waiting for you to tell me what happens to my people, and you gave me one paragraph. I have 7,200 non-technical employees who have never opened a terminal. Your answer to ‘what happens to them’ is that UX researchers become persona calibrators. That is a job description for the twenty people in my org who already have a master’s degree. Until you can tell me how a 54-year-old quality inspector fits into this picture, this is another consultant selling transformation to the people who need it least.”
Would I send this to a peer? No. I would send it to the author with a note that says “write the workforce piece and I will reconsider.”

Zara Okonkwo, Head of People — Cadence Labs (Series B, 85 employees)
Score: 8/10
“We have been doing the lazy version of this for six months — a two-paragraph prompt, a made-up persona, feedback that tells us what we already believe. The calibration rigor section hit me hard because it named exactly what we are getting wrong. I am taking the four-step calibration framework and the objection overlap rate metric straight to our next People ops sync.”
Would I send this to a peer? Yes — other Heads of People at Series A/B startups who are already using LLMs but doing it without rigor. The calibration methodology turns a toy into a tool.

Margaret Chen-Liu, CPO — Harborline Insurance ($6.8B, Insurance, 22,000 employees)
Score: 6/10
“The calibration framework is sound and I can see applications for testing our claims modernization communications before they reach 22,000 employees. But this article does not understand institutional trust debt — the kind that accumulates over three failed change programs and two renegotiated CBAs. McKinsey is already in my building telling me AI will transform claims processing, and they are ignoring me too. I need the author to tell me something they are not.”
Would I send this to a peer? No — not to another CPO at a unionized enterprise. I would send it to my Head of Internal Comms as background reading on the calibration concept.

Sarah Chen, CIO — Pacific Northwest Health System ($3.2B, Healthcare)
Score: 5/10
“The data privacy paragraph is one sentence about HIPAA and a reminder to check your DPA. That is not a compliance framework, that is a footnote. Before I can even pilot this, I need to know where calibration documents are stored, whether they transit a third-party model provider, and what the BAA coverage looks like. The clinical workflow use case is compelling in theory, but the gap between ‘good idea’ and ‘can exist in a regulated health system’ is where every promising AI initiative in my organization goes to die.”
Would I send this to a peer? No — not to another healthcare CIO. The regulated-industry treatment is too shallow.

Tom Brennan, CIO — Midwestern Mutual Insurance ($8B, Insurance)
Score: 4/10
“I have been in IT for thirty years and I have sat through every wave of consulting theater this industry has produced. The core claim — that an LLM pretending to be a CTO produces feedback equivalent to talking to an actual CTO — is something I need to see demonstrated, not asserted. A 70% objection overlap rate means 30% of the time it is wrong, and in my experience the 30% it misses is the 30% that matters.”
Would I send this to a peer? No. If the author offered a live demo against one of my actual workflows with my compliance team in the room, I might reconsider.

Michael Zhang, CTO — Stratal Software (PE-backed, $200M revenue)
Score: 7/10
“The idea that I can point a synthetic CFO at our pricing page before the next investor update and pre-empt the objections my PE sponsors will raise is immediately useful. I would have scored higher if you had addressed how to sell this internally when my investors want headcount reduction metrics, not ‘the job got better’ narratives.”
Would I send this to a peer? Yes — to another portfolio company CTO in our fund’s ecosystem who is also being asked to show AI ROI by Q3.

Diane Foster, CEO — Lakeview Health Partners ($500M, Healthcare, non-technical)
Score: 6/10
“I understood the concept immediately — the opening story made it real for me. But you lost me in the Playwright and CI/CD sections. I am not your technical reader. I needed a clearer ‘here is what this means for your patient satisfaction scores and your operating margin’ paragraph that I could repeat in my own board meeting.”
Would I send this to a peer? No. I would describe the concept over coffee to my COO and ask her to read it.

David Park, COO — Draymark Logistics ($1.8B, Freight and Supply Chain)
Score: 7/10
“The dispatcher example is the closest anyone has come to describing my actual problem. We just spent $2.3M on a TMS redesign and the operations floor hated it. What is missing is scale: I have 14,000 non-desk workers across 200 facilities. How do I calibrate synthetic users for a warehouse supervisor in Memphis who has different constraints than one in Rotterdam?”
Would I send this to a peer? Yes — to my VP of Process Excellence and to a COO at a peer logistics company who just told me his digital transformation is too engineering-centric.

Raj Patel, CDO — Ashford Capital Holdings (Fortune 100, $38B AUM)
Score: 8/10
“The calibration rigor separates this from the synthetic data theatre I see from most vendors. Objection overlap rate as a measurable signal, behavioral anchoring against real CRM data, weekly recalibration — that is a methodology I can put in front of my Chief Risk Officer without embarrassment.”
Would I send this to a peer? Yes — to my counterpart CDO at two other large banks and to our Head of Model Risk Management.

Patricia Williams, CHRO — Brevian Analytics ($1.2B, Enterprise Data Platform, 3,200 employees)
Score: 7/10
“You spend 80% of the article on product and marketing use cases and give People teams one section. I run capability architecture for 3,200 people across six business units. If I am going to invest in calibration documents, I need to see how this maps to org design scenarios: restructures, RIF communications, career ladder rollouts, manager enablement programs.”
Would I send this to a peer? Yes — to my VP of Talent Strategy. The calibration framework for testing internal communications before they ship is immediately useful, even if the article treats it as secondary.

Elena Vasquez, CMO — Clearpoint SaaS ($150M ARR, Mid-market B2B SaaS)
Score: 6/10
“I got excited at ‘your marketing team can sit down with a synthetic CTO and ask follow-up questions’ because that is my exact problem. But then the article pivots back to engineering workflows. I do not have engineers on my marketing team. I need to know how my content strategist builds a synthetic reader panel without Playwright, without a developer, and without two pages of calibration methodology.”
Would I send this to a peer? No. If there were a companion piece focused on marketing-team-accessible implementation, I would forward that immediately to every CMO in my peer group.

Gregory Haines, General Counsel — Whitfield, Haines & Cole LLP (AmLaw 50)
Score: 5/10
“You mention GDPR, CCPA, and HIPAA in passing, but you do not address the evidentiary and ethical implications of using synthetic agents to evaluate legal work product. If I point a synthetic in-house counsel at a contract review workflow, I need to understand the privilege implications and the work product doctrine exposure. The billable-hours disruption angle is the existential question for every managing partner in the AmLaw 100 and it is not mentioned.”
Would I send this to a peer? No. A managing partner would read this and say “interesting technology, nothing to do with us.”

Dana Kowalski, CTO/Co-founder — Threadwork Labs (Series A, 30 employees)
Score: 4/10
“We have been doing this for eight months. Not the marketing-feedback version — the full loop. Synthetic users in CI, synthetic QA agents filing bugs, synthetic design reviewers blocking merges that regress UX scores. Your calibration framework is fine but it is aimed at organizations just discovering this. What I need is the advanced playbook: persona drift at scale, conflicting feedback from a panel of thirty, version-controlling calibration documents alongside code.”
Would I send this to a peer? Yes — but not for me. I would send it to three CTOs in my YC batch who have not started building synthetic feedback loops yet.

Sandra Kim, CTO — Veridon-Praxis Technologies ($2.1B combined entity, post-M&A)
Score: 7/10
“I am fourteen months into merging two engineering organizations with incompatible tech stacks and two completely separate ways of talking to customers. Building a shared synthetic reader panel — calibrated from both organizations’ CRM data — would force alignment on ICP definition faster than any offsite or strategy deck.”
Would I send this to a peer? Yes — to my integration PMO lead and to the other CTO whose org we acquired. Not as “here is our new feedback tool” but as “here is a framework that forces us to define our combined customer.”

Ingrid Magnusson, CISO/CTO — NordFinans Group ($600M, Nordic Fintech, Stockholm)
Score: 7/10
“The GDPR callout was necessary but insufficient. In my world, the entire architecture decision — where calibration data is processed, which model provider sees it, whether it crosses an EU border — is the project. The anti-persona concept earned my attention, and the fact that you acknowledge regulated jurisdictions at all puts you ahead of most vendors.”
Would I send this to a peer? Yes — to my counterpart at a peer Nordic fintech, framed as “interesting methodology, but read the gaps on data sovereignty before you get excited.”

James Okonkwo, CIO — U.S. General Services Administration (Federal Government)
Score: 5/10
“The article assumes a commercial procurement velocity that does not exist in my environment. I cannot spin up Playwright sessions against staging next sprint — I need an ATO, a FedRAMP-authorized model provider, and six months of paperwork. You mention FedRAMP once as a parenthetical. That told me you understand the acronym but not the constraint.”
Would I send this to a peer? No — not until there is a federal implementation appendix.

Yuki Tanaka, CTO — Mitsuhara Industries ($12B, Manufacturing, Osaka)
Score: 6/10
“My concern is cultural. In our organization, feedback flows through consensus. A single agent producing direct criticism of a product decision would be received very differently than it would in a US technology company. The article assumes that direct, scored feedback is universally valuable. I would want to understand how synthetic user output could be adapted for ringi-style decision processes.”
Would I send this to a peer? Yes — to our Head of Digital Transformation, with a note suggesting we explore adaptation for our decision-making culture.

Carlos Mendez, CTO — VoltaPay ($300M, Fintech, São Paulo)
Score: 7/10
“The feedback loop compression argument hit immediately — we operate in a market where regulatory windows open and close in weeks. Two gaps: every synthetic persona in your appendix is a North American or European executive, and running six parallel Playwright sessions with vision-capable models requires bandwidth that is not trivially available in all the markets I serve.”
Would I send this to a peer? Yes — to the CTO at a top-five LATAM fintech’s platform team, with the caveat that we need LATAM-calibrated personas.

Amir Hassan, CTO — Tawazun Digital (Sovereign Wealth Tech Initiative, Abu Dhabi)
Score: 6/10
“At our scale — national-level digital infrastructure — the twenty-minute feedback cycle is appealing but the article does not address what happens when you need two hundred synthetic users representing distinct citizen segments across four languages and multiple literacy levels. The assumption that synthetic users are primarily a B2B sales and product tool undersells the capability.”
Would I send this to a peer? Yes — to a peer CTO at another GCC sovereign digital initiative, noting that the methodology needs significant scaling and localization work but the calibration loop is architecturally sound for mega-project quality gates.

Lisa Nakamura, CTO — Crestline Retail ($8B, E-commerce + Brick-and-Mortar)
Score: 7/10
“I could point a synthetic store operations manager at our checkout flow during Black Friday prep and catch the friction before we are dealing with it at 40,000 concurrent sessions. But the article assumes stable, always-connected environments. The seasonal scaling dimension — where your reader persona shifts from a patient browser in March to a frantic gift-reader in December — is not addressed.”
Would I send this to a peer? Yes — to my VP of Digital Experience. She has been asking for faster usability feedback since we botched the mobile redesign last holiday season.

Alex Rivera, CTO — Storyarc Media ($2B, Media/Entertainment)
Score: 6/10
“In my world, AI is the product. I do not need synthetic users to tell me if my pricing page converts. I need synthetic viewers who can evaluate whether a generated storyline holds narrative tension across six episodes or whether our AI-curated playlist feels like discovery or a filter bubble. The article gestures at ‘the interface changes’ but never follows through for creative product evaluation.”
Would I send this to a peer? No. The core idea is interesting but the examples would not land with anyone in media.

Sven Eriksson, CTO — Skybridge Telecom ($15B, Scandinavian Telecom)
Score: 6/10
“I have 14 million subscribers generating network events at a rate where twenty-minute panel runs are a rounding error against the feedback signals I already have from production telemetry. Where synthetic users would be useful for me is evaluating internal tooling — the NOC dashboards and provisioning workflows my network engineers complain about constantly. That use case is buried in one paragraph.”
Would I send this to a peer? Yes — to my VP of Network Operations with a caveat: “Ignore the marketing examples and read the internal tooling section.”

Henrik Johansson, CTO — GrainWise ($400M, Precision Agriculture)
Score: 5/10
“My deployments are combines running ML models in fields with no cell coverage, agronomists using tablet interfaces in direct sunlight with dirty gloves. The Playwright-based browser loop assumes reliable connectivity and screen-based interaction. My synthetic users would need to evaluate USSD flows, offline-first experiences, and decision support tools where the user’s context is a weather forecast and a soil moisture reading.”
Would I send this to a peer? No — not as written. If someone wrote the AgTech version addressing connectivity constraints and non-screen interfaces, I would forward that immediately.

Ananya Mehta, CTO — Proteon Life Sciences ($6B, Pharma/Life Sciences)
Score: 7/10
“The four-step build process and the objection overlap rate metric are the kind of structured approach I can bring to my regulatory affairs team without them dismissing it. Where this falls short: my synthetic users would need to evaluate clinical trial enrollment interfaces under 21 CFR Part 11 compliance constraints. The data privacy paragraph is a good start but the calibration data itself is likely PHI or proprietary clinical data, and the compliance architecture for that is a six-month initiative, not a footnote.”
Would I send this to a peer? Yes — to my Head of R&D Informatics. She has been looking for a way to get faster feedback on our clinical data platform UX without running another $200K usability study through our IRB process.

Richard Townsend, CTO — Steelvine Holdings ($3B, publicly traded, activist investor pressure)
Score: 7/10
“The speed argument is right for my situation — I cannot wait six weeks for feedback when my board wants demonstrable AI-driven margin improvement every quarter. What I did not see is how synthetic user output translates into the operating leverage metrics that activist investors track. You told me how to compress feedback loops. You did not tell me how to present that compression as margin expansion on an earnings call.”
Would I send this to a peer? Yes — to my VP of Product with the note “read the calibration framework section and tell me what it would take to pilot this against our client portal before Q3 earnings.”

Margaret Thornton, Board Member / Non-Executive Director — sits on 3 boards (tech, healthcare, financial services)
Score: 6/10
“The concept is sound and the governance implications are significant. But this article is written for the operator, not the director. I need the fiduciary framing: what is the board’s exposure when management ships without validated customer feedback, and what does the liability picture look like when decisions are informed by AI-generated personas instead of real market research?”
Would I send this to a peer? No. I would reference the concept in a board strategy session and ask management to come back with a governance framework.

Fiona McAllister, CTO — Ironstone Resources ($5B, Australian Mining)
Score: 5/10
“I run technology across fourteen remote sites, some twelve hours from the nearest city, with a FIFO workforce and an industrial IoT estate generating four terabytes of sensor data a day. Give me one example of synthetic users applied to industrial operations — a control room interface, a maintenance scheduling system, a safety reporting workflow — and this moves from thought piece to steering committee material.”
Would I send this to a peer? No. Too SaaS-centric. My peers in resources would read two examples about marketing pages and close the tab.

Wei Lin, CTO — Vantage Super-App Platform ($4B, Singapore)
Score: 6/10
“I operate across eight markets with five major languages and user personas that shift fundamentally between Singapore, Jakarta, and Ho Chi Minh City. A synthetic user calibrated to a Singaporean finance professional evaluates completely differently from one calibrated to an Indonesian micro-merchant. The article does not address multilingual calibration, cultural context in persona design, or government AI strategy alignment — which in APAC is not optional.”
Would I send this to a peer? Yes — to my Head of Product with a caveat: “The calibration methodology is worth adapting, but we need to build the APAC persona layer ourselves.”

Nadia Osei, CTO — KwikSettle ($80M, Lagos-headquartered Fintech)
Score: 5/10
“I built our mobile money platform for users who transact on $60 Android devices over 2G networks. My synthetic users need to evaluate USSD flows, SMS-based interfaces, and agent-assisted transactions where the ‘user’ is a market trader who has never opened a web browser. The real comparison in my market is between synthetic users and no feedback at all — which is a much more compelling argument than the one this article makes.”
Would I send this to a peer? No — not this version. If there were one paragraph acknowledging that synthetic users are most transformative in markets where traditional research infrastructure does not exist, I would forward it immediately.

Amara Washington, CTO — Beacon University Systems ($600M, EdTech/Higher Education)
Score: 7/10
“I immediately saw applications for how we test student-facing experiences — we just spent four months redesigning our advising portal and did not hear from a single actual student until after launch. What I need is how synthetic users handle institutional complexity: faculty governance, academic integrity concerns, and the fact that my readers include provosts, deans, students, and accreditation bodies who all evaluate the same product with completely different priorities.”
Would I send this to a peer? Yes — to my VP of Product and my Chief Academic Officer with the message: “We should be running our student portal redesign past calibrated agents before shipping.”

James Chen, Engineering Director — Fortune 500 Retailer (4 teams, 52 engineers)
Score: 6/10
“The browser-based validation loop matters to me — we just shipped a checkout redesign that our ops team hated because nobody simulated the experience of a warehouse manager processing returns on a tablet at 6 AM. But the article is written for someone who gets to decide things. I do not decide things. I translate executive mandates into sprint plans. What I need is how to pitch this to my VP without it sounding like another AI side project.”
Would I send this to a peer? Yes — to the other engineering directors in our org with the message: “This is the first thing I have read that connects agent-based testing to the feedback delay problem we keep hitting.”

Sarah Martinez, Engineering Manager — Enterprise SaaS (manages 12 ICs)
Score: 5/10
“I manage twelve engineers. Three of them are already nervous about AI replacing them. This article tells me UX researchers become persona calibrators and the job gets better, but it does not tell me what I say to my mid-level backend engineer who just watched a synthetic user do in twenty minutes what his team spent three sprints building a feedback survey to accomplish. The section on internal synthetic users evaluating reorg announcements is the most useful part — at least someone is acknowledging that the humans on the receiving end deserve better than a Slack message and a FAQ doc.”
Would I send this to a peer? No. I would send it to my skip-level with a pointed question: “If we adopt this, what is the reskilling plan for my team?”

Robert Kim, VP of Engineering — Large Enterprise (300+ engineers)
Score: 8/10
“The economics section is what I needed — I have a board mandate to show measurable AI adoption by Q3 and most proposals on my desk are either too abstract to measure or too narrow to matter. Synthetic user panels as a pre-deployment gate in CI/CD is a concrete capability I can fund, staff, and report on. What holds me back from a 9 is scale evidence: running six synthetic users against one page is a demo. Running thirty against 400 pages across four product lines — that is my reality.”
Would I send this to a peer? Yes — to two other VPs under board pressure to show AI wins with the message: “This is the most defensible AI adoption use case I have seen — measurable, low-risk, and it does not require replacing anyone.”

Dev, Staff Engineer — Platform Company
Score: 7/10
“The Playwright orchestration walkthrough is real — screenshot capture, accessibility tree snapshot, vision model evaluation, structured action output. That is an actual architecture. The failure modes are practitioner numbers: 10-15% retry rate, 70-75% objection overlap, not 95%. Those are honest numbers. What I want to see is the calibration document itself — show me the actual prompt, the token count, the model. You earned my trust with the architecture section. Do not lose it by keeping the implementation behind a consulting engagement.”
Would I send this to a peer? Yes — to another staff engineer evaluating agent frameworks with the message: “Skip to ‘They Can Use Your Software’ and ‘How to Build Them.’ The author actually builds software.”

Written by

The views and opinions expressed in this article are the author’s own and do not represent the positions of any employer, client, or affiliated organization.

One useful note a week

Get one good email a week.

Short notes on AI-native software leadership. No launch sequence. No funnel theater.