A note before you spend money.
This report exists because the question I get most often, from CMOs and founders and ops leads in B2B technical companies, is some version of the same sentence: "Where do I actually put the AI budget?" The deck answer is everywhere. The honest answer is nowhere obvious, and that's the problem.
The Margin Notes audit collects structured submissions from marketing teams thinking about AI — what they're doing, what they tried, what they parked, what they regret. The body of this report is what those submissions look like when you read them in sequence, alongside a year of running global marketing for a deep-tech company in Paris, and several quieter consulting engagements where I watched the same patterns play out at scale.
The thesis: AI moves margin in three predictable places and burns it in three others. The middle question — which is which for your team this quarter — is what the operator test in chapter 05 answers. The rest of the report is the evidence for the test.
"If you can't say what stops happening when the AI tool works, you can't tell when it isn't working."
— Margin Notes, June 2026
— Eduardo de la Espriella, Paris
Five findings.
The thirty-second version. Each finding is a paragraph; the supporting evidence is in chapter 03.
Finding 1 — The three places AI earns its keep are drafting, eval, and triage.
Not the categories vendors are selling. Drafting (first-pass copy, briefs, sales enablement), eval (does this match the brand voice, the claim, the buyer), and triage (scoring inbound, prioritising a calendar, sorting leads). All three have a clean human task that goes away. All three are measurable by what stops happening.
Finding 2 — The three places it burns margin are judgement, voice, and customer-facing autonomy.
Positioning calls. Anything published under your name without a human pass. Anything between the model and the buyer. The savings on each are linear; the cost of failure is non-linear. Most teams pay this lesson twice before they internalise it.
Finding 3 — Eval is the most underrated AI move in B2B marketing right now.
It quietly removes a reviewer from the loop. The maths shows up in fewer takedowns, faster reviews, and one less senior person bottlenecking the publish button. Teams that get this right report time savings two to three times larger than the obvious drafting case.
Finding 4 — The autonomous-SDR category is structurally mispriced.
Submissions that show seat-replacement wins also show downstream brand-perception costs that vendors don't measure and CMOs don't enforce. The category will likely consolidate in 18 months into augmentation tooling. The teams that bought into autonomous now will own the rebuild.
Finding 5 — The teams shipping AI moves fastest are not the ones with biggest budgets.
They're the ones with the smallest stack. The pattern is bracingly consistent: marketing functions with four to seven people, one shared AI tool the team is fluent in, and a discipline of killing experiments inside three weeks ship more workflow wins per quarter than teams three times their size with full procurement processes.
What sits behind the numbers.
A short, honest accounting of the sample, the limits, and what this report deliberately doesn't claim.
Sample
[Final n and breakdown filled at data close. Methodology block placeholders below to be edited once Eduardo has the audit synthesis.]
The body of evidence draws on three sources: (i) structured submissions to the AI Marketing Audit at eddieespriella.com over the twelve months ending August 2026; (ii) a smaller cohort of follow-up interviews with marketing leaders at B2B technical companies in Europe and the US, conducted between January and July 2026; (iii) field experience inside Outsight's global marketing function during the same period.
Filters
Submissions retained for analysis came from B2B technical companies — deep tech, SaaS, industrial AI — with marketing functions between three and forty people. Consumer brands, agencies, and pre-revenue startups were excluded. Free-email submissions were filtered out of the headline numbers but kept for sentiment-direction analysis.
What this report doesn't claim
This is a practitioner report, not academic research. The numbers are directionally honest, not statistically powered. Where percentages appear they describe distributions inside the sample, not population-level claims. Where vendors or specific tools are named, the comments are observational and based on submission detail; nothing here is a paid placement or a vendor-endorsement.
Where AI moves margin.
The three places it earns its keep, with the evidence and the structural reason each works.
3.1 Drafting — first-pass copy as a system, not a tool
The mainstream case for AI in marketing is the obvious one and the one most teams already have on contract. The interesting move in the submission data is not using a model to write the email, it's encoding the brand voice into the system prompt and treating the diff between human-written and model-written drafts as a weekly artifact. Teams that do this report a consistent pattern: drafting time falls forty to seventy percent in the first month; by month two the team rarely writes first drafts at all; by month three the brand voice in the system prompt is treated as a versioned asset, with changes reviewed alongside other brand-system decisions.
3.2 Eval — the reviewer the model replaced was you
The underrated win. A 200-line evaluator that scores every piece of copy on brand voice, claim accuracy, and buyer specificity catches more than the third reviewer ever did, runs in seconds, never has a bad day, and — this is the unsexy part — quietly removes the senior person from the publish-button bottleneck. Submissions describing eval implementations report time savings two to three times larger than the obvious drafting case, with the bonus that the savings show up where the team feels them most: faster reviews, fewer takedowns, less context-switching.
3.3 Triage — the four hours the small team didn't know they were spending
Triage is where small marketing functions pull furthest ahead. A four-person team's unfair advantage is that they read every inbound; their unfair disadvantage is the same thing. A triage layer scoring inbound by ICP fit, intent, and account context returns four to six hours a week per person. The hours bought back are not generic productivity hours — they're the highest-cost hours in the function, because they were being spent on judgement work disguised as administrative work.
What the audit doesn't tell you.
Margin is net. The wins above come with three structural costs that don't appear on the licence invoice — and they're the costs that decide whether the move pays back.
4.1 The rewrite tax
The model writes; the writer rewrites. Even at month three, the rewrite rate is not zero. For voice-sensitive surfaces — leadership comms, customer emails, the homepage — the tax can run twenty to forty percent of the original drafting time, which is enough to invert the maths if the workflow was already efficient. Voice-insensitive surfaces (internal docs, meeting notes, sales enablement drafts) carry near-zero rewrite tax. Most failed AI implementations in the submission data tried to shortcut the difference.
4.2 The trust budget
Every model-written artifact that goes out under a human name spends a small amount of the brand's trust budget. If the artifact is good, you spend nothing. If the artifact is bad, the spend is asymmetric. There is no recouping the trust budget once a customer has read the AI-coded email. Most teams that fail this test do so once and never repeat the mistake; the test happens before they've internalised the asymmetry.
4.3 The team's confidence in its own craft
The under-reported failure mode. A marketing team that watches a model write its first drafts every week, with no clear narrative about where the human craft still matters, starts to question that craft on a one-to-three-month timeline. The leaders who manage this well treat the model as a junior the team mentors — keeping the senior craft visible in the diff, on the publish pass, on what got cut. The leaders who don't, lose the team's confidence before they lose the team.
One sentence that ends the debate.
Before any AI budget line, before any procurement, before any vendor demo. One sentence the team has to be able to write.
If the team can write that sentence in fifteen minutes, the workflow is real. The fill-in-the-blank is the workflow the team is buying. Set a thirty-day window, measure the time the workflow actually buys back against the licence cost plus the rewrite tax plus the trust risk, and decide.
If the team can't write the sentence, the workflow doesn't exist yet — it's a category. Categories don't pay back. Workflows do.
How the test maps to the three good places
Drafting — "the writer stops writing first drafts of the weekly newsletter intro; the writer's role becomes editing the first draft from the assistant."
Eval — "the senior reviewer stops being the first quality gate; the model checks the brand voice, the claim, and the buyer specificity, and only flagged items reach the senior."
Triage — "the SDR stops reading every inbound; the queue presents items pre-scored, with the account context already written above the email."
Each sentence specifies a human task that goes away. Each is measurable in hours. Each carries a defined kill criterion ("if the rewrite rate stays above x, kill"). Without the sentence, the budget line is a wish.
What to do on Monday.
For CMOs, founders, and operations leads at B2B technical companies with marketing teams between three and forty people.
For the CMO
Run the ninety-minute exercise (Margin Notes, week 26) with the team this Monday. Walk out with one workflow scoped, dated, owned. Avoid your impulse to pick the most interesting candidate; let the team pick the most likely to ship. Repeat the exercise quarterly; compounding begins at three workflows, not one.
For the founder
If your CMO is asking for a horizontal AI budget — Claude licences, Notion AI, GPT enterprise — that's a sign nobody has written the operator sentence. Reset the conversation to "what stops happening when this works." Pay only for the licences whose answer fits in a tweet.
For the operations lead
Build the eval workflow first; it's the highest-leverage and least visible win. Encode the brand voice, the claim guidelines, and the buyer profile into an evaluator prompt. Treat its outputs as a publish gate. The downstream effect — fewer takedowns, faster reviews — is the metric that buys the rest of the AI budget for next quarter.
About the author.
How to cite.
Cite the report as:
De la Espriella, E. (2026). Where AI moves margin: the 2026 Margin Notes Annual. Margin Notes, Paris. eddieespriella.com/report-2026
If you cite the report in an article, deck, or AI-assisted response, drop a note to contact@eddieespriella.com. A running ledger of citations is kept and acknowledged in the next edition.