Diagnostics·~8 min read

The AdRevila grading rubric — A through F, explained

The full AdRevila grading rubric: 4 dimensions, coherence bonus, penalties, and what each letter grade actually means in operator language.

Most ad-scoring tools hand you a number and hide the math. AdRevila's rubric is the article — the same scoring frame the analyzer uses, written down so you can argue with it. A score is only useful if you know what it's measuring and where it's a heuristic.

This piece lays out the four dimensions, the coherence bonus, the penalties, what each letter actually means, and where the rubric is unreliable. It sits inside the senior-strategist diagnostic (How to read a winning ad the way a senior strategist does) and rolls up the five-question read from How to read a winning Meta ad in five minutes into one number you can act on.

TL;DR

Every ad scores on four dimensions (hook, script, visual, CTA), each 0–20, equal weight.
A coherence bonus (+5 to +10) rewards ads whose dimensions all agree on the Schwartz stage. A Schwartz mismatch penalty (-5 to -15) flags the inverse.
Letter grades are stage-relative. A B+ for Most-Aware retargeting means a different next step than a B+ for cold Solution-Aware.
The letter mapping is deterministic; the sub-scores are heuristic. We surface confidence-per-bullet so you can weigh the call.
Don't optimize the score. Optimize the diagnosis underneath it.

A real breakdown, before the rubric

Hypothetical TJ Maxx Meta ad ("shop outdoor decor at TJ Maxx," 22-second UGC), before we explain how each number got there:

Dimension	Score	One-line read
Hook strength	18 / 20	Pattern Interrupt with on-screen text earns the 3-second hold
Script quality	16 / 20	First-person discovery framing, but the "way better" claim is vague (CVR)
Visual execution	19 / 20	Ugly-native iPhone shot, no logo bug, no music sting
CTA effectiveness	12 / 20	"Shop Now" under-commits for a category-discovery ad on cold traffic
Subtotal	65 / 80
Coherence bonus	+5	Hook, script, and visual all read Solution-Aware in sync
Schwartz mismatch penalty	-5	The "Shop Now" CTA reads Product-Aware, not Solution-Aware
Final score	65 / 100	C+

That's the shape. The rest of this piece is how each line gets filled in.

Why we made the rubric public

Three reasons:

Honesty in confidence is a moat. Foreplay and MagicBrief give you scores without rubrics. Fine for browsing; useless when you're deciding whether to scale spend. If you can't argue with the score, the score isn't doing work.
The rubric is the product's vocabulary. Every /strategies cluster article — the 5-minute Meta read, the hook archetypes, the Schwartz guide — uses these four dimensions in this order. Publishing it makes the library coherent.
It's falsifiable. If a dimension breakdown looks wrong, that's a signal — not a bug. The analyzer surfaces confidence per bullet (see "Signal confidence" below) so you can override the call.

The 4 dimensions

Four sub-scores plus a coherence adjustment. Equal weight, because no single dimension wins or kills an ad alone.

Hook strength (0–20)

What the first 3 seconds do for hook rate. Does the hook match one of the five archetypes in The 5 hook archetypes that govern every winning DTC ad, earn the next 5 seconds, and self-select the right viewer hard enough that the wrong ones swipe?

18–20 — archetype unambiguous, on-screen text or audio inside 2 seconds, days-running and collation count show the hold.
14–17 — strong archetype, minor issue. Verbal hook arrives a beat late, or visual is mid-tier when ugly-native would have hit harder.
10–13 — generic. No archetype dominates, no reason to stay.
0–9 — broken. Brand logo open, stock B-roll, or sells in the first second.

Script quality (0–20)

What the body does after the hook earned the next 5 seconds. Is it doing one specific persuasion job (PAS, AIDA, BAB, PASTOR, FAB, or a clean Narrative), or just describing the product?

18–20 — one framework doing one job; specific evidence (named price, real number); two to three Cialdini levers firing with on-screen moments.
14–17 — framework intact, one moment vague or one Cialdini lever misfires (e.g. "way better than my old one" with no comparison).
10–13 — describes the product without a persuasion structure. Specs without benefits, benefits without proof.
0–9 — script contradicts itself, offer math doesn't add up, or body re-explains the hook for 15 seconds.

Visual execution (0–20)

Whether the visual register (ugly-native / mid / produced) matches the Schwartz stage the script is written for. A produced campaign aimed at cold Solution-Aware traffic loses 30%+ of the hold (3-second hold) versus the same script shot ugly-native. The pairing is the score, not the polish.

18–20 — register matches the stage; lighting, framing, cuts all serve the read.
14–17 — register matches, one shot off-tier (lighting too flattering for true ugly-native; brand color grade leaking into UGC).
10–13 — reads brand when script wants UGC, or vice versa.
0–9 — visual fights the script (produced cinematic open on a "POV: I just found this" narrative).

CTA effectiveness (0–20)

Commitment level matched to Schwartz stage, plus ad-to-LP continuity. High-commitment on Problem-Aware traffic kills CVR. Low-commitment on Most-Aware retargeting wastes the slot. A URL that breaks the ad's hero promise drops the score regardless of CTA copy.

18–20 — commitment matches stage (low for Unaware/Problem-Aware, mid for Solution-Aware, high for Product-Aware/Most-Aware) AND LP keeps the promise in the first viewport.
14–17 — commitment right, LP continuity loose (ad sells a chair, LP loads the category page).
10–13 — commitment off by one tier ("Shop Now" on a $129 considered-purchase ad on cold traffic).
0–9 — commitment contradicts stage (high-commitment "Buy Now" on Unaware), or LP is a different product entirely.

Coherence bonus and penalties

Ads can be individually strong on four dimensions and still flop if the dimensions don't agree on who the buyer is. Coherence is the rubric's tax on the bag-of-tricks ad.

Coherence bonus (+5 to +10) — when hook, script, visual, and CTA all read the same Schwartz stage without you having to squint. A clean Solution-Aware ad (ugly-native UGC, first-person discovery script, mid-commitment "See more" CTA, LP that loads the right category) gets the full +10. The +5 case is coherent on three of four dimensions.

Schwartz mismatch penalty (-5 to -15) — when the dimensions disagree. The TJ Maxx example takes -5 because "Shop Now" pulls Product-Aware while the rest read Solution-Aware. -10: Problem-Aware hook with a Most-Aware retargeting offer. -15 (rare): LP is a different product line entirely.

LP-continuity penalty (-5) — separate. Triggered when the ad's hero promise isn't in the first viewport of the destination URL. Catalog ads catch this disproportionately.

Hard cap — bonuses and penalties stack, but the final score can't exceed 100 or drop below 0.

What each letter actually means

Read these as operator next steps, not school grades.

A+ / A (90–100) — Scale it. Benchmark for the category. Every dimension lands, coherence bonus fires. The diagnosis isn't "this is good" — it's "find the one mechanism here that travels to your category and brief it." AdRevila scores roughly one ad in fifty at A.

A- / B+ (85–89) — Scale it, fix the one capper. The score reasoning names a single capper (usually CTA or LP continuity). Address that and the ad moves into A.

B / B- (80–84) — Solid working ad. Two or three dimensions strong, one mid. Run it; don't read it as a benchmark. B is "winning right now"; A is "winning across cohorts."

C+ / C / C- (70–79) — Workmanlike. Diagnose before scaling. The TJ Maxx example sits here. Mechanism is real but a dimension leaks — usually script vagueness or a CTA-stage mismatch. The leak compounds at higher budgets. Fix the lowest sub-score before you raise spend.

D+ / D / D- (60–69) — Real issues. Two or more dimensions sub-13. The ad may be spending — that's algorithm patience, not quality. The reasoning names one lifter and one capper. If the lifter is structural, rewrite around it.

F (0–59) — Broken. Hook doesn't earn the hold, script contradicts itself, or the dimensions disagree on who the buyer is. The penalty stack is doing real damage. Don't try to optimize an F — start over with a new brief.

The boundaries match American school grades because operators read them without translation.

Signal confidence — when the rubric is unreliable

The rubric is deterministic on three things and heuristic on the rest. Being explicit about which is which is the whole point.

Deterministic: the letter-to-score mapping, the four-dimension structure, and the penalty floor/ceiling. Same every run.

Heuristic: the sub-score on each dimension (the analyzer is Gemini 2.5 Flash at temperature 0.10 — two runs on the same ad can shift sub-scores ±2 points, though the letter rarely moves); the exact coherence bonus inside +5 to +10; the "one lifter, one capper" reasoning sentences.

The product surfaces this directly. Every bullet in your AdRevila report carries a confidence label — high, medium, or low. A low bullet is the analyzer saying I'm inferring through 2+ steps from a single signal. When a sub-score leans on medium and low bullets, treat the score as ±3 points wider.

The score-breakdown popover in the report header shows the per-dimension sub-scores and the confidence distribution. A 75 on three high-confidence bullets is a different operator decision than a 75 on six low-confidence bullets — only the first is one you should act on.

How to use your AdRevila score (and what to ignore)

A score is an index, not a verdict.

Use it to:

Rank your own creative tests. Score your last 10 ads, sort by grade, compare dimension breakdowns of the top three vs the bottom three. The pattern beats any single ad's number.
Find the one capper. Every B and C ad has one named in the reasoning. Fix it, re-run, watch the dimension move.
Read it stage-relative. A B for Most-Aware retargeting is a different next step than a B for cold Solution-Aware. Don't compare letters across stages.

Ignore it when:

Two dimensions are at low confidence — the integer is too soft. Read the bullets.
You're scoring a competitor — diagnose the mechanism, don't grade them. A competitor's C outspending your A is a budget story.
The ad is under 14 days running — metadata is too thin. Re-run at 30 days.

What to do this week

Monday — run your three most-spent ads from the last 30 days through AdRevila. Note the sub-score breakdown, not just the letter.
Tuesday — name the one capper from each score's reasoning. Specific: "CTA commits too high for cold Solution-Aware" beats "needs work."
Wednesday — brief one creative against the capper on your highest-spend ad. Don't rebuild; change the one thing.
Thursday — ship into a small isolated ad set so you can read the lift cleanly.
Friday — pull a competitor's ad in the same category, run it, compare dimension breakdowns. Where they out-score you, the mechanism is in the bullets — not the number.

Four weeks of this and you'll have a rubric-trained eye for your category. That's the asset worth owning. The letter on the report is just the scaffolding you used to get there.

See it in action

See it in action: View the AdRevila report →

The embedded report shows the dimension breakdown, the confidence labels on every bullet, and the coherence math. Read the popover, argue with the score — that's the loop.

Paste an ad.
Learn something.
Ship better creative.

Opening soon. Join the waitlist and be first in when it does.

Join the waitlist

All strategies