Reading the proof-point

Phase 2 ends with a single block of numbers. It is the one thing you have to read before you approve the persist step. This lesson walks through every field, what counts as good, and what to do when something is off.

The shape of the block

After scripts/validate.sh finishes, Phase 2 prints a summary that looks like this:

Validation complete.
- 14 props verified against source (Button: 6, TextInput: 4, Checkbox: 2, InputWrapper: 2)
- 47 tokens grep-resolved (color: 28, space: 12, type: 7)
- 0 assets in scope this run
- 6 foundation-rules extracted (5 cited, 1 [VERIFY])
- Wiring extracted from acme/marketing@app/layout.tsx (next-app, 28 lines, 1 CSS file lifted, 12 tokens consumed, 12 covered)
- TOKEN_COVERAGE=PASS
- CITATION_VERIFICATION=PASS (CITES_CHECKED=prose:31 claimed:33 skipped=upstream:0 repo:2 url:6)
- 0 hallucinations
- 3 open [VERIFY] markers:
  1. Button.md:42 - loading-state prop name not confirmed in types file
  2. InputWrapper.md:18 - validation slot signature absent from public types; inferred from docs
  3. tokens.md:74 - `--mantine-color-blue-6` cited by foundation URL but no grep-resolve in @mantine/core@7.x
 
Approve to persist? (Reply "go" to write to .claude/skills/acme-ui/.)

Not every line is always present. Foundation and wiring lines appear only when those sources were in scope at Phase 1. The fields that are always there: props, tokens, assets, hallucinations, the [VERIFY] tally.

Field by field

N props verified against source

The deterministic typecheck passed for N props across the extracted components. Each prop in your skill's per-component files was rendered as a typed assignment and compiled against the DS's published types. Negative claims ("never accepts a foo") were rendered as @ts-expect-error lines.

Read it as: "every prop the skill claims actually exists, and every prop the skill says is impossible is in fact rejected."

M tokens grep-resolved

M token names cited in the extraction were grep-found in the DS's source token file. A token cited but not resolved would show up in the [VERIFY] tally below.

Read it as: "every token name in the skill is one the package actually ships."

K assets grep-resolved

K icon, logo, or illustration names were grep-found in the asset package's exports. 0 assets in scope this run means no asset package was passed at Phase 1, which is fine.

F foundation-rules extracted (X cited, Y `[VERIFY]`)

Appears only when at least one [docs:foundation] URL was accepted at Phase 1. F is the number of prose foundation rules pulled (token-pairing, mode-aware, contrast-minimum, semantic-role, fallback-element). X is the count fully cited to a CSS custom property the package ships. Y is the count carrying a [VERIFY] marker, usually because the docs name a variable the installed package does not.

Watch for: 0 foundation-rules extracted when you accepted a foundation URL is a Phase 1 problem (wrong URL, wrong page), not a Phase 2 one. Go back and re-pick the URL.

Wiring extracted from `<ref>@<entry>`

Appears only when an [example:project] reference project was passed at Phase 1. framework is the auto-detected stack (Vite, Next.js App, Next.js Pages, CRA). N lines is the size of the lifted root entry file. K CSS files lifted is the count of companion CSS files pulled verbatim from depth-3 imports. tokens consumed and covered must match.

Watch for: framework=unknown means auto-detection failed and the produced Setup section will carry a [VERIFY] in place of the framework name.

`TOKEN_COVERAGE=<verdict>`

Asserts every var(--X) consumed by the lifted exemplars resolves through one of the lifted CSS @import lines. Three verdicts:

Verdict	Meaning	What to do
PASS	Every consumed token has a covering import.	Proceed.
NOOP	Zero `var(--X)` consumed. Typical for Tailwind-style apps.	Proceed.
FAIL	Lifted imports do not cover every consumed token. Per-var `MISSING:` rows print above.	Either accept the gap (add the missing imports to the scratch by hand and re-run), or go back to discovery (you may have picked the wrong reference project).

FAIL blocks the wait-for-approval gate. The block is deliberate. A skill that fails token coverage will produce code that references CSS variables nothing defines.

`CITATION_VERIFICATION=<verdict>`

Every file:line citation in the extraction was re-read mechanically. PASS means every cite still points to text that supports the claim. FAIL means at least one cite drifted, was unsupported, or was unregistered. The CITES_CHECKED breakdown shows how many were prose, claimed, or skipped (because the cited file lives in an upstream package, a hosted repo, or behind a URL).

Watch for: a high skipped=upstream count means most cites point at code you do not have on disk. The skill is still cited, but the citations were not re-read this run.

0 hallucinations

The sum of every check above that catches a fabricated fact. If this number is non-zero, the skill is making claims that cannot be grounded anywhere, and the run should not be approved.

Read it as: "nothing in the skill was invented from memory." The target is always zero. A non-zero count is a stop sign.

The `[VERIFY]` tally

Every fact the agent could not fully ground gets a [VERIFY] marker inline at the point of extraction. The Phase 2 summary lists them with file path, line number, and a one-line reason. The list is the most important part of the proof point. Each marker needs a decision before you approve.

Decision	When to pick it
Accept as known limitation	The gap is real and acceptable. The DS genuinely does not expose the thing in source. The rule is still useful as guidance even without a citation.
Re-read source	The agent missed something on first pass. You know the prop or token exists, just not where the agent looked. Send it back to widen the search.
Drop the rule	The rule does not actually hold. The agent generalised from a docs page that turned out to be wrong. Tell it to remove the rule and re-run.

An undecided [VERIFY] at the end of Phase 3 is a defect. A decided one is not.

Three shapes you will see in practice

Clean approve

14 props verified, 47 tokens grep-resolved,
0 assets in scope, no foundation in scope,
no wiring in scope, 0 hallucinations,
0 open [VERIFY] markers

Reply go. Phase 3 will write the skill.

Approve with notes

14 props verified, 47 tokens grep-resolved,
6 foundation-rules extracted (5 cited, 1 [VERIFY]),
TOKEN_COVERAGE=PASS, 0 hallucinations,
2 open [VERIFY] markers

Read each [VERIFY]. Decide: accept, re-read, drop. If you accept all of them, reply go. The accepted markers will appear in the Phase 3 closing tally.

Send back to iterate

14 props verified, 47 tokens grep-resolved,
0 foundation-rules extracted (0 cited, 0 [VERIFY]),
TOKEN_COVERAGE=FAIL,
MISSING: var(--acme-color-fg-muted) not covered
MISSING: var(--acme-space-3) not covered
1 hallucinations,
4 open [VERIFY] markers

Do not approve. Either:

The foundation URL was wrong. Restart Phase 1, pick a better URL.
The reference project does not cover the tokens the exemplars use. Restart Phase 1, pick a different reference project, or add the missing imports manually in scratch and re-run.
The hallucination needs to be tracked down. Read the [VERIFY] list to find which fact the agent could not ground, then re-run with that source clarified.

The approval question

The summary closes with "Approve to persist, or adjust?" Reply go only when you can answer yes to all three: hallucinations are zero, every [VERIFY] has a decision, and TOKEN_COVERAGE is not FAIL.

What to take away

The proof-point block is the one thing you read before approving Phase 3.
0 hallucinations is non-negotiable.
TOKEN_COVERAGE=FAIL blocks the gate on purpose. Fix the gap or pick a different reference project.
Every [VERIFY] needs a decision (accept, re-read, drop). An undecided one becomes a defect in the persisted skill.
Iteration is cheap. Scratch is gitignored, nothing lands in your repo until you reply go.

Primary source: .claude/skills/extract-ds-skill/SKILL.md (Phase 2 worked example) and references/validate.md (full proof-point contract, TOKEN_COVERAGE verdicts, citation-verification).