CAV-RFC-001
Draft · Version 0.1.0. Three gated pillars, two supporting signals, one north-star outcome — each with a formula, a deterministic methodology, reference agent profiles, and a CI gate. v0.1 thresholds are seeds to baseline-and-tighten.
The metrics
| Metric | Role | Definition |
|---|---|---|
| CRR — Content Recovery Ratio | Gated | tokens(extract(raw HTML)) / tokens(extract(rendered HTML)). Equal counts with different text is a failure, not a pass. |
| SSD — Semantic Signal Density | Gated | 0.5 × signal_ratio + 0.5 × structured_coverage. Chrome stripped before measuring; coverage scored against the page-type preset. |
| ARR — Action Resolution Rate | Gated | resolved_actions / declared_actions against the accessibility-tree snapshot, diffed against a committed golden file. |
| TC — Token Cost | Supporting | cl100k_base token count of the agent representation. |
| TTFUT — Time to First Useful Token | Supporting | Wall-clock to the first chunk of post-boilerplate content. |
| AF — Answer Fidelity | North star · eval-gated | Can a constrained LLM answer canonical questions from the page alone? Temperature 0, ≥3 runs, majority agreement. |
Thresholds (v0.1)
| Metric | Good | Needs Work | Poor |
|---|---|---|---|
| CRR | ≥ 0.95 | ≥ 0.80 | < 0.80 |
| SSD | ≥ 0.60 | ≥ 0.40 | < 0.40 |
| ARR | = 1.00 | ≥ 0.90 | < 0.90 |
| TC | < 4,000 | < 8,000 | ≥ 8,000 |
| AF | ≥ 0.95 | ≥ 0.80 | < 0.80 |
Is this validated?
Yes — the metrics are calibrated against a downstream outcome, not asserted. CRR, the cheapest pillar to compute, reliably predicts whether a model can recover facts from a page: across 46 pages it separates readable from invisible at ROC AUC = 0.95, with synthetic canaries confirming the outcome measures page-reading and not prior knowledge (priors-leak 0.00). The rank correlation is more moderate (Spearman ρ ≈ 0.5), reflecting a bimodal corpus — pages are legible or invisible, with little middle.
The canonical specification, including measurement edge cases and reference agent profiles, is published here — human and machine views on one page: CAV-RFC-001 .