You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
#23 and #24 are valid-input highlighter bugs the scope-gap metric reported 0 of — not because the metric is broken (fed the two inputs directly, grading catches #23 and the differential catches #24), but because the corpus (yaml-test-suite 351 + issue12 10 = 416) contains no shape that triggers them. The metric is corpus-bound: "100% / monogramWrong 0" only means "no error on the 416 inputs it has seen". Hand-adding shapes — how #23/#24 were found — is sampling; it can't prove no other shape is missed (Dijkstra: tests show presence, not absence).
But Monogram has a lever a normal highlighter doesn't: the source IS a grammar. The same yaml.ts / typescript.ts object the parser, highlighter and tree-sitter derive from is also a generator — walk it (alt=branch, seq=concat, many/opt=repeat, token=sample) and it emits guaranteed-legal inputs. That replaces "hope the corpus contains the shape" with systematic coverage.
Part 1 — the method
(a) Grammar-based input generation. One generic generator over the shared combinator IR (all 7 languages import the same src/api.ts combinators → one driver covers every language). Two modes:
Bounded-exhaustive enumeration — every derivation of depth ≤ N. Provably complete for depth ≤ N under the small-scope hypothesis; turns "which shapes get tested" from imagination into grammar × bound. (E.g. - - a\n - b\n- c, the YAML highlighter: nested compact sequence item swallowed by plain-scalar fold #24 shape, is just SeqItem's value being another BlockSequence + many(Newline, SeqItem) — a depth-3 derivation the enumerator produces without anyone thinking of it.)
Grammar fuzzing — random production choices for deeper/larger structures.
The only per-language cost is structural-token materialization: token-stream languages (TS/JS/TSX/JSX) join tokens directly; markup (HTML/Vue) needs a tag stack + raw-text body; YAML needs an indent stack mirroring the lexer to turn Indent/Dedent/Newline into real indentation. This hook has the same shape as gen-lexer's per-language config — it stays agnostic. (Ambiguity — JSX-vs-generic, regex-vs-divide, ASI — is a parsing problem; when generating you pick a derivation, so it never arises. The generator only honours the context combinators already in the grammar: sameLine, regexContext, not(), exclude.)
(b) Judging. Generated inputs feed:
Round-tripgenerate(structure) → parse → recover structure — needs NO external oracle; validates parser + tree-sitter internal consistency for all 7 languages.
Neutral oracle (yaml package / tsc / parse5) for highlighter correctness where internal consistency isn't enough.
Differential vs official grammars (already in scope-gap), now on generated inputs instead of a fixed corpus.
Part 2 — the test-suite cleanup it enables (to land alongside)
The suite is 80 files / ~11k lines. Most is essential matrix — 7 languages × 4 targets (parser / TextMate / Monarch / tree-sitter) × several metrics — and is not compressible without dropping coverage. Adopting the generator lets us retire/reshape three slices:
A. Delete dev-only scratch / superseded probes (~500 lines, ~9 files; confirm each isn't a CI gate first): parser-gap.ts (254 — superseded by run-conformance + src-coverage-ts, reads an external /tmp path), yaml-diag.ts / yaml-poc.ts (self-labeled "Throwaway"), diag.ts / prof.ts / ts-ast.ts / bench-vs-ts.ts / bench-vs-ts-agg.ts / classify-ts.ts (dev diagnostics).
B. Per-language adapters → data-driven (~400–550 lines, 13 files → ~4): the thin scope-gap-{ts,js,jsx,tsx,html,yaml,vue} and src-coverage-{6} adapters are mostly boilerplate (corpus path, grammar path, scopeName, oracle pick). Collapse to one driver + a config table (the oracle files are already separate), invoked as scope-gap <lang>. This is the same agnostic/data-driven principle the engines already follow — pushing the harness from ~80% to 100% agnostic. Caveat: the html / yaml adapters are thicker and may not fully fold in; keep per-language entry points via a parameter.
C. Corpus loaders → generator source. The corpus role in scope-gap / src-coverage becomes a generator. The hand-written *-issue-cases and yaml-issue12-regressionsstay as named regressions — generation replaces their discovery function, not their documentation/gate value.
Honest boundary — what generation does NOT replace
Alignment to external references (*-conformance vs tsc, src-coverage's official-branch metric, scope-gap's vs-official differential, repo-compat): round-trip only proves the derivation chain is self-consistent — never that the parser matches tsc's semantic boundary. These stay, unchanged.
Negative / error tests (verify-rejects, the reject half of conformance): generation emits only legal inputs; rejecting illegal ones needs a separate mutation axis.
Net: not fewer lines — coverage upgraded from corpus sampling to bounded-exhaustive, the consistency harnesses reused as the spine, scratch deleted, and per-language boilerplate folded into one data-driven driver. What's actually replaced is the corpus's "representativeness bet" (no longer betting yaml-test-suite happens to contain a #23/#24 shape) and the ongoing labour of hand-writing new cases to chase coverage.
Acceptance (incremental)
A generic grammar walker emitting legal inputs for any Monogram grammar (bounded-exhaustive to depth N) + the per-language structural-token materialization hook.
Round-trip gate: generated inputs parse for all 7 languages.
Motivation
#23 and #24 are valid-input highlighter bugs the
scope-gapmetric reported 0 of — not because the metric is broken (fed the two inputs directly, grading catches #23 and the differential catches #24), but because the corpus (yaml-test-suite 351 + issue12 10 = 416) contains no shape that triggers them. The metric is corpus-bound: "100% / monogramWrong 0" only means "no error on the 416 inputs it has seen". Hand-adding shapes — how #23/#24 were found — is sampling; it can't prove no other shape is missed (Dijkstra: tests show presence, not absence).But Monogram has a lever a normal highlighter doesn't: the source IS a grammar. The same
yaml.ts/typescript.tsobject the parser, highlighter and tree-sitter derive from is also a generator — walk it (alt=branch,seq=concat,many/opt=repeat, token=sample) and it emits guaranteed-legal inputs. That replaces "hope the corpus contains the shape" with systematic coverage.Part 1 — the method
(a) Grammar-based input generation. One generic generator over the shared combinator IR (all 7 languages import the same
src/api.tscombinators → one driver covers every language). Two modes:grammar × bound. (E.g.- - a\n - b\n- c, the YAML highlighter: nested compact sequence item swallowed by plain-scalar fold #24 shape, is justSeqItem's value being anotherBlockSequence+many(Newline, SeqItem)— a depth-3 derivation the enumerator produces without anyone thinking of it.)The only per-language cost is structural-token materialization: token-stream languages (TS/JS/TSX/JSX) join tokens directly; markup (HTML/Vue) needs a tag stack + raw-text body; YAML needs an indent stack mirroring the lexer to turn
Indent/Dedent/Newlineinto real indentation. This hook has the same shape as gen-lexer's per-language config — it stays agnostic. (Ambiguity — JSX-vs-generic, regex-vs-divide, ASI — is a parsing problem; when generating you pick a derivation, so it never arises. The generator only honours the context combinators already in the grammar:sameLine,regexContext,not(),exclude.)(b) Judging. Generated inputs feed:
generate(structure) → parse → recover structure— needs NO external oracle; validates parser + tree-sitter internal consistency for all 7 languages.---/...scoped as document markers #23/YAML highlighter: nested compact sequence item swallowed by plain-scalar fold #24 are exactly this inconsistency, so this check auto-surfaces the whole class without anyone naming the shape.yamlpackage / tsc / parse5) for highlighter correctness where internal consistency isn't enough.scope-gap), now on generated inputs instead of a fixed corpus.Part 2 — the test-suite cleanup it enables (to land alongside)
The suite is 80 files / ~11k lines. Most is essential matrix — 7 languages × 4 targets (parser / TextMate / Monarch / tree-sitter) × several metrics — and is not compressible without dropping coverage. Adopting the generator lets us retire/reshape three slices:
A. Delete dev-only scratch / superseded probes (~500 lines, ~9 files; confirm each isn't a CI gate first):
parser-gap.ts(254 — superseded by run-conformance + src-coverage-ts, reads an external/tmppath),yaml-diag.ts/yaml-poc.ts(self-labeled "Throwaway"),diag.ts/prof.ts/ts-ast.ts/bench-vs-ts.ts/bench-vs-ts-agg.ts/classify-ts.ts(dev diagnostics).B. Per-language adapters → data-driven (~400–550 lines, 13 files → ~4): the thin
scope-gap-{ts,js,jsx,tsx,html,yaml,vue}andsrc-coverage-{6}adapters are mostly boilerplate (corpus path, grammar path, scopeName, oracle pick). Collapse to one driver + a config table (the oracle files are already separate), invoked asscope-gap <lang>. This is the same agnostic/data-driven principle the engines already follow — pushing the harness from ~80% to 100% agnostic. Caveat: thehtml/yamladapters are thicker and may not fully fold in; keep per-language entry points via a parameter.C. Corpus loaders → generator source. The corpus role in
scope-gap/src-coveragebecomes a generator. The hand-written*-issue-casesandyaml-issue12-regressionsstay as named regressions — generation replaces their discovery function, not their documentation/gate value.Honest boundary — what generation does NOT replace
*-conformancevs tsc,src-coverage's official-branch metric,scope-gap's vs-official differential,repo-compat): round-trip only proves the derivation chain is self-consistent — never that the parser matches tsc's semantic boundary. These stay, unchanged.verify-rejects, the reject half of conformance): generation emits only legal inputs; rejecting illegal ones needs a separate mutation axis.*-smoke,perf-bench,issue-table,coverage-table,agnostic): structural / perf / README — untouched.Net: not fewer lines — coverage upgraded from corpus sampling to bounded-exhaustive, the consistency harnesses reused as the spine, scratch deleted, and per-language boilerplate folded into one data-driven driver. What's actually replaced is the corpus's "representativeness bet" (no longer betting yaml-test-suite happens to contain a #23/#24 shape) and the ongoing labour of hand-writing new cases to chase coverage.
Acceptance (incremental)
---/...scoped as document markers #23 and YAML highlighter: nested compact sequence item swallowed by plain-scalar fold #24).scope-gap/src-coverageadapters folded into a data-driven driver, per-language entry preserved as a parameter.Related: #23, #24 (the bugs that motivated this).