The YAML TextMate highlighter is flat (derived from the token stream), so it retries the same top-level patterns at every token boundary. #docstart / #docend sit in that top-level loop, and their regexes have no line-start anchor:
"docstart": { "match": "---(?=[\\t ]|\\r|\\n|$)", "name": "entity.other.document.begin.yaml" },
"docend": { "match": "\\.\\.\\.(?=[\\t ]|\\r|\\n|$)", "name": "entity.other.document.end.yaml" }
So a --- / ... that opens a value (or a sequence-item value) is scoped as a document marker, even though a YAML directives-end / document-end marker is only meaningful at column 0.
Repro — all three are valid YAML (yaml package reports 0 errors)
| input |
yaml oracle |
Monogram |
official RedCMD |
note: --- not a marker |
{note: "--- not a marker"} |
--- → entity.other.document.begin.yaml ✗ |
string.unquoted.plain.out.yaml ✓ |
x: ... bar |
{x: "... bar"} |
... → entity.other.document.end.yaml ✗ |
string ✓ |
- --- x |
["--- x"] |
--- → entity.other.document.begin.yaml ✗ |
string ✓ |
A --- in the middle of a value (a: b --- c → {a: "b --- c"}) is already correct — the leading plain scalar consumes it. Only a value-leading --- / ... is mis-scoped.
Root cause
In yaml.ts the markers are tokens with no positional condition:
const DocStart = token(seq('---', docMarkerEnd), { scope: 'entity.other.document.begin' });
const DocEnd = token(seq('...', docMarkerEnd), { scope: 'entity.other.document.end' });
The parser constrains them structurally — DocStart / DocEnd are only accepted in Stream / document positions — so the CST is correct. But gen-tm emits the token pattern into every context's pattern list, dropping that structural constraint. official avoids it by placing the (equally unanchored) marker pattern only inside #document, which is included once at stream top-level; a value goes through #block-node's plain rules, which don't include the marker.
This is the flat-highlighter analogue of a position constraint the parser gets for free from grammar structure.
Fix direction
Constrain the document markers to line start (column 0). A YAML --- / ... marker is only valid at the start of a line, so anchoring #docstart / #docend (or gating the DocStart / DocEnd tokens on line-start in gen-tm) should be safe.
Acceptance
- The three repros scope
--- / ... as string content, not document markers.
- Legitimate line-start
--- / ... document separators still scope as entity.other.document.*.
node src/cli.ts yaml.ts keeps the parser CST and the other six grammars byte-identical; scope-gap:yaml does not regress.
Related: #24
The YAML TextMate highlighter is flat (derived from the token stream), so it retries the same top-level patterns at every token boundary.
#docstart/#docendsit in that top-level loop, and their regexes have no line-start anchor:So a
---/...that opens a value (or a sequence-item value) is scoped as a document marker, even though a YAML directives-end / document-end marker is only meaningful at column 0.Repro — all three are valid YAML (
yamlpackage reports 0 errors)yamloraclenote: --- not a marker{note: "--- not a marker"}---→entity.other.document.begin.yaml✗string.unquoted.plain.out.yaml✓x: ... bar{x: "... bar"}...→entity.other.document.end.yaml✗- --- x["--- x"]---→entity.other.document.begin.yaml✗A
---in the middle of a value (a: b --- c→{a: "b --- c"}) is already correct — the leading plain scalar consumes it. Only a value-leading---/...is mis-scoped.Root cause
In
yaml.tsthe markers are tokens with no positional condition:The parser constrains them structurally —
DocStart/DocEndare only accepted inStream/ document positions — so the CST is correct. But gen-tm emits the token pattern into every context's pattern list, dropping that structural constraint. official avoids it by placing the (equally unanchored) marker pattern only inside#document, which is included once at stream top-level; a value goes through#block-node's plain rules, which don't include the marker.This is the flat-highlighter analogue of a position constraint the parser gets for free from grammar structure.
Fix direction
Constrain the document markers to line start (column 0). A YAML
---/...marker is only valid at the start of a line, so anchoring#docstart/#docend(or gating theDocStart/DocEndtokens on line-start in gen-tm) should be safe.Acceptance
---/...as string content, not document markers.---/...document separators still scope asentity.other.document.*.node src/cli.ts yaml.tskeeps the parser CST and the other six grammars byte-identical;scope-gap:yamldoes not regress.Related: #24