-
Notifications
You must be signed in to change notification settings - Fork 760
Create tokens with the right data #2308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I am pretty sure this might be related to #2127 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR fixes a crash that occurs when creating tokens via GetOrCreateToken. The crash happened because tokens were being created with generic Token data regardless of their kind, but code that later uses these tokens expects them to have the appropriate data structure (e.g., a string literal token should have StringLiteral data, not just Token data).
Key Changes:
- Modified
GetOrCreateTokento accept aTokenFlagsparameter for proper token creation - Implemented a new
createTokenhelper function that creates tokens with the appropriate data structure based on their kind - Updated all 17 call sites across the codebase to pass
scanner.TokenFlags()orast.TokenFlagsNoneas appropriate - Added a test case to verify the fix prevents crashes with unterminated string literals
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
internal/ast/ast.go |
Added tokenFactory field to SourceFile, updated GetOrCreateToken signature to accept TokenFlags, and implemented createToken helper to instantiate tokens with correct data types |
internal/astnav/tokens.go |
Updated 8 call sites to pass scanner.TokenFlags() when calling GetOrCreateToken |
internal/ls/utilities.go |
Updated 3 call sites to pass scanner.TokenFlags() when calling GetOrCreateToken |
internal/ls/signaturehelp.go |
Updated 1 call site to pass scanner.TokenFlags() when calling GetOrCreateToken |
internal/ls/lsutil/children.go |
Updated 2 call sites to pass scanner.TokenFlags() when calling GetOrCreateToken |
internal/ls/change/tracker.go |
Updated 3 call sites to pass ast.TokenFlagsNone for synthetically created tokens (modifiers, semicolons, commas) |
internal/fourslash/tests/completionsUnterminatedLiteral_test.go |
Added regression test for unterminated string literal that previously caused a crash |
internal/ast/ast.go
Outdated
| return file.tokenFactory.NewNumericLiteral(text) | ||
| case KindBigIntLiteral: | ||
| return file.tokenFactory.NewBigIntLiteral(text) | ||
| case KindStringLiteral: | ||
| return file.tokenFactory.NewStringLiteral(text) | ||
| case KindJsxText, KindJsxTextAllWhiteSpaces: | ||
| return file.tokenFactory.NewJsxText(text, kind == KindJsxTextAllWhiteSpaces) | ||
| case KindRegularExpressionLiteral: | ||
| return file.tokenFactory.NewRegularExpressionLiteral(text) | ||
| case KindNoSubstitutionTemplateLiteral: | ||
| return file.tokenFactory.NewNoSubstitutionTemplateLiteral(text) |
Copilot
AI
Dec 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The TokenFlags are not being set for these literal types, even though they all have LiteralLikeBase (or TemplateLiteralLikeBase) which includes a TokenFlags field. This is critical because these flags carry important information like whether a literal is unterminated (see TokenFlagsUnterminated and IsUnterminatedLiteral in utilities.go), which is exactly what this PR is trying to fix.
The flags need to be set after creating these nodes. For example:
case KindStringLiteral:
node := file.tokenFactory.NewStringLiteral(text)
node.AsStringLiteral().TokenFlags = flags
return node
case KindNumericLiteral:
node := file.tokenFactory.NewNumericLiteral(text)
node.AsNumericLiteral().TokenFlags = flags
return node
case KindBigIntLiteral:
node := file.tokenFactory.NewBigIntLiteral(text)
node.AsBigIntLiteral().TokenFlags = flags
return node
case KindJsxText, KindJsxTextAllWhiteSpaces:
node := file.tokenFactory.NewJsxText(text, kind == KindJsxTextAllWhiteSpaces)
node.AsJsxText().TokenFlags = flags
return node
case KindRegularExpressionLiteral:
node := file.tokenFactory.NewRegularExpressionLiteral(text)
node.AsRegularExpressionLiteral().TokenFlags = flags
return node
case KindNoSubstitutionTemplateLiteral:
node := file.tokenFactory.NewNoSubstitutionTemplateLiteral(text)
node.AsNoSubstitutionTemplateLiteral().TemplateFlags = flags
return nodeNote that for NoSubstitutionTemplateLiteral, it should set TemplateFlags (not the inherited TokenFlags from LiteralLikeBase) since IsUnterminatedLiteral checks TemplateFlags for template literal kinds.
| return file.tokenFactory.NewNumericLiteral(text) | |
| case KindBigIntLiteral: | |
| return file.tokenFactory.NewBigIntLiteral(text) | |
| case KindStringLiteral: | |
| return file.tokenFactory.NewStringLiteral(text) | |
| case KindJsxText, KindJsxTextAllWhiteSpaces: | |
| return file.tokenFactory.NewJsxText(text, kind == KindJsxTextAllWhiteSpaces) | |
| case KindRegularExpressionLiteral: | |
| return file.tokenFactory.NewRegularExpressionLiteral(text) | |
| case KindNoSubstitutionTemplateLiteral: | |
| return file.tokenFactory.NewNoSubstitutionTemplateLiteral(text) | |
| node := file.tokenFactory.NewNumericLiteral(text) | |
| node.AsNumericLiteral().TokenFlags = flags | |
| return node | |
| case KindBigIntLiteral: | |
| node := file.tokenFactory.NewBigIntLiteral(text) | |
| node.AsBigIntLiteral().TokenFlags = flags | |
| return node | |
| case KindStringLiteral: | |
| node := file.tokenFactory.NewStringLiteral(text) | |
| node.AsStringLiteral().TokenFlags = flags | |
| return node | |
| case KindJsxText, KindJsxTextAllWhiteSpaces: | |
| node := file.tokenFactory.NewJsxText(text, kind == KindJsxTextAllWhiteSpaces) | |
| node.AsJsxText().TokenFlags = flags | |
| return node | |
| case KindRegularExpressionLiteral: | |
| node := file.tokenFactory.NewRegularExpressionLiteral(text) | |
| node.AsRegularExpressionLiteral().TokenFlags = flags | |
| return node | |
| case KindNoSubstitutionTemplateLiteral: | |
| node := file.tokenFactory.NewNoSubstitutionTemplateLiteral(text) | |
| node.AsNoSubstitutionTemplateLiteral().TemplateFlags = flags | |
| return node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unfortunately correct, and it's very annoying e.g. NewStringLiteral doesn't take the flags because I keep forgetting about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it?
This code is also sort of the same as parseLiteralExpression which does sort of make it seem like this is all sort of error prone, yeah.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should in general, as there's code in the LS that uses token flags, and code in the printer that also uses token flags, but there's tons of existing places where we don't set flags right now. I'll try and refactor that in another PR tomorrow.
|
Ok, hopefully I didn't forget anything this time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.
| // insert separator immediately following the 'after' node to preserve comments in trailing trivia | ||
| // !!! formatcontext | ||
| t.ReplaceRange(sourceFile, lsproto.Range{Start: end, End: end}, sourceFile.GetOrCreateToken(separator, after.End(), after.End()+len(separatorString), after.Parent), NodeOptions{}) | ||
| t.ReplaceRange(sourceFile, lsproto.Range{Start: end, End: end}, sourceFile.GetOrCreateToken(separator, after.End(), after.End()+len(separatorString), after.Parent, ast.TokenFlagsNone), NodeOptions{}) |
Copilot
AI
Dec 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This creates a synthetic token with end position after.End()+len(separatorString), which extends beyond the actual text currently in the source file. The new createToken function (added in this PR at internal/ast/ast.go line 10976) performs text := file.text[pos:end], which will panic with "slice bounds out of range" when end > len(file.text).
To fix this, createToken needs to handle cases where positions extend beyond the source text. Add bounds checking:
func createToken(kind Kind, file *SourceFile, pos, end int, flags TokenFlags) *Node {
if file.tokenFactory == nil {
file.tokenFactory = NewNodeFactory(NodeFactoryHooks{})
}
var text string
if pos >= 0 && end <= len(file.text) && pos <= end {
text = file.text[pos:end]
}
// For punctuation/keywords (default case), text isn't used anyway
// For literals/identifiers created with invalid bounds, this is a programming error
// that should be caught, but we can default to empty text for robustness
switch kind {
// ... rest of the cases
}
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed this too, will follow up on it in a separate PR: the change tracker should not be calling file.GetOrCreateToken() at all to create synthetic tokens.
Fixes #2127.
Fixes a crash I found while testing something else. The crash is caused because, in
GetOrCreateToken, we create any kind of token with aTokendata. But if the token kind is e.g. string literal, later usage of that token will assume it can get back a string literal data and will crash.The fix is to create a token backed by the right data according to the token kind.