Create tokens with the right data #2308

gabritto · 2025-12-09T20:48:15Z

Fixes #2127.
Fixes a crash I found while testing something else. The crash is caused because, in GetOrCreateToken, we create any kind of token with a Token data. But if the token kind is e.g. string literal, later usage of that token will assume it can get back a string literal data and will crash.

The fix is to create a token backed by the right data according to the token kind.

internal/ast/ast.go

DanielRosenwasser · 2025-12-09T21:13:05Z

I am pretty sure this might be related to #2127

internal/ast/ast.go

Copilot

Pull request overview

This PR fixes a crash that occurs when creating tokens via GetOrCreateToken. The crash happened because tokens were being created with generic Token data regardless of their kind, but code that later uses these tokens expects them to have the appropriate data structure (e.g., a string literal token should have StringLiteral data, not just Token data).

Key Changes:

Modified GetOrCreateToken to accept a TokenFlags parameter for proper token creation
Implemented a new createToken helper function that creates tokens with the appropriate data structure based on their kind
Updated all 17 call sites across the codebase to pass scanner.TokenFlags() or ast.TokenFlagsNone as appropriate
Added a test case to verify the fix prevents crashes with unterminated string literals

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
`internal/ast/ast.go`	Added `tokenFactory` field to `SourceFile`, updated `GetOrCreateToken` signature to accept `TokenFlags`, and implemented `createToken` helper to instantiate tokens with correct data types
`internal/astnav/tokens.go`	Updated 8 call sites to pass `scanner.TokenFlags()` when calling `GetOrCreateToken`
`internal/ls/utilities.go`	Updated 3 call sites to pass `scanner.TokenFlags()` when calling `GetOrCreateToken`
`internal/ls/signaturehelp.go`	Updated 1 call site to pass `scanner.TokenFlags()` when calling `GetOrCreateToken`
`internal/ls/lsutil/children.go`	Updated 2 call sites to pass `scanner.TokenFlags()` when calling `GetOrCreateToken`
`internal/ls/change/tracker.go`	Updated 3 call sites to pass `ast.TokenFlagsNone` for synthetically created tokens (modifiers, semicolons, commas)
`internal/fourslash/tests/completionsUnterminatedLiteral_test.go`	Added regression test for unterminated string literal that previously caused a crash

Copilot · 2025-12-09T23:39:45Z

internal/ast/ast.go

+		return file.tokenFactory.NewNumericLiteral(text)
+	case KindBigIntLiteral:
+		return file.tokenFactory.NewBigIntLiteral(text)
+	case KindStringLiteral:
+		return file.tokenFactory.NewStringLiteral(text)
+	case KindJsxText, KindJsxTextAllWhiteSpaces:
+		return file.tokenFactory.NewJsxText(text, kind == KindJsxTextAllWhiteSpaces)
+	case KindRegularExpressionLiteral:
+		return file.tokenFactory.NewRegularExpressionLiteral(text)
+	case KindNoSubstitutionTemplateLiteral:
+		return file.tokenFactory.NewNoSubstitutionTemplateLiteral(text)


The TokenFlags are not being set for these literal types, even though they all have LiteralLikeBase (or TemplateLiteralLikeBase) which includes a TokenFlags field. This is critical because these flags carry important information like whether a literal is unterminated (see TokenFlagsUnterminated and IsUnterminatedLiteral in utilities.go), which is exactly what this PR is trying to fix.

The flags need to be set after creating these nodes. For example:

case KindStringLiteral: node := file.tokenFactory.NewStringLiteral(text) node.AsStringLiteral().TokenFlags = flags return node case KindNumericLiteral: node := file.tokenFactory.NewNumericLiteral(text) node.AsNumericLiteral().TokenFlags = flags return node case KindBigIntLiteral: node := file.tokenFactory.NewBigIntLiteral(text) node.AsBigIntLiteral().TokenFlags = flags return node case KindJsxText, KindJsxTextAllWhiteSpaces: node := file.tokenFactory.NewJsxText(text, kind == KindJsxTextAllWhiteSpaces) node.AsJsxText().TokenFlags = flags return node case KindRegularExpressionLiteral: node := file.tokenFactory.NewRegularExpressionLiteral(text) node.AsRegularExpressionLiteral().TokenFlags = flags return node case KindNoSubstitutionTemplateLiteral: node := file.tokenFactory.NewNoSubstitutionTemplateLiteral(text) node.AsNoSubstitutionTemplateLiteral().TemplateFlags = flags return node

Note that for NoSubstitutionTemplateLiteral, it should set TemplateFlags (not the inherited TokenFlags from LiteralLikeBase) since IsUnterminatedLiteral checks TemplateFlags for template literal kinds.

Suggested change

return file.tokenFactory.NewNumericLiteral(text)

case KindBigIntLiteral:

return file.tokenFactory.NewBigIntLiteral(text)

case KindStringLiteral:

return file.tokenFactory.NewStringLiteral(text)

case KindJsxText, KindJsxTextAllWhiteSpaces:

return file.tokenFactory.NewJsxText(text, kind == KindJsxTextAllWhiteSpaces)

case KindRegularExpressionLiteral:

return file.tokenFactory.NewRegularExpressionLiteral(text)

case KindNoSubstitutionTemplateLiteral:

return file.tokenFactory.NewNoSubstitutionTemplateLiteral(text)

node := file.tokenFactory.NewNumericLiteral(text)

node.AsNumericLiteral().TokenFlags = flags

return node

case KindBigIntLiteral:

node := file.tokenFactory.NewBigIntLiteral(text)

node.AsBigIntLiteral().TokenFlags = flags

return node

case KindStringLiteral:

node := file.tokenFactory.NewStringLiteral(text)

node.AsStringLiteral().TokenFlags = flags

return node

case KindJsxText, KindJsxTextAllWhiteSpaces:

node := file.tokenFactory.NewJsxText(text, kind == KindJsxTextAllWhiteSpaces)

node.AsJsxText().TokenFlags = flags

return node

case KindRegularExpressionLiteral:

node := file.tokenFactory.NewRegularExpressionLiteral(text)

node.AsRegularExpressionLiteral().TokenFlags = flags

return node

case KindNoSubstitutionTemplateLiteral:

node := file.tokenFactory.NewNoSubstitutionTemplateLiteral(text)

node.AsNoSubstitutionTemplateLiteral().TemplateFlags = flags

return node

This is unfortunately correct, and it's very annoying e.g. NewStringLiteral doesn't take the flags because I keep forgetting about it.

Should it?

This code is also sort of the same as parseLiteralExpression which does sort of make it seem like this is all sort of error prone, yeah.

I think it should in general, as there's code in the LS that uses token flags, and code in the printer that also uses token flags, but there's tons of existing places where we don't set flags right now. I'll try and refactor that in another PR tomorrow.

gabritto · 2025-12-10T00:01:47Z

Ok, hopefully I didn't forget anything this time.

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Copilot · 2025-12-10T00:20:38Z

internal/ls/change/tracker.go

 	// insert separator immediately following the 'after' node to preserve comments in trailing trivia
 	// !!! formatcontext
-	t.ReplaceRange(sourceFile, lsproto.Range{Start: end, End: end}, sourceFile.GetOrCreateToken(separator, after.End(), after.End()+len(separatorString), after.Parent), NodeOptions{})
+	t.ReplaceRange(sourceFile, lsproto.Range{Start: end, End: end}, sourceFile.GetOrCreateToken(separator, after.End(), after.End()+len(separatorString), after.Parent, ast.TokenFlagsNone), NodeOptions{})


This creates a synthetic token with end position after.End()+len(separatorString), which extends beyond the actual text currently in the source file. The new createToken function (added in this PR at internal/ast/ast.go line 10976) performs text := file.text[pos:end], which will panic with "slice bounds out of range" when end > len(file.text).

To fix this, createToken needs to handle cases where positions extend beyond the source text. Add bounds checking:

func createToken(kind Kind, file *SourceFile, pos, end int, flags TokenFlags) *Node { if file.tokenFactory == nil { file.tokenFactory = NewNodeFactory(NodeFactoryHooks{}) } var text string if pos >= 0 && end <= len(file.text) && pos <= end { text = file.text[pos:end] } // For punctuation/keywords (default case), text isn't used anyway // For literals/identifiers created with invalid bounds, this is a programming error // that should be caught, but we can default to empty text for robustness switch kind { // ... rest of the cases } }

I noticed this too, will follow up on it in a separate PR: the change tracker should not be calling file.GetOrCreateToken() at all to create synthetic tokens.

gabritto added 2 commits December 9, 2025 18:55

properly create non-tree tokens

f4a1d35

add mutex for node factory

90a9475

gabritto requested review from andrewbranch, Copilot and jakebailey and removed request for Copilot December 9, 2025 20:48

Copilot started reviewing on behalf of gabritto December 9, 2025 20:48 View session

jakebailey reviewed Dec 9, 2025

View reviewed changes

internal/ast/ast.go Outdated Show resolved Hide resolved

jakebailey reviewed Dec 9, 2025

View reviewed changes

internal/ast/ast.go Outdated Show resolved Hide resolved

jakebailey reviewed Dec 9, 2025

View reviewed changes

internal/ast/ast.go Outdated Show resolved Hide resolved

CR

0058c38

Copilot AI review requested due to automatic review settings December 9, 2025 23:34

Copilot started reviewing on behalf of gabritto December 9, 2025 23:35 View session

jakebailey approved these changes Dec 9, 2025

View reviewed changes

gabritto enabled auto-merge December 9, 2025 23:38

Copilot AI reviewed Dec 9, 2025

View reviewed changes

gabritto disabled auto-merge December 9, 2025 23:41

properly set token flags

32220f4

gabritto requested a review from Copilot December 10, 2025 00:01

Merge branch 'main' into gabritto/fix_token

229741e

Copilot started reviewing on behalf of gabritto December 10, 2025 00:02 View session

jakebailey approved these changes Dec 10, 2025

View reviewed changes

gabritto enabled auto-merge December 10, 2025 00:15

Copilot AI reviewed Dec 10, 2025

View reviewed changes

gabritto added this pull request to the merge queue Dec 10, 2025

Merged via the queue into main with commit 543240f Dec 10, 2025
22 checks passed

gabritto deleted the gabritto/fix_token branch December 10, 2025 00:37

gabritto mentioned this pull request Dec 10, 2025

Don't use file GetOrCreateToken for change tracker #2320

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Create tokens with the right data #2308

Create tokens with the right data #2308

gabritto commented Dec 9, 2025 •

edited

Loading

Uh oh!

Uh oh!

DanielRosenwasser commented Dec 9, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 9, 2025

Uh oh!

gabritto Dec 9, 2025

Uh oh!

jakebailey Dec 10, 2025

Uh oh!

gabritto Dec 10, 2025

Uh oh!

gabritto commented Dec 10, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 10, 2025

Uh oh!

gabritto Dec 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

-		return file.tokenFactory.NewNumericLiteral(text)
-	case KindBigIntLiteral:
-		return file.tokenFactory.NewBigIntLiteral(text)
-	case KindStringLiteral:
-		return file.tokenFactory.NewStringLiteral(text)
-	case KindJsxText, KindJsxTextAllWhiteSpaces:
-		return file.tokenFactory.NewJsxText(text, kind == KindJsxTextAllWhiteSpaces)
-	case KindRegularExpressionLiteral:
-		return file.tokenFactory.NewRegularExpressionLiteral(text)
-	case KindNoSubstitutionTemplateLiteral:
-		return file.tokenFactory.NewNoSubstitutionTemplateLiteral(text)
+		node := file.tokenFactory.NewNumericLiteral(text)
+		node.AsNumericLiteral().TokenFlags = flags
+		return node
+	case KindBigIntLiteral:
+		node := file.tokenFactory.NewBigIntLiteral(text)
+		node.AsBigIntLiteral().TokenFlags = flags
+		return node
+	case KindStringLiteral:
+		node := file.tokenFactory.NewStringLiteral(text)
+		node.AsStringLiteral().TokenFlags = flags
+		return node
+	case KindJsxText, KindJsxTextAllWhiteSpaces:
+		node := file.tokenFactory.NewJsxText(text, kind == KindJsxTextAllWhiteSpaces)
+		node.AsJsxText().TokenFlags = flags
+		return node
+	case KindRegularExpressionLiteral:
+		node := file.tokenFactory.NewRegularExpressionLiteral(text)
+		node.AsRegularExpressionLiteral().TokenFlags = flags
+		return node
+	case KindNoSubstitutionTemplateLiteral:
+		node := file.tokenFactory.NewNoSubstitutionTemplateLiteral(text)
+		node.AsNoSubstitutionTemplateLiteral().TemplateFlags = flags
+		return node

Create tokens with the right data #2308

Create tokens with the right data #2308

Conversation

gabritto commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

DanielRosenwasser commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

gabritto Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

jakebailey Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

gabritto Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

gabritto commented Dec 10, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

gabritto Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

gabritto commented Dec 9, 2025 •

edited

Loading

DanielRosenwasser commented Dec 9, 2025 •

edited

Loading