Skip to content

Nodejs provider: GetDocumentUris counts files twice when Location and workspaceFolders are both set #1035

@tsanders-rh

Description

@tsanders-rh

Nodejs Provider: GetDocumentUris counts files twice when Location and workspaceFolders are both set

Summary

The nodejs provider's GetDocumentUris() method in symbol_cache_helper.go counts files twice during provider preparation when both InitConfig.Location and ProviderSpecificConfig["workspaceFolders"] are set to the same directory.

Impact

  • Provider preparation progress shows 2x the actual file count (e.g., 1604 instead of 802 files)
  • Symbol cache operations process the same files multiple times
  • Performance degradation during preparation phase
  • Confusing user experience with incorrect progress reporting

Root Cause

File: external-providers/generic-external-provider/pkg/server_configurations/nodejs/symbol_cache_helper.go

Method: nodejsSymbolSearchHelper.GetDocumentUris() (lines 29-73)

Bug: The method adds workspace folders to additionalPaths without checking if they duplicate the primaryPath (which comes from Location).

func (h *nodejsSymbolSearchHelper) GetDocumentUris(conditionsByCap ...provider.ConditionsByCap) []uri.URI {
    primaryPath := h.config.Location  // Line 30: Set from InitConfig.Location
    if after, ok := strings.CutPrefix(primaryPath, fmt.Sprintf("%s://", uri.FileScheme)); ok {
        primaryPath = after
    }
    additionalPaths := []string{}
    if val, ok := h.config.ProviderSpecificConfig["workspaceFolders"].([]string); ok {
        for _, path := range val {
            if after, prefixOk := strings.CutPrefix(path, fmt.Sprintf("%s://", uri.FileScheme)); prefixOk {
                path = after
            }
            if primaryPath == "" {
                primaryPath = path  // Line 41: Only skips if primaryPath is empty
                continue
            }
            additionalPaths = append(additionalPaths, path)  // Line 44: BUG - adds duplicate!
        }
    }

    // FileSearcher will scan both BasePath and AdditionalPaths
    searcher := provider.FileSearcher{
        BasePath:        primaryPath,        // Scans /path/to/source
        AdditionalPaths: additionalPaths,    // Also scans /path/to/source (duplicate!)
        // ...
    }

Problem: When Location = "/path/to/source" and workspaceFolders = ["/path/to/source"], the code:

  1. Sets primaryPath = "/path/to/source" (from Location)
  2. Adds /path/to/source to additionalPaths (from workspaceFolders)
  3. FileSearcher scans the same directory twice

Reproduction

Test Case 1: tackle2-ui repository

config := provider.InitConfig{
    Location: "/Users/tsanders/Workspace/tackle2-ui",  // 802 JS/TS files
    ProviderSpecificConfig: map[string]interface{}{
        "workspaceFolders": []interface{}{"file:///Users/tsanders/Workspace/tackle2-ui"},
    },
}

Expected: 802 files
Actual: 1604 files (802 × 2)

Test Case 2: Empty Location (workaround)

config := provider.InitConfig{
    Location: "",  // Empty!
    ProviderSpecificConfig: map[string]interface{}{
        "workspaceFolders": []interface{}{"file:///Users/tsanders/Workspace/tackle2-ui"},
    },
}

Result: 802 files ✅ (correct because line 41-42 sets primaryPath from first workspace folder)

Proposed Fix

Option 1: Deduplicate paths (Recommended)

func (h *nodejsSymbolSearchHelper) GetDocumentUris(conditionsByCap ...provider.ConditionsByCap) []uri.URI {
    primaryPath := h.config.Location
    if after, ok := strings.CutPrefix(primaryPath, fmt.Sprintf("%s://", uri.FileScheme)); ok {
        primaryPath = after
    }
    additionalPaths := []string{}
    if val, ok := h.config.ProviderSpecificConfig["workspaceFolders"].([]string); ok {
        for _, path := range val {
            if after, prefixOk := strings.CutPrefix(path, fmt.Sprintf("%s://", uri.FileScheme)); prefixOk {
                path = after
            }
            if primaryPath == "" {
                primaryPath = path
                continue
            }
            // FIX: Skip workspace folders that duplicate primaryPath
            if path == primaryPath {
                continue
            }
            additionalPaths = append(additionalPaths, path)
        }
    }
    // ... rest of function
}

Option 2: Document behavior and require callers to avoid duplicates

Add documentation to GetDocumentUris() clarifying that callers must not set both Location and workspaceFolders to the same value.

Recommendation: Option 1 is preferred because it's more defensive and prevents user errors.

Workaround (for downstream consumers)

Until this is fixed, consumers can work around the issue by:

// For nodejs provider only: Leave Location empty to prevent duplicate counting
location := util.SourceMountPath
if providerName == util.NodeJSProvider {
    location = "" // workspaceFolders is already set in providerSpecificConfig
}

providerConfig := provider.Config{
    InitConfig: []provider.InitConfig{
        {
            Location: location,  // Empty for nodejs to avoid duplicate
            ProviderSpecificConfig: map[string]interface{}{
                "workspaceFolders": []interface{}{fmt.Sprintf("file://%s", util.SourceMountPath)},
            },
        },
    },
}

Related Issues

This issue was discovered while implementing provider preparation progress reporting in kantra (konveyor/kantra#XXX).

Questions for Maintainers

  1. Is this the intended behavior? Should workspaceFolders be additive to Location, or is one meant to override the other?

  2. Does this affect other providers? The generic provider (Go, Python) doesn't seem to have this issue, but should we check them too?

  3. Breaking change concerns? Fixing this might change file counts for existing users. Should we add a deprecation period or config flag?

Environment

  • analyzer-lsp version: v0.9.0-alpha.1.0.20251205151422-c5b85678c415
  • Provider: nodejs (generic-external-provider)
  • OS: macOS (but affects all platforms)

Additional Context

The nodejs provider is the only provider that processes workspaceFolders in this way. Other providers (Go, Python) don't set workspaceFolders in their config, so they don't exhibit this issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs-kindIndicates an issue or PR lacks a `kind/foo` label and requires one.needs-priorityIndicates an issue or PR lacks a `priority/foo` label and requires one.needs-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.

    Type

    No type

    Projects

    Status

    ✅ Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions