Skip to content

Conversation

@tsanders-rh
Copy link
Contributor

@tsanders-rh tsanders-rh commented Dec 6, 2025

Summary

Adds real-time progress reporting for provider preparation phase, showing file processing progress for all providers in both hybrid and containerless modes.

Changes

This PR consists of 6 commits implementing provider preparation progress support:

Feature Implementation (Commits 1-5)

  1. Initial progress reporter setup - Added progress display handler for provider preparation stage
  2. PrepareProgressReporter configuration - Set PrepareProgressReporter only on InitConfig level to avoid duplicate progress events
  3. Nodejs workaround - Added temporary workaround to prevent duplicate file counting in nodejs provider (fixed in commit 6)
  4. Progress formatting fix - Fixed newline handling to prevent blank lines after prepare progress
  5. Documentation - Added reference to analyzer-lsp#1035 explaining nodejs workaround

Cleanup (Commit 6)

  1. Remove workaround - Updated analyzer-lsp to version with fix (analyzer-lsp#1036), removed nodejs file counting workaround

Key Features

  • ✅ Real-time file processing progress for all providers
  • ✅ Works in both hybrid and containerless modes
  • ✅ Thread-safe implementation using channels
  • ✅ Consistent progress bar formatting with checkmarks
  • ✅ Proper cleanup with context cancellation
  • ✅ Respects --no-progress flag

Testing

Comprehensive testing completed:

  • ✅ Hybrid mode + source analysis (tackle2-ui: 802 files)
  • ✅ Containerless mode + source analysis (tackle2-ui: 802 files)
  • ✅ Error handling during provider preparation
  • --no-progress flag behavior

Verification

Before (bug): nodejs provider counted 1604 files (duplicate counting)
After (fixed): nodejs provider counts 802 files correctly

Dependencies

Example Output

Running source analysis...
  ✓ Created volume
  ✓ Started provider containers
  ✓ Initialized providers (builtin, nodejs)
  ✓ Started rules engine
  ✓ Preparing nodejs provider 100% |█████████████████████████| 802/802 files
  ✓ Loaded 231 rules
  ✓ Processing rules 100% |█████████████████████████| 231/231 rules

Related Issues

Code Review

All items from code review addressed:

  • ✅ Thread-safe implementation
  • ✅ Consistent API across modes
  • ✅ Proper error handling
  • ✅ Clear comments
  • ✅ Comprehensive testing
  • ✅ No security issues

Ready to merge.

Summary by CodeRabbit

  • New Features

    • Progress reporting now starts earlier, is shown during provider preparation and initialization, and is consistently used throughout analysis.
    • Progress bar prefix updated with a clearer checkmark indicator.
  • Bug Fixes

    • Removed duplicate progress setup and extra blank lines in rule-loading/execution output for cleaner terminal display.
    • Minor error-message formatting clarified.
  • Chores

    • Toolchain and dependency versions updated.

✏️ Tip: You can customize this high-level summary in your review settings.

This change integrates the prepare progress reporting feature from
analyzer-lsp PR #1027, enabling visual progress bars during provider
initialization showing file processing status.

Key changes:
- Updated analyzer-lsp dependency to include prepare progress support
- Moved setupProgressReporter() earlier in both hybrid and containerless modes
- Added progressReporter parameter to provider setup functions
- Configured PrepareProgressReporter for all provider configs
- Removed duplicate progress reporter initialization

The progress reporter lifecycle now spans the entire analysis:
1. Created before provider setup
2. Used during provider preparation (new)
3. Used during rule execution (existing)
4. Cleaned up via deferred cancel

This provides users with real-time feedback during provider preparation,
showing percentage completion, visual bars, and file counts.

Signed-off-by: tsanders <[email protected]>
Add case for progress.StageProviderPrepare in setupProgressReporter to
display provider preparation progress bars. Without this handler, prepare
progress events were being sent by providers but not displayed to users.

This completes the prepare progress reporting feature by ensuring that
progress events during the provider preparation phase are rendered as
progress bars showing file processing status.

Example output:
  Preparing nodejs provider  46% |███████████░░░░░░░░░░░░░░| 546/1184

Note: Requires provider images built with analyzer-lsp that includes
PR #1027 (GRPC provider preparation progress reporting).

Signed-off-by: tsanders <[email protected]>
Only set PrepareProgressReporter on InitConfig level, not on the main
Config level, to avoid potential duplicate progress events.

Also improved the prepare progress display to show the provider name
and message (e.g., "Preparing nodejs provider") instead of the hardcoded
"Processing rules" label.

Note: File counts displayed during prepare (e.g., 1604 files for tackle2-ui)
come directly from the provider and may include additional files beyond source
code, such as type definitions from node_modules. This is expected behavior
from the analyzer-lsp provider.

Example output:
  Preparing nodejs provider  46% |███████████░░░░░░░░░░░░░░| 746/1604

Signed-off-by: tsanders <[email protected]>
Discovered that nodejs provider's GetDocumentUris was counting files twice
because it uses BOTH InitConfig.Location (as primaryPath) and
ProviderSpecificConfig["workspaceFolders"] (as additionalPaths), even when
they point to the same directory.

Root cause: analyzer-lsp nodejs GetDocumentUris() adds all workspace folders
to additionalPaths without checking if they match the primaryPath, resulting
in FileSearcher scanning the same directory twice.

Workaround: For nodejs provider in hybrid mode, set workspaceFolders in
ProviderSpecificConfig but leave InitConfig.Location empty. This way
GetDocumentUris uses the first workspace folder as primaryPath (via the
'continue' logic when primaryPath is empty) and doesn't add it again to
additionalPaths.

This fix reduces the file count from 1604 (802×2) to 802 for tackle2-ui
analysis, matching the correct count.

Future: Should file an issue in analyzer-lsp to fix GetDocumentUris to check
for duplicate paths before adding to additionalPaths.

Signed-off-by: tsanders <[email protected]>
The prepare progress bar uses \r (carriage return) to update in place without
printing a final newline. This caused the 'Loaded X rules' message to appear
on the same line as the final prepare progress update.

Added shownPrepareProgress flag to track when prepare progress was displayed,
and print a newline before the 'Loaded rules' message to ensure proper
formatting and alignment.

Signed-off-by: tsanders <[email protected]>
Add link to upstream issue tracking the GetDocumentUris file double-counting
bug. This documents why we use the workaround and will help us remember to
remove it once the upstream bug is fixed.

Signed-off-by: tsanders <[email protected]>
@coderabbitai
Copy link

coderabbitai bot commented Dec 6, 2025

Walkthrough

Progress reporting is created earlier and threaded into provider initialization across containerless, hybrid, and standard analysis modes. Provider setup functions now accept a progress.ProgressReporter and attach PrepareProgressReporter adapters to InitConfig entries; provider-prepare progress rendering and minor UI/message tweaks were added. Go/Docker dependencies were also updated.

Changes

Cohort / File(s) Summary
Containerless provider setup
cmd/analyze-bin.go
Import progress; create progress reporter early in RunAnalysisContainerless; pass reporter to setupJavaProvider and setupBuiltinProvider; update their signatures to accept progress.ProgressReporter; attach PrepareProgressReporter to InitConfig entries; remove duplicate reporter creation.
Hybrid provider setup
cmd/analyze-hybrid.go
Create progress reporter earlier in RunAnalysisHybridInProcess; update setupNetworkProvider and setupBuiltinProviderHybrid signatures to accept progress.ProgressReporter; propagate reporter into provider setup and attach PrepareProgressReporter to InitConfig entries; minor comment typo fix.
Standard progress UI
cmd/analyze.go
Add handling for progress.StageProviderPrepare to render provider-preparation progress on stderr; adjust rule-loading progress messages to avoid extra blank lines while preserving cumulative totals.
Tests
cmd/analyze_hybrid_inprocess_test.go
Update test calls to setupBuiltinProviderHybrid to pass the new fourth argument (progress reporter, here nil).
Container error formatting
pkg/container/container.go
Change error wrapping from fmt.Errorf(errBytes.String()) to fmt.Errorf("%s", errBytes.String()) in two error paths.
Build & deps
Dockerfile, go.mod
Bump builder image to golang:1.24.0; set module Go version to 1.24.0; add/update multiple dependencies including github.com/konveyor/analyzer-lsp/..., otel/grpc/x/* modules, and others.

Sequence Diagram(s)

sequenceDiagram
    participant CLI as Analyzer CLI
    participant PR as progress.ProgressReporter
    participant Prep as Provider Preparer (setup*)
    participant Prov as Provider (Java/Builtin/Network)
    participant Rules as Rule Executor

    CLI->>PR: create reporter (early)
    CLI->>Prep: call setup*Provider(..., PR)
    Prep->>PR: attach PrepareProgressReporter to InitConfig entries
    Prep->>Prov: initialize provider with InitConfig
    Prov-->>PR: report provider-prepare progress (StageProviderPrepare)
    CLI->>Rules: execute rules (uses PR for StageRuleParsing/Execution)
    Rules-->>PR: report rule parsing/execution progress
    PR-->>CLI: render progress bars and final newline
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Areas to focus on:

  • Verify all call sites updated to new provider setup signatures (containerless, hybrid, tests).
  • Confirm reporter lifecycle (creation, cancellation, newline on completion) is correct and leak-free.
  • Ensure PrepareProgressReporter attachment to InitConfig does not mutate semantics unexpectedly.
  • Check compatibility of dependency and Go version upgrades with build/test pipeline.

Possibly related PRs

Suggested reviewers

  • eemcmullan
  • shawn-hurley
  • dymurray

Poem

🐰 I hopped in with a cheerful tap,
Threaded progress through every lap,
Bars ticked bright, providers woke to play,
Checks and lines led the startup way,
A tiny hop — analysis on its map 🥕

Pre-merge checks and finishing touches

❌ Failed checks (3 warnings)
Check name Status Explanation Resolution
Linked Issues check ⚠️ Warning The PR implements progress reporting for provider preparation but does not address the root cause of issue #1035 (duplicate file counting in nodejs provider). The PR works around the duplicate counting but does not fix the underlying GetDocumentUris deduplication logic. The PR should include the deduplication fix in GetDocumentUris() to properly address issue #1035, or clarify that the fix is being handled in a separate analyzer-lsp PR and reference those issue numbers explicitly.
Out of Scope Changes check ⚠️ Warning The PR includes out-of-scope changes: Go version bumps (1.23.9→1.24.0), dependency updates, error formatting fixes in container.go (fmt.Errorf changes), and minor typo fixes that are unrelated to provider progress reporting. Separate dependency updates and formatting fixes into dedicated PRs. Keep this PR focused on provider preparation progress reporting implementation in cmd/analyze*.go files.
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding progress reporting for provider preparation, which is the core objective of all code modifications in the PR.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 259d5b3 and 50f516d.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (4)
  • cmd/analyze-bin.go (10 hunks)
  • cmd/analyze-hybrid.go (12 hunks)
  • cmd/analyze.go (1 hunks)
  • go.mod (4 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
cmd/analyze-hybrid.go (2)
pkg/provider/python.go (1)
  • PythonProvider (9-11)
pkg/util/util.go (2)
  • NodeJSProvider (75-75)
  • SourceMountPath (31-31)
🪛 OSV Scanner (2.3.0)
go.mod

[HIGH] 9-9: github.com/getkin/kin-openapi 0.108.0: Improper Handling of Highly Compressed Data (Data Amplification) in github.com/getkin/kin-openapi/openapi3filter

(GO-2025-3533)


[HIGH] 9-9: github.com/getkin/kin-openapi 0.108.0: Improper Handling of Highly Compressed Data (Data Amplification) in github.com/getkin/kin-openapi/openapi3filter

(GHSA-wq9g-9vfc-cfq9)

🔇 Additional comments (17)
cmd/analyze-hybrid.go (7)

25-25: LGTM: Progress package import.

The progress package import is correctly added to support the new progress reporting functionality.


289-296: LGTM: Progress reporter wiring in network provider setup.

The progress reporter is correctly configured at the InitConfig level using the adapter pattern. The comment about avoiding duplicate progress events is helpful context.


433-440: LGTM: Consistent progress reporter setup for builtin provider.

The progress reporter setup follows the same pattern as network providers, maintaining consistency across the codebase.


660-665: LGTM: Early progress reporter creation.

Creating the progress reporter before provider preparation ensures progress updates are captured from the start. The deferred cancel ensures proper cleanup.


675-675: LGTM: Progress reporter passed to provider setup.

The progress reporter is correctly threaded through provider initialization paths, enabling live progress updates during setup.

Also applies to: 722-722


843-843: Minor improvement: Typo fix.

Typo corrected in comment.


238-254: Clarify workspaceFolders behavior across providers.

The comments suggest that the duplicate file counting fix (analyzer-lsp#1036) only applies to the NodeJS provider, but the issue description indicates the fix is in the shared GetDocumentUris() method. If the deduplication logic is in the shared code path, why is it safe to set workspaceFolders for NodeJS (line 254) but not for Go (line 238) and Python (line 245)?

Consider either:

  1. Setting workspaceFolders for all providers consistently if the fix applies to all
  2. Updating the comments to clarify why NodeJS is different (e.g., if NodeJS uses a different code path)

Run the following script to check if Go and Python providers use the same GetDocumentUris logic:

#!/bin/bash
# Check if Go and Python providers share the same GetDocumentUris implementation
rg -nP -C5 'func.*GetDocumentUris' --type=go
go.mod (2)

11-11: LGTM: analyzer-lsp updated with duplicate file counting fix.

The analyzer-lsp dependency is updated to include the fix for analyzer-lsp#1036, which resolves the duplicate file counting issue in GetDocumentUris. The matching java-external-provider indirect dependency is correctly added.

Also applies to: 139-139


3-3: > Likely an incorrect or invalid review comment.

cmd/analyze-bin.go (6)

30-30: LGTM: Progress package import.

Import is correctly added for progress reporting functionality.


78-78: LGTM: Consistent progress bar prefix.

The checkmark prefix maintains visual consistency with other progress messages throughout the application.


203-208: LGTM: Early progress reporter creation for containerless mode.

The progress reporter is created before provider setup, ensuring all initialization progress is captured. Cleanup is properly handled with defer.


219-219: LGTM: Progress reporter integration in provider setup.

Progress reporter is correctly passed to both Java and builtin provider setup functions, enabling progress tracking during initialization.

Also applies to: 235-235


744-750: LGTM: Progress reporter wiring in Java provider setup.

The progress reporter is correctly configured at the InitConfig level using the adapter pattern, consistent with the hybrid mode implementation.


788-794: LGTM: Consistent progress reporter setup for builtin provider.

The progress reporter setup follows the same pattern as the Java provider, maintaining consistency across the codebase.

cmd/analyze.go (2)

1516-1533: LGTM: Provider preparation progress rendering.

The progress display for provider preparation is well-implemented with:

  • Clear visual feedback (progress bar with checkmark prefix)
  • File count tracking
  • Newline printed on completion for clean output
  • Defensive condition (>=) to handle edge cases

1537-1537: LGTM: Improved progress message formatting.

Removing the extra newline after "Loaded X rules" messages improves output formatting by eliminating unnecessary blank lines.

Also applies to: 1545-1545

Removes the nodejs provider workaround that set Location to empty string
to avoid duplicate file counting. This workaround is no longer needed as
the root cause has been fixed in analyzer-lsp#1036.

Changes:
- Remove nodejs-specific Location="" workaround
- Simplify provider configuration logic
- Update comments to reference merged fix

Fixes: konveyor/analyzer-lsp#1035
Related: konveyor/analyzer-lsp#1036
Signed-off-by: tsanders <[email protected]>
@tsanders-rh tsanders-rh force-pushed the feat/prepare-progress-support branch from 61fa87b to 704f937 Compare December 6, 2025 12:10
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
cmd/analyze-hybrid.go (1)

662-665: Consider clarifying the progressCancel lifecycle.

The current code defers progressCancel (line 663) and also calls it explicitly (line 879), which means it may be called twice in the normal flow. While calling a context cancel function twice is safe (it's idempotent), the pattern could be clearer.

Consider one of these approaches:

Option 1: Set to nil after explicit call

 if progressMode.IsEnabled() {
     progressCancel() // This closes the Events() channel
+    progressCancel = nil // Prevent defer from calling again
     <-progressDone   // Wait for goroutine to finish
 }

Option 2: Remove the explicit call and rely solely on defer (if the timing allows)

The current implementation is correct and safe, so this is a low-priority refinement.

Also applies to: 878-881

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cd3c4c8 and 704f937.

📒 Files selected for processing (1)
  • cmd/analyze-hybrid.go (12 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
cmd/analyze-hybrid.go (3)
cmd/settings.go (1)
  • Config (26-35)
pkg/provider/python.go (1)
  • PythonProvider (9-11)
pkg/util/util.go (2)
  • NodeJSProvider (75-75)
  • SourceMountPath (31-31)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build & test from commit
🔇 Additional comments (6)
cmd/analyze-hybrid.go (6)

25-25: LGTM! Progress reporting infrastructure properly integrated.

The import addition and function signature changes correctly establish the progress reporting infrastructure. The progressReporter parameter is consistently propagated through setupNetworkProvider and setupBuiltinProviderHybrid, and all call sites (lines 675, 722) are updated accordingly.

Also applies to: 208-208, 405-405


289-296: LGTM! Progress reporter properly integrated into provider initialization.

The progress reporter is correctly set on each InitConfig entry using the adapter pattern. The nil checks and comment clarifying the InitConfig-level configuration (to avoid duplicate events) demonstrate good defensive programming and clear intent.

Also applies to: 433-440


675-675: LGTM! Progress reporter correctly used throughout the analysis flow.

The progress reporter is properly passed through the provider setup chain (setupNetworkProvider, setupBuiltinProviderHybrid) and rule execution (RunRulesWithOptions). The comment at line 872 helpfully clarifies that the reporter was created earlier, making the flow easy to follow.

Also applies to: 722-722, 872-876


843-843: LGTM! Typo fixed.

Good catch correcting "prpare" to "prepare" in the comment.


238-254: Verify analyzer-lsp dependency version includes the deduplication fix.

The comments reference the fix in analyzer-lsp#1036 that prevents duplicate file counting when both Location and workspaceFolders point to the same path. Confirm that go.mod has been updated to include the version containing this fix (version released on or after December 6, 2025).


660-665: Verify setupProgressReporter function implementation.

The progress reporter is correctly created early in the flow before provider preparation. However, setupProgressReporter is not defined in this file and verification of its existence and signature could not be completed due to repository access limitations.

Match go.mod requirement of Go 1.24.0 (needed by updated dependencies).

Signed-off-by: tsanders <[email protected]>
- Update setupBuiltinProviderHybrid test calls to include progressReporter parameter
- Fix non-constant format strings in container error handling

Signed-off-by: tsanders <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
cmd/analyze_hybrid_inprocess_test.go (1)

48-48: LGTM: Test signature correctly updated.

The additional nil parameter correctly matches the new setupBuiltinProviderHybrid signature that now accepts a progressReporter. Passing nil is appropriate for these tests, which focus on provider setup rather than progress reporting behavior.

Consider adding a dedicated test that exercises the progress reporting functionality with a non-nil progress reporter to verify the integration works end-to-end.

Also applies to: 82-82

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f12eb9e and bf366f4.

📒 Files selected for processing (2)
  • cmd/analyze_hybrid_inprocess_test.go (2 hunks)
  • pkg/container/container.go (2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-08T15:06:52.056Z
Learnt from: abrugaro
Repo: konveyor/kantra PR: 524
File: cmd/dump-rules.go:0-0
Timestamp: 2025-08-08T15:06:52.056Z
Learning: In konveyor/kantra cmd/dump-rules.go (Go/Cobra), the dump-rules command should not use package-level variables for flags (output, overwrite). Flags must be defined as local variables inside NewDumpRulesCommand to avoid shared mutable state across commands/tests.

Applied to files:

  • pkg/container/container.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build & test from commit
🔇 Additional comments (1)
pkg/container/container.go (1)

292-296: LGTM: Defensive format string improvement.

Adding the "%s" format string to fmt.Errorf prevents potential issues if errBytes.String() contains format specifiers (e.g., %s, %d). This is a best practice for error wrapping.

Note: These changes appear unrelated to the PR's stated objective of adding provider preparation progress reporting, but they are valid improvements.

Also applies to: 318-322

@codecov
Copy link

codecov bot commented Dec 6, 2025

Codecov Report

❌ Patch coverage is 2.27273% with 43 lines in your changes missing coverage. Please review.
✅ Project coverage is 36.05%. Comparing base (259d5b3) to head (bf366f4).

Files with missing lines Patch % Lines
cmd/analyze-bin.go 0.00% 14 Missing ⚠️
cmd/analyze.go 0.00% 14 Missing ⚠️
cmd/analyze-hybrid.go 7.14% 12 Missing and 1 partial ⚠️
pkg/container/container.go 0.00% 2 Missing ⚠️

❌ Your patch check has failed because the patch coverage (2.27%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #646      +/-   ##
==========================================
- Coverage   36.24%   36.05%   -0.19%     
==========================================
  Files          34       34              
  Lines        4277     4299      +22     
==========================================
  Hits         1550     1550              
- Misses       2555     2576      +21     
- Partials      172      173       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@tsanders-rh
Copy link
Contributor Author

Screen.Recording.2025-12-06.at.8.07.17.AM.mov

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Nodejs provider: GetDocumentUris counts files twice when Location and workspaceFolders are both set

1 participant