Skip to content

Conversation

@script3r
Copy link
Owner

Improve file discovery performance for large repositories by adding a Git fast path and parallelizing directory traversal.

This PR introduces two main optimizations: first, it leverages git ls-files for Git repositories to quickly list tracked and unignored files, similar to how ripgrep operates. Second, for non-Git repositories or when the Git fast path is not applicable, it switches to a parallel directory walker (ignore crate) with early glob filtering and reduced metadata calls to minimize I/O overhead.


Open in Cursor Open in Web

@cursor
Copy link

cursor bot commented Sep 15, 2025

Cursor Agent can help with this pull request. Just @cursor in comments and I'll start working on changes in this branch.
Learn more about Cursor Agents

cursoragent and others added 9 commits September 15, 2025 12:44
…quential

- Replace ignore crate's build_parallel().run() with sequential builder.build()
- The parallel walker doesn't guarantee thread completion before returning
- Sequential discovery followed by parallel processing with rayon is more reliable
- Eliminates the hanging thread issue that prevented app termination
- Maintains performance through parallel file processing
…anning-and-add-progress-613b

Optimize slow directory scanning and add progress
@script3r script3r closed this Sep 15, 2025
@script3r script3r deleted the cursor/optimize-large-repo-file-listing-performance-e60f branch September 21, 2025 17:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants