fix(worker): Add tmp shard cleanup on indexing failure#805
fix(worker): Add tmp shard cleanup on indexing failure#805brendan-kellam merged 5 commits intomainfrom
Conversation
When zoekt-git-index fails during repository indexing, it can leave behind .tmp shard files that accumulate over time and fill up disk space. This is especially problematic for large repos that repeatedly fail to index. Changes: - Add cleanupTempShards() function to zoekt.ts that removes temporary shard files (files with .tmp in their name) for a specific repository - Call cleanupTempShards() in repoIndexManager.ts when indexGitRepository fails, before re-throwing the error This ensures that even if a repository consistently fails to index, the temporary files created during each attempt are cleaned up. Co-authored-by: michael <michael@sourcebot.dev>
|
Cursor Agent can help with this pull request. Just |
|
Caution Review failedThe pull request is closed. WalkthroughAdds a best-effort cleanup of temporary Zoekt shard files on repository indexing failure by introducing Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Manager as RepoIndexManager
participant Zoekt as zoekt/indexer
participant FS as Filesystem
Manager->>Zoekt: indexGitRepository(repo)
alt success
Zoekt-->>Manager: success
else failure
Zoekt-->>Manager: throws error
Manager->>Zoekt: cleanupTempShards(repo)
Zoekt->>FS: readdir(INDEX_CACHE_DIR)
FS-->>Zoekt: list of files
Zoekt->>FS: rm(matching .tmp shards, force: true)
FS-->>Zoekt: deletion results
Zoekt-->>Manager: cleanup result (logged)
Manager-->>Manager: rethrow error
end
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Code reviewNo issues found. Checked for bugs and CLAUDE.md compliance. |
Code reviewNo issues found. Checked for bugs and CLAUDE.md compliance. |
Clean up temporary Zoekt shard files on indexing failure to prevent disk space exhaustion.
When
zoekt-git-indexfails during repository indexing, it leaves behind.tmpshard files. These accumulate over time, especially for repos that repeatedly fail to index, leading to disk space issues. This PR adds logic to automatically remove these temporary files immediately after an indexing operation fails.Fixes #804
Summary by CodeRabbit