feat: add two-phase progress bar for large model operations#474
feat: add two-phase progress bar for large model operations#474
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces an --include flag for the modelfile generate command, allowing users to explicitly include files or directories (such as hidden files) that are normally skipped. The path filtering logic has been migrated to use the doublestar library, enabling support for recursive glob patterns. Additionally, the PR enhances progress bar tracking by adding 'Hashing' and 'Checking' phases and improves thread safety for progress messages using atomic values. Feedback suggests simplifying the hashing implementation in builder.go by using io.Copy directly with the hash object instead of a TeeReader and io.Discard.
| tee := io.TeeReader(wrappedReader, hash) | ||
| size, err := io.Copy(io.Discard, tee) |
There was a problem hiding this comment.
The use of io.TeeReader with io.Copy and io.Discard to compute the hash and get the size is functionally correct, but unnecessarily complex. Since hash.Hash implements io.Writer, you can simplify this by copying directly from the wrappedReader to the hash instance. This is more idiomatic and easier to read.
| tee := io.TeeReader(wrappedReader, hash) | |
| size, err := io.Copy(io.Discard, tee) | |
| size, err := io.Copy(hash, wrappedReader) |
| flags.StringVarP(&generateConfig.Provider, "provider", "p", "", "explicitly specify the provider for short-form URLs (huggingface, modelscope)") | ||
| flags.StringVar(&generateConfig.DownloadDir, "download-dir", "", "custom directory for downloading models (default: system temp directory)") | ||
| flags.StringArrayVar(&generateConfig.ExcludePatterns, "exclude", []string{}, "specify glob patterns to exclude files/directories (e.g. *.log, checkpoints/*)") | ||
| flags.StringArrayVar(&generateConfig.IncludePatterns, "include", []string{}, |
There was a problem hiding this comment.
This change seems not regarding this PR.
There was a problem hiding this comment.
Good catch — rebased the branch to drop those unrelated commits. This file is no longer modified in the PR.
…ions - Change progressBar.msg from string to atomic.Value to fix data race between Complete() and mpb render goroutine - Change Add() from RLock-check-RUnlock-Abort-Lock-write to full write lock to eliminate TOCTOU race on same-name bars - Add Reset() method as semantic alias for Add() to support phase transitions (Hashing -> Building, Checking -> Pushing) - Add 13 unit tests: 6 functional, 3 concurrency (-race), 4 error/idempotency Signed-off-by: Zhao Chen <zhaochen.zju@gmail.com> Signed-off-by: Zhao Chen <winters.zc@antgroup.com>
Add OnHashFunc type and OnHash field to Hooks struct. Default OnHash passes reader through unchanged. Add WithOnHash option function. Signed-off-by: Zhao Chen <zhaochen.zju@gmail.com> Signed-off-by: Zhao Chen <winters.zc@antgroup.com>
…ss tracking - computeDigestAndSize now accepts onHash callback to wrap reader with progress tracking before SHA256 hashing via TeeReader - BuildLayer computes relPath once and passes it to both onHash (via closure) and OutputLayer, ensuring bar name consistency across phases - Add 3 tests: onHash called with correct size, wrapped reader preserves digest correctness, reader failure propagation Signed-off-by: Zhao Chen <zhaochen.zju@gmail.com> Signed-off-by: Zhao Chen <winters.zc@antgroup.com>
- Add WithOnHash hook calling tracker.Add("Hashing layer") for digest
computation phase
- Change WithOnStart from tracker.Add to tracker.Reset("Building layer")
for storage write phase transition
Signed-off-by: Zhao Chen <zhaochen.zju@gmail.com>
Signed-off-by: Zhao Chen <winters.zc@antgroup.com>
- Remove prompt parameter from pushIfNotExist, phases now hardcoded - Phase 1: show "Checking blob" with nil reader during dst.Exists() - Phase 2: reset to "Pushing blob" with content reader for upload - Add pb.Abort on Exists() and PullBlob() error paths - Manifest/config pushes unchanged (small payloads, no phases needed) Signed-off-by: Zhao Chen <zhaochen.zju@gmail.com> Signed-off-by: Zhao Chen <winters.zc@antgroup.com>
Simplify computeDigestAndSize by copying directly to the sha256 writer instead of using io.TeeReader + io.Discard. hash.Hash already implements io.Writer, making the tee unnecessary. Addresses review feedback on PR #474. Signed-off-by: Zhao Chen <winters.zc@antgroup.com>
1f183ab to
70d8013
Compare
Summary
bar.msgbetweenComplete()and mpb render goroutine (string→atomic.Value)Add()lock pattern (RLock→RUnlock→Lock→ singleLock)Changes
internal/pb/pb.goResetmethodinternal/pb/pb_test.gopkg/backend/build/hooks/hooks.goOnHashcallback for digest computationpkg/backend/build/builder.goonHashintocomputeDigestAndSizewithTeeReaderpkg/backend/build/builder_test.gocomputeDigestAndSizepkg/backend/processor/base.goOnHash+Resetin processor hookspkg/backend/push.goTest plan
go test ./internal/pb/... -race -v— 13 tests pass, race detector cleango test ./pkg/backend/build/... -v— all builder tests passgo test ./pkg/backend/processor/... -v— all processor tests passgo test ./pkg/... -count=1— full package suite passgo build ./...— compiles successfully