feat(producer): optional targetChunkFrames to bound per-chunk frames#1332
Conversation
miguel-heygen
left a comment
There was a problem hiding this comment.
Clean feature addition. Math is correct across all cases; test coverage is solid.
Traced the formula manually:
- Short video (1466f, target 300, cap 16):
autoSizeParallel=5,effectiveChunkSize=294 ≤ 300✓ - Long video (54000f, target 1600, cap 64):
autoSizeParallel=34,effectiveChunkSize=1589 ≤ 1600✓ - Extreme clamp (216000f, target 1600, cap 64): clamps to 64 chunks,
effectiveChunkSize=3375 > 1600(documented behavior) ✓ - No-op when omitted:
autoSizeParallel === maxParallelChunks, identical to prior path ✓
P3 nits:
-
Inconsistent upper-bound validation:
validateDistributedRenderConfigcheckstargetChunkFrames > MAX_CHUNK_SIZE, butresolveChunkPlanonly callsassertPositiveInteger(no upper-bound check).chunkSizehas the same pattern — upper-bound only in the config validator — so this is consistent with the existing convention, but direct callers ofresolveChunkPlanbypass the ceiling. Document-only risk since the config validator gates the public API. -
Silent no-op when
chunkSizeandtargetChunkFramesare both set: documented in the JSDoc ("Ignored whenchunkSizeis set"), but the validator doesn't warn. A user who mistakenly sets both will silently getchunkSizebehavior with no feedback. A validator warning would save confusion — not required now but worth a follow-up. -
No dedicated test for
targetChunkFrames >= totalFrames(single-chunk collapse): e.g., 100 frames, target 300 → should produce 1 chunk. The "empty/inverted slice" grid gets close (37 frames, target 100 → 1 chunk, no slice assertion fails), but no explicitchunkCount === 1assertion for this edge. Fine for now.
Approving.
vanceingalls
left a comment
There was a problem hiding this comment.
Review of #1332 — optional targetChunkFrames to bound per-chunk frames
Verdict: APPROVE. Tight design, additive change, sensible defaults, generous test grid. One nit on a doc-vs-behavior interaction the author may want to acknowledge.
What I checked
- Walked
resolveChunkPlanagainst the grid the test asserts (50/660/1466/12793/54000/216000 × parallels 8/16/64 × targets 100/300/1600). - Confirmed
assertPositiveInteger, theMIN_CHUNK_SIZEfloor, and the trailing-slice-tightening pass interact correctly with the new branch. - Verified validator bound (
MAX_CHUNK_SIZE = 3600) is semantically the right one —targetChunkFramescaps frames-per-chunk, same unit and meaning aschunkSize. - Traced the new optional field through
validateDistributedRenderConfigand theplan()call site. Adapters that JSON-pass-throughevent.Config(packages/aws-lambda,packages/gcp-cloud-run) propagate it transparently because they spread{...event.Config}rather than enumerating fields. - Confirmed
SerializableDistributedRenderConfigisOmit<DistributedRenderConfig, ...>so the new optional field is automatically included for the JSON-wire shape; existing serialized configs without the field still parse (optional,=== undefinedgate).
Findings
[nit] Doc claims "bounds per-chunk render time" but MIN_CHUNK_SIZE = 10 silently overrides on very small targets.
The validator accepts targetChunkFrames >= 1. Trace: with totalFrames = 50, targetChunkFrames = 5, maxParallelChunks = 16:
autoSizeParallel = min(16, max(1, ceil(50/5))) = 10resolvedChunkSize = max(MIN_CHUNK_SIZE=10, ceil(50/10)=5) = 10- final result: 5 chunks of 10 frames each — silently 2× over the requested ceiling.
In practice operators set targetChunkFrames in the hundreds-to-thousands range (timeout-bounding for cloud-run/lambda), so this is academic. But the doc on DistributedRenderConfig.targetChunkFrames reads "chunking targets the fewest chunks whose per-chunk frame count stays at or below this bound" — strictly false when target < MIN_CHUNK_SIZE. Two ways to close: (a) validator additionally rejects targetChunkFrames < MIN_CHUNK_SIZE with a message pointing at the floor, or (b) the doc-comment names the interaction (something like "the auto-size floor MIN_CHUNK_SIZE still applies, so targetChunkFrames < MIN_CHUNK_SIZE is effectively pinned at the floor"). I'd take (a) — quieter contract, no surprises for callers wiring up a config builder.
Things that look right (not for change)
autoSizeParallel = min(maxParallelChunks, max(1, ceil(totalFrames / targetChunkFrames)))is the cleanest expression of the clamp; the innerMath.max(1, …)is technically dead givenassertPositiveIntegerupstream but cheap and defensive against future refactors.- Precedence (
chunkSizewins overtargetChunkFrames) is documented, tested, and matches operator mental model — explicit per-chunk size already pins the dimension. - Validator order matches the existing pattern: positive-int check, then upper bound. Same error shape as
chunkSize/maxParallelChunks. - The "never emits an empty or inverted slice across a grid" test is the right correctness invariant. The 216000-frame "stays at the cap, exceeds target" test is a particularly nice addition — it pins the documented escape hatch behavior rather than silently changing it later.
targetChunkFrames=2000, totalFrames=1466(target > totalFrames) collapses to a single chunk of 1466 frames; covered by the no-op test on the principle and by the 1466/300 short-video case.- Schema: optional field, gated by
!== undefined, default branch is byte-identical to prior behavior (test confirms). The PR description's "existing serialized configs are unaffected" is provably true.
— Review by Vai
|
Folded in CLI exposure + docs so the knob is usable end-to-end, not library-only (commit 7cf9b21):
Testing: CLI package |
Summary
Adds an optional
targetChunkFramesknob toDistributedRenderConfigthat caps frames-per-chunk in the auto-sized chunking path. Today the two chunking knobs are alternatives:maxParallelChunkscaps the chunk count (so per-chunk frames — and thus per-chunk render time — grow unboundedly with video length), andchunkSizepins a fixed size (so short videos can't collapse to fewer chunks). Neither lets a caller say "size chunks to ~N frames each, but never more than M chunks."targetChunkFramesis that third option:maxParallelChunks, to keep each chunk's render time under a downstream per-chunk timeout.This is the one-pass equivalent of the "plan, observe frame count, re-plan at a higher fan-out" dance a caller otherwise has to do in orchestration when a chunk-count cap leaves a long video with chunks too big to finish in time.
Default behavior is unchanged
targetChunkFramesis optional and only affects the auto-sized path (chunkSize === undefined):undefined) →autoSizeParallel === maxParallelChunks→ the auto-sized chunk size is identical to prior behavior. A test asserts the 3-arg and 4-arg-undefinedresults are byte-for-byte equal across a grid of inputs.chunkSizeset →targetChunkFramesis a no-op (an explicit fixed size already pins per-chunk frames;chunkSizewins). A test asserts equality with thechunkSize-only result.resolveChunkPlanand new optional field on the config type — existing 3-arg callers and existing serialized configs are unaffected.Validation mirrors the existing
chunkSize/maxParallelChunkschecks (positive integer,<= MAX_CHUNK_SIZE) in the sharedvalidateDistributedRenderConfig, which both the AWS Lambda and GCP Cloud Run adapters already delegate to — so the bound applies everywhere with no adapter changes.Testing
tsc --noEmitclean; oxlint + oxfmt clean.resolveChunkPlansuite: 18 pass, including 7 new cases — omitted-is-a-no-op (grid), short-collapse, long-video bounding, tier-cap clamp (extreme length stays at the cap),chunkSize-wins precedence, non-positive/non-integer rejection, and an empty/inverted-slice grid overtargetChunkFrames.validateConfigsuites: 51 pass (confirming the additive field doesn't disturb existing callers).Context
This is the clean primitive behind a chunking policy HeyGen currently implements as a post-plan re-plan in its orchestration; once this ships and the producer build is deployed, that re-plan collapses to passing
targetChunkFrames+maxParallelChunkson a single plan call. Useful standalone for any distributed-render caller who wants to bound per-chunk runtime on long videos.🤖 Generated with Claude Code