fix: resolve camera preview and export pipeline issues#1679
fix: resolve camera preview and export pipeline issues#1679hidumou wants to merge 1 commit intoCapSoftware:mainfrom
Conversation
| let layout = infer_camera_buffer_layout( | ||
| frame.pixel_format, | ||
| frame.width as usize, | ||
| frame.height as usize, | ||
| original_data.len(), | ||
| ) | ||
| .ok_or_else(|| { | ||
| windows::core::Error::new( | ||
| windows::core::HRESULT(-1), | ||
| format!( | ||
| "Failed to infer camera buffer layout for {:?} {}x{} ({} bytes)", | ||
| frame.pixel_format, | ||
| frame.width, | ||
| frame.height, | ||
| original_data.len() | ||
| ), | ||
| ) | ||
| })?; |
There was a problem hiding this comment.
MJPEG cameras will now hard-fail in
upload_mf_buffer_to_texture
infer_camera_buffer_layout always returns None for PixelFormat::MJPEG and PixelFormat::H264, so this function will now return an error immediately for any camera that streams in MJPEG format. Before this PR, the function continued (incorrectly, producing garbage), but now it fails outright.
camera_frame_to_ffmpeg correctly short-circuits MJPEG at the top of the function before reaching the layout inference. upload_mf_buffer_to_texture has no equivalent guard.
If any MJPEG cameras are used for the live preview path (which goes through texture upload), the preview will break silently. Consider adding a similar early-return guard:
// MJPEG/H264 frames cannot be uploaded as raw pixel textures
if matches!(
frame.pixel_format,
cap_camera_windows::PixelFormat::MJPEG | cap_camera_windows::PixelFormat::H264
) {
return Err(windows::core::Error::new(
windows::core::HRESULT(-1),
"MJPEG/H264 cannot be uploaded as a raw texture".to_string(),
));
}If MJPEG was already broken in this path before (the old bytes_per_pixel = _ => 2 was clearly wrong for compressed data), this is acceptable fail-fast behavior — but it should be documented explicitly.
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/recording/src/output_pipeline/win.rs
Line: 1850-1867
Comment:
**MJPEG cameras will now hard-fail in `upload_mf_buffer_to_texture`**
`infer_camera_buffer_layout` always returns `None` for `PixelFormat::MJPEG` and `PixelFormat::H264`, so this function will now return an error immediately for any camera that streams in MJPEG format. Before this PR, the function continued (incorrectly, producing garbage), but now it fails outright.
`camera_frame_to_ffmpeg` correctly short-circuits MJPEG at the top of the function before reaching the layout inference. `upload_mf_buffer_to_texture` has no equivalent guard.
If any MJPEG cameras are used for the live preview path (which goes through texture upload), the preview will break silently. Consider adding a similar early-return guard:
```rust
// MJPEG/H264 frames cannot be uploaded as raw pixel textures
if matches!(
frame.pixel_format,
cap_camera_windows::PixelFormat::MJPEG | cap_camera_windows::PixelFormat::H264
) {
return Err(windows::core::Error::new(
windows::core::HRESULT(-1),
"MJPEG/H264 cannot be uploaded as a raw texture".to_string(),
));
}
```
If MJPEG was already broken in this path before (the old `bytes_per_pixel = _ => 2` was clearly wrong for compressed data), this is acceptable fail-fast behavior — but it should be documented explicitly.
How can I resolve this? If you propose a fix, please make it concise.| fn infer_camera_buffer_layout( | ||
| pixel_format: cap_camera_windows::PixelFormat, | ||
| width: usize, | ||
| height: usize, | ||
| data_len: usize, | ||
| ) -> Option<CameraBufferLayout> { | ||
| use cap_camera_windows::PixelFormat; | ||
|
|
||
| if width == 0 || height == 0 || data_len == 0 { | ||
| return None; | ||
| } | ||
|
|
||
| let total_rows = match pixel_format { | ||
| PixelFormat::NV12 | ||
| | PixelFormat::NV21 | ||
| | PixelFormat::YUV420P | ||
| | PixelFormat::YV12 | ||
| | PixelFormat::P010 => height + (height / 2), | ||
| PixelFormat::YUYV422 | ||
| | PixelFormat::UYVY422 | ||
| | PixelFormat::GRAY16 | ||
| | PixelFormat::RGB565 | ||
| | PixelFormat::ARGB | ||
| | PixelFormat::RGB32 | ||
| | PixelFormat::RGB24 | ||
| | PixelFormat::BGR24 | ||
| | PixelFormat::GRAY8 => height, | ||
| PixelFormat::MJPEG | PixelFormat::H264 => return None, | ||
| }; | ||
|
|
||
| if total_rows == 0 || data_len % total_rows != 0 { | ||
| return None; | ||
| } | ||
|
|
||
| let primary_stride = data_len / total_rows; | ||
|
|
||
| match pixel_format { | ||
| PixelFormat::NV12 | PixelFormat::NV21 => { |
There was a problem hiding this comment.
Stride inference may fail for devices with asymmetric UV plane alignment
infer_camera_buffer_layout derives primary_stride from data_len / total_rows, which assumes both the luma and chroma planes in planar formats (YUV420P, YV12) use strides consistent with a single stride value. Specifically, for YUV420P it assumes secondary_stride = primary_stride / 2.
This assumption holds for the vast majority of Windows camera drivers (MF aligns all planes to the same width multiple), but a device that provides a Y plane with stride 1936 and U/V planes with stride 1024 (a different alignment unit) would fail the data_len % total_rows != 0 check and cause the frame to be dropped with an error.
The resulting error is logged clearly, so failed frames are visible. This is a reasonable trade-off given that such cameras are rare, but it's worth noting as a known limitation of the stride inference approach.
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/recording/src/output_pipeline/win.rs
Line: 1149-1186
Comment:
**Stride inference may fail for devices with asymmetric UV plane alignment**
`infer_camera_buffer_layout` derives `primary_stride` from `data_len / total_rows`, which assumes both the luma and chroma planes in planar formats (YUV420P, YV12) use strides consistent with a single stride value. Specifically, for YUV420P it assumes `secondary_stride = primary_stride / 2`.
This assumption holds for the vast majority of Windows camera drivers (MF aligns all planes to the same width multiple), but a device that provides a Y plane with stride 1936 and U/V planes with stride 1024 (a different alignment unit) would fail the `data_len % total_rows != 0` check and cause the frame to be dropped with an error.
The resulting error is logged clearly, so failed frames are visible. This is a reasonable trade-off given that such cameras are rare, but it's worth noting as a known limitation of the stride inference approach.
How can I resolve this? If you propose a fix, please make it concise.| if let (Some(pts), Some(dts)) = (self.packet.pts(), self.packet.dts()) | ||
| && pts < dts | ||
| { | ||
| self.packet.set_pts(Some(dts)); | ||
| } |
There was a problem hiding this comment.
Second
pts < dts guard is correct but worth a comment
The existing DTS monotonicity block (lines 76–86) already adjusts PTS when DTS is bumped. This new block catches the distinct case where the encoder emits a packet with pts < dts without any DTS ordering violation (i.e., the first block didn't fire). This is a real scenario with libx264 at certain configurations.
A short comment would help future readers understand why two separate guards exist:
// Secondary guard: encoder may emit pts < dts even when DTS is
// already monotonic (e.g. during stream init or B-frame drain).
if let (Some(pts), Some(dts)) = (self.packet.pts(), self.packet.dts())
&& pts < dts
{
self.packet.set_pts(Some(dts));
}Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/enc-ffmpeg/src/base.rs
Line: 88-92
Comment:
**Second `pts < dts` guard is correct but worth a comment**
The existing DTS monotonicity block (lines 76–86) already adjusts PTS when DTS is bumped. This new block catches the distinct case where the encoder emits a packet with `pts < dts` without any DTS ordering violation (i.e., the first block didn't fire). This is a real scenario with libx264 at certain configurations.
A short comment would help future readers understand why two separate guards exist:
```rust
// Secondary guard: encoder may emit pts < dts even when DTS is
// already monotonic (e.g. during stream init or B-frame drain).
if let (Some(pts), Some(dts)) = (self.packet.pts(), self.packet.dts())
&& pts < dts
{
self.packet.set_pts(Some(dts));
}
```
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
|
Hey @hidumou, was this an issue you have personally experienced? |
Summary
This PR fixes three issues in the desktop recording/export pipeline:
Failed to export recording: Encode: Invalid argumentbecause encoded packets could reach the MP4 muxer with invalid timestamp ordering (pts < dts).Changes
pts < dts.Root Cause
For the 4K60 export failure, logs showed non-monotonic timestamp ordering during software H.264 export. The encoder/muxer path could produce packets where
ptsbecame smaller thandts, which causes FFmpeg/MP4 muxing to fail withInvalid argument.For the Windows camera artifact issue, camera frames were previously treated as tightly packed buffers, but real device buffers can contain row padding/stride. That caused corrupted image interpretation in export/output paths.
Validation
cargo fmt --allcargo check -p cap-enc-ffmpeg -p cap-exportcargo check --manifest-path apps/desktop/src-tauri/Cargo.tomlAdditionally, the failing local 4K60 export project was re-run and progressed well past the previous early failure range without immediately reproducing
Encode: Invalid argument.Greptile Summary
This PR addresses three distinct bugs in the recording/export pipeline: camera preview re-open on same-device re-selection, Windows camera buffer stride handling, and MP4 muxing failures from
pts < dtsordering violations during 4K60 software H.264 export.lib.rs): Correctly re-opens the preview window when the same camera is already active by unwrappingskip_camera_windowonce and reusing it in both the early-return path and the standard path. Clean and correct.base.rs): Adds a secondpts < dtsguard after the existing DTS monotonicity block. This correctly catches the case where the encoder emitspts < dtswithout any DTS ordering violation (first block doesn't fire). Paired with thebf=0change inh264.rs, this should eliminateInvalid argumenterrors from the MP4 muxer.h264.rs):bf=0removes B-frames from the high-throughput software path, eliminating the primary source of PTS reordering.b-adapt=0is redundant oncebf=0is set but harmless.win.rs): The largest change introducesinfer_camera_buffer_layoutto derive per-frame buffer stride fromdata_len / total_rows, then threads this stride throughcopy_plane,convert_uyvy_to_yuyv, andflip_camera_buffer_into. The approach is sound for standard Windows MF devices. One behavioral change to be aware of:upload_mf_buffer_to_texturenow returns an error immediately for MJPEG/H264 frames (previously continued with incorrect raw-pixel assumptions), which is better fail-fast behavior but could affect MJPEG camera live preview if that path was previously relied upon.Confidence Score: 4/5
upload_mf_buffer_to_texture— should be confirmed before merging, as that path now returns a hard error instead of silently processing garbage data.upload_mf_buffer_to_texturefor MJPEG camera compatibility.Important Files Changed
skip_camera_windowfrom Option earlier to allow it to be reused in both the early-return path (camera already active, window closed) and the normal path. Clean fix for the camera preview re-open regression.pts < dtsguard after the existing DTS monotonicity fix. The new guard correctly covers the case where DTS was not bumped but the encoder still emitted a packet with pts < dts. The fix is sound but may need documentation to explain why two separate guards are needed.bf=0and changesb-adaptfrom 1→0 for the high-throughput software path. Settingbf=0disables B-frames entirely, which eliminates a source of pts/dts reordering.b-adapt=0is redundant oncebf=0is set, but harmless.infer_camera_buffer_layoutheuristic derives stride fromdata_len / total_rows. Works correctly for common semi-planar (NV12/NV21) and packed formats.upload_mf_buffer_to_texturenow returns an error for MJPEG/H264 frames (previously processed incorrectly as raw pixel data), which is better-fail-fast behavior but a behavioral change.Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A[Camera Frame Arrives] --> B{PixelFormat?} B -->|MJPEG| C[decode_mjpeg_frame\nFFmpeg MJPEG decoder] B -->|Other| D[infer_camera_buffer_layout\nderives stride from data_len / total_rows] D -->|None: MJPEG/H264 or\nbad divisibility| E[Return Error] D -->|Some layout| F{UYVY422?} F -->|Yes| G[convert_uyvy_to_yuyv\nstride-aware + flip support] F -->|No| H[Use raw data\nwith layout stride] G --> I[copy_plane with\nnew src_stride param] H --> I I --> J[ffmpeg::frame::Video returned\nto encoder pipeline] K[Packet encoded] --> L{DTS monotonic?} L -->|dts ≤ last_dts| M[Bump DTS to last_dts + 1\nAdjust PTS if pts < new DTS] L -->|OK| N{pts < dts?} M --> N N -->|Yes| O[Set pts = dts] N -->|No| P[write_interleaved to MP4] O --> PPrompt To Fix All With AI
Reviews (1): Last reviewed commit: "fix: resolve camera preview and export p..." | Re-trigger Greptile
(2/5) Greptile learns from your feedback when you react with thumbs up/down!