[Parakeet] Add Vulkan backend documentation and fix CMake build by SS-JIA · Pull Request #18734 · pytorch/executorch

SS-JIA · 2026-04-07T03:40:02Z

Summary:
Add Vulkan backend documentation to the Parakeet README covering export
commands, quantization options, build instructions, and runner examples.

Guard quantized_ops_lib and custom_ops link targets with
if(TARGET ...) in CMakeLists.txt. These targets don't exist in
Vulkan-only or XNNPACK-only builds, causing a hard CMake configure
error from target_link_options(). This matches the existing pattern
used for optimized_native_cpu_ops_lib.

Validated on Samsung S24 (Adreno 750), 8da4w quantization, test_audio.wav (7.2s):

Metric	XNNPACK (686 MB)	Vulkan (781 MB)	Vulkan fp16 (550 MB)
Inference	0.56s	0.46s	0.32s
Encoder speed	188 tok/s	275 tok/s	360 tok/s
Decoder speed	657 tok/s	373 tok/s	746 tok/s

Authored by Claude (Anthropic)

pytorch-bot · 2026-04-07T03:40:07Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18734

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 3 Unrelated Failures

As of commit df72cc1 with merge base fcccda3 ():

NEW FAILURE - The following job has failed:

Cadence Build & Test / cpu-test / test-ops / test-ops (gh)
examples/cadence/operators/test_g3_ops.py::ATenOpTestCases::test_g3_neg_out_7

FLAKY - The following job failed but was likely due to flakiness present on trunk:

pull / test-models-linux-basic (mv3, xnnpack-quantization-delegation, cmake, linux.2xlarge, executorch-u... / linux-job (gh) (detected as infra flaky with no log or failing log classifier)

BROKEN TRUNK - The following jobs failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / unittest / windows / windows-job (gh) (trunk failure)
##[error]The operation was canceled.
pull / unittest-editable / windows / windows-job (gh) (trunk failure)
##[error]The operation was canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2026-04-07T03:40:46Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

mergennachin · 2026-04-07T13:55:15Z

examples/models/parakeet/README.md

+    --qlinear_encoder_group_size 32 \
+    --qlinear 8da4w \
+    --qlinear_group_size 32 \
+    --vulkan_force_fp16 \


We cant use --dtype flag?

--dtype fp16: inputs and outputs are also cast to fp16. From caller's perspective, input/output is fp16.

--vulkan_force_fp16: inputs and outputs are still fp32. Vulkan backend will automatically convert inputs to fp16 within the delegate and outputs to fp32. From caller's perspective, input/output is fp32.

--vulkan_force_fp16 is a bit simpler for client code since they don't have to handle the conversion to/from fp32, so I defaulted to that.

Another thing was that for the export_parakeet_tdt.py script, with --dtype fp16 I see there is a guard:

export_parakeet_tdt.py: error: fp16 is not yet supported

I wasn't sure if this was because the runner binary doesn't handle fp16 input/output yet, so I opted for --vulkan_force_fp16 instead.

Would you prefer enabling usage of the --dtype flag for fp16 inference?

Also updated the text to clarify the special properties of the --vulkan_force_fp16 flag that wouldn't be covered by --dtype

Summary: Add Vulkan backend documentation to the Parakeet README covering export commands, quantization options, build instructions, and runner examples. Guard `quantized_ops_lib` and `custom_ops` link targets with `if(TARGET ...)` in CMakeLists.txt. These targets don't exist in Vulkan-only or XNNPACK-only builds, causing a hard CMake configure error from `target_link_options()`. This matches the existing pattern used for `optimized_native_cpu_ops_lib`. Validated on Samsung S24 (Adreno 750), 8da4w quantization, test_audio.wav (7.2s): | Metric | XNNPACK (686 MB) | Vulkan (781 MB) | Vulkan fp16 (550 MB) | |----------------|-------------------|-----------------|----------------------| | Inference | 0.56s | 0.46s | 0.32s | | Encoder speed | 188 tok/s | 275 tok/s | 360 tok/s | | Decoder speed | 657 tok/s | 373 tok/s | 746 tok/s | Authored by Claude (Anthropic)

SS-JIA requested review from kirklandsign, larryliu0820 and lucylq as code owners April 7, 2026 03:40

pytorch-bot bot added the module: vulkan Issues related to the Vulkan delegate and code under backends/vulkan/ label Apr 7, 2026

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 7, 2026

SS-JIA force-pushed the pr18734 branch 2 times, most recently from 0ad610d to c745a39 Compare April 7, 2026 12:36

mergennachin reviewed Apr 7, 2026

View reviewed changes

SS-JIA force-pushed the pr18734 branch from c745a39 to df72cc1 Compare April 7, 2026 14:14

mergennachin approved these changes Apr 8, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Parakeet] Add Vulkan backend documentation and fix CMake build#18734

[Parakeet] Add Vulkan backend documentation and fix CMake build#18734
SS-JIA wants to merge 1 commit intomainfrom
pr18734

SS-JIA commented Apr 7, 2026 •

edited

Loading

Uh oh!

pytorch-bot bot commented Apr 7, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Apr 7, 2026

Uh oh!

mergennachin Apr 7, 2026

Uh oh!

SS-JIA Apr 7, 2026

Uh oh!

SS-JIA Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

SS-JIA commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18734

❌ 1 New Failure, 3 Unrelated Failures

Uh oh!

github-actions bot commented Apr 7, 2026

This PR needs a release notes: label

Uh oh!

mergennachin Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

SS-JIA Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

SS-JIA Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

SS-JIA commented Apr 7, 2026 •

edited

Loading

pytorch-bot bot commented Apr 7, 2026 •

edited

Loading

This PR needs a `release notes:` label