Skip to content

[Parakeet] Add Vulkan backend documentation and fix CMake build#18734

Open
SS-JIA wants to merge 1 commit intomainfrom
pr18734
Open

[Parakeet] Add Vulkan backend documentation and fix CMake build#18734
SS-JIA wants to merge 1 commit intomainfrom
pr18734

Conversation

@SS-JIA
Copy link
Copy Markdown
Contributor

@SS-JIA SS-JIA commented Apr 7, 2026

Summary:
Add Vulkan backend documentation to the Parakeet README covering export
commands, quantization options, build instructions, and runner examples.

Guard quantized_ops_lib and custom_ops link targets with
if(TARGET ...) in CMakeLists.txt. These targets don't exist in
Vulkan-only or XNNPACK-only builds, causing a hard CMake configure
error from target_link_options(). This matches the existing pattern
used for optimized_native_cpu_ops_lib.

Validated on Samsung S24 (Adreno 750), 8da4w quantization, test_audio.wav (7.2s):

Metric XNNPACK (686 MB) Vulkan (781 MB) Vulkan fp16 (550 MB)
Inference 0.56s 0.46s 0.32s
Encoder speed 188 tok/s 275 tok/s 360 tok/s
Decoder speed 657 tok/s 373 tok/s 746 tok/s

Authored by Claude (Anthropic)

@pytorch-bot pytorch-bot bot added the module: vulkan Issues related to the Vulkan delegate and code under backends/vulkan/ label Apr 7, 2026
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Apr 7, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18734

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 3 Unrelated Failures

As of commit df72cc1 with merge base fcccda3 (image):

NEW FAILURE - The following job has failed:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 7, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 7, 2026

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

@SS-JIA SS-JIA force-pushed the pr18734 branch 2 times, most recently from 0ad610d to c745a39 Compare April 7, 2026 12:36
--qlinear_encoder_group_size 32 \
--qlinear 8da4w \
--qlinear_group_size 32 \
--vulkan_force_fp16 \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cant use --dtype flag?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • --dtype fp16: inputs and outputs are also cast to fp16. From caller's perspective, input/output is fp16.
  • --vulkan_force_fp16: inputs and outputs are still fp32. Vulkan backend will automatically convert inputs to fp16 within the delegate and outputs to fp32. From caller's perspective, input/output is fp32.

--vulkan_force_fp16 is a bit simpler for client code since they don't have to handle the conversion to/from fp32, so I defaulted to that.

Another thing was that for the export_parakeet_tdt.py script, with --dtype fp16 I see there is a guard:

export_parakeet_tdt.py: error: fp16 is not yet supported

I wasn't sure if this was because the runner binary doesn't handle fp16 input/output yet, so I opted for --vulkan_force_fp16 instead.

Would you prefer enabling usage of the --dtype flag for fp16 inference?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also updated the text to clarify the special properties of the --vulkan_force_fp16 flag that wouldn't be covered by --dtype

Summary:
Add Vulkan backend documentation to the Parakeet README covering export
commands, quantization options, build instructions, and runner examples.

Guard `quantized_ops_lib` and `custom_ops` link targets with
`if(TARGET ...)` in CMakeLists.txt. These targets don't exist in
Vulkan-only or XNNPACK-only builds, causing a hard CMake configure
error from `target_link_options()`. This matches the existing pattern
used for `optimized_native_cpu_ops_lib`.

Validated on Samsung S24 (Adreno 750), 8da4w quantization, test_audio.wav (7.2s):

| Metric         | XNNPACK (686 MB) | Vulkan (781 MB) | Vulkan fp16 (550 MB) |
|----------------|-------------------|-----------------|----------------------|
| Inference      | 0.56s             | 0.46s           | 0.32s                |
| Encoder speed  | 188 tok/s         | 275 tok/s       | 360 tok/s            |
| Decoder speed  | 657 tok/s         | 373 tok/s       | 746 tok/s            |

Authored by Claude (Anthropic)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. module: vulkan Issues related to the Vulkan delegate and code under backends/vulkan/

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants