Accelerate SVE128 SBGEMM/BGEMM by fadara01 · Pull Request #5667 · OpenMathLib/OpenBLAS

fadara01 · 2026-03-05T13:52:28Z

This accelerates SBGEMM/BGEMM by extending the existing 8x4 kernel to 8x8 (unrolling N by 8)

Not sure if it's a good idea to delete the previous 8x4 kernel?

Here are the speedups on single core Neoverse-V2 (SVE128) compared to prev state:

  M=N=K=64: SBGEMM 1.164x (16.42%), BGEMM 1.133x (13.30%)
  M=N=K=128: SBGEMM 1.220x (22.02%), BGEMM 1.186x (18.56%)
  M=N=K=256: SBGEMM 1.241x (24.08%), BGEMM 1.235x (23.54%)
  M=N=K=512: SBGEMM 1.240x (23.95%), BGEMM 1.227x (22.75%)
  M=N=K=1024: SBGEMM 1.251x (25.11%), BGEMM 1.232x (23.23%)
  M=N=K=2048: SBGEMM 1.235x (23.47%), BGEMM 1.246x (24.64%)

This accelerates SBGEMM/BGEMM by extending the existing 8x4 kernel to 8x8 (unrolling N by 8) Not sure if it's a good idea to delete the previous 8x4 kernel? Here are the speedups on single core Neoverse-V2 (SVE128) compared to prev state: Per-shape speedup M=N=K=64: SBGEMM 1.164x (16.42%), BGEMM 1.133x (13.30%) M=N=K=128: SBGEMM 1.220x (22.02%), BGEMM 1.186x (18.56%) M=N=K=256: SBGEMM 1.241x (24.08%), BGEMM 1.235x (23.54%) M=N=K=512: SBGEMM 1.240x (23.95%), BGEMM 1.227x (22.75%) M=N=K=1024: SBGEMM 1.251x (25.11%), BGEMM 1.232x (23.23%) M=N=K=2048: SBGEMM 1.235x (23.47%), BGEMM 1.246x (24.64%) Signed-off-by: Fadi Arafeh <fadi.arafeh@arm.com>

fadara01 · 2026-03-05T13:53:51Z

Hi @martin-frbg - could you please have a look?

(This currently copies the 8x4 kernel and extends it to 8x8 - please let me know if it's a good idea to remove the 8x4 kernel)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Accelerate SVE128 SBGEMM/BGEMM#5667

Accelerate SVE128 SBGEMM/BGEMM#5667
fadara01 wants to merge 1 commit intoOpenMathLib:developfrom
fadara01:accelerate_sve128_sbgemm

fadara01 commented Mar 5, 2026 •

edited

Loading

Uh oh!

fadara01 commented Mar 5, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

fadara01 commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fadara01 commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fadara01 commented Mar 5, 2026 •

edited

Loading

fadara01 commented Mar 5, 2026 •

edited

Loading