[DO NOT MERGE] increase number of blocks on spyre cards #556

yannicks1 · 2025-11-13T10:20:07Z

changes:

increasing number of blocks from 2080 to 8192

Note: do not merge yet until...

Signed-off-by: Yannick Schnider <[email protected]>

github-actions · 2025-11-13T10:20:16Z

👋 Hi! Thank you for contributing to vLLM support on Spyre.
Just a reminder: Make sure that your code passes all the linting checks, otherwise your PR won't be able to be merged. To do so, first install the linting requirements, then run format.sh and commit the changes. This can be done with uv directly:

uv sync --frozen --group lint --active --inexact

Or this can be done with pip:

uv pip compile --group lint > requirements-lint.txt
pip install -r requirements-lint.txt
bash format.sh

Now you are good to go 🚀

Signed-off-by: Yannick Schnider <[email protected]>

joerunde · 2025-11-13T15:45:41Z

@yannicks1 I posted this on our internal issue as well but we need to have some way to try to check the version of the spyre runtime stack so that we can set these values appropriately. We wouldn't want a newer version of vllm-spyre to set these expanded limits when its installed alongside an older version of the spyre runtime that doesn't support them.

(This would go away if we actually had APIs to call to get this data which we originally thought we would, but here we are 🤷 )

yannicks1 · 2025-11-14T10:21:16Z

yeah, this makes sense! did not mean to merge this as is, but rather to have a branch to test this.
we can add the checks here once we know how to do it.

Signed-off-by: Yannick Schnider <[email protected]>

yannicks1 · 2025-11-20T15:13:01Z

note: tkv x batchsize constraint will not be increased. number of blocks will be increased (for prefix caching)

maxdebayser · 2025-12-11T19:06:17Z

tests/models/test_granite.py

        )

-        assert granite_3_8b_config.cache_config.num_gpu_blocks_override == 2080
+        assert granite_3_8b_config.cache_config.num_gpu_blocks_override == 8192


As travis mentioned, you could check the torch_sendnn version to see if it already has support 8K. The version can be found: torch_sendnn._version.__version__. But I think the code should use 8192 as the default and downgrade to 2080 if an old version of torch sendnn is found instead of the contrary because if someone is hacking on a local environment where they have editable installs directly from local git repos the version information might be wrong.

Signed-off-by: Yannick Schnider <[email protected]>

increasing tkv x batch limit to 512k

9d5f04d

Signed-off-by: Yannick Schnider <[email protected]>

bump number of blocks

07be061

Signed-off-by: Yannick Schnider <[email protected]>

yannicks1 added 3 commits November 20, 2025 15:05

Merge branch 'main' into ysc-bump-spyre-limits

b179085

revert volumetric constraint

27b85d6

Signed-off-by: Yannick Schnider <[email protected]>

no diffs

86a72e3

Signed-off-by: Yannick Schnider <[email protected]>

yannicks1 changed the title ~~[DO NOT MERGE] bumping spyre card limits~~ [DO NOT MERGE] increase number of blocks on spyre cards Nov 20, 2025

maxdebayser reviewed Dec 11, 2025

View reviewed changes

yannicks1 added 2 commits December 11, 2025 22:38

Merge branch 'main' into ysc-bump-spyre-limits

b436c94

add version check

281a44d

Signed-off-by: Yannick Schnider <[email protected]>

yannicks1 marked this pull request as ready for review December 11, 2025 23:05

yannicks1 requested review from nikolaospapandreou, prashantgupta24, rafvasq, sducouedic and tdoublep as code owners December 11, 2025 23:05

yannicks1 requested review from joerunde and tjohnson31415 December 11, 2025 23:05

fallback for import error

d6e3590

Signed-off-by: Yannick Schnider <[email protected]>

yannicks1 requested a review from maxdebayser December 12, 2025 10:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[DO NOT MERGE] increase number of blocks on spyre cards #556

[DO NOT MERGE] increase number of blocks on spyre cards #556

Uh oh!

yannicks1 commented Nov 13, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 13, 2025

Uh oh!

joerunde commented Nov 13, 2025 •

edited

Loading

Uh oh!

yannicks1 commented Nov 14, 2025

Uh oh!

yannicks1 commented Nov 20, 2025

Uh oh!

maxdebayser Dec 11, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[DO NOT MERGE] increase number of blocks on spyre cards #556

Are you sure you want to change the base?

[DO NOT MERGE] increase number of blocks on spyre cards #556

Uh oh!

Conversation

yannicks1 commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Nov 13, 2025

Uh oh!

joerunde commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yannicks1 commented Nov 14, 2025

Uh oh!

yannicks1 commented Nov 20, 2025

Uh oh!

maxdebayser Dec 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

yannicks1 commented Nov 13, 2025 •

edited

Loading

joerunde commented Nov 13, 2025 •

edited

Loading

maxdebayser Dec 11, 2025 •

edited

Loading