Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
sayakpaul
left a comment
There was a problem hiding this comment.
Thanks a lot for prioritizing it.
docs/source/en/optimization/fp16.md
Outdated
There was a problem hiding this comment.
The kernel skill basically lets users get an agent to write custom kernels for a model and hardware. It's not specific to the attention processor but also other modules as well such RMSNorm. Should we make it clearer?
There was a problem hiding this comment.
lmk if this is clearer!
docs/source/en/optimization/fp16.md
Outdated
There was a problem hiding this comment.
It wasn't just RMSNorm but also other modules implemented with custom kernels.
There was a problem hiding this comment.
i mention RMSNorm as an example only for the benchmark results below
There was a problem hiding this comment.
Thanks! I would like to also see what @burtenshaw thinks about this.
adds a kernels section in the
Accelerate inferencedocs with the results:Attention backendsdocs which demonstrates support for loading attention kernels withset_attention_backend