You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently we use chunked fla for both prefilling and decoding. However using the recurrent fla for decoding will bring better performance. We should only use chunked fla for prefilling and try to use recurrent fla in decoding.
Currently we use chunked fla for both prefilling and decoding. However using the recurrent fla for decoding will bring better performance. We should only use chunked fla for prefilling and try to use recurrent fla in decoding.