When LLM returns 429, we should repeat the request (and cancel it at any time).
A good default is to do it 10 times with the interval increasing by 2 times each time (2s, 4s, 8s, ...).
It also makes sense to have the model name readily available so we can eventually use different HTTP error codes for different models, if this is not enough.
To upvote this issue, give it a thumbs up. See this list for the most upvoted issues.
When LLM returns 429, we should repeat the request (and cancel it at any time).
A good default is to do it 10 times with the interval increasing by 2 times each time (2s, 4s, 8s, ...).
It also makes sense to have the model name readily available so we can eventually use different HTTP error codes for different models, if this is not enough.
To upvote this issue, give it a thumbs up. See this list for the most upvoted issues.