Skip to content

feat: auto failover APIs with LK Cloud#686

Open
davidzhao wants to merge 13 commits into
mainfrom
dz/region-failover
Open

feat: auto failover APIs with LK Cloud#686
davidzhao wants to merge 13 commits into
mainfrom
dz/region-failover

Conversation

@davidzhao

Copy link
Copy Markdown
Member

retries in alternative datacenters on 5xx and transport failures

also removed legacy camel-case, which was not needed since we switched to protobuf-es

retries in alternative datacenters on 5xx and transport failures
@davidzhao davidzhao requested review from anunaym14 and lukasIO June 27, 2026 21:42
@changeset-bot

changeset-bot Bot commented Jun 27, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: d092da9

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages
Name Type
livekit-server-sdk Patch
agent-dispatch Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@davidzhao davidzhao requested a review from 1egoman June 27, 2026 21:42
Add auto failover APIs for LK Cloud in livekit-server-sdk.
@davidzhao

Copy link
Copy Markdown
Member Author

CI depends on livekit/livekit#4627

devin-ai-integration[bot]

This comment was marked as resolved.

return { origins, ttl };
}

function parseMaxAge(cacheControl: string | null): number {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: would be nice to have some unit tests for this (can add this in a follow up)

* cached list when available, otherwise an empty list. Forwards `headers` so a
* valid token — and any test directives — reach the discovery endpoint.
*/
export async function regionOrigins(origin: URL, headers: unknown): Promise<string[]> {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be worth wrapping this in a origin keyed mutex so that we ensure only one regionOrigins request is processed at a time.

For the ones that are queued up afterwards they should be resolved immediately after as long as there's at least some ttl set for the cache.

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

Open in Devin Review

Comment on lines +221 to +223
const timeout = options.waitUntilAnswered
? (options.timeout ?? DEFAULT_RINGING_TIMEOUT_SECONDS)
: options.timeout;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 WhatsApp call acceptance times out before ringing finishes when user sets a custom ring duration

The HTTP request timeout for waiting WhatsApp calls ignores the user-supplied ringing duration (options.timeout ?? DEFAULT_RINGING_TIMEOUT_SECONDS at ConnectorClient.ts:221-222) and always defaults to 30 seconds, so a call whose ringing window is set longer (e.g. 60 s) will be aborted by the SDK before the phone can be answered.

Impact: Users who set a custom ringing timeout and wait for the call to be answered will see spurious timeout errors.

Timeout computation differs between WhatsApp and SIP paths

The SIP path in SipClient.ts:762-764 correctly uses dialRequestTimeout(opts.timeout, opts.ringingTimeout) (from dialTimeout.ts:30-36), which computes Math.max(timeout ?? floor, floor) where floor = ringingTimeout + 2. This guarantees the HTTP request outlasts the ringing window by at least 2 seconds.

The WhatsApp path at ConnectorClient.ts:221-222 ignores options.ringingTimeout entirely:

const timeout = options.waitUntilAnswered
  ? (options.timeout ?? DEFAULT_RINGING_TIMEOUT_SECONDS)
  : options.timeout;
  • If user sets ringingTimeout: 60 but not timeout, the HTTP timeout is 30 s while the server may wait 60 s.
  • Even with defaults (both 30 s), there is no 2 s margin — the request can abort just as the call is answered.

The AcceptWhatsAppCallOptions.timeout JSDoc at ConnectorClient.ts:91-96 explicitly promises "is raised, if needed, to stay above ringingTimeout" but the implementation does not fulfil that contract.

Suggested change
const timeout = options.waitUntilAnswered
? (options.timeout ?? DEFAULT_RINGING_TIMEOUT_SECONDS)
: options.timeout;
const timeout = options.waitUntilAnswered
? dialRequestTimeout(options.timeout, options.ringingTimeout)
: options.timeout;
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants