ci: skip agents with broken upstream URLs instead of failing the whole update run#276
Open
stefandevo wants to merge 1 commit intoagentclientprotocol:mainfrom
Open
Conversation
… failing When an upstream release stops publishing one of an agent's declared binary archives (e.g. vtcode 0.105.5 dropped its Windows zip), the hourly auto-update workflow currently dies in the registry-build URL-validation step. That blocks updates for *every* other agent until a human edits the offending agent.json. Make the apply step resilient instead: - Hoist url_exists() and validate_distribution_urls() into registry_utils.py so update_versions and build_registry share the same check. - After update_versions writes a new agent.json, re-validate its distribution URLs; if any 404, restore the original file content byte-for-byte and report the agent as "skipped" rather than failed. - Track skipped updates separately from real I/O failures so the workflow exits 0 when the only problem is a broken upstream binary. - Surface skipped agents in both the JSON output and the human summary so it's obvious which agents are stuck and why. - Add unit tests covering the revert path and the moved helpers. Refs: failing run https://github.com/agentclientprotocol/registry/actions/runs/25275167557
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The hourly Update Agent Versions workflow has been failing every run for several days (e.g. run 25275167557) because of a single agent —
vtcode— whose upstream stopped publishing a Windows binary archive at the new version:vtcode 0.96.14(currently inmain) ships a Windows zip;0.105.5doesn't. Soupdate_versions.py --applyrewritesvtcode/agent.jsonto point at a non-existent file,build_registry.pyURL-validates it, the whole step exits 1, and no other agent gets updated until a human intervenes.This PR makes the apply step resilient to single-agent breakage: when a planned update's new URLs aren't reachable, that one agent is rolled back to its previous content and reported as skipped, while the rest of the updates proceed.
Changes
url_exists()andvalidate_distribution_urls()frombuild_registry.pyintoregistry_utils.pyso both scripts share the same check (no behavior change forbuild_registry.py).update_versions.pyapply path now revalidates URLs after writing. If any URL fails, the originalagent.jsonis restored byte-for-byte and the agent is recorded as skipped. Hard I/O errors still fail the run; broken upstream URLs no longer do.skippedarray alongsideupdates/errors/up_to_date, so the workflow's downstream consumers (commit message, verify-agents) only see updates that actually landed.registry_utils, and tests for the new revert-and-skip path inapply_update. Also smoke-tested live against the real broken vtcode 0.105.5 Windows URL.Behavior after this change
For the failing run: macOS, Linux, and the 19 other agents that have updates would all land in
main;vtcodewould stay at0.96.14(whose URLs all still work) and appear in the summary as:When upstream republishes the Windows binary (or the agent owner removes the
windows-x86_64target),vtcodewill resume auto-updating with no further action — no manual quarantine entry needed.Test plan
pytest .github/workflows/tests/— 98 passed (93 baseline + 5 new)apply_updateagainst the realvtcode/agent.jsonwith live network, confirmed the file is reverted to its original 0.96.14 content and the skip reason names the missing Windows zip