-
Notifications
You must be signed in to change notification settings - Fork 6
feat(ocap-kernel): implement distributed garbage collection protocol #814
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Coverage Report
File Coverage
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
grypez
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Upgrades from vat gc -> endpoint gc
LGTM, but probably @FUDCo should look, too. The ping-pong prevention works for two kernels; is there a case of three kernels that can get BOYD-locked?
| const crankResult = await remote.deliverBringOutYourDead(); | ||
| expect(mockRemoteComms.sendRemoteMessage).toHaveBeenCalledWith( | ||
| mockRemotePeerId, | ||
| expect.any(String), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe worth us writing a plugin like expect.serialized({ key: value }).
FUDCo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remarkable how much of this just consisted of renaming things. Very nice.
…779) Implement the DGC protocol so that `deliverBringOutYourDead()` on a RemoteHandle sends a BOYD wire message to the remote kernel, which schedules a local reap, runs GC, and sends back drops/retires. - Widen `scheduleReap` to accept `EndpointId` (vat or remote) - Update GC action parsing to use `insistEndpointId` - Add `bringOutYourDead` delivery type to RemoteHandle - Add ping-pong prevention flag to avoid infinite BOYD loops - Add unit tests for BOYD protocol and GC endpoint widening - Add e2e tests for distributed GC in remote-comms Co-Authored-By: Claude Opus 4.6 <[email protected]>
Move the #remoteGcRequested in-memory state change after the savepoint commit, consistent with the transactional message processing pattern from #811. Also consolidate test assertions to use toStrictEqual on parsed objects per project convention, and fix return value expectation. Co-Authored-By: Claude Opus 4.6 <[email protected]>
|
7f4aeac to
dba940f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
FUDCo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latest update seems to do exactly what was promised in the TODO comment :-)
I think the issue that Cursor Bugbot flagged is real, so if you need to do another turn of changes to address that, I'm happy to do another quick review round after.
FUDCo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latest update seems to do exactly what was promised in the TODO comment :-)
I think the issue that Cursor Bugbot flagged is real, so if you need to do another turn of changes to address that, I'm happy to do another quick review round after.
…d store cache
The DGC e2e tests had two compounding bugs: they called scheduleReap('r0')
but the first allocated remote is r1, and they used a separate KernelStore
instance whose cache was isolated from the kernel's internal store.
Add reapRemotes() to RemoteManager and Kernel (mirroring reapVats) and
use it in the e2e tests instead of bypassing the kernel with an external
store. Remove getGCActions() and nextRemoteId assertions that suffered
from the same stale cache issue.
Also reset #remoteGcRequested flag in handlePeerRestart() so BOYD is
correctly sent to a new incarnation after peer restart.
Co-Authored-By: Claude Opus 4.6 <[email protected]>
5297c39 to
2e895a7
Compare
FUDCo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good now...
Closes #779
Summary
Implements the DGC (Distributed Garbage Collection) protocol for remote kernel communication (Issue #779).
deliverBringOutYourDead()onRemoteHandlenow sends a['bringOutYourDead']wire message to the remote kernel instead of being a no-op#remoteGcRequested) prevents infinite BOYD loops between kernelsChanges
GC store widening (
gc.ts,garbage-collection.ts):scheduleReapto acceptEndpointId(union ofVatId | RemoteId) instead of justVatIdaddGCActionsandprocessGCActionSetto useinsistEndpointIdinstead ofinsistVatIdvatId/actionsByVattoendpointId/actionsByEndpointfor clarityBOYD protocol (
RemoteHandle.ts):BringOutYourDeadDeliveryto theDeliveryParamsunion typedeliverBringOutYourDead()to send BOYD over the wire via#sendRemoteCommand'bringOutYourDead'case to#handleRemoteDeliverthat callsscheduleReap#remoteGcRequestedflag: set after the savepoint commit (consistent with the transactional message processing pattern from Complete Ken protocol implementation #811), when BOYD was triggered by an incoming remote request the subsequent local BOYD is suppressedTests:
Test plan
🤖 Generated with Claude Code
Note
Medium Risk
Adds a new cross-kernel
bringOutYourDeadcontrol message and widens GC scheduling from vats to all endpoints, which can affect remote message flow and garbage-collection behavior if mis-triggered, but changes are mostly additive and well-covered by new unit/e2e tests.Overview
Implements distributed garbage collection across kernels by turning
RemoteHandle.deliverBringOutYourDead()into a real wire-leveldelivermessage (['bringOutYourDead']) that causes the receiving kernel toscheduleReapits remote endpoint and run GC, with a#remoteGcRequestedflag to prevent infinite BOYD ping-pong and to reset on peer restart.Widens GC plumbing from vat-only to endpoint-wide operation by treating GC actions and reap scheduling as
EndpointId(vat or remote) rather thanVatId, and addsKernel.reapRemotes()/RemoteManager.reapRemotes()debugging/test hooks to trigger remote reaping.Adds extensive coverage: new RemoteHandle unit tests for BOYD send/receive, seq/ack/persistence, ping-pong prevention, RemoteManager tests for reap scheduling, GC store tests for remote endpoint IDs, and nodejs e2e scenarios validating BOYD exchange and continued bidirectional messaging after DGC.
Written by Cursor Bugbot for commit ae4b9cc. This will update automatically on new commits. Configure here.