Skip to content

Fix async release before HTLC decode#4548

Open
valentinewallace wants to merge 2 commits intolightningdevkit:mainfrom
valentinewallace:2026-04-async-held-htlc-ordering-v2
Open

Fix async release before HTLC decode#4548
valentinewallace wants to merge 2 commits intolightningdevkit:mainfrom
valentinewallace:2026-04-async-held-htlc-ordering-v2

Conversation

@valentinewallace
Copy link
Copy Markdown
Contributor

Handle ReleaseHeldHtlc messages that arrive before the sender-side LSP has even queued the held HTLC for onion decoding. Unlike #4106, which covers releases arriving after the HTLC is in decode_update_add_htlcs but before it reaches pending_intercepted_htlcs, this preserves releases that arrive one step earlier and would otherwise be dropped as HTLC not found.

Supersedes #4542

@valentinewallace valentinewallace added this to the 0.3 milestone Apr 8, 2026
@valentinewallace valentinewallace self-assigned this Apr 8, 2026
@valentinewallace valentinewallace added the weekly goal Someone wants to land this this week label Apr 8, 2026
@ldk-reviews-bot
Copy link
Copy Markdown

ldk-reviews-bot commented Apr 8, 2026

👋 Thanks for assigning @TheBlueMatt as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

@ldk-claude-review-bot
Copy link
Copy Markdown
Collaborator

ldk-claude-review-bot commented Apr 8, 2026

After thoroughly reviewing every file and hunk in this PR diff — tracing all race condition windows, lock ordering, state machine transitions, and serialization correctness — I found no new issues.

Review Summary

No issues found.

Prior comment status

All prior review comments are resolved or no longer applicable in this version of the PR:

  • channelmanager.rs:7534 (intercept_forward bypass for released_in_transit) — Not applicable: this version uses a simpler approach where hold_htlc is cleared at the channel level via release_pending_inbound_held_htlc, so when should_hold_htlc() returns false the HTLC naturally falls through to the else if intercept_forward check at line 7521.
  • channel.rs:8157 (stale Self::is_inbound_htlc_released reference) — Resolved: no such stale reference in current code.
  • channel.rs:8171 ("HTlC" typo) — Resolved: no such typo in current code.
  • channelmanager.rs:7504 ("arrived arrived" double word) — Resolved: no duplicate word in current code.

Architecture assessment

The approach is clean and correct:

  1. release_pending_inbound_held_htlc (channel.rs:8116): Searches monitor_pending_update_adds first (covers post-Committed clones), then pre-Committed states in pending_inbound_htlcs. The Committed state is intentionally not matched because by then the clone exists in monitor_pending_update_adds.

  2. Moved push_decode_update_add_htlcs (channelmanager.rs:10665): Now called under peer_state lock, closing the race window where an HTLC was invisible to all three release-handler checks.

  3. Lock ordering: Consistent — peer_state → decode_update_add_htlcs in the new code. The release handler releases peer_state before acquiring decode_update_add_htlcs (no deadlock). The serialization code now acquires decode_update_add_htlcs after peer_states, matching the new ordering.

  4. Serialization scoping (channelmanager.rs:17956-17962): Correctly releases the lock early when the map is empty. Safe because total_consistency_lock write lock prevents concurrent modifications.

Copy link
Copy Markdown
Contributor

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good, CI is just failing due to some broken doc links AFAICT.

Also validated this still works in conjunction with #4463 / for lightningdevkit/ldk-node#817.

@valentinewallace valentinewallace force-pushed the 2026-04-async-held-htlc-ordering-v2 branch 2 times, most recently from ae776b9 to bd632fb Compare April 9, 2026 18:10
@valentinewallace
Copy link
Copy Markdown
Contributor Author

Addressed feedback with diff (plus an extra push for a whitespace fix).

tnull
tnull previously approved these changes Apr 9, 2026
Copy link
Copy Markdown
Contributor

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just some doc nits.

Feel free to land as is though, assuming CI passes. (mod Claude's comments, maybe)

@valentinewallace valentinewallace force-pushed the 2026-04-async-held-htlc-ordering-v2 branch from bd632fb to f764872 Compare April 9, 2026 18:15
@tnull tnull requested a review from TheBlueMatt April 9, 2026 18:18
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 9, 2026

Codecov Report

❌ Patch coverage is 72.22222% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 87.01%. Comparing base (5704e8e) to head (2ebc372).
⚠️ Report is 12 commits behind head on main.

Files with missing lines Patch % Lines
lightning/src/ln/channel.rs 47.36% 9 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4548      +/-   ##
==========================================
+ Coverage   86.99%   87.01%   +0.01%     
==========================================
  Files         163      163              
  Lines      108635   108682      +47     
  Branches   108635   108682      +47     
==========================================
+ Hits        94511    94572      +61     
+ Misses      11647    11631      -16     
- Partials     2477     2479       +2     
Flag Coverage Δ
fuzzing 40.23% <22.22%> (-0.07%) ⬇️
tests 86.10% <72.22%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

);
if pending_add.forward_info.routing.should_hold_htlc() {
let hold_htlc = if pending_add.forward_info.routing.should_hold_htlc() {
// Check whether a ReleaseHeldHtlc arrived after the manager pulled a copy of this
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feel like it should be fixed by taking an additional lock in handle_release_held_htlc rather than checking for it here?

Copy link
Copy Markdown
Contributor Author

@valentinewallace valentinewallace Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe so -- did you see this race condition? #4542 (comment) Maybe we don't support that kind of multithreaded usage anyway?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, if we try to acquire the decode_update_add_htlcs lock prior to taking the channel lock, which I think would fix a different race you may be pointing out here, it currently results in a lock order violation.

Copy link
Copy Markdown
Contributor Author

@valentinewallace valentinewallace Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok fixed in a nicer way that will avoid similar race conditions in the future, see the first commit

@valentinewallace valentinewallace force-pushed the 2026-04-async-held-htlc-ordering-v2 branch from f764872 to 1231cb8 Compare April 9, 2026 23:37
valentinewallace and others added 2 commits April 9, 2026 20:56
This avoids race conditions where we're unable to properly update an HTLC's
state because we need to update its state in the ChannelManager, but the HTLC
is stuck in transit from the Channel to
ChannelManager::decode_update_add_htlcs. Now the HTLC will atomically go from
the Channel to the ChannelManager decode queue under the same lock.
Handle `ReleaseHeldHtlc` messages that arrive before the sender-side LSP
has even queued the held HTLC for onion decoding. Unlike lightningdevkit#4106, which
covers releases arriving after the HTLC is in `decode_update_add_htlcs`
but before it reaches `pending_intercepted_htlcs`, this preserves
releases that arrive one step earlier and would otherwise be dropped as
HTLC not found.

Co-Authored-By: HAL 9000
Co-Authored-By: Elias Rohrer <dev@tnull.de>
@valentinewallace valentinewallace force-pushed the 2026-04-async-held-htlc-ordering-v2 branch from 1231cb8 to 2ebc372 Compare April 10, 2026 00:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

weekly goal Someone wants to land this this week

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

5 participants