Skip to content

Exit quiescence when splice_init and tx_init_rbf are rejected#4495

Open
jkczyz wants to merge 5 commits intolightningdevkit:mainfrom
jkczyz:2026-03-splicing-rbf-follow-ups
Open

Exit quiescence when splice_init and tx_init_rbf are rejected#4495
jkczyz wants to merge 5 commits intolightningdevkit:mainfrom
jkczyz:2026-03-splicing-rbf-follow-ups

Conversation

@jkczyz
Copy link
Copy Markdown
Contributor

@jkczyz jkczyz commented Mar 18, 2026

Summary

  • Fix quiescence not being exited when tx_init_rbf or splice_init is rejected with Abort (e.g., insufficient RBF feerate, negotiation in progress), which left the channel stuck in a quiescent state
  • Refactor both handlers to return InteractiveTxMsgError, reusing the same pattern already used by the interactive TX message handlers, with a shared handle_interactive_tx_msg_err helper in channelmanager

Test plan

  • Existing test_splice_rbf_insufficient_feerate updated to verify quiescence is properly exited after tx_abort
  • Existing test_splice_feerate_too_high updated to verify quiescence is properly exited after splice_init rejection
  • All splicing tests pass

🤖 Generated with Claude Code

Based on #4494.

@ldk-reviews-bot
Copy link
Copy Markdown

ldk-reviews-bot commented Mar 18, 2026

👋 Thanks for assigning @wpaulino as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

@jkczyz jkczyz changed the title Exit quiescence when splice_init and tx_init_rbf are rejected Exit quiescence when splice_init and tx_init_rbf are rejected Mar 18, 2026
@jkczyz jkczyz force-pushed the 2026-03-splicing-rbf-follow-ups branch from a5d670c to 57aad34 Compare March 18, 2026 18:57
@jkczyz jkczyz self-assigned this Mar 19, 2026
@jkczyz jkczyz mentioned this pull request Mar 19, 2026
36 tasks
@jkczyz jkczyz force-pushed the 2026-03-splicing-rbf-follow-ups branch 3 times, most recently from 8cdc480 to 959b553 Compare March 31, 2026 00:16
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 31, 2026

Codecov Report

❌ Patch coverage is 61.79775% with 34 lines in your changes missing coverage. Please review.
✅ Project coverage is 86.22%. Comparing base (8d00139) to head (959b553).
⚠️ Report is 111 commits behind head on main.

Files with missing lines Patch % Lines
lightning/src/ln/channelmanager.rs 42.85% 30 Missing and 2 partials ⚠️
lightning/src/ln/channel.rs 92.30% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4495      +/-   ##
==========================================
- Coverage   86.24%   86.22%   -0.02%     
==========================================
  Files         160      161       +1     
  Lines      107909   108651     +742     
  Branches   107909   108651     +742     
==========================================
+ Hits        93061    93681     +620     
- Misses      12212    12339     +127     
+ Partials     2636     2631       -5     
Flag Coverage Δ
tests 86.22% <61.79%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jkczyz jkczyz force-pushed the 2026-03-splicing-rbf-follow-ups branch from 959b553 to 69a1a1c Compare March 31, 2026 22:28
@jkczyz jkczyz marked this pull request as ready for review April 2, 2026 17:03
@ldk-reviews-bot ldk-reviews-bot requested a review from wpaulino April 2, 2026 17:04
Comment on lines +13970 to +13974
fn quiescent_negotiation_err(&mut self, err: ChannelError) -> InteractiveTxMsgError {
let exited_quiescence =
if matches!(err, ChannelError::Abort(_)) { self.exit_quiescence() } else { false };
InteractiveTxMsgError { err, splice_funding_failed: None, exited_quiescence }
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In splice_init, resolve_queued_contribution (which can return Abort(FeeRateTooHigh)) is called before validate_splice_init checks is_quiescent(). If a misbehaving peer sends splice_init during the STFU handshake (before quiescence is established), and the holder has a queued contribution with a low max_feerate, the Abort error would flow into quiescent_negotiation_errexit_quiescence(), which has debug_assert!(!is_local_stfu_sent()) / debug_assert!(!is_remote_stfu_sent()).

Those asserts would fire because the STFU flags are still set (handshake incomplete). In release builds, exit_quiescence would harmlessly return false, but in debug/test builds this would panic.

Same issue applies to tx_init_rbf at line 12762 (and line 12780).

Consider guarding with an is_quiescent() check before calling exit_quiescence:

Suggested change
fn quiescent_negotiation_err(&mut self, err: ChannelError) -> InteractiveTxMsgError {
let exited_quiescence =
if matches!(err, ChannelError::Abort(_)) { self.exit_quiescence() } else { false };
InteractiveTxMsgError { err, splice_funding_failed: None, exited_quiescence }
}
fn quiescent_negotiation_err(&mut self, err: ChannelError) -> InteractiveTxMsgError {
let exited_quiescence = if matches!(err, ChannelError::Abort(_))
&& self.context.channel_state.is_quiescent()
{
self.exit_quiescence()
} else {
false
};
InteractiveTxMsgError { err, splice_funding_failed: None, exited_quiescence }
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, though I took the opportunity to have validate_splice_init and validate_ tx_init_rbf no longer call validate_splice_contributions and FundingScope::for_splice. Rather they are called at those call sites. This let's us check if we are quiescent before checking fee rates when calling resolve_queued_contribution.

@ldk-claude-review-bot
Copy link
Copy Markdown
Collaborator

ldk-claude-review-bot commented Apr 2, 2026

I've completed a thorough review of the entire PR diff, examining every file and hunk for bugs, security issues, and logic errors.

Verification performed:

  1. quiescent_negotiation_err debug_assert safety: validate_splice_init (line 12697) and validate_tx_init_rbf (line 12967) both check is_quiescent() first, returning WarnAndDisconnect for non-quiescent channels. Since these are called before resolve_queued_contribution and other Abort-producing paths, the debug_assert!(is_quiescent()) at line 14254 can never fire on a non-quiescent channel. My prior review concern is fully resolved by the reordering.

  2. Reordering safety: resolve_queued_contribution takes &self (line 12822), so it's side-effect-free and reordering it after validation is safe.

  3. handle_interactive_tx_msg_err extraction: Faithful extraction of the inline code from internal_tx_msg, with identical semantics. Lock ordering (acquiring pending_events while peer_state is held) matches existing patterns in the codebase (e.g., internal_tx_abort at line 12191).

  4. No Close errors on these paths: All error returns from validate_splice_init, validate_tx_init_rbf, resolve_queued_contribution, and validate_splice_contributions produce only WarnAndDisconnect or Abort — never Close. So from_chan_no_close is used correctly.

  5. validate_splice_contributions migration: Moved from validate_splice_init/validate_tx_init_rbf to their callers with identical parameters and error wrapping (WarnAndDisconnect).

  6. exit_quiescence visibility: Correctly made non-test-only since quiescent_negotiation_err needs it in production.

  7. exited_quiescence flow: Properly triggers check_free_peer_holding_cells via handle_error (line 4613-4626), matching the existing tx_abort handling pattern.

  8. Tests: Cover the key scenarios — quiescence exit on rejection, STFU re-proposal when pending action exists, holding cell freeing, and misbehaving peer detection.

No issues found.

Comment on lines 12688 to 12690
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that we'll now abort an in-progress RBF if the counterparty misbehaves by sending us tx_init_rbf. The spec isn't really clear on what to do here. There's a similar case in validate_splice_init where we'll disconnect if we already have a pending splice (even if negotiation is still in progress). The spec does indicate this should be done. So maybe both of these are fine?

When tx_init_rbf is rejected with ChannelError::Abort (e.g.,
insufficient RBF feerate, negotiation in progress, feerate too high),
the error is converted to a tx_abort message but quiescence is never
exited and holding cells are never freed. This leaves the channel stuck
in a quiescent state.

Fix this by intercepting ChannelError::Abort before try_channel_entry!
in internal_tx_init_rbf, calling exit_quiescence on the channel, and
returning the error with exited_quiescence set so that handle_error
frees holding cells. Also make exit_quiescence available in non-test
builds by removing its cfg gate.

Update tests to use the proper RBF initiation flow (with tampered
feerates) so that handle_tx_abort correctly echoes the abort and exits
quiescence, rather than manually crafting tx_init_rbf messages that
leave node 0 without proper negotiation state.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jkczyz jkczyz force-pushed the 2026-03-splicing-rbf-follow-ups branch from d275551 to 3c55d3c Compare April 2, 2026 23:16
@jkczyz
Copy link
Copy Markdown
Contributor Author

jkczyz commented Apr 2, 2026

Rebased

&self.logger,
);
if let Err(ChannelError::Abort(_)) = &init_res {
funded_channel.exit_quiescence();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm I'm not sure we want to use this since the debug assertions there are reachable, but the mark_response_received call makes me realize we may need to do this elsewhere to avoid an unintentional disconnect (maybe write a test to confirm?).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm I'm not sure we want to use this since the debug assertions there are reachable,

The fixup commit makes sure that we don't return ChannelError::Abort until we know we are quiescent, so I think those asserts are unreachable from here?

but the mark_response_received call makes me realize we may need to do this elsewhere to avoid an unintentional disconnect (maybe write a test to confirm?).

Like when processing splice_init, splice_ack, tx_init_rbf, tx_ack_rbf, or interactive-tx messages?

One thing I noticed when exploring this as since we don't have a FundedChannel for v2 establishment, we use UNFUNDED_CHANNEL_AGE_LIMIT_TICKS to timeout instead. So updating sent_message_awaiting_response on the ChannelContext has no effect there. We'd effectively have two different mechanisms for timing out. Are we ok with that? Or should we try to use the UNFUNDED_CHANNEL_AGE_LIMIT_TICKS for splice funding, too?

);

// Node 1 handles the echo (no-op since it already aborted).
nodes[1].node.handle_tx_abort(node_id_0, &tx_abort_echo);
Copy link
Copy Markdown
Contributor

@wpaulino wpaulino Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should include a payment such that the outbound HTLC gets sent out after the abort as a result of freeing the holding cell.

@jkczyz jkczyz requested a review from wpaulino April 3, 2026 21:23
jkczyz and others added 4 commits April 3, 2026 16:34
Add a payment during quiescence in test_splice_rbf_insufficient_feerate
to verify that outbound HTLCs queued in the holding cell are freed when
quiescence is exited by the tx_abort exchange.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The same bug fixed in the prior commit for tx_init_rbf also exists in
internal_splice_init: when splice_init triggers FeeRateTooHigh in
resolve_queued_contribution, the ChannelError::Abort goes through
try_channel_entry! without exiting quiescence.

Apply the same fix: intercept ChannelError::Abort before
try_channel_entry!, call exit_quiescence, and return the error with
exited_quiescence set.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The prior two commits manually intercepted ChannelError::Abort in the
channelmanager handlers for splice_init and tx_init_rbf to exit
quiescence before returning, since the channel methods didn't signal
this themselves. The interactive TX message handlers already solved this
by returning InteractiveTxMsgError which bundles exited_quiescence into
the error type.

Apply the same pattern: change splice_init and tx_init_rbf to return
InteractiveTxMsgError, adding a quiescent_negotiation_err helper on
FundedChannel that exits quiescence for Abort errors and passes through
other variants unchanged. Extract handle_interactive_tx_msg_err in
channelmanager to deduplicate the error handling across internal_tx_msg,
internal_splice_init, and internal_tx_init_rbf.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
validate_splice_init and validate_tx_init_rbf check preconditions
including quiescence. Move them before resolve_queued_contribution
so that a misbehaving peer sending splice_init or tx_init_rbf before
quiescence is rejected early. This also moves validate_splice_contributions
and FundingScope construction into the callers since they depend on the
resolved contribution.

Have validate_tx_init_rbf return the last candidate's funding pubkeys
so the caller can construct FundingScope without re-accessing
pending_splice.

Add a debug_assert in quiescent_negotiation_err that Abort errors only
occur when the channel is quiescent, since exit_quiescence requires it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jkczyz jkczyz force-pushed the 2026-03-splicing-rbf-follow-ups branch from 3c55d3c to e4b1c6b Compare April 3, 2026 21:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

4 participants