Conversation
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #17735 +/- ##
============================================
+ Coverage 40.55% 40.58% +0.03%
Complexity 2574 2574
============================================
Files 5179 5179
Lines 349896 349989 +93
Branches 44727 44741 +14
============================================
+ Hits 141890 142041 +151
+ Misses 208006 207948 -58 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
CRZbulabula
left a comment
There was a problem hiding this comment.
Both the capacity-accounting and the rollback-on-DataNode-failure fixes look correct. Comments below are organized by code location and focus on the timing of state capture in SetTTLProcedure, the cost of the consensus read used to capture it, and a handful of readability/test improvements.
| final int tTlRuleCapacity = CommonDescriptor.getInstance().getConfig().getTTlRuleCapacity(); | ||
| if (getTTLCount() >= tTlRuleCapacity) { | ||
| final int newTTLRuleCount = calculateNewTTLRuleCount(plan); | ||
| if (newTTLRuleCount > 0 && ttlCache.getTtlCount() + newTTLRuleCount > tTlRuleCapacity) { |
There was a problem hiding this comment.
Capacity check looks correct. Short-circuiting on newTTLRuleCount > 0 is what lets testUpdateExistingTTLWhenCurrentStateIsAlreadyOversize pass — an update of an existing rule no longer trips the limit even when the current count is already over capacity.
One diagnostic suggestion: when the limit is hit, include both tTlRuleCapacity and ttlCache.getTtlCount() + newTTLRuleCount in the error message. It makes it much easier to tell at a glance whether the cluster genuinely filled the table, or whether the capacity was tightened by a config change.
| } | ||
|
|
||
| private int getNewTTLRuleCount(String[] pathNodes) { | ||
| return ttlCache.getLastNodeTTL(pathNodes) == TTLCache.NULL_TTL ? 1 : 0; |
There was a problem hiding this comment.
getNewTTLRuleCount is slightly mis-named — it returns a 0/1 indicator, not a count. Consider renaming to isNewTTLRule (return boolean) and accumulating boolean ? 1 : 0 in calculateNewTTLRuleCount. Reads more naturally at the call site.
| private static final long TTL_NOT_EXIST = Long.MIN_VALUE; | ||
|
|
||
| private SetTTLPlan plan; | ||
| private long previousTTL = TTL_NOT_EXIST; |
There was a problem hiding this comment.
Worth a one-line comment explaining why this sentinel is Long.MIN_VALUE rather than TTLCache.NULL_TTL (-1). They have different meanings on the rollback path: TTL_NOT_EXIST means "no TTL was set before this procedure", while NULL_TTL is the explicit "unset" marker that ConfigPlanExecutor interprets to route to unsetTTL. Conflating the two would corrupt rollback behavior.
| } | ||
| protected void setConfigNodeTTL(final ConfigNodeProcedureEnv env) { | ||
| capturePreviousTTLState(env); | ||
| final TSStatus res = writeConfigNodePlan(env, plan); |
There was a problem hiding this comment.
(medium) previousTTL can be overwritten with the new value after a crash.
capturePreviousTTLState and writeConfigNodePlan both run inside one executeFromState(SET_CONFIGNODE_TTL) invocation, but the procedure framework only persists previousTTLStateCaptured / previousTTL on state transitions. If the ConfigNode crashes after the consensus write succeeds but before setNextState(UPDATE_DATANODE_CACHE) is durable, recovery replays SET_CONFIGNODE_TTL: previousTTLStateCaptured is still false, so we re-run capture — and getAllTTL() now returns the new value, which gets recorded as the "previous" TTL. If the DataNode update then fails, the rollback restores the new value over itself, losing the original.
Suggestion: split capture into its own pre-state (e.g. CAPTURE_PREVIOUS_TTL → SET_CONFIGNODE_TTL → UPDATE_DATANODE_CACHE) so the captured snapshot is persisted with procedure state before the consensus write.
| res = new TSStatus(TSStatusCode.EXECUTE_STATEMENT_ERROR.getStatusCode()); | ||
| res.setMessage(e.getMessage()); | ||
| } | ||
| protected void setConfigNodeTTL(final ConfigNodeProcedureEnv env) { |
There was a problem hiding this comment.
These methods (setConfigNodeTTL, updateDataNodeTTL, writeConfigNodePlan, sendTTLRequest) are made protected only so that TestingSetTTLProcedure in the test file can override them — no production subclass needs them. Consider making them package-private and putting the test class in the same package; that keeps the public/protected surface narrower.
| rollbackFailure = e; | ||
| } | ||
| try { | ||
| rollbackDataNodeTTL(env); |
There was a problem hiding this comment.
Two suggestions on the rollback error-aggregation strategy:
- Only the first exception survives; the second is logged but lost. Use
rollbackFailure.addSuppressed(e)so a postmortem can still see both. - Add a method-level comment clarifying the policy: "best-effort — if one side fails, the other is still attempted; the earliest exception is what propagates." Otherwise readers might expect fail-fast semantics.
| final int length = ReadWriteIOUtils.readInt(byteBuffer); | ||
| final int position = byteBuffer.position(); | ||
| this.plan = (SetTTLPlan) ConfigPhysicalPlan.Factory.create(byteBuffer); | ||
| byteBuffer.position(position + length); |
There was a problem hiding this comment.
This byteBuffer.position(position + length) is necessary (not redundant) — worth a comment so a future reader doesn't "clean it up".
Reason: serializeToByteBuffer() returns ByteBuffer.wrap(publicBAOS.getBuf(), 0, size()). ReadWriteIOUtils.write(ByteBuffer, OutputStream) then writes byteBuffer.capacity() (the full backing array length, which can be greater than the actual data size) followed by the whole array, padding included. Without forcing the position back to position + length, the new boolean + 2*long trailing fields would be read from those padding zero bytes.
| final int position = byteBuffer.position(); | ||
| this.plan = (SetTTLPlan) ConfigPhysicalPlan.Factory.create(byteBuffer); | ||
| byteBuffer.position(position + length); | ||
| if (byteBuffer.remaining() >= 17) { |
There was a problem hiding this comment.
17 is a magic number (1 byte boolean + 2 * 8 byte long). Extract a constant:
private static final int ROLLBACK_STATE_BYTES = Byte.BYTES + Long.BYTES * 2;
...
if (byteBuffer.remaining() >= ROLLBACK_STATE_BYTES) { ... }Makes it survive any future addition of rollback fields without forgetting to update the guard.
| return this.isGeneratedByPipe == that.isGeneratedByPipe | ||
| && this.previousTTL == that.previousTTL | ||
| && this.previousDatabaseWildcardTTL == that.previousDatabaseWildcardTTL | ||
| && this.previousTTLStateCaptured == that.previousTTLStateCaptured |
There was a problem hiding this comment.
Including previousTTL / previousDatabaseWildcardTTL / previousTTLStateCaptured in equals/hashCode is consistent with the serialization change, but please confirm no caller dedups procedures by equals on the wait queue. If something does, two procedures with the same plan but different captured state would now be considered distinct, which could change idempotency behavior. If there's no such caller, ignore this comment.
| (SetTTLProcedure) ProcedureFactory.getInstance().create(buffer); | ||
| assertSerializedProcedure( | ||
| deserializedProcedure, "root.db", 2000L, true, true, 500L, 600L, false); | ||
| } |
There was a problem hiding this comment.
Please add a backward-compatibility deserialization test: construct an "old format" byte stream by hand — [procedureType:short][super state...][length:int][plan bytes] with no trailing 17 bytes — feed it through deserialize, and assert previousTTLStateCaptured == false, previousTTL == Long.MIN_VALUE, previousDatabaseWildcardTTL == Long.MIN_VALUE. The current round-trip test only exercises new-write/new-read; without an old-format fixture, a rolling upgrade where an in-flight pre-PR procedure is persisted and then resumed on a post-PR ConfigNode is not covered.



Description
This PR fixes two correctness issues in ConfigNode TTL handling.
SetTTLProceduredid not rollback TTL state when ConfigNode metadata had been updated but DataNode TTL cacheupdate failed.
Details
1. Fix TTL rule capacity accounting
TTLInfonow checks the number of newly introduced TTL rules instead of simply checking the current total rule count.This also fixes the database TTL case: setting TTL on a database effectively writes both the database path and its
**wildcard path, so capacity validation now accounts for both entries.As a result:
2. Add rollback for
SetTTLProcedureSetTTLProcedurenow captures the previous TTL state before writing the new value to ConfigNode.If the procedure fails while updating DataNode TTL cache, it will rollback:
For database TTL, the rollback also restores the wildcard TTL entry (
db.**) separately.The rollback state is serialized with the procedure so that recovery/replay can still restore the previous TTL state
correctly.
Tests
Added/updated tests for:
This PR has:
for an unfamiliar reader.
for code coverage.
Key changed/added classes (or packages if there are too many classes) in this PR