Skip to content

HDDS-15005. Delete EC replica when container state is DELETING in SCM#10255

Open
sarvekshayr wants to merge 2 commits into
apache:masterfrom
sarvekshayr:HDDS-15005
Open

HDDS-15005. Delete EC replica when container state is DELETING in SCM#10255
sarvekshayr wants to merge 2 commits into
apache:masterfrom
sarvekshayr:HDDS-15005

Conversation

@sarvekshayr
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

When EC container state is DELETING in SCM, and any EC non-empty replica reports back to SCM, SCM should delete this replica as this replica will remain in orphan state. This can happen when some old DN is bought back which was containing some replica.

What is the link to the Apache JIRA

HDDS-15005

How was this patch tested?

Added unit and integration tests.

@sarvekshayr sarvekshayr marked this pull request as ready for review May 13, 2026 15:12
Copy link
Copy Markdown
Contributor

@sreejasahithi sreejasahithi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sarvekshayr for the patch

}
// HDDS-12421: fall-through to case DELETING
case DELETING:
if (replicationType.equals(HddsProtos.ReplicationType.EC) && !replicaIsEmpty) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if (!replicaIsEmpty) is needed here.

@sodonnel , could you take a look of this patch?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both EmptyContainerHandler and DeletingContainerHandler will take care of empty replicas cleanup.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change makes sense. However it is kind of annoying it is in the ReportHandler rather than the RM deleting handler. There is existing logic in this handler in the DELETING branch of the switch statement for both Ratis and EC, so it makes sense to put in here.

A container transitions to deleting when all its reported replicas are empty. Then it transitions to deleted when all the replicas are gone. If a non empty replica appears after it has transitioned to deleting then it will block the deleting to deleted transition as the RM code will not remove the replica as its non-empty.

What is also interesting is that if the non-empty container appears before the container goes from CLOSED to DELETING, then it would block the container going to DELETING and hence clearing out the other empty replicas. In that case there should be an over replicated index, but I am not sure which RM would remove.

@ChenSammi ChenSammi requested a review from sodonnel May 20, 2026 03:45
@peterxcli peterxcli self-requested a review May 20, 2026 04:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants