Skip to content

ROX-33199: unlink inode tracking#429

Open
ovalenti wants to merge 4 commits intomainfrom
ovalenti/unlink_inode_tracking
Open

ROX-33199: unlink inode tracking#429
ovalenti wants to merge 4 commits intomainfrom
ovalenti/unlink_inode_tracking

Conversation

@ovalenti
Copy link
Copy Markdown
Contributor

Description

When a file is unlinked, the corresponding inode should be removed from the kernel inode map and also from the inode->path map in userland.

Checklist

  • Investigated and inspected CI test results
  • Updated documentation accordingly

Automated testing

  • Added unit tests
  • Added integration tests
  • Added regression tests

If any of these don't apply, please comment below.

Testing Performed

TODO(replace-me)
Use this space to explain how you tested your PR, or, if you didn't test it, why you did not do so. (Valid reasons include "CI is sufficient" or "No testable changes")
In addition to reviewing your code, reviewers must also review your testing instructions, and make sure they are sufficient.

For more details, ref the Confluence page about this section.

@ovalenti ovalenti self-assigned this Mar 25, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 25, 2026

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9a358ede-80d5-4e25-8abd-308cfa9606e6

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ovalenti/unlink_inode_tracking

Comment @coderabbitai help to get the list of available commands and usage tips.

@ovalenti ovalenti force-pushed the ovalenti/unlink_inode_tracking branch from 2b6011b to 2606352 Compare March 25, 2026 16:39
@ovalenti ovalenti force-pushed the ovalenti/unlink_inode_tracking branch from 2606352 to 807b2a9 Compare March 27, 2026 10:29
@ovalenti ovalenti marked this pull request as ready for review March 27, 2026 14:02
@ovalenti ovalenti requested a review from a team as a code owner March 27, 2026 14:02
Comment on lines 129 to +136
pub fn is_creation(&self) -> bool {
matches!(self.file, FileData::Creation(_))
}

pub fn is_unlink(&self) -> bool {
matches!(self.file, FileData::Unlink(_))
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There has to be an idiomatic way of dispatching to a handling function depending on the type ; a match could be a way, but the is_creation() function wasn't going in that direction.

@Molter73 , what would you advise ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One difficulty is that the 'unlink' handler has to be called "after" the host path is set, and the 'creation' handler has to be called "before" the host path.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do a match at the call site, but that would require us to make the file field public in the event and it might not be something we want to do.

Another alternative could be to move the check itself to the associated handle_* method, but that means we need to make that call for all events, regardless of the type they are. This might be the way to go if it doesn't impact performance too much

With the current approach we could still expand usability of the method, imagine in the future we need to call the method protected by this check for rename events as well, we can simply rename the method and add the additional check to something like this:

    pub fn is_path_removed(&self) -> bool {
        matches!(self.file, FileData::Unlink(_) | FileData::Rename(_))
    }

For the time being, I wouldn't worry too much about it, we can revisit it once the code evolves a bit more to see what pattern emerges.

Comment on lines +251 to +262
with open(original_file, 'w') as f:
f.write('This is a test')

# Create two hardlinks in the unmonitored directory
hardlink_file1 = os.path.join(ignored_dir, 'hardlink1.txt')
os.link(original_file, hardlink_file1)

hardlink_file2 = os.path.join(ignored_dir, 'hardlink2.txt')
os.link(original_file, hardlink_file2)

os.remove(hardlink_file1)
os.remove(hardlink_file2)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be flaky. The inode of the original file may not have landed in the probe's map before we start deleting.

I will split the test to wait for the CREATION event before going on with the unlinks.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in fb7d100

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shows that we have a bit of a problem in our code, if a file is created and removed we will miss the host path for it, I wonder if that is related to a check we do on the filesystem and if we should just ignore that and proceed to add the host path regardless of whether we see it in the filesystem or not 🤔

I think I originally put that check in place because I was worried inodes for a file that was removed would linger if we missed the unlink event, but I'm fairly certain the scan should take care of cleaning up.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if that is related to a check we do on the filesystem

Do you mean that you suspect that this is adding to the delay before the host path is added to the inode map ?

Even if we try to add the information as soon as possible the race-condition is there, and I don't see any way we could avoid it completely.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't have a race condition though:

  • The call to create the file is done.
  • The file_open LSM hook is called and an event is put in the ringbuffer.
  • The call to create the file finishes and immediately begins the operation for removing it.
  • The path_unlink LSM hook is called and an event is put in the ringbuffer.

The fact we have a single ringbuffer ensures userspace will see the two events in order so we should have the host path when the unlink event comes around.

I was under the incorrect impression this check would be in the path of the file creation event, but it is not, so I don't really see where the race condition would be. I'll try to debug this a bit with a fresher head on Monday.

@ovalenti ovalenti force-pushed the ovalenti/unlink_inode_tracking branch from 87bffaf to fb7d100 Compare March 27, 2026 14:26
Copy link
Copy Markdown
Contributor

@Molter73 Molter73 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, have a couple small comments, but overall we should be good to merge soon

Comment on lines +230 to +232
if self.inode_map.borrow_mut().remove(inode).is_some() {
self.metrics.scan_inc(ScanLabels::InodeRemoved);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to have a metric for when the removal fails? It shouldn't happen, but it would be nice to know if it does.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it would happen if a customer monitors files with more than one link. I agree that it would be good to know if we are in this case ; as the behavior of fact would probably be quite erratic.

Comment on lines +251 to +262
with open(original_file, 'w') as f:
f.write('This is a test')

# Create two hardlinks in the unmonitored directory
hardlink_file1 = os.path.join(ignored_dir, 'hardlink1.txt')
os.link(original_file, hardlink_file1)

hardlink_file2 = os.path.join(ignored_dir, 'hardlink2.txt')
os.link(original_file, hardlink_file2)

os.remove(hardlink_file1)
os.remove(hardlink_file2)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shows that we have a bit of a problem in our code, if a file is created and removed we will miss the host path for it, I wonder if that is related to a check we do on the filesystem and if we should just ignore that and proceed to add the host path regardless of whether we see it in the filesystem or not 🤔

I think I originally put that check in place because I was worried inodes for a file that was removed would linger if we missed the unlink event, but I'm fairly certain the scan should take care of cleaning up.

event.set_old_host_path(host_path);
}

// Special handling for unlink events
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What makes the handling special?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I want to express is that "this is the specific things we have to do because this event is a deletion".

Ok(())
}

/// Special handling for unlink events.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What makes the handling special?

Comment on lines +269 to +272
# Create a guard file to ensure all events have been processed
guard_file = os.path.join(monitored_dir, 'guard.txt')
with open(guard_file, 'w') as f:
f.write('guard')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably a pattern we could adopt directly in the wait_events method.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not in this PR obviously

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants