Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions fact-ebpf/src/bpf/main.c
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,9 @@ int BPF_PROG(trace_path_unlink, struct path* dir, struct dentry* dentry) {
return 0;
}

// We only support files with one link for now
inode_remove(&inode_key);

submit_unlink_event(&m->path_unlink,
path->path,
inode_to_submit,
Expand Down
4 changes: 4 additions & 0 deletions fact/src/event/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,10 @@ impl Event {
matches!(self.file, FileData::Creation(_))
}

pub fn is_unlink(&self) -> bool {
matches!(self.file, FileData::Unlink(_))
}

/// Unwrap the inner FileData and return the inode that triggered
/// the event.
///
Expand Down
17 changes: 17 additions & 0 deletions fact/src/host_scanner.rs
Original file line number Diff line number Diff line change
Expand Up @@ -220,6 +220,18 @@ impl HostScanner {
Ok(())
}

/// Special handling for unlink events.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What makes the handling special?

///
/// This method removes the inode from the userland inode->path map.
/// The probe already cleared the kernel inode map.
fn handle_unlink_event(&self, event: &Event) {
let inode = event.get_inode();

if self.inode_map.borrow_mut().remove(inode).is_some() {
self.metrics.scan_inc(ScanLabels::InodeRemoved);
}
Comment on lines +230 to +232
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to have a metric for when the removal fails? It shouldn't happen, but it would be nice to know if it does.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it would happen if a customer monitors files with more than one link. I agree that it would be good to know if we are in this case ; as the behavior of fact would probably be quite erratic.

}

/// Periodically notify the host scanner main task that a scan needs
/// to happen.
///
Expand Down Expand Up @@ -277,6 +289,11 @@ impl HostScanner {
event.set_old_host_path(host_path);
}

// Special handling for unlink events
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What makes the handling special?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I want to express is that "this is the specific things we have to do because this event is a deletion".

if event.is_unlink() {
self.handle_unlink_event(&event);
}

let event = Arc::new(event);
if let Err(e) = self.tx.send(event) {
self.metrics.events.dropped();
Expand Down
52 changes: 52 additions & 0 deletions tests/test_path_unlink.py
Original file line number Diff line number Diff line change
Expand Up @@ -227,3 +227,55 @@ def test_unmonitored_mounted_dir(test_container, test_file, server):
file=fut, host_path=test_file)

server.wait_events([event])


def test_probe_inode_map(monitored_dir, ignored_dir, server):
"""
TODO[ROX-33222]: This test won't work when hardlinks are handled properly.

This test demonstrates that the current implementation removes the inode
from the kernel map correctly, as a second unmonitored hardlink deletion
is not noticed.

Args:
monitored_dir: Temporary directory path that is monitored by fact.
ignored_dir: Temporary directory path that is NOT monitored by fact.
server: The server instance to communicate with.
"""
process = Process.from_proc()

# File Under Test - original file in monitored directory
original_file = os.path.join(monitored_dir, 'original.txt')

# Create the original file
with open(original_file, 'w') as f:
f.write('This is a test')

# Create two hardlinks in the unmonitored directory
hardlink_file1 = os.path.join(ignored_dir, 'hardlink1.txt')
os.link(original_file, hardlink_file1)

hardlink_file2 = os.path.join(ignored_dir, 'hardlink2.txt')
os.link(original_file, hardlink_file2)

e = Event(process=process, event_type=EventType.CREATION,
file=original_file, host_path=original_file)

server.wait_events([e])

os.remove(hardlink_file1)
os.remove(hardlink_file2)
Comment on lines +251 to +267
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be flaky. The inode of the original file may not have landed in the probe's map before we start deleting.

I will split the test to wait for the CREATION event before going on with the unlinks.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in fb7d100

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shows that we have a bit of a problem in our code, if a file is created and removed we will miss the host path for it, I wonder if that is related to a check we do on the filesystem and if we should just ignore that and proceed to add the host path regardless of whether we see it in the filesystem or not 🤔

I think I originally put that check in place because I was worried inodes for a file that was removed would linger if we missed the unlink event, but I'm fairly certain the scan should take care of cleaning up.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if that is related to a check we do on the filesystem

Do you mean that you suspect that this is adding to the delay before the host path is added to the inode map ?

Even if we try to add the information as soon as possible the race-condition is there, and I don't see any way we could avoid it completely.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't have a race condition though:

  • The call to create the file is done.
  • The file_open LSM hook is called and an event is put in the ringbuffer.
  • The call to create the file finishes and immediately begins the operation for removing it.
  • The path_unlink LSM hook is called and an event is put in the ringbuffer.

The fact we have a single ringbuffer ensures userspace will see the two events in order so we should have the host path when the unlink event comes around.

I was under the incorrect impression this check would be in the path of the file creation event, but it is not, so I don't really see where the race condition would be. I'll try to debug this a bit with a fresher head on Monday.


# Create a guard file to ensure all events have been processed
guard_file = os.path.join(monitored_dir, 'guard.txt')
with open(guard_file, 'w') as f:
f.write('guard')
Comment on lines +269 to +272
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably a pattern we could adopt directly in the wait_events method.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not in this PR obviously


events = [
Event(process=process, event_type=EventType.UNLINK,
file=hardlink_file1, host_path=original_file),
Event(process=process, event_type=EventType.CREATION,
file=guard_file, host_path=guard_file),
]

server.wait_events(events)
Loading