seccomp: block AF_VSOCK sockets#575
Conversation
|
✅ A new PR has been created in buildah to vendor these changes: containers/buildah#6623 |
| Names: []string{ | ||
| "socket", | ||
| }, | ||
| Action: ActAllow, | ||
| Args: []*Arg{ | ||
| { | ||
| Index: 0, | ||
| Value: unix.AF_NETLINK, | ||
| Op: OpNotEqual, | ||
| }, | ||
| }, | ||
| Excludes: Filter{ | ||
| Caps: []string{"CAP_AUDIT_WRITE"}, | ||
| }, | ||
| }, |
There was a problem hiding this comment.
What I don't understand why this is not causing the EPERM to get overwritten by ActAllow here.
But my testing does confirm this patch works as I would expect so it is my lack of understanding the seccomp rules ordering.
|
The krun runtime uses vsock for VM guest communication. Would this PR conflict with that in any way? |
yeah I was wondering about that too, I am not familiar with the krun details but taking the profile from this PR still works AFAICT So we should not break krun if that test is good enough? I also checked with a volume mount, is there anything specific to test the vsock communication? |
|
I can test with other libkrun variants that use vsock more heavily (like libkrun-nitro for AWS Nitro Enclaves, where all outside communication is done through vsock). Would you mind marking this as draft until I'm able to do so? |
|
the seccomp profile is applied by crun, if it is applied for krun as well I guess we could try to special case it to allow vsock for only that. Or modify the seccomp generation on the podman side so we can create a custom rule per runtime? Of course regardless this would get even more complicated to the already complicated rule syntax. |
|
Could we open the socket before we apply the seccomp profile? I don't think that krun should depend on this to not be blocked, we need to address it in the runtime |
|
can we merge? |
|
@giuseppe what is the path forward for krun? I don’t think we need to block on having that implemented but we should at least have some idea of what we are getting into, and a issue filed somewhere to check/follow up/update. |
|
The OCI runtime should be able to work with such seccomp setting. We can apply the filter later, or not apply it (for the runtime itself). It is not realistic to expect that users won't block this syscall because the runtime can't deal with it |
Block the socket() syscall with AF_VSOCK to prevent container escapes via VM sockets. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
58c47b9 to
eaec878
Compare
Block the socket() syscall with AF_VSOCK to prevent container escapes via VM sockets.