overlayfs regression - internal getxattr operations without sepolicy checking
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux-aws (Ubuntu) |
Confirmed
|
Undecided
|
Unassigned | ||
Xenial |
Invalid
|
Undecided
|
Unassigned | ||
Bionic |
Fix Released
|
Undecided
|
Unassigned | ||
Eoan |
Fix Committed
|
Undecided
|
Unassigned | ||
Focal |
Fix Released
|
Undecided
|
Unassigned | ||
linux-azure (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Xenial |
Fix Released
|
Undecided
|
Unassigned | ||
Bionic |
Fix Committed
|
Undecided
|
Unassigned | ||
Eoan |
Fix Released
|
Undecided
|
Unassigned | ||
Focal |
Fix Released
|
Undecided
|
Unassigned | ||
linux-azure-4.15 (Ubuntu) |
Invalid
|
Undecided
|
Unassigned | ||
Xenial |
Invalid
|
Undecided
|
Unassigned | ||
Bionic |
Fix Released
|
Undecided
|
Unassigned | ||
Eoan |
Invalid
|
Undecided
|
Unassigned | ||
Focal |
Invalid
|
Undecided
|
Unassigned |
Bug Description
Bug description and repro:
Run the following commands on host instances:
Prepare the overlayfs directories:
$ cd /tmp
$ mkdir -p base/dir1/dir2 upper olwork merged
$ touch base/dir1/dir2/file
$ chown -R 100000:100000 base upper olwork merged
Verify that the directory is owned by user 100000:
$ ls -al merged/
total 8
drwxr-xr-x 2 100000 100000 4096 Nov 1 07:08 .
drwxrwxrwt 16 root root 4096 Nov 1 07:08 ..
We use lxc-usernsexec to start a new shell as user 100000.
$ lxc-usernsexec -m b:0:100000:1 -- /bin/bash
$$ ls -al merged/
total 8
drwxr-xr-x 2 root root 4096 Nov 1 07:08 .
drwxrwxrwt 16 nobody nogroup 4096 Nov 1 07:08 ..
Notice that the ownership of . and .. has changed because the new shell is running as the remapped user.
Now, mount the overlayfs as an unprivileged user in the new shell. This is the key to trigger the bug.
$$ mount -t overlay -o lowerdir=
$$ ls -al merged/
-rw-r--r-- 1 root root 0 Nov 1 07:09 merged/
We can see the file in the base layer from the mount directory. Now trigger the bug:
$$ rm -rf merged/dir1/dir2/
$$ mkdir merged/dir1/dir2
$$ ls -al merged/dir1/dir2
total 12
drwxr-xr-x 2 root root 4096 Nov 1 07:10 .
drwxr-xr-x 1 root root 4096 Nov 1 07:10 ..
File does not show up in the newly created dir2 as expected. But it will reappear after we remount the filesystem (or any other means that might evict the cached dentry, such as attempt to delete the parent directory):
$$ umount merged
$$ mount -t overlay -o lowerdir=
$$ ls -al merged/dir1/dir2
total 12
drwxr-xr-x 1 root root 4096 Nov 1 07:10 .
drwxr-xr-x 1 root root 4096 Nov 1 07:10 ..
-rw-r--r-- 1 root root 0 Nov 1 07:09 file
$$ exit
$
This is a recent kernel regression. I tried the above step on an old kernel (4.4.0-1072-aws) but cannot reproduce.
I looked up linux source code and figured out where the "regression" is coming from. The issue lies in how overlayfs checks the "opaque" flag from the underlying upper-level filesystem. It checks the "trusted.
In 4.4: https:/
static bool ovl_is_
{
int res;
char val;
struct inode *inode = dentry->d_inode;
if (!S_ISDIR(
return false;
res = inode->
if (res == 1 && val == 'y')
return true;
return false;
}
In 4.15: https:/
static bool ovl_is_
{
return ovl_check_
}
bool ovl_check_
{
int res;
char val;
if (!d_is_dir(dentry))
return false;
res = vfs_getxattr(
if (res == 1 && val == 'y')
return true;
return false;
}
The 4.4 version simply uses the internal i_node callback inode->
See https:/
ssize_t
vfs_getxattr(struct dentry *dentry, const char *name, void *value, size_t size)
{
struct inode *inode = dentry->d_inode;
int error;
error = xattr_permissio
if (error)
return error;
error = security_
if (error)
return error;
if (!strncmp(name, XATTR_SECURITY_
XATTR_
const char *suffix = name + XATTR_SECURITY_
int ret = xattr_getsecuri
/*
* Only overwrite the return value if a security module
* is actually active.
*/
if (ret == -EOPNOTSUPP)
goto nolsm;
return ret;
}
nolsm:
return __vfs_getxattr(
}
In 4.15, ovl_is_opaquedir is called by the following caller:
ovl_is_opaquedir <-
ovl_lookup_single() <-
ovl_lookup_layer <-
ovl_lookup,
ovl_lookup is the entry point for directory listing in overlayfs. Importantly, it assumes the filesystem mounter's credential to perform all internal lookup operations:
struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
unsigned int flags)
{
old_cred = ovl_override_
// perform lookups
// ....
revert_
}
The "credential switching" logic also does not exist in the 4.4 kernel: https:/
That means, on 4.15, overlayfs uses the file system mounter's credential to fetch the "trusted.
See https:/
static int xattr_permissio
{
....
/*
* The trusted.* namespace can only be accessed by privileged users.
*/
if (!strncmp(name, XATTR_TRUSTED_
if (!capable(
return (mask & MAY_WRITE) ? -EPERM : -ENODATA;
return 0;
}
....
}
When this call fails, overlayfs assumes the upper directory is not "opaque" and combines the content from the lower directory in the result.
There's a proposed patch to fix this issue: https:/
The patch calls the insecure __vfs_getxattr to fetch the opaque flag so that it can bypass the permission check even if the other lookup operation is done under the mounter's credential.
However, the patch hasn't been merged to the upstream linux kernel as of today (see https:/
CVE References
Changed in linux-azure-4.15 (Ubuntu Xenial): | |
status: | New → Invalid |
Changed in linux-azure-4.15 (Ubuntu Bionic): | |
status: | New → Fix Committed |
Changed in linux-azure-4.15 (Ubuntu Eoan): | |
status: | New → Invalid |
Changed in linux-azure-4.15 (Ubuntu Focal): | |
status: | New → Invalid |
Changed in linux-azure (Ubuntu Eoan): | |
status: | New → Fix Committed |
Changed in linux-azure (Ubuntu Bionic): | |
status: | New → Fix Committed |
Changed in linux-azure (Ubuntu Xenial): | |
status: | New → Fix Committed |
Changed in linux-aws (Ubuntu Xenial): | |
status: | New → Invalid |
Changed in linux-aws (Ubuntu Bionic): | |
status: | New → In Progress |
Changed in linux-aws (Ubuntu Eoan): | |
status: | New → In Progress |
Changed in linux-aws (Ubuntu Focal): | |
status: | New → In Progress |
summary: |
- [linux-azure] overlayfs regression - internal getxattr operations - without sepolicy checking + overlayfs regression - internal getxattr operations without sepolicy + checking |
Changed in linux-aws (Ubuntu Eoan): | |
status: | In Progress → Fix Committed |
Changed in linux-aws (Ubuntu Bionic): | |
status: | In Progress → Fix Committed |
Changed in linux-aws (Ubuntu Focal): | |
status: | In Progress → Fix Committed |
Hi, Joe.
Are you aware of any real user case being affected by that? Do you think that can wait until the patchset is merged upstream?