Skip to content

Commit

Permalink
Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/…
Browse files Browse the repository at this point in the history
…mic/linux.git
  • Loading branch information
Mark Brown committed Feb 23, 2022
2 parents a911f73 + a691b98 commit 8ff4b45
Show file tree
Hide file tree
Showing 34 changed files with 627 additions and 13 deletions.
50 changes: 50 additions & 0 deletions Documentation/admin-guide/sysctl/fs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ Currently, these files are in /proc/sys/fs:
- suid_dumpable
- super-max
- super-nr
- trusted_for_policy


aio-nr & aio-max-nr
Expand Down Expand Up @@ -382,3 +383,52 @@ Each "watch" costs roughly 90 bytes on a 32bit kernel, and roughly 160 bytes
on a 64bit one.
The current default value for max_user_watches is the 1/25 (4%) of the
available low memory, divided for the "watch" cost in bytes.


trusted_for_policy
------------------

An interpreter can call :manpage:`trusted_for(2)` with a
``TRUSTED_FOR_EXECUTION`` usage to check that opened regular files are expected
to be executable. If the file is not identified as executable, then the
syscall returns -EACCES. This may allow a script interpreter to check
executable permission before reading commands from a file, or a dynamic linker
to only load executable shared objects. One interesting use case is to enforce
a "write xor execute" policy through interpreters.

The ability to restrict code execution must be thought as a system-wide policy,
which first starts by restricting mount points with the ``noexec`` option.
This option is also automatically applied to special filesystems such as /proc .
This prevents files on such mount points to be directly executed by the kernel
or mapped as executable memory (e.g. libraries). With script interpreters
using :manpage:`trusted_for(2)`, the executable permission can then be checked
before reading commands from files. This makes it possible to enforce the
``noexec`` at the interpreter level, and thus propagates this security policy
to scripts. To be fully effective, these interpreters also need to handle the
other ways to execute code: command line parameters (e.g., option ``-e`` for
Perl), module loading (e.g., option ``-m`` for Python), stdin, file sourcing,
environment variables, configuration files, etc. According to the threat
model, it may be acceptable to allow some script interpreters (e.g. Bash) to
interpret commands from stdin, may it be a TTY or a pipe, because it may not be
enough to (directly) perform syscalls.

There are two complementary security policies: enforce the ``noexec`` mount
option, and enforce executable file permission. These policies are handled by
the ``fs.trusted_for_policy`` sysctl (writable only with ``CAP_SYS_ADMIN``) as
a bitmask:

1 - Mount restriction: checks that the mount options for the underlying VFS
mount do not prevent execution.

2 - File permission restriction: checks that the file is marked as
executable for the current process (e.g., POSIX permissions, ACLs).

Note that as long as a policy is enforced, checking any non-regular file with
:manpage:`trusted_for(2)` returns -EACCES (e.g. TTYs, pipe), even when such a
file is marked as executable or is on an executable mount point.

Code samples can be found in
tools/testing/selftests/interpreter/trust_policy_test.c and interpreter patches
(for the original O_MAYEXEC) are available at
https://github.com/clipos-archive/clipos4_portage-overlay/search?q=O_MAYEXEC .
See also an overview article: https://lwn.net/Articles/820000/ .
1 change: 1 addition & 0 deletions arch/alpha/kernel/syscalls/syscall.tbl
Original file line number Diff line number Diff line change
Expand Up @@ -490,3 +490,4 @@
558 common process_mrelease sys_process_mrelease
559 common futex_waitv sys_futex_waitv
560 common set_mempolicy_home_node sys_ni_syscall
561 common trusted_for sys_trusted_for
1 change: 1 addition & 0 deletions arch/arm/tools/syscall.tbl
Original file line number Diff line number Diff line change
Expand Up @@ -464,3 +464,4 @@
448 common process_mrelease sys_process_mrelease
449 common futex_waitv sys_futex_waitv
450 common set_mempolicy_home_node sys_set_mempolicy_home_node
451 common trusted_for sys_trusted_for
2 changes: 1 addition & 1 deletion arch/arm64/include/asm/unistd.h
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@
#define __ARM_NR_compat_set_tls (__ARM_NR_COMPAT_BASE + 5)
#define __ARM_NR_COMPAT_END (__ARM_NR_COMPAT_BASE + 0x800)

#define __NR_compat_syscalls 451
#define __NR_compat_syscalls 452
#endif

#define __ARCH_WANT_SYS_CLONE
Expand Down
2 changes: 2 additions & 0 deletions arch/arm64/include/asm/unistd32.h
Original file line number Diff line number Diff line change
Expand Up @@ -907,6 +907,8 @@ __SYSCALL(__NR_process_mrelease, sys_process_mrelease)
__SYSCALL(__NR_futex_waitv, sys_futex_waitv)
#define __NR_set_mempolicy_home_node 450
__SYSCALL(__NR_set_mempolicy_home_node, sys_set_mempolicy_home_node)
#define __NR_trusted_for 451
__SYSCALL(__NR_trusted_for, sys_trusted_for)

/*
* Please add new compat syscalls above this comment and update
Expand Down
1 change: 1 addition & 0 deletions arch/ia64/kernel/syscalls/syscall.tbl
Original file line number Diff line number Diff line change
Expand Up @@ -371,3 +371,4 @@
448 common process_mrelease sys_process_mrelease
449 common futex_waitv sys_futex_waitv
450 common set_mempolicy_home_node sys_set_mempolicy_home_node
451 common trusted_for sys_trusted_for
1 change: 1 addition & 0 deletions arch/m68k/kernel/syscalls/syscall.tbl
Original file line number Diff line number Diff line change
Expand Up @@ -450,3 +450,4 @@
448 common process_mrelease sys_process_mrelease
449 common futex_waitv sys_futex_waitv
450 common set_mempolicy_home_node sys_set_mempolicy_home_node
451 common trusted_for sys_trusted_for
1 change: 1 addition & 0 deletions arch/microblaze/kernel/syscalls/syscall.tbl
Original file line number Diff line number Diff line change
Expand Up @@ -456,3 +456,4 @@
448 common process_mrelease sys_process_mrelease
449 common futex_waitv sys_futex_waitv
450 common set_mempolicy_home_node sys_set_mempolicy_home_node
451 common trusted_for sys_trusted_for
1 change: 1 addition & 0 deletions arch/mips/kernel/syscalls/syscall_n32.tbl
Original file line number Diff line number Diff line change
Expand Up @@ -389,3 +389,4 @@
448 n32 process_mrelease sys_process_mrelease
449 n32 futex_waitv sys_futex_waitv
450 n32 set_mempolicy_home_node sys_set_mempolicy_home_node
451 n32 trusted_for sys_trusted_for
1 change: 1 addition & 0 deletions arch/mips/kernel/syscalls/syscall_n64.tbl
Original file line number Diff line number Diff line change
Expand Up @@ -365,3 +365,4 @@
448 n64 process_mrelease sys_process_mrelease
449 n64 futex_waitv sys_futex_waitv
450 common set_mempolicy_home_node sys_set_mempolicy_home_node
451 n64 trusted_for sys_trusted_for
1 change: 1 addition & 0 deletions arch/mips/kernel/syscalls/syscall_o32.tbl
Original file line number Diff line number Diff line change
Expand Up @@ -438,3 +438,4 @@
448 o32 process_mrelease sys_process_mrelease
449 o32 futex_waitv sys_futex_waitv
450 o32 set_mempolicy_home_node sys_set_mempolicy_home_node
451 o32 trusted_for sys_trusted_for
1 change: 1 addition & 0 deletions arch/parisc/kernel/syscalls/syscall.tbl
Original file line number Diff line number Diff line change
Expand Up @@ -448,3 +448,4 @@
448 common process_mrelease sys_process_mrelease
449 common futex_waitv sys_futex_waitv
450 common set_mempolicy_home_node sys_set_mempolicy_home_node
451 common trusted_for sys_trusted_for
1 change: 1 addition & 0 deletions arch/powerpc/kernel/syscalls/syscall.tbl
Original file line number Diff line number Diff line change
Expand Up @@ -530,3 +530,4 @@
448 common process_mrelease sys_process_mrelease
449 common futex_waitv sys_futex_waitv
450 nospu set_mempolicy_home_node sys_set_mempolicy_home_node
451 common trusted_for sys_trusted_for
1 change: 1 addition & 0 deletions arch/s390/kernel/syscalls/syscall.tbl
Original file line number Diff line number Diff line change
Expand Up @@ -453,3 +453,4 @@
448 common process_mrelease sys_process_mrelease sys_process_mrelease
449 common futex_waitv sys_futex_waitv sys_futex_waitv
450 common set_mempolicy_home_node sys_set_mempolicy_home_node sys_set_mempolicy_home_node
451 common trusted_for sys_trusted_for sys_trusted_for
1 change: 1 addition & 0 deletions arch/sh/kernel/syscalls/syscall.tbl
Original file line number Diff line number Diff line change
Expand Up @@ -453,3 +453,4 @@
448 common process_mrelease sys_process_mrelease
449 common futex_waitv sys_futex_waitv
450 common set_mempolicy_home_node sys_set_mempolicy_home_node
451 common trusted_for sys_trusted_for
1 change: 1 addition & 0 deletions arch/sparc/kernel/syscalls/syscall.tbl
Original file line number Diff line number Diff line change
Expand Up @@ -496,3 +496,4 @@
448 common process_mrelease sys_process_mrelease
449 common futex_waitv sys_futex_waitv
450 common set_mempolicy_home_node sys_set_mempolicy_home_node
451 common trusted_for sys_trusted_for
1 change: 1 addition & 0 deletions arch/x86/entry/syscalls/syscall_32.tbl
Original file line number Diff line number Diff line change
Expand Up @@ -455,3 +455,4 @@
448 i386 process_mrelease sys_process_mrelease
449 i386 futex_waitv sys_futex_waitv
450 i386 set_mempolicy_home_node sys_set_mempolicy_home_node
451 i386 trusted_for sys_trusted_for
1 change: 1 addition & 0 deletions arch/x86/entry/syscalls/syscall_64.tbl
Original file line number Diff line number Diff line change
Expand Up @@ -372,6 +372,7 @@
448 common process_mrelease sys_process_mrelease
449 common futex_waitv sys_futex_waitv
450 common set_mempolicy_home_node sys_set_mempolicy_home_node
451 common trusted_for sys_trusted_for

#
# Due to a historical design error, certain syscalls are numbered differently
Expand Down
1 change: 1 addition & 0 deletions arch/xtensa/kernel/syscalls/syscall.tbl
Original file line number Diff line number Diff line change
Expand Up @@ -421,3 +421,4 @@
448 common process_mrelease sys_process_mrelease
449 common futex_waitv sys_futex_waitv
450 common set_mempolicy_home_node sys_set_mempolicy_home_node
451 common trusted_for sys_trusted_for
133 changes: 133 additions & 0 deletions fs/open.c
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,8 @@
#include <linux/dnotify.h>
#include <linux/compat.h>
#include <linux/mnt_idmapping.h>
#include <linux/sysctl.h>
#include <uapi/linux/trusted-for.h>

#include "internal.h"

Expand Down Expand Up @@ -481,6 +483,137 @@ SYSCALL_DEFINE2(access, const char __user *, filename, int, mode)
return do_faccessat(AT_FDCWD, filename, mode, 0);
}

#define TRUST_POLICY_EXEC_MOUNT BIT(0)
#define TRUST_POLICY_EXEC_FILE BIT(1)

static int sysctl_trusted_for_policy __read_mostly;

#ifdef CONFIG_SYSCTL
static struct ctl_table open_sysctls[] = {
{
.procname = "trusted_for_policy",
.data = &sysctl_trusted_for_policy,
.maxlen = sizeof(int),
.mode = 0600,
.proc_handler = proc_dointvec_minmax_sysadmin,
.extra1 = SYSCTL_ZERO,
.extra2 = SYSCTL_THREE,
},
{ }
};

static int __init init_fs_open_sysctls(void)
{
register_sysctl_init("fs", open_sysctls);
return 0;
}
fs_initcall(init_fs_open_sysctls);
#endif /* CONFIG_SYSCTL */

/**
* sys_trusted_for - Check that a FD is trusted for a specific usage
*
* @fd: File descriptor to check.
* @usage: Identify the user space usage (defined by enum trusted_for_usage)
* intended for the file descriptor (only TRUSTED_FOR_EXECUTION for
* now).
*
* @flags: Must be 0.
*
* This system call enables user space to ask the kernel: is this file
* descriptor's content trusted to be used for this purpose? The set of @usage
* currently only contains TRUSTED_FOR_EXECUTION, but other may follow (e.g.
* configuration, sensitive data). If the kernel identifies the file
* descriptor as trustworthy for this usage, this call returns 0 and the caller
* should then take this information into account.
*
* The execution usage means that the content of the file descriptor is trusted
* according to the system policy to be executed by user space, which means
* that it interprets the content or (try to) maps it as executable memory.
*
* A simple system-wide security policy can be set by the system administrator
* through a sysctl configuration consistent with the mount points or the file
* access rights: Documentation/admin-guide/sysctl/fs.rst
*
* @flags could be used in the future to do complementary checks (e.g.
* signature or integrity requirements, origin of the file).
*
* Possible returned errors are:
*
* - EINVAL: unknown @usage or unknown @flags;
* - EBADF: @fd is not a file descriptor for the calling thread;
* - EACCES: the requested usage is denied (and user space should enforce it).
*/
SYSCALL_DEFINE3(trusted_for, const int, fd, const int, usage, const u32, flags)
{
int mask, err = -EACCES;
struct fd f;
struct inode *inode;

if (flags)
return -EINVAL;

/* Only handles execution for now. */
if (usage != TRUSTED_FOR_EXECUTION)
return -EINVAL;
mask = MAY_EXEC;

f = fdget(fd);
if (!f.file)
return -EBADF;
inode = file_inode(f.file);

/*
* For compatibility reasons, without a defined security policy, we
* must map the execute permission to the read permission. Indeed,
* from user space point of view, being able to execute data (e.g.
* scripts) implies to be able to read this data.
*/
if ((mask & MAY_EXEC)) {
/*
* If there is a system-wide execute policy enforced, then
* forbids access to non-regular files and special superblocks.
*/
if ((sysctl_trusted_for_policy & (TRUST_POLICY_EXEC_MOUNT |
TRUST_POLICY_EXEC_FILE))) {
if (!S_ISREG(inode->i_mode))
goto out_fd;
/*
* Denies access to pseudo filesystems that will never
* be mountable (e.g. sockfs, pipefs) but can still be
* reachable through /proc/self/fd, or memfd-like file
* descriptors, or nsfs-like files.
*
* According to the selftests, SB_NOEXEC seems to be
* only used by proc and nsfs filesystems.
*/
if ((f.file->f_path.dentry->d_sb->s_flags &
(SB_NOUSER | SB_KERNMOUNT | SB_NOEXEC)))
goto out_fd;
}

if ((sysctl_trusted_for_policy & TRUST_POLICY_EXEC_MOUNT) &&
path_noexec(&f.file->f_path))
goto out_fd;
/*
* For compatibility reasons, if the system-wide policy doesn't
* enforce file permission checks, then replaces the execute
* permission request with a read permission request.
*/
if (!(sysctl_trusted_for_policy & TRUST_POLICY_EXEC_FILE))
mask &= ~MAY_EXEC;
/* To be executed *by* user space, files must be readable. */
mask |= MAY_READ;
}

err = inode_permission(file_mnt_user_ns(f.file), inode,
mask | MAY_ACCESS);

out_fd:
fdput(f);
return err;
}

SYSCALL_DEFINE1(chdir, const char __user *, filename)
{
struct path path;
Expand Down
2 changes: 1 addition & 1 deletion fs/proc/proc_sysctl.c
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ static const struct file_operations proc_sys_dir_file_operations;
static const struct inode_operations proc_sys_dir_operations;

/* shared constants to be used in various sysctls */
const int sysctl_vals[] = { -1, 0, 1, 2, 4, 100, 200, 1000, 3000, INT_MAX, 65535 };
const int sysctl_vals[] = { -1, 0, 1, 2, 4, 100, 200, 1000, 3000, INT_MAX, 65535, 3 };
EXPORT_SYMBOL(sysctl_vals);

const unsigned long sysctl_long_vals[] = { 0, 1, LONG_MAX };
Expand Down
1 change: 1 addition & 0 deletions include/linux/syscalls.h
Original file line number Diff line number Diff line change
Expand Up @@ -458,6 +458,7 @@ asmlinkage long sys_fallocate(int fd, int mode, loff_t offset, loff_t len);
asmlinkage long sys_faccessat(int dfd, const char __user *filename, int mode);
asmlinkage long sys_faccessat2(int dfd, const char __user *filename, int mode,
int flags);
asmlinkage long sys_trusted_for(int fd, int usage, u32 flags);
asmlinkage long sys_chdir(const char __user *filename);
asmlinkage long sys_fchdir(unsigned int fd);
asmlinkage long sys_chroot(const char __user *filename);
Expand Down
3 changes: 3 additions & 0 deletions include/linux/sysctl.h
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ struct ctl_dir;

/* this is needed for the proc_dointvec_minmax for [fs_]overflow UID and GID */
#define SYSCTL_MAXOLDUID ((void *)&sysctl_vals[10])
#define SYSCTL_THREE ((void *)&sysctl_vals[11])

extern const int sysctl_vals[];

Expand All @@ -69,6 +70,8 @@ int proc_dobool(struct ctl_table *table, int write, void *buffer,
int proc_dointvec(struct ctl_table *, int, void *, size_t *, loff_t *);
int proc_douintvec(struct ctl_table *, int, void *, size_t *, loff_t *);
int proc_dointvec_minmax(struct ctl_table *, int, void *, size_t *, loff_t *);
int proc_dointvec_minmax_sysadmin(struct ctl_table *, int, void *, size_t *,
loff_t *);
int proc_douintvec_minmax(struct ctl_table *table, int write, void *buffer,
size_t *lenp, loff_t *ppos);
int proc_dou8vec_minmax(struct ctl_table *table, int write, void *buffer,
Expand Down
5 changes: 4 additions & 1 deletion include/uapi/asm-generic/unistd.h
Original file line number Diff line number Diff line change
Expand Up @@ -886,8 +886,11 @@ __SYSCALL(__NR_futex_waitv, sys_futex_waitv)
#define __NR_set_mempolicy_home_node 450
__SYSCALL(__NR_set_mempolicy_home_node, sys_set_mempolicy_home_node)

#define __NR_trusted_for 451
__SYSCALL(__NR_trusted_for, sys_trusted_for)

#undef __NR_syscalls
#define __NR_syscalls 451
#define __NR_syscalls 452

/*
* 32 bit systems traditionally used different
Expand Down
18 changes: 18 additions & 0 deletions include/uapi/linux/trusted-for.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
#ifndef _UAPI_LINUX_TRUSTED_FOR_H
#define _UAPI_LINUX_TRUSTED_FOR_H

/**
* enum trusted_for_usage - Usage for which a file descriptor is trusted
*
* Argument of trusted_for(2).
*/
enum trusted_for_usage {
/**
* @TRUSTED_FOR_EXECUTION: Check that the data read from a file
* descriptor is trusted to be executed or interpreted (e.g. scripts).
*/
TRUSTED_FOR_EXECUTION = 1,
};

#endif /* _UAPI_LINUX_TRUSTED_FOR_H */
9 changes: 0 additions & 9 deletions kernel/printk/sysctl.c
Original file line number Diff line number Diff line change
Expand Up @@ -11,15 +11,6 @@

static const int ten_thousand = 10000;

static int proc_dointvec_minmax_sysadmin(struct ctl_table *table, int write,
void *buffer, size_t *lenp, loff_t *ppos)
{
if (write && !capable(CAP_SYS_ADMIN))
return -EPERM;

return proc_dointvec_minmax(table, write, buffer, lenp, ppos);
}

static struct ctl_table printk_sysctls[] = {
{
.procname = "printk",
Expand Down
Loading

0 comments on commit 8ff4b45

Please sign in to comment.