proc_sys(5) Linux Man Page - System Grab Bag

Name

/proc/sys/ - system information, and sysctl pseudo-filesystem

Description

/proc/sys/ This directory (present since Linux 1.3.57) contains a number of files and subdirectories corresponding to kernel variables. These variables can be read and in some cases modified using the /proc filesystem, and the (deprecated) sysctl(2) system call.

String values may be terminated by either '\e0' or '\en'.

Integer and long values may be written either in decimal or in hexadecimal notation (e.g., 0x3FFF). When writing multiple integer or long values, these may be separated by any of the following whitespace characters: '\ ', '\et', or '\en'. Using other separators leads to the error EINVAL .

/proc/sys/abi/ " (since Linux 2.4.10)" This directory may contain files with application binary information. See the Linux kernel source file Documentation/sysctl/abi.rst (or Documentation/sysctl/abi.txt before Linux 5.3) for more information.

/proc/sys/debug/ This directory may be empty.

/proc/sys/dev/ This directory contains device-specific information (e.g., dev/cdrom/info ). On some systems, it may be empty.

/proc/sys/fs/ This directory contains the files and subdirectories for kernel variables related to filesystems.

/proc/sys/fs/aio-max-nr " and " /proc/sys/fs/aio-nr " (since Linux 2.6.4)" aio-nr is the running total of the number of events specified by io_setup(2) calls for all currently active AIO contexts. If aio-nr reaches aio-max-nr, then io_setup(2) will fail with the error EAGAIN .Raising aio-max-nr does not result in the preallocation or resizing of any kernel data structures.

/proc/sys/fs/binfmt_misc Documentation for files in this directory can be found in the Linux kernel source in the file Documentation/admin-guide/binfmt-misc.rst (or in Documentation/binfmt_misc.txt on older kernels).

/proc/sys/fs/dentry-state " (since Linux 2.2)" This file contains information about the status of the directory cache (dcache). The file contains six numbers, nr_dentry, nr_unused, age_limit (age in seconds), want_pages (pages requested by system) and two dummy values.

nr_dentry is the number of allocated dentries (dcache entries). This field is unused in Linux 2.2.

nr_unused is the number of unused dentries.

age_limit is the age in seconds after which dcache entries can be reclaimed when memory is short.

want_pages is nonzero when the kernel has called shrink_dcache_pages() and the dcache isn't pruned yet.

/proc/sys/fs/dir-notify-enable This file can be used to disable or enable the dnotify interface described in fcntl(2) on a system-wide basis. A value of 0 in this file disables the interface, and a value of 1 enables it.

/proc/sys/fs/dquot-max This file shows the maximum number of cached disk quota entries. On some (2.4) systems, it is not present. If the number of free cached disk quota entries is very low and you have some awesome number of simultaneous system users, you might want to raise the limit.

/proc/sys/fs/dquot-nr This file shows the number of allocated disk quota entries and the number of free disk quota entries.

/proc/sys/fs/epoll/ " (since Linux 2.6.28)" This directory contains the file max_user_watches, which can be used to limit the amount of kernel memory consumed by the epoll interface. For further details, see epoll(7).

/proc/sys/fs/file-max This file defines a system-wide limit on the number of open files for all processes. System calls that fail when encountering this limit fail with the error ENFILE .(See also setrlimit(2), which can be used by a process to set the per-process limit, RLIMIT_NOFILE ,on the number of files it may open.) If you get lots of error messages in the kernel log about running out of file handles (open file descriptions) (look for "VFS: file-max limit <number> reached"), try increasing this value:

echo 100000 > /proc/sys/fs/file-max

Privileged processes ( CAP_SYS_ADMIN ) can override the file-max limit.

/proc/sys/fs/file-nr This (read-only) file contains three numbers: the number of allocated file handles (i.e., the number of open file descriptions; see open(2)); the number of free file handles; and the maximum number of file handles (i.e., the same value as /proc/sys/fs/file-max ). If the number of allocated file handles is close to the maximum, you should consider increasing the maximum. Before Linux 2.6, the kernel allocated file handles dynamically, but it didn't free them again. Instead the free file handles were kept in a list for reallocation; the "free file handles" value indicates the size of that list. A large number of free file handles indicates that there was a past peak in the usage of open file handles. Since Linux 2.6, the kernel does deallocate freed file handles, and the "free file handles" value is always zero.

/proc/sys/fs/inode-max " (only present until Linux 2.2)" This file contains the maximum number of in-memory inodes. This value should be 3–4 times larger than the value in file-max, since stdin, stdout and network sockets also need an inode to handle them. When you regularly run out of inodes, you need to increase this value.

Starting with Linux 2.4, there is no longer a static limit on the number of inodes, and this file is removed.

/proc/sys/fs/inode-nr This file contains the first two values from inode-state.

/proc/sys/fs/inode-state This file contains seven numbers: nr_inodes, nr_free_inodes, preshrink, and four dummy values (always zero).

nr_inodes is the number of inodes the system has allocated. nr_free_inodes represents the number of free inodes.

preshrink is nonzero when the nr_inodes > inode-max and the system needs to prune the inode list instead of allocating more; since Linux 2.4, this field is a dummy value (always zero).

/proc/sys/fs/inotify/ " (since Linux 2.6.13)" This directory contains files max_queued_events ", " max_user_instances ", and " max_user_watches, that can be used to limit the amount of kernel memory consumed by the inotify interface. For further details, see inotify(7).

/proc/sys/fs/lease-break-time This file specifies the grace period that the kernel grants to a process holding a file lease ( fcntl (2)) after it has sent a signal to that process notifying it that another process is waiting to open the file. If the lease holder does not remove or downgrade the lease within this grace period, the kernel forcibly breaks the lease.

/proc/sys/fs/leases-enable This file can be used to enable or disable file leases ( fcntl (2)) on a system-wide basis. If this file contains the value 0, leases are disabled. A nonzero value enables leases.

/proc/sys/fs/mount-max " (since Linux 4.9)" The value in this file specifies the maximum number of mounts that may exist in a mount namespace. The default value in this file is 100,000.

/proc/sys/fs/mqueue/ " (since Linux 2.6.6)" This directory contains files msg_max ", " msgsize_max ", and " queues_max, controlling the resources used by POSIX message queues. See mq_overview(7) for details.

/proc/sys/fs/nr_open " (since Linux 2.6.25)" This file imposes a ceiling on the value to which the RLIMIT_NOFILE resource limit can be raised (see getrlimit(2)). This ceiling is enforced for both unprivileged and privileged process. The default value in this file is 1048576. (Before Linux 2.6.25, the ceiling for RLIMIT_NOFILE was hard-coded to the same value.)

/proc/sys/fs/overflowgid " and " /proc/sys/fs/overflowuid These files allow you to change the value of the fixed UID and GID. The default is 65534. Some filesystems support only 16-bit UIDs and GIDs, although in Linux UIDs and GIDs are 32 bits. When one of these filesystems is mounted with writes enabled, any UID or GID that would exceed 65535 is translated to the overflow value before being written to disk.

/proc/sys/fs/pipe-max-size " (since Linux 2.6.35)" See pipe(7).

/proc/sys/fs/pipe-user-pages-hard " (since Linux 4.5)" See pipe(7).

/proc/sys/fs/pipe-user-pages-soft " (since Linux 4.5)" See pipe(7).

/proc/sys/fs/protected_fifos " (since Linux 4.19)" The value in this file is/can be set to one of the following:

0 Writing to FIFOs is unrestricted.

1 Don't allow O_CREAT open(2) on FIFOs that the caller doesn't own in world-writable sticky directories, unless the FIFO is owned by the owner of the directory.

2 As for the value 1, but the restriction also applies to group-writable sticky directories.

The intent of the above protections is to avoid unintentional writes to an attacker-controlled FIFO when a program expected to create a regular file.

/proc/sys/fs/protected_hardlinks " (since Linux 3.6)" When the value in this file is 0, no restrictions are placed on the creation of hard links (i.e., this is the historical behavior before Linux 3.6). When the value in this file is 1, a hard link can be created to a target file only if one of the following conditions is true:

The calling process has the CAP_FOWNER capability in its user namespace and the file UID has a mapping in the namespace.

The filesystem UID of the process creating the link matches the owner (UID) of the target file (as described in credentials(7), a process's filesystem UID is normally the same as its effective UID).

All of the following conditions are true:

the target is a regular file;

the target file does not have its set-user-ID mode bit enabled;

the target file does not have both its set-group-ID and group-executable mode bits enabled; and

the caller has permission to read and write the target file (either via the file's permissions mask or because it has suitable capabilities).

The default value in this file is 0. Setting the value to 1 prevents a longstanding class of security issues caused by hard-link-based time-of-check, time-of-use races, most commonly seen in world-writable directories such as /tmp. The common method of exploiting this flaw is to cross privilege boundaries when following a given hard link (i.e., a root process follows a hard link created by another user). Additionally, on systems without separated partitions, this stops unauthorized users from "pinning" vulnerable set-user-ID and set-group-ID files against being upgraded by the administrator, or linking to special files.

/proc/sys/fs/protected_regular " (since Linux 4.19)" The value in this file is/can be set to one of the following:

0 Writing to regular files is unrestricted.

1 Don't allow O_CREAT open(2) on regular files that the caller doesn't own in world-writable sticky directories, unless the regular file is owned by the owner of the directory.

2 As for the value 1, but the restriction also applies to group-writable sticky directories.

The intent of the above protections is similar to protected_fifos, but allows an application to avoid writes to an attacker-controlled regular file, where the application expected to create one.

/proc/sys/fs/protected_symlinks " (since Linux 3.6)" When the value in this file is 0, no restrictions are placed on following symbolic links (i.e., this is the historical behavior before Linux 3.6). When the value in this file is 1, symbolic links are followed only in the following circumstances:

the filesystem UID of the process following the link matches the owner (UID) of the symbolic link (as described in credentials(7), a process's filesystem UID is normally the same as its effective UID);

the link is not in a sticky world-writable directory; or

the symbolic link and its parent directory have the same owner (UID)

A system call that fails to follow a symbolic link because of the above restrictions returns the error EACCES in errno.

The default value in this file is 0. Setting the value to 1 avoids a longstanding class of security issues based on time-of-check, time-of-use races when accessing symbolic links.

/proc/sys/fs/suid_dumpable " (since Linux 2.6.13)" The value in this file is assigned to a process's "dumpable" flag in the circumstances described in prctl(2). In effect, the value in this file determines whether core dump files are produced for set-user-ID or otherwise protected/tainted binaries. The "dumpable" setting also affects the ownership of files in a process's /proc/ pid directory, as described above.

Three different integer values can be specified:

0\ (default) This provides the traditional (pre-Linux 2.6.13) behavior. A core dump will not be produced for a process which has changed credentials (by calling seteuid(2), setgid(2), or similar, or by executing a set-user-ID or set-group-ID program) or whose binary does not have read permission enabled.

1\ ("debug") All processes dump core when possible. (Reasons why a process might nevertheless not dump core are described in core(5).) The core dump is owned by the filesystem user ID of the dumping process and no security is applied. This is intended for system debugging situations only: this mode is insecure because it allows unprivileged users to examine the memory contents of privileged processes.

2\ ("suidsafe") Any binary which normally would not be dumped (see "0" above) is dumped readable by root only. This allows the user to remove the core dump file but not to read it. For security reasons core dumps in this mode will not overwrite one another or other files. This mode is appropriate when administrators are attempting to debug problems in a normal environment.

Additionally, since Linux 3.6, /proc/sys/kernel/core_pattern must either be an absolute pathname or a pipe command, as detailed in core(5). Warnings will be written to the kernel log if core_pattern does not follow these rules, and no core dump will be produced.

For details of the effect of a process's "dumpable" setting on ptrace access mode checking, see ptrace(2).

/proc/sys/fs/super-max This file controls the maximum number of superblocks, and thus the maximum number of mounted filesystems the kernel can have. You need increase only super-max if you need to mount more filesystems than the current value in super-max allows you to.

/proc/sys/fs/super-nr This file contains the number of filesystems currently mounted.

/proc/sys/kernel/ This directory contains files controlling a range of kernel parameters, as described below.

/proc/sys/kernel/acct This file contains three numbers: highwater, lowwater, and frequency. If BSD-style process accounting is enabled, these values control its behavior. If free space on filesystem where the log lives goes below lowwater percent, accounting suspends. If free space gets above highwater percent, accounting resumes. frequency determines how often the kernel checks the amount of free space (value is in seconds). Default values are 4, 2, and 30. That is, suspend accounting if 2% or less space is free; resume it if 4% or more space is free; consider information about amount of free space valid for 30 seconds.

/proc/sys/kernel/auto_msgmni " (Linux 2.6.27 to Linux 3.18)" From Linux 2.6.27 to Linux 3.18, this file was used to control recomputing of the value in /proc/sys/kernel/msgmni upon the addition or removal of memory or upon IPC namespace creation/removal. Echoing "1" into this file enabled msgmni automatic recomputing (and triggered a recomputation of msgmni based on the current amount of available memory and number of IPC namespaces). Echoing "0" disabled automatic recomputing. (Automatic recomputing was also disabled if a value was explicitly assigned to /proc/sys/kernel/msgmni .) The default value in auto_msgmni was 1.

Since Linux 3.19, the content of this file has no effect (because msgmni defaults to near the maximum value possible), and reads from this file always return the value "0".

/proc/sys/kernel/cap_last_cap " (since Linux 3.2)" See capabilities(7).

/proc/sys/kernel/cap-bound " (from Linux 2.2 to Linux 2.6.24)" This file holds the value of the kernel "capability bounding set" (expressed as a signed decimal number). This set is ANDed against the capabilities permitted to a process during execve(2). Starting with Linux 2.6.25, the system-wide capability bounding set disappeared, and was replaced by a per-thread bounding set; see capabilities(7).

/proc/sys/kernel/core_pattern See core(5).

/proc/sys/kernel/core_pipe_limit See core(5).

/proc/sys/kernel/core_uses_pid See core(5).

/proc/sys/kernel/ctrl-alt-del This file controls the handling of Ctrl-Alt-Del from the keyboard. When the value in this file is 0, Ctrl-Alt-Del is trapped and sent to the init(1) program to handle a graceful restart. When the value is greater than zero, Linux's reaction to a Vulcan Nerve Pinch (tm) will be an immediate reboot, without even syncing its dirty buffers. Note: when a program (like dosemu) has the keyboard in "raw" mode, the Ctrl-Alt-Del is intercepted by the program before it ever reaches the kernel tty layer, and it's up to the program to decide what to do with it.

/proc/sys/kernel/dmesg_restrict " (since Linux 2.6.37)" The value in this file determines who can see kernel syslog contents. A value of 0 in this file imposes no restrictions. If the value is 1, only privileged users can read the kernel syslog. (See syslog(2) for more details.) Since Linux 3.4, only users with the CAP_SYS_ADMIN capability may change the value in this file.

/proc/sys/kernel/domainname " and " /proc/sys/kernel/hostname can be used to set the NIS/YP domainname and the hostname of your box in exactly the same way as the commands domainname(1) and hostname(1), that is:

"#" " echo 'darkstar' > /proc/sys/kernel/hostname" "#" " echo 'mydomain' > /proc/sys/kernel/domainname"

has the same effect as

"#" " hostname 'darkstar'" "#" " domainname 'mydomain'"

Note, however, that the classic darkstar.frop.org has the hostname "darkstar" and DNS (Internet Domain Name Server) domainname "frop.org", not to be confused with the NIS (Network Information Service) or YP (Yellow Pages) domainname. These two domain names are in general different. For a detailed discussion see the hostname(1) man page.

/proc/sys/kernel/hotplug This file contains the pathname for the hotplug policy agent. The default value in this file is /sbin/hotplug.

/proc/sys/kernel/htab-reclaim " (before Linux 2.4.9.2)" (PowerPC only) If this file is set to a nonzero value, the PowerPC htab (see kernel file Documentation/powerpc/ppc_htab.txt ) is pruned each time the system hits the idle loop.

/proc/sys/kernel/keys/ This directory contains various files that define parameters and limits for the key-management facility. These files are described in keyrings(7).

/proc/sys/kernel/kptr_restrict " (since Linux 2.6.38)" The value in this file determines whether kernel addresses are exposed via /proc files and other interfaces. A value of 0 in this file imposes no restrictions. If the value is 1, kernel pointers printed using the %pK format specifier will be replaced with zeros unless the user has the CAP_SYSLOG capability. If the value is 2, kernel pointers printed using the %pK format specifier will be replaced with zeros regardless of the user's capabilities. The initial default value for this file was 1, but the default was changed to 0 in Linux 2.6.39. Since Linux 3.4, only users with the CAP_SYS_ADMIN capability can change the value in this file.

/proc/sys/kernel/l2cr (PowerPC only) This file contains a flag that controls the L2 cache of G3 processor boards. If 0, the cache is disabled. Enabled if nonzero.

/proc/sys/kernel/modprobe This file contains the pathname for the kernel module loader. The default value is /sbin/modprobe. The file is present only if the kernel is built with the CONFIG_MODULES ( CONFIG_KMOD in Linux 2.6.26 and earlier) option enabled. It is described by the Linux kernel source file Documentation/kmod.txt (present only in Linux 2.4 and earlier).

/proc/sys/kernel/modules_disabled " (since Linux 2.6.31)" A toggle value indicating if modules are allowed to be loaded in an otherwise modular kernel. This toggle defaults to off (0), but can be set true (1). Once true, modules can be neither loaded nor unloaded, and the toggle cannot be set back to false. The file is present only if the kernel is built with the CONFIG_MODULES option enabled.

/proc/sys/kernel/msgmax " (since Linux 2.2)" This file defines a system-wide limit specifying the maximum number of bytes in a single message written on a System V message queue.

/proc/sys/kernel/msgmni " (since Linux 2.4)" This file defines the system-wide limit on the number of message queue identifiers. See also /proc/sys/kernel/auto_msgmni.

/proc/sys/kernel/msgmnb " (since Linux 2.2)" This file defines a system-wide parameter used to initialize the msg_qbytes setting for subsequently created message queues. The msg_qbytes setting specifies the maximum number of bytes that may be written to the message queue.

/proc/sys/kernel/ngroups_max " (since Linux 2.6.4)" This is a read-only file that displays the upper limit on the number of a process's group memberships.

/proc/sys/kernel/ns_last_pid " (since Linux 3.3)" See pid_namespaces(7).

/proc/sys/kernel/ostype " and " /proc/sys/kernel/osrelease These files give substrings of /proc/version.

/proc/sys/kernel/overflowgid " and " /proc/sys/kernel/overflowuid These files duplicate the files /proc/sys/fs/overflowgid and /proc/sys/fs/overflowuid.

/proc/sys/kernel/panic This file gives read/write access to the kernel variable panic_timeout. If this is zero, the kernel will loop on a panic; if nonzero, it indicates that the kernel should autoreboot after this number of seconds. When you use the software watchdog device driver, the recommended setting is 60.

/proc/sys/kernel/panic_on_oops " (since Linux 2.5.68)" This file controls the kernel's behavior when an oops or BUG is encountered. If this file contains 0, then the system tries to continue operation. If it contains 1, then the system delays a few seconds (to give klogd time to record the oops output) and then panics. If the /proc/sys/kernel/panic file is also nonzero, then the machine will be rebooted.

/proc/sys/kernel/pid_max " (since Linux 2.5.34)" This file specifies the value at which PIDs wrap around (i.e., the value in this file is one greater than the maximum PID). PIDs greater than this value are not allocated; thus, the value in this file also acts as a system-wide limit on the total number of processes and threads. The default value for this file, 32768, results in the same range of PIDs as on earlier kernels. On 32-bit platforms, 32768 is the maximum value for pid_max. On 64-bit systems, pid_max can be set to any value up to 2^22 ( PID_MAX_LIMIT , approximately 4 million).

/proc/sys/kernel/powersave-nap " (PowerPC only)" This file contains a flag. If set, Linux-PPC will use the "nap" mode of powersaving, otherwise the "doze" mode will be used.

/proc/sys/kernel/printk See syslog(2).

/proc/sys/kernel/pty " (since Linux 2.6.4)" This directory contains two files relating to the number of UNIX 98 pseudoterminals (see pts(4)) on the system.

/proc/sys/kernel/pty/max This file defines the maximum number of pseudoterminals.

/proc/sys/kernel/pty/nr This read-only file indicates how many pseudoterminals are currently in use.

/proc/sys/kernel/random/ This directory contains various parameters controlling the operation of the file /dev/random. See random(4) for further information.

/proc/sys/kernel/random/uuid " (since Linux 2.4)" Each read from this read-only file returns a randomly generated 128-bit UUID, as a string in the standard UUID format.

/proc/sys/kernel/randomize_va_space " (since Linux 2.6.12)" Select the address space layout randomization (ASLR) policy for the system (on architectures that support ASLR). Three values are supported for this file:

0 Turn ASLR off. This is the default for architectures that don't support ASLR, and when the kernel is booted with the norandmaps parameter.

1 Make the addresses of mmap(2) allocations, the stack, and the VDSO page randomized. Among other things, this means that shared libraries will be loaded at randomized addresses. The text segment of PIE-linked binaries will also be loaded at a randomized address. This value is the default if the kernel was configured with CONFIG_COMPAT_BRK .

2 (Since Linux 2.6.25) Also support heap randomization. This value is the default if the kernel was not configured with CONFIG_COMPAT_BRK .

/proc/sys/kernel/real-root-dev This file is documented in the Linux kernel source file Documentation/admin-guide/initrd.rst (or Documentation/initrd.txt before Linux 4.10).

/proc/sys/kernel/reboot-cmd " (Sparc only)" This file seems to be a way to give an argument to the SPARC ROM/Flash boot loader. Maybe to tell it what to do after rebooting?

/proc/sys/kernel/rtsig-max (Up to and including Linux 2.6.7; see setrlimit(2)) This file can be used to tune the maximum number of POSIX real-time (queued) signals that can be outstanding in the system.

/proc/sys/kernel/rtsig-nr (Up to and including Linux 2.6.7.) This file shows the number of POSIX real-time signals currently queued.

/proc/ pid /sched_autogroup_enabled " (since Linux 2.6.38)" See sched(7).

/proc/sys/kernel/sched_child_runs_first " (since Linux 2.6.23)" If this file contains the value zero, then, after a fork(2), the parent is first scheduled on the CPU. If the file contains a nonzero value, then the child is scheduled first on the CPU. (Of course, on a multiprocessor system, the parent and the child might both immediately be scheduled on a CPU.)

/proc/sys/kernel/sched_rr_timeslice_ms " (since Linux 3.9)" See sched_rr_get_interval(2).

/proc/sys/kernel/sched_rt_period_us " (since Linux 2.6.25)" See sched(7).

/proc/sys/kernel/sched_rt_runtime_us " (since Linux 2.6.25)" See sched(7).

/proc/sys/kernel/seccomp/ " (since Linux 4.14)" This directory provides additional seccomp information and configuration. See seccomp(2) for further details.

/proc/sys/kernel/sem " (since Linux 2.4)" This file contains 4 numbers defining limits for System V IPC semaphores. These fields are, in order:

SEMMSL The maximum semaphores per semaphore set.

SEMMNS A system-wide limit on the number of semaphores in all semaphore sets.

SEMOPM The maximum number of operations that may be specified in a semop(2) call.

SEMMNI A system-wide limit on the maximum number of semaphore identifiers.

/proc/sys/kernel/sg-big-buff This file shows the size of the generic SCSI device (sg) buffer. You can't tune it just yet, but you could change it at compile time by editing include/scsi/sg.h and changing the value of SG_BIG_BUFF .However, there shouldn't be any reason to change this value.

/proc/sys/kernel/shm_rmid_forced " (since Linux 3.1)" If this file is set to 1, all System V shared memory segments will be marked for destruction as soon as the number of attached processes falls to zero; in other words, it is no longer possible to create shared memory segments that exist independently of any attached process.

The effect is as though a shmctl(2) IPC_RMID is performed on all existing segments as well as all segments created in the future (until this file is reset to 0). Note that existing segments that are attached to no process will be immediately destroyed when this file is set to 1. Setting this option will also destroy segments that were created, but never attached, upon termination of the process that created the segment with shmget(2).

Setting this file to 1 provides a way of ensuring that all System V shared memory segments are counted against the resource usage and resource limits (see the description of RLIMIT_AS in getrlimit(2)) of at least one process.

Because setting this file to 1 produces behavior that is nonstandard and could also break existing applications, the default value in this file is 0. Set this file to 1 only if you have a good understanding of the semantics of the applications using System V shared memory on your system.

/proc/sys/kernel/shmall " (since Linux 2.2)" This file contains the system-wide limit on the total number of pages of System V shared memory.

/proc/sys/kernel/shmmax " (since Linux 2.2)" This file can be used to query and set the run-time limit on the maximum (System V IPC) shared memory segment size that can be created. Shared memory segments up to 1 GB are now supported in the kernel. This value defaults to SHMMAX .

/proc/sys/kernel/shmmni " (since Linux 2.4)" This file specifies the system-wide maximum number of System V shared memory segments that can be created.

/proc/sys/kernel/sysctl_writes_strict " (since Linux 3.16)" The value in this file determines how the file offset affects the behavior of updating entries in files under /proc/sys. The file has three possible values:

-1 This provides legacy handling, with no printk warnings. Each write(2) must fully contain the value to be written, and multiple writes on the same file descriptor will overwrite the entire value, regardless of the file position.

0 (default) This provides the same behavior as for -1, but printk warnings are written for processes that perform writes when the file offset is not 0.

1 Respect the file offset when writing strings into /proc/sys files. Multiple writes will append to the value buffer. Anything written beyond the maximum length of the value buffer will be ignored. Writes to numeric /proc/sys entries must always be at file offset 0 and the value must be fully contained in the buffer provided to write(2).

/proc/sys/kernel/sysrq This file controls the functions allowed to be invoked by the SysRq key. By default, the file contains 1 meaning that every possible SysRq request is allowed (in older kernel versions, SysRq was disabled by default, and you were required to specifically enable it at run-time, but this is not the case any more). Possible values in this file are:

0 Disable sysrq completely

1 Enable all functions of sysrq

> 1 Bit mask of allowed sysrq functions, as follows:

\ \ 2 Enable control of console logging level

\ \ 4 Enable control of keyboard (SAK, unraw)

\ \ 8 Enable debugging dumps of processes etc.

\ 16 Enable sync command

\ 32 Enable remount read-only

\ 64 Enable signaling of processes (term, kill, oom-kill)

128 Allow reboot/poweroff

256 Allow nicing of all real-time tasks

This file is present only if the CONFIG_MAGIC_SYSRQ kernel configuration option is enabled. For further details see the Linux kernel source file Documentation/admin-guide/sysrq.rst (or Documentation/sysrq.txt before Linux 4.10).

/proc/sys/kernel/version This file contains a string such as:

#5 Wed Feb 25 21:49:24 MET 1998

The "#5" means that this is the fifth kernel built from this source base and the date following it indicates the time the kernel was built.

/proc/sys/kernel/threads-max " (since Linux 2.3.11)" This file specifies the system-wide limit on the number of threads (tasks) that can be created on the system.

Since Linux 4.1, the value that can be written to threads-max is bounded. The minimum value that can be written is 20. The maximum value that can be written is given by the constant FUTEX_TID_MASK (0x3fffffff). If a value outside of this range is written to threads-max, the error EINVAL occurs.

The value written is checked against the available RAM pages. If the thread structures would occupy too much (more than 1/8th) of the available RAM pages, threads-max is reduced accordingly.

/proc/sys/kernel/yama/ptrace_scope " (since Linux 3.5)" See ptrace(2).

/proc/sys/kernel/zero-paged " (PowerPC only)" This file contains a flag. When enabled (nonzero), Linux-PPC will pre-zero pages in the idle loop, possibly speeding up get_free_pages.

/proc/sys/net This directory contains networking stuff. Explanations for some of the files under this directory can be found in tcp(7) and ip(7).

/proc/sys/net/core/bpf_jit_enable See bpf(2).

/proc/sys/net/core/somaxconn This file defines a ceiling value for the backlog argument of listen(2); see the listen(2) manual page for details.

/proc/sys/proc This directory may be empty.

/proc/sys/sunrpc This directory supports Sun remote procedure call for network filesystem (NFS). On some systems, it is not present.

/proc/sys/user " (since Linux 4.9)" See namespaces(7).

/proc/sys/vm/ This directory contains files for memory management tuning, buffer, and cache management.

/proc/sys/vm/admin_reserve_kbytes " (since Linux 3.10)" This file defines the amount of free memory (in KiB) on the system that should be reserved for users with the capability CAP_SYS_ADMIN .

The default value in this file is the minimum of [3% of free pages, 8MiB] expressed as KiB. The default is intended to provide enough for the superuser to log in and kill a process, if necessary, under the default overcommit 'guess' mode (i.e., 0 in /proc/sys/vm/overcommit_memory ).

Systems running in "overcommit never" mode (i.e., 2 in /proc/sys/vm/overcommit_memory ) should increase the value in this file to account for the full virtual memory size of the programs used to recover (e.g., login(1) ssh(1), and top(1)) Otherwise, the superuser may not be able to log in to recover the system. For example, on x86-64 a suitable value is 131072 (128MiB reserved).

Changing the value in this file takes effect whenever an application requests memory.

/proc/sys/vm/compact_memory " (since Linux 2.6.35)" When 1 is written to this file, all zones are compacted such that free memory is available in contiguous blocks where possible. The effect of this action can be seen by examining /proc/buddyinfo.

Present only if the kernel was configured with CONFIG_COMPACTION .

/proc/sys/vm/drop_caches " (since Linux 2.6.16)" Writing to this file causes the kernel to drop clean caches, dentries, and inodes from memory, causing that memory to become free. This can be useful for memory management testing and performing reproducible filesystem benchmarks. Because writing to this file causes the benefits of caching to be lost, it can degrade overall system performance.

To free pagecache, use:

echo 1 > /proc/sys/vm/drop_caches

To free dentries and inodes, use:

echo 2 > /proc/sys/vm/drop_caches

To free pagecache, dentries, and inodes, use:

echo 3 > /proc/sys/vm/drop_caches

Because writing to this file is a nondestructive operation and dirty objects are not freeable, the user should run sync(1) first.

/proc/sys/vm/sysctl_hugetlb_shm_group " (since Linux 2.6.7)" This writable file contains a group ID that is allowed to allocate memory using huge pages. If a process has a filesystem group ID or any supplementary group ID that matches this group ID, then it can make huge-page allocations without holding the CAP_IPC_LOCK capability; see memfd_create(2), mmap(2), and shmget(2).

/proc/sys/vm/legacy_va_layout " (since Linux 2.6.9)" If nonzero, this disables the new 32-bit memory-mapping layout; the kernel will use the legacy (2.4) layout for all processes.

/proc/sys/vm/memory_failure_early_kill " (since Linux 2.6.32)" Control how to kill processes when an uncorrected memory error (typically a 2-bit error in a memory module) that cannot be handled by the kernel is detected in the background by hardware. In some cases (like the page still having a valid copy on disk), the kernel will handle the failure transparently without affecting any applications. But if there is no other up-to-date copy of the data, it will kill processes to prevent any data corruptions from propagating.

The file has one of the following values:

1 Kill all processes that have the corrupted-and-not-reloadable page mapped as soon as the corruption is detected. Note that this is not supported for a few types of pages, such as kernel internally allocated data or the swap cache, but works for the majority of user pages.

0 Unmap the corrupted page from all processes and kill a process only if it tries to access the page.

The kill is performed using a SIGBUS signal with si_code set to BUS_MCEERR_AO .Processes can handle this if they want to; see sigaction(2) for more details.

This feature is active only on architectures/platforms with advanced machine check handling and depends on the hardware capabilities.

Applications can override the memory_failure_early_kill setting individually with the prctl(2) PR_MCE_KILL operation.

Present only if the kernel was configured with CONFIG_MEMORY_FAILURE .

/proc/sys/vm/memory_failure_recovery " (since Linux 2.6.32)" Enable memory failure recovery (when supported by the platform).

1 Attempt recovery.

0 Always panic on a memory failure.

Present only if the kernel was configured with CONFIG_MEMORY_FAILURE .

/proc/sys/vm/oom_dump_tasks " (since Linux 2.6.25)" Enables a system-wide task dump (excluding kernel threads) to be produced when the kernel performs an OOM-killing. The dump includes the following information for each task (thread, process): thread ID, real user ID, thread group ID (process ID), virtual memory size, resident set size, the CPU that the task is scheduled on, oom_adj score (see the description of /proc/ pid /oom_adj ), and command name. This is helpful to determine why the OOM-killer was invoked and to identify the rogue task that caused it.

If this contains the value zero, this information is suppressed. On very large systems with thousands of tasks, it may not be feasible to dump the memory state information for each one. Such systems should not be forced to incur a performance penalty in OOM situations when the information may not be desired.

If this is set to nonzero, this information is shown whenever the OOM-killer actually kills a memory-hogging task.

The default value is 0.

/proc/sys/vm/oom_kill_allocating_task " (since Linux 2.6.24)" This enables or disables killing the OOM-triggering task in out-of-memory situations.

If this is set to zero, the OOM-killer will scan through the entire tasklist and select a task based on heuristics to kill. This normally selects a rogue memory-hogging task that frees up a large amount of memory when killed.

If this is set to nonzero, the OOM-killer simply kills the task that triggered the out-of-memory condition. This avoids a possibly expensive tasklist scan.

If /proc/sys/vm/panic_on_oom is nonzero, it takes precedence over whatever value is used in /proc/sys/vm/oom_kill_allocating_task.

The default value is 0.

/proc/sys/vm/overcommit_kbytes " (since Linux 3.14)" This writable file provides an alternative to /proc/sys/vm/overcommit_ratio for controlling the CommitLimit when /proc/sys/vm/overcommit_memory has the value 2. It allows the amount of memory overcommitting to be specified as an absolute value (in kB), rather than as a percentage, as is done with overcommit_ratio. This allows for finer-grained control of CommitLimit on systems with extremely large memory sizes.

Only one of overcommit_kbytes or overcommit_ratio can have an effect: if overcommit_kbytes has a nonzero value, then it is used to calculate CommitLimit, otherwise overcommit_ratio is used. Writing a value to either of these files causes the value in the other file to be set to zero.

/proc/sys/vm/overcommit_memory This file contains the kernel virtual memory accounting mode. Values are:

0: heuristic overcommit (this is the default) 1: always overcommit, never check 2: always check, never overcommit

In mode 0, calls of mmap(2) with MAP_NORESERVE are not checked, and the default check is very weak, leading to the risk of getting a process "OOM-killed".

In mode 1, the kernel pretends there is always enough memory, until memory actually runs out. One use case for this mode is scientific computing applications that employ large sparse arrays. Before Linux 2.6.0, any nonzero value implies mode 1.

In mode 2 (available since Linux 2.6), the total virtual address space that can be allocated ( CommitLimit in /proc/meminfo ) is calculated as

CommitLimit = (total_RAM - total_huge_TLB) *
	      overcommit_ratio / 100 + total_swap

where:

total_RAM is the total amount of RAM on the system;

total_huge_TLB is the amount of memory set aside for huge pages;

overcommit_ratio is the value in /proc/sys/vm/overcommit_ratio ; and

total_swap is the amount of swap space.

For example, on a system with 16 GB of physical RAM, 16 GB of swap, no space dedicated to huge pages, and an overcommit_ratio of 50, this formula yields a CommitLimit of 24 GB.

Since Linux 3.14, if the value in /proc/sys/vm/overcommit_kbytes is nonzero, then CommitLimit is instead calculated as:

CommitLimit = overcommit_kbytes + total_swap

See also the description of /proc/sys/vm/admin_reserve_kbytes and /proc/sys/vm/user_reserve_kbytes.

/proc/sys/vm/overcommit_ratio " (since Linux 2.6.0)" This writable file defines a percentage by which memory can be overcommitted. The default value in the file is 50. See the description of /proc/sys/vm/overcommit_memory.

/proc/sys/vm/panic_on_oom " (since Linux 2.6.18)" This enables or disables a kernel panic in an out-of-memory situation.

If this file is set to the value 0, the kernel's OOM-killer will kill some rogue process. Usually, the OOM-killer is able to kill a rogue process and the system will survive.

If this file is set to the value 1, then the kernel normally panics when out-of-memory happens. However, if a process limits allocations to certain nodes using memory policies ( mbind (2) MPOL_BIND )or cpusets ( cpuset (7)) and those nodes reach memory exhaustion status, one process may be killed by the OOM-killer. No panic occurs in this case: because other nodes' memory may be free, this means the system as a whole may not have reached an out-of-memory situation yet.

If this file is set to the value 2, the kernel always panics when an out-of-memory condition occurs.

The default value is 0. 1 and 2 are for failover of clustering. Select either according to your policy of failover.

/proc/sys/vm/swappiness The value in this file controls how aggressively the kernel will swap memory pages. Higher values increase aggressiveness, lower values decrease aggressiveness. The default value is 60.

/proc/sys/vm/user_reserve_kbytes " (since Linux 3.10)" Specifies an amount of memory (in KiB) to reserve for user processes. This is intended to prevent a user from starting a single memory hogging process, such that they cannot recover (kill the hog). The value in this file has an effect only when /proc/sys/vm/overcommit_memory is set to 2 ("overcommit never" mode). In this case, the system reserves an amount of memory that is the minimum of [3% of current process size, user_reserve_kbytes ].

The default value in this file is the minimum of [3% of free pages, 128MiB] expressed as KiB.

If the value in this file is set to zero, then a user will be allowed to allocate all free memory with a single process (minus the amount reserved by /proc/sys/vm/admin_reserve_kbytes ). Any subsequent attempts to execute a command will result in "fork: Cannot allocate memory".

Changing the value in this file takes effect whenever an application requests memory.

/proc/sys/vm/unprivileged_userfaultfd " (since Linux 5.2)" This (writable) file exposes a flag that controls whether unprivileged processes are allowed to employ userfaultfd(2). If this file has the value 1, then unprivileged processes may use userfaultfd(2). If this file has the value 0, then only processes that have the CAP_SYS_PTRACE capability may employ userfaultfd(2). The default value in this file is 1.

Name

Description

See Also