 69361eef90
			
		
	
	
	69361eef90
	
	
	
		
			
			This taint flag will be set if the system has ever entered a softlockup state. Similar to TAINT_WARN it is useful to know whether or not the system has been in a softlockup state when debugging. [akpm@linux-foundation.org: apply the taint before calling panic()] Signed-off-by: Josh Hunt <johunt@akamai.com> Cc: Jason Baron <jbaron@akamai.com> Cc: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
		
			
				
	
	
		
			853 lines
		
	
	
	
		
			30 KiB
			
		
	
	
	
		
			Text
		
	
	
	
	
	
			
		
		
	
	
			853 lines
		
	
	
	
		
			30 KiB
			
		
	
	
	
		
			Text
		
	
	
	
	
	
| Documentation for /proc/sys/kernel/*	kernel version 2.2.10
 | |
| 	(c) 1998, 1999,  Rik van Riel <riel@nl.linux.org>
 | |
| 	(c) 2009,        Shen Feng<shen@cn.fujitsu.com>
 | |
| 
 | |
| For general info and legal blurb, please look in README.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| This file contains documentation for the sysctl files in
 | |
| /proc/sys/kernel/ and is valid for Linux kernel version 2.2.
 | |
| 
 | |
| The files in this directory can be used to tune and monitor
 | |
| miscellaneous and general things in the operation of the Linux
 | |
| kernel. Since some of the files _can_ be used to screw up your
 | |
| system, it is advisable to read both documentation and source
 | |
| before actually making adjustments.
 | |
| 
 | |
| Currently, these files might (depending on your configuration)
 | |
| show up in /proc/sys/kernel:
 | |
| 
 | |
| - acct
 | |
| - acpi_video_flags
 | |
| - auto_msgmni
 | |
| - bootloader_type	     [ X86 only ]
 | |
| - bootloader_version	     [ X86 only ]
 | |
| - callhome		     [ S390 only ]
 | |
| - cap_last_cap
 | |
| - core_pattern
 | |
| - core_pipe_limit
 | |
| - core_uses_pid
 | |
| - ctrl-alt-del
 | |
| - dmesg_restrict
 | |
| - domainname
 | |
| - hostname
 | |
| - hotplug
 | |
| - hung_task_panic
 | |
| - hung_task_check_count
 | |
| - hung_task_timeout_secs
 | |
| - hung_task_warnings
 | |
| - kexec_load_disabled
 | |
| - kptr_restrict
 | |
| - kstack_depth_to_print       [ X86 only ]
 | |
| - l2cr                        [ PPC only ]
 | |
| - modprobe                    ==> Documentation/debugging-modules.txt
 | |
| - modules_disabled
 | |
| - msg_next_id		      [ sysv ipc ]
 | |
| - msgmax
 | |
| - msgmnb
 | |
| - msgmni
 | |
| - nmi_watchdog
 | |
| - osrelease
 | |
| - ostype
 | |
| - overflowgid
 | |
| - overflowuid
 | |
| - panic
 | |
| - panic_on_oops
 | |
| - panic_on_unrecovered_nmi
 | |
| - panic_on_stackoverflow
 | |
| - pid_max
 | |
| - powersave-nap               [ PPC only ]
 | |
| - printk
 | |
| - printk_delay
 | |
| - printk_ratelimit
 | |
| - printk_ratelimit_burst
 | |
| - randomize_va_space
 | |
| - real-root-dev               ==> Documentation/initrd.txt
 | |
| - reboot-cmd                  [ SPARC only ]
 | |
| - rtsig-max
 | |
| - rtsig-nr
 | |
| - sem
 | |
| - sem_next_id		      [ sysv ipc ]
 | |
| - sg-big-buff                 [ generic SCSI device (sg) ]
 | |
| - shm_next_id		      [ sysv ipc ]
 | |
| - shm_rmid_forced
 | |
| - shmall
 | |
| - shmmax                      [ sysv ipc ]
 | |
| - shmmni
 | |
| - softlockup_all_cpu_backtrace
 | |
| - stop-a                      [ SPARC only ]
 | |
| - sysrq                       ==> Documentation/sysrq.txt
 | |
| - sysctl_writes_strict
 | |
| - tainted
 | |
| - threads-max
 | |
| - unknown_nmi_panic
 | |
| - watchdog_thresh
 | |
| - version
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| acct:
 | |
| 
 | |
| highwater lowwater frequency
 | |
| 
 | |
| If BSD-style process accounting is enabled these values control
 | |
| its behaviour. If free space on filesystem where the log lives
 | |
| goes below <lowwater>% accounting suspends. If free space gets
 | |
| above <highwater>% accounting resumes. <Frequency> determines
 | |
| how often do we check the amount of free space (value is in
 | |
| seconds). Default:
 | |
| 4 2 30
 | |
| That is, suspend accounting if there left <= 2% free; resume it
 | |
| if we got >=4%; consider information about amount of free space
 | |
| valid for 30 seconds.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| acpi_video_flags:
 | |
| 
 | |
| flags
 | |
| 
 | |
| See Doc*/kernel/power/video.txt, it allows mode of video boot to be
 | |
| set during run time.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| auto_msgmni:
 | |
| 
 | |
| Enables/Disables automatic recomputing of msgmni upon memory add/remove
 | |
| or upon ipc namespace creation/removal (see the msgmni description
 | |
| above). Echoing "1" into this file enables msgmni automatic recomputing.
 | |
| Echoing "0" turns it off. auto_msgmni default value is 1.
 | |
| 
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| bootloader_type:
 | |
| 
 | |
| x86 bootloader identification
 | |
| 
 | |
| This gives the bootloader type number as indicated by the bootloader,
 | |
| shifted left by 4, and OR'd with the low four bits of the bootloader
 | |
| version.  The reason for this encoding is that this used to match the
 | |
| type_of_loader field in the kernel header; the encoding is kept for
 | |
| backwards compatibility.  That is, if the full bootloader type number
 | |
| is 0x15 and the full version number is 0x234, this file will contain
 | |
| the value 340 = 0x154.
 | |
| 
 | |
| See the type_of_loader and ext_loader_type fields in
 | |
| Documentation/x86/boot.txt for additional information.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| bootloader_version:
 | |
| 
 | |
| x86 bootloader version
 | |
| 
 | |
| The complete bootloader version number.  In the example above, this
 | |
| file will contain the value 564 = 0x234.
 | |
| 
 | |
| See the type_of_loader and ext_loader_ver fields in
 | |
| Documentation/x86/boot.txt for additional information.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| callhome:
 | |
| 
 | |
| Controls the kernel's callhome behavior in case of a kernel panic.
 | |
| 
 | |
| The s390 hardware allows an operating system to send a notification
 | |
| to a service organization (callhome) in case of an operating system panic.
 | |
| 
 | |
| When the value in this file is 0 (which is the default behavior)
 | |
| nothing happens in case of a kernel panic. If this value is set to "1"
 | |
| the complete kernel oops message is send to the IBM customer service
 | |
| organization in case the mainframe the Linux operating system is running
 | |
| on has a service contract with IBM.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| cap_last_cap
 | |
| 
 | |
| Highest valid capability of the running kernel.  Exports
 | |
| CAP_LAST_CAP from the kernel.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| core_pattern:
 | |
| 
 | |
| core_pattern is used to specify a core dumpfile pattern name.
 | |
| . max length 128 characters; default value is "core"
 | |
| . core_pattern is used as a pattern template for the output filename;
 | |
|   certain string patterns (beginning with '%') are substituted with
 | |
|   their actual values.
 | |
| . backward compatibility with core_uses_pid:
 | |
| 	If core_pattern does not include "%p" (default does not)
 | |
| 	and core_uses_pid is set, then .PID will be appended to
 | |
| 	the filename.
 | |
| . corename format specifiers:
 | |
| 	%<NUL>	'%' is dropped
 | |
| 	%%	output one '%'
 | |
| 	%p	pid
 | |
| 	%P	global pid (init PID namespace)
 | |
| 	%u	uid
 | |
| 	%g	gid
 | |
| 	%d	dump mode, matches PR_SET_DUMPABLE and
 | |
| 		/proc/sys/fs/suid_dumpable
 | |
| 	%s	signal number
 | |
| 	%t	UNIX time of dump
 | |
| 	%h	hostname
 | |
| 	%e	executable filename (may be shortened)
 | |
| 	%E	executable path
 | |
| 	%<OTHER> both are dropped
 | |
| . If the first character of the pattern is a '|', the kernel will treat
 | |
|   the rest of the pattern as a command to run.  The core dump will be
 | |
|   written to the standard input of that program instead of to a file.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| core_pipe_limit:
 | |
| 
 | |
| This sysctl is only applicable when core_pattern is configured to pipe
 | |
| core files to a user space helper (when the first character of
 | |
| core_pattern is a '|', see above).  When collecting cores via a pipe
 | |
| to an application, it is occasionally useful for the collecting
 | |
| application to gather data about the crashing process from its
 | |
| /proc/pid directory.  In order to do this safely, the kernel must wait
 | |
| for the collecting process to exit, so as not to remove the crashing
 | |
| processes proc files prematurely.  This in turn creates the
 | |
| possibility that a misbehaving userspace collecting process can block
 | |
| the reaping of a crashed process simply by never exiting.  This sysctl
 | |
| defends against that.  It defines how many concurrent crashing
 | |
| processes may be piped to user space applications in parallel.  If
 | |
| this value is exceeded, then those crashing processes above that value
 | |
| are noted via the kernel log and their cores are skipped.  0 is a
 | |
| special value, indicating that unlimited processes may be captured in
 | |
| parallel, but that no waiting will take place (i.e. the collecting
 | |
| process is not guaranteed access to /proc/<crashing pid>/).  This
 | |
| value defaults to 0.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| core_uses_pid:
 | |
| 
 | |
| The default coredump filename is "core".  By setting
 | |
| core_uses_pid to 1, the coredump filename becomes core.PID.
 | |
| If core_pattern does not include "%p" (default does not)
 | |
| and core_uses_pid is set, then .PID will be appended to
 | |
| the filename.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| ctrl-alt-del:
 | |
| 
 | |
| When the value in this file is 0, ctrl-alt-del is trapped and
 | |
| sent to the init(1) program to handle a graceful restart.
 | |
| When, however, the value is > 0, Linux's reaction to a Vulcan
 | |
| Nerve Pinch (tm) will be an immediate reboot, without even
 | |
| syncing its dirty buffers.
 | |
| 
 | |
| Note: when a program (like dosemu) has the keyboard in 'raw'
 | |
| mode, the ctrl-alt-del is intercepted by the program before it
 | |
| ever reaches the kernel tty layer, and it's up to the program
 | |
| to decide what to do with it.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| dmesg_restrict:
 | |
| 
 | |
| This toggle indicates whether unprivileged users are prevented
 | |
| from using dmesg(8) to view messages from the kernel's log buffer.
 | |
| When dmesg_restrict is set to (0) there are no restrictions. When
 | |
| dmesg_restrict is set set to (1), users must have CAP_SYSLOG to use
 | |
| dmesg(8).
 | |
| 
 | |
| The kernel config option CONFIG_SECURITY_DMESG_RESTRICT sets the
 | |
| default value of dmesg_restrict.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| domainname & hostname:
 | |
| 
 | |
| These files can be used to set the NIS/YP domainname and the
 | |
| hostname of your box in exactly the same way as the commands
 | |
| domainname and hostname, i.e.:
 | |
| # echo "darkstar" > /proc/sys/kernel/hostname
 | |
| # echo "mydomain" > /proc/sys/kernel/domainname
 | |
| has the same effect as
 | |
| # hostname "darkstar"
 | |
| # domainname "mydomain"
 | |
| 
 | |
| Note, however, that the classic darkstar.frop.org has the
 | |
| hostname "darkstar" and DNS (Internet Domain Name Server)
 | |
| domainname "frop.org", not to be confused with the NIS (Network
 | |
| Information Service) or YP (Yellow Pages) domainname. These two
 | |
| domain names are in general different. For a detailed discussion
 | |
| see the hostname(1) man page.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| hotplug:
 | |
| 
 | |
| Path for the hotplug policy agent.
 | |
| Default value is "/sbin/hotplug".
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| hung_task_panic:
 | |
| 
 | |
| Controls the kernel's behavior when a hung task is detected.
 | |
| This file shows up if CONFIG_DETECT_HUNG_TASK is enabled.
 | |
| 
 | |
| 0: continue operation. This is the default behavior.
 | |
| 
 | |
| 1: panic immediately.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| hung_task_check_count:
 | |
| 
 | |
| The upper bound on the number of tasks that are checked.
 | |
| This file shows up if CONFIG_DETECT_HUNG_TASK is enabled.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| hung_task_timeout_secs:
 | |
| 
 | |
| Check interval. When a task in D state did not get scheduled
 | |
| for more than this value report a warning.
 | |
| This file shows up if CONFIG_DETECT_HUNG_TASK is enabled.
 | |
| 
 | |
| 0: means infinite timeout - no checking done.
 | |
| Possible values to set are in range {0..LONG_MAX/HZ}.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| hung_task_warnings:
 | |
| 
 | |
| The maximum number of warnings to report. During a check interval
 | |
| if a hung task is detected, this value is decreased by 1.
 | |
| When this value reaches 0, no more warnings will be reported.
 | |
| This file shows up if CONFIG_DETECT_HUNG_TASK is enabled.
 | |
| 
 | |
| -1: report an infinite number of warnings.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| kexec_load_disabled:
 | |
| 
 | |
| A toggle indicating if the kexec_load syscall has been disabled. This
 | |
| value defaults to 0 (false: kexec_load enabled), but can be set to 1
 | |
| (true: kexec_load disabled). Once true, kexec can no longer be used, and
 | |
| the toggle cannot be set back to false. This allows a kexec image to be
 | |
| loaded before disabling the syscall, allowing a system to set up (and
 | |
| later use) an image without it being altered. Generally used together
 | |
| with the "modules_disabled" sysctl.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| kptr_restrict:
 | |
| 
 | |
| This toggle indicates whether restrictions are placed on
 | |
| exposing kernel addresses via /proc and other interfaces.
 | |
| 
 | |
| When kptr_restrict is set to (0), the default, there are no restrictions.
 | |
| 
 | |
| When kptr_restrict is set to (1), kernel pointers printed using the %pK
 | |
| format specifier will be replaced with 0's unless the user has CAP_SYSLOG
 | |
| and effective user and group ids are equal to the real ids. This is
 | |
| because %pK checks are done at read() time rather than open() time, so
 | |
| if permissions are elevated between the open() and the read() (e.g via
 | |
| a setuid binary) then %pK will not leak kernel pointers to unprivileged
 | |
| users. Note, this is a temporary solution only. The correct long-term
 | |
| solution is to do the permission checks at open() time. Consider removing
 | |
| world read permissions from files that use %pK, and using dmesg_restrict
 | |
| to protect against uses of %pK in dmesg(8) if leaking kernel pointer
 | |
| values to unprivileged users is a concern.
 | |
| 
 | |
| When kptr_restrict is set to (2), kernel pointers printed using
 | |
| %pK will be replaced with 0's regardless of privileges.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| kstack_depth_to_print: (X86 only)
 | |
| 
 | |
| Controls the number of words to print when dumping the raw
 | |
| kernel stack.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| l2cr: (PPC only)
 | |
| 
 | |
| This flag controls the L2 cache of G3 processor boards. If
 | |
| 0, the cache is disabled. Enabled if nonzero.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| modules_disabled:
 | |
| 
 | |
| A toggle value indicating if modules are allowed to be loaded
 | |
| in an otherwise modular kernel.  This toggle defaults to off
 | |
| (0), but can be set true (1).  Once true, modules can be
 | |
| neither loaded nor unloaded, and the toggle cannot be set back
 | |
| to false.  Generally used with the "kexec_load_disabled" toggle.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| msg_next_id, sem_next_id, and shm_next_id:
 | |
| 
 | |
| These three toggles allows to specify desired id for next allocated IPC
 | |
| object: message, semaphore or shared memory respectively.
 | |
| 
 | |
| By default they are equal to -1, which means generic allocation logic.
 | |
| Possible values to set are in range {0..INT_MAX}.
 | |
| 
 | |
| Notes:
 | |
| 1) kernel doesn't guarantee, that new object will have desired id. So,
 | |
| it's up to userspace, how to handle an object with "wrong" id.
 | |
| 2) Toggle with non-default value will be set back to -1 by kernel after
 | |
| successful IPC object allocation.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| nmi_watchdog:
 | |
| 
 | |
| Enables/Disables the NMI watchdog on x86 systems. When the value is
 | |
| non-zero the NMI watchdog is enabled and will continuously test all
 | |
| online cpus to determine whether or not they are still functioning
 | |
| properly. Currently, passing "nmi_watchdog=" parameter at boot time is
 | |
| required for this function to work.
 | |
| 
 | |
| If LAPIC NMI watchdog method is in use (nmi_watchdog=2 kernel
 | |
| parameter), the NMI watchdog shares registers with oprofile. By
 | |
| disabling the NMI watchdog, oprofile may have more registers to
 | |
| utilize.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| numa_balancing
 | |
| 
 | |
| Enables/disables automatic page fault based NUMA memory
 | |
| balancing. Memory is moved automatically to nodes
 | |
| that access it often.
 | |
| 
 | |
| Enables/disables automatic NUMA memory balancing. On NUMA machines, there
 | |
| is a performance penalty if remote memory is accessed by a CPU. When this
 | |
| feature is enabled the kernel samples what task thread is accessing memory
 | |
| by periodically unmapping pages and later trapping a page fault. At the
 | |
| time of the page fault, it is determined if the data being accessed should
 | |
| be migrated to a local memory node.
 | |
| 
 | |
| The unmapping of pages and trapping faults incur additional overhead that
 | |
| ideally is offset by improved memory locality but there is no universal
 | |
| guarantee. If the target workload is already bound to NUMA nodes then this
 | |
| feature should be disabled. Otherwise, if the system overhead from the
 | |
| feature is too high then the rate the kernel samples for NUMA hinting
 | |
| faults may be controlled by the numa_balancing_scan_period_min_ms,
 | |
| numa_balancing_scan_delay_ms, numa_balancing_scan_period_max_ms,
 | |
| numa_balancing_scan_size_mb, and numa_balancing_settle_count sysctls.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| numa_balancing_scan_period_min_ms, numa_balancing_scan_delay_ms,
 | |
| numa_balancing_scan_period_max_ms, numa_balancing_scan_size_mb
 | |
| 
 | |
| Automatic NUMA balancing scans tasks address space and unmaps pages to
 | |
| detect if pages are properly placed or if the data should be migrated to a
 | |
| memory node local to where the task is running.  Every "scan delay" the task
 | |
| scans the next "scan size" number of pages in its address space. When the
 | |
| end of the address space is reached the scanner restarts from the beginning.
 | |
| 
 | |
| In combination, the "scan delay" and "scan size" determine the scan rate.
 | |
| When "scan delay" decreases, the scan rate increases.  The scan delay and
 | |
| hence the scan rate of every task is adaptive and depends on historical
 | |
| behaviour. If pages are properly placed then the scan delay increases,
 | |
| otherwise the scan delay decreases.  The "scan size" is not adaptive but
 | |
| the higher the "scan size", the higher the scan rate.
 | |
| 
 | |
| Higher scan rates incur higher system overhead as page faults must be
 | |
| trapped and potentially data must be migrated. However, the higher the scan
 | |
| rate, the more quickly a tasks memory is migrated to a local node if the
 | |
| workload pattern changes and minimises performance impact due to remote
 | |
| memory accesses. These sysctls control the thresholds for scan delays and
 | |
| the number of pages scanned.
 | |
| 
 | |
| numa_balancing_scan_period_min_ms is the minimum time in milliseconds to
 | |
| scan a tasks virtual memory. It effectively controls the maximum scanning
 | |
| rate for each task.
 | |
| 
 | |
| numa_balancing_scan_delay_ms is the starting "scan delay" used for a task
 | |
| when it initially forks.
 | |
| 
 | |
| numa_balancing_scan_period_max_ms is the maximum time in milliseconds to
 | |
| scan a tasks virtual memory. It effectively controls the minimum scanning
 | |
| rate for each task.
 | |
| 
 | |
| numa_balancing_scan_size_mb is how many megabytes worth of pages are
 | |
| scanned for a given scan.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| osrelease, ostype & version:
 | |
| 
 | |
| # cat osrelease
 | |
| 2.1.88
 | |
| # cat ostype
 | |
| Linux
 | |
| # cat version
 | |
| #5 Wed Feb 25 21:49:24 MET 1998
 | |
| 
 | |
| The files osrelease and ostype should be clear enough. Version
 | |
| needs a little more clarification however. The '#5' means that
 | |
| this is the fifth kernel built from this source base and the
 | |
| date behind it indicates the time the kernel was built.
 | |
| The only way to tune these values is to rebuild the kernel :-)
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| overflowgid & overflowuid:
 | |
| 
 | |
| if your architecture did not always support 32-bit UIDs (i.e. arm,
 | |
| i386, m68k, sh, and sparc32), a fixed UID and GID will be returned to
 | |
| applications that use the old 16-bit UID/GID system calls, if the
 | |
| actual UID or GID would exceed 65535.
 | |
| 
 | |
| These sysctls allow you to change the value of the fixed UID and GID.
 | |
| The default is 65534.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| panic:
 | |
| 
 | |
| The value in this file represents the number of seconds the kernel
 | |
| waits before rebooting on a panic. When you use the software watchdog,
 | |
| the recommended setting is 60.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| panic_on_unrecovered_nmi:
 | |
| 
 | |
| The default Linux behaviour on an NMI of either memory or unknown is
 | |
| to continue operation. For many environments such as scientific
 | |
| computing it is preferable that the box is taken out and the error
 | |
| dealt with than an uncorrected parity/ECC error get propagated.
 | |
| 
 | |
| A small number of systems do generate NMI's for bizarre random reasons
 | |
| such as power management so the default is off. That sysctl works like
 | |
| the existing panic controls already in that directory.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| panic_on_oops:
 | |
| 
 | |
| Controls the kernel's behaviour when an oops or BUG is encountered.
 | |
| 
 | |
| 0: try to continue operation
 | |
| 
 | |
| 1: panic immediately.  If the `panic' sysctl is also non-zero then the
 | |
|    machine will be rebooted.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| panic_on_stackoverflow:
 | |
| 
 | |
| Controls the kernel's behavior when detecting the overflows of
 | |
| kernel, IRQ and exception stacks except a user stack.
 | |
| This file shows up if CONFIG_DEBUG_STACKOVERFLOW is enabled.
 | |
| 
 | |
| 0: try to continue operation.
 | |
| 
 | |
| 1: panic immediately.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| perf_cpu_time_max_percent:
 | |
| 
 | |
| Hints to the kernel how much CPU time it should be allowed to
 | |
| use to handle perf sampling events.  If the perf subsystem
 | |
| is informed that its samples are exceeding this limit, it
 | |
| will drop its sampling frequency to attempt to reduce its CPU
 | |
| usage.
 | |
| 
 | |
| Some perf sampling happens in NMIs.  If these samples
 | |
| unexpectedly take too long to execute, the NMIs can become
 | |
| stacked up next to each other so much that nothing else is
 | |
| allowed to execute.
 | |
| 
 | |
| 0: disable the mechanism.  Do not monitor or correct perf's
 | |
|    sampling rate no matter how CPU time it takes.
 | |
| 
 | |
| 1-100: attempt to throttle perf's sample rate to this
 | |
|    percentage of CPU.  Note: the kernel calculates an
 | |
|    "expected" length of each sample event.  100 here means
 | |
|    100% of that expected length.  Even if this is set to
 | |
|    100, you may still see sample throttling if this
 | |
|    length is exceeded.  Set to 0 if you truly do not care
 | |
|    how much CPU is consumed.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| 
 | |
| pid_max:
 | |
| 
 | |
| PID allocation wrap value.  When the kernel's next PID value
 | |
| reaches this value, it wraps back to a minimum PID value.
 | |
| PIDs of value pid_max or larger are not allocated.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| ns_last_pid:
 | |
| 
 | |
| The last pid allocated in the current (the one task using this sysctl
 | |
| lives in) pid namespace. When selecting a pid for a next task on fork
 | |
| kernel tries to allocate a number starting from this one.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| powersave-nap: (PPC only)
 | |
| 
 | |
| If set, Linux-PPC will use the 'nap' mode of powersaving,
 | |
| otherwise the 'doze' mode will be used.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| printk:
 | |
| 
 | |
| The four values in printk denote: console_loglevel,
 | |
| default_message_loglevel, minimum_console_loglevel and
 | |
| default_console_loglevel respectively.
 | |
| 
 | |
| These values influence printk() behavior when printing or
 | |
| logging error messages. See 'man 2 syslog' for more info on
 | |
| the different loglevels.
 | |
| 
 | |
| - console_loglevel: messages with a higher priority than
 | |
|   this will be printed to the console
 | |
| - default_message_loglevel: messages without an explicit priority
 | |
|   will be printed with this priority
 | |
| - minimum_console_loglevel: minimum (highest) value to which
 | |
|   console_loglevel can be set
 | |
| - default_console_loglevel: default value for console_loglevel
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| printk_delay:
 | |
| 
 | |
| Delay each printk message in printk_delay milliseconds
 | |
| 
 | |
| Value from 0 - 10000 is allowed.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| printk_ratelimit:
 | |
| 
 | |
| Some warning messages are rate limited. printk_ratelimit specifies
 | |
| the minimum length of time between these messages (in jiffies), by
 | |
| default we allow one every 5 seconds.
 | |
| 
 | |
| A value of 0 will disable rate limiting.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| printk_ratelimit_burst:
 | |
| 
 | |
| While long term we enforce one message per printk_ratelimit
 | |
| seconds, we do allow a burst of messages to pass through.
 | |
| printk_ratelimit_burst specifies the number of messages we can
 | |
| send before ratelimiting kicks in.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| randomize_va_space:
 | |
| 
 | |
| This option can be used to select the type of process address
 | |
| space randomization that is used in the system, for architectures
 | |
| that support this feature.
 | |
| 
 | |
| 0 - Turn the process address space randomization off.  This is the
 | |
|     default for architectures that do not support this feature anyways,
 | |
|     and kernels that are booted with the "norandmaps" parameter.
 | |
| 
 | |
| 1 - Make the addresses of mmap base, stack and VDSO page randomized.
 | |
|     This, among other things, implies that shared libraries will be
 | |
|     loaded to random addresses.  Also for PIE-linked binaries, the
 | |
|     location of code start is randomized.  This is the default if the
 | |
|     CONFIG_COMPAT_BRK option is enabled.
 | |
| 
 | |
| 2 - Additionally enable heap randomization.  This is the default if
 | |
|     CONFIG_COMPAT_BRK is disabled.
 | |
| 
 | |
|     There are a few legacy applications out there (such as some ancient
 | |
|     versions of libc.so.5 from 1996) that assume that brk area starts
 | |
|     just after the end of the code+bss.  These applications break when
 | |
|     start of the brk area is randomized.  There are however no known
 | |
|     non-legacy applications that would be broken this way, so for most
 | |
|     systems it is safe to choose full randomization.
 | |
| 
 | |
|     Systems with ancient and/or broken binaries should be configured
 | |
|     with CONFIG_COMPAT_BRK enabled, which excludes the heap from process
 | |
|     address space randomization.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| reboot-cmd: (Sparc only)
 | |
| 
 | |
| ??? This seems to be a way to give an argument to the Sparc
 | |
| ROM/Flash boot loader. Maybe to tell it what to do after
 | |
| rebooting. ???
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| rtsig-max & rtsig-nr:
 | |
| 
 | |
| The file rtsig-max can be used to tune the maximum number
 | |
| of POSIX realtime (queued) signals that can be outstanding
 | |
| in the system.
 | |
| 
 | |
| rtsig-nr shows the number of RT signals currently queued.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| sg-big-buff:
 | |
| 
 | |
| This file shows the size of the generic SCSI (sg) buffer.
 | |
| You can't tune it just yet, but you could change it on
 | |
| compile time by editing include/scsi/sg.h and changing
 | |
| the value of SG_BIG_BUFF.
 | |
| 
 | |
| There shouldn't be any reason to change this value. If
 | |
| you can come up with one, you probably know what you
 | |
| are doing anyway :)
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| shmall:
 | |
| 
 | |
| This parameter sets the total amount of shared memory pages that
 | |
| can be used system wide. Hence, SHMALL should always be at least
 | |
| ceil(shmmax/PAGE_SIZE).
 | |
| 
 | |
| If you are not sure what the default PAGE_SIZE is on your Linux
 | |
| system, you can run the following command:
 | |
| 
 | |
| # getconf PAGE_SIZE
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| shmmax:
 | |
| 
 | |
| This value can be used to query and set the run time limit
 | |
| on the maximum shared memory segment size that can be created.
 | |
| Shared memory segments up to 1Gb are now supported in the
 | |
| kernel.  This value defaults to SHMMAX.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| shm_rmid_forced:
 | |
| 
 | |
| Linux lets you set resource limits, including how much memory one
 | |
| process can consume, via setrlimit(2).  Unfortunately, shared memory
 | |
| segments are allowed to exist without association with any process, and
 | |
| thus might not be counted against any resource limits.  If enabled,
 | |
| shared memory segments are automatically destroyed when their attach
 | |
| count becomes zero after a detach or a process termination.  It will
 | |
| also destroy segments that were created, but never attached to, on exit
 | |
| from the process.  The only use left for IPC_RMID is to immediately
 | |
| destroy an unattached segment.  Of course, this breaks the way things are
 | |
| defined, so some applications might stop working.  Note that this
 | |
| feature will do you no good unless you also configure your resource
 | |
| limits (in particular, RLIMIT_AS and RLIMIT_NPROC).  Most systems don't
 | |
| need this.
 | |
| 
 | |
| Note that if you change this from 0 to 1, already created segments
 | |
| without users and with a dead originative process will be destroyed.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| sysctl_writes_strict:
 | |
| 
 | |
| Control how file position affects the behavior of updating sysctl values
 | |
| via the /proc/sys interface:
 | |
| 
 | |
|   -1 - Legacy per-write sysctl value handling, with no printk warnings.
 | |
|        Each write syscall must fully contain the sysctl value to be
 | |
|        written, and multiple writes on the same sysctl file descriptor
 | |
|        will rewrite the sysctl value, regardless of file position.
 | |
|    0 - (default) Same behavior as above, but warn about processes that
 | |
|        perform writes to a sysctl file descriptor when the file position
 | |
|        is not 0.
 | |
|    1 - Respect file position when writing sysctl strings. Multiple writes
 | |
|        will append to the sysctl value buffer. Anything past the max length
 | |
|        of the sysctl value buffer will be ignored. Writes to numeric sysctl
 | |
|        entries must always be at file position 0 and the value must be
 | |
|        fully contained in the buffer sent in the write syscall.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| softlockup_all_cpu_backtrace:
 | |
| 
 | |
| This value controls the soft lockup detector thread's behavior
 | |
| when a soft lockup condition is detected as to whether or not
 | |
| to gather further debug information. If enabled, each cpu will
 | |
| be issued an NMI and instructed to capture stack trace.
 | |
| 
 | |
| This feature is only applicable for architectures which support
 | |
| NMI.
 | |
| 
 | |
| 0: do nothing. This is the default behavior.
 | |
| 
 | |
| 1: on detection capture more debug information.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| tainted:
 | |
| 
 | |
| Non-zero if the kernel has been tainted.  Numeric values, which
 | |
| can be ORed together:
 | |
| 
 | |
|    1 - A module with a non-GPL license has been loaded, this
 | |
|        includes modules with no license.
 | |
|        Set by modutils >= 2.4.9 and module-init-tools.
 | |
|    2 - A module was force loaded by insmod -f.
 | |
|        Set by modutils >= 2.4.9 and module-init-tools.
 | |
|    4 - Unsafe SMP processors: SMP with CPUs not designed for SMP.
 | |
|    8 - A module was forcibly unloaded from the system by rmmod -f.
 | |
|   16 - A hardware machine check error occurred on the system.
 | |
|   32 - A bad page was discovered on the system.
 | |
|   64 - The user has asked that the system be marked "tainted".  This
 | |
|        could be because they are running software that directly modifies
 | |
|        the hardware, or for other reasons.
 | |
|  128 - The system has died.
 | |
|  256 - The ACPI DSDT has been overridden with one supplied by the user
 | |
|         instead of using the one provided by the hardware.
 | |
|  512 - A kernel warning has occurred.
 | |
| 1024 - A module from drivers/staging was loaded.
 | |
| 2048 - The system is working around a severe firmware bug.
 | |
| 4096 - An out-of-tree module has been loaded.
 | |
| 8192 - An unsigned module has been loaded in a kernel supporting module
 | |
|        signature.
 | |
| 16384 - A soft lockup has previously occurred on the system.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| unknown_nmi_panic:
 | |
| 
 | |
| The value in this file affects behavior of handling NMI. When the
 | |
| value is non-zero, unknown NMI is trapped and then panic occurs. At
 | |
| that time, kernel debugging information is displayed on console.
 | |
| 
 | |
| NMI switch that most IA32 servers have fires unknown NMI up, for
 | |
| example.  If a system hangs up, try pressing the NMI switch.
 | |
| 
 | |
| ==============================================================
 | |
| 
 | |
| watchdog_thresh:
 | |
| 
 | |
| This value can be used to control the frequency of hrtimer and NMI
 | |
| events and the soft and hard lockup thresholds. The default threshold
 | |
| is 10 seconds.
 | |
| 
 | |
| The softlockup threshold is (2 * watchdog_thresh). Setting this
 | |
| tunable to zero will disable lockup detection altogether.
 | |
| 
 | |
| ==============================================================
 |