************************************************************************** * Copyright 1990-1998, Silicon Graphics, Inc. * All Rights Reserved. * * This is UNPUBLISHED PROPRIETARY SOURCE CODE of Silicon Graphics, Inc.; * the contents of this file may not be disclosed to third parties, copied or * duplicated in any form, in whole or in part, without the prior written * permission of Silicon Graphics, Inc. * * RESTRICTED RIGHTS LEGEND: * Use, duplication or disclosure by the Government is subject to restrictions * as set forth in subdivision (c)(1)(ii) of the Rights in Technical Data * and Computer Software clause at DFARS 252.227-7013, and/or in similar or * successor clauses in the FAR, DOD or NASA FAR Supplement. Unpublished - * rights reserved under the Copyright Laws of the United States. ************************************************************************** * *#ident "$Revision: 1.278 $" * This is a read-only file. User-specified tunables are stored * in var/sysgen/stune file. * * kernel * * tunables that turn on/off features * * nosuidshells: to allow setuid shells set to 0 * restricted_chown = 1 bsd style chown(2), only super-user can give away files * restricted_chown = 0 sysV style chown(2), non super-user can give away files * posix_tty_default = 0 ==> run our default line discipline and settings * posix_tty_default = 1 ==> match tty parameters/POSIX test assumptions * use_old_serialnum = 1 ==> Force the kernel to use the old method of * calculating a 32-bit serial number for sysinfo -s. * This variable only affects Onyx/Challenge L/XL systems. * reboot_on_panic = -1 ==> Use machine dependent reboot after panic semantics. * reboot_on_panic = 0 ==> Do not reboot the system after a panic (wait for * user to hit the system reset button). * reboot_on_panic = 1 ==> Automatically reboot the system after a panic. * restrict_fastprof = 1 ==> Don't allow users to do fast (1ms) user level * profiling. * restrict_fastprof = 0 ==> Allow users to do fast (1ms) user level profiling. * ip26_allow_ucmem = 0 ==> Accessing system memory uncached on Power Indigo2 * and Indigo2 10000 is illegal and will cause a * system crash. * ip26_allow_ucmem = 1 ==> Accessing system memory on Power Indigo2 and * Indigo2 10000 is always legal. This comes at a * large memory access performance hit. * reset_limits_on_exec = 1 ==> Reset rlimits on exec of processes that are * setuid to root to prevent unprivileged processes from * enforcing resource limitations on setuid/setgid procs. * reset_limits_on_exec = 0 ==> Don't reset limits on execs of setuid procs. * Warning: Allowing non-root user to enforce * resource limitations on setuid/setgid to root * program can compromise system security. Do * not set this option to zero unless you are * sure all setuid/setgid to root programs on * your system can recover from problem caused * by resource limit. * tty_auto_strhold = 1 ==> automatically sets STRHOLD on ttys/ptys whenever * the line discipline is in canonical & echo mode and * automatically clears STRHOLD otherwise. * tty_auto_strhold = 0 ==> STRHOLD on ptys/ttys is entirely under user control. * add_kthread_stack = 0 ==> no additional kthread stack for kernel daemons * add_kthread_stack != 0 ==> add add_kthread_stack bytes to kernel daemon stack * xpg4_sticky_dir = 1 ==> write access to a file does not imply it is * removable in a directory with the sticky bit set * xpg4_sticky_dir = 0 ==> write access to a file implies that it is * removable in a directory with the sticky bit set * mload_auto_rtsyms = 0 ==> Disable auto-loading of kernel's run-time symbol * table for dynamic driver/module loading. * disable_ip25_check = 1 ==> Disable boot time checking for valid (SUPPORTED) * SCC/R10k configurations. * disable_r10k_log = 0 ==> Disable boot time logging of R10k config params. * such as clk divisors etc. * spec_udma_war_on = 1 ==> speculative store userdma workaround * async_udma_copy = 1 ==> proactively copy shared userdma pages * enable_pci_BIST = 0 ==> Causes the Built In Self Test on PCI boards to * be run if the board supports it. Default is not to * run tests. * ip32_pci_enable_prefetch = 0 ==> Do not enable the PCI prefetch buffers * in MACE. * ip32_pci_enable_prefetch = 1 ==> Enable the PCI prefetch buffers in MACE. * ip32_pci_flush_prefetch = 0 ==> If prefetch is disabled, then * pciio_flush_buffers() is disabled. * ip32_pci_flush_prefetch = 1 ==> If prefetch is disabled, * pciio_flush_buffers() will still flush the * prefetch buffers. * ignore_sysctlr_intr = 1 ==> Completely ignore system controller (O200/O2000) * powerfail_cleanio = 1 ==> Reset all hub I/O immediately at AC power loss * Should be enabled for database apps (O200/O2000) * intr_revive_dealy ==> seconds until the MSC interrupts are reenabled * xbox_sysctlr_poll_interval ==> seconds between consecutive polls of the * xbox system controller status * clean_shutdown_time ==> Max seconds allowed for clean shutdown (init 0) * after system power-off is requested * ignore_xsc => Ignore the xbox system controller * snerrste_pages_per_cube ==> number of pages allocated for error state at * system reset per cube. A cube is 16 nodes * xpc_override = 0 ==> Default. * 1 ==> Override restrictions required for XPC. * origin_use_hardware_bzero = 0 ==> Default * 0 ==> Use the processor to zero pages. * 1 ==> Use the Origin hardware block transfer engine to zero pages. * use_old_flock = 0 ==> Default * 0 ==> Use new flock implementation which wakes up only those * waiting for a file-and-record lock which actually can * get the lock being awaited. This rewritten flock * substantially improves performance when file and * record locks have high contention * 1 ==> Use the old flock implmentation which awakes all * waiters. This is to allow a fall back in case problems * are suspected in the new flock implementation. This * systune variable and the old flock implementation will * be removed in a subsequent 6.5.x release * ensure_no_rawio_conflicts = 1 ==> Default * 0 ==> Allow the kernel to concurrently perform user space DMA for * network sends and user space DMA for other raw I/O on the same * page. Running in this mode can possibly cause the system to * panic if this scenario is exercised by applications. * 1 ==> Prevent the kernel from concurrently performing user space DMA * for network sends and user space DMA for other raw I/O on the * same page. This ensures that this scenario will be performed * safely when encountered. * dbe_recovery = 1 ==> Default * 0 ==> Do not attempt to recover from double bit errors during page_zero. * 1 ==> Attempt to recover from double bit errors during page_zero. * switch: static * name default minimum maximum nosuidshells 1 restricted_chown 0 posix_tty_default 0 use_old_serialnum 0 reboot_on_panic -1 restrict_fastprof 0 reset_limits_on_exec 1 ip26_allow_ucmem 0 add_kthread_stack 0 xpg4_sticky_dir 1 mload_auto_rtsyms 1 disable_ip25_check 0 prod_assertion_level 0 0 100 disable_r10k_log 0 r10k_halt_1st_cache_err 0 r10k_cache_debug 0 perfcnt_arch_swtch_sig 0 perfcnt_arch_swtch_msg 0 enable_upanic 0 spec_udma_war_on 1 warbits_override -1 enable_pci_BIST 0 ip32_pci_enable_prefetch 0 ip32_pci_flush_prefetch 1 ignore_sysctlr_intr 0 powerfail_cleanio 0 intr_revive_delay 300 powerfail_routerwar 1 xbox_sysctlr_poll_interval 60 5 3600 clean_shutdown_time 120 ignore_xsc 0 0 1 snerrste_pages_per_cube 1 1 4 xpc_override 0 origin_use_hardware_bzero 0 use_old_flock 0 cpulimit_gracetime 0 0 86400 switch: run * name default minimum maximum tty_auto_strhold 0 async_udma_copy 1 ensure_no_rawio_conflicts 1 dbe_recovery 1 * miscellaneous static tuneables * * io4ia_userdma_war enable the error checking for concurrent user * dma operations into the same physical memory * * ignore_conveyor_override override the code that ignores directives * to put the io in conveyor belt mode on SN0 * misc: static * name default minimum maximum io4ia_userdma_war 1 ignore_conveyor_override 0 0 1 * * io4ia_war_enable Enable io4ia_war on all configurations (without * checking if this configuration needs this). * To be used ONLY for testing purposes. It should * be used only to enable io4ia_war on systems for * testing. Useful only on Challenge/Onyx platforms. * * name default minimum maximum io4ia_war_enable 0 * tuneables for potential IO devices causing Bpush problems * default minimum maximum * Enable/disable Bpush WAR. bpush_war_enable 1 0 1 * value 0x26 corresponds to PCI Shoebox. bpush_source1 0x26 0x26 0x26 * Following are place-holders for future devices that could require bpush war bpush_source2 0 0 0 bpush_source3 0 0 0 bpush_source4 0 0 0 * miscellaneous dynamic tuneables * * r4k_div_patch = 1 turn on exec patch code for binaries that have been * processed with r4kpp for divide in branch delay slot * problem that occurs on R4000 SC rev 2.2 and 3.0 parts. * corepluspid = 1 name core file as "core.pid" * panic_on_sbe = 1 special factory debugging mode * sbe_log_errors = 1 log single bit errors to SYSLOG (logging is turned * off by default in Irix 6.5.5+) * sbe_mem_verbose set verbosity of sbe reporting; see flag definitions * in sys/SN/SN0/memerror.h * sbe_mfr_override = 1 overrides default action of disabling sbe's * if rate of sbe's exceeds predetermined limit * sbe_report_cons = 1 report single bit errors to console also * sbe_maxout maximum number of single bit errors before * sbe monitoring is disabled * sbe_max_per_page maximum number of single bit errors before offending * page is marked bad * ecc_recover_enable = 0 ==> don't attempt to recover from multibit errors * > 0 ==> attempt to recover from some cases of multibit * errors, but no more than 32 errors every * "ecc_recover_enable" seconds (default 60 secs) * munlddelay = 5 timeout for auto-unload for loadable modules * dump_level = 0 dump only putbuf during panic * 1 also dump static kernel pages * 2 also dump dynamic kernel pages * 3 also dump buffer cache pages * 4 also dump free pages * * * dump_hub_info is only effective on SN0 platforms and * accepts the following values: * * dump_hub_info = 0 do not dump information from the hub; * = 1 dump the register information that is contained within * the hub; * = 2 dump information from the hub: directory words high * and low and the protection and reference counter word, * associated with each page which is dumped. * * heartbeat_enable = 0 ==> No heartbeat monitoring (avail. O200/O2000) * = 1 ==> Send hbt to MSCs once a minute. On missed hbt, * MSC sends NMI to force dump, and then resets system. * * mmap_async_write = * 0 ==> don't allow kernel to take references to read-only * MAP_SHARED non-anonymous memory in the write path. * This is the default and reflects standard UNIX * semantics. * 1 ==> allow kernel to take references to read-only MAP_SHARED * non-anonymous memory in order to avoid having to copy * data in the write path. Normally this only happens for * anonymous memory (/dev/zero, bss, stack, etc.) which * is safe to do because we mark that as copy-on-write and * no one outside the share group can get at anonymous * memory. The extension of this semantic to read-only * MAP_SHARED memory should be used with caution and * knowledge of it's potential effects. Normally, once * write() has returned UNIX semantics guarantee that * whatever data was written has now been scheduled for * delivery and nothing can change that data. With this * option, a write() will return immediately after * scheduling the referenced file data for delivery. If * the file is then modified before the file data has * been transfered, then the *modified* data will be * written instead of the file data at the time of the * write() call ... The advantage of using this * semantic is that it enormously reduces the amount of * data copying in applications like web serving, etc. * which can result in fairly substantial savings. * gather_craylink_routerstats = * Gather craylink router statistics (like sequence * number errors, checkbit errors etc). * This variable is off by default. * * This should be turned on, for output from 'linkstat' * to be useful. * * This variable can be turned on at runtime. * r4k_corrupt_scache_data = 0 factory test code option, may corrupt data * ip32_pio_predelay = 5 Delay in microseconds before pio read on IP32. * ip32_pio_postdelay = 5 Delay in microseconds after pio read on IP32. * * crime_rev Variable set by system based on crime rev * The min/max are set to make this variable Readonly. * * cpu_prid The processor id of the CPU (IP32 only). * The min/max are set to make this variable Readonly. * * max_shared_reg_refcnt Maximum reference count on shared regions. MAP_SHARED * regions are normally shared on fork, however if a * process has too many children, then there can be * contention for the region lock. The tradeoff is * the extra overhead needed to duplicate the region * if it's not going to be shared. A value of 1 causes * the MAP_SHARED regions to never be shared on fork. * peer_peer_dma This tunable SHOULDNOT BE TURNED ON unless the user * has a pci card in the o2 pci slot from which a * dma is needed to another io device bypassing the * host memory. * hub_intr_wakeup_cnt Default value is 1, which will work for most drivers. * Some unusual drivers may depend on the interrupt * routine being called once for every interrupt. * This behavior is never guaranteed, but these drivers * may work better if you set this to a larger value, * but never more than 0x7000. * pci_user_dma_max_pages Allow the user to specify maximum dma area * * mrlock_starve_threshold Number of times to attempt to acquire an mrlock * before falling back to a guaranteed locking * algorithm. A value of 0 results in the original * locking algorithm. Non-zero values normally give * better performance by reducing the average delay * on contended locks, although the maximum delay may be * larger. misc: run * name default minimum maximum r4k_div_patch 0 corepluspid 0 panic_on_sbe 0 sbe_log_errors 0 sbe_mem_verbose 0 sbe_mfr_override 0 sbe_report_cons 0 sbe_maxout 600 sbe_max_per_page 60 ecc_recover_enable 60 module_unld_delay 5 0 dump_level 4 dump_hub_info 2 0 2 heartbeat_enable 0 craylnk_bypass_disable 0 disable_sbe_polling 0 0 1 rpparm_maxburst_val 0x24 0x1 0x3ff mmap_async_write 0 0 1 r10k_intervene 1 r10k_progress_nmi 0 r10k_check_count 30000 r10k_progress_diff 200000000 0 0x7fffffffffffffff ll gather_craylink_routerstats 1 0 1 ip32_pio_predelay 5 0 ip32_pio_postdelay 5 0 crime_rev 0 0x100 0xff cpu_prid 0 0x100 0xff max_shared_reg_refcnt 4 1 peer_peer_dma 0 counter_intr_panic 0 hub_intr_wakeup_cnt 1 1 0x7000 pci_user_dma_max_pages 128 1 fpcsr_fs_bit 1 0 1 mrlock_starve_threshold 3 0 10 * * * Tuneables to enable hardware specific parameter on Origin200/Origin2000. * * Rev of bridge supporting prefetch. pcibr_prefetch_enable_rev 3 * Rev of bridge supporting write gathering. pcibr_wg_enable_rev 4 * Turn off panic for Rev B bridge on LLP retry * (which was put in to control silent data corruption) bridge_rev_b_data_check_disable 0 * This is used only by O200/O2000 kernels full_fru_analysis 1 * Minimum confidence required for disabling a fru fru_disable_confidence 90 * Disable specific BTE operations depending on CPU count. * 0 count enables disable_bte 0 disable_bte_pbcopy 64 disable_bte_pbzero 64 disable_bte_poison_range 0 * SN0 specific, should not be changed ! * iio_icmr_precise 1 iio_icmr_c_cnt 6 iio_bte_crb_count 2 iio_ictp 0x800 iio_icto 0xff la_trigger_nasid1 -1 la_trigger_nasid2 -1 la_trigger_val 0xdeadbeefbeefcafe 0 0xffffffffffffffff ll misc: static r4k_corrupt_scache_data 0 l2_sbe_reset_sec 60 * systune l2_sbe_check * * 0x0000 no test * 0x0001 R10K * 0x0002 R12K * 0x0004 R12KS * 0x0008 R14K * 0x0010 reserved CPU type * 0x0020 reserved CPU type * 0x0040 reserved CPU type * 0x0080 reserved CPU type * * 0x0100 verbose output l2_sbe_check 0x0000 * debug enable/disable static tunables * * utrace_bufsize Utraces are a lightweight tracing mechanism used to * collect kernel debugging information.Selects the number * of 48-byte utrace entries stored for each CPU. Setting * utrace_bufsize to zero disables trace collection. * Note: currently, only buffer sizes of zero and 2048 * (NUMB_KERNEL_TSTAMPS in sys/rtmon.h) entries are * supported. debug: static * name default minimum maximum utrace_bufsize 0 0 0x7fffffff * debug enable/disable dynamic tunables * * kmem_do_poison This tunable is always present, but has no effect * unless the kernel memory allocator debugging module * has been installed and configured into the running * kernel. * * Since the purpose of the facility is to immediately * halt the system upon detection of certain types of * errors involving allocated memory, SGI strongly * suggests that memory poisoning only be used with * the assistance of your SGI customer support * representative. * * Turning this option on will affect performance. The * extent to which performance is degraded will depend * on the application load. Currently the maximum * observed degradation has been 15%. * * To install and configure kernel memory debugging, * the eoe.sw.kdebug package must be installed if it * hasn't already. In the irix.sm file, the kmem_debug * module is substituted for the kmem module as described * by the comments in that file. Lastly, build a new * kernel using autoconfig. * * Errors involving allocated memory, for example using * a pointer to a previously allocated memory block after * the block has been freed, are notoriously hard to debug. * Often the first symptom arises long after the event and * in a totally unrelated areas. Memory poisoning attempts * to detect these sorts of errors closer to their source * by writing over free()'ed memory and allocations of * small memory chunks with the return address of the * caller with a 1 or'ed in the lower bit (in order to * cause BUS errors if one of the words is used as a * pointer). The point of this is to try to catch * accesses to freed memory and code that assume that * newly allocated memory is zero'ed. * * kmem_make_zones_private_min * kmem_make_zones_private_max * * As with the kmem_do_poison tunables above * these two tunables are always present, but * have not effect unless the kernel memory * allocator debugging module has been installed * and configured into the running kernel. * * The kernel memory allocator debugging module * is installed with eoe.sw.kdebug. To configure * the module into the kernel see the notes in * irix.sm for kmem.a and kmem_debug.a. * * These two tunables may be set to the size * in bytes of the range of zones to be split. * The size of the zone to be split for debugging * will usually be known from a stack trace, or * from the output of the idbg or icrash zone * commands. If an entire zone is split, the * bounds tunables must be set to the upper * and lower range of the zone. For example, if * you have determined that a problem is ocurring * with some user of the 48 byte zone, you will need * to specify the range 33 to 48. * * The range split may be extended accross multiple * zones, but there is limited table space for zones. * Depending on the configuration there will be * space for ~20-30 new zones. The number of new * zones created by splitting can be determined from * the output of the idbg zone command. * * utrace_mask Each bit controls the collection of a particular * category of information. See the RTMON_xxx macros in * sys/rtmon.h. * * power_button_changed_to_crash_dump change the power button to * take crash dump in case the system 'hangs'. * The soft-power intr condition that the RTC on * IP32 and IP22 platforms provide is polled, * when this variable is enabled, from the * r4kcounter_intr (SR_IP7) handler. * debug: run * name default minimum maximum kmem_do_poison 1 0 1 kmem_make_zones_private_min 0 0 16384 kmem_make_zones_private_max 0 0 16384 utrace_mask 0xfff3f 0 0x7fffffffffffffff ll power_button_changed_to_crash_dump 0 * r12k_bdiag bits <26:23> defines ghistory in R12K * bit <27> disables the branch target address cache r12k_bdiag 0 0 0x0f800000 * * tunables that set the limit * * ncargs is max # bytes of arguments passed during an exec(2) * shlbmax : Maximum number of libraries that can be * attached to a process at one time. * maxwatchpoints: maximum watchpoints per process * nprofile: number of disjoint text spaces to be profiled * maxsymlinks is the maximum number of symlinks expanded in an pathname. * reserve_ncallout: number of reserved callouts * maxup: the maximum number of processes per user, should always smaller than * nproc * * reserve_ncallout auto-config: max(5,numcpus) * maxup limit: nproc - 20 * idbgmaxfuncs = 1200 maximum number of dynamically loaded idbg functions limits: run * name default minimum maximum ncargs 20480 5120 262144 shlbmax 8 3 32 maxwatchpoints 100 1 1000 nprofile 100 100 10000 maxsymlinks 30 0 50 maxup 0 15 30000 idbgmaxfuncs 1200 * * tunables for resource limit * 'cur' may be changed via any shell or setrlimit * Limits specified as 0x7fffffffffffffff implies no limit. * * Note: rlimit_nofile_max must not be set to an unreasonably large value * since many daemons/programs use rlimit_nofile_max as an indication * of how many file descriptors to close when they want to close them all. * resource: static * name default minimum maximum rlimit_cpu_cur 0x7fffffffffffffff 0 0x7fffffffffffffff ll rlimit_cpu_max 0x7fffffffffffffff 0 0x7fffffffffffffff ll rlimit_fsize_cur 0x7fffffffffffffff 0 0x7fffffffffffffff ll rlimit_fsize_max 0x7fffffffffffffff 0 0x7fffffffffffffff ll rlimit_data_cur 0 0 0x7fffffffffffffff ll rlimit_data_max 0 0 0x7fffffffffffffff ll rlimit_stack_cur 0x04000000 0 0x7fffffffffffffff ll rlimit_stack_max 0x20000000 0 0x7fffffffffffffff ll rlimit_core_cur 0x7fffffffffffffff 0 0x7fffffffffffffff ll rlimit_core_max 0x7fffffffffffffff 0 0x7fffffffffffffff ll rlimit_vmem_cur 0 0 0x7fffffffffffffff ll rlimit_vmem_max 0 0 0x7fffffffffffffff ll rlimit_rss_cur 0 0 0x7fffffffffffffff ll rlimit_rss_max 0x20000000 0 0x7fffffffffffffff ll rlimit_nofile_cur 200 40 0x7fffffffffffffff ll rlimit_nofile_max 2500 0 0x7fffffffffffffff ll rlimit_pthread_cur 0x400 1 0xffff ll rlimit_pthread_max 0x400 1 0xffff ll * * tunables that depend on nproc * * nproc: maximum number of processes * ndquot: maximum number of file system quota structures * ncallout: initial # of callouts -- used to implement timeout calls * ncsize: directory-name lookup cache size * nclrus: number of dnlc name lists * ngroups_max: maximum number of groups to which user can belong * * Current nproc auto-config calculation: 40 plus * 4 processes per MB for the 1st GB * 2 processes per MB for the next GB * 1 process per MB for 2-4 GB, * 1 process per 2 MB for 4-8 GB, * 1 process per 4 MB for 8-16 GB, * maximum nproc: 32767 * ndquot auto-config: 200 + 2*nproc * ncsize auto-config: 200 + 2*nproc * ncallout auto-config: nproc/2 * * NOTE: Even though nproc can be set to a value greater than the auto- * configuration value, the amount of physical memory and and swap space * may be insufficent to support that value. * numproc: static * name default minimum maximum nproc 0 30 32767 /* auto-config */ ndquot 0 268 6200 /* auto-config */ ncallout 0 20 5000 /* auto-config */ ncsize 0 268 1000000 /* auto-config */ ngroups_max 16 0 32 nclrus 0 1 128 /* auto-config */ nlog 0 1 1000000 /* auto-config */ * * The following is for STREAMS. * streams: static * name default minimum maximum nstrpush 8 5 10 nstrintr 1024 32 4096 strmsgsz 0x8000 strctlsz 1024 strpmonmax 4 0 1024 streams: run * name default minimum maximum strholdtime 50 0 1000 * * cpu actions -- interprocessor communication blocks * nactions: number of action block, autoconfigure if 0 * * nactions auto-config: max((maxcpus+60), (maxcpus*(maxcpus/2))) * actions: static * name default minimum maximum nactions 0 60 200 /* auto-config */ * * tunables for queued signals * * maxsigq: used by sigqsetup() - Posix.4 required at least 32 * since svr4 SA_SIGINFO signals also share this, make maxsigq >= NSIG+32 * results in * sysconf(SIGQUEUE_MAX) = maxsigq - NSIG * signals: run * name default minimum maxsigq 128 32 * * tunables for timer * * fasthz: profiling/fast itimer clock speed. * * itimer_on_clkcpu = 1 10 ms itimer request is queued on the clock processoro * itimer_on_clkcpu = 0 0 ms itimer request is queued on the runing processor * If a process does a setitimer then uses gettimeofday() to compare the * accuracy of the itimer delivery then itimer_on_clkcpu should be set. * If on the otherhand, itimer request is used to implement a user frequency * based scheduler then itimer_on_clkcpu should be 0. * * timetrim: The clock is adjusted by this signed number of nanoseconds * every second. It is limited to 3 milliseconds or 0.3% in clock.c. * Timed(1M) and timeslave(1M) put suggested values in /usr/adm/SYSLOG. * timer: static * name default minimum maximum fasthz 1000 500 2500 itimer_on_clkcpu 0 timetrim 0 * * tunables for memory size * * maxpmem: the maximum physical memory to use, if maxpmem = 0, * then use all available physical memory. * syssegsz: max pages of dynamic system memory, * maxdmasz: max dma transfer (in pages), must be no more than syssegsz / 2 * and less than maxpmem It is set to what it would be plus 1 * so that file system direct I/O can report a reasonably aligned * maximum I/O size (maxdmasz - 1). * maxpglst: maximum number of pages that can be held in each * of the pager's pageout queues * * syssegsz auto-config: min(max(KB(mem)/2,0x2000),KSEGSIZE) * * Note: KSEGSIZE is system-specific, and defined (in bytes) in * . Thus, the maximum value given here * is just a WAG. * * maxdmasz auto-config: syssegsz/2 * maxpglst auto-config: min(max(KB(mem)/2048,100)+(numcpus-1)*16,1000) * * scache_pool_size: Amount of memory always kept in reserve for use by the * paging daemon. The value is the number of Kbytes * reserved which is always rounded up to next page * boundary. WARNING: this parameter should not be * changed lightly. Setting it too low will result in * memory deadlocks. Setting it too high wastes memory. * If the system panics with the message * "scache... out of memory", * then this tuneable should be increased. Otherwise, * it should be left alone. * memsize: static * name default minimum maximum maxpmem 0 1024 syssegsz,32 0 0x2000 0x20000 /* auto-config */ syssegsz,64 0 0x2000 0x10000000 /* auto-config */ maxdmasz,32 1025 1024 syssegsz /* auto-config */ maxdmasz,64 257 256 syssegsz /* auto-config */ maxpglst 0 50 1000 /* auto-config */ scache_pool_size 32 8 /* in Kbytes */ * * Tuneable Paging parameters * * gpgslo: If freemem < gpgslo, then start to steal pages from processes. * gpgshi: Once we start to steal pages, don't stop until freemem > gtpgshi. * gpgshi defaults to 1/12 of memory; gphslo defaults to gpgshi/2. * gpgsmsk: Mask used by getpages to determine whether a page is stealable. * maxsc: The maximum number of pages which will be swapped out in a single * operation. * maxfc: The maximum number of pages which will be saved up and freed at once. * maxdc: The maximum number of pages which will be saved up and written to * disk (mappd files) at once. * maxumem: Obsolete - see RLIMIT_VMEM * minarmem: The minimum available resident (not swapable) memory to maintain in * order to avoid deadlock. * minasmem: The minimum available swapable memory to maintain in order to avoid * deadlock. * maxlkmem: The maximum amount of lockable pages per process * tlbdrop: Number of ticks before a procs wired entries are flushed * rsshogfrac: Fraction of memory RSS hogs can use * rsshogslop: # pages excess of RSS before slow down process * dwcluster: Maximum number of delayed write pages to cluster in each push. * bdflushr: bdflushr is used to compute what fraction (nbuf/bdflushr) of the * buffers are examined by bdflush when it wakes up * determine whether the delayed-write, aged buffers should to be * flushed to disk. * bdflush_interval: Interval at which bdflush runs, in 10ms ticks. * autoup: The age a delayed-write buffer must be, in seconds, * before bdflush will write it out. * vfs_syncr: The rate at which vfs_syncr is run, in seconds. * min_file_pages: The minimum number of file pages to keep in the cache * when memory gets low. It is autoconfigured to * 3% of the system's memory if it is 0. When setting, * remember that the page size is 4k in 32 bit kernels * and 16k in 64 bit kernels. * min_free_pages: When the number of file pages is above min_file_pages, * this is the minumum number of free pages. * The default value for min_free_pages is gpgshi * 2 for * systems with less than 600M of memory. For larger memory * systems, the default is gpgshi * 4. * min_bufmem: The minimum number of metadata pages to keep in the cache * when memory gets low. It is autoconfigured to 2% of the * system's memory if it is 0. * shaked_interval: The number of seconds between runs of the shaked * daemon when memory is low. * * zone_accum_limit: Percentage of memory (in a node on a NUMA system) that can * be accumulated before shaked is kicked off to shake the zone * to free memory back into the global pool. It is set to 30%. * which means if amount of free memory kept in zones exceed * 30% we kick shaked. * * cwcluster: Maximum number of commit write pages to cluster in each push paging: run * name default minimum maximum gpgslo 0 gpgshi 0 30 gpgsmsk 2 0 7 maxsc 0 8 maxpglst maxfc 0 50 maxpglst maxdc 0 1 maxpglst bdflushr 5 1 31536000 bdflush_interval 100 10 vfs_syncr 30 1 31536000 minarmem 0 minasmem 0 maxlkmem 2000 tlbdrop 100 rsshogfrac 75 0 100 rsshogslop 20 dwcluster 64 autoup 10 1 30 min_file_pages,32 0 0 500000 min_file_pages,64 0 0 1000000 min_free_pages 0 0 50000000 min_bufmem,32 0 0 500000 min_bufmem,64 0 0 1000000 shaked_interval 1 1 3600 zone_accum_limit 30 0 100 cwcluster 64 * * Enables an optimization for mmap() of /dev/zero. This allows adjacent * mmap invokations to just grow the previous address space segment instead * creating a new one. This is useful for X servers. This can be turned off * if a large parallel application wants to initialize adjacent mmap /dev/zero * segments in parallel. * enable_devzero_opt 1 * * This tuneable enables a fast path in the kernel async io. * The fast path ensures that we use k0seg address for aiocb buffers * that lie fully within a page. * enable_kaio_fast_path 1 * * tunables for buffer cache * * nbuf: number of buffers in disk buffer cache. autoconfigure if 0 * * nbuf auto-config: * 32-bit kernels: min(100+KB(mem)/160,6000) * 64-bit kernels: min(100+KB(mem)/160,600000) bufcache: static * name default minimum maximum nbuf,32 0 75 6000 nbuf,64 0 75 600000 * * tunables for module loader * * bdevsw_extra: number of extra entries for bdevsw * cdevsw_extra: number of extra entries for cdevsw * fmodsw_extra: number of extra entries for fmodsw * vfssw_extra: number of extra entries for vfssw * mload: static * name default minimum maximum bdevsw_extra 21 1 254 cdevsw_extra 23 3 254 fmodsw_extra 20 0 vfssw_extra 5 0 * * tunables for extended accounting features * * accounting controls * do_procacct = 1 perform SVR4 process accounting * = 0 do not perform SVR4 process accounting (this overrides * the acct(2) call) * do_extpacct = 1 perform extended process accounting * = 0 do not perform extended process accounting * do_sessacct = 1 perform array session accounting on process exit * = 0 do not perform array session accounting * sessaf: session accounting record format * spilen: length of service provider information * miser_rss_account = 1 do proper miser accounting (costs time/space) * = 0 use inaccurate miser accounting (saves time/space) * * process aggregate id settings * dflt_paggid: the default process aggregate id for special system * sessions and other sessions that bypass ordinary * session handle assignment * min_local_paggid: the first proc. aggregate id that can be assigned * by the kernel * max_local_paggid: the largest proc aggregate id that can be assigned * by the kernel before wrapping back to min_local_paggid * incr_paggid: default increment for local paggid's (or the local * portion of global paggid's) * asmachid: default machine ID for generating global paggid's. * In an array/cluster configuration, no two machines * should have the same machine ID. If 0 is specified, * the kernel will only generate local paggid's. * asarrayid: default array ID for generating global paggid's. * * project ID settings * dfltprid: the default project ID for special system sessions and other * sessions that bypass ordinary project ID assignment, as well * as for any users that do not have their own default project ID. * extacct: static * name default minimum maximum do_procacct 1 do_extpacct 0 do_sessacct 0 sessaf 1 1 2 spilen 64 0 1024 dfltprid 0 0 0x7fffffffffffffff ll dflt_paggid 0 0 0x7fffffffffffffff ll min_local_paggid 1 1 0xffffff00 ll max_local_paggid 65535 255 0xffffffff ll incr_paggid 1 1 0xffffffffffffffff ll asmachid 0 0 0x7fff asarrayid 0xffff 0 0xffff miser_rss_account 1 * * SGI internal use * * histmax: semaphore history * conbuf_cpusz: console buffer sizes, per cpu * putbuf_cpusz: put buffer sizes, per cpu * errbuf_cpusz: error dump buffer sizes , per cpu * conbuf_maxsz: maximum size of console buffer for the system * putbuf_maxsz: maximum size of put buffer for the system * errbuf_maxsz: maximum size of error dump buffer for the system * dumplo: starting default offset in dumpdev to dump kernel when it crashes * internal: static * name default minimum maximum histmax 0 conbuf_maxsz 16384 putbuf_maxsz 16384 errbuf_maxsz 16384 conbuf_cpusz 2048 putbuf_cpusz 2048 errbuf_cpusz 2048 dumplo 0 * * SN0 HW settings * ** Do not modify ** hub_core_arb_ctrl 0xfe hub_2_4_iio_icmr_c_cnt 4 hub_2_4_iio_icmr_p_cnt 0 hub_2_4_iio_bte_crb_cnt 2 * tuneables for using very large pages * * Large pages must be allocated early on and reserved if they * are to be used. nlpages_X refer to the number of large * pages of size X. When these are set the kernel will attempt * to allocate these counts of pages of the appropriate size. * * large_pages: static * name default minimum maximum nlpages_64k 0 0 nlpages_256k 0 0 nlpages_1m 0 0 nlpages_4m 0 0 nlpages_16m 0 0 * tunables for using large pages. * These tunables define hi water marks for various page sizes. * They are specified as percentage of total memory in the system. * The coalescing daemon will use these values in deciding the number of * large pages it has to coalesce for a particular page size. * Thus for example if percent_totalmem_64k_pages is set to 20, the coalescing * mechanism will try to coalesce 20% of memory into 64k pages. * percent_totalmem_16k_pages is looked at only if the system is configured to run * with 4k page size. If the tunables are set to 0, the coalescing * mechanism will be idle. The coalescing daemon will be started if * the tunables are set dynamically on a running system. * The tuneable large_pages_enable turns on the large page feature. * To turn on the feature set the value to 1 otherwise set it to 0. * It is not turned on for workstations by default. It is turned on servers * by default. Coalesced will not run if large_pages_enable is 0. * name default minimum maximum large_pages_enable,NEED_LPAGES 1 large_pages_enable,NOT_NEED_LPAGES 0 lpage_watermarks: run percent_totalmem_16k_pages 0 0 100 percent_totalmem_64k_pages 20 0 100 percent_totalmem_256k_pages 0 0 100 percent_totalmem_1m_pages 0 0 100 percent_totalmem_4m_pages 0 0 100 percent_totalmem_16m_pages 0 0 100 * The number of seconds between runs of coalesced when it cannot coalesce * enough large pages. It is set to 5 min. Coalesced will run continuosly * if some process is sleeping waiting for large pages. coalesced_run_period 300 60 * Normally coalesced will avoid doing any work until large pages are actually * in demand, and then it will only do enough work to satisfy that demand. * Turning on precoalescing will make coalesced put together all the large * pages it can whenever it runs (when demand cannot be immediately met and * according to the run period). Precoalescing can be activated by setting * coalesced_precoalesce to non-zero. coalesced_precoalesce 0 * Tuneables for Maximum simultaneous VME DMA transfer size on Challenge. * Defines the size in Megabytes, of DMA transfer that can be active * for each VME bus in the system. E.g. a value of 64 implies, system * will allocate sufficient resources to have upto 64 Mbytes of DMA * active on each VME bus in the system. System needs to allocate * sufficient number of mapping table entries to support the required * transfer size. Each entry in memory is 4 bytes wide, and maps * a 4k bytes of contiguous data. With a page size of 16k, one (16k) * page is needed for 16M of simultaneous active DMA transfers. * * Value for nvme32_dma should be a proper power of 2. If not, it will be * be bumped to next power of 2 before using. It must not exceed 512 or * else it will be forced to 512. Notice that values larger than 64 may * impact VME throughput on some systems. * vme_dma: static * name default minimum maximum nvme32_dma 64 32 512 * * Tuneables for shm * shmmax - max size in bytes * If shmmax value is 0, it will be adjusted to be 80% of the * system memory size at boot time. * If the default value specifies a number, kernel will obey that. * shmmin - min size in bytes * shmmni - max # active segments in system * If shmmni default value is 0, * it will be adjusted at boot time to be the same value as nproc. * If the default value specifies a number, kernel will obey that. * sshmseg - max segments per process * shm: static * name default minimum maximum shmmax,32 0 0x1000 0x7fffffff shmmin,32 1 1 1 shmmax,64 0 0x1000 0x7fffffffffffffff ll shmmin,64 1 1 1 ll shmmni 0 5 100000 sshmseg 300 1 10000 * * Tuneables for the I/O subsystem * hwgraph_num_dev - number of supported devices in the hardware graph * xxl_hwgraph_num_dev - used on large systems which may have a larger * number of devices * graph_vertex_per_group - number of hardware graph nodes per vertex group. * Each group shares a single lock; reducing the count reduces the * lock contention, at the expense of 128 bytes per group. * graph_vertex_per_group must be a power of two. * io_init_node_threads - the number of nodes on which to initialize IO at * once. 0 means do all nodes at once. * io_init_async - whether or not to start extra threads to initialize * IO devices controlled by a node. 0 = no extra threads, 1 = use * extra threads * * Certain very large machines can hang due to hardware graph lock * contention while probing IO devices at boot time, making it * difficult to boot the machine to change the tuneables to fix the * problem. Therefore, on machines with more than 32 processors, if * io_init_node_threads is zero, these variables are set to the * following values, which have worked well on large test machines: * io_init_node_threads = 16 * io_init_async = 0 * graph_vertex_per_group = 16 (only if it had had the default value) * io: static * name default minimum maximum hwgraph_num_dev 16384 4096 262144 xxl_hwgraph_num_dev 49152 4096 262144 graph_vertex_per_group 128 1 128 io_init_node_threads 0 0 1024 io_init_async 1 0 1 * * sthread priorities * sthread: static * name default minimum maximum console_pri 255 90 255 du_conpoll_pri 255 90 255 msc_shutdown_pri 109 90 109 msc_heartbeat_pri 109 90 109 reaper_pri 100 90 109 netthread_pri 98 90 109 klogthrd_pri 98 90 109 batchd_pri 98 90 109 pdflush_pri 98 90 109 rmonqd_pri 98 90 109 vhand_pri 98 90 109 shaked_pri 98 90 109 exim_pri 96 90 109 bpsqueue_pri 94 90 109 xlvd_pri 94 90 109 vfs_sync_pri 92 90 109 bdflush_pri 92 90 109 xfsd_pri 92 90 109 onesec_pri 90 90 109 unmountd_pri 90 90 109 async_call_pri 90 90 109 sched_pri 90 90 109 coalesced_pri 90 90 109 tiled_pri 90 90 109 cmsd_pri 90 90 109 recovery_pri 90 90 109 tpisockd_pri 90 90 109 io_init_pri 90 90 109 partition_pri 90 90 109 cdl_attach_pri 90 90 109 munldd_pri 90 90 109 dpipe_pri 90 90 109 xbox_poll_pri 90 90 109 cl_async_pri 90 90 109 * * interrupt thread default priorities * ithread: static audio_intr_pri 235 200 239 tserialio_intr_pri 235 200 239 video_intr_pri 230 200 239 graphics_intr_pri 225 200 239 external_intr_pri 220 200 239 serial_intr_pri 215 200 239 parallel_intr_pri 215 200 239 network_intr_pri 210 200 239 scsi_intr_pri 205 200 239 disk_intr_pri 205 200 239 tape_intr_pri 205 200 239 default_intr_pri 200 200 239 default_timeout_pri 1 1 255 * * Kernel Synchronization Package Tunables: * * mrlock_num_mria Number of mria (mrlock inheritance array) in * overflow pool. Each array can record 8 mrlocks. * Every thread in the system by default has one mria * allocated to it. When the number of mrlocks held by a * thread exceeds 8 then an mria is allocated from the * overflow pool. Currently, only vhand (the pageing * daemon) uses mria's from this pool. * ksync: static * name default minimum maximum mrlock_num_mria 64 64 1024 * * enable/disable cpr * ckpt: run * name default minimum maximum ckpt_enabled 1 * * miser/miser_cpuset control flags * * cpuset_nobind: boolean; if true (1) then processes under the * scheduling control of miser_cpuset are not permitted * to bind themselves to any particular CPU. miser: static * name default minimum maximum cpuset_nobind 0 0 1