linux-pinenote

Author	SHA1	Message	Date
Jing Min Zhao	48bfee5fad	[NETFILTER]: H.323 helper: move some function prototypes to ip_conntrack_h323.h Move prototypes of NAT callbacks to ip_conntrack_h323.h. Because the use of typedefs as arguments, some header files need to be moved as well. Signed-off-by: Jing Min Zhao <zhaojingmin@users.sourceforge.net> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-09 22:25:35 -07:00
Patrick McHardy	972d1cb142	[NETFILTER]: Add helper functions for mass hook registration/unregistration Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-09 22:25:32 -07:00
Jordan Hargrave	b20367a6c2	[PATCH] x86_64: Fix drift with HPET timer enabled If the HPET timer is enabled, the clock can drift by ~3 seconds a day. This is due to the HPET timer not being initialized with the correct setting (still using PIT count). If HZ changes, this drift can become even more pronounced. HPET patch initializes tick_nsec with correct tick_nsec settings for HPET timer. Vojtech comments: "It's not entirely correct (it assumes the HPET ticks totally exactly), but it's significantly better than assuming the PIT error there." Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-09 11:53:53 -07:00
Andi Kleen	a8062231d8	[PATCH] x86_64: Handle empty PXMs that only contain hotplug memory The node setup code would try to allocate the node metadata in the node itself, but that fails if there is no memory in there. This can happen with memory hotplug when the hotplug area defines an so far empty node. Now use bootmem to try to allocate the mem_map in other nodes. And if it fails don't panic, but just ignore the node. To make this work I added a new __alloc_bootmem_nopanic function that does what its name implies. TBD should try to use nearby nodes here. Currently we just use any. It's hard to do it better because bootmem doesn't have proper fallback lists yet. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-09 11:53:16 -07:00
Andi Kleen	9d99aaa31f	[PATCH] x86_64: Support memory hotadd without sparsemem Memory hotadd doesn't need SPARSEMEM, but can be handled by just preallocating mem_maps. This only needs some untangling of ifdefs to enable the necessary code even without SPARSEMEM. Originally from Keith Mannthey, hacked by AK. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-09 11:53:16 -07:00
Jeff Garzik	79fa1b677b	Merge branch 'upstream'	2006-04-04 08:45:13 -04:00
Albert Lee	95de719adc	[PATCH] libata: convert ATAPI_ENABLE_DMADIR to module parameter Convert the ATAPI_ENABLE_DMADIR compile time option needed by some SATA-PATA bridge to runtime module parameter. Signed-off-by: Albert Lee <albertcc@tw.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2006-04-04 08:44:24 -04:00
Jeff Garzik	862cff6378	Merge branch 'upstream'	2006-04-04 08:42:17 -04:00
Jeff Garzik	c16226a1c7	Merge branch 'master'	2006-04-04 08:41:29 -04:00
Linus Torvalds	d69636157a	Merge branch 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block * 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block: [PATCH] splice: fix page stealing LRU handling. [PATCH] splice: page stealing needs to wait_on_page_writeback() [PATCH] splice: export generic_splice_sendpage [PATCH] splice: add a SPLICE_F_MORE flag [PATCH] splice: add comments documenting more of the code [PATCH] splice: improve writeback and clean up page stealing [PATCH] splice: fix shadow[] filling logic	2006-04-02 14:22:06 -07:00
Jens Axboe	3e7ee3e7b3	[PATCH] splice: fix page stealing LRU handling. Originally from Nick Piggin, just adapted to the newer branch. You can't check PageLRU without holding zone->lru_lock. The page release code can get away with it only because the page refcount is 0 at that point. Also, you can't reliably remove pages from the LRU unless the refcount is 0. Ever. Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-02 23:11:04 +02:00
Jens Axboe	b2b39fa478	[PATCH] splice: add a SPLICE_F_MORE flag This lets userspace indicate whether more data will be coming in a subsequent splice call. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-02 23:05:41 +02:00
Jens Axboe	4f6f0bd2ff	[PATCH] splice: improve writeback and clean up page stealing By cleaning up the writeback logic (killing write_one_page() and the manual set_page_dirty()), we can get rid of ->stolen inside the pipe_buffer and just keep it local in pipe_to_file(). This also adds dirty page balancing logic and O_SYNC handling. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-02 23:04:46 +02:00
Linus Torvalds	63589ed078	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial * git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (48 commits) Documentation: fix minor kernel-doc warnings BUG_ON() Conversion in drivers/net/ BUG_ON() Conversion in drivers/s390/net/lcs.c BUG_ON() Conversion in mm/slab.c BUG_ON() Conversion in mm/highmem.c BUG_ON() Conversion in kernel/signal.c BUG_ON() Conversion in kernel/signal.c BUG_ON() Conversion in kernel/ptrace.c BUG_ON() Conversion in ipc/shm.c BUG_ON() Conversion in fs/freevxfs/ BUG_ON() Conversion in fs/udf/ BUG_ON() Conversion in fs/sysv/ BUG_ON() Conversion in fs/inode.c BUG_ON() Conversion in fs/fcntl.c BUG_ON() Conversion in fs/dquot.c BUG_ON() Conversion in md/raid10.c BUG_ON() Conversion in md/raid6main.c BUG_ON() Conversion in md/raid5.c Fix minor documentation typo BFP->BPF in Documentation/networking/tuntap.txt ...	2006-04-02 12:58:45 -07:00
Linus Torvalds	b043b673dc	Merge master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb * master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb: (49 commits) V4L/DVB (3667b): cpia2: fix function prototype V4L/DVB (3702): Make msp3400 routing defines more consistent V4L/DVB (3700): Remove obsolete commands from tvp5150.c V4L/DVB (3697): More msp3400 and bttv fixes V4L/DVB (3696): Previous change for cx2341X boards broke the remote support V4L/DVB (3693): Fix msp3400c and bttv stereo/mono/bilingual detection/handling V4L/DVB (3692): Keep experimental SLICED_VBI defines under an #if 0 V4L/DVB (3689): Kconfig: fix VP-3054 Secondary I2C Bus build configuration menu dependencies V4L/DVB (3673): Fix budget-av CAM reset V4L/DVB (3672): Fix memory leak in dvr open V4L/DVB (3671): New module parameter 'tv_standard' (dvb-ttpci driver) V4L/DVB (3670): Fix typo in comment V4L/DVB (3669): Configurable dma buffer size for saa7146-based budget dvb cards V4L/DVB (3653h): Move usb v4l docs into Documentation/video4linux V4L/DVB (3667a): Fix SAP + stereo mode at msp3400 V4L/DVB (3666): Remove trailing newlines V4L/DVB (3665): Add new NEC uPD64031A and uPD64083 i2c drivers V4L/DVB (3663): Fix msp3400c wait time and better audio mode fallbacks V4L/DVB (3662): Don't set msp3400c-non-existent register V4L/DVB (3661): Add wm8739 stereo audio ADC i2c driver ...	2006-04-02 12:53:57 -07:00
Linus Torvalds	9c8680e2cf	Merge master.kernel.org:/pub/scm/linux/kernel/git/dtor/input * master.kernel.org:/pub/scm/linux/kernel/git/dtor/input: (26 commits) Input: add support for Braille devices Input: synaptics - limit rate to 40pps on Toshiba Protege M300 Input: gamecon - add SNES mouse support Input: make modalias code respect allowed buffer size Input: convert /proc handling to seq_file Input: limit attributes' output to PAGE_SIZE Input: gameport - fix memory leak Input: serio - fix memory leak Input: zaurus keyboard driver updates Input: i8042 - fix logic around pnp_register_driver() Input: ns558 - fix logic around pnp_register_driver() Input: pcspkr - separate device and driver registration Input: atkbd - allow disabling on X86_PC (if EMBEDDED) Input: atkbd - disable softrepeat for dumb keyboards Input: atkbd - fix complaints about 'releasing unknown key 0x7f' Input: HID - fix duplicate key mapping for Logitech UltraX remote Input: use kzalloc() throughout the code Input: fix input_free_device() implementation Input: initialize serio and gameport at subsystem level Input: uinput - semaphore to mutex conversion ...	2006-04-02 12:49:19 -07:00
Linus Torvalds	bacd3add08	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [NET]: Fully fix the memory leaks in sys_accept(). [NETFILTER]: iptables 32bit compat layer [NETFILTER]: {ip,nf}_conntrack_netlink: fix expectation notifier unregistration [NETFILTER]: fix ifdef for connmark support in nf_conntrack_netlink [NETFILTER]: x_tables: unify IPv4/IPv6 multiport match [NETFILTER]: x_tables: unify IPv4/IPv6 esp match [NET]: Fix dentry leak in sys_accept(). [IPSEC]: Kill unused decap state structure [IPSEC]: Kill unused decap state argument [NET]: com90xx kmalloc fix [TG3]: Update driver version and reldate. [TG3]: Revert "Speed up SRAM access"	2006-04-02 12:47:12 -07:00
Linus Torvalds	29e350944f	splice: add SPLICE_F_NONBLOCK flag It doesn't make the splice itself necessarily nonblocking (because the actual file descriptors that are spliced from/to may block unless they have the O_NONBLOCK flag set), but it makes the splice pipe operations nonblocking. Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-02 12:46:35 -07:00
Jeff Garzik	029f5468b5	Merge branch 'upstream' Conflicts: drivers/scsi/libata-core.c drivers/scsi/pdc_adma.c drivers/scsi/sata_mv.c drivers/scsi/sata_nv.c drivers/scsi/sata_promise.c drivers/scsi/sata_qstor.c drivers/scsi/sata_sx4.c drivers/scsi/sata_vsc.c include/linux/libata.h	2006-04-02 10:30:40 -04:00
Tejun Heo	ece1d63619	[PATCH] libata: separate out libata-eh.c A lot of EH codes are about to be added to libata. Separate out libata-eh.c. ata_scsi_timed_out(), ata_scsi_error(), ata_qc_timeout(), ata_eng_timeout(), ata_eh_qc_complete() and ata_eh_qc_retry() are moved. No code is changed by this patch. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2006-04-02 10:09:21 -04:00
Tejun Heo	2719736779	[PATCH] libata: add ATA_QCFLAG_IO Add a new qc flag ATA_QCFLAG_IO. This flag gets set for normal IO commands originating from SCSI midlayer. This information will be used by EH to determine transfer speed reconfiguration. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2006-04-02 10:09:20 -04:00
Tejun Heo	ea1dd4e130	[PATCH] libata: clear only affected flags during ata_dev_configure() ata_dev_configure() should not clear dynamic device flags determined elsewhere. Lower eight bits are reserved for feature flags, define ATA_DFLAG_CFG_MASK and clear only those bits before configuring device. Without this patch, ATA_DFLAG_PIO gets turned off during revalidation making PIO mode unuseable. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2006-04-02 10:09:19 -04:00
Tejun Heo	198e0fed9e	[PATCH] libata: rename ATA_FLAG_PORT_DISABLED to ATA_FLAG_DISABLED Rename ATA_FLAG_PORT_DISABLED to ATA_FLAG_DISABLED for consistency. (ATA_FLAG_* are always about ports). Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2006-04-02 10:09:19 -04:00
Tejun Heo	949b38af40	[PATCH] libata: clean up constants * Reorder ATA_DFLAG_* such that feature flags determined by ata_dev_configure() are on lower bits. Reserve lower eight bits for this purpose and allocate dynamic flags from bit 8. * Reorder ATA_FLAG_* such that feature flags determined during driver initiailization are on bits 0:15, dynamic flags on 16:23 and LLDD specific flags on 24:31. * Kill trailing white space and lower-case an one line comment for consistency. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2006-04-02 10:09:19 -04:00
Tejun Heo	c43c555c3a	[PATCH] libata: ATA_FLAG_IN_EH is not used, kill it Kill unused flag ATA_FLAG_IN_EH. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2006-04-02 10:09:19 -04:00
Tejun Heo	14d2bac187	[PATCH] libata: improve ata_bus_probe() Improve ata_bus_probe() such that configuration failures are handled better. Each device is given ATA_PROBE_MAX_TRIES chances, but any non-transient error (revalidation failure with -ENODEV, configuration failure with -EINVAL...) disables the device directly. Any IO error results in SATA PHY speed down and ata_set_mode() failure lowers transfer mode. The last try always puts a device into PIO-0. After each failure, the whole port is reset to make sure that the controller and all the devices are in a known and stable state. The reset also applies SATA SPD configuration if necessary. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2006-04-02 10:02:57 -04:00
Tejun Heo	1c3fae4d7e	[PATCH] libata: implement ap->sata_spd_limit and helpers ap->sata_spd_limit contrains SATA PHY speed of the port. It is initialized to the configured value prior to probing thus preserving BIOS configured value. hardreset is responsible for applying SPD limit and sata_std_hardreset() is updated to do that. SATA SPD limit will be used to enhance failure handling during probing and later by EH. This patch also normalizes some comments around affected code. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2006-04-02 10:02:57 -04:00
Tejun Heo	002c8054fa	[PATCH] libata: implement ata_dev_absent() For the time being we cannot use ata_dev_present() as it was renamed to ata_dev_enabled() but we still need presence test. Implement negation of the test. Conveniently, the negated result is needed in more places. This is suggested by Jeff Garzik. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2006-04-02 10:02:57 -04:00
Martin Waitz	a580290c3e	Documentation: fix minor kernel-doc warnings This patch updates the comments to match the actual code. Signed-off-by: Martin Waitz <tali@admingilde.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>	2006-04-02 13:59:55 +02:00
Hans Verkuil	9bc7400a9d	V4L/DVB (3692): Keep experimental SLICED_VBI defines under an #if 0 The sliced VBI defines added in videodev2.h are removed since requires more discussion. Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2006-04-02 04:56:04 -03:00
Samuel Thibault	b9ec4e109d	Input: add support for Braille devices - Add KEY_BRL_* input keys and K_BRL_* keycodes; - Add emulation of how braille keyboards usually combine braille dots to the console keyboard driver; - Add handling of unicode U+28xy diacritics. Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>	2006-04-02 00:10:28 -05:00
Dmitry Torokhov	95d465fd75	Manual merge with Linus. Conflicts: arch/powerpc/kernel/setup-common.c drivers/input/keyboard/hil_kbd.c drivers/input/mouse/hil_ptr.c	2006-04-02 00:08:05 -05:00
Krzysztof Halasa	58a7ce6442	[PATCH] Goramo PCI200SYN WAN driver subsystem ID patch Goramo finally got PCI subsystem ID for their PCI200SYN card. The attached patch adds support for it - cards with old EEPROM data will emit a warning with URL for update tool. Signed-off-by: Krzysztof Halasa <khc@pm.waw.pl> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2006-04-01 14:32:52 -05:00
Jeff Garzik	6e07e16404	Merge branch 'upstream'	2006-04-01 14:29:12 -05:00
Tejun Heo	e1211e3fa7	[PATCH] libata: implement ata_dev_enabled and disabled() This patch renames ata_dev_present() to ata_dev_enabled() and adds ata_dev_disabled(). This is to discern the state where a device is present but disabled from not-present state. This disctinction is necessary when configuring transfer mode because device selection timing must not be violated even if a device fails to configure. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2006-04-01 12:33:45 -05:00
Dmitry Mishin	2722971cbe	[NETFILTER]: iptables 32bit compat layer This patch extends current iptables compatibility layer in order to get 32bit iptables to work on 64bit kernel. Current layer is insufficient due to alignment checks both in kernel and user space tools. Patch is for current net-2.6.17 with addition of move of ipt_entry_{match\| target} definitions to xt_entry_{match\|target}. Signed-off-by: Dmitry Mishin <dim@openvz.org> Acked-off-by: Kirill Korotaev <dev@openvz.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-01 02:25:19 -08:00
Yasuyuki Kozakai	a89ecb6a2e	[NETFILTER]: x_tables: unify IPv4/IPv6 multiport match This unifies ipt_multiport and ip6t_multiport to xt_multiport. As a result, this addes support for inversion and port range match to IPv6 packets. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-01 02:22:54 -08:00
Yasuyuki Kozakai	dc5ab2faec	[NETFILTER]: x_tables: unify IPv4/IPv6 esp match This unifies ipt_esp and ip6t_esp to xt_esp. Please note that now a user program needs to specify IPPROTO_ESP as protocol to use esp match with IPv6. This means that ip6tables requires '-p esp' like iptables. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-01 02:22:30 -08:00
Kalin KOZHUHAROV	8ba8e95ed1	Fix comments: s/granuality/granularity/ I was grepping through the code and some `grep ganularity -R .` didn't catch what I thought. Then looking closer I saw the term "granuality" used in only four places (in comments) and granularity in many more places describing the same idea. Some other facts: dictionary.com does not know such a word define:granuality on google is not found (and pages for granuality are mostly related to patches to the kernel) it has not been discussed as a term on LKML, AFAICS (=Can Search) To be consistent, I think granularity should be used everywhere. Signed-off-by: Kalin KOZHUHAROV <kalin@thinrope.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>	2006-04-01 01:41:22 +02:00
Linus Torvalds	4b75679f60	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [NET]: Allow skb headroom to be overridden [TCP]: Kill unused extern decl for tcp_v4_hash_connecting() [NET]: add SO_RCVBUF comment [NET]: Deinline some larger functions from netdevice.h [DCCP]: Use NULL for pointers, comfort sparse. [DECNET]: Fix refcount	2006-03-31 12:52:30 -08:00
Adrian Bunk	a244e1698a	[PATCH] fs/namei.c: make lookup_hash() static As announced, lookup_hash() can now become static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-31 12:19:01 -08:00
Antonino A. Daplas	a536093a2f	[PATCH] fbcon: Fix big-endian bogosity in slow_imageblit() The monochrome->color expansion routine that handles bitmaps which have (widths % 8) != 0 (slow_imageblit) produces corrupt characters in big-endian. This is caused by a bogus bit test in slow_imageblit(). Fix. This patch may deserve to go to the stable tree. The code has already been well tested in little-endian machines. It's only in big-endian where there is uncertainty and Herbert confirmed that this is the correct way to go. It should not introduce regressions. Signed-off-by: Antonino Daplas <adaplas@pol.net> Acked-by: Herbert Poetzl <herbert@13thfloor.at> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-31 12:19:00 -08:00
Richard Purdie	6ca017658b	[PATCH] backlight: Backlight Class Improvements Backlight class attributes are currently easy to implement incorrectly. Moving certain handling into the backlight core prevents this whilst at the same time makes the drivers simpler and consistent. The following changes are included: The brightness attribute only sets and reads the brightness variable in the backlight_properties structure. The power attribute only sets and reads the power variable in the backlight_properties structure. Any framebuffer blanking events change a variable fb_blank in the backlight_properties structure. The backlight driver has only two functions to implement. One function is called when any of the above properties change (to update the backlight brightness), the second is called to return the current backlight brightness value. A new attribute "actual_brightness" is added to return this brightness as determined by the driver having combined all the above factors (and any driver/device specific factors). Additionally, the backlight core takes care of checking the maximum brightness is not exceeded and of turning off the backlight before device removal. The corgi backlight driver is updated to reflect these changes. Signed-off-by: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Antonino Daplas <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-31 12:19:00 -08:00
Eric W. Biederman	3e7e241f8c	[PATCH] dcache: Add helper d_hash_and_lookup It is very common to hash a dentry and then to call lookup. If we take fs specific hash functions into account the full hash logic can get ugly. Further full_name_hash as an inline function is almost 100 bytes on x86 so having a non-inline choice in some cases can measurably decrease code size. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-31 12:19:00 -08:00
Eric W. Biederman	92476d7fc0	[PATCH] pidhash: Refactor the pid hash table Simplifies the code, reduces the need for 4 pid hash tables, and makes the code more capable. In the discussions I had with Oleg it was felt that to a large extent the cleanup itself justified the work. With struct pid being dynamically allocated meant we could create the hash table entry when the pid was allocated and free the hash table entry when the pid was freed. Instead of playing with the hash lists when ever a process would attach or detach to a process. For myself the fact that it gave what my previous task_ref patch gave for free with simpler code was a big win. The problem is that if you hold a reference to struct task_struct you lock in 10K of low memory. If you do that in a user controllable way like /proc does, with an unprivileged but hostile user space application with typical resource limits of 1000 fds and 100 processes I can trigger the OOM killer by consuming all of low memory with task structs, on a machine wight 1GB of low memory. If I instead hold a reference to struct pid which holds a pointer to my task_struct, I don't suffer from that problem because struct pid is 2 orders of magnitude smaller. In fact struct pid is small enough that most other kernel data structures dwarf it, so simply limiting the number of referring data structures is enough to prevent exhaustion of low memory. This splits the current struct pid into two structures, struct pid and struct pid_link, and reduces our number of hash tables from PIDTYPE_MAX to just one. struct pid_link is the per process linkage into the hash tables and lives in struct task_struct. struct pid is given an indepedent lifetime, and holds pointers to each of the pid types. The independent life of struct pid simplifies attach_pid, and detach_pid, because we are always manipulating the list of pids and not the hash table. In addition in giving struct pid an indpendent life it makes the concept much more powerful. Kernel data structures can now embed a struct pid * instead of a pid_t and not suffer from pid wrap around problems or from keeping unnecessarily large amounts of memory allocated. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-31 12:19:00 -08:00
Eric W. Biederman	8c7904a00b	[PATCH] task: RCU protect task->usage A big problem with rcu protected data structures that are also reference counted is that you must jump through several hoops to increase the reference count. I think someone finally implemented atomic_inc_not_zero(&count) to automate the common case. Unfortunately this means you must special case the rcu access case. When data structures are only visible via rcu in a manner that is not determined by the reference count on the object (i.e. tasks are visible until their zombies are reaped) there is a much simpler technique we can employ. Simply delaying the decrement of the reference count until the rcu interval is over. What that means is that the proc code that looks up a task and later wants to sleep can now do: rcu_read_lock(); task = find_task_by_pid(some_pid); if (task) { get_task_struct(task); } rcu_read_unlock(); The effect on the rest of the kernel is that put_task_struct becomes cheaper and immediate, and in the case where the task has been reaped it frees the task immediate instead of unnecessarily waiting an until the rcu interval is over. Cleanup of task_struct does not happen when its reference count drops to zero, instead cleanup happens when release_task is called. Tasks can only be looked up via rcu before release_task is called. All rcu protected members of task_struct are freed by release_task. Therefore we can move call_rcu from put_task_struct into release_task. And we can modify release_task to not immediately release the reference count but instead have it call put_task_struct from the function it gives to call_rcu. The end result: - get_task_struct is safe in an rcu context where we have just looked up the task. - put_task_struct() simplifies into its old pre rcu self. This reorganization also makes put_task_struct uncallable from modules as it is not exported but it does not appear to be called from any modules so this should not be an issue, and is trivially fixed. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-31 12:18:59 -08:00
Andrew Morton	158d9ebd19	[PATCH] resurrect __put_task_struct This just got nuked in mainline. Bring it back because Eric's patches use it. Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-31 12:18:59 -08:00
Con Kolivas	d425b274ba	[PATCH] sched: activate SCHED BATCH expired To increase the strength of SCHED_BATCH as a scheduling hint we can activate batch tasks on the expired array since by definition they are latency insensitive tasks. Signed-off-by: Con Kolivas <kernel@kolivas.org> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-31 12:18:59 -08:00
Con Kolivas	3dee386e14	[PATCH] sched: cleanup task_activated() The activated flag in task_struct is used to track different sleep types and its usage is somewhat obfuscated. Convert the variable to an enum with more descriptive names without altering the function. Signed-off-by: Con Kolivas <kernel@kolivas.org> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-31 12:18:58 -08:00
Jack Steiner	db1b1fefc2	[PATCH] sched: reduce overhead of calc_load Currently, count_active_tasks() calls both nr_running() & nr_interruptible(). Each of these functions does a "for_each_cpu" & reads values from the runqueue of each cpu. Although this is not a lot of instructions, each runqueue may be located on different node. Depending on the architecture, a unique TLB entry may be required to access each runqueue. Since there may be more runqueues than cpu TLB entries, a scan of all runqueues can trash the TLB. Each memory reference incurs a TLB miss & refill. In addition, the runqueue cacheline that contains nr_running & nr_uninterruptible may be evicted from the cache between the two passes. This causes unnecessary cache misses. Combining nr_running() & nr_interruptible() into a single function substantially reduces the TLB & cache misses on large systems. This should have no measureable effect on smaller systems. On a 128p IA64 system running a memory stress workload, the new function reduced the overhead of calc_load() from 605 usec/call to 324 usec/call. Signed-off-by: Jack Steiner <steiner@sgi.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-31 12:18:58 -08:00

... 16 17 18 19 20 ...

3734 commits