linux-pinenote

Author	SHA1	Message	Date
Hans de Goede	d4cde88c1c	p54pci: fix regression from prevent stuck rx-ring on slow system This patch fixes a recently introduced use-after-free regression from "p54pci: prevent stuck rx-ring on slow system". Hans de Goede reported a use-after-free regression: >BUG: unable to handle kernel paging request at 6b6b6b6b >IP: [<e122284a>] p54p_check_tx_ring+0x84/0xb1 [p54pci] >pde = 00000000 >Oops: 0000 [#1] SMP >EIP: 0060:[<e122284a>] EFLAGS: 00010286 CPU: 0 >EIP is at p54p_check_tx_ring+0x84/0xb1 [p54pci] >EAX: 6b6b6b6b EBX: df10b170 ECX: 00000003 EDX: 00000001 >ESI: dc471500 EDI: d8acaeb0 EBP: c098be9c ESP: c098be84 > DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 >Process swapper (pid: 0, ti=c098a000 task=c09ccfe0 task.ti=c098a000) >Call Trace: > [<e1222b02>] ? p54p_tasklet+0xaa/0xb5 [p54pci] > [<c0440568>] ? tasklet_action+0x78/0xcb > [<c0440ed3>] ? __do_softirq+0xbc/0x173 Quote from comment #17: "The problem is the innocent looking moving of the tx processing to after the rx processing in the tasklet. Quoting from the changelog: This patch does it the same way, except that it also prioritize rx data processing, simply because tx routines can* wait. This is causing an issue with us referencing already freed memory, because some skb's we transmit, we immediately receive back, such as those for reading the eeprom (*) and getting stats. What can happen because of the moving of the tx processing to after the rx processing is that when the tasklet first runs after doing a special skb tx (such as eeprom) we've already received the answer to it. Then the rx processing ends up calling p54_find_and_unlink_skb to find the matching tx skb for the just received special rx skb and frees the tx skb. Then after the processing of the rx skb answer, and thus freeing the tx skb, we go process the completed tx ring entires, and then dereference the free-ed skb, to see if it should free free-ed by p54p_check_tx_ring()." Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=583623 Bug-Identified-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Christian Lamparter <chunkeey@googlemail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-26 14:21:15 -04:00
Stefan Schmidt	93c0c8b4a5	ieee802154: Fix oops during ieee802154_sock_ioctl Trying to run izlisten (from lowpan-tools tests) on a device that does not exists I got the oops below. The problem is that we are using get_dev_by_name without checking if we really get a device back. We don't in this case and writing to dev->type generates this oops. [Oops code removed by Dmitry Eremin-Solenikov] If possible this patch should be applied to the current -rc fixes branch. Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org> Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-26 11:20:32 -07:00
Eric Dumazet	4a4771a58e	net: use sk_sleep() Commit `aa395145` (net: sk_sleep() helper) missed three files in the conversion. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-26 11:18:45 -07:00
Jiri Pirko	be9e969d79	pppoe: use pppoe_pernet instead of directly net_generic Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-26 11:18:44 -07:00
Jiri Pirko	0db3f0f49a	phonet: use phonet_pernet instead of directly net_generic As in for example pppoe introduce phonet_pernet and use it instead of calling net_generic directly. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-26 11:18:44 -07:00
Hans de Goede	0250ececdf	p54pci: fix bugs in p54p_check_tx_ring Hans de Goede identified a bug in p54p_check_tx_ring: there are two ring indices. 1 => tx data and 3 => tx management. But the old code had a constant "1" and this resulted in spurious dma unmapping failures. Cc: stable@kernel.org Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=583623 Bug-Identified-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Christian Lamparter <chunkeey@googlemail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-26 14:18:42 -04:00
Denis Turischev	fcf1dd7e68	watchdog: sbc_fitpc2_wdt: fixed I/O operations order There are fitpc2 compatible boards that hang with existent i/o operations order. Solution is to switch between writing to data and command ports. Signed-off-by: Denis Turischev <denis@compulab.co.il> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>	2010-04-26 18:17:34 +00:00
Andre Detsch	dc8bf1b1a6	tg3: Fix INTx fallback when MSI fails tg3: Fix INTx fallback when MSI fails MSI setup changes the value of irq_vec in struct tg3 *tp. This attribute must be taken into account and restored before we try to do a new request_irq for INTx fallback. In powerpc, the original code was leading to an EINVAL return within request_irq, because the driver was trying to use the disabled MSI virtual irq number instead of tp->pdev->irq. Signed-off-by: Andre Detsch <adetsch@br.ibm.com> Acked-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-26 11:15:49 -07:00
Guenter Roeck	86913315de	Watchdog: sb_wdog.c: Fix sibyte watchdog initialization Watchdog configuration register and timer count register were interchanged, causing wrong values to be written into both registers. This caused watchdog triggered resets even if the watchdog was reset in time. Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>	2010-04-26 18:14:03 +00:00
Juuso Oikarinen	0c86980817	mac80211: Fix sta->last_tx_rate setting with no-op rate control devices The sta->last_tx_rate is traditionally updated just before transmitting a frame based on information from the rate control algorithm. However, for hardware drivers with IEEE80211_HW_HAS_RATE_CONTROL this is not performed, as the rate control algorithm is not executed, and because the used rate is not known before the frame has actually been transmitted. This causes atleast a fixed 1Mb/s to be reported to user space. A few other instances of code also rely on this information. Fix this by setting the sta->last_tx_rate in tx_status handling. There, look for last rates entry set by the driver, and use that as value for sta->last_tx_rate. Signed-off-by: Juuso Oikarinen <juuso.oikarinen@nokia.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-26 14:11:40 -04:00
Alexander Kurz	83bf6f11e8	pcmcia: fix matching rules for pseudo-multi-function cards Prevent PCMCIA_DEV_ID_MATCH_FUNC_ID from grabbing PFC-cards: I changed the code, so that the first matching struct pcmcia_device_id _PFC_ entry will mark the card has_pfc, preventing PCMCIA_DEV_ID_MATCH_FUNC_ID to match. [linux-pcmcia@lists.infradead.org: re-order commit message] Signed-off-by: Alexander Kurz <linux@kbdbabel.org> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>	2010-04-26 20:09:07 +02:00
Rafał Miłecki	5af5542885	ssb: Fix order of definitions and some text space indents Signed-off-by: Rafał Miłecki <zajec5@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-26 13:51:09 -04:00
Rafał Miłecki	0a182fd88f	ssb: Use relative offsets for SPROM Signed-off-by: Rafał Miłecki <zajec5@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-26 13:51:08 -04:00
Rafał Miłecki	ea2db495f9	ssb: Look for SPROM at different offset on higher rev CC Our offset handling becomes even a little more hackish now. For some reason I do not understand all offsets as inrelative. It assumes base offset is 0x1000 but it will work for now as we make offsets relative anyway by removing base 0x1000. Should be cleaner however. Signed-off-by: Rafał Miłecki <zajec5@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-26 13:51:08 -04:00
John W. Linville	d53cdbb94a	ssb: do not read SPROM if it does not exist Attempting to read registers that don't exist on the SSB bus can cause hangs on some boxes. At least some b43 devices are 'in the wild' that don't have SPROMs at all. When the SSB bus support loads, it attempts to read these (non-existant) SPROMs and causes hard hangs on the box -- no console output, etc. This patch adds some intelligence to determine whether or not the SPROM is present before attempting to read it. This avoids those hard hangs on those devices with no SPROM attached to their SSB bus. The SSB-attached devices (e.g. b43, et al.) won't work, but at least the box will survive to test further patches. :-) Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Rafał Miłecki <zajec5@gmail.com> Cc: Larry Finger <Larry.Finger@lwfinger.net> Cc: Michael Buesch <mb@bu3sch.de>	2010-04-26 13:50:54 -04:00
Dave Chinner	dd77ef924c	xfs: more swap extent fixes for dynamic fork offsets A new xfsqa test (226) with a prototype xfs_fsr change to try to handle dynamic fork offsets better triggers an assertion failure where the inode data fork is in btree format, yet there is room in the inode for it to be in extent format. The two inodes look like: before: ino 0x101 (target), num_extents 11, Max in-fork extents 6, broot size 40, fork offset 96 before: ino 0x115 (temp), num_extents 5, Max in-fork extents 3, broot size 40, fork offset 56 after: ino 0x101 (target), num_extents 5, Max in-fork extents 6, broot size 40, fork offset 96 after: ino 0x115 (temp), num_extents 11, Max in-fork extents 3, broot size 40, fork offset 56 Basically the target inode ends up with 5 extents in btree format, but it had space for 6 extents in extent format, so ends up incorrect. Notably here the broot size is the same, and that is where the kernel code is going wrong - the btree root will fit, so it lets the swap go ahead. The check should not allow the swap to take place if the number of extents while in btree format is less than the number of extents that can fit in the inode in extent format. Adding that check will prevent this swap and corruption from occurring. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2010-04-26 12:38:51 -05:00
Patrick McHardy	cb6a4e461f	net: ipmr: add support for dumping routing tables over netlink The ipmr /proc interface (ip_mr_cache) can't be extended to dump routes from any tables but the main table in a backwards compatible fashion since the output format ends in a variable amount of output interfaces. Introduce a new netlink interface to dump multicast routes from all tables, similar to the netlink interface for regular routes. Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-04-26 16:22:50 +02:00
Patrick McHardy	25239cee7e	net: rtnetlink: decouple rtnetlink address families from real address families Decouple rtnetlink address families from real address families in socket.h to be able to add rtnetlink interfaces to code that is not a real address family without increasing AF_MAX/NPROTO. This will be used to add support for multicast route dumping from all tables as the proc interface can't be extended to support anything but the main table without breaking compatibility. This partialy undoes the patch to introduce independant families for routing rules and converts ipmr routing rules to a new rtnetlink family. Similar to that patch, values up to 127 are reserved for real address families, values above that may be used arbitrarily. Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-04-26 16:13:54 +02:00
Patrick McHardy	3d0c9c4eb2	net: fib_rules: mark arguments to fib_rules_register const and __net_initdata fib_rules_register() duplicates the template passed to it without modification, mark the argument as const. Additionally the templates are only needed when instantiating a new namespace, so mark them as __net_initdata, which means they can be discarded when CONFIG_NET_NS=n. Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-04-26 16:02:04 +02:00
Jens Axboe	e6d086d83c	btrfs: convert to using bdi_setup_and_register() It's now a provided helper, so get rid of the internal setup and btrfs atomic_t bdi enumerator. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2010-04-26 10:27:54 +02:00
Krzysztof Helt	867f1845c5	ALSA: es968: fix wrong PnP dma index There is only one dma for the ESS ES968 based board. Its index is 0 and not 1. This make the es968 card working. Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl> Signed-off-by: Takashi Iwai <tiwai@suse.de>	2010-04-26 09:05:44 +02:00
Guennadi Liakhovetski	83515bc7df	SH: fix error paths in DMA driver If channel allocation is failing, mark the channel unused and give PM a chance to power down the hardware. Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de> Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2010-04-26 16:04:48 +09:00
Magnus Damm	e3a4317e1d	sh: sh7751 pci controller io port fix This patch updates the sh7751 pci code to handle io ports correctly. The code is based on the sh7788x implementation. Tested on a R2D-1 board with CONFIG_8139TOO_PIO=y. Signed-off-by: Magnus Damm <damm@opensource.se> Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2010-04-26 16:02:25 +09:00
Magnus Damm	43f5988c18	sh: Fix maximum number of SCIF ports in R2D defconfigs Update the R2D defconfigs to bump up the maximum number of SCIF ports on the system. Fixes a broken serial console regression added by `cd5f107628`. Reported-by: Shin-ichiro KAWASAKI <kawasaki@juno.dti.ne.jp> Signed-off-by: Magnus Damm <damm@opensource.se> Tested-by: Alexandre Courbot <alex@dcl.info.waseda.ac.jp> Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2010-04-26 16:02:17 +09:00
Guennadi Liakhovetski	c2fe3092e5	SH: fix TS field shift calculation for DMA drivers CHCR_TS_HIGH_SHIFT is defined as a shift of TS high bits in CHCR register, relative to low bits. The TS_INDEX2VAL() macro has to take this into account. Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de> Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2010-04-26 16:02:09 +09:00
Herbert Xu	180ce7e810	crypto: authenc - Add EINPROGRESS check When Steffen originally wrote the authenc async hash patch, he correctly had EINPROGRESS checks in place so that we did not invoke the original completion handler with it. Unfortuantely I told him to remove it before the patch was applied. As only MAY_BACKLOG request completion handlers are required to handle EINPROGRESS completions, those checks are really needed. This patch restores them. Reported-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2010-04-26 09:14:05 +08:00
Linus Torvalds	b91ce4d14a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: ipv6: Fix inet6_csk_bind_conflict() e100: Fix the TX workqueue race	2010-04-25 16:28:56 -07:00
Eric Dumazet	6443bb1fc2	ipv6: Fix inet6_csk_bind_conflict() Commit `fda48a0d7a` (tcp: bind() fix when many ports are bound) introduced a bug on IPV6 part. We should not call ipv6_addr_any(inet6_rcv_saddr(sk2)) but ipv6_addr_any(inet6_rcv_saddr(sk)) because sk2 can be IPV4, while sk is IPV6. Reported-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Tested-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-25 15:09:42 -07:00
Linus Torvalds	202f2bb070	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: Issue the discard operation before releasing the blocks to be reused ext4: Fix buffer head leaks after calls to ext4_get_inode_loc() ext4: Fix possible lost inode write in no journal mode	2010-04-25 10:01:51 -07:00
Jiri Pirko	b3c981d2bb	netns: rename unregister_pernet_subsys parameter Stay consistent with other functions and with comment also and name pernet_operations parameter properly. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-25 00:49:56 -07:00
Jörn Engel	5129a469a9	Catch filesystems lacking s_bdi noop_backing_dev_info is used only as a flag to mark filesystems that don't have any backing store, like tmpfs, procfs, spufs, etc. Signed-off-by: Joern Engel <joern@logfs.org> Changed the BUG_ON() to a WARN_ON(). Note that adding dirty inodes to the noop_backing_dev_info is not legal and will not result in them being flushed, but we already catch this condition in __mark_inode_dirty() when checking for a registered bdi. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2010-04-25 08:54:42 +02:00
Changli Gao	8c52d509e8	rps: optimize rps_get_cpu() optimize rps_get_cpu(). don't initialize ports when we can get the ports. one memory access for ports than two. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-24 22:50:10 -07:00
Alan Cox	401da6aea3	e100: Fix the TX workqueue race Nothing stops the workqueue being left to run in parallel with close or a few other operations. This causes double unmaps and the like. See kerneloops.org #1041230 for an example Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-24 21:09:29 -07:00
Stephen Hemminger	bf73130d7f	sky2: add support for receive hashing Sky2 hardware supports hardware receive hash calculation. Now that Receive Packet Steering is available, add support to enable it. This version does not depend on CONFIG_RPS. Also set_flags rejects all values except RXHASH, so driver won't have to change next time somebody adds a new one. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-24 20:04:12 -07:00
Phillip Lougher	e0d1f70010	squashfs: fix potential buffer over-run on 4K block file systems Sizing the buffer based on block size is incorrect, leading to a potential buffer over-run on 4K block size file systems (because the metadata block size is always 8K). This bug doesn't seem have triggered because 4K block size file systems are not default, and also because metadata blocks after compression tend to be less than 4K. Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>	2010-04-25 02:09:05 +01:00
Phillip Lougher	370ec3d1ed	squashfs: add missing buffer free Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>	2010-04-25 02:09:05 +01:00
Phillip Lougher	1cb08e9738	squashfs: fix warn_on when root inode is corrupted Fix warn_on triggered by mounting a fsfuzzer corrupted file system, where the root inode has been corrupted. Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk> Reported-by: Steve Grubb <sgrubb@redhat.com>	2010-04-25 01:49:17 +01:00
Linus Torvalds	ddc9b34c3b	Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] use max load in conservative governor [CPUFREQ] fix a lockdep warning	2010-04-24 11:35:21 -07:00
Linus Torvalds	8e500ff8df	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (22 commits) gianfar: Fix potential oops during OF address translation fsl_pq_mdio: Fix kernel oops during OF address translation tcp: bind() fix when many ports are bound rdma: potential ERR_PTR dereference rtnetlink: potential ERR_PTR dereference net: ipv6 bind to device issue ipv6: allow to send packet after receiving ICMPv6 Too Big message with MTU field less than IPV6_MIN_MTU drivers/net/usb: Add new driver ipheth cxgb3: fix linkup issue X25 fix dead unaccepted sockets KS8851: NULL pointer dereference if list is empty net: 3c574_cs fix stats.tx_bytes counter xfrm6: ensure to use the same dev when building a bundle can: Fix possible NULL pointer dereference in ems_usb.c net: Fix an RCU warning in dev_pick_tx() ipv6: Fix tcp_v6_send_response transport header setting. bridge: add a missing ntohs() 8139too: Fix a typo in the function name. mac80211: pass HT changes to driver when off channel mac80211: remove bogus TX agg state assignment ...	2010-04-24 11:34:17 -07:00
Linus Torvalds	383bee6b54	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: PCI: Ensure we re-enable devices on resume x86/PCI: parse additional host bridge window resource types PCI: revert broken device warning PCI aerdrv: use correct bit defines and add 2ms delay to aer_root_reset x86/PCI: ignore Consumer/Producer bit in ACPI window descriptions	2010-04-24 11:32:12 -07:00
Linus Torvalds	b39c8be6d5	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mjg59/platform-drivers-x86 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mjg59/platform-drivers-x86: eeepc-laptop: add missing sparse_keymap_free eeepc-wmi: Build fix asus: don't modify bluetooth/wlan on boot dell-wmi: Fix memory leak eeepc-wmi: add backlight support eeepc-wmi: use a platform device as parent device of all sub-devices eeepc-wmi: add an eeepc_wmi context structure	2010-04-24 11:31:57 -07:00
Phillip Lougher	df37bd156d	initramfs: handle unrecognised decompressor when unpacking The unpack routine fails to handle the decompress_method() returning unrecognised decompressor (compress_name == NULL). This results in the routine looping eventually oopsing on an out of bounds memory access. Note this bug is usually hidden, only triggering on trailing junk after one or more correct compressed blocks. The case of the compressed archive being complete junk is (by accident?) caught by the if (state != Reset) check because state is initialised to Start, but not updated due to the decompressor not having been called. Obviously if the junk is trailing a correctly decompressed buffer, state == Reset from the previous call to the decompressor. Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk> Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-04-24 11:31:26 -07:00
Dan Carpenter	22eccdd7d2	ksm: check for ERR_PTR from follow_page() The follow_page() function can potentially return -EFAULT so I added checks for this. Also I silenced an uninitialized variable warning on my version of gcc (version 4.3.2). Signed-off-by: Dan Carpenter <error27@gmail.com> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Izik Eidus <ieidus@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-04-24 11:31:26 -07:00
Dmitry Torokhov	453dc65931	VMware Balloon driver This is a standalone version of VMware Balloon driver. Ballooning is a technique that allows hypervisor dynamically limit the amount of memory available to the guest (with guest cooperation). In the overcommit scenario, when hypervisor set detects that it needs to shuffle some memory, it instructs the driver to allocate certain number of pages, and the underlying memory gets returned to the hypervisor. Later hypervisor may return memory to the guest by reattaching memory to the pageframes and instructing the driver to "deflate" balloon. We are submitting a standalone driver because KVM maintainer (Avi Kivity) expressed opinion (rightly) that our transport does not fit well into virtqueue paradigm and thus it does not make much sense to integrate with virtio. There were also some concerns whether current ballooning technique is the right thing. If there appears a better framework to achieve this we are prepared to evaluate and switch to using it, but in the meantime we'd like to get this driver upstream. We want to get the driver accepted in distributions so that users do not have to deal with an out-of-tree module and many distributions have "upstream first" requirement. The driver has been shipping for a number of years and users running on VMware platform will have it installed as part of VMware Tools even if it will not come from a distribution, thus there should not be additional risk in pulling the driver into mainline. The driver will only activate if host is VMware so everyone else should not be affected at all. Signed-off-by: Dmitry Torokhov <dtor@vmware.com> Cc: Avi Kivity <avi@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-04-24 11:31:26 -07:00
Anton Blanchard	b8af67e268	fs/block_dev.c: fix performance regression in O_DIRECT\|O_SYNC writes to block devices We are seeing a large regression in database performance on recent kernels. The database opens a block device with O_DIRECT\|O_SYNC and a number of threads write to different regions of the file at the same time. A simple test case is below. I haven't defined DEVICE since getting it wrong will destroy your data :) On an 3 disk LVM with a 64k chunk size we see about 17MB/sec and only a few threads in IO wait: procs -----io---- -system-- -----cpu------ r b bi bo in cs us sy id wa st 0 3 0 16170 656 2259 0 0 86 14 0 0 2 0 16704 695 2408 0 0 92 8 0 0 2 0 17308 744 2653 0 0 86 14 0 0 2 0 17933 759 2777 0 0 89 10 0 Most threads are blocking in vfs_fsync_range, which has: mutex_lock(&mapping->host->i_mutex); err = fop->fsync(file, dentry, datasync); if (!ret) ret = err; mutex_unlock(&mapping->host->i_mutex); commit `148f948ba8` (vfs: Introduce new helpers for syncing after writing to O_SYNC file or IS_SYNC inode) offers some explanation of what is going on: Use these new helpers for syncing from generic VFS functions. This makes O_SYNC writes to block devices acquire i_mutex for syncing. If we really care about this, we can make block_fsync() drop the i_mutex and reacquire it before it returns. Thanks Jan for such a good commit message! As well as dropping i_mutex, Christoph suggests we should remove the call to sync_blockdev(): > sync_blockdev is an overcomplicated alias for filemap_write_and_wait on > the block device inode, which is exactly what we did just before calling > into ->fsync The patch below incorporates both suggestions. With it the testcase improves from 17MB/s to 68M/sec: procs -----io---- -system-- -----cpu------ r b bi bo in cs us sy id wa st 0 7 0 65536 1000 3878 0 0 70 30 0 0 34 0 69632 1016 3921 0 1 46 53 0 0 57 0 69632 1000 3921 0 0 55 45 0 0 53 0 69640 754 4111 0 0 81 19 0 Testcase: #define _GNU_SOURCE #include <stdio.h> #include <pthread.h> #include <unistd.h> #include <stdlib.h> #include <string.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #define NR_THREADS 64 #define BUFSIZE (64 * 1024) #define DEVICE "/dev/mapper/XXXXXX" #define ALIGN(VAL, SIZE) (((VAL)+(SIZE)-1) & ~((SIZE)-1)) static int fd; static void doit(void arg) { unsigned long offset = (long)arg; char b, buf; b = malloc(BUFSIZE + 1024); buf = (char )ALIGN((unsigned long)b, 1024); memset(buf, 0, BUFSIZE); while (1) pwrite(fd, buf, BUFSIZE, offset); } int main(int argc, char argv[]) { int flags = O_RDWR\|O_DIRECT; int i; unsigned long offset = 0; if (argc > 1 && !strcmp(argv[1], "O_SYNC")) flags \|= O_SYNC; fd = open(DEVICE, flags); if (fd == -1) { perror("open"); exit(1); } for (i = 0; i < NR_THREADS-1; i++) { pthread_t tid; pthread_create(&tid, NULL, doit, (void )offset); offset += BUFSIZE; } doit((void )offset); return 0; } Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Jan Kara <jack@suse.cz> Cc: Christoph Hellwig <hch@lst.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-04-24 11:31:26 -07:00
Hans Verkuil	98d5ce0d00	lib/vsprintf.c: add missing EXPORT_SYMBOL(simple_strtoll) Add a missing EXPORT_SYMBOL. I must be the first person that wants to use this function :-) Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-04-24 11:31:26 -07:00
Amit Kucheria	81fa08f25b	w1: fix omap 1-wire driver compilation Fixes the following error: drivers/w1/masters/omap_hdq.c: In function 'hdq_wait_for_flag': drivers/w1/masters/omap_hdq.c:137: error: implicit declaration of function 'schedule_timeout_uninterruptible' drivers/w1/masters/omap_hdq.c: In function 'hdq_write_byte': drivers/w1/masters/omap_hdq.c:177: error: 'TASK_UNINTERRUPTIBLE' undeclared (first use in this function) drivers/w1/masters/omap_hdq.c:177: error: (Each undeclared identifier is reported only once drivers/w1/masters/omap_hdq.c:177: error: for each function it appears in.) drivers/w1/masters/omap_hdq.c:177: error: implicit declaration of function 'schedule_timeout' drivers/w1/masters/omap_hdq.c: In function 'hdq_isr': drivers/w1/masters/omap_hdq.c:221: error: 'TASK_NORMAL' undeclared (first use in this function) drivers/w1/masters/omap_hdq.c: In function 'omap_hdq_break': drivers/w1/masters/omap_hdq.c:316: error: 'TASK_UNINTERRUPTIBLE' undeclared (first use in this function) Signed-off-by: Amit Kucheria <amit.kucheria@canonical.com> Acked-by: Tony Lindgren <tony@atomide.com> Cc: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-04-24 11:31:25 -07:00
Oleg Nesterov	31f2b0ebc0	rmap: anon_vma_prepare() can leak anon_vma_chain If find_mergeable_anon_vma() succeeds but another thread installs ->anon_vma before we take ptl, then allocated == NULL but avc should be freed. Change the code to check avc != NULL to detect this case. Also, a couple of whitespace changes to make the critical section more visible. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: Pete Zaitcev <zaitcev@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-04-24 11:31:25 -07:00
David Howells	93b4a44f3a	keys: fix an RCU warning Fix the following RCU warning: =================================================== [ INFO: suspicious rcu_dereference_check() usage. ] --------------------------------------------------- security/keys/request_key.c:116 invoked rcu_dereference_check() without protection! This was caused by doing: [root@andromeda ~]# keyctl newring fred @s 539196288 [root@andromeda ~]# keyctl request2 user a a 539196288 request_key: Required key not available Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-04-24 11:31:25 -07:00
Albin Tonnerre	ccdb40048b	lib: fix the use of LZO to decompress initramfs images This patch fixes 2 issues with the LZO decompressor: - It doesn't handle the case where a block isn't compressed at all. In this case, calling lzo1x_decompress_safe will fail, so we need to just use memcpy() instead (the upstream LZO code does something similar) - Since commit `54291362d2` ("initramfs: add missing decompressor error check") , the decompressor return code is checked in the init/initramfs.c The LZO decompressor didn't return the expected value, causing the initramfs code to falsely believe a decompression error occured Signed-off-by: Albin Tonnerre <albin.tonnerre@free-electrons.com> Tested-by: bert schulze <spambemyguest@googlemail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-04-24 11:31:25 -07:00

... 21 22 23 24 25 ...

192521 commits