Commit graph

234237 commits

Author SHA1 Message Date
Alexandre Bounine
fe41947e1a rapidio: fix sysfs config attribute to access 16MB of maint space
Fixes sysfs config attribute to allow access to entire 16MB maintenance
space of RapidIO devices.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Li Yang <leoli@freescale.com>
Cc: Thomas Moll <thomas.moll@sysgo.com>
Cc: Micha Nelissen <micha@neli.hopto.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-02-25 15:07:37 -08:00
Alexander Gordeev
99b0d365e5 pps: initialize ts_real properly
Initialize ts_real.flags to fix compiler warning about possible
uninitialized use of this field.

Signed-off-by: Alexander Gordeev <lasaine@lvk.cs.msu.su>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Rodolfo Giometti <giometti@linux.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-02-25 15:07:37 -08:00
Hugh Dickins
e5598f8bf5 memcg: more mem_cgroup_uncharge() batching
It seems odd that truncate_inode_pages_range(), called not only when
truncating but also when evicting inodes, has mem_cgroup_uncharge_start
and _end() batching in its second loop to clear up a few leftovers, but
not in its first loop that does almost all the work: add them there too.

Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Acked-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-02-25 15:07:37 -08:00
Andi Kleen
8eac563c1c thp: fix interleaving for transparent hugepages
The THP code didn't pass the correct interleaving shift to the memory
policy code.  Fix this here by adjusting for the order.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
Acked-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-02-25 15:07:37 -08:00
Jan Kara
7137c6bd45 aio: fix race between io_destroy() and io_submit()
A race can occur when io_submit() races with io_destroy():

 CPU1						CPU2
io_submit()
  do_io_submit()
    ...
    ctx = lookup_ioctx(ctx_id);
						io_destroy()
    Now do_io_submit() holds the last reference to ctx.
    ...
    queue new AIO
    put_ioctx(ctx) - frees ctx with active AIOs

We solve this issue by checking whether ctx is being destroyed in AIO
submission path after adding new AIO to ctx.  Then we are guaranteed that
either io_destroy() waits for new AIO or we see that ctx is being
destroyed and bail out.

Cc: Nick Piggin <npiggin@kernel.dk>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-02-25 15:07:37 -08:00
Nick Piggin
3bd9a5d734 aio: fix rcu ioctx lookup
aio-dio-invalidate-failure GPFs in aio_put_req from io_submit.

lookup_ioctx doesn't implement the rcu lookup pattern properly.
rcu_read_lock does not prevent refcount going to zero, so we might take
a refcount on a zero count ioctx.

Fix the bug by atomically testing for zero refcount before incrementing.

[jack@suse.cz: added comment into the code]
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-02-25 15:07:37 -08:00
Namhyung Kim
29723fccc8 mm: fix dubious code in __count_immobile_pages()
When pfn_valid_within() failed 'iter' was incremented twice.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-02-25 15:07:37 -08:00
Lei Xu
a2d6d2fa90 drivers/rtc/rtc-ds3232.c: fix time range difference between linux and RTC chip
In linux rtc_time struct, tm_mon range is 0~11, tm_wday range is 0~6,
while in RTC HW REG, month range is 1~12, day of the week range is 1~7,
this patch adjusts difference of them.

The efect of this bug was that most of month will be operated on as the
next month by the hardware (When in Jan it maybe even worse).  For
example, if in May, software wrote 4 to the hardware, which handled it as
April.  Then the logic would be different between software and hardware,
which would cause weird things to happen.

Signed-off-by: Lei Xu <B33228@freescale.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Jack Lan <jack.lan@freescale.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-02-25 15:07:37 -08:00
Timo Warns
294f6cf486 ldm: corrupted partition table can cause kernel oops
The kernel automatically evaluates partition tables of storage devices.
The code for evaluating LDM partitions (in fs/partitions/ldm.c) contains
a bug that causes a kernel oops on certain corrupted LDM partitions.  A
kernel subsystem seems to crash, because, after the oops, the kernel no
longer recognizes newly connected storage devices.

The patch changes ldm_parse_vmdb() to Validate the value of vblk_size.

Signed-off-by: Timo Warns <warns@pre-sense.de>
Cc: Eugene Teo <eugeneteo@kernel.sg>
Acked-by: Richard Russon <ldm@flatcap.org>
Cc: Harvey Harrison <harvey.harrison@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-02-25 15:07:36 -08:00
Mel Gorman
2876592f23 mm: vmscan: stop reclaim/compaction earlier due to insufficient progress if !__GFP_REPEAT
should_continue_reclaim() for reclaim/compaction allows scanning to
continue even if pages are not being reclaimed until the full list is
scanned.  In terms of allocation success, this makes sense but potentially
it introduces unwanted latency for high-order allocations such as
transparent hugepages and network jumbo frames that would prefer to fail
the allocation attempt and fallback to order-0 pages.  Worse, there is a
potential that the full LRU scan will clear all the young bits, distort
page aging information and potentially push pages into swap that would
have otherwise remained resident.

This patch will stop reclaim/compaction if no pages were reclaimed in the
last SWAP_CLUSTER_MAX pages that were considered.  For allocations such as
hugetlbfs that use __GFP_REPEAT and have fewer fallback options, the full
LRU list may still be scanned.

Order-0 allocation should not be affected because RECLAIM_MODE_COMPACTION
is not set so the following avoids the gfp_mask being examined:

        if (!(sc->reclaim_mode & RECLAIM_MODE_COMPACTION))
                return false;

A tool was developed based on ftrace that tracked the latency of
high-order allocations while transparent hugepage support was enabled and
three benchmarks were run.  The "fix-infinite" figures are 2.6.38-rc4 with
Johannes's patch "vmscan: fix zone shrinking exit when scan work is done"
applied.

  STREAM Highorder Allocation Latency Statistics
                 fix-infinite     break-early
  1 :: Count            10298           10229
  1 :: Min             0.4560          0.4640
  1 :: Mean            1.0589          1.0183
  1 :: Max            14.5990         11.7510
  1 :: Stddev          0.5208          0.4719
  2 :: Count                2               1
  2 :: Min             1.8610          3.7240
  2 :: Mean            3.4325          3.7240
  2 :: Max             5.0040          3.7240
  2 :: Stddev          1.5715          0.0000
  9 :: Count           111696          111694
  9 :: Min             0.5230          0.4110
  9 :: Mean           10.5831         10.5718
  9 :: Max            38.4480         43.2900
  9 :: Stddev          1.1147          1.1325

Mean time for order-1 allocations is reduced.  order-2 looks increased but
with so few allocations, it's not particularly significant.  THP mean
allocation latency is also reduced.  That said, allocation time varies so
significantly that the reductions are within noise.

Max allocation time is reduced by a significant amount for low-order
allocations but reduced for THP allocations which presumably are now
breaking before reclaim has done enough work.

  SysBench Highorder Allocation Latency Statistics
                 fix-infinite     break-early
  1 :: Count            15745           15677
  1 :: Min             0.4250          0.4550
  1 :: Mean            1.1023          1.0810
  1 :: Max            14.4590         10.8220
  1 :: Stddev          0.5117          0.5100
  2 :: Count                1               1
  2 :: Min             3.0040          2.1530
  2 :: Mean            3.0040          2.1530
  2 :: Max             3.0040          2.1530
  2 :: Stddev          0.0000          0.0000
  9 :: Count             2017            1931
  9 :: Min             0.4980          0.7480
  9 :: Mean           10.4717         10.3840
  9 :: Max            24.9460         26.2500
  9 :: Stddev          1.1726          1.1966

Again, mean time for order-1 allocations is reduced while order-2
allocations are too few to draw conclusions from.  The mean time for THP
allocations is also slightly reduced albeit the reductions are within
varianes.

Once again, our maximum allocation time is significantly reduced for
low-order allocations and slightly increased for THP allocations.

  Anon stream mmap reference Highorder Allocation Latency Statistics
  1 :: Count             1376            1790
  1 :: Min             0.4940          0.5010
  1 :: Mean            1.0289          0.9732
  1 :: Max             6.2670          4.2540
  1 :: Stddev          0.4142          0.2785
  2 :: Count                1               -
  2 :: Min             1.9060               -
  2 :: Mean            1.9060               -
  2 :: Max             1.9060               -
  2 :: Stddev          0.0000               -
  9 :: Count            11266           11257
  9 :: Min             0.4990          0.4940
  9 :: Mean        27250.4669      24256.1919
  9 :: Max      11439211.0000    6008885.0000
  9 :: Stddev     226427.4624     186298.1430

This benchmark creates one thread per CPU which references an amount of
anonymous memory 1.5 times the size of physical RAM.  This pounds swap
quite heavily and is intended to exercise THP a bit.

Mean allocation time for order-1 is reduced as before.  It's also reduced
for THP allocations but the variations here are pretty massive due to
swap.  As before, maximum allocation times are significantly reduced.

Overall, the patch reduces the mean and maximum allocation latencies for
the smaller high-order allocations.  This was with Slab configured so it
would be expected to be more significant with Slub which uses these size
allocations more aggressively.

The mean allocation times for THP allocations are also slightly reduced.
The maximum latency was slightly increased as predicted by the comments
due to reclaim/compaction breaking early.  However, workloads care more
about the latency of lower-order allocations than THP so it's an
acceptable trade-off.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Acked-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-02-25 15:07:36 -08:00
Matti J. Aaltonen
ac3c830419 drivers/nfc/pn544.c: add missing regulator
The regulator framework is used for power management.  The regulators are
only named in the driver code, the actual control stuff is in the board
file for each architecture or use case.

The PN544 chip has three regulators that can be controlled or not -
depending on the architecture where the chip is being used.  So some of
the regulators may not be controllable.  In our current case the third
regulator, which was missing from the code, went unnoticed because we
didn't need to control it.  To be as general as possible - in this respect
- the driver needs to list all regulators.  Then the board file can be
used to actually set the usage.

Signed-off-by: Matti J. Aaltonen <matti.j.aaltonen@nokia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-02-25 15:07:36 -08:00
Matti J. Aaltonen
d73fa4b914 drivers/nfc/Kconfig: use full form of the NFC acronym
Spell out the NFC acronym when it's shown for the first time.

Signed-off-by: Matti J. Aaltonen <matti.j.aaltonen@nokia.com>
Acked-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-02-25 15:07:36 -08:00
FUJITA Tomonori
fba99fa38b swiotlb: fix wrong panic
swiotlb's map_page wrongly calls panic() when it can't find a buffer fit
for device's dma mask.  It should return an error instead.

Devices with an odd dma mask (i.e.  under 4G) like b44 network card hit
this bug (the system crashes):

   http://marc.info/?l=linux-kernel&m=129648943830106&w=2

If swiotlb returns an error, b44 driver can use the own bouncing
mechanism.

Reported-by: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Tested-by: Arkadiusz Miskiewicz <arekm@maven.pl>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-02-25 15:07:36 -08:00
Harry Wei
f8407f26b4 MAINTAINERS: add Chinese documentation maintainer
I have translated some kernel documentation so I wish to maintain the
Chinese documentation in our kernel directories.

Signed-off-by: Harry Wei <harryxiyou@gmail.com>
Cc: Joe Perches <joe@perches.com>
Cc: Greg KH <greg@kroah.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-02-25 15:07:36 -08:00
Greg Thelen
a879bf582d mm: grab rcu read lock in move_pages()
The move_pages() usage of find_task_by_vpid() requires rcu_read_lock() to
prevent free_pid() from reclaiming the pid.

Without this patch, RCU warnings are printed in v2.6.38-rc4 move_pages()
with:

  CONFIG_LOCKUP_DETECTOR=y
  CONFIG_PREEMPT=y
  CONFIG_LOCKDEP=y
  CONFIG_PROVE_LOCKING=y
  CONFIG_PROVE_RCU=y

Previously, migrate_pages() went through a similar transformation
replacing usage of tasklist_lock with rcu read lock:

  commit 55cfaa3cbd
  Author: Zeng Zhaoming <zengzm.kernel@gmail.com>
  Date:   Thu Dec 2 14:31:13 2010 -0800

      mm/mempolicy.c: add rcu read lock to protect pid structure

  commit 1e50df39f6
  Author: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
  Date:   Thu Jan 13 15:46:14 2011 -0800

      mempolicy: remove tasklist_lock from migrate_pages

Signed-off-by: Greg Thelen <gthelen@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Zeng Zhaoming <zengzm.kernel@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-02-25 15:07:36 -08:00
Davide Libenzi
22bacca48a epoll: prevent creating circular epoll structures
In several places, an epoll fd can call another file's ->f_op->poll()
method with ep->mtx held.  This is in general unsafe, because that other
file could itself be an epoll fd that contains the original epoll fd.

The code defends against this possibility in its own ->poll() method using
ep_call_nested, but there are several other unsafe calls to ->poll
elsewhere that can be made to deadlock.  For example, the following simple
program causes the call in ep_insert recursively call the original fd's
->poll, leading to deadlock:

 #include <unistd.h>
 #include <sys/epoll.h>

 int main(void) {
     int e1, e2, p[2];
     struct epoll_event evt = {
         .events = EPOLLIN
     };

     e1 = epoll_create(1);
     e2 = epoll_create(2);
     pipe(p);

     epoll_ctl(e2, EPOLL_CTL_ADD, e1, &evt);
     epoll_ctl(e1, EPOLL_CTL_ADD, p[0], &evt);
     write(p[1], p, sizeof p);
     epoll_ctl(e1, EPOLL_CTL_ADD, e2, &evt);

     return 0;
 }

On insertion, check whether the inserted file is itself a struct epoll,
and if so, do a recursive walk to detect whether inserting this file would
create a loop of epoll structures, which could lead to deadlock.

[nelhage@ksplice.com: Use epmutex to serialize concurrent inserts]
Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: Nelson Elhage <nelhage@ksplice.com>
Reported-by: Nelson Elhage <nelhage@ksplice.com>
Tested-by: Nelson Elhage <nelhage@ksplice.com>
Cc: <stable@kernel.org>		[2.6.34+, possibly earlier]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-02-25 15:07:36 -08:00
Linus Torvalds
6366213ee3 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6:
  regulator, mc13xxx: Remove pointless test for unsigned less than zero
  regulator: Fix warning with CONFIG_BUG disabled
2011-02-25 14:04:44 -08:00
Linus Torvalds
4660ba63f1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
  Btrfs: fix fiemap bugs with delalloc
  Btrfs: set FMODE_EXCL in btrfs_device->mode
  Btrfs: make btrfs_rm_device() fail gracefully
  Btrfs: Avoid accessing unmapped kernel address
  Btrfs: Fix BTRFS_IOC_SUBVOL_SETFLAGS ioctl
  Btrfs: allow balance to explicitly allocate chunks as it relocates
  Btrfs: put ENOSPC debugging under a mount option
2011-02-25 14:03:39 -08:00
Linus Torvalds
958ede7f1b Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86 quirk: Fix polarity for IRQ0 pin2 override on SB800 systems
  x86/mrst: Fix apb timer rating when lapic timer is used
  x86: Fix reboot problem on VersaLogic Menlow boards
2011-02-25 14:02:33 -08:00
Jelle Martijn Kok
d40358509e RTC: fix typo in drivers/rtc/rtc-at91sam9.c
The member of the rtc_class_ops struct is called alarm_irq_enable and
not alarm_irq_enabled

CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jelle Martijn Kok <jmkok@youcom.nl>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-02-25 14:00:56 -08:00
Hagen Paul Pfeifer
5aca1a9e88 net: handle addr_type of 0 properly
addr_type of 0 means that the type should be adopted from from_dev and
not from __hw_addr_del_multiple(). Unfortunately it isn't so and
addr_type will always be considered. Fix this by implementing the
considered and documented behavior.

Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-25 13:58:54 -08:00
Tony Lindgren
02fa9f0451 Merge branch 'patches_for_2.6.38rc' of git://git.pwsan.com/linux-2.6 into devel-fixes 2011-02-25 12:27:14 -08:00
Jussi Kivilinna
63453c05da rndis_wlan: use power save only for BCM4320b
BCM4320a breaks when enabling power save (bug 29732). So disable power save
for anything but BCM4320b that is known to work.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-02-25 15:21:51 -05:00
Jan Puk
c86664e5a2 carl9170: add Airlive X.USB a/b/g/n USBID
"AirLive X.USB now works perfectly under a Linux
environment!"

Cc: <stable@kernel.org>
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-02-25 15:21:50 -05:00
Stanislaw Gruszka
385918cc6a ath9k: correct ath9k_hw_set_interrupts
Commit 4df3071ebd "ath9k_hw: optimize
interrupt mask changes", changed ath9k_hw_set_interrupts function to
enable interrupts regardless of function argument, what could possibly
be wrong. Correct that behaviour and check "ints" arguments before
enabling interrupts, also disable interrupts if ints do not have
ATH9K_INT_GLOBAL flag set.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-02-25 15:21:50 -05:00
John W. Linville
79ae79c9aa Merge branch 'wireless-2.6' of git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-2.6 2011-02-25 15:14:33 -05:00
Santosh Shilimkar
51c404b2c5 omap4: prcm: Fix the CPUx clockdomain offsets
CPU0 and CPU1 clockdomain is at the offset of 0x18 from the LPRM base.
The header file has set it wrongly to 0x0. Offset 0x0 is for CPUx power
domain control register

Fix the same.

The autogen scripts is fixed thanks to Benoit Cousson

With the old value, the clockdomain code would access the
*_PWRSTCTRL.POWERSTATE field when it thought it was accessing the
*_CLKSTCTRL.CLKTRCTRL field.  In the worst case, this could cause
system power management to behave incorrectly.

Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Cc: Paul Walmsley <paul@pwsan.com>
Cc: Rajendra Nayak <rnayak@ti.com>
Cc: Benoit Cousson <b-cousson@ti.com>
[paul@pwsan.com: added second paragraph to commit message]
Signed-off-by: Paul Walmsley <paul@pwsan.com>
2011-02-25 12:45:05 -07:00
Linus Torvalds
c1bc3beb06 Merge branch 'usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
* 'usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6:
  usb: musb: core: set has_tt flag
  USB: xhci: mark local functions as static
  USB: xhci: fix couple sparse annotations
  USB: xhci: rework xhci_print_ir_set() to get ir set from xhci itself
  USB: Reset USB 3.0 devices on (re)discovery
  xhci: Fix an error in count_sg_trbs_needed()
  xhci: Fix errors in the running total calculations in the TRB math
  xhci: Clarify some expressions in the TRB math
  xhci: Avoid BUG() in interrupt context
2011-02-25 11:14:44 -08:00
Linus Torvalds
638691a7a4 Merge branch 'for-linus' of git://neil.brown.name/md
* 'for-linus' of git://neil.brown.name/md:
  md: Fix - again - partition detection when array becomes active
  Fix over-zealous flush_disk when changing device size.
  md: avoid spinlock problem in blk_throtl_exit
  md: correctly handle probe of an 'mdp' device.
  md: don't set_capacity before array is active.
  md: Fix raid1->raid0 takeover
2011-02-25 11:13:26 -08:00
Anton Blanchard
0a93ea2e89 RxRPC: Allocate tokens with kzalloc to avoid oops in rxrpc_destroy
With slab poisoning enabled, I see the following oops:

  Unable to handle kernel paging request for data at address 0x6b6b6b6b6b6b6b73
  ...
  NIP [c0000000006bc61c] .rxrpc_destroy+0x44/0x104
  LR [c0000000006bc618] .rxrpc_destroy+0x40/0x104
  Call Trace:
  [c0000000feb2bc00] [c0000000006bc618] .rxrpc_destroy+0x40/0x104 (unreliable)
  [c0000000feb2bc90] [c000000000349b2c] .key_cleanup+0x1a8/0x20c
  [c0000000feb2bd40] [c0000000000a2920] .process_one_work+0x2f4/0x4d0
  [c0000000feb2be00] [c0000000000a2d50] .worker_thread+0x254/0x468
  [c0000000feb2bec0] [c0000000000a868c] .kthread+0xbc/0xc8
  [c0000000feb2bf90] [c000000000020e00] .kernel_thread+0x54/0x70

We aren't initialising token->next, but the code in destroy_context relies
on the list being NULL terminated. Use kzalloc to zero out all the fields.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-02-25 11:12:37 -08:00
Anton Blanchard
f129ccc923 afs: Fix oops in afs_unlink_writeback
I'm seeing the following oops when testing afs:

  Unable to handle kernel paging request for data at address 0x00000008
  ...
  NIP [c0000000003393b0] .afs_unlink_writeback+0x38/0xc0
  LR [c00000000033987c] .afs_put_writeback+0x98/0xec
  Call Trace:
  [c00000000345f600] [c00000000033987c] .afs_put_writeback+0x98/0xec
  [c00000000345f690] [c00000000033ae80] .afs_write_begin+0x6a4/0x75c
  [c00000000345f790] [c00000000012b77c] .generic_file_buffered_write+0x148/0x320
  [c00000000345f8d0] [c00000000012e1b8] .__generic_file_aio_write+0x37c/0x3e4
  [c00000000345f9d0] [c00000000012e2a8] .generic_file_aio_write+0x88/0xfc
  [c00000000345fa90] [c0000000003390a8] .afs_file_write+0x10c/0x178
  [c00000000345fb40] [c000000000188788] .do_sync_write+0xc4/0x128
  [c00000000345fcc0] [c000000000189658] .vfs_write+0xe8/0x1d8
  [c00000000345fd70] [c000000000189884] .SyS_write+0x68/0xb0
  [c00000000345fe30] [c000000000008564] syscall_exit+0x0/0x40

afs_write_begin hits an error and calls afs_unlink_writeback. In there
we do list_del_init on an uninitialised list.

The patch below initialises ->link when creating the afs_writeback struct.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-02-25 11:12:37 -08:00
Lucian Adrian Grijincu
c486da3439 sysctl: ipv6: use correct net in ipv6_sysctl_rtcache_flush
Before this patch issuing these commands:

  fd = open("/proc/sys/net/ipv6/route/flush")
  unshare(CLONE_NEWNET)
  write(fd, "stuff")

would flush the newly created net, not the original one.

The equivalent ipv4 code is correct (stores the net inside ->extra1).
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-25 11:01:56 -08:00
Ian Campbell
b056b6a014 xen: suspend: remove xen_hvm_suspend
It is now identical to xen_suspend, the differences are encapsulated
in the suspend_info struct.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-02-25 16:43:14 +00:00
Ian Campbell
55fb4acef7 xen: suspend: pull pre/post suspend hooks out into suspend_info
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-02-25 16:43:13 +00:00
Ian Campbell
07af38102f xen: suspend: move arch specific pre/post suspend hooks into generic hooks
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-02-25 16:43:13 +00:00
Ian Campbell
82043bb60d xen: suspend: refactor non-arch specific pre/post suspend hooks
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-02-25 16:43:12 +00:00
Ian Campbell
03c8142bd2 xen: suspend: add "arch" to pre/post suspend hooks
xen_pre_device_suspend is unused on ia64.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-02-25 16:43:12 +00:00
Ian Campbell
36b401e2c2 xen: suspend: pass extra hypercall argument via suspend_info struct
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-02-25 16:43:11 +00:00
Ian Campbell
ceb1802947 xen: suspend: refactor cancellation flag into a structure
Will add extra fields in subsequent patches.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-02-25 16:43:11 +00:00
Ian Campbell
bd1c0ad284 xen: suspend: use HYPERVISOR_suspend for PVHVM case instead of open coding
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-02-25 16:43:11 +00:00
Ian Campbell
a8b7458363 xen: switch to new schedop hypercall by default.
Rename old interface to sched_op_compat and rename sched_op_new to
simply sched_op.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-02-25 16:43:10 +00:00
Ian Campbell
8e15597fa4 xen: use new schedop interface for suspend
Take the opportunity to comment on the semantics of the PV guest
suspend hypercall arguments.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-02-25 16:43:10 +00:00
Ian Campbell
552717231e xen: do not respond to unknown xenstore control requests
The PV xenbus control/shutdown node is written by the toolstack as a
request to the guest to perform a particular action (shutdown, reboot,
suspend etc). The guest is expected to acknowledge that it will
complete a request by clearing the control node.

Previously it would acknowledge any request, even if it did not know
what to do with it. Specifically in the case where CONFIG_PM_SLEEP is
not enabled the kernel would acknowledge a suspend request even though
it was not actually going to do anything.

Instead make the kernel only acknowledge requests if it is actually
going to do something with it. This will improve the toolstack's
ability to diagnose and deal with failures.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-02-25 16:43:09 +00:00
Stefano Stabellini
e057a4b6e0 xen: fix compile issue if XEN is enabled but XEN_PVHVM is disabled
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2011-02-25 16:43:06 +00:00
Stefano Stabellini
99bbb3a84a xen: PV on HVM: support PV spinlocks and IPIs
Initialize PV spinlocks on boot CPU right after native_smp_prepare_cpus
(that switch to APIC mode and initialize APIC routing); on secondary
CPUs on CPU_UP_PREPARE.

Enable the usage of event channels to send and receive IPIs when
running as a PV on HVM guest.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2011-02-25 16:43:06 +00:00
Stefano Stabellini
53d5522cad xen: make the ballon driver work for hvm domains
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2011-02-25 16:43:05 +00:00
Stefano Stabellini
c80a420995 xen-blkfront: handle Xen major numbers other than XENVBD
This patch makes sure blkfront handles correctly virtual device numbers
corresponding to Xen emulated IDE and SCSI disks: in those cases
blkfront translates the major number to XENVBD and the minor number to a
low xvd minor.

Note: this behaviour is different from what old xenlinux PV guests used
to do: they used to steal an IDE or SCSI major number and use it
instead.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
2011-02-25 16:43:05 +00:00
Stefano Stabellini
cff520b9c2 xen: do not use xen_info on HVM, set pv_info name to "Xen HVM"
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
2011-02-25 16:43:04 +00:00
Stefano Stabellini
702d4eb9b3 xen: no need to delay xen_setup_shutdown_event for hvm guests anymore
Now that xenstore_ready is used correctly for PV on HVM guests too, we
don't need to delay the initialization of xen_setup_shutdown_event
anymore.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
2011-02-25 16:43:03 +00:00
Miklos Szeredi
8d56addd70 fuse: fix truncate after open
Commit e1181ee6 "vfs: pass struct file to do_truncate on O_TRUNC
opens" broke the behavior of open(O_TRUNC|O_RDONLY) in fuse.  Fuse
assumed that when called from open, a truncate() will be done, not an
ftruncate().

Fix by restoring the old behavior, based on the ATTR_OPEN flag.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2011-02-25 14:44:58 +01:00