Commit graph

190031 commits

Author SHA1 Message Date
Jean Delvare
8acf07c5a7 hwmon: (it87) Properly handle wrong sensor type requests
Currently, if someone tries to set the thermal sensor type to an
unsupported value, subsequent accesses to the chip may temporarily
show the sensor in question as disabled. Use a temporary variable
and only update the cached value on success, to prevent such
confusion.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
2010-04-14 16:14:09 +02:00
Jean Delvare
a00afb97e2 hwmon: (it87) Don't arbitrarily enable temperature channels
Temperature channels can be used in 2 different modes (thermistor and
thermal diode) and we don't know which one, if any, is correct for
every given board. So don't arbitrarily choose one. Instead, leave the
temperature channels untouched. They can be configured from user-space
if needed anyway.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
2010-04-14 16:14:09 +02:00
Jean Delvare
c7a78d2c2e hwmon: (sht15) Properly handle the case CONFIG_REGULATOR=n
When CONFIG_REGULATOR isn't set, regulator_get_voltage() returns 0.
Properly handle this case by not trusting the value.

Reported-by: Jerome Oufella <jerome.oufella@savoirfairelinux.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Jonathan Cameron <jic23@cam.ac.uk>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: stable@kernel.org
2010-04-14 16:14:08 +02:00
Jerome Oufella
328a2c22ab hwmon: (sht15) Fix sht15_calc_temp interpolation function
I discovered two issues.
First the previous sht15_calc_temp() loop did not iterate through the
temppoints array since the (data->supply_uV > temppoints[i - 1].vdd)
test is always true in this direction.

Also the two-points linear interpolation function was returning biased
values due to a stray division by 1000 which shouldn't be there.

[JD: Also change the default value for d1 from 0 to something saner.]

Signed-off-by: Jerome Oufella <jerome.oufella@savoirfairelinux.com>
Acked-by: Jonathan Cameron <jic23@cam.ac.uk>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: stable@kernel.org
2010-04-14 16:14:07 +02:00
Takashi Iwai
3d83e577a8 ALSA: hda - Avoid invalid "Independent HP" control for VIA codecs
Some VIA codecs have no multiple source selection for headphone pins,
thus it's useless (and wrong) to create "Independent HP" control on them.

This patch adds the check of connections to skip the control in such a
case.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
2010-04-14 14:36:23 +02:00
Takashi Iwai
b331439dfd ALSA: hda - Fix control element allocations in VIA codec parser
The commit 5b0cb1d850
    ALSA: hda - add more NID->Control mapping
breaks the control element allocation by returning a wrong value.
Let's fix it.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
2010-04-14 14:35:11 +02:00
Rusty Russell
091ebf07a2 lguest: stop using KVM hypercall mechanism
This is a partial revert of 4cd8b5e2a1 "lguest: use KVM hypercalls";
we revert to using (just as questionable but more reliable) int $15 for
hypercalls.  I didn't revert the register mapping, so we still use the
same calling convention as kvm.

KVM in more recent incarnations stopped injecting a fault when a guest
tried to use the VMCALL instruction from ring 1, so lguest under kvm
fails to make hypercalls.  It was nice to share code with our KVM
cousins, but this was overreach.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Matias Zabaljauregui <zabaljauregui@gmail.com>
Cc: Avi Kivity <avi@redhat.com>
2010-04-14 21:43:56 +09:30
Rusty Russell
5094aeafbb lguest: workaround cmpxchg8b_emu by ignoring cli in the guest.
It's only used by cmpxchg8b_emu (see db677ffa5f for the gory
details), and fixing that to be paravirt aware would be more work than
simply ignoring it (and AFAICT only help lguest).  This makes lguest
work on machines which have cmpxchg8b, for kernels compiled for older
processors.

(We can't emulate it properly: the popf which expects to restore interrupts
does not trap).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: virtualization@lists.osdl.org
2010-04-14 21:43:54 +09:30
Dmitry Torokhov
afb567e3fd Revert "Input: wacom - merge out and in prox events"
This reverts commit 776943fd6f as it
causes issues with ISDv4 E3 touchscreens:

	https://bugzilla.kernel.org/show_bug.cgi?id=15670

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2010-04-13 23:08:58 -07:00
Linus Torvalds
2ba3abd818 Merge branch 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6
* 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
  PM / Hibernate: user.c, fix SNAPSHOT_SET_SWAP_AREA handling
2010-04-13 17:49:48 -07:00
Linus Torvalds
0fdfe5ad28 Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
  NFSv4: fix delegated locking
  NFS: Ensure that the WRITE and COMMIT RPC calls are always uninterruptible
  NFS: Fix a race with the new commit code
  NFS: Ensure that writeback_single_inode() calls write_inode() when syncing
  NFS: Fix the mode calculation in nfs_find_open_context
  NFSv4: Fall back to ordinary lookup if nfs4_atomic_open() returns EISDIR
2010-04-13 15:10:16 -07:00
Sage Weil
a6a5349d17 ceph: use separate class for ceph sockets' sk_lock
Use a separate class for ceph sockets to prevent lockdep confusion.
Because ceph sockets only get passed kernel pointers, there is no
dependency from sk_lock -> mmap_sem.  If we share the same class as other
sockets, lockdep detects a circular dependency from

	mmap_sem (page fault) -> fs mutex -> sk_lock -> mmap_sem

because dependencies are noted from both ceph and user contexts.  Using
a separate class prevents the sk_lock(ceph) -> mmap_sem dependency and
makes lockdep happy.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-04-13 14:07:07 -07:00
Yehuda Sadeh
e1e4dd0caa ceph: reserve one more caps space when doing readdir
We were missing space for the directory cap.  The result was a BUG at
fs/ceph/caps.c:2178.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
2010-04-13 12:28:54 -07:00
Sage Weil
fc837c8f04 ceph: queue_cap_snap should always queue dirty context
This simplifies the calling convention, and fixes a bug where we queue a
capsnap with a context other than i_head_snapc (the one that matches the
dirty pages).  The result was a BUG at fs/ceph/caps.c:2178 on writeback
completion when a capsnap matching the writeback snapc could not be found.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-04-13 12:28:31 -07:00
Linus Torvalds
44d2d371d2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
  sparc64: Add some more commentary to __raw_local_irq_save()
  sparc64: Fix memory leak in pci_register_iommu_region().
  sparc64: Add kmemleak annotation to sun4v_build_virq()
  sparc64: Support kmemleak.
  sparc64: Add function graph tracer support.
  sparc64: Give a stack frame to the ftrace call sites.
  sparc64: Use a seperate counter for timer interrupts and NMI checks, like x86.
  sparc64: Remove profiling from some low-level bits.
  sparc64: Kill unnecessary static on local var in ftrace_call_replace().
  sparc64: Kill CONFIG_STACK_DEBUG code.
  sparc64: Add HAVE_FUNCTION_TRACE_MCOUNT_TEST and tidy up.
  sparc64: Adjust __raw_local_irq_save() to cooperate in NMIs.
  sparc64: Use kstack_valid() in die_if_kernel().
2010-04-13 11:34:05 -07:00
Linus Torvalds
465de2ba71 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (25 commits)
  smc91c92_cs: define multicast_table as unsigned char
  can: avoids a false warning
  e1000e: stop cleaning when we reach tx_ring->next_to_use
  igb: restrict WoL for 82576 ET2 Quad Port Server Adapter
  virtio_net: missing sg_init_table
  Revert "tcp: Set CHECKSUM_UNNECESSARY in tcp_init_nondata_skb"
  iwlwifi: need check for valid qos packet before free
  tcp: Set CHECKSUM_UNNECESSARY in tcp_init_nondata_skb
  udp: fix for unicast RX path optimization
  myri10ge: fix rx_pause in myri10ge_set_pauseparam
  net: corrected documentation for hardware time stamping
  stmmac: use resource_size()
  x.25 attempts to negotiate invalid throughput
  x25: Patch to fix bug 15678 - x25 accesses fields beyond end of packet.
  bridge: Fix IGMP3 report parsing
  cnic: Fix crash during bnx2x MTU change.
  qlcnic: fix set mac addr
  r6040: fix r6040_multicast_list
  vhost-net: fix vq_memory_access_ok error checking
  ath9k: fix double calls to ath_radio_enable
  ...
2010-04-13 11:32:48 -07:00
Ingo Molnar
2b2f862ee6 Merge branch 'iommu/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommu into x86/urgent 2010-04-13 13:24:54 +02:00
Ken Kawasaki
a6d37024de smc91c92_cs: define multicast_table as unsigned char
smc91c92_cs:
  * define multicast_table as unsigned char
  * remove unnecessary "#ifndef final_version"

Signed-off-by: Ken Kawasaki <ken_kawasaki@spring.nifty.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-13 03:03:16 -07:00
Eric Dumazet
4ffa87012e can: avoids a false warning
At this point optlen == sizeof(sfilter) but some compilers are dumb.

Reported-by: Németh Márton <nm127@freemail.h
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Oliver Hartkopp <oliver@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-13 03:03:14 -07:00
Terry Loftin
dac876193c e1000e: stop cleaning when we reach tx_ring->next_to_use
Tx ring buffers after tx_ring->next_to_use are volatile and could
change, possibly causing a crash.  Stop cleaning when we hit
tx_ring->next_to_use.

Signed-off-by: Terry Loftin <terry.loftin@hp.com>
Acked-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-13 03:03:13 -07:00
Stefan Assmann
d5aa22520d igb: restrict WoL for 82576 ET2 Quad Port Server Adapter
Restrict Wake-on-LAN to first port on 82576 ET2 quad port NICs, as it is
only supported there.

Signed-off-by: Stefan Assmann <sassmann@redhat.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-13 03:03:12 -07:00
Takashi Iwai
96d9e9c039 Merge branch 'fix/misc' into topic/misc 2010-04-13 11:14:43 +02:00
David S. Miller
c011f80ba0 sparc64: Add some more commentary to __raw_local_irq_save()
Suggested by Peter Zijlstra

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-13 01:50:43 -07:00
Philby John
b68b58fd6a ALSA: aaci - Fix alignment faults on ARM Cortex introduced by commit 29a4f2d3
The commit 29a4f2d3 used writel() at offset 0x26 which is
half-word aligned causing unaligned exceptions on a
Cortex-A8. The original patch solved the "aaci-pl041 fpga:04:
ac97 read back fail" issue on a soft reset. Reading from any
arbitrary aaci register seems to solve this issue.

Signed-off-by: Philby John <pjohn@mvista.com>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2010-04-13 09:46:55 +02:00
David S. Miller
9343af084c Merge branch 'master' of /home/davem/src/GIT/linux-2.6/
Conflicts:
	lib/Kconfig.debug
2010-04-13 00:28:45 -07:00
David S. Miller
e182c77cc2 sparc64: Fix memory leak in pci_register_iommu_region().
Found by kmemleak.

If request_resource() fails, we leak the struct resource we
allocated to represent the IOMMU mapping area.

This actually happens on sun4v machines because the IOMEM area is only
reported sans the IOMMU region, unlike all previous systems.  I'll
need to fix that at some point, but for now fix the leak.

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-12 23:46:18 -07:00
David S. Miller
25ad403f67 sparc64: Add kmemleak annotation to sun4v_build_virq()
The only reference we store to this memory is in the form of a
physical address, so kmemleak can't see it.

Add a kmemleak_not_leak() annotation.

It's probably useful to be able to look at a dump of these things
either via debugfs or similar, and thus we could at some point store
them in some kind of table and therefore get rid of this annotation.

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-12 23:46:17 -07:00
David S. Miller
8b8d8e2840 sparc64: Support kmemleak.
Only missing thing was an _sdata marker in vmlinux.lds.S

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-12 23:46:17 -07:00
David S. Miller
9960e9e894 sparc64: Add function graph tracer support.
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-12 22:37:26 -07:00
David S. Miller
a71d1d6bb1 sparc64: Give a stack frame to the ftrace call sites.
It's the only way we'll be able to implement the function
graph tracer properly.

A positive is that we no longer have to worry about the
linker over-optimizing the tail call, since we don't
use a tail call any more.

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-12 22:37:15 -07:00
David S. Miller
daecbf58a5 sparc64: Use a seperate counter for timer interrupts and NMI checks, like x86.
This keeps us from having to use kstat_irqs_cpu() from the NMI handler,
the former of which is a profiled function.

Instead we use a currently empty slot in the cpu_data

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-12 22:37:07 -07:00
David S. Miller
f8e8a8e8cb sparc64: Remove profiling from some low-level bits.
These include the timer implementation, perf events support, and the
performance counter register (pcr) programming layer.

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-12 22:36:19 -07:00
David S. Miller
d96478d5a2 sparc64: Kill unnecessary static on local var in ftrace_call_replace().
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-12 22:36:10 -07:00
David S. Miller
ddacd0bc70 sparc64: Kill CONFIG_STACK_DEBUG code.
The generic stack tracer does this job just as well.

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-12 22:36:03 -07:00
David S. Miller
63b7549573 sparc64: Add HAVE_FUNCTION_TRACE_MCOUNT_TEST and tidy up.
Check function_trace_stop at ftrace_caller

Toss mcount_call and dummy call of ftrace_stub, unnecessary.

Document problems we'll have if the final kernel image link
ever turns on relaxation.

Properly size 'ftrace_call' so it looks right when inspecting
instructions under gdb et al.

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-12 22:35:24 -07:00
David S. Miller
0c25e9e6cb sparc64: Adjust __raw_local_irq_save() to cooperate in NMIs.
If we are in an NMI then doing a plain raw_local_irq_disable() will
write PIL_NORMAL_MAX into %pil, which is lower than PIL_NMI, and thus
we'll re-enable NMIs and recurse.

Doing a simple:

	%pil = %pil | PIL_NORMAL_MAX

does what we want, if we're already at PIL_NMI (15) we leave it at
that setting, else we set it to PIL_NORMAL_MAX (14).

This should get the function tracer working on sparc64.

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-12 22:21:52 -07:00
David S. Miller
cb256aa604 sparc64: Use kstack_valid() in die_if_kernel().
This gets rid of a local function (is_kernel_stack()) which tries to
do the same thing, yet poorly in that it doesn't handle IRQ stacks
properly.

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-12 22:16:22 -07:00
Shirley Ma
0e413f22e4 virtio_net: missing sg_init_table
Add missing sg_init_table for sg_set_buf in virtio_net which
induced in defer skb patch.

Reported-by: Thomas Müller <thomas@mathtm.de>
Tested-by: Thomas Müller <thomas@mathtm.de>
Signed-off-by: Shirley Ma <xma@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-12 22:00:34 -07:00
Linus Torvalds
0d0fb0f9c5 Linux 2.6.34-rc4 2010-04-12 18:41:35 -07:00
Linus Torvalds
64a8920fab Merge branch 'anonvma'
* anonvma:
  anonvma: when setting up page->mapping, we need to pick the _oldest_ anonvma
  anon_vma: clone the anon_vma chain in the right order
  vma_adjust: fix the copying of anon_vma chains
  Simplify and comment on anon_vma re-use for anon_vma_prepare()
2010-04-12 18:39:58 -07:00
Linus Torvalds
50b88c46f0 Merge master.kernel.org:/home/rmk/linux-2.6-arm
* master.kernel.org:/home/rmk/linux-2.6-arm: (21 commits)
  ARM: Fix ioremap_cached()/ioremap_wc() for SMP platforms
  ARM: 6043/1: AT91 slow-clock resume: Don't wait for a disabled PLL to lock
  ARM: 6031/1: fix Thumb-2 decompressor
  ARM: 6029/1: ep93xx: gpio.c: local functions should be static
  ARM: 6028/1: ARM: add MAINTAINERS for U300
  ARM: 6024/1: bcmring: fix missing down on semaphore in dma.c
  MXC: mach_armadillo5x0: Add USB Host support.
  ARM mach-mx3: duplicated include
  ARM mach-mx3: duplicated include
  imx31: add watchdog device on litekit board.
  imx3: Add watchdog platform device support
  MXC: mach-mx31_3ds: add support for freescale mc13783 power management device.
  MXC: mach-mx31_3ds: Add SPI1 device support.
  MXC: mach-mx31_3ds: Add support for on board NAND Flash.
  MXC: mach-mx31_3ds: Update variable names over recent mach name modification.
  imx31: fix parent clock for rtc
  i.MX51: remove NFC AXI static mapping
  i.MX51: determine silicon revision dynamically
  i.MX51: map TZIC dynamically
  i.MX51: Use correct clock for gpt
  ...
2010-04-12 18:37:34 -07:00
Linus Torvalds
d6cf853d4d Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
  Btrfs: make sure the chunk allocator doesn't create zero length chunks
  Btrfs: fix data enospc check overflow
2010-04-12 18:37:04 -07:00
Linus Torvalds
6a945f38be Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6:
  quota: Fix possible dq_flags corruption
  quota: Hide warnings about writes to the filesystem before quota was turned on
  ext3: symlink must be handled via filesystem specific operation
  ext2: symlink must be handled via filesystem specific operation
2010-04-12 18:36:49 -07:00
Linus Torvalds
50fc88cb03 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6:
  udf: add speciffic ->setattr callback
  udf: potential integer overflow
2010-04-12 18:36:34 -07:00
Linus Torvalds
4505a49389 Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: (36 commits)
  MIPS: Calculate proper ebase value for 64-bit kernels
  MIPS: Alchemy: DB1200: Remove custom wait implementation
  MIPS: Big Sur: Make defconfig more useful.
  MIPS: Fix __vmalloc() etc. on MIPS for non-GPL modules
  MIPS: Sibyte: Fix M3 TLB exception handler workaround.
  MIPS: BCM63xx: Fix build failure in board_bcm963xx.c
  MIPS: uasm: Add OR instruction.
  MIPS: Sibyte: Apply M3 workaround only on affected chip types and versions.
  MIPS: BCM63xx: Initialize gpio_out_low & out_high to current value at boot.
  MIPS: BCM63xx: Register SSB SPROM fallback in board's first stage callback
  MIPS: BCM63xx: Fix typo in cpu-feature-overrides file.
  MIPS: BCM63xx: Add support for second uart.
  MIPS: BCM63xx: Fix double gpio registration.
  MIPS: BCM63xx: Add DWVS0 board
  MIPS: BCM63xx: Add the RTA1025W-16 BCM6348-based board to suppported boards.
  MIPS: BCM63xx: Fix BCM6338 and BCM6345 gpio count
  MIPS: libgcc.h: Checkpatch cleanup
  MIPS: Loongson-2F: Flush the branch target history in BTB and RAS
  MIPS: Move signal trampolines off of the stack.
  MIPS: Preliminary VDSO
  ...
2010-04-12 18:36:11 -07:00
Linus Torvalds
fedfb947b2 Merge branch 'for-2.6.34' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.34' of git://linux-nfs.org/~bfields/linux:
  svcrdma: RDMA support not yet compatible with RPC6
2010-04-12 18:34:56 -07:00
Linus Torvalds
44fa2b4bee Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2:
  nilfs2: fix typo "numer" -> "number" in alloc.c
  nilfs2: Remove an uninitialization warning in nilfs_btree_propagate_v()
  nilfs2: fix a wrong type conversion in nilfs_ioctl()
2010-04-12 18:34:25 -07:00
Linus Torvalds
ea90002b0f anonvma: when setting up page->mapping, we need to pick the _oldest_ anonvma
Otherwise we might be mapping in a page in a new mapping, but that page
(through the swapcache) would later be mapped into an old mapping too.
The page->mapping must be the case that works for everybody, not just
the mapping that happened to page it in first.

Here's the scenario:

 - page gets allocated/mapped by process A. Let's call the anon_vma we
   associate the page with 'A' to keep it easy to track.

 - Process A forks, creating process B. The anon_vma in B is 'B', and has
   a chain that looks like 'B' -> 'A'. Everything is fine.

 - Swapping happens. The page (with mapping pointing to 'A') gets swapped
   out (perhaps not to disk - it's enough to assume that it's just not
   mapped any more, and lives entirely in the swap-cache)

 - Process B pages it in, which goes like this:

        do_swap_page ->
          page = lookup_swap_cache(entry);
         ...
          set_pte_at(mm, address, page_table, pte);
          page_add_anon_rmap(page, vma, address);

   And think about what happens here!

   In particular, what happens is that this will now be the "first"
   mapping of that page, so page_add_anon_rmap() used to do

        if (first)
                __page_set_anon_rmap(page, vma, address);

   and notice what anon_vma it will use? It will use the anon_vma for
   process B!

   What happens then? Trivial: process 'A' also pages it in (nothing
   happens, it's not the first mapping), and then process 'B' execve's
   or exits or unmaps, making anon_vma B go away.

   End result: process A has a page that points to anon_vma B, but
   anon_vma B does not exist any more.  This can go on forever.  Forget
   about RCU grace periods, forget about locking, forget anything like
   that.  The bug is simply that page->mapping points to an anon_vma
   that was correct at one point, but was _not_ the one that was shared
   by all users of that possible mapping.

Changing it to always use the deepest anon_vma in the anonvma chain gets
us to the safest model.

This can be improved in certain cases: if we know the page is private to
just this particular mapping (for example, it's a new page, or it is the
only swapcache entry), we could pick the top (most specific) anon_vma.

But that's a future optimization. Make it _work_ reliably first.

Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Tested-by: Borislav Petkov <bp@alien8.de> [ "What do you know, I think you fixed it!" ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-12 17:54:13 -07:00
Linus Torvalds
646d87b481 anon_vma: clone the anon_vma chain in the right order
We want to walk the chain in reverse order when cloning it, so that the
order of the result chain will be the same as the order in the source
chain.  When we add entries to the chain, they go at the head of the
chain, so we want to add the source head last.

Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Tested-by: Borislav Petkov <bp@alien8.de> [ "No, it still oopses" ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-12 17:54:12 -07:00
Linus Torvalds
287d97ac03 vma_adjust: fix the copying of anon_vma chains
When we move the boundaries between two vma's due to things like
mprotect, we need to make sure that the anon_vma of the pages that got
moved from one vma to another gets properly copied around.  And that was
not always the case, in this rather hard-to-follow code sequence.

Clarify the code, and fix it so that it copies the anon_vma from the
right source.

Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Tested-by: Borislav Petkov <bp@alien8.de> [ "Yeah, not so much this one either" ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-12 17:54:11 -07:00