Commit graph

179350 commits

Author SHA1 Message Date
Wu Fengguang
31d3d3484f HWPOISON: limit hwpoison injector to known page types
__memory_failure()'s workflow is

	set PG_hwpoison
	//...
	unset PG_hwpoison if didn't pass hwpoison filter

That could kill unrelated process if it happens to page fault on the
page with the (temporary) PG_hwpoison. The race should be big enough to
appear in stress tests.

Fix it by grabbing the page and checking filter at inject time.  This
also avoids the very noisy "Injecting memory failure..." messages.

- we don't touch madvise() based injection, because the filters are
  generally not necessary for it.
- if we want to apply the filters to h/w aided injection, we'd better to
  rearrange the logic in __memory_failure() instead of this patch.

AK: fix documentation, use drain all, cleanups

CC: Haicheng Li <haicheng.li@intel.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-16 12:19:59 +01:00
Wu Fengguang
7c116f2b0d HWPOISON: add fs/device filters
Filesystem data/metadata present the most tricky-to-isolate pages.
It requires careful code review and stress testing to get them right.

The fs/device filter helps to target the stress tests to some specific
filesystem pages. The filter condition is block device's major/minor
numbers:
        - corrupt-filter-dev-major
        - corrupt-filter-dev-minor
When specified (non -1), only page cache pages that belong to that
device will be poisoned.

The filters are checked reliably on the locked and refcounted page.

Haicheng: clear PG_hwpoison and drop bad page count if filter not OK
AK: Add documentation

CC: Haicheng Li <haicheng.li@intel.com>
CC: Nick Piggin <npiggin@suse.de>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-16 12:19:59 +01:00
Wu Fengguang
138ce286eb HWPOISON: return 0 to indicate success reliably
Return 0 to indicate success, when
- action result is RECOVERED or DELAYED
- no extra page reference

Note that dirty swapcache pages are kept in swapcache, so can have one
more reference count.

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-16 12:19:58 +01:00
Wu Fengguang
d95ea51e3a HWPOISON: make semantics of IGNORED/DELAYED clear
Change semantics for
- IGNORED: not handled; it may well be _unsafe_
- DELAYED: to be handled later; it is _safe_

With this change,
- IGNORED/FAILED mean (maybe) Error
- DELAYED/RECOVERED mean Success

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-16 12:19:58 +01:00
Wu Fengguang
847ce401df HWPOISON: Add unpoisoning support
The unpoisoning interface is useful for stress testing tools to
reclaim poisoned pages (to prevent OOM)

There is no hardware level unpoisioning, so this
cannot be used for real memory errors, only for software injected errors.

Note that it may leak pages silently - those who have been removed from
LRU cache, but not isolated from page cache/swap cache at hwpoison time.
Especially the stress test of dirty swap cache pages shall reboot system
before exhausting memory.

AK: Fix comments, add documentation, add printks, rename symbol

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-16 12:19:58 +01:00
Wu Fengguang
8d22ba1b74 HWPOISON: detect free buddy pages explicitly
Most free pages in the buddy system have no PG_buddy set.
Introduce is_free_buddy_page() for detecting them reliably.

CC: Nick Piggin <npiggin@suse.de>
CC: Mel Gorman <mel@linux.vnet.ibm.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-16 12:19:58 +01:00
Wu Fengguang
95d01fc664 HWPOISON: remove the free buddy page handler
The buddy page has already be handled in the very beginning.
So remove redundant code.

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-16 12:19:58 +01:00
Wu Fengguang
dc2a1cbf7d HWPOISON: introduce delete_from_lru_cache()
Introduce delete_from_lru_cache() to
- clear PG_active, PG_unevictable to avoid complains at unpoison time
- move the isolate_lru_page() call back to the handlers instead of the
  entrance of __memory_failure(), this is more hwpoison filter friendly

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-16 12:19:58 +01:00
Wu Fengguang
71f72525df HWPOISON: comment dirty swapcache pages
AK: Improve comment

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-16 12:19:58 +01:00
Wu Fengguang
db0480b3a6 HWPOISON: comment the possible set_page_dirty() race
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-16 12:19:58 +01:00
Wu Fengguang
1668bfd5be HWPOISON: abort on failed unmap
Don't try to isolate a still mapped page. Otherwise we will hit the
BUG_ON(page_mapped(page)) in __remove_from_page_cache().

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-16 12:19:58 +01:00
Andi Kleen
82ba011b90 HWPOISON: Turn ref argument into flags argument
Now that "ref" is just a boolean turn it into
a flags argument. First step is only a single flag
that makes the code's intention more clear, but more
may follow.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-16 12:19:57 +01:00
Wu Fengguang
bd1ce5f91f HWPOISON: avoid grabbing the page count multiple times during madvise injection
If page is double referenced in madvise_hwpoison() and __memory_failure(),
remove_mapping() will fail because it expects page_count=2. Fix it by
not grabbing extra page count in __memory_failure().

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-16 12:19:57 +01:00
Wu Fengguang
a7560fc80f HWPOISON: return ENXIO on invalid page number
Use a different errno than the usual EIO for invalid page numbers.
This is mainly for better reporting for the injector.

This also avoids calling action_result() with invalid pfn.

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-16 12:19:57 +01:00
Wu Fengguang
9b9a29ecd7 HWPOISON: remove the anonymous entry
(PG_swapbacked && !PG_lru) pages should not happen.
Better to treat them as unknown pages.

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-16 12:19:57 +01:00
Wu Fengguang
0e9052eb98 page-types: add standard GPL license header
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-16 12:19:57 +01:00
Andi Kleen
588f9ce6ca HWPOISON: Be more aggressive at freeing non LRU caches
shake_page handles more types of page caches than lru_drain_all()

- per cpu page allocator pages
- per CPU LRU

Stops early when the page became free.

Used in followon patches.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-16 12:19:57 +01:00
Andi Kleen
7bc98b97ed HWPOISON: Add Andi Kleen as hwpoison maintainer to MAINTAINERS
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-16 12:19:57 +01:00
Hidetoshi Seto
918aae42aa ACPI: fix for lapic_timer_propagate_broadcast()
I got following warning on ia64 box:
  In function 'acpi_processor_power_verify':
  642: warning: passing argument 2 of 'smp_call_function_single' from
  incompatible pointer type

This smp_call_function_single() was introduced by a commit
f833bab87f:

 > @@ -162,8 +162,9 @@
 >               pr->power.timer_broadcast_on_state = state;
 >  }
 >
 > -static void lapic_timer_propagate_broadcast(struct acpi_processor *pr)
 > +static void lapic_timer_propagate_broadcast(void *arg)
 >  {
 > +       struct acpi_processor *pr = (struct acpi_processor *) arg;
 >         unsigned long reason;
 >
 >         reason = pr->power.timer_broadcast_on_state < INT_MAX ?
 > @@ -635,7 +636,8 @@
 >                 working++;
 >         }
 >
 > -       lapic_timer_propagate_broadcast(pr);
 > +       smp_call_function_single(pr->id, lapic_timer_propagate_broadcast,
 > +                                pr, 1);
 >
 >         return (working);
 >  }

The problem is that the lapic_timer_propagate_broadcast() has 2 versions:
One is real code that modified in the above commit, and the other is NOP
code that used when !ARCH_APICTIMER_STOPS_ON_C3:

  static void lapic_timer_propagate_broadcast(struct acpi_processor *pr) { }

So I got warning because of !ARCH_APICTIMER_STOPS_ON_C3.

We really want to do nothing here on !ARCH_APICTIMER_STOPS_ON_C3, so
modify lapic_timer_propagate_broadcast() of real version to use
smp_call_function_single() in it.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-12-16 04:13:19 -05:00
Igor Grinberg
61e0ac03c8 [ARM] pxa/cm-x300: add PWM backlight support
Signed-off-by: Igor Grinberg <grinberg@compulab.co.il>
Signed-off-by: Mike Rapoport <mike@compulab.co.il>
Signed-off-by: Eric Miao <eric.y.miao@gmail.com>
2009-12-16 16:49:22 +08:00
Eric Miao
5c3804629e revert "[ARM] pxa/cm-x300: add PWM backlight support"
Commit db205463fd was incorrectly applied,
and reverted here to re-apply.

Signed-off-by: Eric Miao <eric.y.miao@gmail.com>
2009-12-16 16:45:06 +08:00
Len Brown
f02f465b1c Merge branch 'dock' into release
Conflicts:
	drivers/acpi/dock.c

Signed-off-by: Len Brown <len.brown@intel.com>
2009-12-16 03:33:28 -05:00
Andrew Morton
f67538f81e acpi_pad: squish warning
drivers/acpi/acpi_pad.c: In function 'power_saving_thread':
drivers/acpi/acpi_pad.c:103: warning: 'preferred_cpu' may be used uninitialized in this function

Cc: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-12-16 03:21:22 -05:00
Jens Axboe
b568be627a block: temporarily disable discard granularity
Commit 86b3728141 adds a check for
misaligned stacking offsets, but it's buggy since the defaults are 0.
Hence all dm devices that pass in a non-zero starting offset will
be marked as misaligned amd dm will complain.

A real fix is coming, in the mean time disable the discard granularity
check so that users don't worry about dm reporting about misaligned
devices.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-12-16 09:16:41 +01:00
Alex Chiang
747479a3fb ACPI: dock: minor whitespace and style cleanups
Removed some stray whitespaces
Added whitespace when needed for legibility
Removed unneeded curly braces
Removed useless void casts
Removed unnecessary local variable initialization
Renamed variables to help out with 80-column fixes

Signed-off-by: Alex Chiang <achiang@hp.com>
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-12-16 03:03:12 -05:00
Alex Chiang
fe06fba292 ACPI: dock: add struct dock_station * directly to platform device data
Instead of adding a (struct dock_station **) to our dock device's
platform data, we can add the (struct dock_station *) directly.

This change saves us some ugly casting and improves readability.

The cost of making this change is an extra 290 bytes of stack usage,
but this is an infrequently called code-path and unlikely to cause
the kernel to blow up.

Signed-off-by: Alex Chiang <achiang@hp.com>
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-12-16 03:03:11 -05:00
Alex Chiang
9751cb721e ACPI: dock: dock_add - hoist up platform_device_register_simple()
Move the call to platform_device_register_simple so that we do it
before allocating and initializing our struct dock_station.

Signed-off-by: Alex Chiang <achiang@hp.com>
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-12-16 03:03:11 -05:00
Alex Chiang
c6f1905ea9 ACPI: dock: remove global 'dock_device_name'
We only use it in one spot, so it probably gets optimized out, but there's
still no need to use a global variable for this.

Signed-off-by: Alex Chiang <achiang@hp.com>
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-12-16 03:03:10 -05:00
Sheng Yang
5df974009f x86: Add IA32_TSC_AUX MSR and use it
Clean up write_tsc() and write_tscp_aux() by replacing
hardcoded values.

No change in functionality.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
LKML-Reference: <1260942485-19156-4-git-send-email-sheng@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-16 09:02:42 +01:00
Alex Chiang
f69cfdd24a ACPI: dock: combine add|alloc_dock_dependent_device (v2)
There's no real need to have a separate allocation step when adding
a dock dependent device.

Combining the two functions is both logical and helps with legibility.

Signed-off-by: Alex Chiang <achiang@hp.com>
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-12-16 03:02:10 -05:00
Arnaldo Carvalho de Melo
d599db3fc5 perf report: Generalize perf_session__fprintf_hists()
Pull it out of builtin-report - further changes will be made and it
will then be reusable in 'perf diff' as well.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260914682-29652-4-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-16 08:53:50 +01:00
Arnaldo Carvalho de Melo
c410a33887 perf symbols: Move symbol filtering to event__preprocess_sample()
So that --dsos, --comm, --symbols can bem used in more tools,
like in perf diff:

$ perf record -f find / > /dev/null
$ perf record -f find / > /dev/null
$ perf diff --dsos /lib64/libc-2.10.1.so | head -5
   1        +22392124     /lib64/libc-2.10.1.so   _IO_vfprintf_internal
   2         +6410655     /lib64/libc-2.10.1.so   __GI_memmove
   3    +1   +9192692     /lib64/libc-2.10.1.so   _int_malloc
   4    -1  -15158605     /lib64/libc-2.10.1.so   _int_free
   5           +45669     /lib64/libc-2.10.1.so   _IO_new_file_xsputn
$

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260914682-29652-3-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-16 08:53:49 +01:00
Arnaldo Carvalho de Melo
655000e7c7 perf symbols: Adopt the strlists for dso, comm
Will be used in perf diff too.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260914682-29652-2-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-16 08:53:49 +01:00
Arnaldo Carvalho de Melo
75be6cf487 perf symbols: Make symbol_conf global
This simplifies a lot of functions, less stuff to be done by
tool writers.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1260914682-29652-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-16 08:53:48 +01:00
Roland Dreier
14f369d1d6 Merge branches 'amso1100', 'cma', 'cxgb3', 'ehca', 'ipath', 'ipoib', 'iser', 'misc', 'mlx4' and 'nes' into for-next 2009-12-15 23:39:25 -08:00
Frank Zago
48617f862f RDMA/cxgb3: Fix error paths in post_send and post_recv
Always set bad_wr when an immediate error is detected.  Return ENOMEM
for queue full instead of EINVAL to match other drivers.

Signed-off-by: Frank Zago <fzago@systemfabricworks.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-12-15 23:39:10 -08:00
Len Brown
b6202832b4 Merge branch 'wmi' into release 2009-12-16 02:21:25 -05:00
Len Brown
1a544d28dd Merge branch 'ipmi' into release 2009-12-16 02:20:58 -05:00
Len Brown
689a8ab32f Merge branch 'dell-wmi' into release 2009-12-16 02:20:43 -05:00
Len Brown
1fc22fad1f Merge branch 'debug-aml' into release 2009-12-16 02:19:59 -05:00
Len Brown
8033c314b9 Merge branch 'bugzilla-14782' into release 2009-12-16 02:19:55 -05:00
Len Brown
8fa79e08f5 Merge branch 'ost' into release
Conflicts:
	include/acpi/processor.h

Signed-off-by: Len Brown <len.brown@intel.com>
2009-12-16 02:18:36 -05:00
Magnus Damm
503914cf4a net: sh_eth alignment fix for sh7724 using NET_IP_ALIGN V2
Fix sh_eth for sh7724 by adding NET_IP_ALIGN support V2.
Without this patch the receive data is misaligned.

Signed-off-by: Magnus Damm <damm@opensource.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-15 23:18:27 -08:00
Gurucharan Shetty
ca55398043 ixgbe: allow tx of pre-formatted vlan tagged packets
When the 82598 is fed 802.1q packets, it chokes with
an error of the form:

ixgbe: eth0: ixgbe_tx_csum: partial checksum but proto=81!

As the logic there was not smart enough to look into
the vlan header to pick out the encapsulated protocol.

There are times when we'd like to send these packets
out without having to configure a vlan on the interface.
Here we check for the vlan tag and allow the packet to
go out with the correct hardware checksum.

This patch is a clone of a previously submitted patch by
Arthur Jones <ajones@riverbed.com> for igb (Commit -
fa4a7ef36e).

Signed-off-by: Gurucharan Shetty <gshetty@riverbed.com>
Signed-off-by: Arthur Jones <ajones@riverbed.com>
Acked-by: Mallikarjuna R Chilakala <mallikarjuna.chilakala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-15 23:18:26 -08:00
Mallikarjuna R Chilakala
734e979f25 ixgbe: Fix 82598 premature copper PHY link indicatation
Modified patch with Dave's comments to replace mdelay with proper msleep.
Fix 82598 copper link issue, where the phy prematurely indicates link
before it is ready to process packets. The new function looks for phy
link and indicates that, when it is available. If phy is not ready
within few seconds of MAC indicating link, the function will return
failure which translates to link down indication.

Signed-off-by:  Mallikarjuna R Chilakala <mallikarjuna.chilakala@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-15 23:18:26 -08:00
Mallikarjuna R Chilakala
eb985f09b2 ixgbe: Fix tx_restart_queue/non_eop_desc statistics counters
Fix the restart_queue and non_eop_desc counters from being
double-counted.  They are cumulative in each ring, so we don't want to
add them to the cumulative result in the adapter's master counter.
Otherwise, the stats will be inaccurate

Signed-off-by:  Mallikarjuna R Chilakala <mallikarjuna.chilakala@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-15 23:18:25 -08:00
Ben Skeggs
37cb3e0885 drm/nouveau: fix bug causing pinned buffers to lose their NO_EVICT flag
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2009-12-16 17:06:13 +10:00
Ben Skeggs
65ec01a940 drm/nv50: fix suspend/resume delays without firmware present
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2009-12-16 17:06:05 +10:00
Ben Skeggs
0735f62e11 drm/nouveau: prevent all channel creation if accel not available
Previously, if there was no firmware available, the DRM would just
disable channel creation from userspace, but still use a single
channel for its own purposes.

With a bit of care it should actually be possible to do this, due
to the DRM's very limited use of the engine.  It currently doesn't
work correctly however, resulting in corrupted fbcon and hangs on
a number of cards.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2009-12-16 17:05:57 +10:00
Ben Skeggs
3c8868d3db drm/nv50: fix two potential suspend/resume oopses
This avoids touching the dummy channel 0/127 we have on nv50.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2009-12-16 17:05:50 +10:00