Commit graph

313792 commits

Author SHA1 Message Date
Anton Blanchard
d136e27326 powerpc: tracing: Avoid tracepoint duplication with DECLARE_EVENT_CLASS
irq_entry, irq_exit, timer_interrupt_entry and timer_interrupt_exit
all do the same thing so use DECLARE_EVENT_CLASS to avoid duplicating
everything 4 times.

This saves quite a lot of space in both instruction text and data:

   text    data     bss     dec     hex filename
   9265   19622      16   28903    70e7 arch/powerpc/kernel/irq.o
   6817   19019      16   25852    64fc arch/powerpc/kernel/irq.o

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:41 +10:00
Anton Blanchard
641bd53a61 powerpc: Enable jump label support
When looking through some instruction traces I noticed our tracepoint
checks were inline. It turns out we don't have CONFIG_JUMP_LABEL
enabled.

By enabling CONFIG_JUMP_LABEL we replace a load/compare/branch with
a nop at every tracepoint call. For example in do_IRQ:

CONFIG_JUMP_LABEL disabled:
        stdx 3,11,9
        lwz 0,8(29)
        cmpwi 7,0,0
        bne- 7,.L124
        bl .irq_enter

CONFIG_JUMP_LABEL enabled:
        stdx 3,11,9
        nop
        bl .irq_enter

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:40 +10:00
Deepthi Dharwar
16aaaff684 powerpc/pseries/cpuidle: Replace pseries_notify_cpuidle_add call with notifier
The following patch is to remove the pseries_notify_add_cpu() call
and replace it by a hot plug notifier.

This would prevent cpuidle resources being released and allocated each
time cpu comes online on pseries.

The earlier design was causing a lockdep problem
in start_secondary as reported on this thread
	-https://lkml.org/lkml/2012/5/17/2

This applies on 3.4-rc7

Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:40 +10:00
Nishanth Aravamudan
25ebc45b93 powerpc/pseries/iommu: remove default window before attempting DDW manipulation
An upcoming release of firmware will add DDW extensions, in particular
an API to "reset" the DMA window to the original configuration (32-bit,
2GB in size). With that API available, we can safely remove the default
window, increasing the resources available to firmware for creation of
larger windows for the slot in question -- if we encounter an error, we
can use the new API to reset the state of the slot.

Further, this same release of firmware will make it a hard requirement
for OSes to release the existing window before any other windows will be
shown as available, to avoid conflicts in addressing between the two
windows.

In anticipation of these changes, always remove the default window
before we do any DDW manipulations.

Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:39 +10:00
Steven Rostedt
65b8c7226e powerpc/ftrace: Use patch_instruction instead of probe_kernel_write()
The patch_instruction() interface is made to modify kernel text. It is
safer to use that then the probe_kernel_write() when modifying kernel
code.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:39 +10:00
Steven Rostedt
b6e3796834 powerpc: Have patch_instruction detect faults
For ftrace to use the patch_instruction code, it needs to check for
faults on write. Ftrace updates code all over the kernel, and we need to
know if code is updated or not due to protections that are placed on
some portions of the kernel. If ftrace does not detect a fault, it will
error later on, and it will be much more difficult to find the problem.

By changing patch_instruction() to detect faults, then ftrace will be
able to make use of it too.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:38 +10:00
Steven Rostedt
ee456bb346 powerpc/ftrace: Have PPC skip updating with stop_machine()
PowerPC does not have the synchronization issues that x86 has with
modifying code on one CPU while another CPU is executing it.
The other CPU will either see the old or new code without any
issues, unlike x86 which may issue a GPF.

Instead of calling the heavy stop_machine, just update the code.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:38 +10:00
Tony Breeds
b81f18e55e powerpc/boot: Only build board support files when required.
Currently we build all board files regardless of the final zImage
target.  This is sub-optimal (in terms on compilation) and leads to
problems in one platform needlessly causing failures for other
platforms.

Use the Kconfig variables to selectively construct this board files to
build.

Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:37 +10:00
Huang Shijie
77ac32ad2b ARM: imx6q: add clocks for gpmi-nand
Add clocks for gpmi-nand.

Signed-off-by: Huang Shijie <shijie8@gmail.com>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
2012-07-03 11:39:57 +08:00
Linus Torvalds
9d4056aa9e Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull a couple more powerpc fixes from Benjamin Herrenschmidt:
 "Here are two more fixes that I "missed" when scrubbing patchwork last
  week which are worth still having in 3.5."

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/kvm: sldi should be sld
  powerpc/xmon: Use cpumask iterator to avoid warning
2012-07-02 19:52:25 -07:00
Andy Lutomirski
09b243577b security: document no_new_privs
Document no_new_privs.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-07-03 12:35:36 +10:00
Marc Kleine-Budde
610578a3ab ARM: imx: enable flexcan on imx25, imx35, imx53, imx6q
Aas the flexcan driver has DT binding for some time now. This patch makes
the flexcan driver available in the can drivers menu if a SOC with flexcan
is enabled.

Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
2012-07-03 10:31:55 +08:00
NeilBrown
5f066c632f md/raid5: fix refcount problem when blocked_rdev is set.
commit 43220aa0f2
    md/raid5: fix a hang on device failure.

fixed a hang, but introduced a refcounting in-balance so
that if the presence of bad-blocks ever caused an rdev to
be 'blocked' we would increment the refcount on the rdev and
never decrement it.

So added the needed rdev_dec_pending when md_wait_for_blocked_rdev
is not called.

Reported-by: majianpeng <majianpeng@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-07-03 12:13:29 +10:00
majianpeng
7c2c57c9a9 md:Add blk_plug in sync_thread.
Add blk_plug in sync_thread will increase the performance of sync.
Because sync_thread did not blk_plug,so when raid sync, the bio merge
not well.

Testing environment:
SATA controller: Intel Corporation 82801JI (ICH10 Family) SATA AHCI
Controller.
OS:Linux xxx 3.5.0-rc2+ #340 SMP Tue Jun 12 09:00:25 CST 2012
x86_64 x86_64 x86_64 GNU/Linux.
RAID5: four ST31000524NS disk.

Without blk_plug:recovery speed about 63M/Sec;
Add blk_plug:recovery speed about 120M/Sec.

Using blktrace:
blktrace -d /dev/sdb -w 60  -o -|blkparse -i -

without blk_plug:
Total (8,16):
 Reads Queued:      309811,     1239MiB	 Writes Queued:           0,        0KiB
 Read Dispatches:   283583,     1189MiB	 Write Dispatches:        0,        0KiB
 Reads Requeued:         0		 Writes Requeued:         0
 Reads Completed:   273351,     1149MiB	 Writes Completed:        0,        0KiB
 Read Merges:        23533,    94132KiB	 Write Merges:            0,        0KiB
 IO unplugs:             0        	 Timer unplugs:           0

add blk_plug:
Total (8,16):
 Reads Queued:      428697,     1714MiB	 Writes Queued:           0,        0KiB
 Read Dispatches:     3954,     1714MiB	 Write Dispatches:        0,        0KiB
 Reads Requeued:         0		 Writes Requeued:         0
 Reads Completed:     3956,     1715MiB	 Writes Completed:        0,        0KiB
 Read Merges:       424743,     1698MiB	 Write Merges:            0,        0KiB
 IO unplugs:             0        	 Timer unplugs:        3384

The ratio of merge will be markedly increased.

Signed-off-by: majianpeng <majianpeng@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-07-03 12:12:26 +10:00
majianpeng
1850753d2e md/raid5: In ops_run_io, inc nr_pending before calling md_wait_for_blocked_rdev
In ops_run_io(), the call to md_wait_for_blocked_rdev will decrement
nr_pending so we lose the reference we hold on the rdev.
So atomic_inc it first to maintain the reference.

This bug was introduced by commit  73e92e51b7
    md/raid5.  Don't write to known bad block on doubtful devices.

which appeared in 3.0, so patch is suitable for stable kernels since
then.

Cc: stable@vger.kernel.org
Signed-off-by: majianpeng <majianpeng@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-07-03 12:11:54 +10:00
majianpeng
6c0544e255 md/raid5: Do not add data_offset before call to is_badblock
In chunk_aligned_read() we are adding data_offset before calling
is_badblock.  But is_badblock also adds data_offset, so that is bad.

So move the addition of data_offset to after the call to
is_badblock.

This bug was introduced by commit 31c176ecdf
     md/raid5: avoid reading from known bad blocks.
which first appeared in 3.0.  So that patch is suitable for any
-stable kernel from 3.0.y onwards.  However it will need minor
revision for most of those (as the comment didn't appear until
recently).

Cc: stable@vger.kernel.org
Signed-off-by: majianpeng <majianpeng@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-07-03 12:09:57 +10:00
NeilBrown
5cfb22a1f8 md/raid5: prefer replacing failed devices over want-replacement devices.
If a RAID5 has both a failed device and a device marked as
'WantReplacement', then we should preferentially replace the failed
device.
However the current code replaces whichever is found first.
So split into 2 loops, check fail failed/missing first, and only check
for WantReplacement if nothing is failed or missing.

Reported-by: majianpeng <majianpeng@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-07-03 11:46:53 +10:00
NeilBrown
fc448a18ae md/raid10: Don't try to recovery unmatched (and unused) chunks.
If a RAID10 has an odd number of chunks - as might happen when there
are an odd number of devices - the last chunk has no pair and so is
not mirrored.  We don't store data there, but when recovering the last
device in an array we retry to recover that last chunk from a
non-existent location.  This results in an error, and the recovery
aborts.

When we get to that last chunk we should just stop - there is nothing
more to do anyway.

This bug has been present since the introduction of RAID10, so the
patch is appropriate for any -stable kernel.

Cc: stable@vger.kernel.org
Reported-by: Christian Balzer <chibi@gol.com>
Tested-by: Christian Balzer <chibi@gol.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-07-03 10:37:30 +10:00
Alex Williamson
326cf0334b KVM: Sanitize KVM_IRQFD flags
We only know of one so far.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-07-02 21:10:30 -03:00
Alex Williamson
f36992e312 KVM: Add missing KVM_IRQFD API documentation
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-07-02 21:10:30 -03:00
Alex Williamson
d4db2935e4 KVM: Pass kvm_irqfd to functions
Prune this down to just the struct kvm_irqfd so we can avoid
changing function definition for every flag or field we use.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-07-02 21:10:30 -03:00
Arnd Bergmann
fdc0867884 Merge branch 'imx/sparse-irq' of git://git.linaro.org/people/shawnguo/linux-2.6 into next/irq
From Shawn Guo <shawn.guo@linaro.org>, this makes it possible to use
sparse irqs with mach-imx.

* 'imx/sparse-irq' of git://git.linaro.org/people/shawnguo/linux-2.6:
  ARM: imx: enable SPARSE_IRQ for imx platform
  ARM: fiq: change FIQ_START to a variable
  tty: serial: imx: remove the use of MXC_INTERNAL_IRQS
  ARM: imx: remove unneeded mach/irq.h inclusion
  i2c: imx: remove unneeded mach/irqs.h inclusion
  ARM: imx: add a legacy irqdomain for mx31ads
  ARM: imx: add a legacy irqdomain for 3ds_debugboard
  ARM: imx: pass gpio than irq number into mxc_expio_init
  ARM: imx: leave irq_base of wm8350_platform_data uninitialized
  dma: ipu: remove the use of ipu_platform_data
  ARM: imx: move irq_domain_add_legacy call into avic driver
  ARM: imx: move irq_domain_add_legacy call into tzic driver
  gpio/mxc: move irq_domain_add_legacy call into gpio driver
  ARM: imx: eliminate macro IRQ_GPIOx()
  ARM: imx: eliminate macro IOMUX_TO_IRQ()
  ARM: imx: eliminate macro IMX_GPIO_TO_IRQ()

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2012-07-02 23:18:19 +02:00
Arnd Bergmann
d19550e5b8 Merge branch 'fixes' of git://github.com/hzhuang1/linux into fixes
* 'fixes' of git://github.com/hzhuang1/linux:
  ARM: pxa: hx4700: Fix basic suspend/resume
2012-07-02 23:14:29 +02:00
Arnd Bergmann
c911a3169e Merge branch 'prima2/drivers' of git://gitorious.org/sirfprima2-kernel/sirfprima2-kernel into next/pinctrl
* 'prima2/drivers' of git://gitorious.org/sirfprima2-kernel/sirfprima2-kernel:
  PINCTRL: SiRF: add GPIO and GPIO irq support in CSR SiRFprimaII

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Linus Walleij <inus.walleij@linaro.org>
2012-07-02 23:05:04 +02:00
Thierry Reding
0944799156 ARM: tegra: Provide clock for only one PWM controller
A subsequent patch will add a generic PWM API driver for the Tegra PWFM
controller, supporting all four PWM devices with a single PWM chip. The
device will be named tegra-pwm and only one clock needs to be provided.

Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
Signed-off-by: Stephen Warren <swarren@nvidia.com>
2012-07-02 15:01:34 -06:00
Simon Que
bb1dccfc63 ARM: tegra: Fix PWM clock programming
PWM clock source registers in Tegra 2 have different clock source selection bit
fields than other registers.  PWM clock source bits in CLK_SOURCE_PWM_0 register
are located at bit field bit[30:28] while others are at bit field bit[31:30] in
their respective clock source register.

This patch updates the clock programming to correctly reflect that, by adding a
flag to indicate the alternate bit field format and checking for it when
selecting a clock source (parent clock).

Signed-off-by: Bill Huang <bilhuang@nvidia.com>
Signed-off-by: Simon Que <sque@chromium.org>
Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
Signed-off-by: Stephen Warren <swarren@nvidia.com>
2012-07-02 15:01:18 -06:00
Arnd Bergmann
df7cb45585 A series about interrupt controller cleanup. AT91 AIC is moving to
fasteoi type of handler and sparse IRQ.
 The Device Tree support is added to take into account priority
 and external IRQ.
 In addition to that, the new AIC5 IP is introduced.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQEcBAABAgAGBQJP8ZmgAAoJEAf03oE53VmQyuAH/AzZnSa7CduDKooiYuNmlOHQ
 i9j9TD1jqmDQFSdprC4zGXgf9c5OEcmFrtUVrRagl9vvtnmoB97TDHH8RTFdnCe1
 KhoGaJ8IGjalqUMt0h7+IQfYqmnj1zwblhnHoLqe7aBV+ucrThwEDRSutSzG++O6
 JVTKv102H9CGJ2gVUTUxu/KO/soYJog2JWOY9GHfVjqfMekiBR2Yz4I8v57KXzr/
 joMYpQwPzvP5Agv8RAZ2dFhV6lp/Qqm8DuMOI1DR6PZy6N1Eatimm34cNNosCUT3
 A7rH4W84qRR7LaCmusNu2Z2mOskVNLNzvSGNsFTcHVBjeW5Io4pHnQG+YhRN/Gk=
 =fxik
 -----END PGP SIGNATURE-----

Merge tag 'at91-for-next-cleanup' of git://github.com/at91linux/linux-at91 into next/cleanup

Nicolas Ferre <nicolas.ferre@atmel.com> writes:

   A series about interrupt controller cleanup. AT91 AIC is moving to
   fasteoi type of handler and sparse IRQ.
   The Device Tree support is added to take into account priority
   and external IRQ.
   In addition to that, the new AIC5 IP is introduced.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>

* tag 'at91-for-next-cleanup' of git://github.com/at91linux/linux-at91:
  ARM: at91: add AIC5 support
  ARM: at91: remove mach/irqs.h
  ARM: at91: sparse irq support
  ARM: at91: at91 based machines specify their own irq handler at run time
  ARM: at91: remove static irq priorities for sam9x5
  ARM: at91: add of irq priorities support
  ARM: at91: aic add dt support for external irqs
  ARM: at91: aic can use fast eoi handler type
  ARM: at91: fix at91_aic_write macro
  ARM: at91: remove two unused headers
2012-07-02 22:53:37 +02:00
Dmitry Kasatkin
c7de7adc18 ima: remove unused cleanup functions
IMA cannot be used as module and does not need __exit functions.
Removed them.

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2012-07-02 16:43:30 -04:00
Dmitry Kasatkin
0ea4f8ae41 ima: free securityfs violations file
On ima_fs_init() error, free securityfs violations file.

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
2012-07-02 16:43:30 -04:00
Mimi Zohar
08e1b76ae3 ima: use full pathnames in measurement list
The IMA measurement list contains filename hints, which can be
ambigious without the full pathname.  This patch replaces the
filename hint with the full pathname, simplifying for userspace
the correlating of file hash measurements with files.

Change log v1:
- Revert to short filenames, when full pathname is longer than IMA
  measurement buffer size. (Based on Dmitry's review)

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2012-07-02 16:43:29 -04:00
Sarah Sharp
0d9f78a92e xhci: Fix hang on back-to-back Set TR Deq Ptr commands.
The Microsoft LifeChat 3000 USB headset was causing a very reproducible
hang whenever it was plugged in.  At first, I thought the host
controller was producing bad transfer events, because the log was filled
with errors like:

xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD

However, it turned out to be an xHCI driver bug in the ring expansion
patches.  The bug is triggered When there are two ring segments, and a
TD that ends just before a link TRB, like so:

 ______________                     _____________
|              |              ---> | setup TRB B |
 ______________               |     _____________
|              |              |    |  data TRB B |
 ______________               |     _____________
| setup TRB A  | <-- deq      |    |  data TRB B |
 ______________               |     _____________
| data TRB A   |              |    |             | <-- enq, deq''
 ______________               |     _____________
| status TRB A |              |    |             |
 ______________               |     _____________
|  link TRB    |---------------    |  link TRB   |
 _____________  <--- deq'           _____________

TD A (the first control transfer) stalls on the data phase.  That halts
the ring.  The xHCI driver moves the hardware dequeue pointer to the
first TRB after the stalled transfer, which happens to be the link TRB.

Once the Set TR dequeue pointer command completes, the function
update_ring_for_set_deq_completion runs.  That function is supposed to
update the xHCI driver's dequeue pointer to match the internal hardware
dequeue pointer.  On the first call this would work fine, and the
software dequeue pointer would move to deq'.

However, if the transfer immediately after that stalled (TD B in this
case), another Set TR Dequeue command would be issued.  That would move
the hardware dequeue pointer to deq''.  Once that command completed,
update_ring_for_set_deq_completion would run again.

The original code would unconditionally increment the software dequeue
pointer, which moved the pointer off the ring segment into la-la-land.
The while loop would happy increment the dequeue pointer (possibly
wrapping it) until it matched the hardware pointer value.

The while loop would also access all the memory in between the first
ring segment and the second ring segment to determine if it was a link
TRB.  This could cause general protection faults, although it was
unlikely because the ring segments came from a DMA pool, and would often
have consecutive memory addresses.

If nothing in that space looked like a link TRB, the deq_seg pointer for
the ring would remain on the first segment.  Thus, the deq_seg and the
software dequeue pointer would get out of sync.

When the next transfer event came in after the stalled transfer, the
xHCI driver code would attempt to convert the software dequeue pointer
into a DMA address in order to compare the DMA address for the completed
transfer.  Since the deq_seg and the dequeue pointer were out of sync,
xhci_trb_virt_to_dma would return NULL.

The transfer event would get ignored, the transfer would eventually
timeout, and we would mistakenly convert the finished transfer to no-op
TRBs.  Some kernel driver (maybe xHCI?) would then get stuck in an
infinite loop in interrupt context, and the whole machine would hang.

This patch should be backported to kernels as old as 3.4, that contain
the commit b008df60c6 "xHCI: count free
TRBs on transfer ring"

Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Cc: Andiry Xu <andiry.xu@amd.com>
Cc: stable@vger.kernel.org
2012-07-02 12:51:25 -07:00
Stanislaw Ledwon
8bea2bd37d usb: Add support for root hub port status CAS
The host controller port status register supports CAS (Cold Attach
Status) bit. This bit could be set when USB3.0 device is connected
when system is in Sx state. When the system wakes to S0 this port
status with CAS bit is reported and this port can't be used by any
device.

When CAS bit is set the port should be reset by warm reset. This
was not supported by xhci driver.

The issue was found when pendrive was connected to suspended
platform. The link state of "Compliance Mode" was reported together
with CAS bit. This link state was also not supported by xhci and
core/hub.c.

The CAS bit is defined only for xhci root hub port and it is
not supported on regular hubs. The link status is used to force
warm reset on port. Make the USB core issue a warm reset when port
is in ether the 'inactive' or 'compliance mode'. Change the xHCI driver
to report 'compliance mode' when the CAS is set. This force warm reset
on the root hub port.

This patch should be backported to stable kernels as old as 3.2, that
contain the commit 10d674a82e "USB: When
hot reset for USB3 fails, try warm reset."

Signed-off-by: Stanislaw Ledwon <staszek.ledwon@linux.intel.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Acked-by: Andiry Xu <andiry.xu@amd.com>
Cc: stable@vger.kernel.org
2012-07-02 12:51:24 -07:00
Chris Mason
b6305567e7 Btrfs: run delayed directory updates during log replay
While we are resolving directory modifications in the
tree log, we are triggering delayed metadata updates to
the filesystem btrees.

This commit forces the delayed updates to run so the
replay code can find any modifications done.  It stops
us from crashing because the directory deleltion replay
expects items to be removed immediately from the tree.

Signed-off-by: Chris Mason <chris.mason@fusionio.com>
cc: stable@kernel.org
2012-07-02 15:39:19 -04:00
Josef Bacik
7fd1a3f73f Btrfs: hold a ref on the inode during writepages
We can race with unlink and not actually be able to do our igrab in
btrfs_add_ordered_extent.  This will result in all sorts of problems.
Instead of doing the complicated work to try and handle returning an error
properly from btrfs_add_ordered_extent, just hold a ref to the inode during
writepages.  If we cannot grab a ref we know we're freeing this inode anyway
and can just drop the dirty pages on the floor, because screw them we're
going to invalidate them anyway.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2012-07-02 15:39:18 -04:00
Josef Bacik
bdb7d303b3 Btrfs: fix tree log remove space corner case
The tree log stuff can have allocated space that we end up having split
across a bitmap and a real extent.  The free space code does not deal with
this, it assumes that if it finds an extent or bitmap entry that the entire
range must fall within the entry it finds.  This isn't necessarily the case,
so rework the remove function so it can handle this case properly.  This
fixed two panics the user hit, first in the case where the space was
initially in a bitmap and then in an extent entry, and then the reverse
case.  Thanks,

Reported-and-tested-by: Shaun Reich <sreich@kde.org>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2012-07-02 15:39:18 -04:00
Liu Bo
6bf02314d9 Btrfs: fix wrong check during log recovery
When we're evicting an inode during log recovery, we need to ensure that the inode
is not in orphan state any more, which means inode's run_time flags has _no_
BTRFS_INODE_HAS_ORPHAN_ITEM.  Thus, the BUG_ON was triggered because of a wrong
check for the flags.

Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2012-07-02 15:39:17 -04:00
Alexander Block
d3a94048c9 Btrfs: use _IOR for BTRFS_IOC_SUBVOL_GETFLAGS
We used the wrong ioctl macro for the getflags ioctl before.
As we don't have the set/getflags ioctls in the user space ioctl.h
at the moment, it's safe to fix it now.

Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Alexander Block <ablock84@googlemail.com>
2012-07-02 15:39:17 -04:00
Ilya Dryomov
2b6ba629b5 Btrfs: resume balance on rw (re)mounts properly
This introduces btrfs_resume_balance_async(), which, given that
restriper state was recovered earlier by btrfs_recover_balance(),
resumes balance in btrfs-balance kthread.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2012-07-02 15:39:17 -04:00
Ilya Dryomov
68310a5e42 Btrfs: restore restriper state on all mounts
Fix a bug that triggered asserts in btrfs_balance() in both normal and
resume modes -- restriper state was not properly restored on read-only
mounts.  This factors out resuming code from btrfs_restore_balance(),
which is now also called earlier in the mount sequence to avoid the
problem of some early writes getting the old profile.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2012-07-02 15:39:16 -04:00
Josef Bacik
c3473e8300 Btrfs: fix dio write vs buffered read race
Miao pointed out there's a problem with mixing dio writes and buffered
reads.  If the read happens between us invalidating the page range and
actually locking the extent we can bring in pages into page cache.  Then
once the write finishes if somebody tries to read again it will just find
uptodate pages and we'll read stale data.  So we need to lock the extent and
check for uptodate bits in the range.  If there are uptodate bits we need to
unlock and invalidate again.  This will keep this race from happening since
we will hold the extent locked until we create the ordered extent, and then
teh read side always waits for ordered extents.  There was also a race in
how we updated i_size, previously we were relying on the generic DIO stuff
to adjust the i_size after the DIO had completed, but this happens outside
of the extent lock which means reads could come in and not see the updated
i_size.  So instead move this work into where we create the extents, and
then this way the update ordered i_size stuff works properly in the endio
handlers.  Thanks,

Signed-off-by: Josef Bacik <josef@redhat.com>
2012-07-02 15:36:23 -04:00
Stefan Behrens
597a60fade Btrfs: don't count I/O statistic read errors for missing devices
It is normal behaviour of the low level btrfs function btrfs_map_bio()
to complete a bio with -EIO if the device is missing, instead of just
preventing the bio creation in an earlier step.
This used to cause I/O statistic read error increments and annoying
printk_ratelimited messages. This commit fixes the issue.

Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
Reported-by: Carey Underwood <cwillu@cwillu.com>
2012-07-02 15:36:23 -04:00
Paul E. McKenney
9d2ad24306 rcu: Make RCU_FAST_NO_HZ respect nohz= boot parameter
If the nohz= boot parameter disables nohz, then RCU_FAST_NO_HZ needs to
also disable itself.  This commit therefore checks for tick_nohz_enabled
being zero, disabling rcu_prepare_for_idle() if so.  This commit assumes
that tick_nohz_enabled can change at runtime: If this is not the case,
then a simpler approach suffices.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-07-02 12:34:43 -07:00
Paul E. McKenney
4fa3b6cb1b rcu: Fix qlen_lazy breakage
Commit d8169d4c (Make __kfree_rcu() less dependent on compiler choices)
created a macro out of an inline function in order to avoid build
breakage for certain combinations of gcc flags.  Unfortunately, it also
converted a kfree_call_rcu() to a call_rcu(), which made the rcu_data
structure's ->qlen_lazy field lose counts.  This commit therefore changes
the call_rcu() back to kfree_call_rcu().

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-07-02 12:34:43 -07:00
Paul E. McKenney
e84c48ae30 rcu: Round FAST_NO_HZ lazy timeout to nearest second
Currently, if several CPUs in the same package have all lazy RCU
callbacks, their wakeups will be uncorrelated.  If all the CPUs are in the
same power domain (as is often the case), this will result in unnecessary
power-ups of the package.  This commit therefore uses round_jiffies()
to round the timeouts to a second boundary, increasing the odds that
they can be coalesced with each other or with other timeouts.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-07-02 12:34:42 -07:00
Paul E. McKenney
28f8555364 rcu: The rcu_needs_cpu() function is not a quiescent state
The TINY_PREEMPT_RCU() function rcu_preempt_needs_cpu(), which is called
from rcu_needs_cpu(), assumes that it is in a quiescent state with respect
to the CPU.  This is no longer the case.  This commit therefore updates
rcu_preempt_needs_cpu() to make it aware that it is not running in a
quiescent state.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Tested-by: Pascal Chapperon <pascal.chapperon@wanadoo.fr>
2012-07-02 12:34:42 -07:00
Paul E. McKenney
bf1304e9cd rcu: Dump only the current CPU's buffers for idle-entry/exit warnings
Problems in RCU idle entry and exit are almost always confined to the
offending CPU.  This commit therefore switches ftrace_dump() from
DUMP_ALL to DUMP_ORIG.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Tested-by: Pascal Chapperon <pascal.chapperon@wanadoo.fr>
2012-07-02 12:34:42 -07:00
Paul E. McKenney
cf01537ecf rcu: Add check for CPUs going offline with callbacks queued
If a CPU goes offline with callbacks queued, those callbacks might be
indefinitely postponed, which can result in a system hang.  This commit
therefore inserts warnings for this condition.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-07-02 12:34:25 -07:00
Paul E. McKenney
95f0c1de3e rcu: Disable preemption in rcu_blocking_is_gp()
It is time to optimize CONFIG_TREE_PREEMPT_RCU's synchronize_rcu()
for uniprocessor optimization, which means that rcu_blocking_is_gp()
can no longer rely on RCU read-side critical sections having disabled
preemption.  This commit therefore disables preemption across
rcu_blocking_is_gp()'s scan of the cpu_online_mask.

(Updated from previous version to fix embarrassing bug spotted by
Wu Fengguang.)

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-07-02 12:34:25 -07:00
Carsten Emde
1c17e4d443 rcu: Prevent uninitialized string in RCU CPU stall info
An uninitialized string may be displayed at the end of the rcu_preempt
detected stall info such as

0: (1 GPs behind) idle=075/140000000000000/0 =8?^D=8?^D
                                             ^^^^^^^^^^
if CONFIG_RCU_FAST_NO_HZ is not defined.

This trivial patch clears the string in this case.

Signed-off-by: Carsten Emde <C.Emde@osadl.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-07-02 12:34:25 -07:00
Paul E. McKenney
d7118175cc rcu: Fix rcu_is_cpu_idle() #ifdef in TINY_RCU
The rcu_is_cpu_idle() function is used if CONFIG_DEBUG_LOCK_ALLOC,
but TINY_RCU defines it only when CONFIG_PROVE_RCU.  This causes
build failures when CONFIG_DEBUG_LOCK_ALLOC=y but CONFIG_PROVE_RCU=n.
This commit therefore adjusts the #ifdefs for rcu_is_cpu_idle() so
that it is defined when CONFIG_DEBUG_LOCK_ALLOC=y.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-07-02 12:34:25 -07:00