Commit graph

601186 commits

Author SHA1 Message Date
Carlos Maiolino
ffd40ef697 xfs: introduce metadata IO error class
Now we have the basic infrastructure, add the first error class so
we can build up the infrastructure in a meaningful way. Add the
metadata async write IO error class and sysfs entry, and introduce a
default configuration that matches the existing "retry forever"
behavior for async write metadata buffers.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-05-18 11:01:00 +10:00
Linus Torvalds
0b7962a6c4 Merge branch 'for-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata
Pull libata updates from Tejun Heo:
 "Trivial changes except for special case timeout bumping.

  I have two more libata branches which depend on SCSI and dmaengine
  tree respectively.  I'll send pull requests for them once the
  prerequisite trees are pulled in"

* 'for-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
  libata-scsi: use %*ph to dump small buffers
  treewide: Fix typos in libata.xml
  libata-core: Allow longer timeout for drive spinup from PUIS
  libata: Fixup awkward whitespace in warning by removing line continuation.
2016-05-17 18:00:39 -07:00
Carlos Maiolino
192852be8b xfs: configurable error behavior via sysfs
We need to be able to change the way XFS behaviours in error
conditions depending on the type of underlying storage. This is
necessary for handling non-traditional block devices with extended
error cases, such as thin provisioned devices that can return ENOSPC
as an IO error.

Introduce the basic sysfs infrastructure needed to define and
configure error behaviours. This is done to be generic enough to
extend to configuring behaviour in other error conditions, such as
ENOMEM, which also has different desired behaviours according to
machine configuration.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-05-18 10:58:51 +10:00
Brian Foster
9bdd9bd69b xfs: buffer ->bi_end_io function requires irq-safe lock
Reports have surfaced of a lockdep splat complaining about an
irq-safe -> irq-unsafe locking order in the xfs_buf_bio_end_io() bio
completion handler. This only occurs when I/O errors are present
because bp->b_lock is only acquired in this context to protect
setting an error on the buffer. The problem is that this lock can be
acquired with the (request_queue) q->queue_lock held. See
scsi_end_request() or ata_qc_schedule_eh(), for example.

Replace the locked test/set of b_io_error with a cmpxchg() call.
This eliminates the need for the lock and thus the lock ordering
problem goes away.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-05-18 10:56:41 +10:00
Daniel Lezcano
e7387da520 cpuidle: Fix cpuidle_state_is_coupled() argument in cpuidle_enter()
Commit 0b89e9aa28 (cpuidle: delay enabling interrupts until all
coupled CPUs leave idle) rightfully fixed a regression by letting
the coupled idle state framework to handle local interrupt enabling
when the CPU is exiting an idle state.

The current code checks if the idle state is coupled and, if so, it
will let the coupled code to enable interrupts. This way, it can
decrement the ready-count before handling the interrupt. This
mechanism prevents the other CPUs from waiting for a CPU which is
handling interrupts.

But the check is done against the state index returned by the back
end driver's ->enter functions which could be different from the
initial index passed as parameter to the cpuidle_enter_state()
function.

 entered_state = target_state->enter(dev, drv, index);

 [ ... ]

 if (!cpuidle_state_is_coupled(drv, entered_state))
	local_irq_enable();

 [ ... ]

If the 'index' is referring to a coupled idle state but the
'entered_state' is *not* coupled, then the interrupts are enabled
again. All CPUs blocked on the sync barrier may busy loop longer
if the CPU has interrupts to handle before decrementing the
ready-count. That's consuming more energy than saving.

Fixes: 0b89e9aa28 (cpuidle: delay enabling interrupts until all coupled CPUs leave idle)
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: 3.15+ <stable@vger.kernel.org> # 3.15+
[ rjw: Subject & changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-05-18 02:48:37 +02:00
Linus Torvalds
6f88b5be84 regulator: Fix build warnings from regulator_can_change_voltage()
Cut down on noise for mainstream users of the API and people doing build
 testing by dropping the deprecated flag from regulator_can_change_voltage()
 as it triggers even on the EXPORT_SYMBOL_GPL() which affects all builds
 rather than just the remaining drivers with calls to it (for which fixes
 are currently pending).
 
 The function remains deprecated and is expected to be removed entirely
 in v4.8.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJXO1DsAAoJECTWi3JdVIfQFSkH/A1YGISAUiYkASQVtWQM3SeI
 CmS5Cdv1NKVrSmZ+eNcOgjYfBCYNTm5iqW2jh+oholXCCvwNMq9tCavDkE4YR1B+
 Wa+QB0bvaRtAKRg564QF7PexCDv/lQVt/7BItusaza9FXKIgi1vmt4GCojJzxOcS
 NnJab7+8BSG8Xjngw5JaOTcvo26u3bQLBzAUzTaEXXDqbirlvUTKBzSBSDWWSYDN
 W09QLX3UBRyC+TGjPl8lzSyJ+6MlCAaV5qRgmNLJip2X5MVP4I8okN1U4S9YywBm
 8epi8oD9zEr/tWcbec8nNJSgWdAlxZzEJi+JWyoEcYk0hDVASfgrnBkWvNJeH0Q=
 =yAs2
 -----END PGP SIGNATURE-----

Merge tag 'regulator-fix-can-change-voltage' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator fix from Mark Brown:
 "Fix build warnings from regulator_can_change_voltage()

  Cut down on noise for mainstream users of the API and people
  doing build testing by dropping the deprecated flag from
  regulator_can_change_voltage() as it triggers even on the
  EXPORT_SYMBOL_GPL() which affects all builds rather than just
  the remaining drivers with calls to it (for which fixes are
  currently pending).

  The function remains deprecated and is expected to be removed
  entirely in v4.8"

* tag 'regulator-fix-can-change-voltage' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: Silence build warnings from regulator_can_change_voltage()
2016-05-17 17:47:31 -07:00
Linus Torvalds
1eccc6e152 This is the bulk of GPIO changes for kernel cycle v4.7:
Core infrastructural changes:
 
 - Support for natively single-ended GPIO driver stages. This
   means that if the hardware has registers to configure open
   drain or open source configuration, we use that rather than
   (as we did before) try to emulate it by switching the line
   to an input to get high impedance. This is also documented
   throughly in Documentation/gpio/driver.txt for those of you
   who did not understand one word of what I just wrote.
 
 - Start to do away with the unnecessarily complex and
   unitelligible ARCH_REQUIRE_GPIOLIB and
   ARCH_WANT_OPTIONAL_GPIOLIB, another evolutional artifact from
   the time when the GPIO subsystem was unmaintained. Archs can
   now just select GPIOLIB and be done with it, cleanups to
   arches will trickle in for the next kernel. Some minor archs
   ACKed the changes immediately so these are included in this
   pull request.
 
 - Advancing the use of the data pointer inside the GPIO device
   for storing driver data by switching the PowerPC, Super-H
   Unicore and a few other subarches or subsystem drivers in
   ALSA SoC, Input, serial, SSB, staging etc to use it.
 
 - The initialization now reads the input/output state of the
   GPIO lines, so that each GPIO descriptor knows - if this
   callback is implemented - whether the line is input or
   output. This also reflects nicely in userspace "lsgpio".
 
 - It is now possible to name GPIO producer names, line names,
   from the device tree. (Platform data has been supported for
   a while.) I bet we will get a similar mechanism for ACPI
   one of those days. This makes is possible to get sensible
   producer names for e.g. GPIO rails in "lsgpio" in userspace.
 
 New drivers:
 
 - New driver for the Loongson1.
 
 - The XLP driver now supports Broadcom Vulcan ARM64.
 
 - The IT87 driver now supports IT8620 and IT8628.
 
 - The PCA953X driver now supports Galileo Gen2.
 
 Driver improvements:
 
 - MCP23S08 was switched to use the gpiolib irqchip helpers and
   now also suppors level-triggered interrupts.
 
 - 74x164 and RCAR now supports the .set_multiple() callback
 
 - AMDPT was converted to use generic GPIO.
 
 - TC3589x, TPS65218, SX150X, F7188X, MENZ127, VX855, WM831X, WM8994
   support the new single ended callback for open drain
   and in some cases open source.
 
 - Implement the .get_direction() callback for a few more drivers
   like PL061, Xgene.
 
 Cleanups:
 
 - Paul Gortmaker combed through the drivers and de-modularized
   those who are not really modules.
 
 - Move the GPIO poweroff DT bindings to the power subdir where
   they belong.
 
 - Rename gpio-generic.c to gpio-mmio.c, which is much more to the
   point. That's what it is handling, nothing more, nothing less.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXOuJ5AAoJEEEQszewGV1zNXsQAII5wtkP69WRJ3goYBKg1dZN
 DkuLqZyVI4hCgRhptzUW10gDLHKKOCVubfetTJHSpyG/dWDJXPCyH6FHF+pW6lMX
 y+em8kAvWctKpaosy4EM7O55/IohW0/fNCTOfzfrUNivjydFuA2XwPUiPqC7111O
 DeKlC/t+W1JEvZTiKMi83pKq+9wqhiHmD0qxRHhV57S+MT8e7mdlSKOp7uUkKPkg
 LPlerXosnmeFjL2emuSnKl/tq8pOyruU6uaIGG/uwpbo2W86Dok9GY2GWkQ4pANT
 pDtprc4aJ/Clf6Q0CoKwQbmAozqTDeJo+Und9tRs2KuZRly2bWOcyVE0lyK+Y4s0
 544LcKw2q6cB9ARZ6JExEVRJejPISGKMqo9TaHkyNSIJoiiatKYvNS4WVeFtTgbI
 W+1WfM1svPymNRqVPO1PMLV+3m9dalDH2WjtaFF21uCAQ/G0AuPEHjEDbbx0HIpb
 qrvWmYzZ97Rm/LdYROFRO53nEdCp2jh6c3n4/2kGYM8H0suvGxXZsB1g4i+Dm+B+
 qKVTS282azlDuH9ohXeXizeb6atK6s8TC3Rmew97SmXDO00cUQzEQO/ZquRLHY9r
 n83afQ4OL2Z9yruAxAk7pCshVSyheOsHuFPuZ7bwPW31VMdoWNRkhnaTUXMjGfYg
 3y39IHrCKWNMCCVM1iNl
 =z4d6
 -----END PGP SIGNATURE-----

Merge tag 'gpio-v4.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull GPIO updates from Linus Walleij:
 "This is the bulk of GPIO changes for kernel cycle v4.7:

  Core infrastructural changes:

   - Support for natively single-ended GPIO driver stages.

     This means that if the hardware has registers to configure open
     drain or open source configuration, we use that rather than (as we
     did before) try to emulate it by switching the line to an input to
     get high impedance.

     This is also documented throughly in Documentation/gpio/driver.txt
     for those of you who did not understand one word of what I just
     wrote.

   - Start to do away with the unnecessarily complex and unitelligible
     ARCH_REQUIRE_GPIOLIB and ARCH_WANT_OPTIONAL_GPIOLIB, another
     evolutional artifact from the time when the GPIO subsystem was
     unmaintained.

     Archs can now just select GPIOLIB and be done with it, cleanups to
     arches will trickle in for the next kernel.  Some minor archs ACKed
     the changes immediately so these are included in this pull request.

   - Advancing the use of the data pointer inside the GPIO device for
     storing driver data by switching the PowerPC, Super-H Unicore and
     a few other subarches or subsystem drivers in ALSA SoC, Input,
     serial, SSB, staging etc to use it.

   - The initialization now reads the input/output state of the GPIO
     lines, so that each GPIO descriptor knows - if this callback is
     implemented - whether the line is input or output.  This also
     reflects nicely in userspace "lsgpio".

   - It is now possible to name GPIO producer names, line names, from
     the device tree.  (Platform data has been supported for a while).
     I bet we will get a similar mechanism for ACPI one of those days.
     This makes is possible to get sensible producer names for e.g.
     GPIO rails in "lsgpio" in userspace.

  New drivers:

   - New driver for the Loongson1.

   - The XLP driver now supports Broadcom Vulcan ARM64.

   - The IT87 driver now supports IT8620 and IT8628.

   - The PCA953X driver now supports Galileo Gen2.

  Driver improvements:

   - MCP23S08 was switched to use the gpiolib irqchip helpers and now
     also suppors level-triggered interrupts.

   - 74x164 and RCAR now supports the .set_multiple() callback

   - AMDPT was converted to use generic GPIO.

   - TC3589x, TPS65218, SX150X, F7188X, MENZ127, VX855, WM831X, WM8994
     support the new single ended callback for open drain and in some
     cases open source.

   - Implement the .get_direction() callback for a few more drivers like
     PL061, Xgene.

  Cleanups:

   - Paul Gortmaker combed through the drivers and de-modularized those
     who are not really modules.

   - Move the GPIO poweroff DT bindings to the power subdir where they
     belong.

   - Rename gpio-generic.c to gpio-mmio.c, which is much more to the
     point.  That's what it is handling, nothing more, nothing less"

* tag 'gpio-v4.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (126 commits)
  MIPS: do away with ARCH_[WANT_OPTIONAL|REQUIRE]_GPIOLIB
  gpio: zevio: make it explicitly non-modular
  gpio: timberdale: make it explicitly non-modular
  gpio: stmpe: make it explicitly non-modular
  gpio: sodaville: make it explicitly non-modular
  pinctrl: sh-pfc: Let gpio_chip.to_irq() return zero on error
  gpio: dwapb: Add ACPI device ID for DWAPB GPIO controller on X-Gene platforms
  gpio: dt-bindings: add wd,mbl-gpio bindings
  gpio: of: make it possible to name GPIO lines
  gpio: make gpiod_to_irq() return negative for NO_IRQ
  gpio: xgene: implement .get_direction()
  gpio: xgene: Enable ACPI support for X-Gene GFC GPIO driver
  gpio: tegra: Implement gpio_get_direction callback
  gpio: set up initial state from .get_direction()
  gpio: rename gpio-generic.c into gpio-mmio.c
  gpio: generic: fix GPIO_GENERIC_PLATFORM is set to module case
  gpio: dwapb: add gpio-signaled acpi event support
  gpio: dwapb: convert device node to fwnode
  gpio: dwapb: remove name from dwapb_port_property
  gpio/qoriq: select IRQ_DOMAIN
  ...
2016-05-17 17:39:42 -07:00
Pankaj Gupta
3834abb4e6 cpufreq: simplified goto out in cpufreq_register_driver()
simplified goto out in cpufreq_register_driver for increasing
code readability

Signed-off-by: Pankaj Gupta <pankaj.gupta@spreadtrum.com>
Signed-off-by: Sanjeev Yadav <sanjeev.yadav@spreadtrum.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-05-18 02:34:41 +02:00
Linus Torvalds
dcc4c2f61c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
Pull HID updates from Jiri Kosina:
 "No biggies this time:

   - micro-optimization of implement() in HID core parses, from Dmitry
     Torokhov

   - thingm driver cleanups from Heiner Kallweit

   - fine-graining detection of distance and tilt axes in wacom driver
     from Jason Gerecke

   - New hid-asus driver, currently supporting X205TA and VivoBook
     E200HA, from Yusuke Fujimaki"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
  HID: wacom: Add fuzz factor to distance and tilt axes
  HID: usbhid: quirks for Corsair RGB keyboard & mice (K70R, K95RGB, M65RGB, K70RGB, K65RGB)
  HID: thingm: remove not needed error message
  HID: thingm: set new flag LED_HW_PLUGGABLE
  HID: thingm: factor out duplicated code to thingm_init_led
  HID: simplify implement() a bit
  HID: asus: add support for VivoBook E200HA
  HID: hidraw: silence an uninitialized variable warning
  HID: roccat: silence an uninitialized variable warning
  HID: Asus X205TA keyboard driver
  HID: hidraw: switch to using memdup_user
2016-05-17 17:34:33 -07:00
Rafael J. Wysocki
45482c703b cpufreq: governor: CPUFREQ_GOV_STOP never fails
None of the cpufreq governors currently in the tree will ever fail
an invocation of the ->governor() callback with the event argument
equal to CPUFREQ_GOV_STOP (unless invoked with incorrect arguments
which doesn't matter anyway) and it is rather difficult to imagine
a valid reason for such a failure.

Accordingly, rearrange the code in the core to make it clear that
this call never fails.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-05-18 02:28:29 +02:00
Rafael J. Wysocki
36be3418eb cpufreq: governor: CPUFREQ_GOV_POLICY_EXIT never fails
None of the cpufreq governors currently in the tree will ever fail
an invocation of the ->governor() callback with the event argument
equal to CPUFREQ_GOV_POLICY_EXIT (unless invoked with incorrect
arguments which doesn't matter anyway) and it wouldn't really
make sense to fail it, because the caller won't be able to handle
that failure in a meaningful way.

Accordingly, rearrange the code in the core to make it clear that
this call never fails.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-05-18 02:27:32 +02:00
Rafael J. Wysocki
c749c64f45 intel_pstate: Simplify conditional in intel_pstate_set_policy()
One of the if () statements in intel_pstate_set_policy() causes
another if () to be evaluated if the condition is true and it
doesn't do anything else, so merge the two if () statements into
one.

No functional changes.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2016-05-18 02:26:33 +02:00
Linus Torvalds
0b86c75db6 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching
Pull livepatching updates from Jiri Kosina:

 - remove of our own implementation of architecture-specific relocation
   code and leveraging existing code in the module loader to perform
   arch-dependent work, from Jessica Yu.

   The relevant patches have been acked by Rusty (for module.c) and
   Heiko (for s390).

 - live patching support for ppc64le, which is a joint work of Michael
   Ellerman and Torsten Duwe.  This is coming from topic branch that is
   share between livepatching.git and ppc tree.

 - addition of livepatching documentation from Petr Mladek

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
  livepatch: make object/func-walking helpers more robust
  livepatch: Add some basic livepatch documentation
  powerpc/livepatch: Add live patching support on ppc64le
  powerpc/livepatch: Add livepatch stack to struct thread_info
  powerpc/livepatch: Add livepatch header
  livepatch: Allow architectures to specify an alternate ftrace location
  ftrace: Make ftrace_location_range() global
  livepatch: robustify klp_register_patch() API error checking
  Documentation: livepatch: outline Elf format and requirements for patch modules
  livepatch: reuse module loader code to write relocations
  module: s390: keep mod_arch_specific for livepatch modules
  module: preserve Elf information for livepatch modules
  Elf: add livepatch-specific Elf constants
2016-05-17 17:11:27 -07:00
Linus Torvalds
16bf834805 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree updates from Jiri Kosina.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (21 commits)
  gitignore: fix wording
  mfd: ab8500-debugfs: fix "between" in printk
  memstick: trivial fix of spelling mistake on management
  cpupowerutils: bench: fix "average"
  treewide: Fix typos in printk
  IB/mlx4: printk fix
  pinctrl: sirf/atlas7: fix printk spelling
  serial: mctrl_gpio: Grammar s/lines GPIOs/line GPIOs/, /sets/set/
  w1: comment spelling s/minmum/minimum/
  Blackfin: comment spelling s/divsor/divisor/
  metag: Fix misspellings in comments.
  ia64: Fix misspellings in comments.
  hexagon: Fix misspellings in comments.
  tools/perf: Fix misspellings in comments.
  cris: Fix misspellings in comments.
  c6x: Fix misspellings in comments.
  blackfin: Fix misspelling of 'register' in comment.
  avr32: Fix misspelling of 'definitions' in comment.
  treewide: Fix typos in printk
  Doc: treewide : Fix typos in DocBook/filesystem.xml
  ...
2016-05-17 17:05:30 -07:00
Linus Torvalds
a7fd20d1c4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Highlights:

   1) Support SPI based w5100 devices, from Akinobu Mita.

   2) Partial Segmentation Offload, from Alexander Duyck.

   3) Add GMAC4 support to stmmac driver, from Alexandre TORGUE.

   4) Allow cls_flower stats offload, from Amir Vadai.

   5) Implement bpf blinding, from Daniel Borkmann.

   6) Optimize _ASYNC_ bit twiddling on sockets, unless the socket is
      actually using FASYNC these atomics are superfluous.  From Eric
      Dumazet.

   7) Run TCP more preemptibly, also from Eric Dumazet.

   8) Support LED blinking, EEPROM dumps, and rxvlan offloading in mlx5e
      driver, from Gal Pressman.

   9) Allow creating ppp devices via rtnetlink, from Guillaume Nault.

  10) Improve BPF usage documentation, from Jesper Dangaard Brouer.

  11) Support tunneling offloads in qed, from Manish Chopra.

  12) aRFS offloading in mlx5e, from Maor Gottlieb.

  13) Add RFS and RPS support to SCTP protocol, from Marcelo Ricardo
      Leitner.

  14) Add MSG_EOR support to TCP, this allows controlling packet
      coalescing on application record boundaries for more accurate
      socket timestamp sampling.  From Martin KaFai Lau.

  15) Fix alignment of 64-bit netlink attributes across the board, from
      Nicolas Dichtel.

  16) Per-vlan stats in bridging, from Nikolay Aleksandrov.

  17) Several conversions of drivers to ethtool ksettings, from Philippe
      Reynes.

  18) Checksum neutral ILA in ipv6, from Tom Herbert.

  19) Factorize all of the various marvell dsa drivers into one, from
      Vivien Didelot

  20) Add VF support to qed driver, from Yuval Mintz"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1649 commits)
  Revert "phy dp83867: Fix compilation with CONFIG_OF_MDIO=m"
  Revert "phy dp83867: Make rgmii parameters optional"
  r8169: default to 64-bit DMA on recent PCIe chips
  phy dp83867: Make rgmii parameters optional
  phy dp83867: Fix compilation with CONFIG_OF_MDIO=m
  bpf: arm64: remove callee-save registers use for tmp registers
  asix: Fix offset calculation in asix_rx_fixup() causing slow transmissions
  switchdev: pass pointer to fib_info instead of copy
  net_sched: close another race condition in tcf_mirred_release()
  tipc: fix nametable publication field in nl compat
  drivers: net: Don't print unpopulated net_device name
  qed: add support for dcbx.
  ravb: Add missing free_irq() calls to ravb_close()
  qed: Remove a stray tab
  net: ethernet: fec-mpc52xx: use phy_ethtool_{get|set}_link_ksettings
  net: ethernet: fec-mpc52xx: use phydev from struct net_device
  bpf, doc: fix typo on bpf_asm descriptions
  stmmac: hardware TX COE doesn't work when force_thresh_dma_mode is set
  net: ethernet: fs-enet: use phy_ethtool_{get|set}_link_ksettings
  net: ethernet: fs-enet: use phydev from struct net_device
  ...
2016-05-17 16:26:30 -07:00
Andreas Gruenbacher
e0d46f5c6e btrfs: Switch to generic xattr handlers
The btrfs_{set,remove}xattr inode operations check for a read-only root
(btrfs_root_readonly) before calling into generic_{set,remove}xattr.  If
this check is moved into __btrfs_setxattr, we can get rid of
btrfs_{set,remove}xattr.

This patch applies to mainline, I would like to keep it together with
the other xattr cleanups if possible, though.  Could you please review?

Thanks,
Andreas

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-17 19:17:09 -04:00
Andreas Gruenbacher
2b88fc21ca ubifs: Switch to generic xattr handlers
Ubifs internally uses special inodes for storing xattrs. Those inodes
had NULL {get,set,remove}xattr inode operations before this change, so
xattr operations on them would fail. The super block's s_xattr field
would also apply to those special inodes. However, the inodes are not
visible outside of ubifs, and so no xattr operations will ever be
carried out on them anyway.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-17 19:16:23 -04:00
Linus Torvalds
b80fed9595 - based on Jens' 'for-4.7/core' to have DM thinp's discard support use
bio_inc_remaining() and the block core's new async
   __blkdev_issue_discard() interface
 
 - make DM multipath's fast code-paths lockless, using lockless_deference,
   to significantly improve large NUMA performance when using blk-mq.  The
   m->lock spinlock contention was a serious bottleneck.
 
 - a few other small code cleanups and Documentation fixes
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJXNdGVAAoJEMUj8QotnQNaYYgH/Rf2am46A78kcR5b9nN2I+Tb
 +MkqQyf8mXUzNHOu3v93CVugT+tBZuJcpHPJgCSc/1GXtgsjHLvbkO2Mc+Ioe45S
 PlUA3HdRzxHSJ365SdYvT+bY+QQlGiySelSBrJHlikXC88kz3wqyQ146BT1Rw/w+
 t0mi1liNJtZHsuH+3uO9uxe5+H7476lB84i79Kz0x8Ygv5+urgaSvDBRO5EH/hkJ
 LN2WJWHDQLT4MtHKCuiMiLpu/1HGvISN2QrMPsFjC1d1DbbZvRWAxYDwGaP/C277
 IflPo7sA/nds5T2vqb0fRTPuxBnzXdFMMvf+VQX7pjCnxlhfaxBkvNtnFpxW+oA=
 =iCyS
 -----END PGP SIGNATURE-----

Merge tag 'dm-4.7-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper updates from Mike Snitzer:

 - based on Jens' 'for-4.7/core' to have DM thinp's discard support use
   bio_inc_remaining() and the block core's new async __blkdev_issue_discard()
   interface

 - make DM multipath's fast code-paths lockless, using lockless_deference,
   to significantly improve large NUMA performance when using blk-mq.
   The m->lock spinlock contention was a serious bottleneck.

 - a few other small code cleanups and Documentation fixes

* tag 'dm-4.7-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm thin: unroll issue_discard() to create longer discard bio chains
  dm thin: use __blkdev_issue_discard for async discard support
  dm thin: remove __bio_inc_remaining() and switch to using bio_inc_remaining()
  dm raid: make sure no feature flags are set in metadata
  dm ioctl: drop use of __GFP_REPEAT in copy_params()'s __vmalloc() call
  dm stats: fix spelling mistake in Documentation
  dm cache: update cache-policies.txt now that mq is an alias for smq
  dm mpath: eliminate use of spinlock in IO fast-paths
  dm mpath: move trigger_event member to the end of 'struct multipath'
  dm mpath: use atomic_t for counting members of 'struct multipath'
  dm mpath: switch to using bitops for state flags
  dm thin: Remove return statement from void function
  dm: remove unused mapped_device argument from free_tio()
2016-05-17 16:13:00 -07:00
Linus Torvalds
24b9f0cf00 Merge branch 'for-4.7/drivers' of git://git.kernel.dk/linux-block
Pull block driver updates from Jens Axboe:
 "On top of the core pull request, this is the drivers pull request for
  this merge window.  This contains:

   - Switch drivers to the new write back cache API, and kill off the
     flush flags.  From me.

   - Kill the discard support for the STEC pci-e flash driver.  It's
     trivially broken, and apparently unmaintained, so it's safer to
     just remove it.  From Jeff Moyer.

   - A set of lightnvm updates from the usual suspects (Matias/Javier,
     and Simon), and fixes from Arnd, Jeff Mahoney, Sagi, and Wenwei
     Tao.

   - A set of updates for NVMe:

        - Turn the controller state management into a proper state
          machine.  From Christoph.

        - Shuffling of code in preparation for NVMe-over-fabrics, also
          from Christoph.

        - Cleanup of the command prep part from Ming Lin.

        - Rewrite of the discard support from Ming Lin.

        - Deadlock fix for namespace removal from Ming Lin.

        - Use the now exported blk-mq tag helper for IO termination.
          From Sagi.

        - Various little fixes from Christoph, Guilherme, Keith, Ming
          Lin, Wang Sheng-Hui.

   - Convert mtip32xx to use the now exported blk-mq tag iter function,
     from Keith"

* 'for-4.7/drivers' of git://git.kernel.dk/linux-block: (74 commits)
  lightnvm: reserved space calculation incorrect
  lightnvm: rename nr_pages to nr_ppas on nvm_rq
  lightnvm: add is_cached entry to struct ppa_addr
  lightnvm: expose gennvm_mark_blk to targets
  lightnvm: remove mgt targets on mgt removal
  lightnvm: pass dma address to hardware rather than pointer
  lightnvm: do not assume sequential lun alloc.
  nvme/lightnvm: Log using the ctrl named device
  lightnvm: rename dma helper functions
  lightnvm: enable metadata to be sent to device
  lightnvm: do not free unused metadata on rrpc
  lightnvm: fix out of bound ppa lun id on bb tbl
  lightnvm: refactor set_bb_tbl for accepting ppa list
  lightnvm: move responsibility for bad blk mgmt to target
  lightnvm: make nvm_set_rqd_ppalist() aware of vblks
  lightnvm: remove struct factory_blks
  lightnvm: refactor device ops->get_bb_tbl()
  lightnvm: introduce nvm_for_each_lun_ppa() macro
  lightnvm: refactor dev->online_target to global nvm_targets
  lightnvm: rename nvm_targets to nvm_tgt_type
  ...
2016-05-17 16:03:32 -07:00
Linus Torvalds
a4d1dbed0e Merge branch 'for-4.7/core' of git://git.kernel.dk/linux-block
Pull core block layer updates from Jens Axboe:
 "This is the core block IO changes for this merge window.  Nothing
  earth shattering in here, it's mostly just fixes.  In detail:

   - Fix for a long standing issue where wrong ordering in blk-mq caused
     order_to_size() to spew a warning.  From Bart.

   - Async discard support from Christoph.  Basically just splitting our
     sync interface into a submit + wait part.

   - Add a cleaner interface for flagging whether a device has a write
     back cache or not.  We've previously overloaded blk_queue_flush()
     with this, but let's make it more explicit.  Drivers cleaned up and
     updated in the drivers pull request.  From me.

   - Fix for a double check for whether IO accounting is enabled or not.
     From Michael Callahan.

   - Fix for the async discard from Mike Snitzer, reinstating the early
     EOPNOTSUPP return if the device doesn't support discards.

   - Also from Mike, export bio_inc_remaining() so dm can drop it's
     private copy of it.

   - From Ming Lin, add support for passing in an offset for request
     payloads.

   - Tag function export from Sagi, which will be used in NVMe in the
     drivers pull.

   - Two blktrace related fixes from Shaohua.

   - Propagate NOMERGE flag when making a request from a bio, also from
     Shaohua.

   - An optimization to not parse cgroup paths in blk-throttle, if we
     don't need to.  From Shaohua"

* 'for-4.7/core' of git://git.kernel.dk/linux-block:
  blk-mq: fix undefined behaviour in order_to_size()
  blk-throttle: don't parse cgroup path if trace isn't enabled
  blktrace: add missed mask name
  blktrace: delete garbage for message trace
  block: make bio_inc_remaining() interface accessible again
  block: reinstate early return of -EOPNOTSUPP from blkdev_issue_discard
  block: Minor blk_account_io_start usage cleanup
  block: add __blkdev_issue_discard
  block: remove struct bio_batch
  block: copy NOMERGE flag from bio to request
  block: add ability to flag write back caching on a device
  blk-mq: Export tagset iter function
  block: add offset in blk_add_request_payload()
  writeback: Fix performance regression in wb_over_bg_thresh()
2016-05-17 15:29:49 -07:00
Kees Cook
9f8036643d doc: self-protection: provide initial details
This document attempts to codify the intent around kernel self-protection
along with discussion of both existing and desired technologies, with
attention given to the rationale behind them, and the expectations of
their usage.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
[jc: applied fixes suggested by Randy]
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2016-05-17 16:24:52 -06:00
Linus Torvalds
c2e7b20705 Merge branch 'work.preadv2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs cleanups from Al Viro:
 "More cleanups from Christoph"

* 'work.preadv2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  nfsd: use RWF_SYNC
  fs: add RWF_DSYNC aand RWF_SYNC
  ceph: use generic_write_sync
  fs: simplify the generic_write_sync prototype
  fs: add IOCB_SYNC and IOCB_DSYNC
  direct-io: remove the offset argument to dio_complete
  direct-io: eliminate the offset argument to ->direct_IO
  xfs: eliminate the pos variable in xfs_file_dio_aio_write
  filemap: remove the pos argument to generic_file_direct_write
  filemap: remove pos variables in generic_file_read_iter
2016-05-17 15:05:23 -07:00
Chris Mason
c315ef8d9d Merge branch 'for-chris-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/fdmanana/linux into for-linus-4.7
Signed-off-by: Chris Mason <clm@fb.com>
2016-05-17 14:43:19 -07:00
Linus Torvalds
c52b76185b Merge branch 'work.const-path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull 'struct path' constification update from Al Viro:
 "'struct path' is passed by reference to a bunch of Linux security
  methods; in theory, there's nothing to stop them from modifying the
  damn thing and LSM community being what it is, sooner or later some
  enterprising soul is going to decide that it's a good idea.

  Let's remove the temptation and constify all of those..."

* 'work.const-path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  constify ima_d_path()
  constify security_sb_pivotroot()
  constify security_path_chroot()
  constify security_path_{link,rename}
  apparmor: remove useless checks for NULL ->mnt
  constify security_path_{mkdir,mknod,symlink}
  constify security_path_{unlink,rmdir}
  apparmor: constify common_perm_...()
  apparmor: constify aa_path_link()
  apparmor: new helper - common_path_perm()
  constify chmod_common/security_path_chmod
  constify security_sb_mount()
  constify chown_common/security_path_chown
  tomoyo: constify assorted struct path *
  apparmor_path_truncate(): path->mnt is never NULL
  constify vfs_truncate()
  constify security_path_truncate()
  [apparmor] constify struct path * in a bunch of helpers
2016-05-17 14:41:03 -07:00
Linus Torvalds
681750c046 Merge branch 'for-cifs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull cifs xattr updates from Al Viro:
 "This is the remaining parts of the xattr work - the cifs bits"

* 'for-cifs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  cifs: Switch to generic xattr handlers
  cifs: Fix removexattr for os2.* xattrs
  cifs: Check for equality with ACL_TYPE_ACCESS and ACL_TYPE_DEFAULT
  cifs: Fix xattr name checks
2016-05-17 14:35:45 -07:00
Linus Torvalds
820c687b70 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull UDF fixes from Jan Kara:
 "A fix for UDF crash on corrupted media and one UDF header fixup"

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  udf: Export superblock magic to userspace
  udf: Prevent stack overflow on corrupted filesystem mount
2016-05-17 14:25:02 -07:00
Chris Mason
a88336d13c Merge branch 'for-chris-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.7 2016-05-17 14:24:44 -07:00
Linus Torvalds
dba1e98731 Some jfs logging cleanups from Joe Perches
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJXOjXsAAoJEDaohF61QIxkXYYP/RlNZu5S+ekeJMCOKF/dZWpJ
 cpg/Klr7KC8TnCU7WiemXG7PNMjY5jnMJE19MyGrPcNpNeiSLp7RMnWiGiSNmVNM
 76qy9dcHFwq9SfksQ0cdCp6nhH3paLepX1OSVhgIqAzrsrgLcUbjV/wjcmBq5Ths
 sRBQtEI68EasybL+77s3G3Oqe/F2JHAHbJkf8IWrC+yo6pvXzCEauycYamOyP1Bp
 owu9OIxIDXvm2BOVzogmlZX63Mz+7yXGINKIqG/j/5mPGALL4n/AIpCqxpdpPkcr
 o8Wr4A9QIma8Z8X6ayIOF7/KgQCdORq/8biufs1KcyIIcsJkifpRfmvPNDIeAW9D
 dO/jlVzTpDRLel7eOcmLZylFcfKiRnY5Y4YxwkJkIpMIj6Luztp75WtGhBvGw2gB
 zOUvyxRp42QLEyro/sP+2LfzaagdKv3skH5nKcpr72Bs/gvetoH3/poDGlGEHZKJ
 i4Quh+62PUIsXsMt0mdBpr3QeijKmYv7kL5mroJjAgE6yM9Ihy8tmrE+ICpvK/z3
 1j3/lbDRn84oR8SdBywgbN2vw9x4VOwn8IcLChzn0w+D/jFyWzqwlzzYhp1leRMg
 FsX7nifbEQe8Ci86sZQLEKP2sVgGe8JqEYzRidVRY6ALcW7ICROmuTCzZ4jT1xU9
 98FDjKXs5QlfNjD+yuq7
 =KviC
 -----END PGP SIGNATURE-----

Merge tag 'jfs-4.7' of git://github.com/kleikamp/linux-shaggy

Pull jfs updates from Dave Kleikamp:
 "Some jfs logging cleanups from Joe Perches"

* tag 'jfs-4.7' of git://github.com/kleikamp/linux-shaggy:
  jfs: Coalesce some formats
  jfs: Remove unnecessary line continuations and terminating newlines
  jfs: Remove terminating newlines from jfs_info, jfs_warn, jfs_err uses
2016-05-17 14:15:18 -07:00
Kees Cook
cb6fd68fdd exec: clarify reasoning for euid/egid reset
This section of code initially looks redundant, but is required. This
improves the comment to explain more clearly why the reset is needed.

Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-17 13:56:53 -07:00
Jeff Layton
1b3c6d07e2 pnfs: make pnfs_layout_process more robust
It can return NULL if layoutgets are blocked currently. Fix it to return
-EAGAIN in that case, so we can properly handle it in pnfs_update_layout.

Also, clean up and simplify the error handling -- eliminate "status" and
just use "lseg".

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17 15:48:13 -04:00
Jeff Layton
183d9e7b11 pnfs: rework LAYOUTGET retry handling
There are several problems in the way a stateid is selected for a
LAYOUTGET operation:

We pick a stateid to use in the RPC prepare op, but that makes
it difficult to serialize LAYOUTGETs that use the open stateid. That
serialization is done in pnfs_update_layout, which occurs well before
the rpc_prepare operation.

Between those two events, the i_lock is dropped and reacquired.
pnfs_update_layout can find that the list has lsegs in it and not do any
serialization, but then later pnfs_choose_layoutget_stateid ends up
choosing the open stateid.

This patch changes the client to select the stateid to use in the
LAYOUTGET earlier, when we're searching for a usable layout segment.
This way we can do it all while holding the i_lock the first time, and
ensure that we serialize any LAYOUTGET call that uses a non-layout
stateid.

This also means a rework of how LAYOUTGET replies are handled, as we
must now get the latest stateid if we want to retransmit in response
to a retryable error.

Most of those errors boil down to the fact that the layout state has
changed in some fashion. Thus, what we really want to do is to re-search
for a layout when it fails with a retryable error, so that we can avoid
reissuing the RPC at all if possible.

While the LAYOUTGET RPC is async, the initiating thread always waits for
it to complete, so it's effectively synchronous anyway. Currently, when
we need to retry a LAYOUTGET because of an error, we drive that retry
via the rpc state machine.

This means that once the call has been submitted, it runs until it
completes. So, we must move the error handling for this RPC out of the
rpc_call_done operation and into the caller.

In order to handle errors like NFS4ERR_DELAY properly, we must also
pass a pointer to the sliding timeout, which is now moved to the stack
in pnfs_update_layout.

The complicating errors are -NFS4ERR_RECALLCONFLICT and
-NFS4ERR_LAYOUTTRYLATER, as those involve a timeout after which we give
up and return NULL back to the caller. So, there is some special
handling for those errors to ensure that the layers driving the retries
can handle that appropriately.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17 15:48:12 -04:00
Jeff Layton
83026d80a1 pnfs: lift retry logic from send_layoutget to pnfs_update_layout
If we get back something like NFS4ERR_OLD_STATEID, that will be
translated into -EAGAIN, and the do/while loop in send_layoutget
will drive the call again.

This is not quite what we want, I think. An error like that is a
sign that something has changed. That something could have been a
concurrent LAYOUTGET that would give us a usable lseg.

Lift the retry logic into pnfs_update_layout instead. That allows
us to redo the layout search, and may spare us from having to issue
an RPC.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17 15:48:12 -04:00
Jeff Layton
d03ab29dbb pnfs: fix bad error handling in send_layoutget
Currently, the code will clear the fail bit if we get back a fatal
error. I don't think that's correct -- we want to clear that bit
if we do not get a fatal error.

Fixes: 0bcbf039f6 (nfs: handle request add failure properly)
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17 15:48:11 -04:00
Jeff Layton
95e2b7e95d flexfiles: add kerneldoc header to nfs4_ff_layout_prepare_ds
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17 15:48:11 -04:00
Jeff Layton
094069f1d9 flexfiles: remove pointless setting of NFS_LAYOUT_RETURN_REQUESTED
Setting just the NFS_LAYOUT_RETURN_REQUESTED flag doesn't do anything,
unless there are lsegs that are also being marked for return. At the
point where that happens this flag is also set, so these set_bit calls
don't do anything useful.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17 15:48:11 -04:00
Jeff Layton
6d597e1750 pnfs: only tear down lsegs that precede seqid in LAYOUTRETURN args
LAYOUTRETURN is "special" in that servers and clients are expected to
work with old stateids. When the client sends a LAYOUTRETURN with an old
stateid in it then the server is expected to only tear down layout
segments that were present when that seqid was current. Ensure that the
client handles its accounting accordingly.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17 15:48:10 -04:00
Jeff Layton
3982a6a2d0 pnfs: keep track of the return sequence number in pnfs_layout_hdr
When we want to selectively do a LAYOUTRETURN, we need to specify a
stateid that represents most recent layout acquisition that is to be
returned.

When we mark a layout stateid to be returned, we update the return
sequence number in the layout header with that value, if it's newer
than the existing one. Then, when we go to do a LAYOUTRETURN on
layout header put, we overwrite the seqid in the stateid with the
saved one, and then zero it out.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17 15:48:10 -04:00
Jeff Layton
6675528380 pnfs: record sequence in pnfs_layout_segment when it's created
In later patches, we're going to teach the client to be more selective
about how it returns layouts. This means keeping a record of what the
stateid's seqid was at the time that the server handed out a layout
segment.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17 15:48:09 -04:00
Jeff Layton
ee26bdd680 pnfs: don't merge new ff lsegs with ones that have LAYOUTRETURN bit set
Otherwise, we'll end up returning layouts that we've just received if
the client issues a new LAYOUTGET prior to the LAYOUTRETURN.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17 15:48:09 -04:00
Tom Haynes
446ca21953 pNFS/flexfiles: When initing reads or writes, we might have to retry connecting to DSes
If we are initializing reads or writes and can not connect to a DS, then
check whether or not IO is allowed through the MDS. If it is allowed,
reset to the MDS. Else, fail the layout segment and force a retry
of a new layout segment.

Signed-off-by: Tom Haynes <loghyr@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17 15:48:08 -04:00
Tom Haynes
3b13b4b311 pNFS/flexfiles: When checking for available DSes, conditionally check for MDS io
Whenever we check to see if we have the needed number of DSes for the
action, we may also have to check to see whether IO is allowed to go to
the MDS or not.

[jlayton: fix merge conflict due to lack of localio patches here]

Signed-off-by: Tom Haynes <loghyr@primarydata.com>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17 15:48:08 -04:00
Trond Myklebust
75bf47ebf6 pNFS/flexfile: Fix erroneous fall back to read/write through the MDS
This patch fixes a problem whereby the pNFS client falls back to doing
reads and writes through the metadata server even when the layout flag
FF_FLAGS_NO_IO_THRU_MDS is set.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17 15:48:07 -04:00
Trond Myklebust
cca588d6c8 NFS: Reclaim writes via writepage are opportunistic
No need to make them a priority any more, or to make them succeed.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17 15:48:07 -04:00
Trond Myklebust
abf4e13cc1 NFSv4: Use the right stateid for delegations in setattr, read and write
When we're using a delegation to represent our open state, we should
ensure that we use the stateid that was used to create that delegation.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17 15:48:07 -04:00
Trond Myklebust
93b717fd81 NFSv4: Label stateids with the type
In order to more easily distinguish what kind of stateid we are dealing
with, introduce a type that can be used to label the stateid structure.

The label will be useful both for debugging, but also when dealing with
operations like SETATTR, READ and WRITE that can take several different
types of stateid as arguments.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17 15:48:06 -04:00
Trond Myklebust
9a8f6b5ea2 SUNRPC: Ensure get_rpccred() and put_rpccred() can take NULL arguments
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17 15:48:06 -04:00
Trond Myklebust
f538d0ba5b pNFS: Fix a leaked layoutstats flag
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17 15:48:05 -04:00
Chuck Lever
6e14a92c36 xprtrdma: Remove qplock
Clean up.

After "xprtrdma: Remove ro_unmap() from all registration modes",
there are no longer any sites that take rpcrdma_ia::qplock for read.
The one site that takes it for write is always single-threaded. It
is safe to remove it.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17 15:48:05 -04:00
Chuck Lever
b2dde94bfa xprtrdma: Faster server reboot recovery
In a cluster failover scenario, it is desirable for the client to
attempt to reconnect quickly, as an alternate NFS server is already
waiting to take over for the down server. The client can't see that
a server IP address has moved to a new server until the existing
connection is gone.

For fabrics and devices where it is meaningful, set a definite upper
bound on the amount of time before it is determined that a
connection is no longer valid. This allows the RPC client to detect
connection loss in a timely matter, then perform a fresh resolution
of the server GUID in case it has changed (cluster failover).

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17 15:48:04 -04:00
Chuck Lever
0b043b9fb5 xprtrdma: Remove ro_unmap() from all registration modes
Clean up: The ro_unmap method is no longer used.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17 15:48:04 -04:00