Commit graph

573726 commits

Author SHA1 Message Date
Linus Torvalds
e9f57ebcba Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull overlayfs updates from Miklos Szeredi:
 "This contains several bug fixes and a new mount option
  'default_permissions' that allows read-only exported NFS
  filesystems to be used as lower layer"

* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ovl: check dentry positiveness in ovl_cleanup_whiteouts()
  ovl: setattr: check permissions before copy-up
  ovl: root: copy attr
  ovl: move super block magic number to magic.h
  ovl: use a minimal buffer in ovl_copy_xattr
  ovl: allow zero size xattr
  ovl: default permissions
2016-01-21 12:20:46 -08:00
Linus Torvalds
5c89e9ea7e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse updates from Miklos Szeredi:
 "This adds SEEK_HOLE and SEEK_DATA support in lseek"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: add support for SEEK_HOLE and SEEK_DATA in lseek
2016-01-21 12:14:24 -08:00
Linus Torvalds
d43421565b PCI changes for the v4.5 merge window:
Enumeration
     Simplify config space size computation (Bjorn Helgaas)
     Avoid iterating through ROM outside the resource window (Edward O'Callaghan)
     Support PCIe devices with short cfg_size (Jason S. McMullan)
     Add Netronome vendor and device IDs (Jason S. McMullan)
     Limit config space size for Netronome NFP6000 family (Jason S. McMullan)
     Add Netronome NFP4000 PF device ID (Simon Horman)
     Limit config space size for Netronome NFP4000 (Simon Horman)
     Print warnings for all invalid expansion ROM headers (Vladis Dronov)
 
   Resource management
     Fix minimum allocation address overwrite (Christoph Biedl)
 
   PCI device hotplug
     acpiphp_ibm: Fix null dereferences on null ibm_slot (Colin Ian King)
     pciehp: Always protect pciehp_disable_slot() with hotplug mutex (Guenter Roeck)
     shpchp: Constify hpc_ops structure (Julia Lawall)
     ibmphp: Remove unneeded NULL test (Julia Lawall)
 
   Power management
     Make ASPM sysfs link_state_store() consistent with link_state_show() (Andy Lutomirski)
 
   Virtualization
     Add function 1 DMA alias quirk for Lite-On/Plextor M6e/Marvell 88SS9183 (Tim Sander)
 
   MSI
     Remove empty pci_msi_init_pci_dev() (Bjorn Helgaas)
     Mark PCIe/PCI (MSI) IRQ cascade handlers as IRQF_NO_THREAD (Grygorii Strashko)
     Initialize MSI capability for all architectures (Guilherme G. Piccoli)
     Relax msi_domain_alloc() to support parentless MSI irqdomains (Liu Jiang)
 
   ARM Versatile host bridge driver
     Remove unused pci_sys_data structures (Lorenzo Pieralisi)
 
   Broadcom iProc host bridge driver
     Hide CONFIG_PCIE_IPROC (Arnd Bergmann)
     Do not use 0x in front of %pap (Dmitry V. Krivenok)
     Update iProc PCIe device tree binding (Ray Jui)
     Add PAXC interface support (Ray Jui)
     Add iProc PCIe MSI device tree binding (Ray Jui)
     Add iProc PCIe MSI support (Ray Jui)
 
   Freescale i.MX6 host bridge driver
     Use gpio_set_value_cansleep() (Fabio Estevam)
     Add support for active-low reset GPIO (Petr Štetiar)
 
   HiSilicon host bridge driver
     Add support for HiSilicon Hip06 PCIe host controllers (Gabriele Paoloni)
 
   Intel VMD host bridge driver
     Export irq_domain_set_info() for module use (Keith Busch)
     x86/PCI: Allow DMA ops specific to a PCI domain (Keith Busch)
     Use 32 bit PCI domain numbers (Keith Busch)
     Add driver for Intel Volume Management Device (VMD) (Keith Busch)
 
   Qualcomm host bridge driver
     Document PCIe devicetree bindings (Stanimir Varbanov)
     Add Qualcomm PCIe controller driver (Stanimir Varbanov)
     dts: apq8064: add PCIe devicetree node (Stanimir Varbanov)
     dts: ifc6410: enable PCIe DT node for this board (Stanimir Varbanov)
 
   Renesas R-Car host bridge driver
     Add support for R-Car H3 to pcie-rcar (Harunobu Kurokawa)
     Allow DT to override default window settings (Phil Edworthy)
     Convert to DT resource parsing API (Phil Edworthy)
     Revert "PCI: rcar: Build pcie-rcar.c only on ARM" (Phil Edworthy)
     Remove unused pci_sys_data struct from pcie-rcar (Phil Edworthy)
     Add runtime PM support to pcie-rcar (Phil Edworthy)
     Add Gen2 PHY setup to pcie-rcar (Phil Edworthy)
     Add gen2 fallback compatibility string for pci-rcar-gen2 (Simon Horman)
     Add gen2 fallback compatibility string for pcie-rcar (Simon Horman)
 
   Synopsys DesignWare host bridge driver
     Simplify control flow (Bjorn Helgaas)
     Make config accessor override checking symmetric (Bjorn Helgaas)
     Ensure ATU is enabled before IO/conf space accesses (Stanimir Varbanov)
 
   Miscellaneous
     Add of_pci_get_host_bridge_resources() stub (Arnd Bergmann)
     Check for PCI_HEADER_TYPE_BRIDGE equality, not bitmask (Bjorn Helgaas)
     Fix all whitespace issues (Bogicevic Sasa)
     x86/PCI: Simplify pci_bios_{read,write} (Geliang Tang)
     Use to_pci_dev() instead of open-coding it (Geliang Tang)
     Use kobj_to_dev() instead of open-coding it (Geliang Tang)
     Use list_for_each_entry() to simplify code (Geliang Tang)
     Fix typos in <linux/msi.h> (Thomas Petazzoni)
     x86/PCI: Clarify AMD Fam10h config access restrictions comment (Tomasz Nowicki)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWoQIuAAoJEFmIoMA60/r8ckYP/0ZrkANeN1SB5cQVi2k7aceq
 kQb1Hk6ifxohJvgpJ/iwmVCHoApyeBfUBfrC+fUpIC2f7ncPsE5HNyjqpAWzFzj2
 sYWwY029yjBQ9g4mPhkvjBXfha+lNtLthWc+Xxcat5pdcyG63Dg4SfJKWm2ZYnbN
 0GJzyRZXIwAMnNf0KIr61Aqru0nXeHvi5wblyJ08UZ7AcNzCtB0wKLmE3S6SeZVF
 f2fry35zcGu+TFvQ1hAYemfl3XyDBJ87nPiKzJAwYSaKcWPFWt+72PBDPO6X9squ
 6prm4nmAgeG2Oo4Zu0fbkDlB2bEsWUc14/xT0i5Wfs35vcwzF+S1zirJAtVqoNir
 NgC7fSbEHbsS7FZOz0rBOBIvIkbb6NdfLFuZqUFv0X1M5bRFywjo8lZRfAYoGJzK
 Mmus0uKbklx5m6RT5adf9+Plev1YJT6XZW9XrDpGnxrwRyPjHmyvuTWsYkumxY7Q
 CE5Wr3p7q2I2+MtrQVv2D9Nzsb+4zQ6BgHrd2vwR/IxTsfdXLU7+B691wkUDX8No
 UKFxBd0FiVCn+srG96u7lWQvdoUqoNCogTZSVzGR5gFBv3zAN9gi8HS7NbV558Mg
 Io3Xw+6dcbG33uvWdU6jHEDLMQsohZcp05Q5esCgRQNV4cGJbPxBDtOZEO/ezvW4
 FAI7lfgYTFiQK3NzE3Ng
 =z9mQ
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.5-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "PCI changes for the v4.5 merge window:

  Enumeration:
   - Simplify config space size computation (Bjorn Helgaas)
   - Avoid iterating through ROM outside the resource window (Edward O'Callaghan)
   - Support PCIe devices with short cfg_size (Jason S. McMullan)
   - Add Netronome vendor and device IDs (Jason S. McMullan)
   - Limit config space size for Netronome NFP6000 family (Jason S. McMullan)
   - Add Netronome NFP4000 PF device ID (Simon Horman)
   - Limit config space size for Netronome NFP4000 (Simon Horman)
   - Print warnings for all invalid expansion ROM headers (Vladis Dronov)

  Resource management:
   - Fix minimum allocation address overwrite (Christoph Biedl)

  PCI device hotplug:
   - acpiphp_ibm: Fix null dereferences on null ibm_slot (Colin Ian King)
   - pciehp: Always protect pciehp_disable_slot() with hotplug mutex (Guenter Roeck)
   - shpchp: Constify hpc_ops structure (Julia Lawall)
   - ibmphp: Remove unneeded NULL test (Julia Lawall)

  Power management:
   - Make ASPM sysfs link_state_store() consistent with link_state_show() (Andy Lutomirski)

  Virtualization
   - Add function 1 DMA alias quirk for Lite-On/Plextor M6e/Marvell 88SS9183 (Tim Sander)

  MSI:
   - Remove empty pci_msi_init_pci_dev() (Bjorn Helgaas)
   - Mark PCIe/PCI (MSI) IRQ cascade handlers as IRQF_NO_THREAD (Grygorii Strashko)
   - Initialize MSI capability for all architectures (Guilherme G. Piccoli)
   - Relax msi_domain_alloc() to support parentless MSI irqdomains (Liu Jiang)

  ARM Versatile host bridge driver:
   - Remove unused pci_sys_data structures (Lorenzo Pieralisi)

  Broadcom iProc host bridge driver:
   - Hide CONFIG_PCIE_IPROC (Arnd Bergmann)
   - Do not use 0x in front of %pap (Dmitry V. Krivenok)
   - Update iProc PCIe device tree binding (Ray Jui)
   - Add PAXC interface support (Ray Jui)
   - Add iProc PCIe MSI device tree binding (Ray Jui)
   - Add iProc PCIe MSI support (Ray Jui)

  Freescale i.MX6 host bridge driver:
   - Use gpio_set_value_cansleep() (Fabio Estevam)
   - Add support for active-low reset GPIO (Petr Štetiar)

  HiSilicon host bridge driver:
   - Add support for HiSilicon Hip06 PCIe host controllers (Gabriele Paoloni)

  Intel VMD host bridge driver:
   - Export irq_domain_set_info() for module use (Keith Busch)
   - x86/PCI: Allow DMA ops specific to a PCI domain (Keith Busch)
   - Use 32 bit PCI domain numbers (Keith Busch)
   - Add driver for Intel Volume Management Device (VMD) (Keith Busch)

  Qualcomm host bridge driver:
   - Document PCIe devicetree bindings (Stanimir Varbanov)
   - Add Qualcomm PCIe controller driver (Stanimir Varbanov)
   - dts: apq8064: add PCIe devicetree node (Stanimir Varbanov)
   - dts: ifc6410: enable PCIe DT node for this board (Stanimir Varbanov)

  Renesas R-Car host bridge driver:
   - Add support for R-Car H3 to pcie-rcar (Harunobu Kurokawa)
   - Allow DT to override default window settings (Phil Edworthy)
   - Convert to DT resource parsing API (Phil Edworthy)
   - Revert "PCI: rcar: Build pcie-rcar.c only on ARM" (Phil Edworthy)
   - Remove unused pci_sys_data struct from pcie-rcar (Phil Edworthy)
   - Add runtime PM support to pcie-rcar (Phil Edworthy)
   - Add Gen2 PHY setup to pcie-rcar (Phil Edworthy)
   - Add gen2 fallback compatibility string for pci-rcar-gen2 (Simon Horman)
   - Add gen2 fallback compatibility string for pcie-rcar (Simon Horman)

  Synopsys DesignWare host bridge driver:
   - Simplify control flow (Bjorn Helgaas)
   - Make config accessor override checking symmetric (Bjorn Helgaas)
   - Ensure ATU is enabled before IO/conf space accesses (Stanimir Varbanov)

  Miscellaneous:
   - Add of_pci_get_host_bridge_resources() stub (Arnd Bergmann)
   - Check for PCI_HEADER_TYPE_BRIDGE equality, not bitmask (Bjorn Helgaas)
   - Fix all whitespace issues (Bogicevic Sasa)
   - x86/PCI: Simplify pci_bios_{read,write} (Geliang Tang)
   - Use to_pci_dev() instead of open-coding it (Geliang Tang)
   - Use kobj_to_dev() instead of open-coding it (Geliang Tang)
   - Use list_for_each_entry() to simplify code (Geliang Tang)
   - Fix typos in <linux/msi.h> (Thomas Petazzoni)
   - x86/PCI: Clarify AMD Fam10h config access restrictions comment (Tomasz Nowicki)"

* tag 'pci-v4.5-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (58 commits)
  PCI: Add function 1 DMA alias quirk for Lite-On/Plextor M6e/Marvell 88SS9183
  PCI: Limit config space size for Netronome NFP4000
  PCI: Add Netronome NFP4000 PF device ID
  x86/PCI: Add driver for Intel Volume Management Device (VMD)
  PCI/AER: Use 32 bit PCI domain numbers
  x86/PCI: Allow DMA ops specific to a PCI domain
  irqdomain: Export irq_domain_set_info() for module use
  PCI: host: Add of_pci_get_host_bridge_resources() stub
  genirq/MSI: Relax msi_domain_alloc() to support parentless MSI irqdomains
  PCI: rcar: Add Gen2 PHY setup to pcie-rcar
  PCI: rcar: Add runtime PM support to pcie-rcar
  PCI: designware: Make config accessor override checking symmetric
  PCI: ibmphp: Remove unneeded NULL test
  ARM: dts: ifc6410: enable PCIe DT node for this board
  ARM: dts: apq8064: add PCIe devicetree node
  PCI: hotplug: Use list_for_each_entry() to simplify code
  PCI: rcar: Remove unused pci_sys_data struct from pcie-rcar
  PCI: hisi: Add support for HiSilicon Hip06 PCIe host controllers
  PCI: Avoid iterating through memory outside the resource window
  PCI: acpiphp_ibm: Fix null dereferences on null ibm_slot
  ...
2016-01-21 11:52:16 -08:00
Linus Torvalds
859e762544 pwm: Changes for v4.5-rc1
This set of changes contains a new driver for OMAP (using the dual-mode
 timers) as well as an assortment of fixes all across the board.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJWoOb9AAoJEN0jrNd/PrOh6e8P/2fIW49qBk1bSyzWZ8nI9ifx
 rUz1hDpFaOwvOOp+JZ/PCF8l11hr99CpsfZm8tWp3J2lLsahDv362Bp235lOrZk2
 aQ4MaKUAvfzcqmCkeoXFj1SJLgUacerpdWBklTQJpyqoT6bYoyY89fXtaVELI4jF
 V65aSBKF2jh2BlEvmKiMa666xd2jAux59kLRhajJB4MUr2NY5aTmPa9oHZVGa1Qi
 QXEjLp5elGnnIpzN4f7ZuIkgCOceB35HVJoRgGq068YZO2PZDNXn4GSXval85hRZ
 eBHX0btkQ8itQ9IJz1gPtgOxnQuZO5k70bfNSMsG4CDJfNjwMAibFs7UfThnnPZK
 aS7N5SMXt6Yt5H6OSGDR66emeyciYy/+8/0aToscuu1NiTxAUkGN//kEv4IPABn8
 DMWm78PvaxVlZzXLKUkWYPgT7kjvnFQRzopQjmMtY15FUU28cpd+CZqlMlGJ4tCh
 EIA6g6v16BnYyfNI8ZcffoTsfJ7B7uKUnEMiD/gGJ5OrYl8GRT/dvpsX51/hkiQS
 o/vD2aokpEcnY/1mHSQAWdEuwcV9UVi3pplfhRLk1OkJylBc19JUbqbRY9Mx5p5S
 yhgGMZ5tf0jraevr1qy5h0pZXZkQlbykZT2ljDW+BhbpgHyHInF0Rkib+FzUWcb1
 LBsqAkqpBcGKhx4vUKvy
 =YH/K
 -----END PGP SIGNATURE-----

Merge tag 'pwm/for-4.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm

Pull pwm updates from Thierry Reding:
 "This set of changes contains a new driver for OMAP (using the
  dual-mode timers) as well as an assortment of fixes all across the
  board"

* tag 'pwm/for-4.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm:
  pwm: Mark all devices as "might sleep"
  pwm: omap-dmtimer: Potential NULL dereference on error
  pwm: add HAS_IOMEM dependency to PWM_FSL_FTM
  pwm: Add PWM driver for OMAP using dual-mode timers
  pwm: rcar: Improve accuracy of frequency division setting
  pwm: lpc32xx: return ERANGE, if requested period is not supported
  pwm: lpc32xx: fix and simplify duty cycle and period calculations
  pwm: lpc32xx: make device usable with common clock framework
  pwm: lpc32xx: correct number of PWM channels from 2 to 1
  dt: lpc32xx: pwm: update documentation of LPC32xx PWM device
  dt: lpc32xx: pwm: correct LPC32xx PWM device node example
  pwm: fsl-ftm: Fix clock enable/disable when using PM
  pwm: lpss: Rework the sequence of programming PWM_SW_UPDATE
  pwm: lpss: Select core part automatically
  pwm: lpss: Update PWM setting for Broxton
  pwm: bcm2835: Fix email address specification
  pwm: bcm2835: Prevent division by zero
  pwm: bcm2835: Calculate scaler in ->config()
  pwm: lpss: Remove ->free() callback
2016-01-21 11:45:02 -08:00
Linus Torvalds
96461fdb3a CRIS changes for 4.5
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iEYEABECAAYFAlagoW4ACgkQ31LbvUHyf1d1hwCfejdx5Ql4odS6y2GklMthWK7b
 wdUAnjL45Z6Ky1PTMxaSG+7VWIXrBYIJ
 =rawY
 -----END PGP SIGNATURE-----

Merge tag 'cris-for-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/jesper/cris

Pull CRIS updates from Jesper Nilsson:
 "Just some fixups for section mismatches from Guenter"

* tag 'cris-for-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/jesper/cris:
  cris: Fix section mismatches in architecture startup code
  cris: debugport: Fix section mismatches
2016-01-21 11:33:36 -08:00
Linus Torvalds
278e5acae1 Add KGDB support.
zImage fix.
 various cleanup
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJWoGJGAAoJEEdC8EELKDmcaQYP/32M9SkO6BQP/FeegmBjUnbW
 JLNA4g7eatmVc31LuC7TvnijJ3MHzcCGh9fe0VxnIY5x1Qr5aewapemMK1z7zoCD
 2nXi36jRWkxxGr7OnThr3h26N4TzBmb0a/0N0nhRNTS5GkpDIeakX1M7bsQ/9WDj
 Ex4uaH8U+NiwbsDeBXVCzb2aOUY0zF7Siw4iu7gJKNBZN2epuWaPvnMfxVKhM4L3
 05GIo7me1bYukAtmHUGJ0MoiSzIRq5zilOgriiNA3wMbsBmbHsiQA8MJ7d5JPEK+
 ZgvK3AuAj4q7CdUY4BvQmTj2ACUD6e8iyBvKs+2U1hrrsnIjJThtXr2kpXMxNrJw
 imQldzsc/nIuqriUH6z5y3Z7yxnZTQlGWW9G/7G90nSrQH56q0e+/JMwO3pkYZGC
 ZfvBUdgLLLnvLx6JUoJifLZBErqkbdBsOar8X8NHVyv23lx8Bfqk99BWGsXUfhYJ
 HopI9ggvsiasGFh/VyYr9HqP8IwJVMoHmf/iHtZO/eYslyWmmdYiRP9DY+Ldx9CF
 UnLfin0Nipuh5kV1j6uSfbghjnt0Bp7J6XiKSmdtI3DQ16PWi5w0lD88A+dBneql
 5zWlheQuZL+s1c71wpSBX4nYEJmLDX7PRiqiRe1BD/4Empu2ePmBeszmWu22UFEi
 akLnKG68+ZPi4EU8cUOs
 =7t5+
 -----END PGP SIGNATURE-----

Merge tag 'for-4.5' of git://git.osdn.jp/gitroot/uclinux-h8/linux

Pull h8300 updates from Yoshinori Sato:
 - Add KGDB support
 - zImage fix
 - various cleanup

* tag 'for-4.5' of git://git.osdn.jp/gitroot/uclinux-h8/linux:
  h8300: System call entry enable interrupt.
  h8300: show_stack cleanup
  h8300: Restraint of warning.
  h8300: Add KGDB support.
  irqchip: renesas-h8s: Replace ctrl_outw/ctrl_inw with writew/readw
  h8300: signal stack fix
  h8300: Add LZO compression
  h8300: zImage alignment fix
  clk: h8300: Remove "sh73a0-" part from compatible value
  h8300: zImage alignment fix
2016-01-21 11:27:34 -08:00
Ilya Dryomov
7e01726a68 libceph: remove outdated comment
MClientMount{,Ack} are long gone.  The receipt of bare monmap doesn't
actually indicate a mount success as we are yet to authenticate at that
point in time.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-01-21 19:36:09 +01:00
Ilya Dryomov
f6cdb2928d libceph: kill off ceph_x_ticket_handler::validity
With it gone, no need to preserve ceph_timespec in process_one_ticket()
either.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2016-01-21 19:36:09 +01:00
Ilya Dryomov
187d131dd9 libceph: invalidate AUTH in addition to a service ticket
If we fault due to authentication, we invalidate the service ticket we
have and request a new one - the idea being that if a service rejected
our authorizer, it must have expired, despite mon_client's attempts at
periodic renewal.  (The other possibility is that our ticket is too new
and the service hasn't gotten it yet, in which case invalidating isn't
necessary but doesn't hurt.)

Invalidating just the service ticket is not enough, though.  If we
assume a failure on mon_client's part to renew a service ticket, we
have to assume the same for the AUTH ticket.  If our AUTH ticket is
bad, we won't get any service tickets no matter how hard we try, so
invalidate AUTH ticket along with the service ticket.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2016-01-21 19:36:09 +01:00
Ilya Dryomov
6abe097db5 libceph: fix authorizer invalidation, take 2
Back in 2013, commit 4b8e8b5d78 ("libceph: fix authorizer
invalidation") tried to fix authorizer invalidation issues by clearing
validity field.  However, nothing ever consults this field, so it
doesn't force us to request any new secrets in any way and therefore we
never get out of the exponential backoff mode:

    [  129.973812] libceph: osd2 192.168.122.1:6810 connect authorization failure
    [  130.706785] libceph: osd2 192.168.122.1:6810 connect authorization failure
    [  131.710088] libceph: osd2 192.168.122.1:6810 connect authorization failure
    [  133.708321] libceph: osd2 192.168.122.1:6810 connect authorization failure
    [  137.706598] libceph: osd2 192.168.122.1:6810 connect authorization failure
    ...

AFAICT this was the case at the time 4b8e8b5d78 was merged, too.

Using timespec solely as a bool isn't nice, so introduce a new have_key
flag, specifically for this purpose.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2016-01-21 19:36:08 +01:00
Ilya Dryomov
f6330cc1f0 libceph: clear messenger auth_retry flag if we fault
Commit 20e55c4cc7 ("libceph: clear messenger auth_retry flag when we
authenticate") got us only half way there.  We clear the flag if the
second attempt succeeds, but it also needs to be cleared if that
attempt fails, to allow for the exponential backoff to kick in.
Otherwise, if ->should_authenticate() thinks our keys are valid, we
will busy loop, incrementing auth_retry to no avail:

    process_connect ffff880079a63830 got BADAUTHORIZER attempt 1
    process_connect ffff880079a63830 got BADAUTHORIZER attempt 2
    process_connect ffff880079a63830 got BADAUTHORIZER attempt 3
    process_connect ffff880079a63830 got BADAUTHORIZER attempt 4
    process_connect ffff880079a63830 got BADAUTHORIZER attempt 5
    ...

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2016-01-21 19:36:08 +01:00
Ilya Dryomov
67645d7619 libceph: fix ceph_msg_revoke()
There are a number of problems with revoking a "was sending" message:

(1) We never make any attempt to revoke data - only kvecs contibute to
con->out_skip.  However, once the header (envelope) is written to the
socket, our peer learns data_len and sets itself to expect at least
data_len bytes to follow front or front+middle.  If ceph_msg_revoke()
is called while the messenger is sending message's data portion,
anything we send after that call is counted by the OSD towards the now
revoked message's data portion.  The effects vary, the most common one
is the eventual hang - higher layers get stuck waiting for the reply to
the message that was sent out after ceph_msg_revoke() returned and
treated by the OSD as a bunch of data bytes.  This is what Matt ran
into.

(2) Flat out zeroing con->out_kvec_bytes worth of bytes to handle kvecs
is wrong.  If ceph_msg_revoke() is called before the tag is sent out or
while the messenger is sending the header, we will get a connection
reset, either due to a bad tag (0 is not a valid tag) or a bad header
CRC, which kind of defeats the purpose of revoke.  Currently the kernel
client refuses to work with header CRCs disabled, but that will likely
change in the future, making this even worse.

(3) con->out_skip is not reset on connection reset, leading to one or
more spurious connection resets if we happen to get a real one between
con->out_skip is set in ceph_msg_revoke() and before it's cleared in
write_partial_skip().

Fixing (1) and (3) is trivial.  The idea behind fixing (2) is to never
zero the tag or the header, i.e. send out tag+header regardless of when
ceph_msg_revoke() is called.  That way the header is always correct, no
unnecessary resets are induced and revoke stands ready for disabled
CRCs.  Since ceph_msg_revoke() rips out con->out_msg, introduce a new
"message out temp" and copy the header into it before sending.

Cc: stable@vger.kernel.org # 4.0+
Reported-by: Matt Conner <matt.conner@keepertech.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Tested-by: Matt Conner <matt.conner@keepertech.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2016-01-21 19:36:08 +01:00
Geliang Tang
10bcee149f libceph: use list_for_each_entry_safe
Use list_for_each_entry_safe() instead of list_for_each_safe() to
simplify the code.

Signed-off-by: Geliang Tang <geliangtang@163.com>
[idryomov@gmail.com: nuke call to list_splice_init() as well]
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-01-21 19:36:08 +01:00
Yan, Zheng
99c88e6900 ceph: use i_size_{read,write} to get/set i_size
Cap message from MDS can update i_size. In that case, we don't
hold i_mutex. So it's unsafe to directly access inode->i_size
while holding i_mutex.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-01-21 19:36:08 +01:00
Yan, Zheng
5be0389dac ceph: re-send AIO write request when getting -EOLDSNAP error
When receiving -EOLDSNAP from OSD, we need to re-send corresponding
write request. Due to locking issue, we can send new request inside
another OSD request's complete callback. So we use worker to re-send
request for AIO write.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-01-21 19:36:08 +01:00
Yan, Zheng
c8fe9b17d0 ceph: Asynchronous IO support
The basic idea of AIO support is simple, just call kiocb::ki_complete()
in OSD request's complete callback. But there are several special cases.

when IO span multiple objects, we need to wait until all OSD requests
are complete, then call kiocb::ki_complete(). Error handling in this case
is tricky too. For simplify, AIO both span multiple objects and extends
i_size are not allowed.

Another special case is check EOF for reading (other client can write to
the file and extend i_size concurrently). For simplify, the direct-IO/AIO
code path does do the check, fallback to normal syn read instead.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-01-21 19:36:07 +01:00
Minfei Huang
458c4703ae ceph: Avoid to propagate the invalid page point
The variant pagep will still get the invalid page point, although ceph
fails in function ceph_update_writeable_page.

To fix this issue, Assigne the page to pagep until there is no failure
in function ceph_update_writeable_page.

Signed-off-by: Minfei Huang <mnfhuang@gmail.com>
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-01-21 19:36:07 +01:00
Yan, Zheng
f9cac5ac08 ceph: fix double page_unlock() in page_mkwrite()
ceph_update_writeable_page() unlocks the page on errors, so
page_mkwrite() should not unlock the page again.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-01-21 19:36:07 +01:00
Markus Elfring
1761b22966 rbd: delete an unnecessary check before rbd_dev_destroy()
The rbd_dev_destroy() function tests whether its argument is NULL
and then returns immediately. Thus the test around the call is not needed.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-01-21 19:36:07 +01:00
Geliang Tang
17ddc49b9c libceph: use list_next_entry instead of list_entry_next
list_next_entry has been defined in list.h, so I replace list_entry_next
with it.

Signed-off-by: Geliang Tang <geliangtang@163.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-01-21 19:36:07 +01:00
Yaowei Bai
79a3ed2e98 ceph: ceph_frag_contains_value can be boolean
This patch makes ceph_frag_contains_value return bool to improve
readability due to this particular function only using either one or
zero as its return value.

No functional change.

Signed-off-by: Yaowei Bai <baiyaowei@cmss.chinamobile.com>
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-01-21 19:36:07 +01:00
Yaowei Bai
eade1fe75f ceph: remove unused functions in ceph_frag.h
These functions were introduced in commit 3d14c5d2b ("ceph: factor
out libceph from Ceph file system"). Howover, there's no user of
these functions since then, so remove them for simplicity.

Signed-off-by: Yaowei Bai <baiyaowei@cmss.chinamobile.com>
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-01-21 19:36:07 +01:00
David Sterba
14e46e0495 btrfs: synchronize incompat feature bits with sysfs files
The files under /sys/fs/UUID/features get out of sync with the actual
incompat bits set for the filesystem if they change after mount (eg. the
LZO compression).

Synchronize the feature bits with the sysfs files representing them
right after we set/clear them.

Signed-off-by: David Sterba <dsterba@suse.com>
2016-01-21 18:54:41 +01:00
Alexander Shishkin
45c815f06b perf: Synchronously free aux pages in case of allocation failure
We are currently using asynchronous deallocation in the error path in
AUX mmap code, which is unnecessary and also presents a problem for users
that wish to probe for the biggest possible buffer size they can get:
they'll get -EINVAL on all subsequent attemts to allocate a smaller
buffer before the asynchronous deallocation callback frees up the pages
from the previous unsuccessful attempt.

Currently, gdb does that for allocating AUX buffers for Intel PT traces.
More specifically, overwrite mode of AUX pmus that don't support hardware
sg (some implementations of Intel PT, for instance) is limited to only
one contiguous high order allocation for its buffer and there is no way
of knowing its size without trying.

This patch changes error path freeing to be synchronous as there won't
be any contenders for the AUX pages at that point.

Reported-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/1453216469-9509-1-git-send-email-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-01-21 18:54:27 +01:00
Stephane Eranian
0e1eb0a1f5 perf/x86: add Intel SkyLake uncore IMC PMU support
This patch enables the uncore_imc PMU for Intel
SkyLake Desktop processors (Core i7-6700, model 94).

It is possible to compute memory read/write bandwidth
using:

  $ perf stat -a -e uncore_imc/data_reads/,uncore_imc/data_writes/ ....

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: kan.liang@intel.com
Link: http://lkml.kernel.org/r/1452151546-8853-1-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-01-21 18:54:26 +01:00
Peter Zijlstra
63b6da39bb perf: Fix perf_event_exit_task() race
There is a race against perf_event_exit_task() vs
event_function_call(),find_get_context(),perf_install_in_context()
(iow, everyone).

Since there is no permanent marker on a context that its dead, it is
quite possible that we access (and even modify) a context after its
passed through perf_event_exit_task().

For instance, find_get_context() might find the context still
installed, but by the time we get to perf_install_in_context() it
might already have passed through perf_event_exit_task() and be
considered dead, we will however still add the event to it.

Solve this by marking a ctx dead by setting its ctx->task value to -1,
it must be !0 so we still know its a (former) task context.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-01-21 18:54:25 +01:00
Peter Zijlstra
c97f473643 perf: Add more assertions
Try to trigger warnings before races do damage.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-01-21 18:54:25 +01:00
Peter Zijlstra
fae3fde651 perf: Collapse and fix event_function_call() users
There is one common bug left in all the event_function_call() users,
between loading ctx->task and getting to the remote_function(),
ctx->task can already have been changed.

Therefore we need to double check and retry if ctx->task != current.

Insert another trampoline specific to event_function_call() that
checks for this and further validates state. This also allows getting
rid of the active/inactive functions.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-01-21 18:54:24 +01:00
Peter Zijlstra
32132a3d0d perf: Specialize perf_event_exit_task()
The perf_remove_from_context() usage in __perf_event_exit_task() is
different from the other usages in that this site has already
detached and scheduled out the task context.

This will stand in the way of stronger assertions checking the (task)
context scheduling invariants.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-01-21 18:54:24 +01:00
Peter Zijlstra
39a4364076 perf: Fix task context scheduling
There is a very nasty problem wrt disabling the perf task scheduling
hooks.

Currently we {set,clear} ctx->is_active on every
__perf_event_task_sched_{in,out}, _however_ this means that if we
disable these calls we'll have task contexts with ->is_active set that
are not active and 'active' task contexts without ->is_active set.

This can result in event_function_call() looping on the ctx->is_active
condition basically indefinitely.

Resolve this by changing things such that contexts without events do
not set ->is_active like we used to. From this invariant it trivially
follows that if there are no (task) events, every task ctx is inactive
and disabling the context switch hooks is harmless.

This leaves two places that need attention (and already had
accumulated weird and wonderful hacks to work around, without
recognising this actual problem).

Namely:

 - perf_install_in_context() will need to deal with installing events
   in an inactive context, meaning it cannot rely on ctx-is_active for
   its IPIs.

 - perf_remove_from_context() will have to mark a context as inactive
   when it removes the last event.

For specific detail, see the patch/comments.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-01-21 18:54:23 +01:00
Peter Zijlstra
63e30d3e52 perf: Make ctx->is_active and cpuctx->task_ctx consistent
For no apparent reason and to great confusion the rules for
ctx->is_active and cpuctx->task_ctx are different. This means that its
not always possible to find all active (task) contexts.

Fix this such that if ctx->is_active gets set, we also set (or verify)
cpuctx->task_ctx.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-01-21 18:54:23 +01:00
Peter Zijlstra
25432ae96a perf: Optimize perf_sched_events() usage
It doesn't make sense to take up-to _4_ references on
perf_sched_events() per event, avoid doing this.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-01-21 18:54:22 +01:00
Peter Zijlstra
aee7dbc45f perf: Simplify/fix perf_event_enable() event scheduling
Like perf_enable_on_exec(), perf_event_enable() event scheduling has problems
respecting the context hierarchy when trying to schedule events (for
example, it will try and add a pinned event without first removing
existing flexible events).

So simplify it by using the new ctx_resched() call which will DTRT.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-01-21 18:54:22 +01:00
Peter Zijlstra
8833d0e286 perf: Use task_ctx_sched_out()
We have a function that does exactly what we want here, use it. This
reduces the amount of cpuctx->task_ctx muckery.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-01-21 18:54:21 +01:00
Peter Zijlstra
3e349507d1 perf: Fix perf_enable_on_exec() event scheduling
There are two problems with the current perf_enable_on_exec() event
scheduling:

  - the newly enabled events will be immediately scheduled
    irrespective of their ctx event list order.

  - there's a hole in the ctx->lock between scheduling the events
    out and putting them back on.

Esp. the latter issue is a real problem because a hole in event
scheduling leaves the thing in an observable inconsistent state,
confusing things.

Fix both issues by first doing the enable iteration and at the end,
when there are newly enabled events, reschedule the ctx in one go.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-01-21 18:54:20 +01:00
Peter Zijlstra
5947f6576e perf: Remove stale comment
The comment here is horribly out of date, remove it.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-01-21 18:54:20 +01:00
Peter Zijlstra
70a0165752 perf: Fix cgroup scheduling in perf_enable_on_exec()
There is a comment that states that perf_event_context_sched_in() will
also switch in the cgroup events, I cannot find it does so. Therefore
all the resulting logic goes out the window too.

Clean that up.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-01-21 18:54:19 +01:00
Peter Zijlstra
7e41d17753 perf: Fix cgroup event scheduling
There appears to be a problem in __perf_event_task_sched_in() wrt
cgroup event scheduling.

The normal event scheduling order is:

	CPU pinned
	Task pinned
	CPU flexible
	Task flexible

And since perf_cgroup_sched*() only schedules the cpu context, we must
call this _before_ adding the task events.

Note: double check what happens on the ctx switch optimization where
the task ctx isn't scheduled.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-01-21 18:54:19 +01:00
Peter Zijlstra
c994d61367 perf: Add lockdep assertions
Make various bugs easier to see.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-01-21 18:54:18 +01:00
David Sterba
444e751698 btrfs: sysfs: introduce helper for syncing bits with sysfs files
The files under /sys/fs/UUID/features get out of sync with the actual
incompat bits set for the filesystem if they change after mount. We're
going to sync them and need a helper to do that.

Signed-off-by: David Sterba <dsterba@suse.com>
2016-01-21 18:50:40 +01:00
David Sterba
3b5bb73bd8 btrfs: sysfs: add free-space-tree bit attribute
The incompat bit representing the newly added free space tree feature is
missing. Right now it will be listed only among features supported by
the module, not per-fs.

Signed-off-by: David Sterba <dsterba@suse.com>
2016-01-21 18:36:46 +01:00
Leon Romanovsky
34356f64ac IB/mlx5: Unify CQ create flags check
The create_cq() can receive creation flags which were used
differently by two commits which added create_cq extended
command and cross-channel. The merged code caused to not
accept any flags at all.

This patch unifies the check into one function and one return
error code.

Fixes: 972ecb8213 ("IB/mlx5: Add create_cq extended command")
Fixes: 051f263098 ("IB/mlx5: Add driver cross-channel support")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-21 12:05:37 -05:00
majd@mellanox.com
ad5f8e964c IB/mlx5: Expose Raw Packet QP to user space consumers
Added Raw Packet QP modify functionality which will enable user
space consumers to use it.

Since Raw Packet QP is built of SQ and RQ sub-objects, therefore
Raw Packet QP state changes are implemented by changing the state
of the sub-objects.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-21 12:01:09 -05:00
majd@mellanox.com
427c1e7bcd {IB, net}/mlx5: Move the modify QP operation table to mlx5_ib
When modifying a QP, the desired operation was determined in
the mlx5_core using a transition table that takes the current
state, the final state, and returns the desired operation.

Since this logic will be used for Raw Packet QP, move the
operation table to the mlx5_ib.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-21 12:01:09 -05:00
majd@mellanox.com
75850d0bce IB/mlx5: Support setting Ethernet priority for Raw Packet QPs
When the user changes the Address Vector(AV) in the modify QP, he
provides an SL. This SL should be translated to Ethernet Priority
by taking the 3 LSB bits, and modify the QP's TIS according to this
Ethernet priority.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-21 12:01:09 -05:00
majd@mellanox.com
6d2f89df04 IB/mlx5: Add Raw Packet QP query functionality
Since Raw Packet QP is composed of RQ and SQ, the IB QP's
state is derived from the sub-objects. Therefore we need
to query each one of the sub-objects, and decide on the
IB QP's state.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-21 12:01:09 -05:00
majd@mellanox.com
0fb2ed66a1 IB/mlx5: Add create and destroy functionality for Raw Packet QP
This patch adds support for Raw Packet QP for the mlx5 device.

Raw Packet QP, unlike other QP types, has no matching mlx5_core_qp
object but rather it is built of RQ/SQ/TIR/TIS/TD mlx5_core object.

Since the SQ and RQ work-queue (WQ) buffers are not contiguous like
other QPs, we allocate separate buffers in the user-space and pass
the address of each one of them separately to the kernel.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-21 12:01:09 -05:00
majd@mellanox.com
19098df2da IB/mlx5: Refactor mlx5_ib_qp to accommodate other QP types
Extract specific IB QP fields to mlx5_ib_qp_trans structure.
The mlx5_core QP object resides in mlx5_ib_qp_base, which all QP types
inherit from. When we need to find mlx5_ib_qp using mlx5_core QP
(event handling and co), we use a pointer that resides in
mlx5_ib_qp_base.

In addition, we delete all redundant fields that weren't used anywhere
in the code:
-doorbell_qpn
-sq_max_wqes_per_wr
-sq_spare_wqes

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-21 12:01:09 -05:00
majd@mellanox.com
146d2f1af3 IB/mlx5: Allocate a Transport Domain for each ucontext
Transport Domain groups several TIS and TIR object. By grouping
these object, it defines wheather local loopback packets that
are sent from the TIS objects in the group are received by the
TIR objects in the same group.

Allocate a Transport Domain(TD) for each user context to be used
in the future by Raw Packet QP for Self-Loopback Control.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-21 12:01:09 -05:00
majd@mellanox.com
a14c2d4bee net/mlx5_core: Warn on unsupported events of QP/RQ/SQ
When an event arrives on QP/RQ/SQ, check whether it's supported,
and print a warning message otherwise.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-21 12:01:09 -05:00