Commit graph

5,305 commits

Author SHA1 Message Date
Jiri Slaby
5a8de47b3c netfilter: bridge: define INT_MIN & INT_MAX in userspace
With 4.19, programs like ebtables fail to build when they include
"linux/netfilter_bridge.h". It is caused by commit 94276fa8a2 which
added a use of INT_MIN and INT_MAX to the header:
: In file included from /usr/include/linux/netfilter_bridge/ebtables.h:18,
:                  from include/ebtables_u.h:28,
:                  from communication.c:23:
: /usr/include/linux/netfilter_bridge.h:30:20: error: 'INT_MIN' undeclared here (not in a function)
:   NF_BR_PRI_FIRST = INT_MIN,
:                     ^~~~~~~

Define these constants by including "limits.h" when !__KERNEL__ (the
same way as for other netfilter_* headers).

Fixes: 94276fa8a2 ("netfilter: bridge: Expose nf_tables bridge hook priorities through uapi")
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Acked-by: Máté Eckl <ecklm94@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-10-25 10:25:22 +02:00
Wei Wang
2e991629bc virtio-balloon: VIRTIO_BALLOON_F_PAGE_POISON
The VIRTIO_BALLOON_F_PAGE_POISON feature bit is used to indicate if the
guest is using page poisoning. Guest writes to the poison_val config
field to tell host about the page poisoning value that is in use.

Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-10-24 20:57:55 -04:00
Wei Wang
86a559787e virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT
Negotiation of the VIRTIO_BALLOON_F_FREE_PAGE_HINT feature indicates the
support of reporting hints of guest free pages to host via virtio-balloon.
Currenlty, only free page blocks of MAX_ORDER - 1 are reported. They are
obtained one by one from the mm free list via the regular allocation
function.

Host requests the guest to report free page hints by sending a new cmd id
to the guest via the free_page_report_cmd_id configuration register. When
the guest starts to report, it first sends a start cmd to host via the
free page vq, which acks to host the cmd id received. When the guest
finishes reporting free pages, a stop cmd is sent to host via the vq.
Host may also send a stop cmd id to the guest to stop the reporting.

VIRTIO_BALLOON_CMD_ID_STOP: Host sends this cmd to stop the guest
reporting.
VIRTIO_BALLOON_CMD_ID_DONE: Host sends this cmd to tell the guest that
the reported pages are ready to be freed.

Why does the guest free the reported pages when host tells it is ready to
free?
This is because freeing pages appears to be expensive for live migration.
free_pages() dirties memory very quickly and makes the live migraion not
converge in some cases. So it is good to delay the free_page operation
when the migration is done, and host sends a command to guest about that.

Why do we need the new VIRTIO_BALLOON_CMD_ID_DONE, instead of reusing
VIRTIO_BALLOON_CMD_ID_STOP?
This is because live migration is usually done in several rounds. At the
end of each round, host needs to send a VIRTIO_BALLOON_CMD_ID_STOP cmd to
the guest to stop (or say pause) the reporting. The guest resumes the
reporting when it receives a new command id at the beginning of the next
round. So we need a new cmd id to distinguish between "stop reporting" and
"ready to free the reported pages".

TODO:
- Add a batch page allocation API to amortize the allocation overhead.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Liang Li <liang.z.li@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-10-24 20:57:55 -04:00
Linus Torvalds
01aa9d518e This is a fairly typical cycle for documentation. There's some welcome
readability improvements for the formatted output, some LICENSES updates
 including the addition of the ISC license, the removal of the unloved and
 unmaintained 00-INDEX files, the deprecated APIs document from Kees, more
 MM docs from Mike Rapoport, and the usual pile of typo fixes and
 corrections.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJbztcuAAoJEI3ONVYwIuV6nTAP/0Be+5dNPGJmSnb/RbkwBuBV
 zAFVUj2sx4lZlRmWRZ0r7AOef2eSw3IvwBix/vnmllYCVahjp+BdRbhXQAijjyeb
 FWWjOH50/J+BaxSthAINiLRLvuoe0D/M08OpmXQfRl5q0S8RufeV3BDtEABx9j2n
 IICPGTl8LpPUgSMA4cw8zPhHdauhZpbmL2mGE9LXZ27SJT4S8lcHMwyPU1n5S+Jd
 ChEz5g9dYr3GNxFp712pkI5GcVL3tP2nfoVwK7EuGf1tvSnEnn2kzac8QgMqorIh
 xB2+Sh4XIUCbHYpGHpxIniD+WI4voNr/E7STQioJK5o2G4HTuxLjktvTezNF8paa
 hgNHWjPQBq0OOCdM/rsffONFF2J/v/r7E3B+kaRg8pE0uZWTFaDMs6MVaL2fL4Ls
 DrFhi90NJI/Fs7uB4sriiviShAhwboiSIRXJi4VlY/5oFJKHFgqes+R7miU+zTX3
 2qv0k4mWZXWDV9w1piPxSCZSdRzaoYSoxEihX+tnYpCyEcYd9ovW/X1Uhl/wCWPl
 Ft+Op6rkHXRXVfZzTLuF6PspZ4Udpw2PUcnA5zj5FRDDBsjSMFR31c19IFbCeiNY
 kbTIcqejJG1WbVrAK4LCcFyVSGxbrr281eth4rE06cYmmsz3kJy1DB6Lhyg/2vI0
 I8K9ZJ99n1RhPJIcburB
 =C0wt
 -----END PGP SIGNATURE-----

Merge tag 'docs-4.20' of git://git.lwn.net/linux

Pull documentation updates from Jonathan Corbet:
 "This is a fairly typical cycle for documentation. There's some welcome
  readability improvements for the formatted output, some LICENSES
  updates including the addition of the ISC license, the removal of the
  unloved and unmaintained 00-INDEX files, the deprecated APIs document
  from Kees, more MM docs from Mike Rapoport, and the usual pile of typo
  fixes and corrections"

* tag 'docs-4.20' of git://git.lwn.net/linux: (41 commits)
  docs: Fix typos in histogram.rst
  docs: Introduce deprecated APIs list
  kernel-doc: fix declaration type determination
  doc: fix a typo in adding-syscalls.rst
  docs/admin-guide: memory-hotplug: remove table of contents
  doc: printk-formats: Remove bogus kobject references for device nodes
  Documentation: preempt-locking: Use better example
  dm flakey: Document "error_writes" feature
  docs/completion.txt: Fix a couple of punctuation nits
  LICENSES: Add ISC license text
  LICENSES: Add note to CDDL-1.0 license that it should not be used
  docs/core-api: memory-hotplug: add some details about locking internals
  docs/core-api: rename memory-hotplug-notifier to memory-hotplug
  docs: improve readability for people with poorer eyesight
  yama: clarify ptrace_scope=2 in Yama documentation
  docs/vm: split memory hotplug notifier description to Documentation/core-api
  docs: move memory hotplug description into admin-guide/mm
  doc: Fix acronym "FEKEK" in ecryptfs
  docs: fix some broken documentation references
  iommu: Fix passthrough option documentation
  ...
2018-10-24 18:01:11 +01:00
Linus Torvalds
fe0142df64 xfs: Changes for 4.20
- only support filesystems with unwritten extents
 - add definition for statfs XFS magic number
 - remove unused parameters around reflink code
 - more debug for dangling delalloc extents
 - cancel COW extents on extent swap targets
 - fix quota stats output and clean up the code
 - refactor some of the attribute code in preparation for parent pointers
 - fix several buffer handling bugs
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJbz7CiAAoJEK3oKUf0dfodz3YP/jRBYAMall0h5sJZt0FHdIb1
 y8bvDz3YaN3T9cFsSKC+2eRiDnxUwvX23Fm3rd3XhMZb0JkLwxX2Y4Tm+BMLeOtx
 JcoFvO6xVNFugYgLHvxNh0jGfLAPRYzd1KOkoGBiDWl3GYijrXCU9wdXukHmYq9k
 /JIfXjVabgmlLhOo1SnaXtqBOo140FjYAlL4h1sGBYxzeoRk/ltsXBnEzjwOJajY
 EYKDjBMoGtxSh38EoXPHHP0Pf89miTE+B3Y7wkR+sURG/cptt6WN1MFCOmEOAPLH
 RYpaZrFYLLTS4xvaDo0UMLuMTv72tabqEtjQ1Rj4bSPZaFb7QZVscpNkoFjuK2V7
 iVJUE3bqlCAGoHdnK22a6Mq1c7inOEG/GGDv+V+xfdOZFlWtaUNyjSmDtpmwxR5D
 0W8eSYaEYfvhQJ7I9066SWof8EtUdc8cc4P+hshego1DWzDF1TSpBEwwy2V+WUsC
 l4NJyLwjUPjfuD/MSUly9N7bIEzgLM5sh+aSBGNY87ODnWbRTzoBbNjl2NOCWv58
 2zT57WLlT7mBzQPE6yNpUGwrpubNEC5z+LzcQRfBedzx/Wh8XBnV/8Z6ETft3sO+
 v63i5e11ejZBUSR1TGf8dIzQonBroJo7Zwk9ghxNs+KIXmG/8vQHN9EuC9X4IUm7
 RD5X0oxgxuFBzrD3G9kD
 =twNy
 -----END PGP SIGNATURE-----

Merge tag 'xfs-4.20-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pul xfs updates from Dave Chinner:
 "There's not a huge amount of change in this cycle - Darrick has been
  out of action for a couple of months (hence me sending the last few
  pull requests), so we decided a quiet cycle mainly focussed on bug
  fixes was a good idea. Darrick will take the helm again at the end of
  this merge window.

  FYI, I may be sending another update later in the cycle - there's a
  pending rework of the clone/dedupe_file_range code that fixes numerous
  bugs that is spread amongst the VFS, XFS and ocfs2 code. It has been
  reviewed and tested, Al and I just need to work out the details of the
  merge, so it may come from him rather than me.

  Summary:

   - only support filesystems with unwritten extents

   - add definition for statfs XFS magic number

   - remove unused parameters around reflink code

   - more debug for dangling delalloc extents

   - cancel COW extents on extent swap targets

   - fix quota stats output and clean up the code

   - refactor some of the attribute code in preparation for parent
     pointers

   - fix several buffer handling bugs"

* tag 'xfs-4.20-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (21 commits)
  xfs: cancel COW blocks before swapext
  xfs: clear ail delwri queued bufs on unmount of shutdown fs
  xfs: use offsetof() in place of offset macros for __xfsstats
  xfs: Fix xqmstats offsets in /proc/fs/xfs/xqmstat
  xfs: fix use-after-free race in xfs_buf_rele
  xfs: Add attibute remove and helper functions
  xfs: Add attibute set and helper functions
  xfs: Add helper function xfs_attr_try_sf_addname
  xfs: Move fs/xfs/xfs_attr.h to fs/xfs/libxfs/xfs_attr.h
  xfs: issue log message on user force shutdown
  xfs: fix buffer state management in xrep_findroot_block
  xfs: always assign buffer verifiers when one is provided
  xfs: xrep_findroot_block should reject root blocks with siblings
  xfs: add a define for statfs magic to uapi
  xfs: print dangling delalloc extents
  xfs: fix fork selection in xfs_find_trim_cow_extent
  xfs: remove the unused trimmed argument from xfs_reflink_trim_around_shared
  xfs: remove the unused shared argument to xfs_reflink_reserve_cow
  xfs: handle zeroing in xfs_file_iomap_begin_delay
  xfs: remove suport for filesystems without unwritten extent flag
  ...
2018-10-24 17:36:12 +01:00
Linus Torvalds
638820d8da Merge branch 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "In this patchset, there are a couple of minor updates, as well as some
  reworking of the LSM initialization code from Kees Cook (these prepare
  the way for ordered stackable LSMs, but are a valuable cleanup on
  their own)"

* 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  LSM: Don't ignore initialization failures
  LSM: Provide init debugging infrastructure
  LSM: Record LSM name in struct lsm_info
  LSM: Convert security_initcall() into DEFINE_LSM()
  vmlinux.lds.h: Move LSM_TABLE into INIT_DATA
  LSM: Convert from initcall to struct lsm_info
  LSM: Remove initcall tracing
  LSM: Rename .security_initcall section to .lsm_info
  vmlinux.lds.h: Avoid copy/paste of security_init section
  LSM: Correctly announce start of LSM initialization
  security: fix LSM description location
  keys: Fix the use of the C++ keyword "private" in uapi/linux/keyctl.h
  seccomp: remove unnecessary unlikely()
  security: tomoyo: Fix obsolete function
  security/capabilities: remove check for -EINVAL
2018-10-24 11:49:35 +01:00
Linus Torvalds
50b825d7e8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) Add VF IPSEC offload support in ixgbe, from Shannon Nelson.

 2) Add zero-copy AF_XDP support to i40e, from Björn Töpel.

 3) All in-tree drivers are converted to {g,s}et_link_ksettings() so we
    can get rid of the {g,s}et_settings ethtool callbacks, from Michal
    Kubecek.

 4) Add software timestamping to veth driver, from Michael Walle.

 5) More work to make packet classifiers and actions lockless, from Vlad
    Buslov.

 6) Support sticky FDB entries in bridge, from Nikolay Aleksandrov.

 7) Add ipv6 version of IP_MULTICAST_ALL sockopt, from Andre Naujoks.

 8) Support batching of XDP buffers in vhost_net, from Jason Wang.

 9) Add flow dissector BPF hook, from Petar Penkov.

10) i40e vf --> generic iavf conversion, from Jesse Brandeburg.

11) Add NLA_REJECT netlink attribute policy type, to signal when users
    provide attributes in situations which don't make sense. From
    Johannes Berg.

12) Switch TCP and fair-queue scheduler over to earliest departure time
    model. From Eric Dumazet.

13) Improve guest receive performance by doing rx busy polling in tx
    path of vhost networking driver, from Tonghao Zhang.

14) Add per-cgroup local storage to bpf

15) Add reference tracking to BPF, from Joe Stringer. The verifier can
    now make sure that references taken to objects are properly released
    by the program.

16) Support in-place encryption in TLS, from Vakul Garg.

17) Add new taprio packet scheduler, from Vinicius Costa Gomes.

18) Lots of selftests additions, too numerous to mention one by one here
    but all of which are very much appreciated.

19) Support offloading of eBPF programs containing BPF to BPF calls in
    nfp driver, frm Quentin Monnet.

20) Move dpaa2_ptp driver out of staging, from Yangbo Lu.

21) Lots of u32 classifier cleanups and simplifications, from Al Viro.

22) Add new strict versions of netlink message parsers, and enable them
    for some situations. From David Ahern.

23) Evict neighbour entries on carrier down, also from David Ahern.

24) Support BPF sk_msg verdict programs with kTLS, from Daniel Borkmann
    and John Fastabend.

25) Add support for filtering route dumps, from David Ahern.

26) New igc Intel driver for 2.5G parts, from Sasha Neftin et al.

27) Allow vxlan enslavement to bridges in mlxsw driver, from Ido
    Schimmel.

28) Add queue and stack map types to eBPF, from Mauricio Vasquez B.

29) Add back byte-queue-limit support to r8169, with all the bug fixes
    in other areas of the driver it works now! From Florian Westphal and
    Heiner Kallweit.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2147 commits)
  tcp: add tcp_reset_xmit_timer() helper
  qed: Fix static checker warning
  Revert "be2net: remove desc field from be_eq_obj"
  Revert "net: simplify sock_poll_wait"
  net: socionext: Reset tx queue in ndo_stop
  net: socionext: Add dummy PHY register read in phy_write()
  net: socionext: Stop PHY before resetting netsec
  net: stmmac: Set OWN bit for jumbo frames
  arm64: dts: stratix10: Support Ethernet Jumbo frame
  tls: Add maintainers
  net: ethernet: ti: cpsw: unsync mcast entries while switch promisc mode
  octeontx2-af: Support for NIXLF's UCAST/PROMISC/ALLMULTI modes
  octeontx2-af: Support for setting MAC address
  octeontx2-af: Support for changing RSS algorithm
  octeontx2-af: NIX Rx flowkey configuration for RSS
  octeontx2-af: Install ucast and bcast pkt forwarding rules
  octeontx2-af: Add LMAC channel info to NIXLF_ALLOC response
  octeontx2-af: NPC MCAM and LDATA extract minimal configuration
  octeontx2-af: Enable packet length and csum validation
  octeontx2-af: Support for VTAG strip and capture
  ...
2018-10-24 06:47:44 +01:00
Jiri Kosina
276e722761 Merge branch 'for-4.20/logitech-highres' into for-linus
High-resolution support for hid-logitech
2018-10-23 13:35:22 +02:00
Linus Torvalds
114b5f8f7e This is the bulk of GPIO changes for the v4.20 series:
Core changes:
 
 - A patch series from Hans Verkuil to make it possible to
   enable/disable IRQs on a GPIO line at runtime and drive GPIO
   lines as output without having to put/get them from scratch.
   The irqchip callbacks have been improved so that they can
   use only the fastpatch callbacks to enable/disable irqs
   like any normal irqchip, especially the gpiod_lock_as_irq()
   has been improved to be callable in fastpath context.
   A bunch of rework had to be done to achieve this but it is
   a big win since I never liked to restrict this to slowpath.
   The only call requireing slowpath was try_module_get() and
   this is kept at the .request_resources() slowpath callback.
   In the GPIO CEC driver this is a big win sine a single
   line is used for both outgoing and incoming traffic, and
   this needs to use IRQs for incoming traffic while actively
   driving the line for outgoing traffic.
 
 - Janusz Krzysztofik improved the GPIO array API to pass a
   "cookie" (struct gpio_array) and a bitmap for setting or
   getting multiple GPIO lines at once. This improvement
   orginated in a specific need to speed up an OMAP1 driver and
   has led to a much better API and real performance gains
   when the state of the array can be used to bypass a lot
   of checks and code when we want things to go really fast.
   The previous code would minimize the number of calls
   down to the driver callbacks assuming the CPU speed was
   orders of magnitude faster than the I/O latency, but this
   assumption was wrong on several platforms: what we needed
   to do was to profile and improve the speed on the hot
   path of the array functions and this change is now
   completed.
 
 - Clean out the painful and hard to grasp BNF experiments
   from the device tree bindings. Future approaches are looking
   into using JSON schema for this purpose. (Rob Herring
   is floating a patch series.)
 
 New drivers:
 
 - The RCAR driver now supports r8a774a1 (RZ/G2M).
 
 - Synopsys GPIO via CREGs driver.
 
 Major improvements:
 
 - Modernization of the EP93xx driver to use irqdomain and
   other contemporary concepts.
 
 - The ingenic driver has been merged into the Ingenic pin
   control driver and removed from the GPIO subsystem.
 
 - Debounce support in the ftgpio010 driver.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJbzdyOAAoJEEEQszewGV1zfYcP/0HBEAOPhHD/i5OQxfKs1msh
 mFT/t/IbTmRpCgbEv4CDx4Kc/InE0sUnQr1TL/1WvU6uObM6Ncxq5Z90MvyrgzYu
 BqQHq2k2tORvkVSNRxcfD/BAAoo1EerXts1kDhutvdKfepfS6DxpENwzvsFgkVlq
 2jj1cdZztjv8A+9cspHDpQP+jDvl1VSc10nR5fRu1TttSpUwzRJaB30NBNXJmMJc
 5KUr67lEbsQRPsBvFErU11bydPqhfT+pXmODcfIwS0EtATQ8WC5mkSb/Ooei0fvT
 oZ7uR3Os8tMf7isOKssEyFabKwhnfOEt6TBt9em0TfUtInOo0Dc7r8TfBcn57fyZ
 xg2R9DQEVRfac8bjhF/BI5KHuN9IMGDDvj6XApumQVliZbISRjMnh3jte6RpcV0A
 Ejqz8FeDY13qvEdOnW1EPpwmXdDVWiEAq0ebGLStKNls+/4gB2HmyxGUOzJf+og5
 hujsxcJzGQqjCe0moeY/1d7vsy0ZjbHoS+p5fy79U212y2O7onEzFU92AX89bxKC
 rx2eCNmiZxCUy1nqu8edO62VnH6QdnqG3o+a4DJfCSHPvFM/E/NX9zHemZubQQ4I
 rYXNy4bL4tEG9cqWMfBxWrpiDZw7H6l8kXwdZG8IMyRU9BcKu96amgZ+jBXwzoaB
 JZelAAUWB9APghJYFr7o
 =YosT
 -----END PGP SIGNATURE-----

Merge tag 'gpio-v4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull GPIO updates from Linus Walleij:
 "This is the bulk of GPIO changes for the v4.20 series:

  Core changes:

   - A patch series from Hans Verkuil to make it possible to
     enable/disable IRQs on a GPIO line at runtime and drive GPIO lines
     as output without having to put/get them from scratch.

     The irqchip callbacks have been improved so that they can use only
     the fastpatch callbacks to enable/disable irqs like any normal
     irqchip, especially the gpiod_lock_as_irq() has been improved to be
     callable in fastpath context.

     A bunch of rework had to be done to achieve this but it is a big
     win since I never liked to restrict this to slowpath. The only call
     requireing slowpath was try_module_get() and this is kept at the
     .request_resources() slowpath callback. In the GPIO CEC driver this
     is a big win sine a single line is used for both outgoing and
     incoming traffic, and this needs to use IRQs for incoming traffic
     while actively driving the line for outgoing traffic.

   - Janusz Krzysztofik improved the GPIO array API to pass a "cookie"
     (struct gpio_array) and a bitmap for setting or getting multiple
     GPIO lines at once.

     This improvement orginated in a specific need to speed up an OMAP1
     driver and has led to a much better API and real performance gains
     when the state of the array can be used to bypass a lot of checks
     and code when we want things to go really fast.

     The previous code would minimize the number of calls down to the
     driver callbacks assuming the CPU speed was orders of magnitude
     faster than the I/O latency, but this assumption was wrong on
     several platforms: what we needed to do was to profile and improve
     the speed on the hot path of the array functions and this change is
     now completed.

   - Clean out the painful and hard to grasp BNF experiments from the
     device tree bindings. Future approaches are looking into using JSON
     schema for this purpose. (Rob Herring is floating a patch series.)

  New drivers:

   - The RCAR driver now supports r8a774a1 (RZ/G2M).

   - Synopsys GPIO via CREGs driver.

  Major improvements:

   - Modernization of the EP93xx driver to use irqdomain and other
     contemporary concepts.

   - The ingenic driver has been merged into the Ingenic pin control
     driver and removed from the GPIO subsystem.

   - Debounce support in the ftgpio010 driver"

* tag 'gpio-v4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (116 commits)
  gpio: Clarify kerneldoc on gpiochip_set_chained_irqchip()
  gpio: Remove unused 'irqchip' argument to gpiochip_set_cascaded_irqchip()
  gpio: Drop parent irq assignment during cascade setup
  mmc: pwrseq_simple: Fix incorrect handling of GPIO bitmap
  gpio: fix SNPS_CREG kconfig dependency warning
  gpiolib: Initialize gdev field before is used
  gpio: fix kernel-doc after devres.c file rename
  gpio: fix doc string for devm_gpiochip_add_data() to not talk about irq_chip
  gpio: syscon: Fix possible NULL ptr usage
  gpiolib: Show correct direction from the beginning
  pinctrl: msm: Use init_valid_mask exported function
  gpiolib: Add init_valid_mask exported function
  GPIO: add single-register GPIO via CREG driver
  dt-bindings: Document the Synopsys GPIO via CREG bindings
  gpio: mockup: use device properties instead of platform_data
  gpio: Slightly more helpful debugfs
  gpio: omap: Remove set but not used variable 'dev'
  gpio: omap: drop omap_gpio_list
  Accept partial 'gpio-line-names' property.
  gpio: omap: get rid of the conditional PM runtime calls
  ...
2018-10-23 08:45:05 +01:00
David S. Miller
a19c59cc10 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2018-10-21

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Implement two new kind of BPF maps, that is, queue and stack
   map along with new peek, push and pop operations, from Mauricio.

2) Add support for MSG_PEEK flag when redirecting into an ingress
   psock sk_msg queue, and add a new helper bpf_msg_push_data() for
   insert data into the message, from John.

3) Allow for BPF programs of type BPF_PROG_TYPE_CGROUP_SKB to use
   direct packet access for __skb_buff, from Song.

4) Use more lightweight barriers for walking perf ring buffer for
   libbpf and perf tool as well. Also, various fixes and improvements
   from verifier side, from Daniel.

5) Add per-symbol visibility for DSO in libbpf and hide by default
   global symbols such as netlink related functions, from Andrey.

6) Two improvements to nfp's BPF offload to check vNIC capabilities
   in case prog is shared with multiple vNICs and to protect against
   mis-initializing atomic counters, from Jakub.

7) Fix for bpftool to use 4 context mode for the nfp disassembler,
   also from Jakub.

8) Fix a return value comparison in test_libbpf.sh and add several
   bpftool improvements in bash completion, documentation of bpf fs
   restrictions and batch mode summary print, from Quentin.

9) Fix a file resource leak in BPF selftest's load_kallsyms()
   helper, from Peng.

10) Fix an unused variable warning in map_lookup_and_delete_elem(),
    from Alexei.

11) Fix bpf_skb_adjust_room() signature in BPF UAPI helper doc,
    from Nicolas.

12) Add missing executables to .gitignore in BPF selftests, from Anders.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-21 21:11:46 -07:00
John Fastabend
6fff607e2f bpf: sk_msg program helper bpf_msg_push_data
This allows user to push data into a msg using sk_msg program types.
The format is as follows,

	bpf_msg_push_data(msg, offset, len, flags)

this will insert 'len' bytes at offset 'offset'. For example to
prepend 10 bytes at the front of the message the user can,

	bpf_msg_push_data(msg, 0, 10, 0);

This will invalidate data bounds so BPF user will have to then recheck
data bounds after calling this. After this the msg size will have been
updated and the user is free to write into the added bytes. We allow
any offset/len as long as it is within the (data, data_end) range.
However, a copy will be required if the ring is full and its possible
for the helper to fail with ENOMEM or EINVAL errors which need to be
handled by the BPF program.

This can be used similar to XDP metadata to pass data between sk_msg
layer and lower layers.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-20 21:37:11 +02:00
David S. Miller
a4efbaf622 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for your net-next tree:

1) Use lockdep_is_held() in ipset_dereference_protected(), from Lance Roy.

2) Remove unused variable in cttimeout, from YueHaibing.

3) Add ttl option for nft_osf, from Fernando Fernandez Mancera.

4) Use xfrm family to deal with IPv6-in-IPv4 packets from nft_xfrm,
   from Florian Westphal.

5) Simplify xt_osf_match_packet().

6) Missing ct helper alias definition in snmp_trap helper, from Taehee Yoo.

7) Remove unnecessary parameter in nf_flow_table_cleanup(), from Taehee Yoo.

8) Remove unused variable definitions in nft_{dup,fwd}, from Weongyo Jeong.

9) Remove empty net/netfilter/nfnetlink_log.h file, from Taehee Yoo.

10) Revert xt_quota updates remain option due to problems in the listing
    path for 32-bit arches, from Maze.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-20 12:32:44 -07:00
Mauricio Vasquez B
bd513cd08f bpf: add MAP_LOOKUP_AND_DELETE_ELEM syscall
The previous patch implemented a bpf queue/stack maps that
provided the peek/pop/push functions.  There is not a direct
relationship between those functions and the current maps
syscalls, hence a new MAP_LOOKUP_AND_DELETE_ELEM syscall is added,
this is mapped to the pop operation in the queue/stack maps
and it is still to implement in other kind of maps.

Signed-off-by: Mauricio Vasquez B <mauricio.vasquez@polito.it>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-10-19 13:24:31 -07:00
Mauricio Vasquez B
f1a2e44a3a bpf: add queue and stack maps
Queue/stack maps implement a FIFO/LIFO data storage for ebpf programs.
These maps support peek, pop and push operations that are exposed to eBPF
programs through the new bpf_map[peek/pop/push] helpers.  Those operations
are exposed to userspace applications through the already existing
syscalls in the following way:

BPF_MAP_LOOKUP_ELEM            -> peek
BPF_MAP_LOOKUP_AND_DELETE_ELEM -> pop
BPF_MAP_UPDATE_ELEM            -> push

Queue/stack maps are implemented using a buffer, tail and head indexes,
hence BPF_F_NO_PREALLOC is not supported.

As opposite to other maps, queue and stack do not use RCU for protecting
maps values, the bpf_map[peek/pop] have a ARG_PTR_TO_UNINIT_MAP_VALUE
argument that is a pointer to a memory zone where to save the value of a
map.  Basically the same as ARG_PTR_TO_UNINIT_MEM, but the size has not
be passed as an extra argument.

Our main motivation for implementing queue/stack maps was to keep track
of a pool of elements, like network ports in a SNAT, however we forsee
other use cases, like for exampling saving last N kernel events in a map
and then analysing from userspace.

Signed-off-by: Mauricio Vasquez B <mauricio.vasquez@polito.it>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-10-19 13:24:31 -07:00
David S. Miller
2e2d6f0342 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
net/sched/cls_api.c has overlapping changes to a call to
nlmsg_parse(), one (from 'net') added rtm_tca_policy instead of NULL
to the 5th argument, and another (from 'net-next') added cb->extack
instead of NULL to the 6th argument.

net/ipv4/ipmr_base.c is a case of a bug fix in 'net' being done to
code which moved (to mr_table_dump)) in 'net-next'.  Thanks to David
Ahern for the heads up.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-19 11:03:06 -07:00
Paolo Bonzini
e42b4a507e KVM/arm updates for 4.20
- Improved guest IPA space support (32 to 52 bits)
 - RAS event delivery for 32bit
 - PMU fixes
 - Guest entry hardening
 - Various cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAlvJ0HIVHG1hcmMuenlu
 Z2llckBhcm0uY29tAAoJECPQ0LrRPXpDnWsP/02W6iIZUlg0SfsNq3bownJv+3VH
 BwEWTfRhWqqzSnsPwUEcOakKI8OIDJ07wIr6XoqPqq2PESS4BQv90qUTxytJXIt4
 gdTxZbNdCSzOc8Zf5URi1WtydekxsEFKgZy9iYWuILJzGW8iFbDZasgG6l8TWupN
 SsoyoGYBVwqR4xRf2f+PLf2n4U0McM8gFuKBFpnp1vCg6gZMBOvvKxQSRk9lUXEL
 C5LERL1CsGVn1Q2GxEB4yAxqrlAMMjy/S2dAf2KpCvMvviK3t05C4vY/+/mT21YE
 wCStX7W5Jfhy3hEsyHCkeulODdomIyro32/hw1qLhMXv4+wRvoiNrMVEoxUPi+by
 L89C6slwxqZOgcF2epSQgf7LBiLw+LnCGtACq2xY7p8yGuy0XW7mK9DlY5RvBHka
 aMmZ6kK/GIZFqRHDHa+ND2cAqS+Xyg2t/j2rvUPL0/xNelI1hpUUyGECTcqAXLr7
 N28+8aoHWcYb03r8YwfgWkEcwT9leAS45NBmHgnkOL4srcyW7anSW4NhZb/+U0mM
 8cLF+2BxfUo733Q5EyM2Q3JdbgaDaeanf6zzy7xAsPEywK4P5/kdqjc0N9se+LUx
 WhU3BRDU4KwV6S7bBS9ZuFK3heuwfuKWaYwwDaxrTlem++8FhoLBNV2vN8VjemD/
 AY5RvHrEhFYndijj
 =vjLz
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-for-v4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm updates for 4.20

- Improved guest IPA space support (32 to 52 bits)
- RAS event delivery for 32bit
- PMU fixes
- Guest entry hardening
- Various cleanups
2018-10-19 15:24:24 +02:00
Pablo Neira Ayuso
af510ebd89 Revert "netfilter: xt_quota: fix the behavior of xt_quota module"
This reverts commit e9837e55b0.

When talking to Maze and Chenbo, we agreed to keep this back by now
due to problems in the ruleset listing path with 32-bit arches.

Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-10-19 14:00:34 +02:00
Adam Borowski
dddde68b8f xfs: add a define for statfs magic to uapi
Needed by userspace programs that call fstatfs().

It'd be natural to publish XFS_SB_MAGIC in uapi, but while these two
have identical values, they have different semantic meaning: one is
an enum cookie meant for statfs, the other a signature of the
on-disk format.

Signed-off-by: Adam Borowski <kilobyte@angband.pl>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-10-18 17:20:19 +11:00
Nicolas Dichtel
b55cbc8d9b bpf: fix doc of bpf_skb_adjust_room() in uapi
len_diff is signed.

Fixes: fa15601ab3 ("bpf: add documentation for eBPF helpers (33-41)")
CC: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-10-17 21:45:50 -07:00
David Howells
f366d322ae UAPI: ndctl: Remove use of PAGE_SIZE
The macro PAGE_SIZE isn't valid outside of the kernel, so it should not
appear in UAPI headers.

Furthermore, the actual machine page size could theoretically change from
an application's point of view if it's running in a container that gets
migrated to another machine (say 4K/ppc64 to 64K/ppc64).

Fixes: f2ba5a5bae ("libnvdimm, namespace: make min namespace size 4K")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-10-17 13:56:58 -07:00
David Howells
9607871f37 UAPI: ndctl: Fix g++-unsupported initialisation in headers
The following code in the linux/ndctl header file:

	static inline const char *nvdimm_bus_cmd_name(unsigned cmd)
	{
		static const char * const names[] = {
			[ND_CMD_ARS_CAP] = "ars_cap",
			[ND_CMD_ARS_START] = "ars_start",
			[ND_CMD_ARS_STATUS] = "ars_status",
			[ND_CMD_CLEAR_ERROR] = "clear_error",
			[ND_CMD_CALL] = "cmd_call",
		};

		if (cmd < ARRAY_SIZE(names) && names[cmd])
			return names[cmd];
		return "unknown";
	}

is broken in a number of ways:

 (1) ARRAY_SIZE() is not generally defined.

 (2) g++ does not support "non-trivial" array initialisers fully yet.

 (3) Every file that calls this function will acquire a copy of names[].

The same goes for nvdimm_cmd_name().

Fix all three by converting to a switch statement where each case returns a
string.  That way if cmd is a constant, the compiler can trivially reduce it
and, if not, the compiler can use a shared lookup table if it thinks that is
more efficient.

A better way would be to remove these functions and their arrays from the
header entirely.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-10-17 13:56:54 -07:00
Jim Mattson
c4f55198c7 kvm: x86: Introduce KVM_CAP_EXCEPTION_PAYLOAD
This is a per-VM capability which can be enabled by userspace so that
the faulting linear address will be included with the information
about a pending #PF in L2, and the "new DR6 bits" will be included
with the information about a pending #DB in L2. With this capability
enabled, the L1 hypervisor can now intercept #PF before CR2 is
modified. Under VMX, the L1 hypervisor can now intercept #DB before
DR6 and DR7 are modified.

When userspace has enabled KVM_CAP_EXCEPTION_PAYLOAD, it should
generally provide an appropriate payload when injecting a #PF or #DB
exception via KVM_SET_VCPU_EVENTS. However, to support restoring old
checkpoints, this payload is not required.

Note that bit 16 of the "new DR6 bits" is set to indicate that a debug
exception (#DB) or a breakpoint exception (#BP) occurred inside an RTM
region while advanced debugging of RTM transactional regions was
enabled. This is the reverse of DR6.RTM, which is cleared in this
scenario.

This capability also enables exception.pending in struct
kvm_vcpu_events, which allows userspace to distinguish between pending
and injected exceptions.

Reported-by: Jim Mattson <jmattson@google.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 19:07:44 +02:00
Vitaly Kuznetsov
57b119da35 KVM: nVMX: add KVM_CAP_HYPERV_ENLIGHTENED_VMCS capability
Enlightened VMCS is opt-in. The current version does not contain all
fields supported by nested VMX so we must not advertise the
corresponding VMX features if enlightened VMCS is enabled.

Userspace is given the enlightened VMCS version supported by KVM as
part of enabling KVM_CAP_HYPERV_ENLIGHTENED_VMCS. The version is to
be advertised to the nested hypervisor, currently done via a cpuid
leaf for Hyper-V.

Suggested-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:30:14 +02:00
Peng Hao
0804c849f1 kvm/x86 : add coalesced pio support
Coalesced pio is based on coalesced mmio and can be used for some port
like rtc port, pci-host config port and so on.

Specially in case of rtc as coalesced pio, some versions of windows guest
access rtc frequently because of rtc as system tick. guest access rtc like
this: write register index to 0x70, then write or read data from 0x71.
writing 0x70 port is just as index and do nothing else. So we can use
coalesced pio to handle this scene to reduce VM-EXIT time.

When starting and closing a virtual machine, it will access pci-host config
port frequently. So setting these port as coalesced pio can reduce startup
and shutdown time.

without my patch, get the vm-exit time of accessing rtc 0x70 and piix 0xcf8
using perf tools: (guest OS : windows 7 64bit)
IO Port Access  Samples Samples%  Time%  Min Time  Max Time  Avg time
0x70:POUT        86     30.99%    74.59%   9us      29us    10.75us (+- 3.41%)
0xcf8:POUT     1119     2.60%     2.12%   2.79us    56.83us 3.41us (+- 2.23%)

with my patch
IO Port Access  Samples Samples%  Time%   Min Time  Max Time   Avg time
0x70:POUT       106    32.02%    29.47%    0us      10us     1.57us (+- 7.38%)
0xcf8:POUT      1065    1.67%     0.28%   0.41us    65.44us   0.66us (+- 10.55%)

Signed-off-by: Peng Hao <peng.hao2@zte.com.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:30:11 +02:00
Vitaly Kuznetsov
214ff83d44 KVM: x86: hyperv: implement PV IPI send hypercalls
Using hypercall for sending IPIs is faster because this allows to specify
any number of vCPUs (even > 64 with sparse CPU set), the whole procedure
will take only one VMEXIT.

Current Hyper-V TLFS (v5.0b) claims that HvCallSendSyntheticClusterIpi
hypercall can't be 'fast' (passing parameters through registers) but
apparently this is not true, Windows always uses it as 'fast' so we need
to support that.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-17 00:29:47 +02:00
Xin Long
0ac1077e3a sctp: get pr_assoc and pr_stream all status with SCTP_PR_SCTP_ALL instead
According to rfc7496 section 4.3 or 4.4:

   sprstat_policy:  This parameter indicates for which PR-SCTP policy
      the user wants the information.  It is an error to use
      SCTP_PR_SCTP_NONE in sprstat_policy.  If SCTP_PR_SCTP_ALL is used,
      the counters provided are aggregated over all supported policies.

We change to dump pr_assoc and pr_stream all status by SCTP_PR_SCTP_ALL
instead, and return error for SCTP_PR_SCTP_NONE, as it also said "It is
an error to use SCTP_PR_SCTP_NONE in sprstat_policy. "

Fixes: 826d253d57 ("sctp: add SCTP_PR_ASSOC_STATUS on sctp sockopt")
Fixes: d229d48d18 ("sctp: add SCTP_PR_STREAM_STATUS sockopt for prsctp")
Reported-by: Ying Xu <yinxu@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-16 09:58:49 -07:00
Fernando Fernandez Mancera
a218dc82f0 netfilter: nft_osf: Add ttl option support
Add ttl option support to the nftables "osf" expression.

Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-10-16 10:01:48 +02:00
Justin.Lee1@Dell.com
9771b8ccdf net/ncsi: Extend NC-SI Netlink interface to allow user space to send NC-SI command
The new command (NCSI_CMD_SEND_CMD) is added to allow user space application
to send NC-SI command to the network card.
Also, add a new attribute (NCSI_ATTR_DATA) for transferring request and response.

The work flow is as below.

Request:
User space application
	-> Netlink interface (msg)
	-> new Netlink handler - ncsi_send_cmd_nl()
	-> ncsi_xmit_cmd()

Response:
Response received - ncsi_rcv_rsp()
	-> internal response handler - ncsi_rsp_handler_xxx()
	-> ncsi_rsp_handler_netlink()
	-> ncsi_send_netlink_rsp ()
	-> Netlink interface (msg)
	-> user space application

Command timeout - ncsi_request_timeout()
	-> ncsi_send_netlink_timeout ()
	-> Netlink interface (msg with zero data length)
	-> user space application

Error:
Error detected
	-> ncsi_send_netlink_err ()
	-> Netlink interface (err msg)
	-> user space application

Signed-off-by: Justin Lee <justin.lee1@dell.com>
Reviewed-by: Samuel Mendoza-Jonas <sam@mendozajonas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:00:59 -07:00
Maciej W. Rozycki
61414f5ec9 FDDI: defza: Add support for DEC FDDIcontroller 700 TURBOchannel adapter
Add support for the DEC FDDIcontroller 700 (DEFZA), Digital Equipment
Corporation's first-generation FDDI network interface adapter, made for
TURBOchannel and based on a discrete version of what eventually became
Motorola's widely used CAMEL chipset.

The CAMEL chipset is present for example in the DEC FDDIcontroller
TURBOchannel, EISA and PCI adapters (DEFTA/DEFEA/DEFPA) that we support
with the `defxx' driver, however the host bus interface logic and the
firmware API are different in the DEFZA and hence a separate driver is
required.

There isn't much to say about the driver except that it works, but there
is one peculiarity to mention.  The adapter implements two Tx/Rx queue
pairs.

Of these one pair is the usual network Tx/Rx queue pair, in this case
used by the adapter to exchange frames with the ring, via the RMC (Ring
Memory Controller) chip.  The Tx queue is handled directly by the RMC
chip and resides in onboard packet memory.  The Rx queue is maintained
via DMA in host memory by adapter's firmware copying received data
stored by the RMC in onboard packet memory.

The other pair is used to communicate SMT frames with adapter's
firmware.  Any SMT frame received from the RMC via the Rx queue must be
queued back by the driver to the SMT Rx queue for the firmware to
process.  Similarly the firmware uses the SMT Tx queue to supply the
driver with SMT frames that must be queued back to the Tx queue for the
RMC to send to the ring.

This solution was chosen because the designers ran out of PCB space and
could not squeeze in more logic onto the board that would be required to
handle this SMT frame traffic without the need to involve the driver, as
with the later DEFTA/DEFEA/DEFPA adapters.

Finally the driver does some Frame Control byte decoding, so to avoid
magic numbers some macros are added to <linux/if_fddi.h>.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 21:46:06 -07:00
Dan Schatzberg
5571f1e654 fuse: enable caching of symlinks
FUSE file reads are cached in the page cache, but symlink reads are
not. This patch enables FUSE READLINK operations to be cached which
can improve performance of some FUSE workloads.

In particular, I'm working on a FUSE filesystem for access to source
code and discovered that about a 10% improvement to build times is
achieved with this patch (there are a lot of symlinks in the source
tree).

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-10-15 15:43:07 +02:00
David S. Miller
d864991b22 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts were easy to resolve using immediate context mostly,
except the cls_u32.c one where I simply too the entire HEAD
chunk.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-12 21:38:46 -07:00
David S. Miller
e32cf9a386 Highlights:
* merge net-next, so I can finish the hwsim workqueue removal
  * fix TXQ NULL pointer issue that was reported multiple times
  * minstrel cleanups from Felix
  * simplify lib80211 code by not using skcipher, note that this
    will conflict with the crypto tree (and this new code here
    should be used)
  * use new netlink policy validation in nl80211
  * fix up SAE (part of WPA3) in client-mode
  * FTM responder support in the stack
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAlvAgX8ACgkQB8qZga/f
 l8Sg6A/+IhbIRerfje8kVIWNG3fUa5oF/vXYRfNRDeNwHu6OVYer5uj6bonkSSoU
 lpWivhuF1XxdIQx6npr8j7R6bw8sbKMKrpnR7ObFaaCmxjmjgMNY4pkkmzSRv3zi
 LP91woUQ2mFv16U9E+9HROBJXgeZfU96bBBR4mfqKrhnhFz6O8SYa9JvRpylcLTz
 HY+a+ezIztbrCPImBUOkSfUPuoeWSJ8Y4LN2zhqbY/ZUk3n4iFOD44bAN346u3+q
 CzfL+e+PKjuSiwPrjrKMIp9emiAE1zeWPUcmnSAYU/s9zISoZ8v5L/kSBanVnvJI
 DIHJ6LB+aETm1gnj1dSnboXRKpUaW63nbkoxwTy15JrdIMysh9vDaXWok6p85mDq
 FVbFx1MzBcYPCMrACEBhhRz71eZHFtyrGyqUm1EZcdcNeQtg+eJdXvhTiowg5JhH
 Csl+ZloZfqrd0aPjaM3F819aVMKcJORdQhrfTN+GYW81mdZyZO3VWmzd3kI8Pjuc
 WRZmykYTub/MbDFjE5g1LRJ4jZ/bThIl71i8hYEopqfim9URE1RPm9mSMUv9ggVL
 /xNIIMhw3H/+P3g0z2a0Z06hT/dScHsZ9If5Srb699/NUOiES5zl43eXidBzuAbI
 36DHaHhf6k8/e3aaLOS1oD3AjpiW28uxq7baeLfd1B8l9WmmgZM=
 =RqMD
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-next-for-davem-2018-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next

Johannes Berg says:

====================
Highlights:
 * merge net-next, so I can finish the hwsim workqueue removal
 * fix TXQ NULL pointer issue that was reported multiple times
 * minstrel cleanups from Felix
 * simplify lib80211 code by not using skcipher, note that this
   will conflict with the crypto tree (and this new code here
   should be used)
 * use new netlink policy validation in nl80211
 * fix up SAE (part of WPA3) in client-mode
 * FTM responder support in the stack
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-12 10:56:56 -07:00
Nikolay Aleksandrov
9163a0fc1f net: bridge: add support for per-port vlan stats
This patch adds an option to have per-port vlan stats instead of the
default global stats. The option can be set only when there are no port
vlans in the bridge since we need to allocate the stats if it is set
when vlans are being added to ports (and respectively free them
when being deleted). Also bump RTNL_MAX_TYPE as the bridge is the
largest user of options. The current stats design allows us to add
these without any changes to the fast-path, it all comes down to
the per-vlan stats pointer which, if this option is enabled, will
be allocated for each port vlan instead of using the global bridge-wide
one.

CC: bridge@lists.linux-foundation.org
CC: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-12 10:18:58 -07:00
Ankita Bajaj
0d4e14a32d nl80211: Add per peer statistics to compute FCS error rate
Add support for drivers to report the total number of MPDUs received
and the number of MPDUs received with an FCS error from a specific
peer. These counters will be incremented only when the TA of the
frame matches the MAC address of the peer irrespective of FCS
error.

It should be noted that the TA field in the frame might be corrupted
when there is an FCS error and TA matching logic would fail in such
cases. Hence, FCS error counter might not be fully accurate, but it can
provide help in detecting bad RX links in significant number of cases.
This FCS error counter without full accuracy can be used, e.g., to
trigger a kick-out of a connected client with a bad link in AP mode to
force such a client to roam to another AP.

Signed-off-by: Ankita Bajaj <bankita@codeaurora.org>
Signed-off-by: Jouni Malinen <jouni@codeaurora.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-10-12 12:56:34 +02:00
Gerd Hoffmann
3cdf752506 vfio: add edid api for display (vgpu) devices.
This allows to set EDID monitor information for the vgpu display, for a
more flexible display configuration, using a special vfio region.  Check
the comment describing struct vfio_region_gfx_edid for more details.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-10-11 10:22:35 -06:00
David S. Miller
49b538e79b RxRPC fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIVAwUAW7vWy/u3V2unywtrAQJkeA//UTmUmv9bL8k38XvnnHQ82iTyIjvZ40md
 9XbNagwjkxuDv2eqW1rTxk5+Po596Nb4s/cBvpyAABjCA+zoT0/fklK5PqsSKFla
 ibAHlrBiv/1FlSUSTf4dvAfVVszRAItCP4OhKTU+0fkX3Hq75T7Lty1h92Yk/ijx
 ccqSV9K/nB4WN9bbg+sDngu3RAVR3VJ5RhVSmrnE9xd3M9ihz2RHacJ6GWNQf5PV
 j8cKuP3qY/2K245hb4sGn1i9OO6/WVVQhnojYA+jR8Vg4+8mY0Y++MLsW8ywlsuo
 xlrDCVPBaR6YqjnlDoOXIXYZ8EbEsmy5FP51zfEZJqeKJcVPLavQyoku0xJ8fHgr
 oK/3DBUkwgi0hiq4dC1CrmXZRSbBQqGTEJMA6AKh2ApXb9BkmhmqRGMx+2nqocwF
 TjAQmCu0FJTAYdwJuqzv5SRFjocjrHGcYDpQ27CgNqSiiDlh8CZ1ej/mbV+o8pUQ
 tOrpnRCLY8Nl55AgJbl2260mKcpcKQcRUwggKAOzqRjXwcy9q3KKm8H149qejJSl
 AQihDkPVqBZrv5ODKR0z88e5cVF9ad+vBPd+2TrRVGtTWhm/q8N/0Hh0M2sYDSc6
 vLklYfwI4U5U1/naJC9WCkW+ybj5ekVNHgW91ICvQVNG0TH2zfE7sUGMQn6lRZL8
 jfpLTVbtKfI=
 =q6E6
 -----END PGP SIGNATURE-----

Merge tag 'rxrpc-fixes-20181008' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

David Howells says:

====================
rxrpc: Fix packet reception code

Here are a set of patches that prepares for and fix problems in rxrpc's
package reception code.  There serious problems are:

 (A) There's a window between binding the socket and setting the data_ready
     hook in which packets can find their way into the UDP socket's receive
     queues.

 (B) The skb_recv_udp() will return an error (and clear the error state) if
     there was an error on the Tx side.  rxrpc doesn't handle this.

 (C) The rxrpc data_ready handler doesn't fully drain the UDP receive
     queue.

 (D) The rxrpc data_ready handler assumes it is called in a non-reentrant
 state.

The second patch fixes (A) - (C); the third patch renders (B) and (C)
non-issues by using the recap_rcv hook instead of data_ready - and the
final patch fixes (D).  That last is the most complex.

The preparatory patches are:

 (1) Fix some places that are doing things in the wrong net namespace.

 (2) Stop taking the rcu read lock as it's held by the IP input routine in
     the call chain.

 (3) Only end the Tx phase if *we* rotated the final packet out of the Tx
     buffer.

 (4) Don't assume that the call state won't change after dropping the
     call_state lock.

 (5) Only take receive window and MTU suze parameters from an ACK packet if
     it's the latest ACK packet.

 (6) Record connection-level abort information correctly.

 (7) Fix a trace line.

And then there are three main patches - note that these are mixed in with
the preparatory patches somewhat:

 (1) Fix the setup window (A), skb_recv_udp() error check (B) and packet
     drainage (C).

 (2) Switch to using the encap_rcv instead of data_ready to cut out the
     effects of the UDP read queues and get the packets delivered directly.

 (3) Add more locking into the various packet input paths to defend against
     re-entrance (D).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:27:38 -07:00
David S. Miller
071a234ad7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2018-10-08

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) sk_lookup_[tcp|udp] and sk_release helpers from Joe Stringer which allow
BPF programs to perform lookups for sockets in a network namespace. This would
allow programs to determine early on in processing whether the stack is
expecting to receive the packet, and perform some action (eg drop,
forward somewhere) based on this information.

2) per-cpu cgroup local storage from Roman Gushchin.
Per-cpu cgroup local storage is very similar to simple cgroup storage
except all the data is per-cpu. The main goal of per-cpu variant is to
implement super fast counters (e.g. packet counters), which don't require
neither lookups, neither atomic operations in a fast path.
The example of these hybrid counters is in selftests/bpf/netcnt_prog.c

3) allow HW offload of programs with BPF-to-BPF function calls from Quentin Monnet

4) support more than 64-byte key/value in HW offloaded BPF maps from Jakub Kicinski

5) rename of libbpf interfaces from Andrey Ignatov.
libbpf is maturing as a library and should follow good practices in
library design and implementation to play well with other libraries.
This patch set brings consistent naming convention to global symbols.

6) relicense libbpf as LGPL-2.1 OR BSD-2-Clause from Alexei Starovoitov
to let Apache2 projects use libbpf

7) various AF_XDP fixes from Björn and Magnus
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 23:42:44 -07:00
Paul Mackerras
901f8c3f6f KVM: PPC: Book3S HV: Add NO_HASH flag to GET_SMMU_INFO ioctl result
This adds a KVM_PPC_NO_HASH flag to the flags field of the
kvm_ppc_smmu_info struct, and arranges for it to be set when
running as a nested hypervisor, as an unambiguous indication
to userspace that HPT guests are not supported.  Reporting the
KVM_CAP_PPC_MMU_HASH_V3 capability as false could be taken as
indicating only that the new HPT features in ISA V3.0 are not
supported, leaving it ambiguous whether pre-V3.0 HPT features
are supported.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-10-09 16:14:54 +11:00
Paul Mackerras
aa069a9969 KVM: PPC: Book3S HV: Add a VM capability to enable nested virtualization
With this, userspace can enable a KVM-HV guest to run nested guests
under it.

The administrator can control whether any nested guests can be run;
setting the "nested" module parameter to false prevents any guests
becoming nested hypervisors (that is, any attempt to enable the nested
capability on a guest will fail).  Guests which are already nested
hypervisors will continue to be so.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-10-09 16:14:47 +11:00
David S. Miller
9000a457a0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for your net-next tree:

1) Support for matching on ipsec policy already set in the route, from
   Florian Westphal.

2) Split set destruction into deactivate and destroy phase to make it
   fit better into the transaction infrastructure, also from Florian.
   This includes a patch to warn on imbalance when setting the new
   activate and deactivate interfaces.

3) Release transaction list from the workqueue to remove expensive
   synchronize_rcu() from configuration plane path. This speeds up
   configuration plane quite a bit. From Florian Westphal.

4) Add new xfrm/ipsec extension, this new extension allows you to match
   for ipsec tunnel keys such as source and destination address, spi and
   reqid. From Máté Eckl and Florian Westphal.

5) Add secmark support, this includes connsecmark too, patches
   from Christian Gottsche.

6) Allow to specify remaining bytes in xt_quota, from Chenbo Feng.
   One follow up patch to calm a clang warning for this one, from
   Nathan Chancellor.

7) Flush conntrack entries based on layer 3 family, from Kristian Evensen.

8) New revision for cgroups2 to shrink the path field.

9) Get rid of obsolete need_conntrack(), as a result from recent
   demodularization works.

10) Use WARN_ON instead of BUG_ON, from Florian Westphal.

11) Unused exported symbol in nf_nat_ipv4_fn(), from Florian.

12) Remove superfluous check for timeout netlink parser and dump
    functions in layer 4 conntrack helpers.

13) Unnecessary redundant rcu read side locks in NAT redirect,
    from Taehee Yoo.

14) Pass nf_hook_state structure to error handlers, patch from
    Florian Westphal.

15) Remove ->new() interface from layer 4 protocol trackers. Place
    them in the ->packet() interface. From Florian.

16) Place conntrack ->error() handling in the ->packet() interface.
    Patches from Florian Westphal.

17) Remove unused parameter in the pernet initialization path,
    also from Florian.

18) Remove additional parameter to specify layer 3 protocol when
    looking up for protocol tracker. From Florian.

19) Shrink array of layer 4 protocol trackers, from Florian.

20) Check for linear skb only once from the ALG NAT mangling
    codebase, from Taehee Yoo.

21) Use rhashtable_walk_enter() instead of deprecated
    rhashtable_walk_init(), also from Taehee.

22) No need to flush all conntracks when only one single address
    is gone, from Tan Hu.

23) Remove redundant check for NAT flags in flowtable code, from
    Taehee Yoo.

24) Use rhashtable_lookup() instead of rhashtable_lookup_fast()
    from netfilter codebase, since rcu read lock side is already
    assumed in this path.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 21:28:55 -07:00
David Ahern
89d35528d1 netlink: Add new socket option to enable strict checking on dumps
Add a new socket option, NETLINK_DUMP_STRICT_CHK, that userspace
can use via setsockopt to request strict checking of headers and
attributes on dump requests.

To get dump features such as kernel side filtering based on data in
the header or attributes appended to the dump request, userspace
must call setsockopt() for NETLINK_DUMP_STRICT_CHK and a non-zero
value. Since the netlink sock and its flags are private to the
af_netlink code, the strict checking flag is passed to dump handlers
via a flag in the netlink_callback struct.

For old userspace on new kernel there is no impact as all of the data
checks in later patches are wrapped in a check on the new strict flag.

For new userspace on old kernel, the setsockopt will fail and even if
new userspace sets data in the headers and appended attributes the
kernel will silently ignore it. Moving forward when the setsockopt
succeeds, the new userspace on old kernel means the dump request can
pass an attribute the kernel does not understand. The dump will then
fail as the older kernel does not understand it.

New userspace on new kernel setting the socket option gets the benefit
of the improved data dump.

Kernel side the NETLINK_DUMP_STRICT_CHK uapi is converted to a generic
NETLINK_F_STRICT_CHK flag which can potentially be leveraged for tighter
checking on the NEW, DEL, and SET commands.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:04 -07:00
David Howells
5271953cad rxrpc: Use the UDP encap_rcv hook
Use the UDP encap_rcv hook to cut the bit out of the rxrpc packet reception
in which a packet is placed onto the UDP receive queue and then immediately
removed again by rxrpc.  Going via the queue in this manner seems like it
should be unnecessary.

This does, however, require the invention of a value to place in encap_type
as that's one of the conditions to switch packets out to the encap_rcv
hook.  Possibly the value doesn't actually matter for anything other than
sockopts on the UDP socket, which aren't accessible outside of rxrpc
anyway.

This seems to cut a bit of time out of the time elapsed between each
sk_buff being timestamped and turning up in rxrpc (the final number in the
following trace excerpts).  I measured this by making the rxrpc_rx_packet
trace point print the time elapsed between the skb being timestamped and
the current time (in ns), e.g.:

	... 424.278721: rxrpc_rx_packet: ...  ACK 25026

So doing a 512MiB DIO read from my test server, with an unmodified kernel:

	N       min     max     sum		mean    stddev
	27605   2626    7581    7.83992e+07     2840.04 181.029

and with the patch applied:

	N       min     max     sum		mean    stddev
	27547   1895    12165   6.77461e+07     2459.29 255.02

Signed-off-by: David Howells <dhowells@redhat.com>
2018-10-08 15:45:18 +01:00
Greg Kroah-Hartman
4e1a606d55 Merge 4.19-rc7 into tty-next
We want the fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-08 15:43:12 +02:00
Greg Kroah-Hartman
8aff4eaa1d Merge 4.19-rc7 into usb-next
We want the USB fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-08 15:40:42 +02:00
Greg Kroah-Hartman
ba1cb318dc Merge 4.19-rc7 into char-misc-next
We want the fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-08 15:33:21 +02:00
Amir Goldstein
d0a6a87e40 fanotify: support reporting thread id instead of process id
In order to identify which thread triggered the event in a
multi-threaded program, add the FAN_REPORT_TID flag in fanotify_init to
opt-in for reporting the event creator's thread id information.

Signed-off-by: nixiaoming <nixiaoming@huawei.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2018-10-08 13:48:45 +02:00
Johannes Berg
188de5dd80 Merge remote-tracking branch 'net-next/master' into mac80211-next
Merge net-next, which pulled in net, so I can merge a few more
patches that would otherwise conflict.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-10-08 09:48:36 +02:00
Eugene Syromiatnikov
d4f0006a08 net/smc: retain old name for diag_mode field
Commit c601171d7a ("net/smc: provide smc mode in smc_diag.c") changed
the name of diag_fallback field of struct smc_diag_msg structure
to diag_mode.  However, this structure is a part of UAPI, and this change
breaks user space applications that use it ([1], for example).  Since
the new name is more suitable, convert the field to a union that provides
access to the data via both the new and the old name.

[1] https://gitlab.com/strace/strace/blob/v4.24/netlink_smc_diag.c#L165

Fixes: c601171d7a ("net/smc: provide smc mode in smc_diag.c")
Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-07 21:06:28 -07:00
Eugene Syromiatnikov
a21048c8ec net/smc: use __aligned_u64 for 64-bit smc_diag fields
Commit 4b1b7d3b30 ("net/smc: add SMC-D diag support") introduced
new UAPI-exposed structure, struct smcd_diag_dmbinfo.  However,
it's not usable by compat binaries, as it has different layout there.
Probably, the most straightforward fix that will avoid similar issues
in the future is to use __aligned_u64 for 64-bit fields.

Fixes: 4b1b7d3b30 ("net/smc: add SMC-D diag support")
Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-07 21:06:28 -07:00
David S. Miller
72438f8cef Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-10-06 14:43:42 -07:00