Commit graph

549700 commits

Author SHA1 Message Date
David Sterba
8eb934591f btrfs: check unsupported filters in balance arguments
We don't verify that all the balance filter arguments supplemented by
the flags are actually known to the kernel. Thus we let it silently pass
and do nothing.

At the moment this means only the 'limit' filter, but we're going to add
a few more soon so it's better to have that fixed. Also in older stable
kernels so that it works with newer userspace tools.

Cc: stable@vger.kernel.org # 3.16+
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
2015-10-13 18:53:03 -07:00
Geliang Tang
c3643885aa mISDN: use kstrdup() in dsp_pipeline_build
Use kstrdup instead of strlen-kmalloc-strcpy. Remove unneeded NULL
test, it will be tested inside kstrdup. Remove 0 length string test,
it has been tested in the caller of dsp_pipeline_build.

Signed-off-by: Geliang Tang <geliangtang@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-13 18:29:50 -07:00
Eric Dumazet
4bdc3d6614 tcp/dccp: fix behavior of stale SYN_RECV request sockets
When a TCP/DCCP listener is closed, its pending SYN_RECV request sockets
become stale, meaning 3WHS can not complete.

But current behavior is wrong :
incoming packets finding such stale sockets are dropped.

We need instead to cleanup the request socket and perform another
lookup :
- Incoming ACK will give a RST answer,
- SYN rtx might find another listener if available.
- We expedite cleanup of request sockets and old listener socket.

Fixes: 079096f103 ("tcp/dccp: install syn_recv requests into ehash table")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-13 18:26:34 -07:00
Ulf Hansson
a98f1b78ec PM / Domains: Fix validation of latency constraints in genpd governor
Commit ba2bbfbf63 (PM / Domains: Remove intermediate states from the
power off sequence) changed the power off sequence in genpd. That also
required some updates regarding the validation of latency constraints in
the genpd governor. Unfortunate that wasn't covered, so let's fix this.

From a runtime PM and latency point of view, we need to consider the worst
case scenario while validating latency constraints. That's typically when
a call to pm_runtime_get_sync() needs to wait for a ongoing runtime
suspend operation to be carried out, as it then also needs to wait for the
device to be runtime resumed again.

The above mentioned commit made the genpd governor's ->stop_ok() callback
responsible of validating genpd's device's runtime suspend/resume latency.
In other words, the constraint needs to be validated towards the relevant
latencies present in genpd's ->runtime_suspend|resume() callbacks.

Earlier, that included latencies from the ->stop|start() callbacks, but as
->save|restore_state() are now also being invoked from genpd's
->runtime_suspend|resume() and to comply with the worst case scenario,
let's take also those latencies into account.

Fixes: ba2bbfbf63 (PM / Domains: Remove intermediate states from the power off sequence)
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-10-14 02:34:43 +02:00
Christoph Lameter
0b5c9279e5 IB/ipoib: For sendonly join free the multicast group on leave
When we leave the multicast group on expiration of a neighbor we
do not free the mcast structure. This results in a memory leak
that causes ib_dealloc_pd to fail and print a WARN_ON message
and backtrace.

Fixes: bd99b2e05c (IB/ipoib: Expire sendonly multicast joins)
Signed-off-by: Christoph Lameter <cl@linux.com>
Tested-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-13 16:43:59 -04:00
Linus Torvalds
5b5f145527 Two nfsd fixes, one for an RDMA crash, one for a pnfs/block protocol
bug.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWHUj5AAoJECebzXlCjuG+KIoP/RW5zigAEKqUiD7ycKR91BxD
 9Nt0fqTTrbkGJhKM1/DN4YEjogAHeFW5OnGiLQRUNI/qdy+I1Gyr1kgwGmCCVDt9
 d8AhnxcnXR5SmsQHk7eeUd/rnODetf0bW5YJ8PfFbnC6cmM013nR9ujEccUuCl9M
 hHTp+690Doab00PtWtsjmZv5d+eT1bktY/R2PuQhyQM2CKWh1u4FeNTd1lWE551D
 b1wSvhAGMYVEsQv8+HICDrIQ8loGfH2gpBILERLM2yJlhN1IPU3RmNSAcQpZSaql
 veJYVmHdpMACCLp0Dd3hwWKDYvcQ2lCqKk+Cpd0vLpvZ8J5OjCLC+a2dh0PRIYuf
 pwFCvbWz6dn27/9eXEKbyT2JIeBIl4qwrFjfiRKlNX0c4HGKXaE2gJrY7bxnDxe1
 BatAbEFZ+rxHyPmycaj3JdyOxafmw94XzbT8q2g7tmUCj+pvAI+Pbv6PlwN6W2r7
 aGBZzgd8Y9pT6ZbCB0e413d/t5ulxwkt6vVz9Jze4gfcUrWcqHaqt7AadMl7obUx
 AYPLAVGeHybdKlLvqv42IF2QM8ZhizM0+EnxkjfWLrsa7WbstWX5KLPpm3K80dM7
 98p1ToNQDFcNU8WBZw8AkBpFz4j32RVOkvzWFWbhCo+T3is4BmP16uEEjH90aCCY
 skQKMrq8J1ox33gz5gT7
 =Pkuy
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-4.3-2' of git://linux-nfs.org/~bfields/linux

Pull nfsd fixes from Bruce Fields:
 "Two nfsd fixes, one for an RDMA crash, one for a pnfs/block protocol
  bug"

* tag 'nfsd-4.3-2' of git://linux-nfs.org/~bfields/linux:
  svcrdma: Fix NFS server crash triggered by 1MB NFS WRITE
  nfsd/blocklayout: accept any minlength
2015-10-13 11:31:03 -07:00
Linus Torvalds
6006d4521b Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
 "This fixes the following issues:

   - Fix AVX detection to prevent use of non-existent AESNI.

   - Some SPARC ciphers did not set their IV size which may lead to
     memory corruption"

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: ahash - ensure statesize is non-zero
  crypto: camellia_aesni_avx - Fix CPU feature checks
  crypto: sparc - initialize blkcipher.ivsize
2015-10-13 10:18:54 -07:00
Linus Torvalds
7554225312 IOMMU Fixes for Linux v4.3-rc5
A few fixes piled up:
 
 	* Fix for a suspend/resume issue where PCI probing code overwrote
 	  dev->irq for the MSI irq of the AMD IOMMU.
 
 	* Fix for a kernel crash when a 32 bit PCI device was assigned to a KVM
 	  guest.
 
 	* Fix for a possible memory leak in the VT-d driver
 
 	* A couple of fixes for the ARM-SMMU driver
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJWHNdbAAoJECvwRC2XARrjB/YQAKouJaRMjBaehx6kbaZMhMJy
 hXDsh8Xl6TtCe6kLD2uXrvjLZAdu32kjrtzhhcM21EO5Ms2Weq6A60/98LwnJ4Eg
 AqftjfxQsIwf2G1PvHb+xepgcFxIAhW6a3nORzx6d2AGrNWmMtUhbLTSncYjmojf
 Td4dscuRmRPenJUV1JhcJQBR62QonknIHV99QmevaCSAoUdyuMH+t5kQVEgPjx7C
 GlMPNEZZmGl7J3NXSWRtDSkUxFZ1OU8MTKc1LmPPHHAOZk37wbePihQbLLySlHPH
 v4G1R05e2hG7C66yu959fyOleL87lDToUXhwQNFJMqEc+e7IzBzZsB3ANEHjpLQH
 UJC9COU+sf8mPafja4ge/KbyGDmgDg/OMQJDhU6+DSXUflwymeWJmXr7sLFQex6O
 nZO/SVzkbKj+PKxV7UnGD0sTeAAk0X6vfhFCL0l/acPpQg0T6Fpky5D5fUMv5dWS
 xxxvxfwBcDoI44fxWBhfPYvmLFT9f5da+bpbzeeGjVSNezOkPJ65AJcVk5An4kQu
 PRzJGoq3XpZHOeg5+O7IKzeuJ+3qc7Tz4wAzMxcaNFpVBl2qp1RUkTbmS9/YV1b5
 ZOcIFBMLuUROE1ExsU19c5Uo0j1Bvh9jtdy6lNFCagQYzihtA0Jk19ucllx1jIjD
 sdv2hgDIauRToKF1d9xz
 =v5G4
 -----END PGP SIGNATURE-----

Merge tag 'iommu-fixes-v4.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull IOMMU fixes from Joerg Roedel:
 "A few fixes piled up:

   - Fix for a suspend/resume issue where PCI probing code overwrote
     dev->irq for the MSI irq of the AMD IOMMU.

   - Fix for a kernel crash when a 32 bit PCI device was assigned to a
     KVM guest.

   - Fix for a possible memory leak in the VT-d driver

   - A couple of fixes for the ARM-SMMU driver"

* tag 'iommu-fixes-v4.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/amd: Fix NULL pointer deref on device detach
  iommu/amd: Prevent binding other PCI drivers to IOMMU PCI devices
  iommu/vt-d: Fix memory leak in dmar_insert_one_dev_info()
  iommu/arm-smmu: Use correct address mask for CMD_TLBI_S2_IPA
  iommu/arm-smmu: Ensure IAS is set correctly for AArch32-capable SMMUs
  iommu/io-pgtable-arm: Don't use dma_to_phys()
2015-10-13 10:09:59 -07:00
Linus Torvalds
06d1ee32a4 Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
 "I got a bit behind last week, so here is a delayed fixes pull:

   - a bunch of radeon/amd gpu fixes
   - some nouveau regression fixes (ppc bios reading and runtime pm fix)
   - one drm core oops fix
   - two qxl locking fixes
   - one qxl regression fix"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/nouveau/bios: fix OF loading
  drm/nouveau/fbcon: take runpm reference when userspace has an open fd
  drm/nouveau/nouveau: Disable AGP for SiS 761
  drm/nouveau/display: allow up to 16k width/height for fermi+
  drm/nouveau/bios: translate devinit pri/sec i2c bus to internal identifiers
  drm: Fix locking for sysfs dpms file
  drm/amdgpu: fix memory leak in amdgpu_vm_update_page_directory
  drm/amdgpu: fix 32-bit compiler warning
  drm/qxl: avoid dependency lock
  drm/qxl: avoid buffer reservation in qxl_crtc_page_flip
  drm/qxl: fix framebuffer dirty rectangle tracking.
  drm/amdgpu: flag iceland as experimental
  drm/amdgpu: check before checking pci bridge registers
  drm/amdgpu: fix num_crtc on CZ
  drm/amdgpu: restore the fbdev mode in lastclose
  drm/radeon: restore the fbdev mode in lastclose
  drm/radeon: add quirk for ASUS R7 370
  drm/amdgpu: add pm sysfs files late
  drm/radeon: add pm sysfs files late
2015-10-13 09:45:21 -07:00
Paolo Bonzini
7391773933 KVM: x86: fix SMI to halted VCPU
An SMI to a halted VCPU must wake it up, hence a VCPU with a pending
SMI must be considered runnable.

Fixes: 64d6067057
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-13 18:29:41 +02:00
Paolo Bonzini
5d9bc648b9 KVM: x86: clean up kvm_arch_vcpu_runnable
Split the huge conditional in two functions.

Fixes: 64d6067057
Cc: stable@vger.kernel.org
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-13 18:28:59 +02:00
Paolo Bonzini
f0d648bdf0 KVM: x86: map/unmap private slots in __x86_set_memory_region
Otherwise, two copies (one of them never populated and thus bogus)
are allocated for the regular and SMM address spaces.  This breaks
SMM with EPT but without unrestricted guest support, because the
SMM copy of the identity page map is all zeros.

By moving the allocation to the caller we also remove the last
vestiges of kernel-allocated memory regions (not accessible anymore
in userspace since commit b74a07beed, "KVM: Remove kernel-allocated
memory regions", 2010-06-21); that is a nice bonus.

Reported-by: Alexandre DERUMIER <aderumier@odiso.com>
Cc: stable@vger.kernel.org
Fixes: 9da0e4d5ac
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-13 18:28:58 +02:00
Paolo Bonzini
1d8007bdee KVM: x86: build kvm_userspace_memory_region in x86_set_memory_region
The next patch will make x86_set_memory_region fill the
userspace_addr.  Since the struct is not used untouched
anymore, it makes sense to build it in x86_set_memory_region
directly; it also simplifies the callers.

Reported-by: Alexandre DERUMIER <aderumier@odiso.com>
Cc: stable@vger.kernel.org
Fixes: 9da0e4d5ac
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-13 18:28:46 +02:00
Mike Snitzer
ba30670f4d dm thin: fix missing pool reference count decrement in pool_ctr error path
Fixes: ac8c3f3df ("dm thin: generate event when metadata threshold passed")
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org # 3.10+
2015-10-13 12:20:55 -04:00
Sudip Mukherjee
a2a678ed4d dm snapshot persistent: fix missing cleanup in persistent_ctr error path
If an unsupported option is given then the early return from
persistent_ctr() leaked memory allocated for the 'pstore' and never
destroyed the 'metadata_wq'.

Fixes: b0d3cc011e ("dm snapshot: add new persistent store option to support overflow")
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2015-10-13 12:20:54 -04:00
Alexandre Belloni
42160a041d can: at91: remove at91_can_data
struct at91_can_data was used to pass a callback to the driver, allowing it
to switch the transceiver on and off. As all at91 boards are now using DT,
this is not used anymore, remove that structure.

Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2015-10-13 17:42:35 +02:00
Arnd Bergmann
ba61a8d9d7 can: avoid using timeval for uapi
The can subsystem communicates with user space using a bcm_msg_head
header, which contains two timestamps. This is problematic for
multiple reasons:

a) The structure layout is currently incompatible between 64-bit
   user space and 32-bit user space, and cannot work in compat
   mode (other than x32).

b) The timeval structure layout will change in 32-bit user
   space when we fix the y2038 overflow problem by redefining
   time_t to 64-bit, making new 32-bit user space incompatible
   with the current kernel interface.
   Cars last a long time and often use old kernels, so the actual
   users of this code are the most likely ones to migrate to y2038
   safe user space.

This tries to work around part of the problem by changing the
publicly visible user interface in the header, but not the binary
interface. Fortunately, the values passed around in the structure
are relative times and do not actually suffer from the y2038
overflow, so 32-bit is enough here.

We replace the use of 'struct timeval' with a newly defined
'struct bcm_timeval' that uses the exact same binary layout
as before and that still suffers from problem a) but not problem
b).

The downside of this approach is that any user space program
that currently assigns a timeval structure to these members
rather than writing the tv_sec/tv_usec portions individually
will suffer a compile-time error when built with an updated
kernel header. Fixing this error makes it work fine with old
and new headers though.

We could address problem a) by using '__u32' or 'int' members
rather than 'long', but that would have a more significant
downside in also breaking support for all existing 64-bit user
binaries that might be using this interface, which is likely
not acceptable.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Cc: linux-can@vger.kernel.org
Cc: linux-api@vger.kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2015-10-13 17:42:34 +02:00
Gerhard Bertelsmann
3c200db564 can: sun4i: fix MODULE_DESCRIPTION
This patch change description of the module.

Signed-off-by: Gerhard Bertelsmann <info@gerhard-bertelsmann.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2015-10-13 17:42:34 +02:00
Gerhard Bertelsmann
887e07be3f can: sun4i: fix arbitration lost error reporting
This patch fixes a bug in arbitration error reporting

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Gerhard Bertelsmann <info@gerhard-bertelsmann.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2015-10-13 17:42:33 +02:00
Russell King
8996eafdcb crypto: ahash - ensure statesize is non-zero
Unlike shash algorithms, ahash drivers must implement export
and import as their descriptors may contain hardware state and
cannot be exported as is.  Unfortunately some ahash drivers did
not provide them and end up causing crashes with algif_hash.

This patch adds a check to prevent these drivers from registering
ahash algorithms until they are fixed.

Cc: stable@vger.kernel.org
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-10-13 22:28:10 +08:00
Ian Morris
544d9b17f9 netfilter: ip6_tables: ternary operator layout
Correct whitespace layout of ternary operators in the netfilter-ipv6
code.

No changes detected by objdiff.

Signed-off-by: Ian Morris <ipm@chirality.org.uk>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-10-13 14:12:38 +02:00
Ian Morris
f9527ea9b6 netfilter: ipv6: whitespace around operators
This patch cleanses whitespace around arithmetical operators.

No changes detected by objdiff.

Signed-off-by: Ian Morris <ipm@chirality.org.uk>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-10-13 14:12:38 +02:00
Ian Morris
7695495d5a netfilter: ipv6: code indentation
Use tabs instead of spaces to indent code.

No changes detected by objdiff.

Signed-off-by: Ian Morris <ipm@chirality.org.uk>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-10-13 14:12:38 +02:00
Ian Morris
cda219c6ad netfilter: ip6_tables: function definition layout
Use tabs instead of spaces to indent second line of parameters in
function definitions.

No changes detected by objdiff.

Signed-off-by: Ian Morris <ipm@chirality.org.uk>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-10-13 14:12:37 +02:00
Ian Morris
6ac94619b6 netfilter: ip6_tables: label placement
Whitespace cleansing: Labels should not be indented.

No changes detected by objdiff.

Signed-off-by: Ian Morris <ipm@chirality.org.uk>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-10-13 14:12:37 +02:00
Florian Westphal
514ed62ed3 netfilter: sync with packet rx also after removing queue entries
We need to sync packet rx again after flushing the queue entries.
Otherwise, the following race could happen:

cpu1: nf_unregister_hook(H) called, H unliked from lists, calls
synchronize_net() to wait for packet rx completion.

Problem is that while no new nf_queue_entry structs that use H can be
allocated, another CPU might receive a verdict from userspace just before
cpu1 calls nf_queue_nf_hook_drop to remove this entry:

cpu2: receive verdict from userspace, lock queue
cpu2: unlink nf_queue_entry struct E, which references H, from queue list
cpu1: calls nf_queue_nf_hook_drop, blocks on queue spinlock
cpu2: unlock queue
cpu1: nf_queue_nf_hook_drop drops affected queue entries
cpu2: call nf_reinject for E
cpu1: kfree(H)
cpu2: potential use-after-free for H

Cc: Eric W. Biederman <ebiederm@xmission.com>
Fixes: 085db2c045 ("netfilter: Per network namespace netfilter hooks.")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-10-13 13:59:56 +02:00
David S. Miller
bbb300eb97 Merge branch 'bridge-vlan'
Nikolay Aleksandrov says:

====================
bridge: vlan: cleanups & fixes (part 3)

Patch 01 converts the vlgrp member to use rcu as it was already used in a
similar way so better to make it official and use all the available RCU
instrumentation. Patch 02 fixes a bug where the vlan_list can be traversed
without rtnl or rcu held which could lead to using freed entries.
Patch 03 removes some redundant code that isn't needed anymore.
Patch 04 fixes a bug reported by Ido Schimmel about the vlan_flush order
and switchdevs, it moves it back.

v2: patch 03 and 04 are new, couldn't escape the second synchronize_rcu()
since the rhtable destruction can sleep
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-13 04:58:04 -07:00
Nikolay Aleksandrov
f409d0ed87 bridge: vlan: move back vlan_flush
Ido Schimmel reported a problem with switchdev devices because of the
order change of del_nbp operations, more specifically the move of
nbp_vlan_flush() which deletes all vlans and frees vlgrp after the
rx_handler has been unregistered. So in order to fix this move
vlan_flush back where it was and make it destroy the rhtable after
NULLing vlgrp and waiting a grace period to make sure noone can see it.

Reported-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-13 04:57:58 -07:00
Nikolay Aleksandrov
b8d02c3cac bridge: vlan: drop unnecessary flush code
As Ido Schimmel pointed out the vlan_vid_del() code in nbp_vlan_flush is
unnecessary (and is actually a remnant of the old vlan code) so we can
remove it.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-13 04:57:56 -07:00
Nikolay Aleksandrov
e9c953eff7 bridge: vlan: use rcu for vlan_list traversal in br_fill_ifinfo
br_fill_ifinfo is called by br_ifinfo_notify which can be called from
many contexts with different locks held, sometimes it relies upon
bridge's spinlock only which is a problem for the vlan code, so use
explicitly rcu for that to avoid problems.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-13 04:57:54 -07:00
Nikolay Aleksandrov
907b1e6e83 bridge: vlan: use proper rcu for the vlgrp member
The bridge and port's vlgrp member is already used in RCU way, currently
we rely on the fact that it cannot disappear while the port exists but
that is error-prone and we might miss places with improper locking
(either RCU or RTNL must be held to walk the vlan_list). So make it
official and use RCU for vlgrp to catch offenders. Introduce proper vlgrp
accessors and use them consistently throughout the code.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-13 04:57:52 -07:00
David S. Miller
4b918163ae Merge branch 'vrf-ipv6'
David Ahern says:

====================
net: VRF support in IPv6 stack

Initial support for VRF in IPv6 stack. Makes IPv6 functionality on par
with IPv4 -- ping, tcp client/server and udp client/server all work fine.
tcpdump on vrf device and external tap (e.g., host side tap device) shows
all packets with proper addresses. IPv6 does not need the source address
operation like IPv4. Verified vti6 works properly in my setup as does use
of an IPv6 address on the VRF device.

v3
- re-based to top of net-next (updates per net namespace changes by Eric)
- fixed dst_entry typecasts as requested by Dave
- added flags to inet6_rtm_getroute (IPv6 version of deaa0a6a93)

v2
- fixed CONFIG_IPV6 dependency as questioned by Cong
  - if IPV6 is a module, kbuild ensures VRF is a module
  - if IPV6 is disabled IPV6 functionality is compiled out of VRF module
- addressed comments from Nik over IRC
  - removed duplicate call to netif_is_l3_master in l3mdev_rt6_dst_by_oif
  - changed allocation flag from GFP_ATOMIC to GFP_KERNEL since it is init time
  - added free of rt6i_pcpu
  - check_ipv6_frame returns false only if packet is NDISC type
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-13 04:55:10 -07:00
David Ahern
ca254490c8 net: Add VRF support to IPv6 stack
As with IPv4 support for VRFs added to IPv6 stack by replacing hardcoded
table ids with possibly device specific ones and manipulating the oif in
the flowi6. The flow flags are used to skip oif compare in nexthop lookups
if the device is enslaved to a VRF via the L3 master device.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-13 04:55:08 -07:00
David Ahern
35402e3136 net: Add IPv6 support to VRF device
Add support for IPv6 to VRF device driver. Implemenation parallels what
has been done for IPv4.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-13 04:55:07 -07:00
David Ahern
c485068778 net: Export fib6_get_table and nd_tbl
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-13 04:55:05 -07:00
David Ahern
ccf3c8c3fe net: Add IPv6 support to l3mdev
Add operations to retrieve cached IPv6 dst entry from l3mdev device
and lookup IPv6 source address.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-13 04:55:04 -07:00
Eric W. Biederman
e332bc67cf ipv6: Don't call with rt6_uncached_list_flush_dev
As originally written rt6_uncached_list_flush_dev makes no sense when
called with dev == NULL as it attempts to flush all uncached routes
regardless of network namespace when dev == NULL.  Which is simply
incorrect behavior.

Furthermore at the point rt6_ifdown is called with dev == NULL no more
network devices exist in the network namespace so even if the code in
rt6_uncached_list_flush_dev were to attempt something sensible it
would be meaningless.

Therefore remove support in rt6_uncached_list_flush_dev for handling
network devices where dev == NULL, and only call rt6_uncached_list_flush_dev
 when rt6_ifdown is called with a network device.

Fixes: 8d0b94afdc ("ipv6: Keep track of DST_NOCACHE routes in case of iface down/unregister")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Reviewed-by: Martin KaFai Lau <kafai@fb.com>
Tested-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-13 04:52:40 -07:00
Nikolay Aleksandrov
af3793921d bridge: fix gc_timer mod/del race condition
commit c62987bbd8 ("bridge: push bridge setting ageing_time down to
switchdev") introduced a timer race condition because the gc_timer can
get rearmed after it's supposedly stopped and flushed in br_dev_delete()
leading to a use of freed memory. So take rtnl to sync with bridge
destruction when setting ageing_timer.
Here's the trace reproduced with these two commands running in parallel:
while :; do echo 10000 > /sys/class/net/br0/bridge/ageing_timer; done;
while :; do brctl addbr br0; ip l set br0 up; ip l set br0 down;
brctl delbr br0; done;

[  300.000029] BUG: unable to handle kernel paging request at
ffffffff811c59d3
[  300.000263] IP: [<ffffffff810f168e>] __internal_add_timer+0x2e/0xd0
[  300.000422] PGD 1a0f067 PUD 1a10063 PMD 10001e1
[  300.000639] Oops: 0003 [#1] SMP
[  300.000793] Modules linked in: bridge stp llc nfsd auth_rpcgss
oid_registry nfs_acl nfs lockd grace fscache sunrpc crct10dif_pclmul
crc32_pclmul crc32c_intel ghash_clmulni_intel ppdev aesni_intel
aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd
snd_hda_codec_generic qxl drm_kms_helper psmouse pcspkr ttm
snd_hda_intel 9pnet_virtio evdev serio_raw joydev snd_hda_codec 9pnet
virtio_balloon drm snd_hwdep virtio_console snd_hda_core pvpanic snd_pcm
i2c_piix4 snd_timer acpi_cpufreq parport_pc snd parport soundcore button
processor i2c_core ipv6 autofs4 hid_generic usbhid hid ext4 crc16
mbcache jbd2 sg sr_mod cdrom ata_generic virtio_blk virtio_net e1000
ehci_pci uhci_hcd ehci_hcd usbcore usb_common floppy ata_piix libata
virtio_pci virtio_ring virtio scsi_mod
[  300.004008] CPU: 1 PID: 1169 Comm: bash Not tainted 4.3.0-rc3+ #46
[  300.004008] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  300.004008] task: ffff880035be2200 ti: ffff88003795c000 task.ti:
ffff88003795c000
[  300.004008] RIP: 0010:[<ffffffff810f168e>]  [<ffffffff810f168e>]
__internal_add_timer+0x2e/0xd0
[  300.004008] RSP: 0018:ffff88003fd03e78  EFLAGS: 00010046
[  300.004008] RAX: ffff88003fd0ef60 RBX: 840fc78949c08548 RCX:
00000001ffffffff
[  300.004008] RDX: 0000000000000000 RSI: ffffffff811c59d3 RDI:
ffff88003fd0df00
[  300.004008] RBP: ffff88003fd03e78 R08: 00000000ffffffff R09:
0000000000000000
[  300.004008] R10: 0000000000000000 R11: 0000000000000000 R12:
ffff88003fd0df00
[  300.004008] R13: 0000000000000000 R14: 0000000000000001 R15:
ffffffff816032e0
[  300.004008] FS:  00007fcbdd609700(0000) GS:ffff88003fd00000(0000)
knlGS:0000000000000000
[  300.004008] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  300.004008] CR2: ffffffff811c59d3 CR3: 0000000037879000 CR4:
00000000000406e0
[  300.004008] Stack:
[  300.004008]  ffff88003fd03ea8 ffffffff810f1775 ffff88003c8cb958
ffff88003fd0df00
[  300.004008]  0000000000000000 0000000000000001 ffff88003fd03f18
ffffffff810f28c4
[  300.004008]  ffff88003fd0eb68 ffff88003fd0e968 ffff88003fd0e768
ffff88003fd0df68
[  300.004008] Call Trace:
[  300.004008]  <IRQ>
[  300.004008]  [<ffffffff810f1775>] cascade+0x45/0x70
[  300.004008]  [<ffffffff810f28c4>] run_timer_softirq+0x2f4/0x340
[  300.004008]  [<ffffffff8107e380>] __do_softirq+0xd0/0x440
[  300.004008]  [<ffffffff8107e8a3>] irq_exit+0xb3/0xc0
[  300.004008]  [<ffffffff815c2032>] smp_apic_timer_interrupt+0x42/0x50
[  300.004008]  [<ffffffff815bfe37>] apic_timer_interrupt+0x87/0x90
[  300.004008]  <EOI>
[  300.004008]  [<ffffffff811fb80c>] ? create_object+0x13c/0x2e0
[  300.004008]  [<ffffffff8109b23e>] ? __kernel_text_address+0x4e/0x70
[  300.004008]  [<ffffffff8109b23e>] ? __kernel_text_address+0x4e/0x70
[  300.004008]  [<ffffffff8101e17f>] print_context_stack+0x7f/0xf0
[  300.004008]  [<ffffffff8101d55b>] dump_trace+0x11b/0x300
[  300.004008]  [<ffffffff8102970b>] save_stack_trace+0x2b/0x50
[  300.004008]  [<ffffffff811fb80c>] create_object+0x13c/0x2e0
[  300.004008]  [<ffffffff815b2e8e>] kmemleak_alloc+0x4e/0xb0
[  300.004008]  [<ffffffff811e475d>] kmem_cache_alloc_trace+0x18d/0x2f0
[  300.004008]  [<ffffffff8128b139>] kernfs_fop_open+0xc9/0x380
[  300.004008]  [<ffffffff8120214f>] do_dentry_open+0x1ff/0x2f0
[  300.004008]  [<ffffffff8128b070>] ? kernfs_fop_release+0x70/0x70
[  300.004008]  [<ffffffff812034f9>] vfs_open+0x59/0x60
[  300.004008]  [<ffffffff812130de>] path_openat+0x1ce/0x1260
[  300.004008]  [<ffffffff812154ae>] do_filp_open+0x7e/0xe0
[  300.004008]  [<ffffffff812251ff>] ? __alloc_fd+0xaf/0x180
[  300.004008]  [<ffffffff8120387b>] do_sys_open+0x12b/0x210
[  300.004008]  [<ffffffff8120397e>] SyS_open+0x1e/0x20
[  300.004008]  [<ffffffff815bf0b6>] entry_SYSCALL_64_fastpath+0x16/0x7a
[  300.004008] Code: 66 90 48 8b 46 10 48 8b 4f 40 55 48 89 c2 48 89 e5
48 29 ca 48 81 fa ff 00 00 00 77 20 0f b6 c0 48 8d 44 c7 68 48 8b 10 48
85 d2 <48> 89 16 74 04 48 89 72 08 48 89 30 48 89 46 08 5d c3 48 81 fa
[  300.004008] RIP  [<ffffffff810f168e>] __internal_add_timer+0x2e/0xd0
[  300.004008]  RSP <ffff88003fd03e78>
[  300.004008] CR2: ffffffff811c59d3

Fixes: c62987bbd8 ("bridge: push bridge setting ageing_time down to switchdev")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-13 04:50:17 -07:00
Nikolay Aleksandrov
87aaf2caed switchdev: check if the vlan id is in the proper vlan range
VLANs 0 and 4095 are reserved and shouldn't be used, add checks to
switchdev similar to the bridge. Also make sure ids above 4095 cannot
be passed either.

Fixes: 47f8328bb1 ("switchdev: add new switchdev bridge setlink")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-13 04:43:24 -07:00
Nikolay Aleksandrov
cc02aa8e41 switchdev: enforce no pvid flag in vlan ranges
We shouldn't allow BRIDGE_VLAN_INFO_PVID flag in VLAN ranges.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: Elad Raz <eladr@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-13 04:41:40 -07:00
David S. Miller
1f225031fe Merge branch 'be2net-fixes'
Sathya Perla says:

====================
be2net: patch set

Patch 1 fixes a FW image compatibility check in the driver that
prevents certain FW images from being flashed on BE3 (not BE3-R)
adapters.

Patch 2 fixes a spin_lock not being released in a failure case in
be_cmd_notify_wait().

Patch 3 includes a workaround to pad packets that are only 32b long or less
to be applicabe to BE3 too. This workaround was currently applied only to
Skyhawk and Lancer chips. Such packets are causing BE3's TX path to stall
on a SR-IOV config.

Patch 4 fixes the be_cmd_get_profile_config() routine to set the pf_num
field in the cmd request. The FW requires this field to be set for it to
return the specific function's descriptors. If not set, the FW returns
the descriptors of all the functions on the device. If the first descriptor
is not what is being queried for, the driver will read wrong data.
This patch fixes this issue by using the GET_CNTL_ATTRIB cmd to query the
real pci_func_num of a function and then uses it in the GET_PROFILE_CONFIG
cmd.

Patch 5 completes an earlier fix that removed the vlan promisc capability
for VFs. The earlier fix did not update the removal of this capability from
the profile descriptor of the VF. This causes the VF driver to request this
capability when it tries to create it's interface at probe time. This could
potentailly cause the VF probe to fail if the FW enforces strict checking of
the flags based on what was provisoned by the PF.  This strict checking is
not being done by FW currently but will be fixed in a future version. This
patch fixes this issue by updating the VF's profile descriptor so that they
match the interface capability flags provisioned by the PF.

Pls consider adding these patches to the net tree. Thanks!
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-13 04:32:50 -07:00
Kalesh AP
196e3735fa be2net: remove vlan promisc capability from VF's profile descriptors
The commit 435452aa88 ("Prevent VFs from enabling VLAN promiscuous mode")
fixed the PF driver to not include the VLAN promisc capability while
provisioning the interface for a VF. But the fix did not remove this
capability from the profile descriptor of the VF. This causes the VF
driver to request this capability when it tries to create it's interface
at probe time.  This could potentailly cause the VF probe to fail if the
FW enforces strict checking of the flags based on what was provisoned
by the PF.  This strict checking is not being done by FW currently but
will be fixed in a future version. This patch fixes this issue by updating
the VF's profile descriptor so that they match the interface capability
flags provisioned by the PF.

Fixes: 435452aa88 ("Prevent VFs from enabling VLAN promiscuous mode")
Signed-off-by: Kalesh AP <kalesh.purayil@avagotech.com>
Signed-off-by: Sathya Perla <sathya.perla@avagotech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-13 04:32:45 -07:00
Somnath Kotur
72ef3a88fa be2net: set pci_func_num while issuing GET_PROFILE_CONFIG cmd
The FW requires the pf_num field in the cmd hdr to be set for it to return
the specific function's descriptors in the GET_PROFILE_CONFIG cmd. If not
set, the FW returns the descriptors of all the functions on the device.
If the first descriptor is not what is being queried for, the driver will
read wrong data. This patch fixes this issue by using the GET_CNTL_ATTRIB
cmd to query the real pci_func_num of a function and then uses it in the
GET_PROFILE_CONFIG cmd.

Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@avagotech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-13 04:32:44 -07:00
Suresh Reddy
8227e9901d be2net: pad skb to meet minimum TX pkt size in BE3
On BE3 chips in SRIOV configs, the TX path stalls when a packet less
than 32B is received from the host. A workaround to pad such packets
already exists for the Skyhawk and Lancer chips. Use the same workaround
for BE3 chips too.

Signed-off-by: Suresh Reddy <suresh.reddy@avagotech.com>
Signed-off-by: Sathya Perla <sathya.perla@avagotech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-13 04:32:43 -07:00
Suresh Reddy
0c8845679f be2net: release mcc-lock in a failure case in be_cmd_notify_wait()
The mcc/mbox lock is not being released when be_cmd_copy() returns
an error.

Signed-off-by: Suresh Reddy <suresh.reddy@avagotech.com>
Signed-off-by: Sathya Perla <sathya.perla@avagotech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-13 04:32:42 -07:00
Kalesh AP
ae4a9d6a63 be2net: fix BE3-R FW download compatibility check
In the BE3 FW image, unlike Skyhawk's, the "asic_type_rev" field doesn't
track the asic_rev of chip it is compatible with. When asic_type_rev
is 0 the image is compatible only with pre-BE3-R chips (asic_rev < 0x10).
Fix the current compatibility check to take care of this.
We hit this issue when we try to flash old BE3 images (used prior to the
release of BE3-R) on pre-BE3-R adapters.

Fixes: a6e6ff6eee ("be2net: simplify UFI compatibility checking")
Signed-off-by: Kalesh AP <kalesh.purayil@avagotech.com>
Signed-off-by: Sathya Perla <sathya.perla@avagotech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-13 04:32:41 -07:00
Gerlando Falauto
3bb35ac497 net/fsl_pq_mdio: fix computed address for the TBI register
commit afae5ad78b
  "net/fsl_pq_mdio: streamline probing of MDIO nodes"

added support for different types of MDIO devices:
1) Gianfar MDIO nodes that only map the MII registers
2) Gianfar MDIO nodes that map the full MDIO register set
3) eTSEC2 MDIO nodes (which map the full MDIO register set)
4) QE MDIO nodes (which map only the MII registers)

However, the implementation for types 1 and 4 would mistakenly assume
a mapping of the full MDIO register set, thereby computing the address
for the TBI register starting from the containing structure.
The TBI register would therefore be accessed at a wrong (much bigger)
address, not giving the expected result at all.
This patch restores the correct behavior we had prior to the above one.

The consequences of this bug are apparent when trying to access a PHY
with the same address as the value contained in the initial value of
the TBI register (normally 0); in that case you'll get answers from the
internal TBI device (even though MDIO/MDC pins are actually *also*
toggling on the physical bus!).
Beware that you also need to add a fake tbi node to your device tree
with an unused address.

Notice how this fix is related to commit
220669495b
  "powerpc: Add TBI PHY node to first MDIO bus"

which fixed the behavior in kernel 3.3, which was later broken by the
above commit on kernel 3.7.

Signed-off-by: Gerlando Falauto <gerlando.falauto@keymile.com>
Cc: Timur Tabi <timur@tabi.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-13 04:29:55 -07:00
Gerlando Falauto
3dd03e52a4 net/fsl_pq_mdio: check TBI address for consistency with mapped range
When configuring the MDIO subsystem it is also necessary to configure
the TBI register. Make sure the TBI is contained within the mapped
register range in order to:
a) make sure the address is computed correctly
b) make users aware that we're actually accessing that register

In case of error, print a message but continue anyway.

Signed-off-by: Gerlando Falauto <gerlando.falauto@keymile.com>
Cc: Timur Tabi <timur@tabi.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-13 04:29:54 -07:00
David S. Miller
f83665d0c4 Merge branch 'dsa-mv88e6xxx-fix-hardware-bridging'
Vivien Didelot says:

====================
net: dsa: mv88e6xxx: fix hardware bridging

DSA and its drivers currently hook the NETDEV_CHANGEUPPER net_device event in
order to configure the VLAN map of every port.

This VLAN map is a feature of these switch chips to hardcode and restrict which
output ports a given input port can egress frames to.

A Linux bridge is a simple untagged VLAN propagated by the bridge code itself.
With a proper 802.1Q support, a driver does not need this hook anymore, and
will simply program the related VLAN object.

This patchset improves the hardware bridging code in the mv88e6xxx driver with
a strict 802.1Q mode.

Ideally, the equivalent must be done for Broadcom Starfighter 2 and Rocker,
before completely getting rid of this hook.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-13 04:26:44 -07:00
Vivien Didelot
5fe7f68016 net: dsa: mv88e6xxx: fix hardware bridging
Playing with the VLAN map of every port to implement "hardware bridging"
in the 88E6352 driver was a hack until full 802.1Q was supported.

Indeed with 802.1Q port mode "Disabled" or "Fallback", this feature is
used to restrict which output ports an input port can egress frames to.

A Linux bridge is an untagged VLAN. With full 802.1Q support, we don't
need this hack anymore and can use the "Secure" strict 802.1Q port mode.

With this mode, the port-based VLAN map still needs to be configured,
but all the logic is VTU-centric. This means that the switch only cares
about rules described in its hardware VLAN table, which is exactly what
Linux bridge expects and what we want.

Note also that the hardware bridging was broken with the previous
flexible "Fallback" 802.1Q port mode. Here's an example:

Port0 and Port1 belong to the same bridge. If Port0 sends crafted tagged
frames with VID 200 to Port1, Port1 receives it. Even if Port1 is in
hardware VLAN 200, but not Port0, Port1 will still receive it, because
Fallback mode doesn't care about invalid VID or non-member source port.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-13 04:26:31 -07:00