Signed-off-by: Phil Sutter <phil@nwl.cc>
Cc: Jay Vosburgh <j.vosburgh@gmail.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: Andy Gospodarek <gospo@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since the topology API is still in sufficient flux for changes to be
identified disable the use of the userspace ABI by adding #error
statements to the code, ensuring that nobody relies on the headers as
currently defined. It is expected that this change will be reverted for
v4.3.
Signed-off-by: Mark Brown <broonie@kernel.org>
Alex Deucher, Mark Rustad and Alexander Holler reported a regression
with the latest v4.2-rc4 kernel, which breaks some SATA controllers.
With multi-MSI capable SATA controllers, only the first port works,
all other ports time out when executing SATA commands.
This happens because the first argument to assign_irq_vector_policy()
is always the base linux irq number of the multi MSI interrupt block,
so all subsequent vector assignments operate on the base linux irq
number, so all MSI irqs are handled as the first irq number. Therefor
the other MSI irqs of a device are never set up correctly and never
fire.
Add the loop iterator to the base irq number so all vectors are
assigned correctly.
Fixes: b5dc8e6c21 "x86/irq: Use hierarchical irqdomain to manage CPU interrupt vectors"
Reported-and-tested-by: Alex Deucher <alexdeucher@gmail.com>
Reported-and-tested-by: Mark Rustad <mrustad@gmail.com>
Reported-and-tested-by: Alexander Holler <holler@ahsoftware.de>
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/1439911228-9880-1-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The routines in scsi_rpm.c assume that if a runtime-PM callback is
invoked for a SCSI device, it can only mean that the device's driver
has asked the block layer to handle the runtime power management (by
calling blk_pm_runtime_init(), which among other things sets q->dev).
However, this assumption turns out to be wrong for things like the ses
driver. Normally ses devices are not allowed to do runtime PM, but
userspace can override this setting. If this happens, the kernel gets
a NULL pointer dereference when blk_post_runtime_resume() tries to use
the uninitialized q->dev pointer.
This patch fixes the problem by calling the block layer's runtime-PM
routines only if the device's driver really does have a runtime-PM
callback routine. Since ses doesn't define any such callbacks, the
crash won't occur.
This fixes Bugzilla #101371.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Stanisław Pitucha <viraptor@gmail.com>
Reported-by: Ilan Cohen <ilanco@gmail.com>
Tested-by: Ilan Cohen <ilanco@gmail.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <JBottomley@Odin.com>
We added changes in fnic driver patch 1.6.0.16 to acquire
io_req_lock in fnic_queuecommand() before issuing I/O so that io completion
is serialized. But when releasing the lock we check for the I/O flag and
this could be modified if IO abort occurs before I/O completion. In this case
we wont release the lock and causes deadlock in some scenerios. Using the
local variable to check the IO lock status will resolve the problem.
Fixes: 41df7b02db
Signed-off-by: Hiral Shah <hishah@cisco.com>
Signed-off-by: Sesidhar Baddela <sebaddel@cisco.com>
Signed-off-by: Anil Chintalapati <achintal@cisco.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Pull drm fixes from Dave Airlie:
"These came in late last week, I wanted to look over the mst one before
forwarding, but it seems good.
Just three i915 and one MST fix"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/i915: Commit planes on each crtc separately.
drm/i915: calculate primary visibility changes instead of calling from set_config
drm/i915: Only dither on 6bpc panels
drm/dp/mst: Remove port after removing connector.
* fix a few power consumption issues
* scan cleanup
* fixes for D0i3 system state
* add paging for devices that support it
* add again the new RBD allocation model
* add more options to the firmware debug system
* add support for frag SKBs in Tx
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJV0uQxAAoJEC0Llv5uNjIBA3MQAKrYe0QQlty150wuXaJwys/1
CmdtcVSytafaCPchmPL4O73m8Z5kGv9yhktyY9dYGexMkgiqInzHRo3IWIO0eTFQ
83xrwdplxbv2j1gs6peNT5ojsXDVawiFcrAKbM3SXFyUTZCblnjPGXledmMH3S+f
L3cypj4j97HhyGnksm5kOiH+3LOfS/+JWW6LS9Z1/nEuehrykS1v5noUhgntlmTq
jy2BlocQxEKe89o1bbaG8s3BEL8l5O+Te4z7bhV0k/aX/jzcybQOf0bX+5nUswq/
YoZtHapd9vS6/z/dZiohDS4Db8HwqXjiFTgGhwu8lsxiI828pZlnBubNoSfsacyh
6UDXEqfP8bFcmWUf5vrG9eabxU008CBZ4pEuXHH/JV3DB9PRHVZ+bXtznMqzJtI0
1B0Mlc+iXB2sBhQqzMjinRM395cLipIhImriBfYDHoa/NiUIeZYTYwmwTdwWVXzh
d1OgtiqPRZMpWISON6/hcOARthOwj4per4Dieoy1vdyXhS2Fbz6edxIo4E2XW89C
AgWtJnsvZ1uTOhb/Ei6xiTtYEW9k8Dkb8i5SmcpZBNHAxLhD2SiNR4ZFLfUyluLB
eauZEDIcCoeCj+itYD4mguQlQHMlTcn+qaKXnWZWpTxUObkJpb2JUzaoyfJtka1F
ZrMX6X4dVmL6nJqL6rb7
=9i6F
-----END PGP SIGNATURE-----
Merge tag 'iwlwifi-next-for-kalle-2015-08-18' of https://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-next
* polish the Miracast operation
* fix a few power consumption issues
* scan cleanup
* fixes for D0i3 system state
* add paging for devices that support it
* add again the new RBD allocation model
* add more options to the firmware debug system
* add support for frag SKBs in Tx
lock_timer_base() cannot prevent the following :
CPU1 ( in __mod_timer()
timer->flags |= TIMER_MIGRATING;
spin_unlock(&base->lock);
base = new_base;
spin_lock(&base->lock);
// The next line clears TIMER_MIGRATING
timer->flags &= ~TIMER_BASEMASK;
CPU2 (in lock_timer_base())
see timer base is cpu0 base
spin_lock_irqsave(&base->lock, *flags);
if (timer->flags == tf)
return base; // oops, wrong base
timer->flags |= base->cpu // too late
We must write timer->flags in one go, otherwise we can fool other cpus.
Fixes: bc7a34b8b9 ("timer: Reduce timer migration overhead if disabled")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jon Christopherson <jon@jons.org>
Cc: David Miller <davem@davemloft.net>
Cc: xen-devel@lists.xen.org
Cc: david.vrabel@citrix.com
Cc: Sander Eikelenboom <linux@eikelenboom.it>
Link: http://lkml.kernel.org/r/1439831928.32680.11.camel@edumazet-glaptop2.roam.corp.google.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
U-Boot is often used to boot the kernel on ARM boards, but uImage
is not built by "make all", so we are often inclined to do
"make all uImage" to generate DTBs, modules and uImage in a single
command, but we should notice a pitfall behind it. In fact,
"make all uImage" could generate an invalid uImage if it is run with
the parallel option (-j).
You can reproduce this problem with the following procedure:
[1] First, build "all" and "uImage" separately.
You will get a valid uImage
$ git clean -f -x -d
$ export CROSS_COMPILE=<your-tools-prefix>
$ make -s -j8 ARCH=arm multi_v7_defconfig
$ make -s -j8 ARCH=arm all
$ make -j8 ARCH=arm UIMAGE_LOADADDR=0x80208000 uImage
CHK include/config/kernel.release
CHK include/generated/uapi/linux/version.h
CHK include/generated/utsrelease.h
make[1]: `include/generated/mach-types.h' is up to date.
CHK include/generated/timeconst.h
CHK include/generated/bounds.h
CHK include/generated/asm-offsets.h
CALL scripts/checksyscalls.sh
CHK include/generated/compile.h
Kernel: arch/arm/boot/Image is ready
Kernel: arch/arm/boot/zImage is ready
UIMAGE arch/arm/boot/uImage
Image Name: Linux-4.2.0-rc5-00156-gdd2384a-d
Created: Sat Aug 8 23:21:35 2015
Image Type: ARM Linux Kernel Image (uncompressed)
Data Size: 6138648 Bytes = 5994.77 kB = 5.85 MB
Load Address: 80208000
Entry Point: 80208000
Image arch/arm/boot/uImage is ready
$ ls -l arch/arm/boot/*Image
-rwxrwxr-x 1 masahiro masahiro 13766656 Aug 8 23:20 arch/arm/boot/Image
-rw-rw-r-- 1 masahiro masahiro 6138712 Aug 8 23:21 arch/arm/boot/uImage
-rwxrwxr-x 1 masahiro masahiro 6138648 Aug 8 23:20 arch/arm/boot/zImage
[2] Update some source file(s)
$ touch init/main.c
[3] Then, re-build "all" and "uImage" simultaneously.
You will get an invalid uImage at random.
$ make -j8 ARCH=arm UIMAGE_LOADADDR=0x80208000 all uImage
CHK include/config/kernel.release
CHK include/generated/uapi/linux/version.h
CHK include/generated/utsrelease.h
make[1]: `include/generated/mach-types.h' is up to date.
CHK include/generated/timeconst.h
CHK include/generated/bounds.h
CHK include/generated/asm-offsets.h
CALL scripts/checksyscalls.sh
CC init/main.o
CHK include/generated/compile.h
LD init/built-in.o
LINK vmlinux
LD vmlinux.o
MODPOST vmlinux.o
GEN .version
CHK include/generated/compile.h
UPD include/generated/compile.h
CC init/version.o
LD init/built-in.o
KSYM .tmp_kallsyms1.o
KSYM .tmp_kallsyms2.o
LD vmlinux
SORTEX vmlinux
SYSMAP System.map
OBJCOPY arch/arm/boot/Image
Building modules, stage 2.
Kernel: arch/arm/boot/Image is ready
GZIP arch/arm/boot/compressed/piggy.gzip
AS arch/arm/boot/compressed/piggy.gzip.o
Kernel: arch/arm/boot/Image is ready
LD arch/arm/boot/compressed/vmlinux
GZIP arch/arm/boot/compressed/piggy.gzip
OBJCOPY arch/arm/boot/zImage
Kernel: arch/arm/boot/zImage is ready
UIMAGE arch/arm/boot/uImage
Image Name: Linux-4.2.0-rc5-00156-gdd2384a-d
Created: Sat Aug 8 23:23:14 2015
Image Type: ARM Linux Kernel Image (uncompressed)
Data Size: 26472 Bytes = 25.85 kB = 0.03 MB
Load Address: 80208000
Entry Point: 80208000
Image arch/arm/boot/uImage is ready
MODPOST 192 modules
AS arch/arm/boot/compressed/piggy.gzip.o
LD arch/arm/boot/compressed/vmlinux
OBJCOPY arch/arm/boot/zImage
Kernel: arch/arm/boot/zImage is ready
$ ls -l arch/arm/boot/*Image
-rwxrwxr-x 1 masahiro masahiro 13766656 Aug 8 23:23 arch/arm/boot/Image
-rw-rw-r-- 1 masahiro masahiro 26536 Aug 8 23:23 arch/arm/boot/uImage
-rwxrwxr-x 1 masahiro masahiro 6138648 Aug 8 23:23 arch/arm/boot/zImage
Please notice the uImage is extremely small when this issue is
encountered. Besides, "Kernel: arch/arm/boot/zImage is ready" is
displayed twice, before and after the uImage log.
The root cause of this is the race condition between zImage and
uImage. Actually, uImage depends on zImage, but the dependency
between the two is only described in arch/arm/boot/Makefile.
Because arch/arm/boot/Makefile is not included from the top-level
Makefile, it cannot know the dependency between zImage and uImage.
Consequently, when we run make with the parallel option, Kbuild
updates vmlinux first, and then two different threads descends into
the arch/arm/boot/Makefile almost at the same time, one for updating
zImage and the other for uImage. While one thread is re-generating
zImage, the other also tries to update zImage before creating uImage
on top of that. zImage is overwritten by the slower thread and then
uImage is created based on the half-written zImage.
This is the reason why "Kernel: arch/arm/boot/zImage is ready" is
displayed twice, and a broken uImage is created.
The same problem could happen on bootpImage.
This commit adds dependencies among Image, zImage, uImage, and
bootpImage to arch/arm/Makefile, which is included from the
top-level Makefile.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The mmap semaphore should not be taken when page faults are disabled.
Since pagefault_disable() no longer disables preemption, we now need
to use faulthandler_disabled() in place of in_atomic().
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Matthew Fortune <Matthew.Fortune@imgtec.com> reports:
The genex.S file appears to mix the case of a macro between its definition and
use. A cut down example of this is below. The macro __build_clear_none has
lower case 'build' but ends up being instantiated with upper case BUILD. Can
this be fixed on master. It has been picked up by the LLVM integrated assembler
which is currently case sensitive. We are likely to fix the assembler as well
but the code is currently inconsistent in the kernel.
.macro __build_clear_none
.endm
.macro __BUILD_HANDLER exception handler clear verbose ext
.align 5
.globl handle_\exception; .align 2; .type handle_\exception, @function; .ent
handle_\exception, 0; handle_\exception: .frame $29, 184, $29
.set noat
.globl handle_\exception\ext; .type handle_\exception\ext, @function;
handle_\exception\ext:
__BUILD_clear_\clear
.endm
.macro BUILD_HANDLER exception handler clear verbose
__BUILD_HANDLER \exception \handler \clear \verbose _int
.endm
BUILD_HANDLER ftlb ftlb none silent
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Reported-by: Matthew Fortune <Matthew.Fortune@imgtec.com>
If PM is enabled but PM_SLEEP is disabled, the suspend/resume functions
are still unused and produce a compiler warning.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Cc: <stable@vger.kernel.org> # 4.1+
This reverts commit:
2c7577a758 ("sched/x86_64: Don't save flags on context switch")
It was a nice speedup. It's also not quite correct: SYSENTER
enables interrupts too early.
We can re-add this optimization once the SYSENTER code is beaten
into shape, which should happen in 4.3 or 4.4.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org # v3.19
Link: http://lkml.kernel.org/r/85f56651f59f76624e80785a8fd3bdfdd089a818.1439838962.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When we enter D0i3, we must stop TXing otherwise the
sequence number we use might conflict with the firmware's
internal TX. In order to do so, we have
IWL_MVM_STATUS_IN_D0I3 which should prevent any Tx while we
enter D0i3. There is a bug in this code since we may Tx even
if IWL_MVM_STATUS_IN_D0I3 is set. This can happen as long as
mvm->d0i3_ap_sta_id is not set.
To make sure that we don't have any packet in the Tx path
while we set mvm->d0i3_ap_sta_id, call synchronize_net only
after we already set mvm->d0i3_ap_sta_id.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Currently if we wake up during D0I3 due to beacon loss we disconnect
immediately. This behaviour causes redundant disconnection, which could
be prevented by polling as it is usually done in mac80211.
Instead, we prefer reporting beacon loss and let mac80211 try polling
before disconnection.
Signed-off-by: David Spinadel <david.spinadel@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
KASan error report:
==================================================================
BUG: KASan: out of bounds access in iwl_init_sband_channels+0x207/0x260 [iwlwifi] at addr ffff8800c2d0aac8
Read of size 4 by task modprobe/329
==================================================================
Both loops of this function compare data from the 'chan' array and then
check if the index is valid.
The 2 conditions should be inverted to avoid an out-of-bounds access.
Signed-off-by: Adrien Schildknecht <adrien+dev@schischi.me>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Fix bug where MIMO is disabled for low latency TX on P2P VIF
regardless of configuration. Make it dependent on
IWL_MVM_RS_DISABLE_P2P_MIMO compilation option. Change configuration
so that MIMO will be disabled only in SDIO platforms.
Signed-off-by: Alexander Bondar <alexander.bondar@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
This switches the BCMA GPIO driver to use GPIOLIB_IRQCHIP to
handle its interrupts instead of rolling its own copy of the
irqdomain handling etc.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
MAC/BB name is"????" if the MAC/BB is unknown.
Signed-off-by: Miaoqing Pan <miaoqing@qca.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Add diversity statistics and sync the driver
statistics acx and debugfs representation
with the current fw api.
Signed-off-by: Guy Mishol <guym@ti.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Sync the driver statistics acx and debugfs representation
with the current fw api.
Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
rt2500usb_validate_eeprom() read data up to 0x6e (EEPROM_CALIBRATE_OFFSET)
but only 0x6a bytes has been allocated and read from the eeprom.
This lead to out-of-bound accesses and invalid values for
EEPROM_BBPTUNE_R17 and EEPROM_CALIBRATE_OFFSET.
Change the EEPROM_SIZE to 0x6e in order to retrieve all the fields.
Tested with a rt2570 device.
Signed-off-by: Adrien Schildknecht <adrien+dev@schischi.me>
Acked-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
CC [M] drivers/net/wireless/mwl8k.o
drivers/net/wireless/mwl8k.c: In function ‘mwl8k_bss_info_changed’:
drivers/net/wireless/mwl8k.c:3290:2: warning: ‘ap_mcs_rates’ may be used uninitialized in this function [-Wmaybe-uninitialized]
memcpy(cmd->mcs_set, mcs_rates, 16);
^
drivers/net/wireless/mwl8k.c:4987:5: note: ‘ap_mcs_rates’ was declared here
u8 ap_mcs_rates[16];
^
The warning was bogus. But the conditionals were rather complicated,
with multiple redundant checks. This consolidates the checking and
makes it more readable IMHO.
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
I remove duplicated routines which related rtl92cu_set_hw_reg().
1. rtl92c_set_qos() and HW_VAR_AC_PARAM routine are similar code.
so i replace code with rtlpriv->cfg->ops->set_hw_reg().
2. rtl92c_set_mac_addr() and 'HW_VAR_ETHER_ADDR' case at
rtl92cu_set_hw_reg() routine are similar code.
so i removed rtl92c_set_mac_addr() function.
also it was not used anywhere.
3. remove HW_VAR_ACM_CTRL routine in rtl92cu_set_hw_reg().
if rtl_usb->acm_method is not EACMWAY2_SW, HW_VAR_ACM_CTRL is called
from HW_VAR_AC_PARAM. but it never called. because acm_method is always
EACMWAY2_SW. so i remove acm_method check routine
and HW_VAR_ACM_CTRL routine.
both usb and pci interface is not used HW_VAR_ACM_CTRL.
but i can't test pci interface module, so i didn't modify pci code.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
rtl92c_set_xxx_filter is same routine with rtl92cu_set_hw_reg.
so i remove those functions that are rtl92c_set_xxx_filter.
(rtl92c_get_xxx_filter is also same reason.)
also i add code updating struct rtl_mac member variable in the
rtl92cu_set_hw_reg.
after that, no more _update_mac_setting is not useful. thus i remove that.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Allow the topology code to be compiled out so that users who don't need
topology don't need to havve the code compiled in, saving them some
memory.
Some more configuration could be added to remove some of the hooks into
the core data structures but that is probably best done with some
refactoring to use functions to do the updates of the data structures
rather than ifdefing in the code as we'd need to do at the minute.
Suggested-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Mark Brown <broonie@kernel.org>
a bit of content:
* mesh fixes/improvements from Alexis, Bob, Chun-Yeow and Jesse
* TDLS higher bandwidth support (Arik)
* OCB fixes from Bertold Van den Bergh
* suspend/resume fixes from Eliad
* dynamic SMPS support for minstrel-HT (Krishna Chaitanya)
* VHT bitrate mask support (Lorenzo Bianconi)
* better regulatory support for 5/10 MHz channels (Matthias May)
* basic support for MU-MIMO to avoid the multi-vif issue (Sara Sharon)
along with a number of other cleanups.
-----BEGIN PGP SIGNATURE-----
iQIcBAABCAAGBQJVzg5bAAoJEDBSmw7B7bqr3PAP/1r8wyZXxtySzz6P5Z9k0+2I
52NiSUISgmtnaQUyahf4n90eMU+gGJWQwPwIZFvMKg6bD4RW2XI4MdKmviKx8skU
4sDlDxMFrVMfV/ySwiPDAONWPtwwgKllIt0IDDnKs6kPdDlUcbKOTEFYhzZ1HhTZ
7Og4rJm7M90QpdMU7hmxmE5KRkp1hW0Yce1KPTW5U0j9yl9zbi4eLVWT+ac1WnZs
GpItajd0BFtBy7DRHzX8RiRJ4pi+aWxhuYNqiSxUm0BqPWCzT7PP15M1kCGwrXtm
/TTSVJl7WkLbOYI0PE0Y0XcJfZUg1c9aecCR3ubmRrQrGfOBFpN01jUANIRwqvZ3
3QRq1RZNLac0+zlBPjoFdOHmoaVX6UcJQKSgOhcfuM1BcNFnXZEcHFN4/SaEUfvJ
1ltybEeOEAckCMqqfHb1g/nVfJnlBjy811GzIrsHXqKqb7rRfGkfxmBxLrRzVknS
PC970pbuhxICeeryKdVgK5BClWeT3TB1srt6OZ0QR1zlcfZbLZ8jqJlHJcy3szFi
P43X9w8I6ZNTzkBU+lsCt9gbveYS+rSaJ+zm/SaF21ro33+FEdZ+p1ujjzp729Tz
PnKobaOrku38Be7CSwJ760WvngC7gbZqGybGknBsws4dqDXJste0UjxulZeyaOkN
nVmHDL45jc5rd8qjoPQV
=kV1a
-----END PGP SIGNATURE-----
Merge tag 'mac80211-next-for-davem-2015-08-14' mac80211-next.git
iwlwifi needs new mac80211 patches so merge mac80211-next.git to
wireless-drivers-next.git.
Tom Herbert says:
====================
net: Identifier Locator Addressing - Part I
This patch set provides rudimentary support for Identifier Locator
Addressing or ILA. The basic concept of ILA is that we split an IPv6
address into a 64 bit locator and 64 bit identifier. The identifier is
the identity of an entity in communication ("who"), and the locator
expresses the location of the entity ("where"). Applications
use externally visible address that contains the identifier.
When a packet is actually sent, a translation is done that
overwrites the first 64 bits of the address with a locator.
The packet can then be forwarded over the network to the host where
the addressed entity is located. At the receiver, the reverse
translation is done so the that the application sees the original,
untranslated address. Presumably an external control plane will
provide identifier->locator mappings.
v2:
- Fix compilation erros when LWT not configured
- Consolidate ILA into a single ila.c
v3:
- Change pseudohdr argument od inet_proto_csum_replace functions to
be a bool
v4:
- In ila_build_state check locator being in netlink params before
allocating tunnel state
The data path for ILA is a simple NAT translation that only operates
on the upper 64 bits of a destination address in IPv6 packets. The
basic process is:
1) Lookup 64 bit identifier (lower 64 bits of destination)
2) If a match is found
a) Overwrite locator (upper 64 bits of destination) with
the new locator
b) Adjust any checksum that has destination address included in
pseudo header
3) Send or receive packet
ILA is a means to implement tunnels or network virtualization without
encapsulation. Since there is no encapsulation involved, we assume that
stateless support in the network for IPv6 (e.g. RSS, ECMP, TSO, etc.)
just works. Also, since we're minimally changing the packet many of
the worries about encapsulation (MTU, checksum, fragmentation) are
not relevant. The downside is that, ILA is not extensible like other
encapsulations (GUE for instance) so it might not be appropriate for
all use cases. Also, this only makes sense to do in IPv6!
A key aspect of ILA is performance. The intent is that ILA would be
used in data centers in virtualizing tasks or jobs. In the fullest
incarnation all intra data center communications might be targeted to
virtual ILA addresses. This is basically adding a new virtualization
capability to the existing services in a datacenter, so there is a
strong expectation is that this does not degrade performance for
existing applications.
Performance seems to be dependent on how ILA is hooked into kernel.
ILA can be implemented under some different models:
- Mechanically it is a form a stateless DNAT
- It can be thought of as a type of (source) routing
- As a functional replacement of encapsulation
In this patch set we hook into the data path using Light Weight
Tunnels (LWT) infrastructure. As part of that, we add support in LWT
to redirect dst input. iproute will be modified to take a new ila encap
type. ILA can be configured like:
ip route add 3333:0:0:1:5555:0:2:0/128 \
encap ila 2001:0:0:2 via 2401:db00:20:911a:face:0:27:0
ip -6 addr add 3333:0:0:1:5555:0:1:0/128 dev eth0
ip route add table local local 2001:0:0:1:5555:0:1:0/128
encap ila 3333:0:0:1 dev lo
So sending to destination 3333:0:0:1:5555:0:2:0 will have destination
of 2001:0:0:2:5555:0:2:0 on the wire.
Performance results are below. With ILA we see about a 10% drop in
pps compared to non-ILA. Much of this drop can be attributed to the
loss of early demux on input (translation occurs after it is attempted).
We will address this in the next patch set. Also, IPvlan input path
does not work with ILA since the routing is bypassed-- this will
be addressed in a future patch.
Performance testing:
Performing netperf TCP_RR with 200 clients:
Non-ILA baseline
84.92% CPU utilization
1861922.9 tps
93/163/330 50/90/99% latencies
ILA single destination
83.16% CPU utilization
1679683.4 tps
105/180/332 50/90/99% latencies
References:
Slides from netconf:
http://vger.kernel.org/netconf2015Herbert-ILA.pdf
Slides from presentation at IETF:
https://www.ietf.org/proceedings/92/slides/slides-92-nvo3-1.pdf
I-D:
https://tools.ietf.org/html/draft-herbert-nvo3-ila-00
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Adding new module name ila. This implements ILA translation. Light
weight tunnel redirection is used to perform the translation in
the data path. This is configured by the "ip -6 route" command
using the "encap ila <locator>" option, where <locator> is the
value to set in destination locator of the packet. e.g.
ip -6 route add 3333:0:0:1:5555:0:1:0/128 \
encap ila 2001:0:0:1 via 2401:db00:20:911a:face:0:25:0
Sets a route where 3333:0:0:1 will be overwritten by
2001:0:0:1 on output.
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This function updates a checksum field value and skb->csum based on
a value which is the difference between the old and new checksum.
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
inet_proto_csum_replace4,2,16 take a pseudohdr argument which indicates
the checksum field carries a pseudo header. This argument should be a
boolean instead of an int.
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the capability to redirect dst input in the same way
that dst output is redirected by LWT.
Also, save the original dst.input and and dst.out when setting up
lwtunnel redirection. These can be called by the client as a pass-
through.
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
>> drivers/net/ethernet/cisco/enic/vnic_dev.c:1095:13: sparse: incorrect type in assignment (different address spaces)
drivers/net/ethernet/cisco/enic/vnic_dev.c:1095:13: expected void *res
drivers/net/ethernet/cisco/enic/vnic_dev.c:1095:13: got void [noderef] <asn:2>*
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
>> drivers/net/ethernet/mellanox/mlx5/core/en_rx.c:173:44: sparse: incorrect type in argument 1 (different base types)
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c:173:44: expected restricted __sum16 [usertype] n
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c:173:44: got restricted __be16 [usertype] check_sum
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This work adds the possibility of deriving the zone id from the skb->mark
field in a scalable manner. This allows for having only a single template
serving hundreds/thousands of different zones, for example, instead of the
need to have one match for each zone as an extra CT jump target.
Note that we'd need to have this information attached to the template as at
the time when we're trying to lookup a possible ct object, we already need
to know zone information for a possible match when going into
__nf_conntrack_find_get(). This work provides a minimal implementation for
a possible mapping.
In order to not add/expose an extra ct->status bit, the zone structure has
been extended to carry a flag for deriving the mark.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>