Under certain circumstances, the device may not have enough resources to
enable all of the VFs that it advertises in config space. Although the
number of supported VFs is reported upon driver init, it is not obvious
when this is different from the number reported in config space. To
eliminate this confusion, add an error message explaining the problem.
Additionally, move the 'Allocating VFs' message down below the error
checks so as to prevent further confusion.
Change-ID: I45b7efca53a7aebf7777be33a8bc9d615ae48ea1
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The interrupt enable function can be inlined by moving it to the header
file, which decreases the function call overhead for a frequently called
function.
Change-ID: I3214cc99593725768642680e7b8ce7e9bba7e44d
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The driver was issuing a WARN_ON during ring size changes
because the code was cloning the rx_ring struct but
not zeroing out the pointers before allocating new memory.
Zero out the pointers in the cloned copy before allocating
new memory for them. In this case the code was correctly
avoiding memory leaks but still triggering the warning.
Change-ID: I186dd493948e9b7254ab0593d4aad8b68808918d
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The commit 55ce74d4bf (md/raid1: ensure
device failure recorded before write request returns) is causing crash in
the LVM2 testsuite test shell/lvchange-raid.sh. For me the crash is 100%
reproducible.
The reason for the crash is that the newly added code in raid1d moves the
list from conf->bio_end_io_list to tmp, then tests if tmp is non-empty and
then incorrectly pops the bio from conf->bio_end_io_list (which is empty
because the list was alrady moved).
Raid-10 has a similar bug.
Kernel Fault: Code=15 regs=000000006ccb8640 (Addr=0000000100000000)
CPU: 3 PID: 1930 Comm: mdX_raid1 Not tainted 4.2.0-rc5-bisect+ #35
task: 000000006cc1f258 ti: 000000006ccb8000 task.ti: 000000006ccb8000
YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
PSW: 00001000000001001111111000001111 Not tainted
r00-03 000000ff0804fe0f 000000001059d000 000000001059f818 000000007f16be38
r04-07 000000001059d000 000000007f16be08 0000000000200200 0000000000000001
r08-11 000000006ccb8260 000000007b7934d0 0000000000000001 0000000000000000
r12-15 000000004056f320 0000000000000000 0000000000013dd0 0000000000000000
r16-19 00000000f0d00ae0 0000000000000000 0000000000000000 0000000000000001
r20-23 000000000800000f 0000000042200390 0000000000000000 0000000000000000
r24-27 0000000000000001 000000000800000f 000000007f16be08 000000001059d000
r28-31 0000000100000000 000000006ccb8560 000000006ccb8640 0000000000000000
sr00-03 0000000000249800 0000000000000000 0000000000000000 0000000000249800
sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000
IASQ: 0000000000000000 0000000000000000 IAOQ: 000000001059f61c 000000001059f620
IIR: 0f8010c6 ISR: 0000000000000000 IOR: 0000000100000000
CPU: 3 CR30: 000000006ccb8000 CR31: 0000000000000000
ORIG_R28: 000000001059d000
IAOQ[0]: call_bio_endio+0x34/0x1a8 [raid1]
IAOQ[1]: call_bio_endio+0x38/0x1a8 [raid1]
RP(r2): raid_end_bio_io+0x88/0x168 [raid1]
Backtrace:
[<000000001059f818>] raid_end_bio_io+0x88/0x168 [raid1]
[<00000000105a4f64>] raid1d+0x144/0x1640 [raid1]
[<000000004017fd5c>] kthread+0x144/0x160
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Fixes: 55ce74d4bf ("md/raid1: ensure device failure recorded before write request returns.")
Fixes: 95af587e95 ("md/raid10: ensure device failure recorded before write request returns.")
Signed-off-by: NeilBrown <neilb@suse.com>
All unrecovered machine check errors on PowerNV should cause an
immediate panic. There are 2 reasons that this is the right policy:
it's not safe to continue, and we're already trying to reboot.
Firstly, if we go through the recovery process and do not successfully
recover, we can't be sure about the state of the machine, and it is
not safe to recover and proceed.
Linux knows about the following sources of Machine Check Errors:
- Uncorrectable Errors (UE)
- Effective - Real Address Translation (ERAT)
- Segment Lookaside Buffer (SLB)
- Translation Lookaside Buffer (TLB)
- Unknown/Unrecognised
In the SLB, TLB and ERAT cases, we can further categorise these as
parity errors, multihit errors or unknown/unrecognised.
We can handle SLB errors by flushing and reloading the SLB. We can
handle TLB and ERAT multihit errors by flushing the TLB. (It appears
we may not handle TLB and ERAT parity errors: I will investigate
further and send a followup patch if appropriate.)
This leaves us with uncorrectable errors. Uncorrectable errors are
usually the result of ECC memory detecting an error that it cannot
correct, but they also crop up in the context of PCI cards failing
during DMA writes, and during CAPI error events.
There are several types of UE, and there are 3 places a UE can occur:
Skiboot, the kernel, and userspace. For Skiboot errors, we have the
facility to make some recoverable. For userspace, we can simply kill
(SIGBUS) the affected process. We have no meaningful way to deal with
UEs in kernel space or in unrecoverable sections of Skiboot.
Currently, these unrecovered UEs fall through to
machine_check_expection() in traps.c, which calls die(), which OOPSes
and sends SIGBUS to the process. This sometimes allows us to stumble
onwards. For example we've seen UEs kill the kernel eehd and
khugepaged. However, the process killed could have held a lock, or it
could have been a more important process, etc: we can no longer make
any assertions about the state of the machine. Similarly if we see a
UE in skiboot (and again we've seen this happen), we're not in a
position where we can make any assertions about the state of the
machine.
Likewise, for unknown or unrecognised errors, we're not able to say
anything about the state of the machine.
Therefore, if we have an unrecovered MCE, the most appropriate thing
to do is to panic.
The second reason is that since e784b6499d ("powerpc/powernv: Invoke
opal_cec_reboot2() on unrecoverable machine check errors."), we
attempt a special OPAL reboot on an unhandled MCE. This is so the
hardware can record error data for later debugging.
The comments in that commit assert that we are heading down the panic
path anyway. At the moment this is not always true. With UEs in kernel
space, for instance, they are marked as recoverable by the hardware,
so if the attempt to reboot failed (e.g. old Skiboot), we wouldn't
panic() but would simply die() and OOPS. It doesn't make sense to be
staggering on if we've just tried to reboot: we should panic().
Explicitly panic() on unrecovered MCEs on PowerNV.
Update the comments appropriately.
This fixes some hangs following EEH events on cxlflash setups.
Signed-off-by: Daniel Axtens <dja@axtens.net>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
native_hpte_clear() is called in real mode from two places:
- Early in boot during htab initialisation if firmware assisted dump is
active.
- Late in the kexec path.
In both contexts there is no need to disable interrupts are they are
already disabled. Furthermore, locking around the tlbie() is only required
for pre POWER5 hardware.
On POWER5 or newer hardware concurrent tlbie()s work as expected and on pre
POWER5 hardware concurrent tlbie()s could result in deadlock. This code
would only be executed at crashdump time, during which all bets are off,
concurrent tlbie()s are unlikely and taking locks is unsafe therefore the
best course of action is to simply do nothing. Concurrent tlbie()s are not
possible in the first case as secondary CPUs have not come up yet.
Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
When scaling_available_frequencies is read on an offlined cpu, then
either lockup or junk values are displayed. This is caused by
freed freq_table, which policy is using.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The Atmel sdhci device needs the
SDHCI_QUIRK2_NEED_DELAY_AFTER_INT_CLK_RST quirk. Without it, the
internal clock could never stabilised when changing the sd clock
frequency.
Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
The Atmel sdhci device needs a new quirk. sdhci_set_clock set the Clock
Control Register to 0 before computing the new value and writing it.
It disables the internal clock which causes a reset mecanism. If we
write the new value before this reset mecanism is done, it will prevent
the stabilisation of the internal clock, so a delay is needed. This
delay is about 2-3 cycles of the base clock. To be safe, a 1 ms delay is
used.
Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
In case of armada_38x_quirks error, all clocks should be cleaned-up, same
as after mv_conf_mbus_windows failure.
Signed-off-by: Marcin Wojtas <mw@semihalf.com>
Cc: <stable@vger.kernel.org> # v4.2
Reviewed-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
According to 'FE-2946959' erratum the clock inversion option is
needed to support slow frequencies when the card input hold time
requirement is high. This setting is not required for high speed
MMC and might cause timing violation.
Signed-off-by: Nadav Haklai <nadavh@marvell.com>
Cc: <stable@vger.kernel.org> # v4.2
Reviewed-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
shci-pxav3 driver is enabling by default the
SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN quirk. However this quirk is not
required for Armada 38x and leads to wrong clock setting in the divider.
Signed-off-by: Nadav Haklai <nadavh@marvell.com>
Signed-off-by: Marcin Wojtas <mw@semihalf.com>
Cc: <stable@vger.kernel.org> # v4.2
Reviewed-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
For some reason, only the little-endian flavor of
powerpc provided the zero_bytemask() implementation.
Reported-by: Michal Sojka <sojkam1@fel.cvut.cz>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
We need to explicitly check the AVX and AES CPU features, as we can't
infer them from the related XSAVE feature flags. For example, the
Core i3 2310M passes the XSAVE feature test but does not implement
AES-NI.
Reported-and-tested-by: Stéphane Glondu <glondu@debian.org>
References: https://bugs.debian.org/800934
Fixes: ce4f5f9b65 ("x86/fpu, crypto x86/camellia_aesni_avx: Simplify...")
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: stable <stable@vger.kernel.org> # 4.2
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Some of the crypto algorithms write to the initialization vector,
but no space has been allocated for it. This clobbers adjacent memory.
Cc: stable@vger.kernel.org
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When the controller is unconfigured (for example it does not have a
valid Bluetooth address), then the basic debugfs entries for dut_mode
and vendor_diag are not creates. Ensure they are created in __hci_init
and also __hci_unconf_init functions. One of them is called during setup
stage of a new controller.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Daniel Borkmann says:
====================
BPF/random32 updates
BPF update to split the prandom state apart, and to move the
*once helpers to the core. For details, please see individual
patches. Given the changes and since it's in the tree for
quite some time, net-next is a better choice in our opinion.
v1 -> v2:
- Make DO_ONCE() type-safe, remove the kvec helper. Credits
go to Alexei Starovoitov for the __VA_ARGS__ hint, thanks!
- Add a comment to the DO_ONCE() helper as suggested by Alexei.
- Rework prandom_init_once() helper to the new API.
- Keep Alexei's Acked-by on the last patch.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
While recently arguing on a seccomp discussion that raw prandom_u32()
access shouldn't be exposed to unpriviledged user space, I forgot the
fact that SKF_AD_RANDOM extension actually already does it for some time
in cBPF via commit 4cd3675ebf ("filter: added BPF random opcode").
Since prandom_u32() is being used in a lot of critical networking code,
lets be more conservative and split their states. Furthermore, consolidate
eBPF and cBPF prandom handlers to use the new internal PRNG. For eBPF,
bpf_get_prandom_u32() was only accessible for priviledged users, but
should that change one day, we also don't want to leak raw sequences
through things like eBPF maps.
One thought was also to have own per bpf_prog states, but due to ABI
reasons this is not easily possible, i.e. the program code currently
cannot access bpf_prog itself, and copying the rnd_state to/from the
stack scratch space whenever a program uses the prng seems not really
worth the trouble and seems too hacky. If needed, taus113 could in such
cases be implemented within eBPF using a map entry to keep the state
space, or get_random_bytes() could become a second helper in cases where
performance would not be critical.
Both sides can trigger a one-time late init via prandom_init_once() on
the shared state. Performance-wise, there should even be a tiny gain
as bpf_user_rnd_u32() saves one function call. The PRNG needs to live
inside the BPF core since kernels could have a NET-less config as well.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: Chema Gonzalez <chema@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a prandom_init_once() facility that works on the rnd_state, so that
users that are keeping their own state independent from prandom_u32() can
initialize their taus113 per cpu states.
The motivation here is similar to net_get_random_once(): initialize the
state as late as possible in the hope that enough entropy has been
collected for the seeding. prandom_init_once() makes use of the recently
introduced prandom_seed_full_state() helper and is generic enough so that
it could also be used on fast-paths due to the DO_ONCE().
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Factor out the full reseed handling code that populates the state
through get_random_bytes() and runs prandom_warmup(). The resulting
prandom_seed_full_state() will be used later on in more than the
current __prandom_reseed() user. Fix also two minor whitespace
issues along the way.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the get_random_once() helper generic enough, so that functions
in general would only be called once, where one user of this is then
net_get_random_once().
The only implementation specific call is to get_random_bytes(), all
the rest of this *_once() facility would be duplicated among different
subsystems otherwise. The new DO_ONCE() helper will be used by prandom()
later on, but might also be useful for other scenarios/subsystems as
well where a one-time initialization in often-called, possibly fast
path code could occur.
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
There's no good reason why users outside of networking should not
be using this facility, f.e. for initializing their seeds.
Therefore, make it accessible from there as get_random_once().
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch moves values for all lowpan interface to the shared
implementation of 6lowpan. This patch also quietly fixes the forgotten
IFF_NO_QUEUE flag for the bluetooth 6LoWPAN interface. An identically
commit is 4afbc0d ("net: 6lowpan: convert to using IFF_NO_QUEUE") which
wasn't changed for bluetooth 6lowpan.
All 6lowpan interfaces should be virtual with IFF_NO_QUEUE, using EUI64
address length, the mtu size is 1280 (IPV6_MIN_MTU) and the netdev type
is ARPHRD_6LOWPAN.
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
If we get MAX_MSIX interrupts would like to have each receive ring
with his own msix interrupt line. Do not need the shared_ports
variable at mlx4_enable_msix
Fixes: 9293267a3e ('net/mlx4_core: Capping number of requested MSIXs to MAX_MSIX')
Signed-off-by: Carol L Soto <clsoto@linux.vnet.ibm.com>
Acked-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit deaa0a6a93 ("net: Lookup actual route when oif is VRF device")
exposed a bug in __ip_route_output_key_hash for VRF devices: on FIB lookup
failure if the oif is specified the current logic drops to make_route on
the assumption that the route tables are wrong. For VRF/L3 master devices
this leads to wrong dst entries and route lookups. For example:
$ ip route ls table vrf-red
unreachable default
broadcast 10.2.1.0 dev eth1 proto kernel scope link src 10.2.1.2
10.2.1.0/24 dev eth1 proto kernel scope link src 10.2.1.2
local 10.2.1.2 dev eth1 proto kernel scope host src 10.2.1.2
broadcast 10.2.1.255 dev eth1 proto kernel scope link src 10.2.1.2
$ ip route get oif vrf-red 1.1.1.1
1.1.1.1 dev vrf-red src 10.0.0.2
cache
With this patch:
$ ip route get oif vrf-red 1.1.1.1
RTNETLINK answers: No route to host
which is the correct response based on the default route
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
"Y" was the right answer for MDIO_OCTEON when this option was only
available on CAVIUM_OCTEON_SOC. But now that the option is visible on
all (64-bit) systems, this piece of advice no longer makes sense. This
helper module is selected automatically by drivers which need it
anyway.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Fixes: a6d6786452 ("net: mdio-octeon: Modify driver to work on both ThunderX and Octeon")
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Sunil Goutham <sgoutham@cavium.com>
Cc: Radha Mohan Chintakuntla <rchintakuntla@cavium.com>
Cc: David Daney <david.daney@cavium.com>
Cc: David S. Miller <davem@davemloft.net>
Acked-by: David Daney <david.daney@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The recently added mlxsw driver produces warnings in ARM
allmodconfig:
drivers/net/ethernet/mellanox/mlxsw/pci.c: In function 'mlxsw_pci_cmd_exec':
drivers/net/ethernet/mellanox/mlxsw/pci.c:1585:59: warning: right shift count >= width of type [-Wshift-count-overflow]
linux/byteorder/big_endian.h:38:51: note: in definition of macro '__cpu_to_be32'
drivers/net/ethernet/mellanox/mlxsw/pci.c:76:2: note: in expansion of macro 'iowrite32be'
This uses upper_32_bits() to extract the bits while avoiding that warning.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Fixes: eda6500a98 "mlxsw: Add PCI bus implementation"
Signed-off-by: David S. Miller <davem@davemloft.net>
Similar to commit c29390c6df ("xps: must clear sender_cpu before
forwarding"), we also need to clear the skb->sender_cpu when moving
from RX to TX via skb_do_redirect() due to the shared location of
napi_id (used on RX) and sender_cpu (used on TX).
Fixes: 27b29f6305 ("bpf: add bpf_redirect() helper")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Similar to commit c29390c6df ("xps: must clear sender_cpu before forwarding")
the skb->sender_cpu needs to be cleared before xmit.
Fixes: 3896d655f4 ("bpf: introduce bpf_clone_redirect() helper")
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Similar to commit c29390c6df ("xps: must clear sender_cpu before forwarding")
the skb->sender_cpu needs to be cleared when moving from Rx
Tx, otherwise kernel could crash.
Fixes: 2bd82484bb ("xps: fix xps for stacked devices")
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The recently added hns driver causes a build warning in ARM
allmodconfig builds:
drivers/net/ethernet/hisilicon/hns/hnae.c: In function 'handles_show':
drivers/net/ethernet/hisilicon/hns/hnae.c:452:13: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
j, (u64)h->qs[i]->io_base);
^
This removes the pointless cast and prints the pointer address using
the "%p" format string in all three locations.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This ethernet driver supports the Micorchip enc424j600/626j600 Ethernet
controller over a SPI bus interface. This driver makes use of the regmap API to
optimize access to registers by caching registers where possible.
Datasheet:
http://ww1.microchip.com/downloads/en/DeviceDoc/39935b.pdf
Signed-off-by: Jon Ringle <jringle@gridpoint.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arun Parameswaran says:
====================
Add support for Broadcom's iProc MDIO and Cygnus Ethernet PHY
This patchset adds support for the iProc MDIO interface and the
Broadcom Cygnus SoC's internal Ethernet PHY.
The internal Ethernet PHY(s) in the Cygnus SoC's are accessed
via the MDIO interface found in most of the iProc based chips.
The patch also consolidates the common API's used by the
Broadcom phys to a common library. Existing Broadcom phy
drivers have been modified to use the common library API's.
This patch series is based on Linux v4.3-rc1 and is avaliable in:
https://github.com/Broadcom/cygnus-linux/tree/cygnus-net-phy-mdio-v3
The Ethernet driver for the iProc family will be submitted soon,
as will the device tree configurations for the different iProc
family SoCs.
Changes from v2:
- Modified drivers/net/phy/Kconfig to modify the BCM_CYGNUS_PHY
driver to 'depends on MDIO_BCM_IPROC' instead of 'select'.
- Added github branch to the cover letter
Changes from v1:
- Updated device tree documentation for the iProc MDIO driver
based on Florian's feedback.
- Moved the core register defines from the Cygnus PHY driver to
'include/linux/brcmphy.h' based on Florian's feedback.
- Created a new patch/commit to modify the bcm7xxx phy driver
to use the new core register defines.
- Modified the Kconfig entry for the Broadcom PHY library to
'tristate' instead of 'bool'
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Modified the bcm7xxx phy driver to remove local core register
defines and use the common ones from "include/linux/brcmphy.h"
Signed-off-by: Arun Parameswaran <arunp@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for the Broadcom Cygnus SoCs internal PHY's.
The PHYs are 1000M/100M/10M capable with support for 'EEE'
and 'APD' (Auto Power Down).
This driver supports the following Broadcom Cygnus SoCs:
- BCM583XX (BCM58300, BCM58302, BCM58303, BCM58305)
- BCM113XX (BCM11300, BCM11320, BCM11350, BCM11360)
The PHY's on these SoC's require some workarounds for
stable operation, both during configuration time and
during suspend/resume. This driver handles the
application of the workarounds.
Signed-off-by: Arun Parameswaran <arunp@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the Broadcom phy library to consolidate common
interfaces shared by Broadcom phy's.
Moved the common interfaces to the 'bcm-phy-lib.c' and updated
the Broadcom PHY drivers to use the new APIs.
Signed-off-by: Arun Parameswaran <arunp@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for the Broadcom iProc MDIO bus interface.
The MDIO interface can be found in the Broadcom iProc family Soc's.
The MDIO bus is accessed using a combination of command and data
registers. This MDIO driver provides access to the Etherent GPHY's
connected to the MDIO bus.
Signed-off-by: Arun Parameswaran <arunp@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add device tree binding documentation for the Broadcom iProc MDIO
bus driver.
Signed-off-by: Arun Parameswaran <arunp@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Santosh Shilimkar says:
====================
RDS: connection scalability and performance improvements
[v4]
Re-sending the same patches from v3 again since my repost of
patch 05/14 from v3 was whitespace damaged.
[v3]
Updated patch "[PATCH v2 05/14] RDS: defer the over_batch work to
send worker" as per David Miller's comment [4] to avoid the magic
value usage. Patch now makes use of already available but unused
send_batch_count module parameter. Rest of the patches are same as
earlier version v2 [3]
[v2]:
Dropped "[PATCH 05/15] RDS: increase size of hash-table to 8K" from
earlier version [1]. I plan to address the hash table scalability using
re-sizable hash tables as suggested by David Laight and David Miller [2]
This series addresses RDS connection bottlenecks on massive workloads and
improve the RDMA performance almost by 3X. RDS TCP also gets a small gain
of about 12%.
RDS is being used in massive systems with high scalability where several
hundred thousand end points and tens of thousands of local processes
are operating in tens of thousand sockets. Being RC(reliable connection),
socket bind and release happens very often and any inefficiencies in
bind hash look ups hurts the overall system performance. RDS bin hash-table
uses global spin-lock which is the biggest bottleneck. To make matter worst,
it uses rcu inside global lock for hash buckets.
This is being addressed by simply using per bucket rw lock which makes the
locking simple and very efficient. The hash table size is still an issue and
I plan to address it by using re-sizable hash tables as suggested on the list.
For RDS RDMA improvement, the completion handling is revamped so that we
can do batch completions. Both send and receive completion handlers are
split logically to achieve the same. RDS 8K messages being one of the
key usecase, mr pool is adapted to have the 8K mrs along with default 1M
mrs. And while doing this, few fixes and couple of bottlenecks seen with
rds_sendmsg() are addressed.
Series applies against 4.3-rc1 as well net-next. Its tested on Oracle
hardware with IB fabric for both bcopy as well as RDMA mode. RDS TCP is
tested with iXGB NIC. Like last time, iWARP transport is untested with
these changes. The patchset is also available at below git repo:
git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux.git net/rds/4.3-v3
As a side note, the IB HCA driver I used for testing misses at least 3
important patches in upstream to see the full blown IB performance and
am hoping to get that in mainline with help of them.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric W. Biederman says:
====================
net: Pass net through the output path v2
This is the next installment of my work to pass struct net through the
output path so the code does not need to guess how to figure out which
network namespace it is in, and ultimately routes can have output
devices in another network namespace.
The first patch in this series is a fix for a bug that came in when sk
was passed through the functions in the output path, and as such is
probably a candidate for net. At the same time my later patches depend
on it so sending the fix separately would be confusing.
The second patch in this series is another fix that for an issue that
came in when sk was passed through the output path. I don't think it
needs a backport as I don't think anyone uses the path where the code
was incorrect.
The rest of the patchset focuses on the path from xxx_local_out to
dst_output and in the end succeeds in passing sock_net(sk) from the
socket a packet locally originates on to the dst->output function.
Given the size reduction in the code I think this counts as a cleanup as
much as feature work.
There remain a number of helper functions (like ip option processing) to
take care of before the network stack can support destination devices in
other network namespaces but with this set of changes the backbone of
the work is done.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The network namespace is already passed into dst_output pass it into
dst->output lwt->output and friends.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Compute net once in ipvlan_process_v4_outbound and
ipvlan_process_v6_outbound and store it in a variable so that net does
not need to be recomputed next time it is used.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Compute net and store it in a variable in pptp_xmit, so that the value
can be reused the next time it is needed.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Compute net and store it in a variable in the functions
ip_build_and_send_pkt and ip_queue_xmit so that it does not need to be
recomputed next time it is needed.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Store net in a variable in ip_tunnel_xmit so it does not need
to be recomputed when it is used again.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stop hidding the sk parameter with an inline helper function and make
all of the callers pass it, so that it is clear what the function is
doing.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>