This if statement was accidentally dropped in (aaa795a netfilter:
nat: propagate errors from xfrm_me_harder()) so now it returns
unconditionally.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
struct ip_vs_sync_mesg and ip_vs_sync_mesg_v0 are both sent across the wire
and used internally to store IPVS synchronisation messages.
Up until now the scheme used has been to convert the size field
to network byte order before sending a message on the wire and
convert it to host byte order when sending a message.
This patch changes that scheme to always treat the field
as being network byte order. This seems appropriate as
the structure is sent across the wire. And by consistently
treating the field has network byte order it is now possible
to take advantage of sparse to flag any future miss-use.
Acked-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
The sctp_events[] come from sch->type in set_sctp_state(). They are
between 0-255 so that means we need 256 elements in the array.
I believe that because of how the code is aligned there is normally a
hole after sctp_events[] so this patch doesn't actually change anything.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
There are two motivations for this:
1. It improves readability to my eyes
2. Using nested min() calls results in a shadowed _min1 variable,
which is a bit untidy. Sparse complained about this.
I have also replaced (size_t)64 with a variable of type size_t and value 64.
This also improves readability to my eyes.
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Some service fields are in network order:
- netmask: used once in network order and also as prefix len for IPv6
- port
Other parameters are in host order:
- struct ip_vs_flags: flags and mask moved between user and kernel only
- sync state: moved between user and kernel only
- syncid: sent over network as single octet
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
kbuild test robot reports for sparse warnings in
commits c2a4ffb70e ("ipvs: convert lblc scheduler to rcu")
and c5549571f9 ("ipvs: convert lblcr scheduler to rcu").
Fix it by removing extra __rcu annotation.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
- RCU annotations for ip_vs_info_seq_start and _stop
- __percpu for cpustats
- properly dereference svc->pe in ip_vs_genl_fill_service
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
kbuild test robot reports for sparse warnings
in commit 088339a57d ("ipvs: convert connection locking"):
net/netfilter/ipvs/ip_vs_conn.c:962:13: warning: context imbalance
in 'ip_vs_conn_array' - wrong count at exit
include/linux/rcupdate.h:326:30: warning: context imbalance in
'ip_vs_conn_seq_next' - unexpected unlock
include/linux/rcupdate.h:326:30: warning: context imbalance in
'ip_vs_conn_seq_stop' - unexpected unlock
Fix it by running ip_vs_conn_array under RCU lock
to avoid conditional locking and by adding proper RCU
annotations.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Use rcu_dereference_protected to resolve
sparse warning, found by kbuild test robot:
net/netfilter/ipvs/ip_vs_ctl.c:1464:35: warning: dereference of
noderef expression
Problem from commit 026ace060d
("ipvs: optimize dst usage for real server")
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
struct sctp_packet is currently embedded into sctp_transport or
sits on the stack as 'singleton' in sctp_outq_flush(). Therefore,
its member 'malloced' is always 0, thus a kfree() is never called.
Because of that, we can just remove this code.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow L2 redirection when VXLAN L3 switching is enabled
This patch restricts L3 switching to destination MAC addresses that are
marked as routers in order to allow virtual IP appliances that do L2
redirection to function with VXLAN L3 switching enabled.
We use L3 switching on VXLAN networks to avoid extra hops when the nominal
router for cross-subnet traffic for a VM is remote and the ultimate
destination may be local, or closer to the local node. Currently, the
destination IP address takes precedence over the MAC address in all cases.
Some network appliances receive packets for a virtualized IP address and
redirect by changing the destination MAC address (only) to be the final
destination for packet processing. VXLAN tunnel endpoints with L3 switching
enabled may then overwrite this destination MAC address based on the packet IP
address, resulting in potential loops and, at least, breaking L2 redirections
that travel through tunnel endpoints.
This patch limits L3 switching to the intended case where the original
destination MAC address is a next-hop router and relies on the destination
MAC address for all other cases, thus allowing L2 redirection and L3 switching
to coexist peacefully.
Signed-Off-By: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The return value from list_netdevice() is not used and no need, so remove it.
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dl_next member in struct request_sock doesn't need to be first.
We expect to insert a "struct common_sock" or a subset of it,
so this claim had to be verified.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
qeth_hdr_chk_and_bounce() can possibly shift the skb->data
pointer. However, the existing code didn't update the hdr pointer,
which should point to skb->data, accordingly.
Symptoms of this issue are sporadic recoveries.
Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Frank Blaschka <blaschka@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
remove unused variable
Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Frank Blaschka <blaschka@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
remove cast for kzalloc return value.
Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Frank Blaschka <blaschka@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some frontend drivers are sending packets > 64 KiB in length. This length
overflows the length field in the first slot making the following slots have
an invalid length.
Turn this error back into a non-fatal error by dropping the packet. To avoid
having the following slots having fatal errors, consume all slots in the
packet.
This does not reopen the security hole in XSA-39 as if the packet as an
invalid number of slots it will still hit fatal error case.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch tries to coalesce tx requests when constructing grant copy
structures. It enables netback to deal with situation when frontend's
MAX_SKB_FRAGS is larger than backend's MAX_SKB_FRAGS.
With the help of coalescing, this patch tries to address two regressions
avoid reopening the security hole in XSA-39.
Regression 1. The reduction of the number of supported ring entries (slots)
per packet (from 18 to 17). This regression has been around for some time but
remains unnoticed until XSA-39 security fix. This is fixed by coalescing
slots.
Regression 2. The XSA-39 security fix turning "too many frags" errors from
just dropping the packet to a fatal error and disabling the VIF. This is fixed
by coalescing slots (handling 18 slots when backend's MAX_SKB_FRAGS is 17)
which rules out false positive (using 18 slots is legit) and dropping packets
using 19 to `max_skb_slots` slots.
To avoid reopening security hole in XSA-39, frontend sending packet using more
than max_skb_slots is considered malicious.
The behavior of netback for packet is thus:
1-18 slots: valid
19-max_skb_slots slots: drop and respond with an error
max_skb_slots+ slots: fatal error
max_skb_slots is configurable by admin, default value is 20.
Also change variable name from "frags" to "slots" in netbk_count_requests.
Please note that RX path still has dependency on MAX_SKB_FRAGS. This will be
fixed with separate patch.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The maximum packet including header that can be handled by netfront / netback
wire format is 65535. Reduce gso_max_size accordingly.
Drop skb and print warning when skb->len > 65535. This can 1) save the effort
to send malformed packet to netback, 2) help spotting misconfiguration of
netfront in the future.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Also fix a typo in comment.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch sets the coherent DMA mask to 64-bit after the be2net driver has
been acknowledged that the system is 64-bit DMA capable. The coherent DMA
mask is examined by the Intel IOMMU driver to determine whether to allow
pass through context mapping for all devices. With this patch, the be2net
driver combined with be2net compatible hardware provides comparable
performance to the case where vt-d is disabled. The main use-case for this
change is to decrease the time necessary to copy virtual machine memory
during KVM live migration instantiations.
This patch was tested on a system that enables the IOMMU in non-coherent
mode. Two DMA remapper issues were encountered in the previous version and
both patches have been committed.
commit ea2447f700
commit 2e12bc29fc
Signed-off-by: Craig Hada <craig.hada@hp.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use GET_PROFILE_CONFIG_V1 cmd for BE3-R, to query the maximum number of
TX rings available per function. On SH-R the same is queried via the
GET_FUNCTION_CONFIG cmd.
Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Avoid flashing BE3 UFI on BE3-R chip by verifying asic_revision
number of the chip.
Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Don't log "Out of MCCQ wrbs" msg. When the driver doesn't receive any
response from the FW it already logs a "FW not responding" message.
The driver runs out of MCCQ wrbs much later. Also, this message can
swamp the kernel log in HW/FW error scenarios.
Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Skyhawk-R and BE3-R (SuperNIC profile) require V2 version
of TXQ_CREATE cmd to be used.
Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
a. Common tree of `dir` structures.
b. Multi-port devices structures.
CC: Francious Romieu <romieu@fz.zoreil.com>
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
introduce a procedure to read in u32 granularity.
CC: Francious Romieu <romieu@fz.zoreil.com>
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/s390/net/qeth_l3_main.c: In function 'qeth_l3_add_vlan_mc':
>> drivers/s390/net/qeth_l3_main.c:1662:3: error: too few arguments to function '__vlan_find_dev_deep'
include/linux/if_vlan.h:88:27: note: declared here
drivers/s390/net/qeth_l3_main.c: In function 'qeth_l3_add_vlan_mc6':
>> drivers/s390/net/qeth_l3_main.c:1723:3: error: too few arguments to function '__vlan_find_dev_deep'
include/linux/if_vlan.h:88:27: note: declared here
drivers/s390/net/qeth_l3_main.c: In function 'qeth_l3_free_vlan_addresses4':
>> drivers/s390/net/qeth_l3_main.c:1767:2: error: too few arguments to function '__vlan_find_dev_deep'
include/linux/if_vlan.h:88:27: note: declared here
drivers/s390/net/qeth_l3_main.c: In function 'qeth_l3_free_vlan_addresses6':
>> drivers/s390/net/qeth_l3_main.c:1797:2: error: too few arguments to function '__vlan_find_dev_deep'
include/linux/if_vlan.h:88:27: note: declared here
drivers/s390/net/qeth_l3_main.c: In function 'qeth_l3_process_inbound_buffer':
>> drivers/s390/net/qeth_l3_main.c:1980:6: error: too few arguments to function '__vlan_hwaccel_put_tag'
include/linux/if_vlan.h:234:31: note: declared here
drivers/s390/net/qeth_l3_main.c: In function 'qeth_l3_verify_vlan_dev':
>> drivers/s390/net/qeth_l3_main.c:2089:3: error: too few arguments to function '__vlan_find_dev_deep'
include/linux/if_vlan.h:88:27: note: declared here
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add missing return statement for CONFIG_BUG=n.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix up some function signatures for CONFIG_VLAN=n that were missed during
the 802.1ad support patches.
Found by the kbuild robot.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The following leak is reported by kmemleak:
[ 86.812073] kmemleak: Found object by alias at 0xffff88006ecc76f0
[ 86.816019] Pid: 739, comm: kworker/u:1 Not tainted 3.9.0-rc5+ #842
[ 86.816019] Call Trace:
[ 86.816019] <IRQ> [<ffffffff81151c58>] find_and_get_object+0x8c/0xdf
[ 86.816019] [<ffffffff8190e90d>] ? vlan_info_rcu_free+0x33/0x49
[ 86.816019] [<ffffffff81151cbe>] delete_object_full+0x13/0x2f
[ 86.816019] [<ffffffff8194bbb6>] kmemleak_free+0x26/0x45
[ 86.816019] [<ffffffff8113e8c7>] slab_free_hook+0x1e/0x7b
[ 86.816019] [<ffffffff81141c05>] kfree+0xce/0x14b
[ 86.816019] [<ffffffff8190e90d>] vlan_info_rcu_free+0x33/0x49
[ 86.816019] [<ffffffff810d0b0b>] rcu_do_batch+0x261/0x4e7
The reason is that in vlan_info_rcu_free() we don't take the VLAN protocol
into account when iterating over the vlan_devices_array.
Reported-by: Cong Wang <amwang@redhat.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Tested-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
The following patchset contains a small batch of Netfilter
updates for your net-next tree, they are:
* Three patches that provide more accurate error reporting to
user-space, instead of -EPERM, in IPv4/IPv6 netfilter re-routing
code and NAT, from Patrick McHardy.
* Update copyright statements in Netfilter filters of
Patrick McHardy, from himself.
* Add Kconfig dependency on the raw/mangle tables to the
rpfilter, from Florian Westphal.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for the get_settings ethtool op to the bonding
driver. This was motivated by users who wanted to get the speed of the
bond and compare that against throughput to understand utilization.
The behavior before this patch was added was problematic when computing
line utilization after trying to get link-speed and throughput via SNMP.
Output from ethtool looks like this for a round-robin bond:
Settings for bond0:
Supported ports: [ ]
Supported link modes: Not reported
Supported pause frame use: No
Supports auto-negotiation: No
Advertised link modes: Not reported
Advertised pause frame use: No
Advertised auto-negotiation: No
Speed: 11000Mb/s
Duplex: Full
Port: Other
PHYAD: 0
Transceiver: internal
Auto-negotiation: off
MDI-X: Unknown
Link detected: yes
I tested this and verified it works as expected. A test was also done
on a version backported to an older kernel and it worked well there.
v2: Switch to using ethtool_cmd_speed_set to set speed, added check to
SLAVE_IS_OK for each slave in bond, dropped mode-specific calculations
as they were not needed, and set port type to 'Other.'
v3: Fix useless assignment and checkpatch warning.
Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch introduces a small, internal helper function, that is used by
PF_PACKET. Based on the flags that are passed, it extracts the packet
timestamp in the receive path. This is merely a refactoring to remove
some duplicate code in tpacket_rcv(), to make it more readable, and to
enable others to use this function in PF_PACKET as well, e.g. for TX.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, ktime2ts is a small helper function that is only used in
net/socket.c. Move this helper into the ktime API as a small inline
function, so that i) it's maintained together with ktime routines,
and ii) also other files can make use of it. The function is named
ktime_to_timespec_cond() and placed into the generic part of ktime,
since we internally make use of ktime_to_timespec(). ktime_to_timespec()
itself does not check the ktime variable for zero, hence, we name
this function ktime_to_timespec_cond() for only a conditional
conversion, and adapt its users to it.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rajesh Borundia says:
====================
* "qlcnic: Change 82xx adapter VLAN id endian type".
- Adapter requires VLAN id in little endian. VLAN id was being
converted to __le16 and then passed as a parameter. Pass VLAN id
as u16 and then use cpu_to_le16 at appropriate places. It is
appropriate for net-next as SR-IOV patches have a dependency on it.
* "qlcnic: Fix loopback test for SR-IOV PF".
- It is appropriate for net-next as change is needed for SRIOV PF
only.
* Remaining patches add enhancements to SR-IOV functionality like
- FLR handling
- Adapter reset recovery handling
- iproute2 tool support for configuring MAC address, Tx rate and
VLAN id.
- Mailbox polling support for SR-IOV PF in case mailbox interrupts
are disabled.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
o When mailbox interrupt is disabled PF should be
able to process request from VF. Enable polling
for such cases.
Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
o Do not disable mailbox interrupts while running
loopback test through SR-IOV PF.
Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
o Add support for VLAN id configuration per VF using
iproute2 tool.
o VLAN id's 1-4094 are treated as PVID by the PF and
Guest VLAN tagging is not allowed by default.
o PVID is disabled when the VLAN id is set to 0
o Guest VLAN tagging is allowed when the VLAN id is set to 4095.
o Only one Guest VLAN id is supported.
o VLAN id can be changed only when the VF driver is not loaded.
Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
o Add support for MAC address and Tx rate configuration
per VF via iproute2 tool.
o Tx rate change is allowed while the guest is running
and the VF driver is loaded.
o MAC address change is allowed only when VF driver
is not loaded.
Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
o Implement recovery mechanism for VF to recover from
adapter resets.
Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
o FLR from Hypervisor - When hypervisor issues a VF FLR request,
adapter notifies the parent PF driver of the FLR request for PF
driver to perform any cleanup on behalf of that VF.
o FLR from VF Driver - VF driver may initiate a VF FLR request,
if VF state needs to be cleaned up before a re-initialization.
VF re-initialization during kdump is an example.
o PF driver cleans up all resources allocated on behalf of a VF,
on VF FLR notifications from the adapter or from the VF driver.
Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>