The recent change in IP_VS_XMIT_TUNNEL to set
CHECKSUM_NONE is not correct. After adding IPIP header
skb->csum becomes invalid but the CHECKSUM_PARTIAL
case must be supported. So, use skb_forward_csum() which is
most suitable for us to allow local clients to send IPIP
to remote real server.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Delivering locally ICMP from FORWARD hook is not supported.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
This patch is needed to avoid scheduling of
packets from local real server when we add ip_vs_in
in LOCAL_OUT hook to support local client.
Currently, when ip_vs_in can not find existing
connection it tries to create new one by calling ip_vs_schedule.
The default indication from ip_vs_schedule was if
connection was scheduled to real server. If real server is
not available we try to use the bypass forwarding method
or to send ICMP error. But in some cases we do not want to use
the bypass feature. So, add flag 'ignored' to indicate if
the scheduler ignores this packet.
Make sure we do not create new connections from replies.
We can hit this problem for persistent services and local real
server when ip_vs_in is added to LOCAL_OUT hook to handle
local clients.
Also, make sure ip_vs_schedule ignores SYN packets
for Active FTP DATA from local real server. The FTP DATA
connection should be created on SYN+ACK from client to assign
correct connection daddr.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Change skb->ipvs_property semantic. This is preparation
to support ip_vs_out processing in LOCAL_OUT. ipvs_property=1
will be used to avoid expensive lookups for traffic sent by
transmitters. Now when conntrack support is not used we call
ip_vs_notrack method to avoid problems in OUTPUT and
POST_ROUTING hooks instead of exiting POST_ROUTING as before.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Avoid full checksum calculation for apps that can provide
info whether csum was broken after payload mangling. For now only
ip_vs_ftp mangles payload and it updates the csum, so the full
recalculation is avoided for all packets.
Add CHECKSUM_UNNECESSARY for snat_handler (TCP and UDP).
It is needed to support SNAT from local address for the case
when csum is fully recalculated.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Fix CHECKSUM_PARTIAL handling. Tested for IPv4 TCP,
UDP not tested because it needs network card with HW CSUM support.
May be fixes problem where IPVS can not be used in virtual boxes.
Problem appears with DNAT to local address when the local stack
sends reply in CHECKSUM_PARTIAL mode.
Fix tcp_dnat_handler and udp_dnat_handler to provide
vaddr and daddr in right order (old and new IP) when calling
tcp_partial_csum_update/udp_partial_csum_update (CHECKSUM_PARTIAL).
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
If dma mapping fail we are dropping packages or fail to open device.
But exact reason of drop/fail stays unknow for a user, so print errors.
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix switching device to low-speed mode after resume reported in:
https://bugzilla.redhat.com/show_bug.cgi?id=502974
Reported-and-tested-by: Laurentiu Badea <bugzilla-redhat@wotevah.com>
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since we do not change rx buffer size any longer, we can
clean up rtl8169_change_mtu and in consequence rtl8169_down.
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Only increase tx_{packets,dropped} statistics when transmit or drop
full skb, not just fragment.
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Check possible dma mapping errors and do clean up if it happens.
Fix overwrap bug in rtl8169_tx_clear on the way.
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert bvec_k{un,}map_irq() from macros to static inline functions if
!CONFIG_HIGHMEM, so we can easier detect mistakes like the one fixed in
93055c3104 ("ps3disk: passing wrong variable =
to
bvec_kunmap_irq()")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Make the bnx2x driver use the new vlan accleration model.
Signed-off-by: Hao Zheng <hzheng@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
CC: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the ixgbe driver use the new vlan accleration model.
Signed-off-by: Jesse Gross <jesse@nicira.com>
CC: Peter Waskiewicz <peter.p.waskiewicz.jr@intel.com>
CC: Emil Tantilov <emil.s.tantilov@intel.com>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the bnx2 driver use the new vlan accleration model.
Signed-off-by: Jesse Gross <jesse@nicira.com>
CC: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If some of the underlying devices support it, enable vlan offload on
transmit for bridge devices. This allows senders to take advantage of the
hardware support, similar to other forms of acceleration.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that vlan acceleration is handled consistently regardless of usage,
it is possible to enable and disable it at will. This adds support for
Ethtool operations that change the offloading status for debugging
purposes, similar to other forms of hardware acceleration.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently each driver that is capable of vlan hardware acceleration
must be aware of the vlan groups that are configured and then pass
the stripped tag to a specialized receive function. This is
different from other types of hardware offload in that it places a
significant amount of knowledge in the driver itself rather keeping
it in the networking core.
This makes vlan offloading function more similarly to other forms
of offloading (such as checksum offloading or TSO) by doing the
following:
* On receive, stripped vlans are passed directly to the network
core, without attempting to check for vlan groups or reconstructing
the header if no group
* vlans are made less special by folding the logic into the main
receive routines
* On transmit, the device layer will add the vlan header in software
if the hardware doesn't support it, instead of spreading that logic
out in upper layers, such as bonding.
There are a number of advantages to this:
* Fixes all bugs with drivers incorrectly dropping vlan headers at once.
* Avoids having to disable VLAN acceleration when in promiscuous mode
(good for bridging since it always puts devices in promiscuous mode).
* Keeps VLAN tag separate until given to ultimate consumer, which
avoids needing to do header reconstruction as in tg3 unless absolutely
necessary.
* Consolidates common code in core networking.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A struct net_device always maps to zero or one vlan groups and we
always know the device when we are looking up a group. We currently
do a hash table lookup on the device to find the group but it is
much simpler to just store a pointer.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently users of hardware vlan accleration need to know whether
the device supports it before generating packets. However, vlan
acceleration will soon be available in a more flexible manner so
knowing ahead of time becomes much more difficult. This adds
a software fallback path for vlan packets on devices without the
necessary offloading support, similar to other types of hardware
accleration.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Many (but not all) drivers check to see whether there is a vlan
group configured before using a tag stored in the skb. There's
not much point in this check since it just throws away data that
should only be present in the expected circumstances. However,
it will soon be legal and expected to get a vlan tag when no
vlan group is configured, so remove this check from all drivers
to avoid dropping the tags.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
VLAN_GROUP_ARRAY_LEN is simply the number of possible vlan VIDs.
Since vlan groups will soon be more of an implementation detail
for vlan devices, rename the constant to be descriptive of its
actual purpose.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
An upcoming commit will allow packets with hardware vlan acceleration
information to be passed though more parts of the network stack, including
packets trunked through the bridge. This adds support for matching and
filtering those packets through ebtables.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix a log message
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David Wang <dwang2@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change min MTU to 68.
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David Wang <dwang2@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace no wait CMD_ENABLE firmware devcmd with CMD_ENABLE_WAIT
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David Wang <dwang2@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Let the firmware know about the mac address set by the user using ndo_set_mac_address
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David Wang <dwang2@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for multiple hardware receive queues. The ingress traffic is hashed into one of the receive queues based on IP or TCP or both headers. The max no. of receive queues supported is 8.
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David Wang <dwang2@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If we get a bit-flip of ECC error while reading the data area, do not add it to
corrupted list, because it is possible that this is just unstable PEB with
corruptions caused by unclean reboots.
This patch also improves commentaries.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
When the data does not contain all 0xFF bytes, 'check_data_ff()' should return
1, not -EINVAL; Also, the caller ('process_eb()') should not add the PEB to the
"corrupted" list if there was a read error.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
While scanning the flash we read all VID headers and store some important
information in 'struct ubi_scan_leb'. Store also the 'copy_flag' value there
as it is needed when comparing LEBs. We do not increase memory consumption
because this is just one bit and we have plenty of spare bits in
'struct ubi_scan_leb' (sizeof(struct ubi_scan_leb) is 48 both with and
without this patch).
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
In 'ubifs_replay_journal()' we allocate 'sbuf' for scanning the log.
However, we already have 'c->sbuf' for these purposes, so do not
allocate yet another one. This reduces UBIFS memory consumption while
recovering.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
This is a bug-fix: when we unmount, and we are currently in R/O
mode because of an error - we do not sync write-buffers, which
means we also do not cancel write-buffer timers we may possibly
have armed. This patch fixes the issue.
The issue can easily be reproduced by enabling UBIFS failure debug
mode (echo 4 > /sys/module/ubifs/parameters/debug_tsts) and
unmounting as soon as a failure happen. At some point the system
oopses because we have an armed hrtimer but UBIFS is unmounted
already.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
This is a clean-up patch which:
1. Removes explicite 'hrtimer_cancel()' after 'ubifs_wbuf_sync()' in
'ubifs_remount_ro()', because the timers will be canceled by
'ubifs_wbuf_sync()', no need to cancel them for the second time.
2. Remove "if (c->jheads)" check from 'ubifs_put_super()', because
at journal heads must always be allocated there, since we checked
earlier that we were mounted R/W, and the olny situation when
journal heads are not allocated is when mounter or re-mounted R/O.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Checkin c957ef2c59 had inconsistent
ordering of .data..percpu..page_aligned and .data..percpu..readmostly;
the still-broken version affected x86-32 at least.
The page aligned version really must be page aligned...
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
LKML-Reference: <1287544022.4571.7.camel@sli10-conroe.sh.intel.com>
Cc: Shaohua Li <shaohua.li@intel.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Remove the BKL usage added in "block: push down BKL into .locked_ioctl".
Virtio-blk doesn't use the BKL for anything, and doesn't implement any
ioctl command by itself, but only uses the generic scsi_cmd_ioctl
which is fine without the BKL.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The ports are char devices; do not have seeking capabilities. Calling
nonseekable_open() from the fops_open() call and setting the llseek fops
pointer to no_llseek ensures an lseek() call from userspace returns
-ESPIPE.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
CC: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If a port has registered for SIGIO signals, let the application
know that the port is getting unplugged.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Send a SIGIO signal when new data arrives on a port. This is sent only
when the process has requested for the signal to be sent using fcntl().
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
A process can request for SIGIO on host connect / disconnect events
using the O_ASYNC file flag using fcntl().
If that's requested, and if the guest-side connection for the port is
open, any host-side open/close events for that port will raise a SIGIO.
The process can then use poll() within the signal handler to find out
which port triggered the signal.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Explain in a comment why there's no need to reference-count the portdev
struct: when a device is yanked out, we can't do anything more with it
anyway so just give up doing anything more with the data or the vqs and
exit cleanly.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When a port got hot-unplugged, when a port was open, any file operation
after the unplugging resulted in a crash. This is fixed by ref-counting
the port structure, and releasing it only when the file is closed.
This splits the unplug operation in two parts: first marks the port
as unavailable, removes all the buffers in the vqs and removes the port
from the per-device list of ports. The second stage, invoked when all
references drop to zero, releases the chardev and frees all other memory.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>