linux-pinenote

Author	SHA1	Message	Date
Eric Dumazet	740b0f1841	tcp: switch rtt estimations to usec resolution Upcoming congestion controls for TCP require usec resolution for RTT estimations. Millisecond resolution is simply not enough these days. FQ/pacing in DC environments also require this change for finer control and removal of bimodal behavior due to the current hack in tcp_update_pacing_rate() for 'small rtt' TCP_CONG_RTT_STAMP is no longer needed. As Julian Anastasov pointed out, we need to keep user compatibility : tcp_metrics used to export RTT and RTTVAR in msec resolution, so we added RTT_US and RTTVAR_US. An iproute2 patch is needed to use the new attributes if provided by the kernel. In this example ss command displays a srtt of 32 usecs (10Gbit link) lpk51:~# ./ss -i dst lpk52 Netid State Recv-Q Send-Q Local Address:Port Peer Address:Port tcp ESTAB 0 1 10.246.11.51:42959 10.246.11.52:64614 cubic wscale:6,6 rto:201 rtt:0.032/0.001 ato:40 mss:1448 cwnd:10 send 3620.0Mbps pacing_rate 7240.0Mbps unacked:1 rcv_rtt:993 rcv_space:29559 Updated iproute2 ip command displays : lpk51:~# ./ip tcp_metrics \| grep 10.246.11.52 10.246.11.52 age 561.914sec cwnd 10 rtt 274us rttvar 213us source 10.246.11.51 Old binary displays : lpk51:~# ip tcp_metrics \| grep 10.246.11.52 10.246.11.52 age 561.914sec cwnd 10 rtt 250us rttvar 125us source 10.246.11.51 With help from Julian Anastasov, Stephen Hemminger and Yuchung Cheng Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Yuchung Cheng <ycheng@google.com> Cc: Larry Brakmo <brakmo@google.com> Cc: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 17:08:40 -05:00
Eric Dumazet	363ec39235	net: add skb_mstamp infrastructure ktime_get() is too expensive on some cases, and we'd like to get usec resolution timestamps in TCP stack. This patch adds a light weight facility using a combination of local_clock() and jiffies samples. Instead of : u64 t0, t1; t0 = ktime_get(); // stuff t1 = ktime_get(); delta_us = ktime_us_delta(t1, t0); use : struct skb_mstamp t0, t1; skb_mstamp_get(&t0); // stuff skb_mstamp_get(&t1); delta_us = skb_mstamp_us_delta(&t1, &t0); Note : local_clock() might have a (bounded) drift between cpus. Do not use this infra in place of ktime_get() without understanding the issues. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Larry Brakmo <brakmo@google.com> Cc: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 17:08:39 -05:00
Ben Dooks	20d8435a1c	phy: micrel: add of configuration for LED mode Add support for the led-mode property for the following PHYs which have a single LED mode configuration value. KSZ8001 and KSZ8041 which both use register 0x1e bits 15,14 and KSZ8021, KSZ8031 and KSZ8051 which use register 0x1f bits 5,4 to control the LED configuration. Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 17:00:07 -05:00
Arnd Bergmann	94fcf6964c	isdn: fix multiple sleep_on races The isdn core code uses a couple of wait queues with interruptible_sleep_on, which is racy and about to get removed from the kernel. Fortunately, we know for each case what we are waiting for, so they can all be converted to the better wait_event_interruptible interface. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Karsten Keil <isdn@linux-pingi.de> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 16:06:13 -05:00
Arnd Bergmann	c11da83bda	isdn: divert, hysdn: fix interruptible_sleep_on race These two drivers use identical code for their procfs status file handling, which contains a small race against status data becoming available while reading the file. This uses wait_event_interruptible instead to fix this particular race and eventually get rid of all sleep_on instances. There seems to be another race involving multiple concurrent readers of the same procfs file, which I don't try to fix here. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Karsten Keil <isdn@linux-pingi.de> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 16:06:13 -05:00
Arnd Bergmann	c728cc88ce	isdn: hisax/elsa: fix sleep_on race in elsa FSM The state machine code in the elsa driver uses interruptible_sleep_on to wait for state changes, which is racy. A closer look at the possible states reveals that it is always used to wait for getting back into ARCOFI_NOP, so we can use wait_event_interruptible instead. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Karsten Keil <isdn@linux-pingi.de> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 16:06:13 -05:00
Arnd Bergmann	e5b3fa1547	isdn: pcbit: fix interruptible_sleep_on race interruptible_sleep_on is racy and going away. In case of pcbit, the driver would run into a timeout if the card is initialized before we start waiting for it. This uses wait_event to fix the race. In order to do this, the state machine handling for the timeout case has to get trivially reorganized so we actually know whether the timeout has occorred or not. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Karsten Keil <isdn@linux-pingi.de> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 16:06:12 -05:00
Arnd Bergmann	c73b1f6a04	atm: firestream: fix interruptible_sleep_on race interruptible_sleep_on is racy and going away. This replaces the one use in the firestream driver with the appropriate wait_event_interruptible variant. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Chas Williams <chas@cmf.nrl.navy.mil> Cc: linux-atm-general@lists.sourceforge.net Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 16:06:12 -05:00
David S. Miller	53e0b080bc	Merge branch 'intel-next' Aaron Brown says: ==================== Intel Wired LAN Driver Updates This series contains updates to ixgbe, igb and documentation. The first four have been sent up as part of other series where 1 or more in the series were rejected and either dropped or still being worked on for reasons unrelated to these patches. Don makes recovery from a HW ECC error just schedule a reset as it turns out the previous behaviour of forcing the user to reload is not necessary. Mark adds WoL support to port 0 of a new device. Jacob removes a magic number from the ptp_caps.name and updates the SubmittingPatches documentation with details on the Fixed: tag. And Carolyn updates igb files to remove the FSF physical mail address. [ DaveM Note: SubmittingPatches change omitted, will go via LKML ] ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 15:55:53 -05:00
Carolyn Wyborny	74cfb2e1f2	igb: Update license text to remove FSF address and update copyright. This patch updates the license text to remove address of Free Software Foundation and refer users to www.gnu.org instead. This patch also updates the copyright dates in appropriate igb driver files. Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 15:54:52 -05:00
Jeff Kirsher	167f3f71c7	igb: make local functions static and remove dead code Based on Stephen Hemminger's original patch. Make local functions static, and remove unused functions. Reported-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 15:54:52 -05:00
Mark Rustad	87557440d8	ixgbe: Add WoL support for a new device Add WoL support for port 0 of a new 82599-based device. Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 15:54:52 -05:00
Jacob Keller	ca324099fb	ixgbe: don't use magic size number to assign ptp_caps.name Rather than using a magic size number, just use sizeof since that will work and is more robust to future changes. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 15:54:51 -05:00
Don Skidmore	d773ce2de1	ixgbe: modify behavior on receiving a HW ECC error. Currently when we noticed a HW ECC error we would request the use reload the driver to force a reset of the part. This was done due to the mistaken believe that a normal reset would not be sufficient. Well it turns out it would be so now we just schedule a reset upon seeing the ECC. Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 15:54:51 -05:00
Hannes Frederic Sowa	0b95227a7b	ipv6: yet another new IPV6_MTU_DISCOVER option IPV6_PMTUDISC_OMIT This option has the same semantic as IP_PMTUDISC_OMIT for IPv4 which got recently introduced. It doesn't honor the path mtu discovered by the host but in contrary to IPV6_PMTUDISC_INTERFACE allows the generation of fragments if the packet size exceeds the MTU of the outgoing interface MTU. Fixes: `93b36cf342` ("ipv6: support IPV6_PMTU_INTERFACE on sockets") Cc: Florian Weimer <fweimer@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 15:51:01 -05:00
Hannes Frederic Sowa	1b34657635	ipv4: yet another new IP_MTU_DISCOVER option IP_PMTUDISC_OMIT IP_PMTUDISC_INTERFACE has a design error: because it does not allow the generation of fragments if the interface mtu is exceeded, it is very hard to make use of this option in already deployed name server software for which I introduced this option. This patch adds yet another new IP_MTU_DISCOVER option to not honor any path mtu information and not accepting new icmp notifications destined for the socket this option is enabled on. But we allow outgoing fragmentation in case the packet size exceeds the outgoing interface mtu. As such this new option can be used as a drop-in replacement for IP_PMTUDISC_DONT, which is currently in use by most name server software making the adoption of this option very smooth and easy. The original advantage of IP_PMTUDISC_INTERFACE is still maintained: ignoring incoming path MTU updates and not honoring discovered path MTUs in the output path. Fixes: `482fc6094a` ("ipv4: introduce new IP_MTU_DISCOVER mode IP_PMTUDISC_INTERFACE") Cc: Florian Weimer <fweimer@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 15:51:00 -05:00
Hannes Frederic Sowa	69647ce46a	ipv4: use ip_skb_dst_mtu to determine mtu in ip_fragment ip_skb_dst_mtu mostly falls back to ip_dst_mtu_maybe_forward if no socket is attached to the skb (in case of forwarding) or determines the mtu like we do in ip_finish_output, which actually checks if we should branch to ip_fragment. Thus use the same function to determine the mtu here, too. This is important for the introduction of IP_PMTUDISC_OMIT, where we want the packets getting cut in pieces of the size of the outgoing interface mtu. IPv6 already does this correctly. Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 15:51:00 -05:00
Timo Teräs	a960ff81f0	neigh: probe application via netlink in NUD_PROBE iproute2 arpd seems to expect this as there's code and comments to handle netlink probes with NUD_PROBE set. It is used to flush the arpd cached mappings. opennhrp instead turns off unicast probes (so it can handle all neighbour discovery). Without this change it will not see NUD_PROBE probes and cannot reconfirm the mapping. Thus currently neigh entry will just fail and can cause few packets dropped until broadcast discovery is restarted. Earlier discussion on the subject: http://marc.info/?t=139305877100001&r=1&w=2 Signed-off-by: Timo Teräs <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 15:46:25 -05:00
Jean Sacren	44a6bd8656	ieee802154: fix new function declaration The commit `8fad346f36` ("eee802154: add basic support for RF212 to at86rf230 driver") introduced the new function is_rf212() with some minor issues in declaration: 1) Fix the function type by changing it to bool as the function definition returns a boolean value. Additionally both callers of is_rf212() are expected to return a boolean value. 2) Fix the function specifier by deleting the inline keyword as the compiler takes care of that. Signed-off-by: Jean Sacren <sakiwit@gmail.com> Cc: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 15:46:25 -05:00
Bjørn Mork	84a3e72c3a	ipv6: log src and dst along with "udp checksum is 0" These info messages are rather pointless without any means to identify the source of the bogus packets. Logging the src and dst addresses and ports may help a bit. Cc: Joe Perches <joe@perches.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 15:46:25 -05:00
Pablo Neira Ayuso	86c3f0f830	vxlan: remove unused port variable in vxlan_udp_encap_recv() Signed-off-by: Pablo Neira Ayuso <pablo@gnumonks.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 15:46:25 -05:00
David S. Miller	941ee45c4c	Merge branch 'mlx4' Amir Vadai says: ==================== net, net/mlx4: Add sysfs file for port number Modern distro's are using biosdevname to rename interface to a name based on slot/port number. biosdevname can't get the port number of devices that have multiple ports that share the same PCI function. This patch adds a sysfs file under: /sys/devices/.../net/<interface>/dev_port, that contains the port number (0 based) - to be used by biosdevname. Also, dev_id was wrongly used in mlx4_en driver - added a patch that fix it. This patch was tested and applied over commit 51adfcc "net: bcmgenet: remove unused bh_lock member" ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 15:38:18 -05:00
Amir Vadai	ca9f9f7039	net/mlx4_en: Fix bad use of dev_id dev_id should be set for multiple netdev's sharing the same MAC, which is not the case here. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 15:38:07 -05:00
Amir Vadai	76a066f2a2	net/mlx4_en: Expose port number through sysfs Initialize dev_port with port number (0 based) to be accessed through sysfs from user space. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 15:38:06 -05:00
Amir Vadai	3f85944fe2	net: Add sysfs file for port number Add a sysfs file to enable user space to query the device port number used by a netdevice instance. This is needed for devices that have multiple ports on the same PCI function. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 15:38:06 -05:00
David S. Miller	f2ea0cfd6b	Merge branch 'bnx2x' Michal Schmidt says: ==================== bnx2x: minimize RAM usage in kdump kdump kernels usually have only a small amount of memory reserved. bnx2x can be memory-hungry. Let's minimize its memory usage when running in kdump. I detect kdump by looking at the "reset_devices" flag. A couple of storage drivers (cciss, hpsa) use it for the same purpose. I am not sure this is the best way to solve the problem, but it works. Should it be made more generic by, say, looking at the total amount of lowmem instead? Not using TPA by default when lowmem is small and/or defaulting to fewer queues would help 32bit systems where a driver for a multi-function multi-queue NIC can consume a significant amount of available memory. Or do we want no such heuristics? Is this something to consider doing for other network drivers too? ==================== Acked-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 15:28:08 -05:00
Michal Schmidt	94d9de3cf7	bnx2x: save RAM in kdump kernel by disabling TPA When running in a kdump kernel, disable TPA. This saves memory, which tends to be scarce in kdump. TPA, being a receive acceleration, is unlikely to be useful for kdump, whose purpose is to send the memory image out. This saves additional 5 MB in the kdump environment. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 15:27:50 -05:00
Michal Schmidt	ff2ad3071f	bnx2x: save RAM in kdump kernel by using a single queue When running in a kdump kernel, make sure to use only a single ethernet queue even if a num_queues option in /etc/modprobe.d/*.conf would specify otherwise. This saves memory, which tends to be scarce in kdump. This saves about 40 MB in the kdump environment on a setup with num_queues=8 in the config file. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 15:27:50 -05:00
Michal Schmidt	7d0445d66a	bnx2x: clamp num_queues to prevent passing a negative value Use the clamp() macro to make the calculation of the number of queues slightly easier to understand. It also avoids a crash when someone accidentally passes a negative value in num_queues= module parameter. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 15:27:50 -05:00
Florian Westphal	8e165e2034	net: tcp: add mib counters to track zero window transitions Three counters are added: - one to track when we went from non-zero to zero window - one to track the reverse - one counter incremented when we want to announce zero window, but can't because we would shrink current window. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 15:23:30 -05:00
Neil Jerram	2ebe21fdde	net: order MPLS ethertypes numerically All ethertypes other than ETH_P_MPLS_UC, ETH_P_MPLS_MC and ETH_P_ATMMPOA were already ordered numerically. This commit moves those three ETH_P_... values into correct numerical order too. Signed-off-by: Neil Jerram <Neil.Jerram@metaswitch.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-26 15:14:26 -05:00
Joe Perches	cd2b0389dc	bnx2x: Remove hidden flow control goto from BNX2X_ALLOC macros BNX2X_ALLOC macros use "goto alloc_mem_err" so these labels appear unused in some functions. Expand these macros in-place via coccinelle and some typing. Update the macros to use statement expressions and remove the BNX2X_ALLOC macro. This adds some > 80 char lines. $ cat bnx2x_pci_alloc.cocci @@ expression e1; expression e2; expression e3; @@ - BNX2X_PCI_ALLOC(e1, e2, e3); + e1 = BNX2X_PCI_ALLOC(e2, e3); if (!e1) goto alloc_mem_err; @@ expression e1; expression e2; expression e3; @@ - BNX2X_PCI_FALLOC(e1, e2, e3); + e1 = BNX2X_PCI_FALLOC(e2, e3); if (!e1) goto alloc_mem_err; @@ expression e1; expression e2; @@ - BNX2X_ALLOC(e1, e2); + e1 = kzalloc(e2, GFP_KERNEL); if (!e1) goto alloc_mem_err; @@ expression e1; expression e2; expression e3; @@ - kzalloc(sizeof(e1) * e2, e3) + kcalloc(e2, sizeof(e1), e3) @@ expression e1; expression e2; expression e3; @@ - kzalloc(e1 * sizeof(e2), e3) + kcalloc(e1, sizeof(e2), e3) Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-25 17:36:35 -05:00
Steffen Klassert	895de9a348	vti4: Enable namespace changing vti4 is now fully namespace aware, so allow namespace changing for vti devices Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-02-25 07:04:19 +01:00
Steffen Klassert	6e2de802af	vti4: Check the tunnel endpoints of the xfrm state and the vti interface The tunnel endpoints of the xfrm_state we got from the xfrm_lookup must match the tunnel endpoints of the vti interface. This patch ensures this matching. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-02-25 07:04:19 +01:00
Steffen Klassert	78a010cca0	vti4: Support inter address family tunneling. With this patch we can tunnel ipv6 traffic via a vti4 interface. A vti4 interface can now have an ipv6 address and ipv6 traffic can be routed via a vti4 interface. The resulting traffic is xfrm transformed and tunneled throuhg ipv4 if matching IPsec policies and states are present. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-02-25 07:04:19 +01:00
Steffen Klassert	a34cd4f319	vti4: Use the on xfrm_lookup returned dst_entry directly We need to be protocol family indepenent to support inter addresss family tunneling with vti. So use a dst_entry instead of the ipv4 rtable in vti_tunnel_xmit. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-02-25 07:04:18 +01:00
Steffen Klassert	9994bb8e1e	xfrm4: Remove xfrm_tunnel_notifier This was used from vti and is replaced by the IPsec protocol multiplexer hooks. It is now unused, so remove it. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-02-25 07:04:18 +01:00
Steffen Klassert	df3893c176	vti: Update the ipv4 side to use it's own receive hook. With this patch, vti uses the IPsec protocol multiplexer to register it's own receive side hooks for ESP, AH and IPCOMP. Vti now does the following on receive side: 1. Do an input policy check for the IPsec packet we received. This is required because this packet could be already prosecces by IPsec, so an inbuond policy check is needed. 2. Mark the packet with the i_key. The policy and the state must match this key now. Policy and state belong to the outer namespace and policy enforcement is done at the further layers. 3. Call the generic xfrm layer to do decryption and decapsulation. 4. Wait for a callback from the xfrm layer to properly clean the skb to not leak informations on namespace and to update the device statistics. On transmit side: 1. Mark the packet with the o_key. The policy and the state must match this key now. 2. Do a xfrm_lookup on the original packet with the mark applied. 3. Check if we got an IPsec route. 4. Clean the skb to not leak informations on namespace transitions. 5. Attach the dst_enty we got from the xfrm_lookup to the skb. 6. Call dst_output to do the IPsec processing. 7. Do the device statistics. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-02-25 07:04:18 +01:00
Steffen Klassert	6d608f06e3	ip_tunnel: Make vti work with i_key set Vti uses the o_key to mark packets that were transmitted or received by a vti interface. Unfortunately we can't apply different marks to in and outbound packets with only one key availabe. Vti interfaces typically use wildcard selectors for vti IPsec policies. On forwarding, the same output policy will match for both directions. This generates a loop between the IPsec gateways until the ttl of the packet is exceeded. The gre i_key/o_key are usually there to find the right gre tunnel during a lookup. When vti uses the i_key to mark packets, the tunnel lookup does not work any more because vti does not use the gre keys as a hash key for the lookup. This patch workarounds this my not including the i_key when comupting the hash for the tunnel lookup in case of vti tunnels. With this we have separate keys available for the transmitting and receiving side of the vti interface. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-02-25 07:04:18 +01:00
Steffen Klassert	70be6c91c8	xfrm: Add xfrm_tunnel_skb_cb to the skb common buffer IPsec vti_rcv needs to remind the tunnel pointer to check it later at the vti_rcv_cb callback. So add this pointer to the IPsec common buffer, initialize it and check it to avoid transport state matching of a tunneled packet. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-02-25 07:04:17 +01:00
Steffen Klassert	d099160e02	ipcomp4: Use the IPsec protocol multiplexer API Switch ipcomp4 to use the new IPsec protocol multiplexer. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-02-25 07:04:17 +01:00
Steffen Klassert	e5b56454e0	ah4: Use the IPsec protocol multiplexer API Switch ah4 to use the new IPsec protocol multiplexer. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-02-25 07:04:17 +01:00
Steffen Klassert	827789cbd7	esp4: Use the IPsec protocol multiplexer API Switch esp4 to use the new IPsec protocol multiplexer. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-02-25 07:04:17 +01:00
Steffen Klassert	3328715e6c	xfrm4: Add IPsec protocol multiplexer This patch add an IPsec protocol multiplexer. With this it is possible to add alternative protocol handlers as needed for IPsec virtual tunnel interfaces. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-02-25 07:04:16 +01:00
Florian Fainelli	51adfcc333	net: bcmgenet: remove unused bh_lock member bh_lock spinlock is unused, remove it from the private driver structure. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-24 20:26:37 -05:00
Florian Fainelli	da56bbf71d	net: bcmgenet: remove commented code in bcmgenet_xmit() This code is commented since it is unused, left-over from the very first time this driver was merged. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-24 20:26:36 -05:00
Florian Fainelli	80d8e96d12	net: bcmgenet: drop checks on priv->phydev Drop all the checks on priv->phydev since we will refuse probing the driver if we cannot attach to a PHY device. Drop all checks on priv->phydev. This also fixes some smatch issues reported by Dan Carpenter. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-24 20:26:36 -05:00
David S. Miller	432c5b3a10	Merge branch 'gianfar' Claudiu Manoil says: ==================== gianfar: Device reset and reconfig fixes These patches end up fixing some notable device reset & reconfig related problems. One issue is on-the-fly (Rx/Tx on) programming of interrupt coalescing (IC) registers on the processing path, against HW recommendation. This is an old issue that became visible after BQL introduction, as under certain conditions (low traffic) one TX interrupt gets lost and BQL fires Tx timeout as a result. Another notable issue is a race on the Tx path (xmit, clean_tx) during device reset (i.e. during Tx timeout watchdog firing) that leads to NULL access. Fixing the problematic on-thy-fly register writes (i.e. the IC regs) required the implementation of a MAC soft reset procedure. The race leading to NULL access was addressed by fixing the stop_gfar()/startup_gfar() pair (disable/enable napi a.s.o.) and adding the device state DOWN to sync with the TX path. v2: Refactored if() clauses from gfar_set_features(), PATCH 2. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-24 19:38:53 -05:00
Claudiu Manoil	f19015baa2	gianfar: Fix Tx int miss, dont write IC on-the-fly Programming the interrupt coalescing (IC) registers while the controller/DMA is on may incur the loss of one Tx confirmation interrupt, under certain conditions. This is a subtle hw race because it does not occur during a burst of Tx packets. It has been observed on p2020 devices that, if just one packet is being xmit'ed, the Tx confirmation doesn't trigger and BQL evetually blocks the Tx queues, followed by Tx timeout and an un-responsive device. This issue was not apparent prior to introducing BQL support, as a late Tx confirmation was not an issue back then and the next burst of Tx frames would have triggered the Tx confirmation/ Tx ring cleanup anyway. Bottom line, the hw specifications state that the IC registers should not be programmed while the Rx/Tx blocks (the DMA) are enabled. Further more, these registers are currently re-written with the same values on the processing path, over and over again. To fix this, rewriting the IC registers has been removed from the processing path (napi poll). A complete MAC reset procedure has been implemented for the ethtool -c option instead, to reliably update these registers while the controller is stopped. Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-24 19:38:20 -05:00
Claudiu Manoil	0851133bb5	gianfar: Fix device reset races (oops) for Tx The device reset procedure, stop_gfar()/startup_gfar(), has concurrency issues. "Kernel access of bad area" oopses show up during Tx timeout device reset or other reset cases (like changing MTU) that happen while the interface still has traffic. The oopses happen in start_xmit and clean_tx_ring when accessing tx_queue-> tx_skbuff which is NULL. The race comes from de-allocating the tx_skbuff while transmission and napi processing are still active. Though the Tx queues get temoprarily stopped when Tx timeout occurs, they get re-enabled as a result of Tx congestion handling inside the napi context (see clean_tx_ring()). Not disabling the napi during reset is also a bug, because clean_tx_ring() will try to access tx_skbuff while it is being de-alloc'ed and re-alloc'ed. To fix this, stop_gfar() needs to disable napi processing after stopping the Tx queues. However, in order to prevent clean_tx_ring() to re-enable the Tx queue before the napi gets disabled, the device state DOWN has been introduced. It prevents the Tx congestion management from re-enabling the de-congested Tx queue while the device is brought down. An additional locking state, RESETTING, has been introduced to prevent simultaneous resets or to prevent configuring the device while it is resetting. The bogus 'rxlock's (for each Rx queue) have been removed since their purpose is not justified, as they don't prevent nor are suited to prevent device reset/reconfig races (such as this one). Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-24 19:38:20 -05:00

1 2 3 4 5 ...

427483 commits