Commit graph

110 commits

Author SHA1 Message Date
Haiyang Zhang
4c87454a47 hyperv: Add IPv6 into the hash computation for vRSS
This will allow the workload spreading via vRSS for IPv6.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-30 16:10:04 -04:00
Haiyang Zhang
942396b019 hyperv: Fix the total_data_buflen in send path
total_data_buflen is used by netvsc_send() to decide if a packet can be put
into send buffer. It should also include the size of RNDIS message before the
Ethernet frame. Otherwise, a messge with total size bigger than send_section_size
may be copied into the send buffer, and cause data corruption.

[Request to include this patch to the Stable branches]

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-22 17:58:50 -04:00
Haiyang Zhang
f88e67149f hyperv: Add handling of IP header with option field in netvsc_set_hash()
In case that the IP header has optional field at the end, this patch will
get the port numbers after that field, and compute the hash. The general
parser skb_flow_dissect() is used here.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-17 23:43:40 -04:00
David S. Miller
64b1f00a08 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-10-08 16:22:22 -04:00
KY Srinivasan
3a67c9ccad hyperv: Fix a bug in netvsc_send()
After the packet is successfully sent, we should not touch the packet
as it may have been freed. This patch is based on the work done by
Long Li <longli@microsoft.com>.

David, please queue this up for stable.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reported-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-05 21:10:48 -04:00
David S. Miller
739e4a758e Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/usb/r8152.c
	net/netfilter/nfnetlink.c

Both r8152 and nfnetlink conflicts were simple overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-02 11:25:43 -07:00
KY Srinivasan
dedb845ded hyperv: Fix a bug in netvsc_start_xmit()
After the packet is successfully sent, we should not touch the skb
as it may have been freed. This patch is based on the work done by
Long Li <longli@microsoft.com>.

In this version of the patch I have fixed issues pointed out by David.
David, please queue this up for stable.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Tested-by: Long Li <longli@microsoft.com>
Tested-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-30 01:21:03 -04:00
Dan Carpenter
b1c849276b hyperv: NULL dereference on error
We try to call free_netvsc_device(net_device) when "net_device" is NULL.
It leads to an Oops.

Fixes: f90251c8a6 ('hyperv: Increase the buffer length for netvsc_channel_cb()')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05 17:29:22 -07:00
Haiyang Zhang
f90251c8a6 hyperv: Increase the buffer length for netvsc_channel_cb()
When the buffer is too small for a packet from VMBus, a bigger buffer will be
allocated in netvsc_channel_cb() and retry reading the packet from VMBus.
Increasing this buffer size will reduce the retry overhead.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-22 12:23:09 -07:00
KY Srinivasan
be136ed30a hyperv: Adjust the size of sendbuf region to support ws2008r2
WS2008R2 is a supported platform and it turns out that the maximum sendbuf
size that ws2008R2 can support is only 15MB. Make the necessary
adjustment.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-06 13:52:07 -07:00
KY Srinivasan
06b47aac49 Drivers: net-next: hyperv: Increase the size of the sendbuf region
Intel did some benchmarking on our network throughput when Linux on Hyper-V
is as used as a gateway. This fix gave us almost a 1 Gbps additional throughput
on about 5Gbps base throughput we hadi, prior to increasing the sendbuf size.
The sendbuf mechanism is a copy based transport that we have which is clearly
more optimal than the copy-free page flipping mechanism (for small packets).
In the forwarding scenario, we deal only with MTU sized packets,
and increasing the size of the senbuf area gave us the additional performance.
For what it is worth, Windows guests on Hyper-V, I am told use similar sendbuf
size as well.

The exact value of sendbuf I think is less important than the fact that it needs
to be larger than what Linux can allocate as physically contiguous memory.
Thus the change over to allocating via vmalloc().

We currently allocate 16MB receive buffer and we use vmalloc there for allocation.
Also the low level channel code has already been modified to deal with physically
dis-contiguous memory in the ringbuffer setup.

Based on experimentation Intel did, they say there was some improvement in throughput
as the sendbuf size was increased up to 16MB and there was no effect on throughput
beyond 16MB. Thus I have chosen 16MB here.

Increasing the sendbuf value makes a material difference in small packet handling

In this version of the patch, based on David's feedback, I have added
additional details in the commit log.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-04 15:07:14 -07:00
David S. Miller
f139c74a8d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-30 13:25:49 -07:00
Wei Yongjun
dd1d3f8f99 hyperv: Fix error return code in netvsc_init_buf()
Fix to return -ENOMEM from the kalloc error handling
case instead of 0.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-23 14:55:47 -07:00
Richard Weinberger
316158feff hyperv: Add netpoll support
In order to have at least a netconsole to debug kernel issues on
Windows Azure this patch implements netpoll support.
Sending packets is easy, netvsc_start_xmit() does already everything
needed.

Signed-off-by: Richard Weinberger <richard@nod.at>
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-09 16:52:49 -07:00
Fabian Frederick
bd4578bc84 drivers/net/hyperv/netvsc.c: remove unnecessary null test before kfree
Fix checkpatch warning:
WARNING: kfree(NULL) is safe this check is probably not required

Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: netdev@vger.kernel.org
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-02 18:22:25 -07:00
David S. Miller
9b8d90b963 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-06-25 22:40:43 -07:00
Haiyang Zhang
3a494e7103 hyperv: Add handler for RNDIS_STATUS_NETWORK_CHANGE event
The RNDIS_STATUS_NETWORK_CHANGE event is received after the Hyper-V host
sleep or hibernation. We refresh network at this time.
MS-TFS: 135162

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-19 21:17:37 -07:00
Dave Jones
2f18423d7e hyperv: fix apparent cut-n-paste error in send path teardown
c25aaf814a: "hyperv: Enable sendbuf mechanism on the send path" added
some teardown code that looks like it was copied from the recieve path
above, but missed a variable name replacement.

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-16 21:36:13 -07:00
Haiyang Zhang
307f099520 hyperv: Add hash value into RNDIS Per-packet info
It passes the hash value as the RNDIS Per-packet info to the Hyper-V host,
so that the send completion notices can be spread across multiple channels.
MS-TFS: 140273

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-23 14:49:00 -04:00
Wilfried Klaebe
7ad24ea4bf net: get rid of SET_ETHTOOL_OPS
net: get rid of SET_ETHTOOL_OPS

Dave Miller mentioned he'd like to see SET_ETHTOOL_OPS gone.
This does that.

Mostly done via coccinelle script:
@@
struct ethtool_ops *ops;
struct net_device *dev;
@@
-       SET_ETHTOOL_OPS(dev, ops);
+       dev->ethtool_ops = ops;

Compile tested only, but I'd seriously wonder if this broke anything.

Suggested-by: Dave Miller <davem@davemloft.net>
Signed-off-by: Wilfried Klaebe <w-lkml@lebenslange-mailadresse.de>
Acked-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-13 17:43:20 -04:00
David S. Miller
5f013c9bc7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/altera/altera_sgdma.c
	net/netlink/af_netlink.c
	net/sched/cls_api.c
	net/sched/sch_api.c

The netlink conflict dealt with moving to netlink_capable() and
netlink_ns_capable() in the 'net' tree vs. supporting 'tc' operations
in non-init namespaces.  These were simple transformations from
netlink_capable to netlink_ns_capable.

The Altera driver conflict was simply code removal overlapping some
void pointer cast cleanups in net-next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-12 13:19:14 -04:00
Haiyang Zhang
e565e803d4 Add support for netvsc build without CONFIG_SYSFS flag
This change ensures the driver can be built successfully without the
CONFIG_SYSFS flag.
MS-TFS: 182270

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-12 01:02:32 -04:00
KY Srinivasan
22041fb05b hyperv: Properly handle checksum offload
Do checksum offload only if the client of the driver wants checksum to be
offloaded.

In V1 version of this patch, I  addressed comments from
Stephen Hemminger <stephen@networkplumber.org> and
Eric Dumazet <eric.dumazet@gmail.com>.

In this version of the patch I have addressed comments from
David Miller.

This patch fixes a bug that is exposed in gateway scenarios.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-30 16:12:23 -04:00
KY Srinivasan
c25aaf814a hyperv: Enable sendbuf mechanism on the send path
We send packets using a copy-free mechanism (this is the Guest to Host transport
via VMBUS). While this is obviously optimal for large packets,
it may not be optimal for small packets. Hyper-V host supports
a second mechanism for sending packets that is "copy based". We implement that
mechanism in this patch.

In this version of the patch I have addressed a comment from David Miller.

With this patch (and all of the other offload and VRSS patches), we are now able
to almost saturate a 10G interface between Linux VMs on Hyper-V
on different hosts - close to  9 Gbps as measured via iperf.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-30 13:48:46 -04:00
Haiyang Zhang
893f662777 hyperv: Simplify the send_completion variables
The union contains only one member now, so we use the variables in it directly.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-23 14:48:39 -04:00
Haiyang Zhang
4baab26129 hyperv: Remove recv_pkt_list and lock
Removed recv_pkt_list and lock, and updated related code, so that
the locking overhead is reduced especially when multiple channels
are in use.

The recv_pkt_list isn't actually necessary because the packets are
processed sequentially in each channel. It has been replaced by a
local variable, and the related lock for this list is also removed.
The is_data_pkt field is not used in receive path, so its assignment
is cleaned up.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-23 14:48:39 -04:00
Haiyang Zhang
5b54dac856 hyperv: Add support for virtual Receive Side Scaling (vRSS)
This feature allows multiple channels to be used by each virtual NIC.
It is available on Hyper-V host 2012 R2.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-21 13:32:29 -04:00
KY Srinivasan
af9893a3dc Drivers: net: hyperv: Address UDP checksum issues
ws2008r2 does not support UDP checksum offload. Thus, we cannnot turn on
UDP offload in the host. Also, on ws2012 and ws2012 r2, there appear to be
an issue with UDP checksum offload.
Fix this issue by computing the UDP checksum in the Hyper-V driver.

Based on Dave Miller's comments, in this version, I have COWed the skb
before modifying the UDP header (the checksum field).

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-11 15:15:12 -04:00
KY Srinivasan
1f73db495a Drivers: net: hyperv: Negotiate suitable ndis version for offload support
Ws2008R2 supports ndis_version 6.1 and 6.1 is the minimal version required
for various offloads. Negotiate ndis_version 6.1 when on ws2008r2.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-11 15:15:12 -04:00
KY Srinivasan
4276372f0d Drivers: net: hyperv: Allocate memory for all possible per-pecket information
An outgoing packet can potentially need per-packet information for
all the offloads and VLAN tagging. Fix this issue.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-11 15:15:12 -04:00
David S. Miller
85dcce7a73 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/usb/r8152.c
	drivers/net/xen-netback/netback.c

Both the r8152 and netback conflicts were simple overlapping
changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-14 22:31:55 -04:00
Haiyang Zhang
99d3016de4 hyperv: Change the receive buffer size for legacy hosts
Due to a bug in the Hyper-V host verion 2008R2, we need to use a slightly smaller
receive buffer size, otherwise the buffer will not be accepted by the legacy hosts.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 16:11:26 -04:00
KY Srinivasan
77bf548794 Drivers: net: hyperv: Enable large send offload
Enable segmentation offload.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 15:51:37 -04:00
KY Srinivasan
08cd04bf6d Drivers: net: hyperv: Enable send side checksum offload
Enable send side checksum offload.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 15:51:37 -04:00
KY Srinivasan
e3d605ed44 Drivers: net: hyperv: Enable receive side IP checksum offload
Enable receive side checksum offload.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 15:51:37 -04:00
KY Srinivasan
4a0e70ae5e Drivers: net: hyperv: Enable offloads on the host
Prior to enabling guest side offloads, enable the offloads on the host.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 15:51:37 -04:00
KY Srinivasan
8a00251a36 Drivers: net: hyperv: Cleanup the send path
In preparation for enabling offloads, cleanup the send path.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 15:51:37 -04:00
KY Srinivasan
54a7357f7a Drivers: net: hyperv: Enable scatter gather I/O
Cleanup the code and enable scatter gather I/O.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 15:51:36 -04:00
Haiyang Zhang
1b07da516e hyperv: Move state setting for link query
It moves the state setting for query into rndis_filter_receive_response().
All callbacks including query-complete and status-callback are synchronized
by channel->inbound_lock. This prevents pentential race between them.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-05 20:40:25 -05:00
Haiyang Zhang
a1eabb0178 hyperv: Add latest NetVSP versions to auto negotiation
It auto negotiates the highest NetVSP version supported by both guest and host.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-19 19:52:45 -05:00
David S. Miller
1e8d6421cf Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/bonding/bond_3ad.h
	drivers/net/bonding/bond_main.c

Two minor conflicts in bonding, both of which were overlapping
changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-19 01:24:22 -05:00
KY Srinivasan
ee0c4c39c5 Drivers: net: hyperv: Cleanup the netvsc receive callback functio
Get rid of the buffer allocation in the receive path for normal packets.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-17 16:32:32 -05:00
KY Srinivasan
97c1723a61 Drivers: net: hyperv: Cleanup the receive path
Make the receive path a little more efficient by parameterizing the
required state rather than re-establishing that state.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-17 16:32:32 -05:00
KY Srinivasan
86eedacc63 Drivers: net: hyperv: Get rid of the rndis_filter_packet structure
This structure is redundant; get rid of it make the code little more efficient -
get rid of the unnecessary indirection.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-17 16:32:31 -05:00
Haiyang Zhang
891de74d69 hyperv: Fix the carrier status setting
Without this patch, the "cat /sys/class/net/ethN/operstate" shows
"unknown", and "ethtool ethN" shows "Link detected: yes", when VM
boots up with or without vNIC connected.

This patch fixed the problem.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-16 23:45:00 -05:00
Haiyang Zhang
b679ef73ed hyperv: Add support for physically discontinuous receive buffer
This will allow us to use bigger receive buffer, and prevent allocation failure
due to fragmented memory.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-27 16:40:45 -08:00
David S. Miller
56a4342dfe Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c
	net/ipv6/ip6_tunnel.c
	net/ipv6/ip6_vti.c

ipv6 tunnel statistic bug fixes conflicting with consolidation into
generic sw per-cpu net stats.

qlogic conflict between queue counting bug fix and the addition
of multiple MAC address support.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-06 17:37:45 -05:00
Haiyang Zhang
a68f961461 hyperv: Fix race between probe and open calls
Moving the register_netdev to the end of probe to prevent
possible open call happens before NetVSP is connected.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-21 22:23:06 -05:00
David S. Miller
143c905494 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/intel/i40e/i40e_main.c
	drivers/net/macvtap.c

Both minor merge hassles, simple overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-18 16:42:06 -05:00
Jason Wang
50dc875f2e netvsc: don't flush peers notifying work during setting mtu
There's a possible deadlock if we flush the peers notifying work during setting
mtu:

[   22.991149] ======================================================
[   22.991173] [ INFO: possible circular locking dependency detected ]
[   22.991198] 3.10.0-54.0.1.el7.x86_64.debug #1 Not tainted
[   22.991219] -------------------------------------------------------
[   22.991243] ip/974 is trying to acquire lock:
[   22.991261]  ((&(&net_device_ctx->dwork)->work)){+.+.+.}, at: [<ffffffff8108af95>] flush_work+0x5/0x2e0
[   22.991307]
but task is already holding lock:
[   22.991330]  (rtnl_mutex){+.+.+.}, at: [<ffffffff81539deb>] rtnetlink_rcv+0x1b/0x40
[   22.991367]
which lock already depends on the new lock.

[   22.991398]
the existing dependency chain (in reverse order) is:
[   22.991426]
-> #1 (rtnl_mutex){+.+.+.}:
[   22.991449]        [<ffffffff810dfdd9>] __lock_acquire+0xb19/0x1260
[   22.991477]        [<ffffffff810e0d12>] lock_acquire+0xa2/0x1f0
[   22.991501]        [<ffffffff81673659>] mutex_lock_nested+0x89/0x4f0
[   22.991529]        [<ffffffff815392b7>] rtnl_lock+0x17/0x20
[   22.991552]        [<ffffffff815230b2>] netdev_notify_peers+0x12/0x30
[   22.991579]        [<ffffffffa0340212>] netvsc_send_garp+0x22/0x30 [hv_netvsc]
[   22.991610]        [<ffffffff8108d251>] process_one_work+0x211/0x6e0
[   22.991637]        [<ffffffff8108d83b>] worker_thread+0x11b/0x3a0
[   22.991663]        [<ffffffff81095e5d>] kthread+0xed/0x100
[   22.991686]        [<ffffffff81681c6c>] ret_from_fork+0x7c/0xb0
[   22.991715]
-> #0 ((&(&net_device_ctx->dwork)->work)){+.+.+.}:
[   22.991715]        [<ffffffff810de817>] check_prevs_add+0x967/0x970
[   22.991715]        [<ffffffff810dfdd9>] __lock_acquire+0xb19/0x1260
[   22.991715]        [<ffffffff810e0d12>] lock_acquire+0xa2/0x1f0
[   22.991715]        [<ffffffff8108afde>] flush_work+0x4e/0x2e0
[   22.991715]        [<ffffffff8108e1b5>] __cancel_work_timer+0x95/0x130
[   22.991715]        [<ffffffff8108e303>] cancel_delayed_work_sync+0x13/0x20
[   22.991715]        [<ffffffffa03404e4>] netvsc_change_mtu+0x84/0x200 [hv_netvsc]
[   22.991715]        [<ffffffff815233d4>] dev_set_mtu+0x34/0x80
[   22.991715]        [<ffffffff8153bc2a>] do_setlink+0x23a/0xa00
[   22.991715]        [<ffffffff8153d054>] rtnl_newlink+0x394/0x5e0
[   22.991715]        [<ffffffff81539eac>] rtnetlink_rcv_msg+0x9c/0x260
[   22.991715]        [<ffffffff8155cdd9>] netlink_rcv_skb+0xa9/0xc0
[   22.991715]        [<ffffffff81539dfa>] rtnetlink_rcv+0x2a/0x40
[   22.991715]        [<ffffffff8155c41d>] netlink_unicast+0xdd/0x190
[   22.991715]        [<ffffffff8155c807>] netlink_sendmsg+0x337/0x750
[   22.991715]        [<ffffffff8150d219>] sock_sendmsg+0x99/0xd0
[   22.991715]        [<ffffffff8150d63e>] ___sys_sendmsg+0x39e/0x3b0
[   22.991715]        [<ffffffff8150eba2>] __sys_sendmsg+0x42/0x80
[   22.991715]        [<ffffffff8150ebf2>] SyS_sendmsg+0x12/0x20
[   22.991715]        [<ffffffff81681d19>] system_call_fastpath+0x16/0x1b

This is because we hold the rtnl_lock() before ndo_change_mtu() and try to flush
the work in netvsc_change_mtu(), in the mean time, netdev_notify_peers() may be
called from worker and also trying to hold the rtnl_lock. This will lead the
flush won't succeed forever. Solve this by not canceling and flushing the work,
this is safe because the transmission done by NETDEV_NOTIFY_PEERS was
synchronized with the netif_tx_disable() called by netvsc_change_mtu().

Reported-by: Yaju Cao <yacao@redhat.com>
Tested-by: Yaju Cao <yacao@redhat.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-17 14:45:42 -05:00