Commit graph

549675 commits

Author SHA1 Message Date
Andrej Ota
5f715c0979 via-rhine: fix VLAN receive handling regression.
Because eth_type_trans() consumes ethernet header worth of bytes, a call
to read TCI from end of packet using rhine_rx_vlan_tag() no longer works
as it's reading from an invalid offset.

Tested to be working on PCEngines Alix board.

Fixes: 810f19bcb8 ("via-rhine: add consistent memory barrier in vlan receive code.")
Signed-off-by: Andrej Ota <andrej@ota.si>
Acked-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 00:55:30 -07:00
Tom Herbert
ac00737f4e bpf: Need to call bpf_prog_uncharge_memlock from bpf_prog_put
Currently, is only called from __prog_put_rcu in the bpf_prog_release
path. Need this to call this from bpf_prog_put also to get correct
accounting.

Fixes: aaac3ba95e ("bpf: charge user for creation of BPF maps and programs")
Signed-off-by: Tom Herbert <tom@herbertland.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 00:55:02 -07:00
David S. Miller
a302afe980 Merge branch 'robust_listener'
Eric Dumazet says:

====================
tcp/dccp: make our listener code more robust

This patch series addresses request sockets leaks and listener dismantle
phase. This survives a stress test with listeners being added/removed
quite randomly.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 00:52:27 -07:00
Eric Dumazet
ebb516af60 tcp/dccp: fix race at listener dismantle phase
Under stress, a close() on a listener can trigger the
WARN_ON(sk->sk_ack_backlog) in inet_csk_listen_stop()

We need to test if listener is still active before queueing
a child in inet_csk_reqsk_queue_add()

Create a common inet_child_forget() helper, and use it
from inet_csk_reqsk_queue_add() and inet_csk_listen_stop()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 00:52:19 -07:00
Eric Dumazet
f03f2e154f tcp/dccp: add inet_csk_reqsk_queue_drop_and_put() helper
Let's reduce the confusion about inet_csk_reqsk_queue_drop() :
In many cases we also need to release reference on request socket,
so add a helper to do this, reducing code size and complexity.

Fixes: 4bdc3d6614 ("tcp/dccp: fix behavior of stale SYN_RECV request sockets")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 00:52:18 -07:00
Eric Dumazet
ef84d8ce5a Revert "inet: fix double request socket freeing"
This reverts commit c69736696c.

At the time of above commit, tcp_req_err() and dccp_req_err()
were dead code, as SYN_RECV request sockets were not yet in ehash table.

Real bug was fixed later in a different commit.

We need to revert to not leak a refcount on request socket.

inet_csk_reqsk_queue_drop_and_put() will be added
in following commit to make clean inet_csk_reqsk_queue_drop()
does not release the reference owned by caller.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 00:52:17 -07:00
David S. Miller
7de88271da Merge branch 'ipv6-blackhole-route-fix'
Martin KaFai Lau says:

====================
ipv6: Initialize rt6_info properly in ip6_blackhole_route()

This patchset ensures the rt6_info's fields are initialized properly
in ip6_blackhole_route() where xfrm_policy is the primarily user.
The first patch is a prep work.  The second patch is the fix.  It
fixes d52d3997f8 ("ipv6: Create percpu rt6_info").

Here is the oops reported by Phil Sutter <phil@nwl.cc>:

BUG: unable to handle kernel NULL pointer dereference at 00000000000000a0
IP: [<ffffffff8171a95e>] __ip6_datagram_connect+0x71e/0xa20
PGD c2cb1067 PUD c2d7a067 PMD 0
Oops: 0000 [#1] PREEMPT SMP
Modules linked in: cmac nfs lockd grace sunrpc bridge stp llc nvidia(PO) snd_usb_audio snd_usbmidi_lib iTCO_wdt
CPU: 1 PID: 2964 Comm: ping6 Tainted: P           O    4.2.1-aufs #10
Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./4Core1333-Viiv, BIOS P1.60 07/01/2008
task: ffff8800ca62bc00 ti: ffff880129a14000 task.ti: ffff880129a14000
RIP: 0010:[<ffffffff8171a95e>]  [<ffffffff8171a95e>] __ip6_datagram_connect+0x71e/0xa20
RSP: 0018:ffff880129a17da8  EFLAGS: 00010296
RAX: 000000000000000b RBX: 0000000000000000 RCX: 0000000000000006
RDX: 0000000000000007 RSI: 0000000000000246 RDI: ffff88012fc8d5a0
RBP: ffff8800cb9a9048 R08: 756e207369207472 R09: 216c6c756e207369
R10: 0000000000000665 R11: 0000000000000006 R12: ffff8800cb9a8cf8
R13: ffff8800cb9a8cf8 R14: 0000000000000000 R15: ffff8800cb9a8cc0
FS:  00007fb76ad74700(0000) GS:ffff88012fc80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000000a0 CR3: 00000000c2dba000 CR4: 00000000000406e0
Stack:
 ffff8800cb9a9048 ffff8800cb9a8de0 ffff8800cb9feb70 ffffffff816b2c41
 00007fb70000000b ffffea0000df7200 ffff8800cb9f5cfc ffff8800cb9a8cc0
 03fffffffe068a20 ffff8800cb9a8cc0 ffffffff817097c0 0000000100000000
Call Trace:
 [<ffffffff816b2c41>] ? udp_lib_get_port+0x1a1/0x380
 [<ffffffff817097c0>] ? udpv6_rcv+0x20/0x20
 [<ffffffff8171ac82>] ? ip6_datagram_connect+0x22/0x40
 [<ffffffff8163ae9b>] ? SyS_connect+0x6b/0xb0
 [<ffffffff810767ac>] ? __do_page_fault+0x15c/0x380
 [<ffffffff8163a8d3>] ? SyS_socket+0x63/0xa0
 [<ffffffff81741957>] ? entry_SYSCALL_64_fastpath+0x12/0x6a
Code: ba ae 00 00 00 48 c7 c6 7b 71 94 81 48 c7 c7 63 71 94 81 e8 6c 0f 02 00 48 85 db 75 0e 48 c7 c7 9f 71 94 81 31 c0 e8 59 0f 02 00 <48> 83 bb a0 00 00 00 00 75 0e 48 c7 c7 ae 71 94 81 31 c0 e8 41
RIP  [<ffffffff8171a95e>] __ip6_datagram_connect+0x71e/0xa20
 RSP <ffff880129a17da8>
CR2: 00000000000000a0
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 00:39:25 -07:00
Martin KaFai Lau
0a1f596200 ipv6: Initialize rt6_info properly in ip6_blackhole_route()
ip6_blackhole_route() does not initialize the newly allocated
rt6_info properly.  This patch:
1. Call rt6_info_init() to initialize rt6i_siblings and rt6i_uncached

2. The current rt->dst._metrics init code is incorrect:
   - 'rt->dst._metrics = ort->dst._metris' is not always safe
   - Not sure what dst_copy_metrics() is trying to do here
     considering ip6_rt_blackhole_cow_metrics() always returns
     NULL

   Fix:
   - Always do dst_copy_metrics()
   - Replace ip6_rt_blackhole_cow_metrics() with
     dst_cow_metrics_generic()

3. Mask out the RTF_PCPU bit from the newly allocated blackhole route.
   This bug triggers an oops (reported by Phil Sutter) in rt6_get_cookie().
   It is because RTF_PCPU is set while rt->dst.from is NULL.

Fixes: d52d3997f8 ("ipv6: Create percpu rt6_info")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Reported-by: Phil Sutter <phil@nwl.cc>
Tested-by: Phil Sutter <phil@nwl.cc>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: Phil Sutter <phil@nwl.cc>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 00:39:16 -07:00
Martin KaFai Lau
ebfa45f0d9 ipv6: Move common init code for rt6_info to a new function rt6_info_init()
Introduce rt6_info_init() to do the common init work for
'struct rt6_info' (after calling dst_alloc).

It is a prep work to fix the rt6_info init logic in the
ip6_blackhole_route().

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: Phil Sutter <phil@nwl.cc>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 00:39:14 -07:00
Jakub Pawlowski
5157b8a503 Bluetooth: Fix initializing conn_params in scan phase
This patch makes sure that conn_params that were created just for
explicit_connect, will get properly deleted during cleanup.

Signed-off-by: Jakub Pawlowski <jpawlowski@google.com>
Acked-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-10-16 09:24:41 +02:00
Johan Hedberg
9ad3e6ffe1 Bluetooth: Fix conn_params list update in hci_connect_le_scan_cleanup
After clearing the params->explicit_connect variable the parameters
may need to be either added back to the right list or potentially left
absent from both the le_reports and the le_conns lists.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-10-16 09:24:41 +02:00
Johan Hedberg
679d2b6f9d Bluetooth: Fix remove_device behavior for explicit connects
Devices undergoing an explicit connect should not have their
conn_params struct removed by the mgmt Remove Device command. This
patch fixes the necessary checks in the command handler to correct the
behavior.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-10-16 09:24:41 +02:00
Johan Hedberg
49c509220d Bluetooth: Fix LE reconnection logic
We can't use hci_explicit_connect_lookup() since that would only cover
explicit connections, leaving normal reconnections completely
untouched. Not using it in turn means leaving out entries in
pend_le_reports.

To fix this and simplify the logic move conn params from the reports
list to the pend_le_conns list for the duration of an explicit
connect. Once the connect is complete move the params back to the
pend_le_reports list. This also means that the explicit connect lookup
function only needs to look into the pend_le_conns list.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-10-16 09:24:41 +02:00
Johan Hedberg
b958f9a3e8 Bluetooth: Fix reference counting for LE-scan based connections
The code should never directly call hci_conn_hash_del since many
cleanup & reference counting updates would be lost. Normally
hci_conn_del is the right thing to do, but in the case of a connection
doing LE scanning this could cause a deadlock due to doing a
cancel_delayed_work_sync() on the same work callback that we were
called from.

Connections in the LE scanning state actually need very little cleanup
- just a small subset of hci_conn_del. To solve the issue, refactor
out these essential pieces into a new hci_conn_cleanup() function and
call that from the two necessary places.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-10-16 09:24:41 +02:00
Jakub Pawlowski
168b8a25c0 Bluetooth: Fix double scan updates
When disable/enable scan command is issued twice, some controllers
will return an error for the second request, i.e. requests with this
command will fail on some controllers, and succeed on others.

This patch makes sure that unnecessary scan disable/enable commands
are not issued.

When adding device to the auto connect whitelist when there is pending
connect attempt, there is no need to update scan.

hci_connect_le_scan_cleanup is conditionally executing
hci_conn_params_del, that is calling hci_update_background_scan. Make
the other case also update scan, and remove reduntand call from
hci_connect_le_scan_remove.

When stopping interleaved discovery the state should be set to stopped
only when both LE scanning and discovery has stopped.

Signed-off-by: Jakub Pawlowski <jpawlowski@google.com>
Acked-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-10-16 09:24:41 +02:00
Ivan Vecera
47ea032533 drivers/net: get rid of unnecessary initializations in .get_drvinfo()
Many drivers initialize uselessly n_priv_flags, n_stats, testinfo_len,
eedump_len & regdump_len fields in their .get_drvinfo() ethtool op.
It's not necessary as these fields is filled in ethtool_get_drvinfo().

v2: removed unused variable
v3: removed another unused variable

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-16 00:24:10 -07:00
Johannes Berg
a515de6607 cfg80211: reg: fix reg_ignore_cell_hint return type
The return type should be enum reg_request_treatment for both
branches of the #ifdef.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-10-16 09:15:45 +02:00
Johannes Berg
81e925747e cfg80211: reg: reduce chan_reg_rule_print_dbg() ifdef
The function is void and static, so just ifdef its contents
instead of duplicating the declaration.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-10-16 09:15:45 +02:00
Johannes Berg
9f50680292 cfg80211: reg: fix antenna gain in chan_reg_rule_print_dbg()
Printing "N/A mBi" is strange - print just "N/A" instead.

Also add a missing opening parenthesis.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-10-16 09:15:44 +02:00
Johannes Berg
d34265a3ee cfg80211: reg: centralize freeing ignored requests
Instead of having a lot of places that free ignored requests
and then return REG_REQ_OK, make reg_process_hint() process
REG_REQ_IGNORE by freeing the request, and let functions it
calls return that instead of freeing.

This also fixes a leak when a second (different) country IE
hint was ignored.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-10-16 09:15:44 +02:00
Johannes Berg
480908a7ec cfg80211: reg: clarify 'treatment' handling in reg_process_hint()
This function can only deal with treatment values OK and ALREADY_SET
so make the callees not return anything else and warn if they do.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-10-16 09:15:44 +02:00
Johannes Berg
fd453d3c53 cfg80211: reg: rename reg_regdb_query() to reg_query_builtin()
The new name better reflects the functionality.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-10-16 09:15:43 +02:00
Johannes Berg
b686303691 cfg80211: reg: make CRDA support optional
If there's a built-in regulatory database, there may be little point
in also calling out to CRDA and failing if the system is configured
that way. Allow removing CRDA support to save ~1K kernel size.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-10-16 09:15:39 +02:00
David S. Miller
ae23051820 Merge branch 'tipc-link-improvements'
Jon Maloy says:

====================
tipc: some link level code improvements

Extensive testing has revealed some weaknesses and non-optimal solutions
in the link level code.

This commit series addresses those issues.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:55:33 -07:00
Jon Paul Maloy
c819930090 tipc: update node FSM when peer RESET message is received
The change made in the previous commit revealed a small flaw in the way
the node FSM is updated. When the function tipc_node_link_down() is
called for the last link to a node, we should check whether this was
caused by a local reset or by a received RESET message from the peer.
In the latter case, we can directly issue a PEER_LOST_CONTACT_EVT to
the node FSM, so that it is ready to re-establish contact. If this is
not done, the peer node will sometimes have to go through a second
establish cycle before the link becomes stable.

We fix this in this commit by conditionally issuing the mentioned
event in the function tipc_node_link_down(). We also move LINK_RESET
FSM even away from the link_reset() function and into the caller
function, partially because it is easier to follow the code when state
changes are gathered at a limited number of locations, partially
because there will be cases in future commits where we don't want the
link to go RESET mode when link_reset() is called.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:55:23 -07:00
Jon Paul Maloy
282b3a0562 tipc: send out RESET immediately when link goes down
When a link is taken down because of a node local event, such as
disabling of a bearer or an interface, we currently leave it to the
peer node to discover the broken communication. The default time for
such failure discovery is 1.5-2 seconds.

If we instead allow the terminating link endpoint to send out a RESET
message at the moment it is reset, we can achieve the impression that
both endpoints are going down instantly. Since this is a very common
scenario, we find it worthwhile to make this small modification.

Apart from letting the link produce the said message, we also have to
ensure that the interface is able to transmit it before TIPC is
detached. We do this by performing the disabling of a bearer in three
steps:

1) Disable reception of TIPC packets from the interface in question.
2) Take down the links, while allowing them so send out a RESET message.
3) Disable transmission of TIPC packets on the interface.

Apart from this, we now have to react on the NETDEV_GOING_DOWN event,
instead of as currently the NEDEV_DOWN event, to ensure that such
transmission is possible during the teardown phase.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:55:22 -07:00
Jon Paul Maloy
73f646cec3 tipc: delay ESTABLISH state event when link is established
Link establishing, just like link teardown, is a non-atomic action, in
the sense that discovering that conditions are right to establish a link,
and the actual adding of the link to one of the node's send slots is done
in two different lock contexts. The link FSM is designed to help bridging
the gap between the two contexts in a safe manner.

We have now discovered a weakness in the implementaton of this FSM.
Because we directly let the link go from state LINK_ESTABLISHING to
state LINK_ESTABLISHED already in the first lock context, we are unable
to distinguish between a fully established link, i.e., a link that has
been added to its slot, and a link that has not yet reached the second
lock context. It may hence happen that a manual intervention, e.g., when
disabling an interface, causes the function tipc_node_link_down() to try
removing the link from the node slots, decrementing its active link
counter etc, although the link was never added there in the first place.

We solve this by delaying the actual state change until we reach the
second lock context, inside the function tipc_node_link_up(). This
makes it possible for potentail callers of __tipc_node_link_down() to
know if they should proceed or not, and the problem is solved.

Unforunately, the situation described above also has a second problem.
Since there by necessity is a tipc_node_link_up() call pending once
the node lock has been released, we must defuse that call by setting
the link back from LINK_ESTABLISHING to LINK_RESET state. This forces
us to make a slight modification to the link FSM, which will now look
as follows.

 +------------------------------------+
 |RESET_EVT                           |
 |                                    |
 |                             +--------------+
 |           +-----------------|   SYNCHING   |-----------------+
 |           |FAILURE_EVT      +--------------+   PEER_RESET_EVT|
 |           |                  A            |                  |
 |           |                  |            |                  |
 |           |                  |            |                  |
 |           |                  |SYNCH_      |SYNCH_            |
 |           |                  |BEGIN_EVT   |END_EVT           |
 |           |                  |            |                  |
 |           V                  |            V                  V
 |    +-------------+          +--------------+          +------------+
 |    |  RESETTING  |<---------|  ESTABLISHED |--------->| PEER_RESET |
 |    +-------------+ FAILURE_ +--------------+ PEER_    +------------+
 |           |        EVT        |    A         RESET_EVT       |
 |           |                   |    |                         |
 |           |  +----------------+    |                         |
 |  RESET_EVT|  |RESET_EVT            |                         |
 |           |  |                     |                         |
 |           |  |                     |ESTABLISH_EVT            |
 |           |  |  +-------------+    |                         |
 |           |  |  | RESET_EVT   |    |                         |
 |           |  |  |             |    |                         |
 |           V  V  V             |    |                         |
 |    +-------------+          +--------------+        RESET_EVT|
 +--->|    RESET    |--------->| ESTABLISHING |<----------------+
      +-------------+ PEER_    +--------------+
       |           A  RESET_EVT       |
       |           |                  |
       |           |                  |
       |FAILOVER_  |FAILOVER_         |FAILOVER_
       |BEGIN_EVT  |END_EVT           |BEGIN_EVT
       |           |                  |
       V           |                  |
      +-------------+                 |
      | FAILINGOVER |<----------------+
      +-------------+

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:55:21 -07:00
Jon Paul Maloy
8306f99a51 tipc: disallow packet duplicates in link deferred queue
After the previous commits, we are guaranteed that no packets
of type LINK_PROTOCOL or with illegal sequence numbers will be
attempted added to the link deferred queue. This makes it possible to
make some simplifications to the sorting algorithm in the function
tipc_skb_queue_sorted().

We also alter the function so that it will drop packets if one with
the same seqeunce number is already present in the queue. This is
necessary because we have identified weird packet sequences, involving
duplicate packets, where a legitimate in-sequence packet may advance to
the head of the queue without being detected and de-queued.

Finally, we make this function outline, since it will now be called only
in exceptional cases.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:55:21 -07:00
Jon Paul Maloy
81204c492b tipc: improve sequence number checking
The sequence number of an incoming packet is currently only checked
for less than, equality to, or bigger than the next expected number,
meaning that the receive window in practice becomes one half sequence
number cycle, or U16_MAX/2. This does not make sense, and may not even
be safe if there are extreme delays in the network. Any packet sent by
the peer during the ongoing cycle must belong inside his current send
window, or should otherwise be dropped if possible.

Since a link endpoint cannot know its peer's current send window, it
has to base this sanity check on a worst-case assumption, i.e., that
the peer is using a maximum sized window of 8191 packets. Using this
assumption, we now add a check that the sequence number is not bigger
than next_expected + TIPC_MAX_LINK_WIN. We also re-order the checks
done, so that the receive window test is performed before the gap test.
This way, we are guaranteed that no packet with illegal sequence numbers
are ever added to the deferred queue.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:55:20 -07:00
Jon Paul Maloy
f9aa358a81 tipc: simplify tipc_link_rcv() reception loop
Currently, all packets received in tipc_link_rcv() are unconditionally
added to the packet deferred queue, whereafter that queue is walked and
all its buffers evaluated for delivery. This is both non-optimal and
and makes the queue sorting function unnecessary complex.

This commit changes the loop so that an arrived packet is evaluated
first, and added to the deferred queue only when a sequence number gap
is discovered. A non-empty deferred queue is walked until it is empty
or until its head's sequence number doesn't fit.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:55:19 -07:00
Jon Paul Maloy
9945e8043e tipc: limit usage of temporary skb list during packet reception
During packet reception, the function tipc_link_rcv() adds its accepted
packets to a temporary buffer queue, before finally splicing this queue
into the lock protected input queue that will be delivered up to the
socket layer. The purpose is to reduce potential contention on the input
queue lock. However, since the vast majority of packets arrive in
sequence, they will anyway be added one by one to the input queue, and
the use of the temporary queue becomes a sub-optimization.

The only case where this queue makes sense is when unpacking buffers
from a bundle packet; here we want to avoid dozens of small buffers
to be added individually to the lock-protected input queue in a tight
loop.

In this commit, we remove the general usage of the temporary queue,
and keep it only for the packet unbundling case.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:55:18 -07:00
Insu Yun
175f8d6746 mlx4: corretly check failed allocation
When allocation fails, mlx4_alloc_cmd_mailbox returns -ENOMEM.
Since there is no case that mlx4_alloc_cmd_mailbox returns NULL,
it needs to be checked by IS_ERR, not IS_ERR_OR_NULL

Signed-off-by: Insu Yun <wuninsu@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:31:38 -07:00
Eric Dumazet
e87eb4051e bonding: support encapsulated ipv6 TSO
If using a sixtofour device on top of a bonding device,
skb segmentation of TCP traffic is done right before calling
bonding xmit, because bonding only enables TSO for IPv4.

This patch improves single flow performance by about 120 % on my hosts,
because segmentation is deferred right before calling slave xmit.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:29:28 -07:00
David S. Miller
181e4246b4 Merge branch 'mlxsw-cleanups'
Jiri Pirko says:

====================
mlxsw: Driver update, cleanups

This patchset contains various cleanups and improvements in mlxsw driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:28:03 -07:00
Ido Schimmel
5cd16d8c78 mlxsw: cmd: Update CONFIG_PROFILE command documentation
The meaning of certain parameters in the profile passed to the device
during initialization has changed, so update their documentation
accordingly.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:27:57 -07:00
Ido Schimmel
801bd3defb mlxsw: Add trap group for control packets
Previously, we trapped flooded and control packets using the same trap
group. This can cause flooded packets to overflow the PCI bus and
prevent control packets (e.g. STP, LACP) from getting to the CPU.

Solve this by splitting the RX trap group to RX and control, which allows
us to configure a policer on the first, thereby preventing it from
overflowing the PCI bus.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:27:56 -07:00
Ido Schimmel
f24af33015 mlxsw: Simplify traps creation
The Host Trap Group Table (HTGT) register configures trap groups, which
are populated with trap IDs using the Host PacKet Trap (HPKT) register.
However, a trap ID can only be present inside one trap group (the last
configured).

Instead of passing both the trap group and ID for the function that
packs HPKT, pass only the trap ID and derive from it the trap group.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:27:55 -07:00
Jiri Pirko
ebb7963f9b mlxsw: Introduce mlxsw_reg_spms_vid_pack helper and use it
Introduce separate helper for packing SPMS VIDs, as it can be used for
multiple VIDs and not only for one as previous SPMS pack function
provided.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:27:55 -07:00
Ido Schimmel
fa6ad058bc mlxsw: reg: Adjust definition of enum mlxsw_reg_sfgc_type
Define max which would be needed later on.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:27:54 -07:00
Jiri Pirko
36b78e8aba mlxsw: reg: Remove extra space in SFGC ID define
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:27:53 -07:00
Jiri Pirko
3f0effd16b mlxsw: reg: Uppercase letters in register IDs
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:27:52 -07:00
Jiri Pirko
6cf9dc8b77 mlxsw: Use dev_level_ratelimited instead of net_ratelimit & dev_level
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:27:51 -07:00
Jiri Pirko
18ea54454e mlxsw: core: Do not use EMADs in mlxsw_emad_fini
Be symmetric with mlxsw_emad_init and don't use EMADs in mlxsw_emad_fini
cleanup function. Use command interface instead.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:27:51 -07:00
Jiri Pirko
3e2206da73 mlxsw: pci: Limit number of entries being sent in single MAP_FA cmd
Firmware accepts only limited number of mapping entries for MAP_FA
command. In order to prevent overflow, introduce a limit and in case the
number of entries is bigger, call MAP_FA multiple times.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:27:49 -07:00
Jiri Pirko
c85c3882ad mlxsw: pci: Remove MLXSW_PCI_RDQS/SDQS defines and checks
Remove strict number check of queues count as various ASICs have
different counts.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:27:48 -07:00
Jiri Pirko
424e1114af mlxsw: pci: Do not use MLXSW_PCI_SDQS_COUNT define
Use mlxsw_pci_sdq_count helper instead.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:27:48 -07:00
Jiri Pirko
e4c870b1b4 mlxsw: pci: Use MLXSW_PCI_CQS_MAX instead of MLXSW_PCI_CQS_COUNT
The count of CQs can be different for various ASICs, so just define
maximal value and check for that.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:27:47 -07:00
Jiri Pirko
ffe053285b mlxsw: switchx2: Use ETH_ALEN for mac address length
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:27:45 -07:00
Ido Schimmel
33a704a59b mlxsw: Remove multicast ID configuration
With respect to a firmware change, the Switch Multicast ID (SMID)
register is no longer needed, so the related configuration code can be
removed.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-15 23:27:45 -07:00
Arnd Bergmann
d549f545e6 drm/virtio: use %llu format string form atomic64_t
The virtgpu driver prints the last_seq variable using the %ld or
%lu format string, which does not work correctly on all architectures
and causes this compiler warning on ARM:

drivers/gpu/drm/virtio/virtgpu_fence.c: In function 'virtio_timeline_value_str':
drivers/gpu/drm/virtio/virtgpu_fence.c:64:22: warning: format '%lu' expects argument of type 'long unsigned int', but argument 4 has type 'long long int' [-Wformat=]
  snprintf(str, size, "%lu", atomic64_read(&fence->drv->last_seq));
                      ^
drivers/gpu/drm/virtio/virtgpu_debugfs.c: In function 'virtio_gpu_debugfs_irq_info':
drivers/gpu/drm/virtio/virtgpu_debugfs.c:37:16: warning: format '%ld' expects argument of type 'long int', but argument 3 has type 'long long int' [-Wformat=]
  seq_printf(m, "fence %ld %lld\n",
                ^

In order to avoid the warnings, this changes the format strings to %llu
and adds a cast to u64, which makes it work the same way everywhere.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-10-16 11:36:36 +10:00