Commit graph

198242 commits

Author SHA1 Message Date
Eric Dumazet
ab6e3feba1 net: No dst refcounting in ip_queue_xmit()
TCP outgoing packets can avoid two atomic ops, and dirtying
of previously higly contended cache line using new refdst
infrastructure.

Note 1: loopback device excluded because of !IFF_XMIT_DST_RELEASE
Note 2: UDP packets dsts are built before ip_queue_xmit().

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-17 17:18:52 -07:00
Eric Dumazet
4a94445c9a net: Use ip_route_input_noref() in input path
Use ip_route_input_noref() in ip fast path, to avoid two atomic ops per
incoming packet.

Note: loopback is excluded from this optimization in ip_rcv_finish()

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-17 17:18:51 -07:00
Eric Dumazet
407eadd996 net: implements ip_route_input_noref()
ip_route_input() is the version returning a refcounted dst, while
ip_route_input_noref() returns a non refcounted one.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-17 17:18:51 -07:00
Eric Dumazet
7fee226ad2 net: add a noref bit on skb dst
Use low order bit of skb->_skb_dst to tell dst is not refcounted.

Change _skb_dst to _skb_refdst to make sure all uses are catched.

skb_dst() returns the dst, regardless of noref bit set or not, but
with a lockdep check to make sure a noref dst is not given if current
user is not rcu protected.

New skb_dst_set_noref() helper to set an notrefcounted dst on a skb.
(with lockdep check)

skb_dst_drop() drops a reference only if skb dst was refcounted.

skb_dst_force() helper is used to force a refcount on dst, when skb
is queued and not anymore RCU protected.

Use skb_dst_force() in __sk_add_backlog(), __dev_xmit_skb() if
!IFF_XMIT_DST_RELEASE or skb enqueued on qdisc queue, in
sock_queue_rcv_skb(), in __nf_queue().

Use skb_dst_force() in dev_requeue_skb().

Note: dst_use_noref() still dirties dst, we might transform it
later to do one dirtying per jiffies.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-17 17:18:50 -07:00
Eric Dumazet
ebda37c27d rps: avoid one atomic in enqueue_to_backlog
If CONFIG_SMP=y, then we own a queue spinlock, we can avoid the atomic
test_and_set_bit() from napi_schedule_prep().

We now have same number of atomic ops per netif_rx() calls than with
pre-RPS kernel.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-17 17:18:50 -07:00
Neil Jones
3f78d1f210 drivers/net/usb/asix.c: Fix unaligned accesses
Using this driver can cause unaligned accesses in the IP layer
This has been fixed by aligning the skb data correctly using the
spare room left over by the 4 byte header inserted between packets
by the device.

Signed-off-by: Neil Jones <NeilJay@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-17 17:18:28 -07:00
Brian King
e7a3af5d8c ibmveth: Add suspend/resume support
Adds support for resuming from suspend for IBM virtual ethernet devices.
We may have lost an interrupt over the suspend, so we just kick the
interrupt handler to process anything that is outstanding.

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-17 17:09:10 -07:00
Florian Westphal
0771275b25 ipv6 addrlabel: permit deletion of labels assigned to removed dev
as addrlabels with an interface index are left alone when the
interface gets removed this results in addrlabels that can no
longer be removed.

Restrict validation of index to adding new addrlabels.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-17 17:08:08 -07:00
Julia Lawall
2db4e42eac drivers/block/drbd: Use kzalloc
Use kzalloc rather than the combination of kmalloc and memset.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
expression x,size,flags;
statement S;
@@

-x = kmalloc(size,flags);
+x = kzalloc(size,flags);
 if (x == NULL) S
-memset(x, 0, size);
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 02:04:10 +02:00
Philipp Reisner
0c3f34516e drbd: Create new current UUID as late as possible
The choice was to either delay creation of the new UUID until
IO got thawed or to delay it until the first IO request.

Both are correct, the later is more friendly to users of
dual-primary setups, that actually only write on one side.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 02:03:49 +02:00
Philipp Reisner
9a25a04c80 drbd: If we detect late that IO got frozen, retry after we thawed.
If we detect late (= after grabing mdev->req_lock) that IO got frozen, we
return 1 to generic_make_request(), which simply will retry to make a
request for that bio.

In the subsequent call of generic_make_request() into drbd_make_request_26()
we sleep in inc_ap_bio().

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 02:03:32 +02:00
Lars Ellenberg
a1c88d0d7a drbd: always use_bmbv, ignore setting
Now that the peer may handle multi-bio EEs,
we can ignore the peer's limit,
and concentrate on the limits of the local IO stack.

This is safe accross drbd protocol versions,
as our queue_max_sectors() will be adjusted accordingly.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 02:03:05 +02:00
Lars Ellenberg
bb3d000cb9 drbd: allow resync requests to be larger than max_segment_size
this should allow for better background resync performance.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 02:02:36 +02:00
Lars Ellenberg
45bb912bd5 drbd: Allow drbd_epoch_entries to use multiple bios.
This should allow for better performance if the lower level IO stack
of the peers differs in limits exposed either via the queue,
or via some merge_bvec_fn.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 02:01:23 +02:00
Christoph Fritz
59497bba59 Staging: wlan-ng prism2usb: add suspend/resume
There is no need trying to load the (even in most cases) not availible
firmware after suspend. This saves about 30 secounds on reset waiting
for timeout.

Signed-off-by: Christoph Fritz <chf.fritz@googlemail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-17 16:33:28 -07:00
Hank Janssen
9153f7b997 staging: hv: Added heartbeat functionality to hv_utils
Add heartbeat functionality to hv_utils/Hyper-V

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Hank Janssen <hjanssen@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-17 16:32:30 -07:00
Julia Lawall
94002c07ff Staging: Use kmemdup
Use kmemdup when some other buffer is immediately copied into the
allocated region.

A simplified version of the semantic patch that makes this change is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@@
expression from,to,size,flag;
statement S;
@@

-  to = \(kmalloc\|kzalloc\)(size,flag);
+  to = kmemdup(from,size,flag);
   if (to==NULL || ...) S
-  memcpy(to, from, size);
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-17 16:31:15 -07:00
Dan Williams
0b28330e39 Merge branch 'ioat' into dmaengine 2010-05-17 16:30:58 -07:00
Linus Walleij
058276303d DMAENGINE: extend the control command to include an arg
This adds an argument to the DMAengine control function, so that
we can later provide control commands that need some external data
passed in through an argument akin to the ioctl() operation
prototype.

[dan.j.williams@intel.com: fix up some missed conversions]
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-05-17 16:30:42 -07:00
Lars Ellenberg
708d740ed8 drbd: reduce sizeof struct drbd_epoch_entry by 8 byte by aligning members
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:28:35 +02:00
Philipp Reisner
162f3ec7f0 drbd: Fixes to the new delay_probes code
* Only send delay_probes with protocol 93 or newer
* drbd_send_delay_probes() is called only from worker context,
  no atomic_t needed for delay_seq

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:28:08 +02:00
Charles Clément
96fe9ee2c2 Staging: vt6655: use ETH_HLEN macro instead of custom one
Replaced custom header length definition U_HEADER_LEN by ETH_HLEN
from <linux/if_ether.h>. Also remove unused U_TYPE_LEN.

Signed-off-by: Charles Clément <caratorn@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-17 16:28:00 -07:00
Philipp Reisner
a8cdfd8d3b drbd: A fixes to the new resync speed code
* Mention P_DELAY_PROBE in the packet naming array
* Do not corrupt the mdev->data.work list in case the timer goes
  off before delay_probe_work got handled by the worker
* Do not mod_timer() twice for a single delay_probe pair

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:26:51 +02:00
Charles Clément
078b078f66 Staging: vt6655: use ETH_ALEN macro instead of custom one
Replaced custom ethernet address length definition U_ETHER_ADDR_LEN by
ETH_ALEN from <linux/if_ether.h>.

Signed-off-by: Charles Clément <caratorn@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-17 16:26:42 -07:00
Philipp Reisner
eedf386ae9 drbd: Proc bits of new resync speed stuff
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:26:27 +02:00
Philipp Reisner
cdd67a7460 drbd: Control the actual resync rate based on the queuing delay of data packets
In a setup with a high bandwidth and high latency network, eventually
involving deep queues in routers, it is beneficial to only fill those
queues up to an limited extend with resync data.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:25:47 +02:00
Philipp Reisner
bd26bfc5b4 drbd: Actually send delay probes
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:25:28 +02:00
Philipp Reisner
67c7ddd055 drbd: Four new configuration settings for resync speed control
To reasonably control resync speed over drbd-proxy connections,
drbd has to measure the current delay of packets transmitted over
the (possibly congested) data socket vs the meta-data socket.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:25:00 +02:00
Dan Williams
caa20d974c async_tx: trim dma_async_tx_descriptor in 'no channel switch' case
Saves 24 bytes per descriptor (64-bit) when the channel-switching
capabilities of async_tx are not required.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-05-17 16:24:16 -07:00
Philipp Reisner
7237bc430f drbd: Sending of delay_probes
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:22:46 +02:00
Philipp Reisner
0ced55a3be drbd: Receiving of delay_probes
Delay_probes are new packets in the DRBD protocol, which allow
DRBD to know the current delay packets have on the data socket.
(relative to the meta data socket)

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:22:11 +02:00
Philipp Reisner
5223671bb0 drbd: Fixed bitmap in case of online-grow without resync
The "surplus" bits of the old (smaller) bitmap must be clean
in case of online-grow without resync.

Note: Reverted 67ae8b80d4a116ab3b7094eb3723506b20c06dff as
well, since the lines added by this patch are redundant. The
bits get set by the bm_set_surplus(b) call before that.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:20:33 +02:00
Philipp Reisner
6b4388ac1f drbd: Added transmission faults to the fault injection code
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:19:51 +02:00
Philipp Reisner
087c24925c drbd: bugfix: Make resize work, if remote's size was limiting and increased in the meantime
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:18:22 +02:00
Philipp Reisner
6495d2c6d0 drbd: Implemented the --assume-clean option for drbdsetup resize
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:17:47 +02:00
Philipp Reisner
b4ee79dac3 drbd: Added some missing statics
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:17:11 +02:00
Philipp Reisner
fd76438c24 drbd: Make sure to resync all of the new storage upon online resize
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:16:20 +02:00
Philipp Reisner
e89b591c3a drbd: Implemented flags for the resize packet
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:15:44 +02:00
Philipp Reisner
02d9a94bbb drbd: Implemented the set_new_bits parameter for drbd_bm_resize()
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:14:43 +02:00
Philipp Reisner
d845030f21 drbd: made determin_dev_size's parameter an flag enum
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:14:04 +02:00
Adam Gandelman
3a11a48789 drbd: New handler: initial-split-brain
Some wish to be notified of all instances of split brain, not just those that
go unresolved.  The initial-split-brain handler is called to notify someone
upon  detection of all split brain conditions even if auto-recovery policies
are configured.

Signed-off-by: Adam Gandelman <adam.gandelman@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:13:33 +02:00
Lars Ellenberg
979f5c7f1f drbd: fail_requests_early: remove incorrect and unnecessary optimization
The condition does not fit the commend (I may well be Primary,
even if I lost the disk earlier and now the connection).

And this is catched below anyways, where it also gets logged.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:10:31 +02:00
Lars Ellenberg
6666032ade drbd: check for corrupt or malicous sector addresses when receiving data
Even if it should never happen if the peer does behave, we need to
double check, and not even attempt access beyond end of device.
It usually would be caught by lower layers, resulting in "IO error",
but may also end up in the internal meta data area.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:09:57 +02:00
Philipp Reisner
c3fe30b0e7 drbd: cleanup: This code path to trigger a resync is no longer needed
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:09:13 +02:00
Lars Ellenberg
8d4ce82b3c drbd: don't start a resync without access to up-to-date Data
In case both nodes are "inconsistent", invalidate would
have started a resync anyways, without a chance to ever
succeed, just filling the logs with warning messages.

Simply disallow that state change,
re-using the SS_NO_UP_TO_DATE_DISK return value.

This also changes the corresponding error string to
"Need access to UpToDate Data" -- I found the
"Refusing to be Primary without at least one UpToDate disk"
answer misleading in some situations anyways.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:08:18 +02:00
Lars Ellenberg
c3470cde57 drbd: fix potential protocol error
Don't forget to drain the digest in case we cannot satisfy a
checksum based resync or online-verify request.

It would additionally cause a protocoll error,
dropping the connection.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:07:38 +02:00
Lars Ellenberg
8d1894ebe4 drbd: remove bogus ASSERT
block_id may be ID_SYNCER,
as well as checksum based resync request magic, or online verify magic.

Let's just drop that ASSERT.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:06:59 +02:00
Lars Ellenberg
e0f83012dc drbd: fix regression: attach while connected failed
commit e4f925e12e
Author: Philipp Reisner <philipp.reisner@linbit.com>
Date:   Wed Mar 17 14:18:41 2010 +0100

    drbd: Do not upgrade state to Outdated if already Inconsistent

prevented the necessary state transition for attaching while connected
(Diskless -> Consistent respectively Outdated).
This is the fix for the fix.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:06:07 +02:00
Philipp Reisner
e4f925e12e drbd: Do not upgrade state to Outdated if already Inconsistent [Bugz 277]
There was a race condition:
  In a situation with a SyncSource+Primary and a SyncTarget+Secondary node,
  and a resync dependency to some other device. After both nodes decided
  to do the resync, the other device finishes its resync process.
  At that time SyncSource already sent the P_SYNC_UUID packet, and
  already updated its peer disk state to Inconsistent.
  The SyncTarget node waits for the P_SYNC_UUID and sends a state packet
  to report the resync dependency change. That packet still carries
  a disk state of Outdated.

Impact:
  If application writes come in, during that time on the Primary node,
  those do not get replicated, and the out-of-sync counter gets increased.
  => The completion of resync is not detected on the primary node.
  => stalled.
  Those blocks get resync'ed with the next resync, since the are get
  marked as out-of-sync in the bitmap.

In order to fix this, we filter out that wrong state change in the
sanitize_state() function.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:01:05 +02:00
Lars Ellenberg
8c484ee491 drbd: use proc_create_data with explicit NULL argument
To document that we know about deprecation of proc_create,
even though we are not affected, as we don't use the ->data member,
open code proc_create_data(..., NULL);

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 00:59:00 +02:00