Commit graph

310,704 commits

Author SHA1 Message Date
Thomas Abraham
95dcc2cb6c mmc: dw_mmc: make multiple instances of dw_mci_card_workqueue
The variable 'dw_mci_card_workqueue' is a global variable shared between
multiple instances of the dw_mmc host controller. Due to this, data
corruption has been noticed when multiple instances of dw_mmc controllers
are actively reading/writing the media. Fix this by adding a instance
of 'struct workqueue_struct' for each host instance and removing the
global 'dw_mci_card_workqueue' instance.

Signed-off-by: Thomas Abraham <thomas.abraham@linaro.org>
Acked-by: Jaehoon Chung <jh80.chung@samsung.com>
Acked-by: Will Newton <will.newton@imgtec.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2012-05-09 10:14:10 -04:00
Venkatraman S
b41b6f1d1c mmc: queue: remove redundant memsets
Not needed to memset, as they are pointers and are assigned
to proper values in the next line anyway.

Signed-off-by: Venkatraman S <svenkatr@ti.com>
Reviewed-by: Namjae Jeon <linkinjeon@gmail.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2012-05-09 10:10:46 -04:00
Olof Johansson
9e69ab4116 mfd: Fix of_match_node() da9052 arguments
The driver calls of_match_node() with the arguments swapped.

Signed-off-by: Olof Johansson <olof@lixom.net>
Tested-by: Ying-Chun Liu <paul.liu@linaro.org>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2012-05-09 16:09:32 +02:00
Venkatraman S
1b50f5f392 mmc: queue: rename mmc_request function
The name mmc_request is used for both the issue function
and a data structure, which creates conflicts in symbol lookups
in editors. Rename the function to mmc_request_fn.

Signed-off-by: Venkatraman S <svenkatr@ti.com>
Reviewed-by: Namjae Jeon <linkinjeon@gmail.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2012-05-09 10:08:54 -04:00
Subhash Jadavani
3d93576e34 mmc: core: skip card initialization if power class selection fails
With current implementation of power class selection,
mmc_select_powerclass() should never fail. So treat any error
returned by this function as serious enough to skip the card
initialization.

Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
Reviewed-by: Namjae Jeon <linkinjeon@gmail.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2012-05-09 10:07:12 -04:00
Seungwon Jeon
10942aa40a mmc: core: fix the signaling 1.8V for HS200
Currently only 1.2V is treated for HS200 mode. If the host has only
1.8V I/O capability not 1.2V, mmc_set_signal_voltage can't be called
for 1.8V HS200. EXT_CSD_CARD_TYPE_SDR_1_8V needs to be considered.

Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
2012-05-09 10:04:12 -04:00
Seungwon Jeon
96cf5f02ae mmc: core: fix the decision of HS200/DDR card-type
Current implementation decides the card type exclusively. Even though
eMMC device can support both HS200 and DDR mode, card type will be
set only for HS200. If the host doesn't support HS200 but has DDR
capability, then DDR mode can't be selected.

Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
2012-05-09 10:03:21 -04:00
Seth Heasley
8ee3c2a79f lpc_sch: Add Intel Centerton Multifunction Device support
This patch adds the Intel Centerton processor DeviceID for the
Integrated Legacy Block (ILB).
The ILB provides GPIO, SMBus, and Watchdog functionality.

Signed-off-by: Seth Heasley <seth.heasley@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2012-05-09 15:58:21 +02:00
Seth Heasley
1fc9b1eade pci_ids: Add Intel Centerton Legacy Block DeviceID
This patch adds the Integrated Legacy Block DeviceID for the Centerton CPU.  It will be used in the GPIO and Multifunction Devices driver.

Signed-off-by: Seth Heasley <seth.heasley@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2012-05-09 15:57:01 +02:00
Alessandro Rubini
7b0d44f3b7 gpio: Add STA2X11 GPIO block
This introduces 128 gpio bits (for each PCI device installed) with
working interrupt support.

Signed-off-by: Alessandro Rubini <rubini@gnudd.com>
Acked-by: Giancarlo Asnaghi <giancarlo.asnaghi@st.com>
Cc: Alan Cox <alan@linux.intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2012-05-09 15:38:39 +02:00
Alessandro Rubini
35bdd29095 mfd: Add driver for STA2X11 MFD block
This also introduces <asm/sta2x11.h> to export a function that is in
the base sta2x11 support patches. The header will increase with other
prototypes and constants over time.

Signed-off-by: Alessandro Rubini <rubini@gnudd.com>
Acked-by: Giancarlo Asnaghi <giancarlo.asnaghi@st.com>
Cc: Alan Cox <alan@linux.intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2012-05-09 15:34:28 +02:00
Peter Zijlstra
cb04ff9ac4 sched, perf: Use a single callback into the scheduler
We can easily use a single callback for both sched-in and sched-out. This
reduces the code footprint in the scheduler path as well as removes
the PMU black spot otherwise present between the out and in callback.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-o56ajxp1edwqg6x9d31wb805@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-09 15:23:17 +02:00
Robert Richter
8b1e13638d perf/x86-ibs: Fix usage of IBS op current count
The value of IbsOpCurCnt rolls over when it reaches IbsOpMaxCnt. Thus,
it is reset to zero by hardware. To get the correct count we need to
add the max count to it in case we received an ibs sample (valid bit
set).

Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1333390758-10893-13-git-send-email-robert.richter@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-09 15:23:17 +02:00
Robert Richter
fc5fb2b5e1 perf/x86-ibs: Catch spurious interrupts after stopping IBS
After disabling IBS there could be still incomming NMIs with samples
that even have the valid bit cleared. Mark all this NMIs as handled to
avoid spurious interrupt messages.

Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1333390758-10893-12-git-send-email-robert.richter@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-09 15:23:16 +02:00
Robert Richter
c9574fe0bd perf/x86-ibs: Implement workaround for IBS erratum #420
When disabling ibs there might be the case where hardware continuously
generates interrupts. This is described in erratum #420 (Instruction-
Based Sampling Engine May Generate Interrupt that Cannot Be Cleared).
To avoid this we must clear the counter mask first and then clear the
enable bit. This patch implements this.

See Revision Guide for AMD Family 10h Processors, Publication #41322.

Note: We now keep track of the last read ibs config value which is
then used to disable ibs. To update the config value we pass now a
pointer to the functions reading it.

Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1333390758-10893-11-git-send-email-robert.richter@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-09 15:23:16 +02:00
Robert Richter
7caaf4d824 perf/x86-ibs: Extend hw period that triggers overflow
If the last hw period is too short we might hit the irq handler which
biases the results. Thus try to have a max last period that triggers
the sw overflow.

Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1333390758-10893-10-git-send-email-robert.richter@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-09 15:23:15 +02:00
Robert Richter
fc006cf7cc perf/x86-ibs: Trigger overflow if remaining period is too small
There are cases where the remaining period is smaller than the minimal
possible value. In this case the counter is restarted with the minimal
period. This is of no use as the interrupt handler will trigger
immediately again and most likely hits itself. This biases the
results.

So, if the remaining period is within the min range, we better do not
restart the counter and instead trigger the overflow.

Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1333390758-10893-9-git-send-email-robert.richter@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-09 15:23:15 +02:00
Robert Richter
98112d2e95 perf/x86-ibs: Rename some variables
Simple patch that just renames some variables for better
understanding.

Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1333390758-10893-8-git-send-email-robert.richter@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-09 15:23:14 +02:00
Robert Richter
450bbd493d perf/x86-ibs: Precise event sampling with IBS for AMD CPUs
This patch adds support for precise event sampling with IBS. There are
two counting modes to count either cycles or micro-ops. If the
corresponding performance counter events (hw events) are setup with
the precise flag set, the request is redirected to the ibs pmu:

 perf record -a -e cpu-cycles:p ...    # use ibs op counting cycle count
 perf record -a -e r076:p ...          # same as -e cpu-cycles:p
 perf record -a -e r0C1:p ...          # use ibs op counting micro-ops

Each ibs sample contains a linear address that points to the
instruction that was causing the sample to trigger. With ibs we have
skid 0. Thus, ibs supports precise levels 1 and 2. Samples are marked
with the PERF_EFLAGS_EXACT flag set. In rare cases the rip is invalid
when IBS was not able to record the rip correctly. Then the
PERF_EFLAGS_EXACT flag is cleared and the rip is taken from pt_regs.

V2:
* don't drop samples in precise level 2 if rip is invalid, instead
  support the PERF_EFLAGS_EXACT flag

Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120502103309.GP18810@erda.amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-09 15:23:14 +02:00
Robert Richter
d47e8238cd perf/x86-ibs: Take instruction pointer from ibs sample
Each IBS sample contains a linear address of the instruction that
caused the sample to trigger. This address is more precise than the
rip that was taken from the interrupt handler's stack. Update the rip
with that address. We use this in the next patch to implement
precise-event sampling on AMD systems using IBS.

Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1333390758-10893-6-git-send-email-robert.richter@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-09 15:23:13 +02:00
Robert Richter
6accb9cf76 perf/x86-ibs: Fix frequency profiling
Fixing profiling at a fixed frequency, in this case the freq value and
sample period was setup incorrectly. Since sampling periods are
adjusted we also allow periods that have lower 4 bits set.

Another fix is the setup of the hw counter: If we modify
hwc->sample_period, we also need to update hwc->last_period and
hwc->period_left.

Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1333390758-10893-5-git-send-email-robert.richter@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-09 15:23:13 +02:00
Robert Richter
7bf352384f perf/x86-ibs: Enable ibs op micro-ops counting mode
Allow enabling ibs op micro-ops counting mode.

Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1333390758-10893-4-git-send-email-robert.richter@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-09 15:23:12 +02:00
Robert Richter
fd0d000b2c perf: Pass last sampling period to perf_sample_data_init()
We always need to pass the last sample period to
perf_sample_data_init(), otherwise the event distribution will be
wrong. Thus, modifiyng the function interface with the required period
as argument. So basically a pattern like this:

        perf_sample_data_init(&data, ~0ULL);
        data.period = event->hw.last_period;

will now be like that:

        perf_sample_data_init(&data, ~0ULL, event->hw.last_period);

Avoids unininitialized data.period and simplifies code.

Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1333390758-10893-3-git-send-email-robert.richter@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-09 15:23:12 +02:00
Robert Richter
c75841a398 perf/x86-ibs: Fix update of period
The last sw period was not correctly updated on overflow and thus led
to wrong distribution of events. We always need to properly initialize
data.period in struct perf_sample_data.

Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1333390758-10893-2-git-send-email-robert.richter@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-09 15:23:11 +02:00
Ingo Molnar
ad8537cda6 Merge branch 'perf/x86-ibs' into perf/core 2012-05-09 15:22:23 +02:00
Lars Ellenberg
9476f39d66 drbd: introduce a bio_set to allocate housekeeping bios from
Don't rely on availability of bios from the global fs_bio_set,
we should use our own bio_set for meta data IO.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:17:07 +02:00
Lars Ellenberg
3c2f7a856f drbd: remove unused define
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:17:06 +02:00
Arne Redlich
0c7db27920 drbd: bm_page_async_io: properly initialize page->private
If bm_page_async_io is advised to use a new page for I/O
(BM_AIO_COPY_PAGES is set), it will get it from a mempool.
Once the mempool has to dip into its reserves the page is
not reinitialized, i.e. page->private contains garbage, which
will lead to various problems once the I/O completes (dereferences
of NULL pointers, the submitting thread getting stuck in D-state,
 ...).

Signed-off-by: Arne Redlich <arne.redlich@googlemail.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2012-05-09 15:17:04 +02:00
Lars Ellenberg
4d95a10f97 drbd: use the newly introduced page pool for bitmap IO
Conflicts:

	drbd/drbd_bitmap.c

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:17:03 +02:00
Lars Ellenberg
4281808fb3 drbd: add page pool to be used for meta data IO
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:17:02 +02:00
Lars Ellenberg
0e8488ade2 drbd: allow bitmap to change during writeout from resync_finished
Symptom: messages similar to
 "FIXME asender in bm_change_bits_to,
  bitmap locked for 'write from resync_finished' by worker"

If a resync or verify is finished (or aborted), a full bitmap writeout
is triggered.  If we have ongoing local IO, the bitmap may still change
during that writeout, pending and not yet processed acks may cause bits
to be cleared, while new writes may cause bits to be to be set.

To fix this, introduce the drbd_bm_write_copy_pages() variant.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:17:00 +02:00
Lars Ellenberg
a574daf5d7 drbd: fix race between drbdadm invalidate/verify and finishing resync
When a resync or online verify is finished or aborted,
drbd does a bulk write-out of changed bitmap pages.

If *in that very moment* a new verify or resync is triggered,
this can race:
 ASSERT( !test_bit(BITMAP_IO, &mdev->flags) ) in drbd_main.c
 FIXME going to queue 'set_n_write from StartingSync' but 'write from resync_finished' still pending?
and similar.

This can be observed with e.g. tight invalidate loops in test scripts,
and probably has no real-life implication.

Still, that race can be solved by first quiescen the device,
before starting a new resync or verify.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:59 +02:00
Lars Ellenberg
ba280c092e drbd: fix resend/resubmit of frozen IO
DRBD can freeze IO, due to fencing policy (fencing resource-and-stonith),
or because we lost access to data (on-no-data-accessible suspend-io).

Resuming from there (re-connect, or re-attach, or explicit admin
intervention) should "just work".

Unfortunately, if the re-attach/re-connect did not happen within
the timeout, since the commit
  drbd: Implemented real timeout checking for request processing time
if so configured, the request_timer_fn() would timeout and
detach/disconnect virtually immediately.

This change tracks the most recent attach and connect, and does not
timeout within <configured timeout interval> after attach/connect.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:58 +02:00
Philipp Reisner
5de738272e drbd: Ensure that data_size is not 0 before using data_size-1 as index
This could be exploited by a peer which runs modified code.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:56 +02:00
Philipp Reisner
197296ffed drbd: Delay/reject other state changes while establishing a connection
Changes to the role and disk state should be delayed or rejected
while we establish a connection.

This is necessary, since the peer will base its resync decision
on the UUIDs and the state we sent in the drbd_connect() function.

The most prominent example for this race is becoming primary after
sending state and UUIDs and before the state changes to C_WF_CONNECTION.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:55 +02:00
Lars Ellenberg
46385c84ac drbd: move put_ldev from __req_mod() to the endio callback
One invocation in the endio handler is good enough,
we don't need mention it for each of the different ways
it calls __req_mod().

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:51 +02:00
Lars Ellenberg
d64957c9a9 drbd: fix WRITE_ACKED_BY_PEER_AND_SIS to not set RQ_NET_DONE
Just because this request happened during a resync does
not mean it may pretend to have been barrier-acked.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:50 +02:00
Lars Ellenberg
41c4a0035b drbd: fix READ_RETRY_REMOTE_CANCELED to not complete if device is suspended
READ_RETRY_REMOTE_CANCELED needs to be grouped with the other _CANCELED
cases, not with CONNECTION_LOST_WHILE_PENDING, as that would complete
(fail) the bio even if the device became suspended.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:48 +02:00
Lars Ellenberg
6d49e101fd drbd: make OOS_HANDED_TO_NETWORK its own case
OOS_HANDED_TO_NETWORK should not be grouped with the various
*_CANCELED/*_FAILED cases.
Also, not only clear the RQ_NET_QUEUED flag, but also mark it RQ_NET_DONE,
so it can be distinguished from a local-only request even after that.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:47 +02:00
Lars Ellenberg
c088b2d904 drbd: don't pretend that barrier_nr == 0 was special
We used to have a barrier implementation where barrier_nr 0 was
reserved. That is long gone. Just use the full sequence space.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:46 +02:00
Lars Ellenberg
7ffcaa7194 drbd: remove unused static helper function
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:44 +02:00
Lars Ellenberg
a5d214f621 drbd: remove some very outdated comments
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:43 +02:00
Lars Ellenberg
1abc2af205 drbd: missing wakeup after drbd_rs_del_all
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:42 +02:00
Lars Ellenberg
671a74e749 drbd: remove now unused seq_num member from struct drbd_request
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:40 +02:00
Lars Ellenberg
001a88687a drbd: fix potential data corruption and protocol error
We assumed only bios with bi_idx == 0 would end up
in drbd_make_request().

That is wrong.

At least device mapper, in __clone_and_map(), may submit
clones only covering a partial bio, but sharing
the original bvec, by adjusting bi_idx and relevant
other bio members of the clone.

We used __bio_for_each_segment() in various places,
even though that is documented as
 * drivers should not use the __ version unless they _really_ want to
 * run through the entire bio and not just pending pieces

Impact: we would send the full bio bvec, even for the clone
with bi_idx > 0, which will cause data corruption on the
peer (because we submit wrong data at the clone offset),
and will cause a DRBD protocol error, disconnect/reconnect
and resync (thus fixing the corruption),
because the next package header would be expected right
in the middle of the sent data, causing DRBD magic mismatch.

Fix: drop the assert, and use bio_for_each_segment()
instead of the __ version.

Conflicts:

	drbd/drbd_tracing.c

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:39 +02:00
Philipp Reisner
b6a370ba07 drbd: Fix a potential write ordering issue on SyncTarget nodes
If a SyncTarget node gets a P_RS_DATA_REPLY before a P_DATA packet
for the same sector, it simply submits these two IO requests.

  This is be possible because on the SyncSource node, the data of the
  P_RS_DATA_REPLY packet was read from disk.  Immediately after that a
  write request from upper layers came in.

The disk scheduler or even the "hardware" queues on the disk drive might
reorder these writes.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:38 +02:00
Philipp Reisner
fc28845bc0 drbd: Fix a potential race that could case data inconsistency
When we have a write request and a state change C_WF_BITMAP_S -> C_SYNC_SOURCE
at the same time, and it happens that the line

	remote = remote && drbd_should_do_remote(s);

stills sees C_WF_BITMAP_S, and

	send_oos = rw == WRITE && drbd_should_send_oos(s);

already sees C_SYNC_SOURCE both are 0.

This causes the write to not be mirrored, but marked as out-of-sync on the
Sync_Source node.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:34 +02:00
Lars Ellenberg
031a7c173f drbd: add missing part_round_stats to _drbd_start_io_acct
Without this, iostat frequently sees bogus svctime and >= 100% "utilization".

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:33 +02:00
Lars Ellenberg
47a4f1c1bb drbd: Fix module refcount leak in drbd_accept()
drbd_accept was modelled after kernel_accept
with drbd commit 53eb779 in July 2008.

Only, kernel_accept was then broken, and only fixed later
with kernel commit 1b08534e in Dec 2008:
net: Fix module refcount leak in kernel_accept()

Impact: protocol families provided as modules, e.g. ipv6 or ib_sdp,
would soon have their reference count become negative, preventing
them from being unloaded (likely), or worse, hit zero without actually
being unused, allowing them to be unloaded while still in use (unlikely,
but if triggered, causing a kernel crash).

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:32 +02:00
Philipp Reisner
7caacb69ac drbd: Consider the disk-timeout also for meta-data IO operations
If the backing device is already frozen during attach, we failed
to recognize that. The current disk-timeout code works on top
of the drbd_request objects. During attach we do not allow IO
and therefore never generate a drbd_request object but block
before that in drbd_make_request().

This patch adds the timeout to all drbd_md_sync_page_io().

Before this patch we used to go from D_ATTACHING directly
to D_DISKLESS if IO failed during attach. We can no longer
do this since we have to stay in D_FAILED until all IO
ops issued to the backing device returned.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:30 +02:00