Initial EXPERIMENTAL implementation of device-mapper thin provisioning
with snapshot support. The 'thin' target is used to create instances of
the virtual devices that are hosted in the 'thin-pool' target. The
thin-pool target provides data sharing among devices. This sharing is
made possible using the persistent-data library in the previous patch.
The main highlight of this implementation, compared to the previous
implementation of snapshots, is that it allows many virtual devices to
be stored on the same data volume, simplifying administration and
allowing sharing of data between volumes (thus reducing disk usage).
Another big feature is support for arbitrary depth of recursive
snapshots (snapshots of snapshots of snapshots ...). The previous
implementation of snapshots did this by chaining together lookup tables,
and so performance was O(depth). This new implementation uses a single
data structure so we don't get this degradation with depth.
For further information and examples of how to use this, please read
Documentation/device-mapper/thin-provisioning.txt
Signed-off-by: Joe Thornber <thornber@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
The persistent-data library offers a re-usable framework for the storage
and management of on-disk metadata in device-mapper targets.
It's used by the thin-provisioning target in the next patch and in an
upcoming hierarchical storage target.
For further information, please read
Documentation/device-mapper/persistent-data.txt
Signed-off-by: Joe Thornber <thornber@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
The dm-bufio interface allows you to do cached I/O on devices,
holding recently-read blocks in memory and performing delayed writes.
We don't use buffer cache or page cache already present in the kernel, because:
* we need to handle block sizes larger than a page
* we can't allocate memory to perform reads or we'd have deadlocks
Currently, when a cache is required, we limit its size to a fraction of
available memory. Usage can be viewed and changed in
/sys/module/dm_bufio/parameters/ .
The first user is thin provisioning, but more dm users are planned.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Introduce DM_TARGET_IMMUTABLE to indicate that the target type cannot be mixed
with any other target type, and once loaded into a device, it cannot be
replaced with a table containing a different type.
The thin provisioning pool device will use this.
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Add a target feature flag DM_TARGET_ALWAYS_WRITEABLE to indicate that a target
does not support read-only mode.
The initial implementation of the thin provisioning target uses this.
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Introduce the concept of a singleton table which contains exactly one target.
If a target type sets the DM_TARGET_SINGLETON feature bit device-mapper
will ensure that any table that includes that target contains no others.
The thin provisioning pool target uses this.
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
This patch introduces dm_kcopyd_zero() to make it easy to use
kcopyd to write zeros into the requested areas instead
instead of copying. It is implemented by passing a NULL
copying source to dm_kcopyd_copy().
The forthcoming thin provisioning target uses this.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Since set_current_state() contains a memory barrier in it,
an additional barrier isn't needed.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
printk_ratelimit() shares global ratelimiting state with all
other subsystems, so its usage is discouraged. Instead,
define and use dm's local state.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Allow QUEUE_FLAG_NONROT to propagate up the device stack if all
underlying devices are non-rotational. Tools like ureadahead will
schedule IOs differently based on the rotational flag.
With this patch, I see boot time go from 7.75 s to 7.46 s on my device.
Suggested-by: J. Richard Barnette <jrbarnette@chromium.org>
Signed-off-by: Mandeep Singh Baines <msb@chromium.org>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Jens Axboe <jaxboe@fusionio.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: dm-devel@redhat.com
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
We can't just pass back the return value of snd_soc_update_bits() as it
will be 1 if a bit changed rather than zero.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Enable the maximum size (128) supported by the device for the shadow
vlans table, ignoring the module parameter that overrides it. This
table is only used by the IBoE control plane for setting a vlan index
into an RC/UC QP context or UD Address Handle.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
There's no need to set the vlan-related fields in an IBoE send WQE
control segment:
- the vlan to be used by a UD QP is set in the datagram segment.
- for GSI (CM) QP, all the headers down to 8021q and MAC are built by
the software anyway.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
The IBoE port MTU is derived from the corresponding Ethernet netdevice
MTU, which can support jumbo frames of 9K, and hence surely supports
the max IB mtu of 4K.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
QPs need to be moved to error before telling the firwmare to shutdown
the queue. Otherwise, the application can submit WRs that will never
get fetched by the hardware and never flushed by the driver.
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Acked-by: Steve Wise <swsie@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Commit 01e7da6ba5 ("RDMA/cxgb4: Make sure flush CQ entries are
collected on connection close") introduced a potential problem where a
CQ's comp_handler can get called simultaneously from different places
in the iw_cxgb4 driver. This does not comply with
Documentation/infiniband/core_locking.txt, which states that at a
given point of time, there should be only one callback per CQ should
be active.
This problem was reported by Parav Pandit <Parav.Pandit@Emulex.Com>.
Based on discussion between Parav Pandit and Steve Wise, this patch
fixes the above problem by serializing the calls to a CQ's
comp_handler using a spin_lock.
Reported-by: Parav Pandit <Parav.Pandit@Emulex.Com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
iw_cxgb3 has a potential problem where a CQ's comp_handler can get
called simultaneously from different places in iw_cxgb3 driver. This
does not comply with Documentation/infiniband/core_locking.txt, which
states that at a given point of time, there should be only one
callback per CQ should be active.
Such problem was reported by Parav Pandit <Parav.Pandit@Emulex.Com>
for iw_cxgb4 driver. Based on discussion between Parav Pandit and
Steve Wise, this patch fixes the above problem by serializing the
calls to a CQ's comp_handler using a spin_lock.
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Fix an issue where the link would come up after replugging a cable
even if it has been DISABLED manually.
Signed-off-by: Mitko Haralanov <mitko@qlogic.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Drop the edac_mce custom hook in favor of the generic notifier
mechanism. Also, do not log the error to mcelog if the notified agent
was able to decode it.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
mce->socketid and cpu_data(mce->cpu).phys_proc_id are the same,
compare with mce_setup (in mce.c):
m->cpu = m->extcpu = smp_processor_id();
...
m->socketid = cpu_data(m->extcpu).phys_proc_id;
This makes it easier for example for XEN patches to hook into
the MCE subsystem.
Compile tested on x86_64.
Signed-off-by: Thomas Renninger <trenn@suse.de>
CC: JBeulich@novell.com
CC: linux-edac@vger.kernel.org
CC: Mauro Carvalho Chehab <mchehab@redhat.com>
Add scrubbing support to i7core_edac, tested on intel Xeon L5638.
Signed-off-by: Samuel Gabrielsson <samuel.gabrielsson@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
As we'll need to use those structs for trace functions, they should
be on a more public place. So, move struct mem_ctl_info & friends
to edac.h.
No functional changes on this patch.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Error injection needs the pci device 0:0. So, we need to revert
this changeset: 79daef2099.
Tests need to be made to be sure that refcount won't be wrong
as noticed before.
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Clean up: the first parameter of nfs_create_request() has been
incorrectly documented since time immemorial (OK, since before
2.6.12).
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
craa_type_mask is bitmap4 per RFC5661. We need to expect a length before
extracting bitmap value.
Cc: Alexandros Batsakis <batsakis@netapp.com>
Signed-off-by: Peng Tao <peng_tao@emc.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Current pnfs_layoutcommit_inode can not handle parallel layoutcommit.
And as Trond suggested , there is no need for client to optimize for
parallel layoutcommit. So add NFS_INO_LAYOUTCOMMITTING flag to
mark inflight layoutcommit and serialize lalyoutcommit with it.
Also mark_inode_dirty_sync if pnfs_layoutcommit_inode fails to issue
layoutcommit.
Reported-by: Vitaliy Gusev <gusev.vitaliy@nexenta.com>
Signed-off-by: Peng Tao <peng_tao@emc.com>
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Now that we are doing the locking correctly, we need to grab the
i_completed_io_lock() twice per end_io. We can clean this up by
removing the structure from the i_complted_io_list, and use this as
the locking mechanism to prevent ext4_flush_completed_IO() racing
against ext4_end_io_work(), instead of clearing the
EXT4_IO_END_UNWRITTEN in io->flag.
In addition, if the ext4_convert_unwritten_extents() returns an error,
we no longer keep the end_io structure on the linked list. This
doesn't help, because it tends to lock up the file system and wedges
the system. That's one way to call attention to the problem, but it
doesn't help the overall robustness of the system.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The patch merges the build of imx3 and imx6. The Kconfig symbol
ARCH_IMX_V6_V7 is introduced to replace ARCH_MX3 and ARCH_MX6.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
It adds a number of core drivers support for imx6q, including clock,
General Power Controller (gpc), Multi Mode DDR Controller(mmdc) and
System Reset Controller (src).
Signed-off-by: Ranjani Vaidyanathan <ra5478@freescale.com>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
This is a plain translation of assembly gic irq handler to C function
for CONFIG_MULTI_IRQ_HANDLER support on imx family.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
This adds cpu hotplug for highbank. On highbank, a core is always reset and
boots up the same path as a cold boot.
Signed-off-by: Martin Bogomolni <martin@calxeda.com>
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Reviewed-by: Jamie Iles <jamie@jamieiles.com>
Reviewed-by: Shawn Guo <shawn.guo@linaro.org>
This enables SMP support on highbank processor.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Reviewed-by: Jamie Iles <jamie@jamieiles.com>
Reviewed-by: Shawn Guo <shawn.guo@linaro.org>
This adds basic support for the Calxeda Highbank platform.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Reviewed-by: Jamie Iles <jamie@jamieiles.com>
Reviewed-by: Shawn Guo <shawn.guo@linaro.org>