Commit graph

311670 commits

Author SHA1 Message Date
Len Brown
7ae30986dc ACPI: fix acpi_bus.h build warnings when ACPI is not enabled
introduced in Linux-3.5-rc1 by
66886d6f8c
(ACPI: Add stubs for (un)register_acpi_bus_type)

Fix header file warnings when CONFIG_ACPI is not enabled:

include/acpi/acpi_bus.h:443:42: warning: 'struct acpi_bus_type' declared inside parameter list
include/acpi/acpi_bus.h:443:42: warning: its scope is only this definition or declaration, which is probably not
include/acpi/acpi_bus.h:444:44: warning: 'struct acpi_bus_type' declared inside parameter list

Signed-off-by: Len Brown <len.brown@intel.com>
2012-06-04 00:29:11 -04:00
Kukjin Kim
5041caa4d5 gpio/samsung: fix the typo 'exynos5_xxx' instead of 'exonys5_xxx'
Should be 'exynos5_xxx' instead of 'exonys5_xxx'.

It happened at the commit 30b842889e ("Merge tag 'soc2' of
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc")
during v3.5 merge window.

Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
[ My bad  - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-03 21:21:01 -07:00
Fabio Estevam
2f07a6134f drivers: acpi: Fix dependency for ACPI_HOTPLUG_CPU
Fix the following build warning:

warning: (ACPI_HOTPLUG_CPU) selects ACPI_CONTAINER which has unmet direct dependencies (ACPI && EXPERIMENTAL)

Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-06-04 00:07:33 -04:00
Len Brown
650a37f32d tools/power turbostat: fix IVB support
Initial IVB support went into turbostat in Linux-3.1:
553575f1ae
(tools turbostat: recognize and run properly on IVB)

However, when running on IVB, turbostat would fail
to report the new couters added with SNB, c7, pc2 and pc7.
So in scenarios where these counters are non-zero on IVB,
turbostat would report erroneous residencey results.

In particular c7 time would be added to c1 time,
since c1 time is calculated as "that which is left over".

Also, turbostat reports MHz capabilities when passed
the "-v" option, and it would incorrectly report 133MHz
bclk instead of 100MHz bclk for IVB, which would inflate
GHz reported with that option.

This patch is a backport of a fix already included in turbostat v2.

Signed-off-by: Len Brown <len.brown@intel.com>
2012-06-03 23:47:49 -04:00
Len Brown
d15cf7c129 tools/power turbostat: fix un-intended affinity of forked program
Linux 3.4 included a modification to turbostat to
lower cross-call overhead by using scheduler affinity:

15aaa34654
(tools turbostat: reduce measurement overhead due to IPIs)

In the use-case where turbostat forks a child program,
that change had the un-intended side-effect of binding
the child to the last cpu in the system.

This change removed the binding before forking the child.

This is a back-port of a fix already included in turbostat v2.

Signed-off-by: Len Brown <len.brown@intel.com>
2012-06-03 23:24:00 -04:00
Linus Torvalds
4d578573b8 Merge branch 'pm-acpi' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull some left-over PM patches from Rafael J. Wysocki.

* 'pm-acpi' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI / PM: Make acpi_pm_device_sleep_state() follow the specification
  ACPI / PM: Make __acpi_bus_get_power() cover D3cold correctly
  ACPI / PM: Fix error messages in drivers/acpi/bus.c
  rtc-cmos / PM: report wakeup event on ACPI RTC alarm
  ACPI / PM: Generate wakeup events on fixed power button
2012-06-03 20:15:57 -07:00
Linus Torvalds
68e3e92620 Revert "mm: compaction: handle incorrect MIGRATE_UNMOVABLE type pageblocks"
This reverts commit 5ceb9ce6fe.

That commit seems to be the cause of the mm compation list corruption
issues that Dave Jones reported.  The locking (or rather, absense
there-of) is dubious, as is the use of the 'page' variable once it has
been found to be outside the pageblock range.

So revert it for now, we can re-visit this for 3.6.  If we even need to:
as Minchan Kim says, "The patch wasn't a bug fix and even test workload
was very theoretical".

Reported-and-tested-by: Dave Jones <davej@redhat.com>
Acked-by: Hugh Dickins <hughd@google.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-03 20:05:57 -07:00
Hugh Dickins
752dc185da mm: fix warning in __set_page_dirty_nobuffers
New tmpfs use of !PageUptodate pages for fallocate() is triggering the
WARNING: at mm/page-writeback.c:1990 when __set_page_dirty_nobuffers()
is called from migrate_page_copy() for compaction.

It is anomalous that migration should use __set_page_dirty_nobuffers()
on an address_space that does not participate in dirty and writeback
accounting; and this has also been observed to insert surprising dirty
tags into a tmpfs radix_tree, despite tmpfs not using tags at all.

We should probably give migrate_page_copy() a better way to preserve the
tag and migrate accounting info, when mapping_cap_account_dirty().  But
that needs some more work: so in the interim, avoid the warning by using
a simple SetPageDirty on PageSwapBacked pages.

Reported-and-tested-by: Dave Jones <davej@redhat.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-03 20:05:47 -07:00
Linus Torvalds
2f9d3df8aa vfs: move inode stat information closer together
The comment above it says "Stat data, not accessed from path walking",
but in fact some of inode fields we use for the common stat data was way
down at the end of the inode, causing unnecessary cache misses for the
common stat operations.

The inode structure is pretty big, and this can change padding depending
on field width, but at least on the common 64-bit configurations this
doesn't change the size.  Some of our inode layout has historically been
to tro to avoid unnecessary padding fields, but cache locality is at
least as important for layout, if not more.

Noticed by looking at kernel profiles, and noticing that the "i_blkbits"
access stood out like a sore thumb.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-03 14:50:19 -07:00
Axel Lin
2e3f7f26e1 regulator: max77686: Use regulator_map_voltage_linear for simple linear mappings
Both max77686_ops and max77686_buck_dvs_ops use simple linear voltage maps.
Thus use regulator_map_voltage_linear is more efficient than using the defult
regulator_map_voltage_iterate.

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2012-06-03 20:11:09 +01:00
Axel Lin
fcbb71f6f8 regulator: wm8350: Use regulator_map_voltage_linear for wm8350_dcdc_ops
wm8350_dcdc_ops uses simple linear voltage maps.
Thus use regulator_map_voltage_linear is more efficient than using the default
regulator_map_voltage_iterate.

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2012-06-03 20:08:54 +01:00
Joe Perches
dc605dbdb8 can: cc770: Fix likely misuse of | for &
Using | with a constant is always true.
Likely this should have be &.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2012-06-03 18:59:21 +02:00
AnilKumar Ch
f461f27a44 can: c_can: fix race condition in c_can_open()
Fix the issue of C_CAN interrupts getting disabled forever when canconfig
utility is used multiple times. According to NAPI usage we disable all
the hardware interrupts in ISR and re-enable them in poll(). Current
implementation calls napi_enable() after hardware interrupts are enabled.
If we get any interrupts between these two steps then we do not process
those interrupts because napi is not enabled. Mostly these interrupts
come because of STATUS is not 0x7 or ERROR interrupts. If napi_enable()
happens before HW interrupts enabled then c_can_poll() function will be
called eventual re-enabling.

This patch moves the napi_enable() call before interrupts enabled.

Cc: stable@kernel.org # 2.6.39+
Signed-off-by: AnilKumar Ch <anilkumar@ti.com>
Acked-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2012-06-03 18:59:20 +02:00
AnilKumar Ch
148c87c89e can: c_can: fix an interrupt thrash issue with c_can driver
This patch fixes an interrupt thrash issue with c_can driver.

In c_can_isr() function interrupts are disabled and enabled only in
c_can_poll() function. c_can_isr() & c_can_poll() both read the
irqstatus flag. However, irqstatus is always read as 0 in c_can_poll()
because all C_CAN interrupts are disabled in c_can_isr(). This causes
all interrupts to be re-enabled in c_can_poll() which in turn causes
another interrupt since the event is not really handled. This keeps
happening causing a flood of interrupts.

To fix this, read the irqstatus register in isr and use the same cached
value in the poll function.

Cc: stable@kernel.org # 2.6.39+
Signed-off-by: AnilKumar Ch <anilkumar@ti.com>
Acked-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2012-06-03 18:59:19 +02:00
AnilKumar Ch
617caccebe can: c_can: fix "BUG! echo_skb is occupied!" during transmit
This patch fixes an issue with transmit routine, which causes
"can_put_echo_skb: BUG! echo_skb is occupied!" message when
using "cansequence -p" on D_CAN controller.

In c_can driver, while transmitting packets tx_echo flag holds
the no of can frames put for transmission into the hardware.

As the comment above c_can_do_tx() indicates, if we find any packet
which is not transmitted then we should stop looking for more.
In the current implementation this is not taken care of causing the
said message.

Also, fix the condition used to find if the packet is transmitted
or not. Current code skips the first tx message object and ends up
checking one extra invalid object.

While at it, fix the comment on top of c_can_do_tx() to use the
terminology "packet" instead of "package" since it is more
standard.

Cc: stable@kernel.org # 2.6.39+
Signed-off-by: AnilKumar Ch <anilkumar@ti.com>
Acked-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2012-06-03 18:59:18 +02:00
Guenter Roeck
ca4620853a MAINTAINERS: Update my e-mail address
Maintainer activities moved to home server. Update e-mail address to reflect
reality.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Jean Delvare <khali@linux-fr.org>
2012-06-03 08:13:41 -07:00
Marc Dionne
d8dc3494f7 HID: logitech: don't use stack based dj_report structures
On a system with a logitech wireless keyboard/mouse and DMA-API debugging
enabled, this warning appears at boot:

kernel: WARNING: at lib/dma-debug.c:929 check_for_stack.part.12+0x70/0xa7()
kernel: Hardware name: MS-7593
kernel: uhci_hcd 0000:00:1d.1: DMA-API: device driver maps memory fromstack [addr=ffff8801b0079c29]

Make logi_dj_recv_query_paired_devices and logi_dj_recv_switch_to_dj_mode
use a structure allocated with kzalloc rather than a stack based one.

Signed-off-by: Marc Dionne <marc.c.dionne@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-06-03 15:11:43 +02:00
Lee Jones
392b309644 regulator: Change db8500-prcmu match names to reflect Device Tree
The 'name' field in 'struct of_regulator_match' expects to match with
its corresponding regulator device node in the Device Tree. This patch
renames each of the regulators in the db8500-prcum regulator driver so
this is true.

Signed-off-by: Lee Jones <lee.jones@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2012-06-03 13:25:06 +01:00
Lee Jones
7e715b954a regulator: Change ab8500 match names to reflect Device Tree
The 'name' field in 'struct of_regulator_match' expects to match with
its corresponding regulator device node in the Device Tree. This patch
renames each of the regulators in the ab8500 regulator driver so this
is true.

Signed-off-by: Lee Jones <lee.jones@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2012-06-03 13:25:06 +01:00
Axel Lin
abcfaf23c7 regulator: fixed: Use of_match_ptr() for of_match_table entry
Use the new of_match_ptr() macro for the of_match_table
pointer entry to avoid having to #define match NULL.

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2012-06-03 13:25:05 +01:00
Jonghwa Lee
133d4016f1 regulator: MAX77686: Add Maxim 77686 regulator driver
Add driver for support max77686 regulator.
MAX77686 provides LDOs[1~26] and BUCKs[1~9]. It support to set or get the
volatege of regulator on max77686 chip with using regmap.

Signed-off-by: Chiwoong Byun <woong.byun@samsung.com>
Signed-off-by: Jonghwa Lee <jonghwa3.lee@samsung.com>
Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Reviewed-by: Yadwinder Singh Brar <yadi.brar@samsung.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2012-06-03 13:25:05 +01:00
Axel Lin
27eeabb7a1 regulator: wm8400: Use regulator_map_voltage_linear for wm8400_dcdc_ops
wm8400_dcdc_ops uses simple linear voltage maps.
Thus use regulator_map_voltage_linear is more efficient than using the default
regulator_map_voltage_iterate.

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2012-06-03 13:25:05 +01:00
Axel Lin
c2543b5f6c regulator: wm831x-ldo: Use regulator_map_voltage_linear for wm831x_alive_ldo_ops
wm831x_alive_ldo_ops uses simple linear voltage maps.
Thus use regulator_map_voltage_linear is more efficient than using the default
regulator_map_voltage_iterate.

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2012-06-03 13:25:05 +01:00
Axel Lin
f2d103add1 regulator: wm8994: Convert wm8994_ldo1_ops to regulator_[list|map]_voltage_linear
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2012-06-03 13:25:05 +01:00
Axel Lin
0713e6abf3 regulator: anatop: Convert to regulator_list_voltage_linear()
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2012-06-03 13:25:05 +01:00
Axel Lin
8029a00686 regulator: palmas: Use regulator_[list|map]_voltage_linear() for palmas_ops_smps10
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Acked-by: Graeme Gregory <gg@slimlogic.co.uk>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2012-06-03 13:25:04 +01:00
Mark Brown
107a3967a8 regulator: wm8350: Convert DCDCs to set_voltage_sel() and linear voltages
The WM8350 DCDCs have a simple linear mapping from selectors to voltages
so can be converted very simply to use the new infrastructure for a nice
reduction in code size.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2012-06-03 13:25:04 +01:00
Mark Brown
a540f68286 regulator: wm8350: Convert to core regmap-based enable operations
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2012-06-03 13:25:04 +01:00
Mark Brown
b4ec87aedb regulator: wm8350: Convert to use core regmap vsel readback
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2012-06-03 13:25:03 +01:00
Axel Lin
6333e9ddf4 regulator: ab8500: Let regulator core handle the case no delay for setting new voltage if regulator is off
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2012-06-03 13:25:03 +01:00
Axel Lin
cad8d76e20 regulator: lp3971: Use regulator_list_voltage_table()
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2012-06-03 13:25:02 +01:00
Axel Lin
055917ac56 regulator: tps6507x: Use regulator_list_voltage_table()
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2012-06-03 13:25:02 +01:00
Axel Lin
c55979b7bd regulator: tps6105x: Use regulator_list_voltage_table()
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2012-06-03 13:25:01 +01:00
Axel Lin
4f73ccad5c regulator: lp3972: Use regulator_list_voltage_table()
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2012-06-03 13:25:01 +01:00
Axel Lin
ec1cc4d9da regulator: ab8500: Use regulator_list_voltage_table()
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2012-06-03 13:25:01 +01:00
Axel Lin
a3beb74261 regulator: ab3100: Use regulator_list_voltage_table()
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2012-06-03 13:25:01 +01:00
Mark Brown
3a4b0a07fa regulator: core: Use dev_get_regmap() to find the regmap
If no regmap is explicitly specified then use dev_get_regmap() to obtain
one. The driver must explicitly enable any actual usage of the regmap
so there's no concern with unwanted usage.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Acked-by: Liam Girdwood <lrg@ti.com>
2012-06-03 13:20:34 +01:00
Mark Brown
361ff50174 regulator: Use newly added devres_release() rather than open coding
devres_release() will call the destructor for the resource as well as
freeing the devres data.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Acked-by: Liam Girdwood <lrg@ti.com>
2012-06-03 13:20:12 +01:00
Axel Lin
8b7485ef62 regulator: core: Call set_voltage_time_sel() only when the regulator is on
If the regulator is not on, it won't take time setting new voltage.
So only call set_voltage_time_sel() to get the necessary delay when
the regulator is on.

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2012-06-03 13:20:00 +01:00
Axel Lin
cffc9592fd regulator: core: Allow drivers to set voltage mapping table in regulator_desc
Some regulator hardware use table based mapping can set volt_table in
regulator_desc and use regulator_list_voltage_table() for their list_voltage
callback.

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Acked-by: Liam Girdwood <lrg@ti.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2012-06-03 13:19:45 +01:00
Fabio Estevam
5494a98f45 regmap: Fix the size calculation for map->format.buf_size
The word to be transmitted/received via regmap is composed by the following
parts:

config->reg_bits
config->val_bits
config->pad_bits

,so the total size should be calculated by summing up the number of bits of
each element and using a DIV_ROUND_UP to return the number of bytes.

Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2012-06-03 13:14:01 +01:00
Stephen Warren
bfaa25f334 regmap: clean up debugfs if regmap_init fails
If debugfs isn't cleaned up, stale files will be left in the filesystem
which will cause an OOPS when accessed the first time, and hang the
accessing application when accessed again, presumably due to some lock
being left held.

Signed-off-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2012-06-03 13:13:38 +01:00
Nicholas Bellinger
a4dff3043c target/file: Use O_DSYNC by default for FILEIO backends
Convert to use O_DSYNC for all cases at FILEIO backend creation time to
avoid the extra syncing of pure timestamp updates with legacy O_SYNC during
default operation as recommended by hch.  Continue to do this independently of
Write Cache Enable (WCE) bit, as WCE=0 is currently the default for all backend
devices and enabled by user on per device basis via attrib/emulate_write_cache.

This patch drops the now unnecessary fd_buffered_io= token usage that was
originally signalling when to explictly disable O_SYNC at backend creation
time for buffered I/O operation.  This can end up being dangerous for a number
of reasons during physical node failure, so go ahead and drop this option
for now when O_DSYNC is used as the default.

Also allow explict FUA WRITEs -> vfs_fsync_range() call to function in
fd_execute_cmd() independently of WCE bit setting.

Reported-by: Christoph Hellwig <hch@lst.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-06-02 23:47:20 -07:00
Linus Torvalds
f8f5701bda Linux 3.5-rc1 2012-06-02 18:29:26 -07:00
Linus Torvalds
912afc3616 Improve multipath's retrying mechanism in some defined circumstances
and provide a simple reserve/release mechanism for userspace tools to
 access thin provisioning metadata while the pool is in use.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJPyqIdAAoJEK2W1qbAHj1nSPoQAJvAb/6UHufTWC/lufbEyo7t
 ft6uwZZ4S/VV1Gdx8V5YXo3rxkVIZj/CV0hiJIDctmDMKGPMlzup39kCgjD/rOUF
 mzcFAE8sEr3QEavkfjSWw2RHIIlhnJpvqVnb8nu3p/mSgAB4qYGgaDjBpi+W60PV
 aqQSSWgwH1uNhfGDBIxQoJ8OIjjYvKPIf2Ir2FAXam/dNi9chWO9nzFdj3q2LccP
 nZir094BDsFac1BF0FYW3J+rgT1FfPO7RRGAQct6WNJ197IZlYWYjKH3XehxnUHE
 wgiJmjfUO8vrho1hhWmWDOesKJPPWFN67EQnl5FqAu9itP7c7k8bd7Ay4jWgtZQU
 QIx10uiAgAuFUmTdWGK1fLlE8HGKUFINYLp63N5n5NZ4TDJrgo8e7CIID3rvYf/O
 EtmL7HzAyztL9Uc6oaXzCK6TgMUtd/ht8OJCDFhjitzQTNjbrfAGz6m+RHnEZyyj
 dtOVK7WBlmuKEANl2vDFGuVVF0+MwJLTlvPx1/b/ejFvnHI/R5Wuk9EH7t/DO4LB
 nCmiwzB6uWMzU3y3vnZG72AYSF5NTKSvnAl5B8U/0rI1MZU+6PehjeviJNx6ddJN
 2YheHBLU4vbBV/LF4XIpaHK2aiHN1ltaKCp8INo3EKhCwpR4ZdlVvnAGU9ocf9+c
 qoaFTOP7zGD9zgPeGjoG
 =wCpY
 -----END PGP SIGNATURE-----

Merge tag 'dm-3.5-changes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm

Pull device-mapper updates from Alasdair G Kergon:
 "Improve multipath's retrying mechanism in some defined circumstances
  and provide a simple reserve/release mechanism for userspace tools to
  access thin provisioning metadata while the pool is in use."

* tag 'dm-3.5-changes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm:
  dm thin: provide userspace access to pool metadata
  dm thin: use slab mempools
  dm mpath: allow ioctls to trigger pg init
  dm mpath: delay retry of bypassed pg
  dm mpath: reduce size of struct multipath
2012-06-02 17:39:40 -07:00
Joe Thornber
cc8394d86f dm thin: provide userspace access to pool metadata
This patch implements two new messages that can be sent to the thin
pool target allowing it to take a snapshot of the _metadata_.  This,
read-only snapshot can be accessed by userland, concurrently with the
live target.

Only one metadata snapshot can be held at a time.  The pool's status
line will give the block location for the current msnap.

Since version 0.1.5 of the userland thin provisioning tools, the
thin_dump program displays the msnap as follows:

    thin_dump -m <msnap root> <metadata dev>

Available here: https://github.com/jthornber/thin-provisioning-tools

Now that userland can access the metadata we can do various things
that have traditionally been kernel side tasks:

     i) Incremental backups.

     By using metadata snapshots we can work out what blocks have
     changed over time.  Combined with data snapshots we can ensure
     the data doesn't change while we back it up.

     A short proof of concept script can be found here:

     https://github.com/jthornber/thinp-test-suite/blob/master/incremental_backup_example.rb

     ii) Migration of thin devices from one pool to another.

     iii) Merging snapshots back into an external origin.

     iv) Asyncronous replication.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-06-03 00:30:01 +01:00
Mike Snitzer
a24c25696b dm thin: use slab mempools
Use dedicated caches prefixed with a "dm_" name rather than relying on
kmalloc mempools backed by generic slab caches so the memory usage of
thin provisioning (and any leaks) can be accounted for independently.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-06-03 00:30:00 +01:00
Mikulas Patocka
35991652ba dm mpath: allow ioctls to trigger pg init
After the failure of a group of paths, any alternative paths that
need initialising do not become available until further I/O is sent to
the device.  Until this has happened, ioctls return -EAGAIN.

With this patch, new paths are made available in response to an ioctl
too.  The processing of the ioctl gets delayed until this has happened.

Instead of returning an error, we submit a work item to kmultipathd
(that will potentially activate the new path) and retry in ten
milliseconds.

Note that the patch doesn't retry an ioctl if the ioctl itself fails due
to a path failure.  Such retries should be handled intelligently by the
code that generated the ioctl in the first place, noting that some SCSI
commands should not be retried because they are not idempotent (XOR write
commands).  For commands that could be retried, there is a danger that
if the device rejected the SCSI command, the path could be errorneously
marked as failed, and the request would be retried on another path which
might fail too.  It can be determined if the failure happens on the
device or on the SCSI controller, but there is no guarantee that all
SCSI drivers set these flags correctly.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-06-03 00:29:58 +01:00
Mike Christie
f220fd4efb dm mpath: delay retry of bypassed pg
If I/O needs retrying and only bypassed priority groups are available,
set the pg_init_delay_retry flag to wait before retrying.

If, for example, the reason for the bypass is that the controller is
getting reset or there is a firmware upgrade happening, retrying right
away would cause a flood of log messages and retries for what could be a
few seconds or even several minutes.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-06-03 00:29:45 +01:00
Mike Snitzer
1fbdd2b3a3 dm mpath: reduce size of struct multipath
Move multipath structure's 'lock' and 'queue_size' members to eliminate
two 4-byte holes.  Also use a bit within a single unsigned int for each
existing flag (saves 8-bytes).  This allows future flags to be added
without each consuming an unsigned int.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-06-03 00:29:43 +01:00