Architectures that handle DMA-non-coherent memory need to set
ARCH_KMALLOC_MINALIGN to make sure that kmalloc'ed buffer is DMA-safe: the
buffer doesn't share a cache with the others.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Chris Zankel <chris@zankel.net>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The port data structure related to fc_host statistics collection is
not initialized. This causes system crash when reading the fc_host
statistics. The fix is to initialize port structure during driver
attach.
Signed-off-by: Krishna Gudipati <kgudipat@brocade.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
There is no need to call sk_sleep before calling wake_up_interruptible,
because the wait_queue_head is now with the socket.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The async helper threads offload crc work onto all the
CPUs, and make streaming writes much faster. This
changes the O_DIRECT write code to use them. The only
small complication was that we need to pass in the
logical offset in the file for each bio, because we can't
find it in the bio's pages.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Yan Zheng noticed two places we were doing a lot of work
without task->state set to TASK_RUNNING. This sets the state
properly after we get ready to sleep but decide not to.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
In order for AIO to work, we need to implement aio_write. This patch converts
our btrfs_file_write to btrfs_aio_write. I've tested this with xfstests and
nothing broke, and the AIO stuff magically started working. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
This provides basic DIO support for reading and writing. It does not do the
work to recover from mismatching checksums, that will come later. A few design
changes have been made from Jim's code (sorry Jim!)
1) Use the generic direct-io code. Jim originally re-wrote all the generic DIO
code in order to account for all of BTRFS's oddities, but thanks to that work it
seems like the best bet is to just ignore compression and such and just opt to
fallback on buffered IO.
2) Fallback on buffered IO for compressed or inline extents. Jim's code did
it's own buffering to make dio with compressed extents work. Now we just
fallback onto normal buffered IO.
3) Use ordered extents for the writes so that all of the
lock_extent()
lookup_ordered()
type checks continue to work.
4) Do the lock_extent() lookup_ordered() loop in readpage so we don't race with
DIO writes.
I've tested this with fsx and everything works great. This patch depends on my
dio and filemap.c patches to work. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Btrfs cannot handle having logically non-contiguous requests submitted. For
example if you have
Logical: [0-4095][HOLE][8192-12287]
Physical: [0-4095] [4096-8191]
Normally the DIO code would put these into the same BIO's. The problem is we
need to know exactly what offset is associated with what BIO so we can do our
checksumming and unlocking properly, so putting them in the same BIO doesn't
work. So add another check where we submit the current BIO if the physical
blocks are not contigous OR the logical blocks are not contiguous.
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Because BTRFS can do RAID and such, we need our own submit hook so we can setup
the bio's in the correct fashion, and handle checksum errors properly. So there
are a few changes here
1) The submit_io hook. This is straightforward, just call this instead of
submit_bio.
2) Allow the fs to return -ENOTBLK for reads. Usually this has only worked for
writes, since writes can fallback onto buffered IO. But BTRFS needs the option
of falling back on buffered IO if it encounters a compressed extent, since we
need to read the entire extent in and decompress it. So if we get -ENOTBLK back
from get_block we'll return back and fallback on buffered just like the write
case.
I've tested these changes with fsx and everything seems to work. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
This is similar to what already happens in the write case. If we have a short
read while doing O_DIRECT, instead of just returning, fallthrough and try to
read the rest via buffered IO. BTRFS needs this because if we encounter a
compressed or inline extent during DIO, we need to fallback on buffered. If the
extent is compressed we need to read the entire thing into memory and
de-compress it into the users pages. I have tested this with fsx and everything
works great. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
This patch adds metadata ENOSPC handling for the balance code.
It is consisted by following major changes:
1. Avoid COW tree leave in the phrase of merging tree.
2. Handle interaction with snapshot creation.
3. make the backref cache can live across transactions.
Signed-off-by: Yan Zheng <zheng.yan@oracle.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Pre-allocate space for data relocation. This can detect ENOPSC
condition caused by fragmentation of free space.
Signed-off-by: Yan Zheng <zheng.yan@oracle.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Previous patches make the allocater return -ENOSPC if there is no
unreserved free metadata space. This patch updates tree log code
and various other places to propagate/handle the ENOSPC error.
Signed-off-by: Yan Zheng <zheng.yan@oracle.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Reserve metadata space for extent tree, checksum tree and root tree
Signed-off-by: Yan Zheng <zheng.yan@oracle.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Introduce metadata reservation context for delayed allocation
and update various related functions.
This patch also introduces EXTENT_FIRST_DELALLOC control bit for
set/clear_extent_bit. It tells set/clear_bit_hook whether they
are processing the first extent_state with EXTENT_DELALLOC bit
set. This change is important if set/clear_extent_bit involves
multiple extent_state.
Signed-off-by: Yan Zheng <zheng.yan@oracle.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Besides simplify the code, this change makes sure all metadata
reservation for normal metadata operations are released after
committing transaction.
Changes since V1:
Add code that check if unlink and rmdir will free space.
Add ENOSPC handling for clone ioctl.
Signed-off-by: Yan Zheng <zheng.yan@oracle.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Introducing metadata reseravtion contexts has two major advantages.
First, it makes metadata reseravtion more traceable. Second, it can
reclaim freed space and re-add them to the itself after transaction
committed.
Besides add btrfs_block_rsv structure and related helper functions,
This patch contains following changes:
Move code that decides if freed tree block should be pinned into
btrfs_free_tree_block().
Make space accounting more accurate, mainly for handling read only
block groups.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
All code in init_btrfs_i can be moved into btrfs_alloc_inode()
Signed-off-by: Yan Zheng <zheng.yan@oracle.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Shrink delayed allocation space in a synchronized manner is more
controllable than flushing all delay allocated space in an async
thread.
Signed-off-by: Yan Zheng <zheng.yan@oracle.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
We already have fs_info->chunk_mutex to avoid concurrent
chunk creation.
Signed-off-by: Yan Zheng <zheng.yan@oracle.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
The size of reserved space is stored in space_info. If block groups
of different raid types are linked to separate space_info, changing
allocation profile will corrupt reserved space accounting.
Signed-off-by: Yan Zheng <zheng.yan@oracle.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
The TRACE_EVENT() macros automate creation of trace events. To automate
initialization, the set up variables are loaded in a special section
that is read on boot up. GCC is not aware that these static variables
are used and will complain about them if we do not inform GCC that
they are indeed used.
One of the declarations of the event element was missing a __used
annotation. This patch adds it.
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Allow userspace filesystem implementation to use splice() to read from
the fuse device.
The userspace filesystem can now transfer data coming from a WRITE
request to an arbitrary file descriptor (regular file, block device or
socket) without having to go through a userspace buffer.
The semantics of using splice() to read messages are:
1) with a single splice() call move the whole message from the fuse
device to a temporary pipe
2) read the header from the pipe and determine the message type
3a) if message is a WRITE then splice data from pipe to destination
3b) else read rest of message to userspace buffer
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
When splicing buffers to the fuse device with SPLICE_F_MOVE, try to
move pages from the pipe buffer into the page cache. This allows
populating the fuse filesystem's cache without ever touching the page
contents, i.e. zero copy read capability.
The following steps are performed when trying to move a page into the
page cache:
- buf->ops->confirm() to make sure the new page is uptodate
- buf->ops->steal() to try to remove the new page from it's previous place
- remove_from_page_cache() on the old page
- add_to_page_cache_locked() on the new page
If any of the above steps fail (non fatally) then the code falls back
to copying the page. In particular ->steal() will fail if there are
external references (other than the page cache and the pipe buffer) to
the page.
Also since the remove_from_page_cache() + add_to_page_cache_locked()
are non-atomic it is possible that the page cache is repopulated in
between the two and add_to_page_cache_locked() will fail. This could
be fixed by creating a new atomic replace_page_cache_page() function.
fuse_readpages_end() needed to be reworked so it works even if
page->mapping is NULL for some or all pages which can happen if the
add_to_page_cache_locked() failed.
A number of sanity checks were added to make sure the stolen pages
don't have weird flags set, etc... These could be moved into generic
splice/steal code.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Allow userspace filesystem implementation to use splice() to write to
the fuse device. The semantics of using splice() are:
1) buffer the message header and data in a temporary pipe
2) with a *single* splice() call move the message from the temporary pipe
to the fuse device
The READ reply message has the most interesting use for this, since
now the data from an arbitrary file descriptor (which could be a
regular file, a block device or a socket) can be tranferred into the
fuse device without having to go through a userspace buffer. It will
also allow zero copy moving of pages.
One caveat is that the protocol on the fuse device requires the length
of the whole message to be written into the header. But the length of
the data transferred into the temporary pipe may not be known in
advance. The current library implementation works around this by
using vmplice to write the header and modifying the header after
splicing the data into the pipe (error handling omitted):
struct fuse_out_header out;
iov.iov_base = &out;
iov.iov_len = sizeof(struct fuse_out_header);
vmsplice(pip[1], &iov, 1, 0);
len = splice(input_fd, input_offset, pip[1], NULL, len, 0);
/* retrospectively modify the header: */
out.len = len + sizeof(struct fuse_out_header);
splice(pip[0], NULL, fuse_chan_fd(req->ch), NULL, out.len, flags);
This works since vmsplice only saves a pointer to the data, it does
not copy the data itself.
Since pipes are currently limited to 16 pages and messages need to be
spliced atomically, the length of the data is limited to 15 pages (or
60kB for 4k pages).
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acquire a page ref on pages in ->readpages() and release them when the
read has finished. Not acquiring a reference didn't seem to cause any
trouble since the page is locked and will not be kicked out of the
page cache during the read.
However the following patches will want to remove the page from the
cache so a separate ref is needed. Making the reference in req->pages
explicit also makes the code easier to understand.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Replace uses of get_user_pages() with get_user_pages_fast(). It looks
nicer and should be faster in most cases.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
"map" isn't needed any more after: 0bd87182d3 "fuse: fix kunmap in
fuse_ioctl_copy_user"
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
This is the driver for the hardware watchdog on the Freescale IMX2 and later processors.
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Cc: Vladimir Zapolskiy <vzapolskiy@gmail.com>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Tested-by: Juergen Beisert <jbe@pengutronix.de>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Commit 5f487cd34f (power_supply: Use
attribute groups) causes a regression the power supply core does not
export the 'type' attribute anymore.
POWER_SUPPLY_PROP_TYPE is handled by the power supply core without the
low-level driver, so power_supply_attr_is_visible() must always return
the entry as readable.
Reported-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Daniel Mack <daniel@caiaq.de>
Tested-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Anton Vorontsov <cbouatmailru@gmail.com>
Patch b7e2ecef92 (perf, trace: Optimize tracepoints by removing
IRQ-disable from perf/tracepoint interaction) made the
unfortunate mistake of assuming the world is x86 only, correct
this.
The problem was that perf_fetch_caller_regs() did
local_save_flags() into regs->flags, and I re-used that to
remove another local_save_flags(), forgetting !x86 doesn't have
regs->flags.
Do the reverse, remove the local_save_flags() from
perf_fetch_caller_regs() and let the ftrace site do the
local_save_flags() instead.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Paul Mackerras <paulus@samba.org>
Cc: acme@redhat.com
Cc: efault@gmx.de
Cc: fweisbec@gmail.com
Cc: rostedt@goodmis.org
LKML-Reference: <1274778175.5882.623.camel@twins>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
On Mon, 2010-05-17 at 17:34 +0200, Mark Brown wrote:
> This doesn't seem like the right error handling - if the driver has a
> set_mode() you'd *expect* it to have a get_mode() but there's no need
> for it to be a strict requirement.
True. In such a case, even a valid request would be lost! So now
in the updated patch:
- check if get_mode is present to avoid oops;
- if get_mode is not present, proceed anyways for the request.
Here is the updated patch:
>From bad0d5eb51ef84be5b100e3dd0f5a590ea0529b6 Mon Sep 17 00:00:00 2001
From: Sundar R Iyer <sundar.iyer@stericsson.com>
Date: Fri, 14 May 2010 15:14:17 +0530
Subject: [PATCH 1/1] regulator: return set_mode when same mode is requested
save I/O costs by returning when the same mode is
requested for the regulator
Cc: Liam Girdwood <lrg@slimlogic.co.uk>
Cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Sundar R Iyer <sundar.iyer@stericsson.com>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
This patch adds a missing .owner field in regulator_desc, which is used for refcounting.
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
All twl6030 regulators can be programmed from 1.0v to 3.3v
with 100mV steps.
The below formula can be used to calculate the vsel values
to be programmed in the VREG_VOLTAGE registers.
Voltage(in mV) = 1000mv + 100mv * (vsel - 1)
Ex: if vsel = 0x9, mV = 1000 + 100 * (9 -1) = 1800mV.
This patch removes all existing VSEL tables for twl6030 adjustable
regulators and just uses the formula directly for vsel calculations
after verifing they fall in the allowed range.
Signed-off-by: Rajendra Nayak <rnayak@ti.com>
Cc: Liam Girdwood <lrg@slimlogic.co.uk>
Cc: Samuel Ortiz <sameo@linux.intel.com>
Cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
In the case of "min_uV == max_uV == mc13783_regulators[id].voltages[0]",
mc13783_fixed_regulator_set_voltage should return 0 instead of -EINVAL.
This patch also adds a missing ">" character for MODULE_AUTHOR, a trivial fix.
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: Liam Girdwood <lrg@slimlogic.co.uk>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
A lot of condition comparision statements are used in original driver. These
statements are used to check the boundary of voltage numbers since voltage
number isn't linear.
Now use array of voltage numbers instead. Clean code with simpler way.
Signed-off-by: Haojian Zhuang <haojian.zhuang@marvell.com>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Remove a lot of driver structures in 88pm860x driver. Make regulators share
one driver structure.
Signed-off-by: Haojian Zhuang <haojian.zhuang@marvell.com>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Simply remove all consumer supplies for the regulator on errors. Remove
unset_consumer_device_supply() which is no longer used.
Signed-off-by: Jani Nikula <ext-jani.1.nikula@nokia.com>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Remove all matching consumer supplies, not just the first, to not leave
dangling pointers.
Signed-off-by: Jani Nikula <ext-jani.1.nikula@nokia.com>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Pointer comparison is not sufficient for non-NULL device name matching,
so use strcmp(). Otherwise the semantics remain the same.
Signed-off-by: Jani Nikula <ext-jani.1.nikula@nokia.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
When one regulator supplies another allow the relationship to be specified
using names rather than struct regulators, in a similar manner to that
allowed for consumer supplies. This allows static configuration at compile
time, reducing the need for dynamic init code.
Also change the references to LINE supply to be system supply since line
is sometimes used for actual supplies and therefore potentially confusing.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
If the request for wdt_mem region fails, this patch modifies the driver
such that, it does not try to release the wdt_mem region on exit.
Signed-off-by: Banajit Goswami <banajit.g@samsung.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
This patch adds HAVE_S3C2410_WATCHDOG to control inclusion of watchdog driver
for Samsung SoCs. This option will help to include the driver only for the
necessary machines and not for all for any given arch.
Signed-off-by: Banajit Goswami <banajit.g@samsung.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
For TCO V1 devices the programmed timeout was twice too long
because the fact that the TCO V1 timer needs to count down
twice before triggering the watchdog, wasn't accounted for.
Also the timeout values in the module description and error
message were clarified. And the _STS registers are 16 bit
instead of 8 bit.
Signed-off-by: Pádraig Brady <P@draigBrady.com>
Tested-by: Simon Kagstrom <simon.kagstrom@netinsight.se>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
If we are not able to register then it is better to have
watchdog in disabled state than noticing a system reboot.
Signed-off-by: Ameya Palande <ameya.palande@nokia.com>
Acked-By: Timo Kokkonen <timo.t.kokkonen@nokia.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Move the limited watchdog driver help from kernel-parameters.txt
to Documentation/watchdog/watchdog-parameters.txt and add info to it
for all watchdog drivers except the ones that have driver-specific
files already.
Correct minor comments and MODULE_PARM_DESC() text in 2 places.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>