Commit graph

33,155 commits

Author SHA1 Message Date
Tyler Hicks
bc5abcf7e4 eCryptfs: Check return of filemap_write_and_wait during fsync
Error out of ecryptfs_fsync() if filemap_write_and_wait() fails.

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Cc: Paul Taysom <taysom@chromium.org>
Cc: Olof Johansson <olofj@chromium.org>
Cc: stable@vger.kernel.org # v3.6+
2013-06-04 23:53:31 -07:00
Linus Torvalds
6f66f9005b Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes
Pull gfs2 fixes from Steven Whitehouse:
 "There are four patches this time.

  The first fixes a problem where the wrong descriptor type was being
  written into the log for journaled data blocks.

  The second fixes a race relating to the deallocation of allocator
  data.

  The third provides a fallback if kmalloc is unable to satisfy a
  request to allocate a directory hash table.

  The fourth fixes the iopen glock caching so that inodes are deleted in
  a more timely manner after rmdir/unlink"

* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes:
  GFS2: Don't cache iopen glocks
  GFS2: Fall back to vmalloc if kmalloc fails for dir hash tables
  GFS2: Increase i_writecount during gfs2_setattr_size
  GFS2: Set log descriptor type for jdata blocks
2013-06-05 09:06:28 +09:00
Linus Torvalds
8764d86100 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse fixes from Miklos Szeredi:
 "One patch fixes an Oops introduced in 3.9 with the readdirplus
  feature.  The rest are fixes for async-dio in 3.10"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: fix alignment in short read optimization for async_dio
  fuse: return -EIOCBQUEUED from fuse_direct_IO() for all async requests
  fuse: fix readdirplus Oops in fuse_dentry_revalidate
  fuse: update inode size and invalidate attributes on fallocate
  fuse: truncate pagecache range on hole punch
  fuse: allocate for_background dio requests based on io->async state
2013-06-05 09:03:31 +09:00
Dave Chinner
59913f14df xfs: fix remote attribute invalidation for a leaf
When invalidating an attribute leaf block block, there might be
remote attributes that it points to. With the recent rework of the
remote attribute format, we have to make sure we calculate the
length of the attribute correctly. We aren't doing that in
xfs_attr3_leaf_inactive(), so fix it.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Mark Tinguely <tinuguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-06-04 17:36:30 -05:00
Dave Chinner
6fcdc59de2 xfs: rework dquot CRCs
Calculating dquot CRCs when the backing buffer is written back just
doesn't work reliably. There are several places which manipulate
dquots directly in the buffers, and they don't calculate CRCs
appropriately, nor do they always set the buffer up to calculate
CRCs appropriately.

Firstly, if we log a dquot buffer (e.g. during allocation) it gets
logged without valid CRC, and so on recovery we end up with a dquot
that is not valid.

Secondly, if we recover/repair a dquot, we don't have a verifier
attached to the buffer and hence CRCs are not calculated on the way
down to disk.

Thirdly, calculating the CRC after we've changed the contents means
that if we re-read the dquot from the buffer, we cannot verify the
contents of the dquot are valid, as the CRC is invalid.

So, to avoid all the dquot CRC errors that are being detected by the
read verifier, change to using the same model as for inodes. That
is, dquot CRCs are calculated and written to the backing buffer at
the time the dquot is flushed to the backing buffer. If we modify
the dquot directly in the backing buffer, calculate the CRC
immediately after the modification is complete. Hence the dquot in
the on-disk buffer should always have a valid CRC.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-06-04 17:35:51 -05:00
Jan Kara
5dc23bdd5f ext4: remove ext4_ioend_wait()
Now that we clear PageWriteback after extent conversion, there's no
need to wait for io_end processing in ext4_evict_inode().  Running
AIO/DIO keeps file reference until aio_complete() is called so
ext4_evict_inode() cannot be called.  For io_end structures resulting
from buffered IO waiting is happening because we wait for
PageWriteback in truncate_inode_pages().

Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-06-04 14:46:12 -04:00
Jan Kara
c724585b62 ext4: don't wait for extent conversion in ext4_punch_hole()
We don't have to wait for extent conversion in ext4_punch_hole() as
buffered IO for the punched range has been flushed and waited upon
(thus all extent conversions for that range have completed).  Also we
wait for all DIO to finish using inode_dio_wait() so there cannot be
any extent conversions pending due to direct IO.

Also remove ext4_flush_unwritten_io() since it's unused now.

Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-06-04 14:44:36 -04:00
Jan Kara
38b8ff7db4 ext4: Remove wait for unwritten extents in ext4_ind_direct_IO()
We don't have to wait for unwritten extent conversion in
ext4_ind_direct_IO() as all writes that happened before DIO are
flushed by the generic code and extent conversion has happened before
we cleared PageWriteback bit.

Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-06-04 14:41:29 -04:00
Jan Kara
92e6222dfb ext4: remove i_mutex from ext4_file_sync()
After removal of ext4_flush_unwritten_io() call, ext4_file_sync()
doesn't need i_mutex anymore. Forcing of transaction commits doesn't
need i_mutex as there's nothing inode specific in that code apart from
grabbing transaction ids from the inode. So remove the lock.

Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-06-04 14:40:39 -04:00
Jan Kara
37b10dd063 ext4: use generic_file_fsync() in ext4_file_fsync() in nojournal mode
Just use the generic function instead of duplicating it.  We only need
to reshuffle the read-only check a bit (which is there to prevent
writing to a filesystem which has been remounted read-only after error
I assume).

Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-06-04 14:40:09 -04:00
Jan Kara
a115f749c1 ext4: remove wait for unwritten extent conversion from ext4_truncate()
Since PageWriteback bit is now cleared after extents are converted
from unwritten to written ones, we have full exclusion of writeback
path from truncate (truncate_inode_pages() waits for PageWriteback
bits to get cleared on all invalidated pages).  Exclusion from DIO
path is achieved by inode_dio_wait() call in ext4_setattr().  So
there's no need to wait for extent convertion in ext4_truncate()
anymore.

Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-06-04 14:30:00 -04:00
Jan Kara
e83403959f ext4: protect extent conversion after DIO with i_dio_count
Make sure extent conversion after DIO happens while i_dio_count is
still elevated so that inode_dio_wait() waits until extent conversion
is done.  This removes the need for explicit waiting for extent
conversion in some cases.

Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-06-04 14:27:38 -04:00
Jan Kara
b0857d309f ext4: defer clearing of PageWriteback after extent conversion
Currently PageWriteback bit gets cleared from put_io_page() called
from ext4_end_bio().  This is somewhat inconvenient as extent tree is
not fully updated at that time (unwritten extents are not marked as
written) so we cannot read the data back yet.  This design was
dictated by lock ordering as we cannot start a transaction while
PageWriteback bit is set (we could easily deadlock with
ext4_da_writepages()).  But now that we use transaction reservation
for extent conversion, locking issues are solved and we can move
PageWriteback bit clearing after extent conversion is done.  As a
result we can remove wait for unwritten extent conversion from
ext4_sync_file() because it already implicitely happens through
wait_on_page_writeback().

We implement deferring of PageWriteback clearing by queueing completed
bios to appropriate io_end and processing all the pages when io_end is
going to be freed instead of at the moment ext4_io_end() is called.

Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-06-04 14:23:41 -04:00
Jan Kara
2e8fa54e3b ext4: split extent conversion lists to reserved & unreserved parts
Now that we have extent conversions with reserved transaction, we have
to prevent extent conversions without reserved transaction (from DIO
code) to block these (as that would effectively void any transaction
reservation we did).  So split lists, work items, and work queues to
reserved and unreserved parts.

Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-06-04 14:21:02 -04:00
Jan Kara
6b523df4fb ext4: use transaction reservation for extent conversion in ext4_end_io
Later we would like to clear PageWriteback bit only after extent
conversion from unwritten to written extents is performed.  However it
is not possible to start a transaction after PageWriteback is set
because that violates lock ordering (and is easy to deadlock).  So we
have to reserve a transaction before locking pages and sending them
for IO and later we use the transaction for extent conversion from
ext4_end_io().

Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-06-04 13:21:11 -04:00
Jan Kara
3613d22807 ext4: remove buffer_uninit handling
There isn't any need for setting BH_Uninit on buffers anymore.  It was
only used to signal we need to mark io_end as needing extent
conversion in add_bh_to_extent() but now we can mark the io_end
directly when mapping extent.

Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-06-04 13:19:34 -04:00
Jan Kara
4e7ea81db5 ext4: restructure writeback path
There are two issues with current writeback path in ext4.  For one we
don't necessarily map complete pages when blocksize < pagesize and
thus needn't do any writeback in one iteration.  We always map some
blocks though so we will eventually finish mapping the page.  Just if
writeback races with other operations on the file, forward progress is
not really guaranteed. The second problem is that current code
structure makes it hard to associate all the bios to some range of
pages with one io_end structure so that unwritten extents can be
converted after all the bios are finished.  This will be especially
difficult later when io_end will be associated with reserved
transaction handle.

We restructure the writeback path to a relatively simple loop which
first prepares extent of pages, then maps one or more extents so that
no page is partially mapped, and once page is fully mapped it is
submitted for IO. We keep all the mapping and IO submission
information in mpage_da_data structure to somewhat reduce stack usage.
Resulting code is somewhat shorter than the old one and hopefully also
easier to read.

Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-06-04 13:17:40 -04:00
Jan Kara
fffb273997 ext4: better estimate credits needed for ext4_da_writepages()
We limit the number of blocks written in a single loop of
ext4_da_writepages() to 64 when inode uses indirect blocks.  That is
unnecessary as credit estimates for mapping logically continguous run
of blocks is rather low even for inode with indirect blocks.  So just
lift this limitation and properly calculate the number of necessary
credits.

This better credit estimate will also later allow us to always write
at least a single page in one iteration.

Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-06-04 13:01:11 -04:00
Jan Kara
fa55a0ed03 ext4: improve writepage credit estimate for files with indirect blocks
ext4_ind_trans_blocks() wrongly used 'chunk' argument to decide whether
blocks mapped are logically contiguous. That is wrong since the argument
informs whether the blocks are physically contiguous. As the blocks
mapped are always logically contiguous and that's all
ext4_ind_trans_blocks() cares about, just remove the 'chunk' argument.

Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-06-04 12:56:55 -04:00
Jan Kara
f2d50a65c9 ext4: deprecate max_writeback_mb_bump sysfs attribute
This attribute is now unused so deprecate it.  We still show the old
default value to keep some compatibility but we don't allow writing to
that attribute anymore.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-06-04 12:51:16 -04:00
Jan Kara
39bba40b7a ext4: stop messing with nr_to_write in ext4_da_writepages()
Writeback code got better in how it submits IO and now the number of
pages requested to be written is usually higher than original 1024.
The number is now dynamically computed based on observed throughput
and is set to be about 0.5 s worth of writeback.  E.g. on ordinary
SATA drive this ends up somewhere around 10000 as my testing shows.
So remove the unnecessary smarts from ext4_da_writepages().

Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-06-04 12:50:24 -04:00
Jan Kara
5fe2fe895a ext4: provide wrappers for transaction reservation calls
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-06-04 12:37:50 -04:00
Jan Kara
8f7d89f368 jbd2: transaction reservation support
In some cases we cannot start a transaction because of locking
constraints and passing started transaction into those places is not
handy either because we could block transaction commit for too long.
Transaction reservation is designed to solve these issues.  It
reserves a handle with given number of credits in the journal and the
handle can be later attached to the running transaction without
blocking on commit or checkpointing.  Reserved handles do not block
transaction commit in any way, they only reduce maximum size of the
running transaction (because we have to always be prepared to
accomodate request for attaching reserved handle).

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-06-04 12:35:11 -04:00
Jan Kara
f29fad7210 jbd2: remove unused waitqueues
j_wait_logspace and j_wait_checkpoint are unused.  Remove them.

Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-06-04 12:24:11 -04:00
Jan Kara
fe1e8db598 jbd2: fix race in t_outstanding_credits update in jbd2_journal_extend()
jbd2_journal_extend() first checked whether transaction can accept
extending handle with more credits and then added credits to
t_outstanding_credits.  This can race with start_this_handle() adding
another handle to a transaction and thus overbooking a transaction.
Make jbd2_journal_extend() use atomic_add_return() to close the race.

Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-06-04 12:22:15 -04:00
Jan Kara
76c3990456 jbd2: cleanup needed free block estimates when starting a transaction
__jbd2_log_space_left() and jbd_space_needed() were kind of odd.
jbd_space_needed() accounted also credits needed for currently
committing transaction while it didn't account for credits needed for
control blocks.  __jbd2_log_space_left() then accounted for control
blocks as a fraction of free space.  Since results of these two
functions are always only compared against each other, this works
correct but is somewhat strange.  Move the estimates so that
jbd_space_needed() returns number of blocks needed for a transaction
including control blocks and __jbd2_log_space_left() returns free
space in the journal (with the committing transaction already
subtracted).  Rename functions to jbd2_log_space_left() and
jbd2_space_needed() while we are changing them.

Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-06-04 12:12:57 -04:00
Jan Kara
2f387f849b jbd2: remove outdated comment
The comment about credit estimates isn't true anymore. We do what the
comment describes now.

Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-06-04 12:10:11 -04:00
Jan Kara
b34090e5e2 jbd2: refine waiting for shadow buffers
Currently when we add a buffer to a transaction, we wait until the
buffer is removed from BJ_Shadow list (so that we prevent any changes
to the buffer that is just written to the journal).  This can take
unnecessarily long as a lot happens between the time the buffer is
submitted to the journal and the time when we remove the buffer from
BJ_Shadow list.  (e.g.  We wait for all data buffers in the
transaction, we issue a cache flush, etc.)  Also this creates a
dependency of do_get_write_access() on transaction commit (namely
waiting for data IO to complete) which we want to avoid when
implementing transaction reservation.

So we modify commit code to set new BH_Shadow flag when temporary
shadowing buffer is created and we clear that flag once IO on that
buffer is complete.  This allows do_get_write_access() to wait only
for BH_Shadow bit and thus removes the dependency on data IO
completion.

Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-06-04 12:08:56 -04:00
Jan Kara
e5a120aeb5 jbd2: remove journal_head from descriptor buffers
Similarly as for metadata buffers, also log descriptor buffers don't
really need the journal head. So strip it and remove BJ_LogCtl list.

Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-06-04 12:06:01 -04:00
Jan Kara
f5113effc2 jbd2: don't create journal_head for temporary journal buffers
When writing metadata to the journal, we create temporary buffer heads
for that task.  We also attach journal heads to these buffer heads but
the only purpose of the journal heads is to keep buffers linked in
transaction's BJ_IO list.  We remove the need for journal heads by
reusing buffer_head's b_assoc_buffers list for that purpose.  Also
since BJ_IO list is just a temporary list for transaction commit, we
use a private list in jbd2_journal_commit_transaction() for that thus
removing BJ_IO list from transaction completely.

Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-06-04 12:01:45 -04:00
Jan Kara
97a851ed71 ext4: use io_end for multiple bios
Change writeback path to create just one io_end structure for the
extent to which we submit IO and share it among bios writing that
extent. This prevents needless splitting and joining of unwritten
extents when they cannot be submitted as a single bio.

Bugs in ENOMEM handling found by Linux File System Verification project
(linuxtesting.org) and fixed by Alexey Khoroshilov
<khoroshilov@ispras.ru>.

CC: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-06-04 11:58:58 -04:00
Linus Torvalds
0ab60871b4 A couple jfs bug fixes for 3.10-rc5
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.20 (GNU/Linux)
 
 iQIcBAABAgAGBQJRrLmSAAoJEDaohF61QIxkakoP/jo8fo2KaixVHRag1h7pBwVG
 7UJrgYmojoFUsmem+k34rXOzwc8eqv5BzTwGJe81ktatUjPkEQELdwryB4Rw6YYI
 F8/Mrc2AjHEavuTnojmENP22+YkPOW90r+k00dVk+cY9FQsc1/AhNYtzpVg9hSRX
 RBhGcgfnQQCAlIf09vr7R19oBLiHfrFNnUYJeUg/d0bxW8uiTimX7ENDTOtn61WF
 hLEABnL2tC0UtK3z/B0qNIpDE5DHHaLo8WcGjlXE3Pte4JA2rPDJwaHaUgvCEo6R
 iYCw5Dq6bL7Ce7wMP1GzY+LHgaeUg5Nknnkl5vEKZSZuy7dfC2eBlYbziqyJm73A
 0e7UgypkaNGa4p4MzM0J5OW+4bRRPzNir99BnpU8Bm6dJTY8ZeIMTTQFX2Fla8+G
 SCo6dMjEUfF+yIykDwmeZYjHl5XeKM6B3imoJ8+yD9vZLzg6MY4SUmgPEKL7BFoi
 U4aiID6N67LYAMb5vapzpcl+uHZbrw7CBnI2FanqdsAoPjath1CgrEvseBAUowpy
 EUeCLaQ05xR8ynm44tQCPQFgqBpRxAiVtpy9nPulA3cQCbwelEuLKrZB17DBfK9i
 w7f5e8nNG8gXfRuL5idGzS7g6Hp90JUhBWKh0dJC820OM7idWnXGCLbYTm7t6pvo
 m931/xUofSsXCLNwEo88
 =JZIK
 -----END PGP SIGNATURE-----

Merge tag 'jfs-3.10-rc5' of git://github.com/kleikamp/linux-shaggy

Pull jfs bugfixes from David Kleikamp:
 "A couple jfs bug fixes for 3.10-rc5"

* tag 'jfs-3.10-rc5' of git://github.com/kleikamp/linux-shaggy:
  fs/jfs: Add check if journaling to disk has been disabled in lbmRead()
  jfs: Several bugs in jfs_freeze() and jfs_unfreeze()
2013-06-04 06:33:44 +09:00
Stephen Rothwell
40b313608a Finally eradicate CONFIG_HOTPLUG
Ever since commit 45f035ab9b ("CONFIG_HOTPLUG should be always on"),
it has been basically impossible to build a kernel with CONFIG_HOTPLUG
turned off.  Remove all the remaining references to it.

Cc: Russell King <linux@arm.linux.org.uk>
Cc: Doug Thompson <dougthompson@xmission.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-03 14:20:18 -07:00
Mathias Krause
a3b2c8c7aa debugfs: write_file_bool() - ensure strtobool() operates on valid data
In case, userland writes an empty string to a bool debugfs file, buf[]
will still be uninitialized when being passed to strtobool() making the
outcome of that function purely random.

Fix this by always zero-terminating the buffer.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-03 13:55:02 -07:00
Seth Jennings
3a76e5e09f debugfs: add get/set for atomic types
debugfs currently lack the ability to create attributes
that set/get atomic_t values.

This patch adds support for this through a new
debugfs_create_atomic_t() function.

Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-03 13:55:01 -07:00
Bob Peterson
a6a4d98b01 GFS2: Don't cache iopen glocks
This patch makes GFS2 immediately reclaim/delete all iopen glocks
as soon as they're dequeued. This allows deleters to get an
EXclusive lock on iopen so files are deleted properly instead of
being set as unlinked.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2013-06-03 16:40:22 +01:00
Bob Peterson
e8830d8856 GFS2: Fall back to vmalloc if kmalloc fails for dir hash tables
This version has one more correction: the vmalloc calls are replaced
by __vmalloc calls to preserve the GFP_NOFS flag.

When GFS2's directory management code allocates buffers for a
directory hash table, if it can't get the memory it needs, it
currently gives a bad return code. Rather than giving an error,
this patch allows it to use virtual memory rather than kernel
memory for the hash table. This should make it possible for
directories to function properly, even when kernel memory becomes
very fragmented.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2013-06-03 16:39:44 +01:00
Bob Peterson
2b3dcf3581 GFS2: Increase i_writecount during gfs2_setattr_size
This patch calls get_write_access in a few functions. This
merely increases inode->i_writecount for the duration of the function.
That will ensure that any file closes won't delete the inode's
multi-block reservation while the function is running.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2013-06-03 16:38:58 +01:00
Bob Peterson
4a58681205 GFS2: Set log descriptor type for jdata blocks
This patch sets the log descriptor type according to whether the
journal commit is for (journaled) data or metadata. This was
recently broken when the functions to process data and metadata
log ops were combined.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2013-06-03 16:38:39 +01:00
Maxim Patlasov
e5c5f05dca fuse: fix alignment in short read optimization for async_dio
The bug was introduced with async_dio feature: trying to optimize short reads,
we cut number-of-bytes-to-read to i_size boundary. Hence the following example:

	truncate --size=300 /mnt/file
	dd if=/mnt/file of=/dev/null iflag=direct

led to FUSE_READ request of 300 bytes size. This turned out to be problem
for userspace fuse implementations who rely on assumption that kernel fuse
does not change alignment of request from client FS.

The patch turns off the optimization if async_dio is disabled. And, if it's
enabled, the patch fixes adjustment of number-of-bytes-to-read to preserve
alignment.

Note, that we cannot throw out short read optimization entirely because
otherwise a direct read of a huge size issued on a tiny file would generate
a huge amount of fuse requests and most of them would be ACKed by userspace
with zero bytes read.

Signed-off-by: Maxim Patlasov <MPatlasov@parallels.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2013-06-03 15:15:42 +02:00
Brian Foster
c9ecf989cc fuse: return -EIOCBQUEUED from fuse_direct_IO() for all async requests
If request submission fails for an async request (i.e.,
get_user_pages() returns -ERESTARTSYS), we currently skip the
-EIOCBQUEUED return and drop into wait_for_sync_kiocb() forever.

Avoid this by always returning -EIOCBQUEUED for async requests. If
an error occurs, the error is passed into fuse_aio_complete(),
returned via aio_complete() and thus propagated to userspace via
io_getevents().

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Maxim Patlasov <MPatlasov@parallels.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2013-06-03 15:15:42 +02:00
Miklos Szeredi
28420dad23 fuse: fix readdirplus Oops in fuse_dentry_revalidate
Fix bug introduced by commit 4582a4ab2a "FUSE: Adapt readdirplus to application
usage patterns".

We need to check for a positive dentry; negative dentries are not added by
readdirplus.  Secondly we need to advise the use of readdirplus on the *parent*,
otherwise the whole thing is useless.  Thirdly all this is only relevant if
"readdirplus_auto" mode is selected by the filesystem.

We advise the use of readdirplus only if the dentry was still valid.  If we had
to redo the lookup then there was no use in doing the -plus version.

Reported-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: Feng Shuo <steve.shuo.feng@gmail.com>
CC: stable@vger.kernel.org
2013-06-03 14:40:22 +02:00
Jason Hrycay
1e03e38b35 f2fs: handle errors from get_node_page calls
Add check for error pointers returned from get_node_page in order to
avoid dereferencing a bad address on the next use.

Signed-off-by: Jason Hrycay <jason.hrycay@motorola.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-06-03 19:49:09 +09:00
Linus Torvalds
6cf3c73620 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull assorted fixes from Al Viro:
 "There'll be more - I'm trying to dig out from under the pile of mail
  (a couple of weeks of something flu-like ;-/) and there's several more
  things waiting for review; this is just the obvious stuff."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  zoran: racy refcount handling in vm_ops ->open()/->close()
  befs_readdir(): do not increment ->f_pos if filldir tells us to stop
  hpfs: deadlock and race in directory lseek()
  qnx6: qnx6_readdir() has a braino in pos calculation
  fix buffer leak after "scsi: saner replacements for ->proc_info()"
  vfs: Fix invalid ida_remove() call
2013-06-01 19:51:52 +09:00
Linus Torvalds
f8cb27916a NFS client fixes:
- Fix a regression that broke NFS mounting using klibc and busybox
 - Stable fix to check access modes correctly on NFSv4 delegated open()
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQIcBAABAgAGBQJRqUQdAAoJEGcL54qWCgDy4gMP/RfrcBIAJxEBUfB7Q/hCjoRG
 CTmfoq8XK/fTTmCBuTTjf0QntlKNJMFJSnJwGZqFzqA6PJOSDvkWFDf0WzzCtIEN
 3D8C6rko7oYUU2tcu8bJK3CGkfFNgh6ON519XMHMcOZCIpzycIO6DrZFvkcOS6aW
 lzlqHkB/ksH/G4Z7k2C9YyS6Ljr0odAxhsOcn27BWMtktCvsgTyPhowfo83aOzyd
 EFnwHwXSCbawyYIqNiBTFjoudaZwlFDhbStSWX5jcI/JwpR8rfhN8n1dU+AJknE6
 d6DdEgk1CVuBUke0VcDLf0dJ5iMbKY5bnoJF5PF09qTINPmErKUMNI13P3wt+Uyn
 RkmhoiH0ooiBhrkCJWlZHo/EJxl/ionQR7vOlEhN/Hvp11oreXmvHx/pL2WCCanJ
 wLEJnFGflay7NUfkKWSuUpyyBJTS7/5lrWaabBZ1qxlHSOonzwaKW3G0slzJroiV
 x6YmM7xEe6CEquEQmyFm5kJjlueYJpPzIvxFYsPun1cxpzjKosXgNCOEe8oe5Znj
 JkvNS+zc+VN7pw06EtmKgyqoavUSbVcAeZICegskCkUuhYQkdJaP5Lc819qeZj8D
 rlgyltkh6XP7s8W7a4UWePHqrjZyen62ULtNPLoEainSbdpu0BrItZqgnokQ3AIe
 mRiIKkpkoqHKVWWozhCh
 =HTIq
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.10-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull two NFS client fixes from Trond Myklebust:
 - Fix a regression that broke NFS mounting using klibc and busybox
 - Stable fix to check access modes correctly on NFSv4 delegated open()

* tag 'nfs-for-3.10-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFS: Fix security flavor negotiation with legacy binary mounts
  NFSv4: Fix a thinko in nfs4_try_open_cached
2013-06-01 19:48:59 +09:00
Jan Kara
8af8eecc13 ext4: fix overflow when counting used blocks on 32-bit architectures
The arithmetics adding delalloc blocks to the number of used blocks in
ext4_getattr() can easily overflow on 32-bit archs as we first multiply
number of blocks by blocksize and then divide back by 512. Make the
arithmetics more clever and also use proper type (unsigned long long
instead of unsigned long).

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2013-05-31 19:39:56 -04:00
Jan Kara
a60697f411 ext4: fix data offset overflow in ext4_xattr_fiemap() on 32-bit archs
On 32-bit architectures with 32-bit sector_t computation of data offset
in ext4_xattr_fiemap() can overflow resulting in reporting bogus data
location. Fix the problem by typing block number to proper type before
shifting.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2013-05-31 19:38:56 -04:00
Jan Kara
e7293fd146 ext4: fix overflows in SEEK_HOLE, SEEK_DATA implementations
ext4_lblk_t is just u32 so multiplying it by blocksize can easily
overflow for files larger than 4 GB. Fix that by properly typing the
block offsets before shifting.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
2013-05-31 19:37:56 -04:00
Jan Kara
eaf3793728 ext4: fix data offset overflow on 32-bit archs in ext4_inline_data_fiemap()
On 32-bit archs when sector_t is defined as 32-bit the logic computing
data offset in ext4_inline_data_fiemap(). Fix that by properly typing
the shifted value.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2013-05-31 19:33:42 -04:00
Linus Torvalds
1d822d6094 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull reiserfs fixes from Jan Kara:
 "Three reiserfs fixes.  They fix real problems spotted by users so I
  hope they are ok even at this stage."

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  reiserfs: fix deadlock with nfs racing on create/lookup
  reiserfs: fix problems with chowning setuid file w/ xattrs
  reiserfs: fix spurious multiple-fill in reiserfs_readdir_dentry
2013-06-01 06:59:14 +09:00