Commit graph

33,155 commits

Author SHA1 Message Date
Josef Bacik
655b09fe54 Btrfs: make sure roots are assigned before freeing their nodes
If we fail to load the chunk tree we'll call free_root_pointers, except we may
not have assigned the roots for the dev_root/extent_root/csum_root yet, so we
could NULL pointer deref at this point.  Just add checks to make sure these
roots are set to keep us from panicing.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-17 21:40:38 -04:00
Stefan Behrens
3a6cad9009 Btrfs: explicitly use global_block_rsv for quota_tree
The quota_tree was set up to use the empty_block_rsv before
which would be problematic when the filesystem is filled up
and ENOSPC happens during internal operations while the quota
tree is updated and COWed (when the btrfs_qgroup_info_item
items) are written. In fact, use_block_rsv() which is used
in btrfs_cow_block() falls back to the global_block_rsv in
this case. But just in order to make it more clear what is
happening, change it to explicitly use the global_block_rsv.

Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-17 21:40:36 -04:00
Alexandre Oliva
17a5adccf3 btrfs: do away with non-whole_page extent I/O
end_bio_extent_readpage computes whole_page based on bv_offset and
bv_len, without taking into account that blk_update_request may modify
them when some of the blocks to be read into a page produce a read
error.  This would cause the read to unlock only part of the file
range associated with the page, which would in turn leave the entire
page locked, which would not only keep the process blocked instead of
returning -EIO to it, but also prevent any further access to the file.

It turns out that btrfs always issues whole-page reads and writes.
The special handling of non-whole_page appears to be a mistake or a
left-over from a time when this wasn't the case.  Indeed,
end_bio_extent_writepage distinguished between whole_page and
non-whole_page writes but behaved identically in both cases!

I've replaced the whole_page computations with warnings, just to be
sure that we're not issuing partial page reads or writes.  The
warnings should probably just go away some time.

Signed-off-by: Alexandre Oliva <oliva@gnu.org>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-17 21:40:35 -04:00
Miao Xie
b216cbfb52 Btrfs: don't invoke btrfs_invalidate_inodes() in the spin lock context
btrfs_invalidate_inodes() may sleep, so we should not invoke it in the
spin lock context. Fix it.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-17 21:40:34 -04:00
Miao Xie
314297c2a3 Btrfs: remove BUG_ON() in btrfs_read_fs_tree_no_radix()
We have checked if ->node is NULL or not, so it is unnecessary to
use BUG_ON() to check again. Remove it.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-17 21:40:32 -04:00
Miao Xie
061594ef17 Btrfs: pause the space balance when remounting to R/O
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-17 21:40:31 -04:00
Miao Xie
e1409cef85 Btrfs: fix unprotected root node of the subvolume's inode rb-tree
The root node of the rb-tree may be changed, so we should get it under
the lock. Fix it.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-17 21:40:30 -04:00
Miao Xie
89042e5ad2 Btrfs: fix accessing a freed tree root
inode_tree_del() will move the tree root into the dead root list, and
then the tree will be destroyed by the cleaner. So if we remove the
delayed node which is cached in the inode after inode_tree_del(),
we may access a freed tree root. Fix it.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-17 21:40:29 -04:00
Liu Bo
b9aa55bed1 Btrfs: return errno if possible when we fail to allocate memory
We need to set return value explicitly, otherwise we'll lose the error
value.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-17 21:40:27 -04:00
Miao Xie
d88033dbf4 Btrfs: update the global reserve if it is empty
Before applying this patch, we reserved the space for the global reserve
by the minimum unit if we found it is empty, it was unreasonable and
inefficient, because if the global reserve space was depleted, it implied
that the size of the global reserve was too small. In this case, we shoud
update the global reserve and fill it.

Cc: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-17 21:40:26 -04:00
Miao Xie
5881cfc924 Btrfs: don't steal the reserved space from the global reserve if their space type is different
If the type of the space we need is different with the global reserve, we
can not steal the space from the global reserve, because we can not allocate
the space from the free space cache that the global reserve points to.

Cc: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-17 21:40:25 -04:00
Miao Xie
b586b32374 Btrfs: optimize the error handle of use_block_rsv()
cc: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-17 21:40:24 -04:00
Miao Xie
7b61cd9224 Btrfs: don't use global block reservation for inode cache truncation
It is very likely that there are lots of subvolumes/snapshots in the filesystem,
so if we use global block reservation to do inode cache truncation, we may hog
all the free space that is reserved in global rsv. So it is better that we do
the free space reservation for inode cache truncation by ourselves.

Cc: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-17 21:40:22 -04:00
Miao Xie
7cfa9e51d2 Btrfs: don't abort the current transaction if there is no enough space for inode cache
The filesystem with inode cache was forced to be read-only when we umounted it.

Steps to reproduce:
 # mkfs.btrfs -f ${DEV}
 # mount -o inode_cache ${DEV} ${MNT}
 # dd if=/dev/zero of=${MNT}/file1 bs=1M count=8192
 # btrfs fi syn ${MNT}
 # dd if=${MNT}/file1 of=/dev/null bs=1M
 # rm -f ${MNT}/file1
 # btrfs fi syn ${MNT}
 # umount ${MNT}

It is because there was no enough space to do inode cache truncation, and then
we aborted the current transaction.

But no space error is not a serious problem when we write out the inode cache,
and it is safe that we just skip this step if we meet this problem. So we need
not abort the current transaction.

Reported-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Tested-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-17 21:40:21 -04:00
Andreas Philipp
8250dabedb Correct allowed raid levels on balance.
Raid5 with 3 devices is well defined while the old logic allowed
raid5 only with a minimum of 4 devices when converting the block group
profile via btrfs balance. Creating a raid5 with just three devices
using mkfs.btrfs worked always as expected. This is now fixed and the
whole logic is rewritten.

Signed-off-by: Andreas Philipp <philipp.andreas@gmail.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-17 21:40:20 -04:00
Stefan Behrens
379cde741b Btrfs: fix possible memory leak in replace_path()
In replace_path(), if read_tree_block() fails, we cannot return
directly, we should free some allocated memory otherwise memory
leak happens.

Similar to Wang's "Btrfs: fix possible memory leak in the
find_parent_nodes()" patch, the current commit fixes an issue that
is related to the "Btrfs: fix all callers of read_tree_block"
commit.

Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-17 21:40:19 -04:00
Wang Shilong
c16c2e2e51 Btrfs: fix possible memory leak in the find_parent_nodes()
In the find_parent_nodes(), if read_tree_block() fails, we can
not return directly, we should free some allocated memory otherwise
memory leak happens.

Signed-off-by: Wang Shilong <wangsl-fnst@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-17 21:40:17 -04:00
Stefan Behrens
4968810752 Btrfs: don't allow device replace on RAID5/RAID6
This is not yet supported and causes crashes. One sad user reported
that it destroyed his filesystem.

One failure is in __btrfs_map_block+0xc1f calling kmalloc(0).

0x5f21f is in __btrfs_map_block (fs/btrfs/volumes.c:4923).
4918                            num_stripes = map->num_stripes;
4919                            max_errors = nr_parity_stripes(map);
4920
4921                            raid_map = kmalloc(sizeof(u64) * num_stripes,
4922                                               GFP_NOFS);
4923                            if (!raid_map) {
4924                                    ret = -ENOMEM;
4925                                    goto out;
4926                            }
4927

There might be more issues. Until this is really tested, don't allow
users to start the procedure on RAID5/RAID6 filesystems.

Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-17 21:40:16 -04:00
Josef Bacik
b1c79e0947 Btrfs: handle running extent ops with skinny metadata
Chris hit a bug where we weren't finding extent records when running extent ops.
This is because we use the delayed_ref_head when running the extent op, which
means we can't use the ->type checks to see if we are metadata.  We also lose
the level of the metadata we are working on.  So to fix this we can just check
the ->is_data section of the extent_op, and we can store the level of the buffer
we were modifying in the extent_op.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-17 21:40:15 -04:00
Josef Bacik
73e1e61fb8 Btrfs: remove warn on in free space cache writeout
This catches block groups that are too large to properly cache.  We deal with
this case fine, so the warning just confuses users.  Remove the warning.
Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-17 21:40:13 -04:00
Josef Bacik
69a85bd87c Btrfs: don't null pointer deref on abort
I'm sorry, theres no excuse for this sort of work.  We need to use
root->leafsize since eb may be NULL.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-17 21:40:12 -04:00
Gabriel de Perthuis
03b71c6ca6 btrfs: don't stop searching after encountering the wrong item
The search ioctl skips items that are too large for a result buffer, but
inline items of a certain size occuring before any search result is
found would trigger an overflow and stop the search entirely.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=57641

Cc: stable@vger.kernel.org
Signed-off-by: Gabriel de Perthuis <g2p.code+btrfs@gmail.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-17 21:40:10 -04:00
Liu Bo
a52f4cd2b1 Btrfs: fix off-by-one in fiemap
lock_extent/unlock_extent expect an exclusive end.

Tested-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-17 16:27:26 -04:00
David Sterba
60b62978bc btrfs: annotate quota tree for lockdep
Quota tree has been missing from lockdep annotations, though no warning
has been seen in the wild.

There's currently one entry that does not belong there,
BTRFS_ORPHAN_OBJECTID.  No such tree exists, it's probably a copy &
paste mistake, the id is defined among tree ids.

Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2013-05-17 16:27:25 -04:00
Jim Schutt
39be95e9c8 ceph: ceph_pagelist_append might sleep while atomic
Ceph's encode_caps_cb() worked hard to not call __page_cache_alloc()
while holding a lock, but it's spoiled because ceph_pagelist_addpage()
always calls kmap(), which might sleep.  Here's the result:

[13439.295457] ceph: mds0 reconnect start
[13439.300572] BUG: sleeping function called from invalid context at include/linux/highmem.h:58
[13439.309243] in_atomic(): 1, irqs_disabled(): 0, pid: 12059, name: kworker/1:1
    . . .
[13439.376225] Call Trace:
[13439.378757]  [<ffffffff81076f4c>] __might_sleep+0xfc/0x110
[13439.384353]  [<ffffffffa03f4ce0>] ceph_pagelist_append+0x120/0x1b0 [libceph]
[13439.391491]  [<ffffffffa0448fe9>] ceph_encode_locks+0x89/0x190 [ceph]
[13439.398035]  [<ffffffff814ee849>] ? _raw_spin_lock+0x49/0x50
[13439.403775]  [<ffffffff811cadf5>] ? lock_flocks+0x15/0x20
[13439.409277]  [<ffffffffa045e2af>] encode_caps_cb+0x41f/0x4a0 [ceph]
[13439.415622]  [<ffffffff81196748>] ? igrab+0x28/0x70
[13439.420610]  [<ffffffffa045e9f8>] ? iterate_session_caps+0xe8/0x250 [ceph]
[13439.427584]  [<ffffffffa045ea25>] iterate_session_caps+0x115/0x250 [ceph]
[13439.434499]  [<ffffffffa045de90>] ? set_request_path_attr+0x2d0/0x2d0 [ceph]
[13439.441646]  [<ffffffffa0462888>] send_mds_reconnect+0x238/0x450 [ceph]
[13439.448363]  [<ffffffffa0464542>] ? ceph_mdsmap_decode+0x5e2/0x770 [ceph]
[13439.455250]  [<ffffffffa0462e42>] check_new_map+0x352/0x500 [ceph]
[13439.461534]  [<ffffffffa04631ad>] ceph_mdsc_handle_map+0x1bd/0x260 [ceph]
[13439.468432]  [<ffffffff814ebc7e>] ? mutex_unlock+0xe/0x10
[13439.473934]  [<ffffffffa043c612>] extra_mon_dispatch+0x22/0x30 [ceph]
[13439.480464]  [<ffffffffa03f6c2c>] dispatch+0xbc/0x110 [libceph]
[13439.486492]  [<ffffffffa03eec3d>] process_message+0x1ad/0x1d0 [libceph]
[13439.493190]  [<ffffffffa03f1498>] ? read_partial_message+0x3e8/0x520 [libceph]
    . . .
[13439.587132] ceph: mds0 reconnect success
[13490.720032] ceph: mds0 caps stale
[13501.235257] ceph: mds0 recovery completed
[13501.300419] ceph: mds0 caps renewed

Fix it up by encoding locks into a buffer first, and when the number
of encoded locks is stable, copy that into a ceph_pagelist.

[elder@inktank.com: abbreviated the stack info a bit.]

Cc: stable@vger.kernel.org # 3.4+
Signed-off-by: Jim Schutt <jaschut@sandia.gov>
Reviewed-by: Alex Elder <elder@inktank.com>
2013-05-17 12:45:48 -05:00
Jim Schutt
c420276a53 ceph: add cpu_to_le32() calls when encoding a reconnect capability
In his review, Alex Elder mentioned that he hadn't checked that
num_fcntl_locks and num_flock_locks were properly decoded on the
server side, from a le32 over-the-wire type to a cpu type.
I checked, and AFAICS it is done; those interested can consult
    Locker::_do_cap_update()
in src/mds/Locker.cc and src/include/encoding.h in the Ceph server
code (git://github.com/ceph/ceph).

I also checked the server side for flock_len decoding, and I believe
that also happens correctly, by virtue of having been declared
__le32 in struct ceph_mds_cap_reconnect, in src/include/ceph_fs.h.

Cc: stable@vger.kernel.org # 3.4+
Signed-off-by: Jim Schutt <jaschut@sandia.gov>
Reviewed-by: Alex Elder <elder@inktank.com>
2013-05-17 12:45:43 -05:00
Rami Rosen
5240d58c04 sysfs: kill sysfs_sb declaration in fs/sysfs/inode.c.
This patch removes sysfs_sb declaration from fs/sysfs/inode.c
(due to 0f4288ec6f,
 "Kill unused sysfs_sb variable").

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-17 10:26:08 -07:00
Warner Wang
434749108c sysfs: sysfs_link_sibling(): fix typo in comment
Fix a typo subling->sibling in the comment of sysfs_link_sibling().

Signed-off-by: Warner Wang <warner.wang@hp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-17 10:26:08 -07:00
J. Bruce Fields
ba4e55bb67 nfsd4: fix compile in !CONFIG_NFSD_V4_SECURITY_LABEL case
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-05-15 10:35:18 -04:00
David Quigley
18032ca062 NFSD: Server implementation of MAC Labeling
Implement labeled NFS on the server: encoding and decoding, and writing
and reading, of file labels.

Enabled with CONFIG_NFSD_V4_SECURITY_LABEL.

Signed-off-by: Matthew N. Dodd <Matthew.Dodd@sparta.com>
Signed-off-by: Miguel Rodel Felipe <Rodel_FM@dsi.a-star.edu.sg>
Signed-off-by: Phua Eu Gene <PHUA_Eu_Gene@dsi.a-star.edu.sg>
Signed-off-by: Khin Mi Mi Aung <Mi_Mi_AUNG@dsi.a-star.edu.sg>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-05-15 09:27:02 -04:00
Linus Torvalds
b973425cbb Fixed regressions (two stability regressions and a performance
regression) introduced during the 3.10-rc1 merge window.  Also
 included is a bug fix relating to allocating blocks after resizing an
 ext3 file system when using the ext4 file system driver.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABCAAGBQJRkZBlAAoJENNvdpvBGATwLYQP/iWOBs2z93WG23cqkgqvL8o6
 ZyeJdgy9dkFCArVDX5SSnGkJXZ3iqIKi5HoTKTJKfytgMzgiDAZcLsIHVv6NczwR
 UGhjgS3HEdV5tJ46E6JnpB3NLSb+rAdc5kCdlsbzU46CP+JjFiYEhxVpK7ELuM/G
 yctChbIH9FY+1OwxHccacBOaJU2ELhnH6B/8Ry/6gM2H0vfKeTNOdocOHdxvbNqg
 ooGjytMfVopMQEfVG8aXtTfy341NFJH5fAYEahCcXxeO9ta6Unj9yOu5JV2wVrTt
 39+DBsquGX6AVQsc9IxJ6YAN6ldwWN7l3huE9/AI0o/alwGsfVi5M+M/d1MMjDqf
 Fgl2EzzBpZQeKKY9UXNi4LLgYdBiILMgKDOGoRKhRb8ynSSf/JX43+24FvidEi3o
 o//J4aR+oSZfaovGAeikqyF1cumayhoNN8MINRN8igIinBiC4GjBFEl/Kl/1eAY/
 lREGcsmYPXOkVPpM72waRYlP4GwNdOg4QSEY0SGljpwluO+dYtKQjHXcv/s/xL5v
 j3GemzYVyjx4zaq1g3PxGfuD6VKFHr0T6jvzd6cHu17lnPlw9fwznHbEm9BEcXDY
 gbGx9u+a2ZTqDwYVALbeoRpf9Zz6DUCse3ts4N3rbkXUQQiBYo7tybfVopIMAukb
 CexvidDE/ryJrJJFBwoK
 =6cRD
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 update from Ted Ts'o:
 "Fixed regressions (two stability regressions and a performance
  regression) introduced during the 3.10-rc1 merge window.

  Also included is a bug fix relating to allocating blocks after
  resizing an ext3 file system when using the ext4 file system driver"

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  jbd,jbd2: fix oops in jbd2_journal_put_journal_head()
  ext4: revert "ext4: use io_end for multiple bios"
  ext4: limit group search loop for non-extent files
  ext4: fix fio regression
2013-05-14 09:30:54 -07:00
Brian Foster
de82b92301 fuse: allocate for_background dio requests based on io->async state
Commit 8b41e671 introduced explicit background checking for fuse_req
structures with BUG_ON() checks for the appropriate type of request in
in the associated send functions. Commit bcba24cc introduced the ability
to send dio requests as background requests but does not update the
request allocation based on the type of I/O request. As a result, a
BUG_ON() triggers in the fuse_request_send_background() background path if
an async I/O is sent.

Allocate a request based on the async state of the fuse_io_priv to avoid
the BUG.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2013-05-14 18:25:44 +02:00
Lingzhu Xiang
3fab70c165 efivarfs: Never return ENOENT from firmware again
Previously in 1fa7e69 efi_status_to_err() translated firmware status
EFI_NOT_FOUND to -EIO instead of -ENOENT for efivarfs operations to
avoid confusion. After refactoring in e14ab23, it is also used in other
places where the translation may be unnecessary.

So move the translation to efivarfs specific code. Also return EOF
for reading zero-length files, which is what users would expect.

Cc: Josh Boyer <jwboyer@redhat.com>
Cc: Jeremy Kerr <jk@ozlabs.org>
Cc: Lee, Chun-Yi <jlee@suse.com>
Cc: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Lingzhu Xiang <lxiang@redhat.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2013-05-13 20:12:10 +01:00
Steve Dickson
4bdc33ed5b NFSDv4.2: Add NFS v4.2 support to the NFS server
This enables NFSv4.2 support for the server. To enable this
code do the following:
  echo "+4.2" >/proc/fs/nfsd/versions

after the nfsd kernel module is loaded.

On its own this does nothing except allow the server to respond to
compounds with minorversion set to 2.  All the new NFSv4.2 features are
optional, so this is perfectly legal.

Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-05-13 10:11:47 -04:00
J. Bruce Fields
4f540e29dc nfsd4: store correct client minorversion for >=4.2
This code assumes that any client using exchange_id is using NFSv4.1,
but with the introduction of 4.2 that will no longer true.

This main effect of this is that client callbacks will use the same
minorversion as that used on the exchange_id.

Note that clients are forbidden from mixing 4.1 and 4.2 compounds.  (See
rfc 5661, section 2.7, #13: "A client MUST NOT attempt to use a stateid,
filehandle, or similar returned object from the COMPOUND procedure with
minor version X for another COMPOUND procedure with minor version Y,
where X != Y.")  However, we do not currently attempt to enforce this
except in the case of mixing zero minor version with non-zero minor
versions.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-05-13 10:11:45 -04:00
Zhao Hongjiang
eb48241bb4 nfsd: get rid of the unused functions in vfs
The fh_lock_parent(), nfsd_truncate(), nfsd_notify_change() and
nfsd_sync_dir() fuctions are neither implemented nor used, just remove
them.

Signed-off-by: Zhao Hongjiang <zhaohongjiang@huawei.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-05-13 10:11:44 -04:00
Steve Dickson
4488cc96c5 NFS: Add NFSv4.2 protocol constants
Signed-off-by: Matthew N. Dodd <Matthew.Dodd@sparta.com>
Signed-off-by: Miguel Rodel Felipe <Rodel_FM@dsi.a-star.edu.sg>
Signed-off-by: Phua Eu Gene <PHUA_Eu_Gene@dsi.a-star.edu.sg>
Signed-off-by: Khin Mi Mi Aung <Mi_Mi_AUNG@dsi.a-star.edu.sg>
Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-05-13 10:05:43 -04:00
Colin Cross
9745cdb36d select: use freezable blocking call
Avoid waking up every thread sleeping in a select call during
suspend and resume by calling a freezable blocking call.  Previous
patches modified the freezer to avoid sending wakeups to threads
that are blocked in freezable blocking calls.

This call was selected to be converted to a freezable call because
it doesn't hold any locks or release any resources when interrupted
that might be needed by another freezing task or a kernel driver
during suspend, and is a common site where idle userspace tasks are
blocked.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Colin Cross <ccross@android.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-05-12 14:16:22 +02:00
Colin Cross
1c441e9212 epoll: use freezable blocking call
Avoid waking up every thread sleeping in an epoll_wait call during
suspend and resume by calling a freezable blocking call.  Previous
patches modified the freezer to avoid sending wakeups to threads
that are blocked in freezable blocking calls.

This call was selected to be converted to a freezable call because
it doesn't hold any locks or release any resources when interrupted
that might be needed by another freezing task or a kernel driver
during suspend, and is a common site where idle userspace tasks are
blocked.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Colin Cross <ccross@android.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-05-12 14:16:22 +02:00
Colin Cross
5853cc2a89 freezer: add unsafe versions of freezable helpers for CIFS
CIFS calls wait_event_freezekillable_unsafe with a VFS lock held,
which is unsafe and will cause lockdep warnings when 6aa9707
"lockdep: check that no locks held at freeze time" is reapplied
(it was reverted in dbf520a).  CIFS shouldn't be doing this, but
it has long-running syscalls that must hold a lock but also
shouldn't block suspend.  Until CIFS freeze handling is rewritten
to use a signal to exit out of the critical section, add a new
wait_event_freezekillable_unsafe helper that will not run the
lockdep test when 6aa9707 is reapplied, and call it from CIFS.

In practice the likley result of holding the lock while freezing
is that a second task blocked on the lock will never freeze,
aborting suspend, but it is possible to manufacture a case using
the cgroup freezer, the lock, and the suspend freezer to create
a deadlock.  Silencing the lockdep warning here will allow
problems to be found in other drivers that may have a more
serious deadlock risk, and prevent new problems from being added.

Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Colin Cross <ccross@android.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-05-12 14:16:21 +02:00
Colin Cross
416ad3c9c0 freezer: add unsafe versions of freezable helpers for NFS
NFS calls the freezable helpers with locks held, which is unsafe
and will cause lockdep warnings when 6aa9707 "lockdep: check
that no locks held at freeze time" is reapplied (it was reverted
in dbf520a).  NFS shouldn't be doing this, but it has
long-running syscalls that must hold a lock but also shouldn't
block suspend.  Until NFS freeze handling is rewritten to use a
signal to exit out of the critical section, add new *_unsafe
versions of the helpers that will not run the lockdep test when
6aa9707 is reapplied, and call them from NFS.

In practice the likley result of holding the lock while freezing
is that a second task blocked on the lock will never freeze,
aborting suspend, but it is possible to manufacture a case using
the cgroup freezer, the lock, and the suspend freezer to create
a deadlock.  Silencing the lockdep warning here will allow
problems to be found in other drivers that may have a more
serious deadlock risk, and prevent new problems from being added.

Signed-off-by: Colin Cross <ccross@android.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-05-12 14:16:21 +02:00
Theodore Ts'o
a549984b8c ext4: revert "ext4: use io_end for multiple bios"
This reverts commit 4eec708d26.

Multiple users have reported crashes which is apparently caused by
this commit.  Thanks to Dmitry Monakhov for bisecting it.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Dmitry Monakhov <dmonakhov@openvz.org>
Cc: Jan Kara <jack@suse.cz>
2013-05-11 19:07:42 -04:00
Linus Torvalds
c4cc75c332 Merge git://git.infradead.org/users/eparis/audit
Pull audit changes from Eric Paris:
 "Al used to send pull requests every couple of years but he told me to
  just start pushing them to you directly.

  Our touching outside of core audit code is pretty straight forward.  A
  couple of interface changes which hit net/.  A simple argument bug
  calling audit functions in namei.c and the removal of some assembly
  branch prediction code on ppc"

* git://git.infradead.org/users/eparis/audit: (31 commits)
  audit: fix message spacing printing auid
  Revert "audit: move kaudit thread start from auditd registration to kaudit init"
  audit: vfs: fix audit_inode call in O_CREAT case of do_last
  audit: Make testing for a valid loginuid explicit.
  audit: fix event coverage of AUDIT_ANOM_LINK
  audit: use spin_lock in audit_receive_msg to process tty logging
  audit: do not needlessly take a lock in tty_audit_exit
  audit: do not needlessly take a spinlock in copy_signal
  audit: add an option to control logging of passwords with pam_tty_audit
  audit: use spin_lock_irqsave/restore in audit tty code
  helper for some session id stuff
  audit: use a consistent audit helper to log lsm information
  audit: push loginuid and sessionid processing down
  audit: stop pushing loginid, uid, sessionid as arguments
  audit: remove the old depricated kernel interface
  audit: make validity checking generic
  audit: allow checking the type of audit message in the user filter
  audit: fix build break when AUDIT_DEBUG == 2
  audit: remove duplicate export of audit_enabled
  Audit: do not print error when LSMs disabled
  ...
2013-05-11 14:29:11 -07:00
Linus Torvalds
2dbd3cac87 Merge branch 'for-3.10' of git://linux-nfs.org/~bfields/linux
Pull nfsd fixes from Bruce Fields:
 "Small fixes for two bugs and two warnings"

* 'for-3.10' of git://linux-nfs.org/~bfields/linux:
  nfsd: fix oops when legacy_recdir_name_error is passed a -ENOENT error
  SUNRPC: fix decoding of optional gss-proxy xdr fields
  SUNRPC: Refactor gssx_dec_option_array() to kill uninitialized warning
  nfsd4: don't allow owner override on 4.1 CLAIM_FH opens
2013-05-10 09:28:55 -07:00
Linus Torvalds
3644bc2ec7 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal
Pull stray syscall bits from Al Viro:
 "Several syscall-related commits that were missing from the original"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
  switch compat_sys_sysctl to COMPAT_SYSCALL_DEFINE
  unicore32: just use mmap_pgoff()...
  unify compat fanotify_mark(2), switch to COMPAT_SYSCALL_DEFINE
  x86, vm86: fix VM86 syscalls: use SYSCALL_DEFINEx(...)
2013-05-10 09:21:05 -07:00
Linus Torvalds
6fad8d02ef Improve performance when AES-NI (and most likely other crypto accelerators) is
available by moving to the ablkcipher crypto API. The improvement is more
 apparent on faster storage devices. There's no noticeable change when hardware
 crypto is not available.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABCgAGBQJRjJFjAAoJENaSAD2qAscKuRgQAJkefyLPNBb4plXr3R+zJ1aM
 jkcqkUJWamCxlmqHg/n9LmlxpSZiiWSWJM8iiq8zQhPE0PVXVOVhqvzogT1xwv75
 xfT9xTuRi1v7UFaSGHtj3WoO2nscQ0pjZW/0CHgd/PZjz9y0iJ/l6ueoWCOz5L2i
 3oJvx/W407qM+MSogWKx79i1B2jILdBdQH/7PZ+UJS3jWEo3rMWBbfCwbYhd4pUG
 oVc+qFglNs+3HLdHVUmHPCerCL9qYAJIJmDrvupSOQ6DwdFaV8IysTgSEdFtLcfC
 8Z6DUOPzXnvA/+y+NCCCUxg1CrkgYkNrefLKAq18atFu63zIZIHZyJBTJ5Q6vXVF
 o1H8UcIOg/liGa6lXGf5b4ENNKvB0qMYQgiSrL0/FVVima4zGqUWkQLno4kQl1zx
 FHB5imQ7F/EMcow/nTN3YQYC3N/iYIFAIRxf35SiGGsNhO2sEyIqYlRSyq2MNrRl
 pLWNbhnRuhBUqcbqZDxq1oZ7624Ui4jnHHx7rl6Y3gfm8Xa1ZmQeY6rOadSZaRd7
 +ZqFZi1jiHz1c0tVUO/3DsqABIhbr9Ee03vracN8bVTGV0ZmO0qneugFB9esoeDf
 UnrU0Im8ilHu3OHAyc+UkRZuzThL9bLZEivwICQ+JDUJ1zsLYSQZEaGIt8hA0iDy
 8Bu3gtfX2WxD4ak/AkFv
 =3EhP
 -----END PGP SIGNATURE-----

Merge tag 'ecryptfs-3.10-rc1-ablkcipher' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs

Pull eCryptfs update from Tyler Hicks:
 "Improve performance when AES-NI (and most likely other crypto
  accelerators) is available by moving to the ablkcipher crypto API.
  The improvement is more apparent on faster storage devices.

  There's no noticeable change when hardware crypto is not available"

* tag 'ecryptfs-3.10-rc1-ablkcipher' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs:
  eCryptfs: Use the ablkcipher crypto API
2013-05-10 09:20:01 -07:00
Linus Torvalds
977b58e1dd Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu
Pull m68knommu updates from Greg Ungerer:
 "The bulk of the changes are generalizing the ColdFire v3 core support
  and adding in 537x CPU support.  Also a couple of other bug fixes, one
  to fix a reintroduction of a past bug in the romfs filesystem nommu
  support."

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
  m68knommu: enable Timer on coldfire 532x
  m68knommu: fix ColdFire 5373/5329 QSPI base address
  m68knommu: add support for configuring a Freescale M5373EVB board
  m68knommu: add support for the ColdFire 537x family of CPUs
  m68knommu: make ColdFire M532x platform support more v3 generic
  m68knommu: create and use a common M53xx ColdFire class of CPUs
  m68k: remove unused asm/dbg.h
  m68k: Set ColdFire ACR1 cache mode depending on kernel configuration
  romfs: fix nommu map length to keep inside filesystem
  m68k: clean up unused "config ROMVECSIZE"
2013-05-10 07:22:35 -07:00
Richard Genoud
beadadfa54 UBIFS: correct mount message
When mounting an UBIFS R/W volume, we have the message:
UBIFS: mounted UBI device 0, volume 1, name "rootfs"(null)
With this patch, we'll have:
UBIFS: mounted UBI device 0, volume 1, name "rootfs"
Which is, I think, what was intended.

Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Cc: stable@vger.kernel.org [v3.7+]
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2013-05-10 17:07:01 +03:00
Tyler Hicks
4dfea4f0d7 eCryptfs: Use the ablkcipher crypto API
Make the switch from the blkcipher kernel crypto interface to the
ablkcipher interface.

encrypt_scatterlist() and decrypt_scatterlist() now use the ablkcipher
interface but, from the eCryptfs standpoint, still treat the crypto
operation as a synchronous operation. They submit the async request and
then wait until the operation is finished before they return. Most of
the changes are contained inside those two functions.

Despite waiting for the completion of the crypto operation, the
ablkcipher interface provides performance increases in most cases when
used on AES-NI capable hardware.

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Acked-by: Colin King <colin.king@canonical.com>
Reviewed-by: Zeev Zilberman <zeev@annapurnaLabs.com>
Cc: Dustin Kirkland <dustin.kirkland@gazzang.com>
Cc: Tim Chen <tim.c.chen@intel.com>
Cc: Ying Huang <ying.huang@intel.com>
Cc: Thieu Le <thieule@google.com>
Cc: Li Wang <dragonylffly@163.com>
Cc: Jarkko Sakkinen <jarkko.sakkinen@iki.fi>
2013-05-09 16:55:07 -07:00
Linus Torvalds
70eba4226d Couple of pstore cleanups
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJRiosmAAoJEKurIx+X31iBQ/IP/2D+JHfGcMzZbRo+2moRyhp9
 VYUrNOg+QA+Csm6ZO43GMB4JZtodI/tO12YLNGFbNh+YELdpox7XTqmG058+pABV
 El1xt7i2Ro2/PxnWBRnnvWVHoWSsPI9R65aVNgz8C2QyDxG4wY9/ZYfcZQgEajJO
 M4gyKx/d54WjKcD31OJBF5NGki2zZQcuBI8vWkjXZBPximNj+cJeC27VPnuNI2GC
 p32p9Q+pvBv43bkf3EPEFGsd/ZKdczZ75SOzLXVqOEJhmsxfEKgleQbBd3hiRlwe
 zwj8lCzjZS3xR9oCnvalzWgswFQMd9S1kQYbItdztIofU4Y9hP6wmsEzLofXku+0
 FshRxAOCtW1jSgRGo/BiDNfRQm8w+l+rJGafZafK6cQOtUHLD+Kig8AhFBIY3Nt1
 Wzsmjz5ERhMDImM2lVw4ypji5vWkQ50wst6sX5CQHiOdOEmX+HYJkpOXCHzZYNY2
 IrAa/qq6EKbQCfUf7btbsFMDMDIYk66gy6R596sC1Onwy14ecxWerY1HO0BFg3b8
 UUo9tGVHEBBLdogc1aErU6OBL2DLeO7TeURhFOu7UHo0PfJNtLyG2ekYOYjwgS+Q
 RKNJz9PHKsQnBOdI+dK3iHG2uqkbv6rg0pH5oIglCptabqflN9SE0RpM5b2PSLT3
 74fTPSlUDSuCcm1PXJ8O
 =Zw0E
 -----END PGP SIGNATURE-----

Merge tag 'please-pull-pstore' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux

Pull trivial pstore update from Tony Luck:
 "Couple of pstore cleanups"

It turns out that the kmemdup() conversion ends up being undone by the
fact that the memory block also needed the ecc information (see commit
bd08ec33b5: "pstore/ram: Restore ecc information block"), so all that
remains after merging is the error return code change.

* tag 'please-pull-pstore' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
  pstore/ram: fix error return code in ramoops_probe()
  fs: pstore: Replaced calls to kmalloc and memcpy with kmemdup
2013-05-09 16:42:10 -07:00