linux-pinenote

Author	SHA1	Message	Date
Yan, Zheng	51da8e8c6f	ceph: reset r_resend_mds after receiving -ESTALE this makes __choose_mds() choose mds according caps Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2014-07-14 10:49:15 +08:00
Christoph Hellwig	73a8f5f7e6	locks: purge fl_owner_t from fs/locks.c Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jeff Layton <jlayton@primarydata.com>	2014-07-13 21:39:07 -04:00
Linus Torvalds	18b34d9a7a	More bug fixes for ext4 -- most importantly, a fix for a bug (introduced in 3.15) that can end up triggering a file system corruption error after a journal replay. (It shouldn't lead to any actual data corruption, but it is scary and can force file systems to be remounted read-only, etc.) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJTwuY+AAoJENNvdpvBGATwgN8QAJ2/S5GFxQwHbglHmayXYuMQ fU411FwJ1wbqjYyYb+jyBoYcsgpsCKPTqA2JbPlHsFTm2Ec+BPzsybhtYw5ybdeW 1qAfPTSgNxYXroNwpaqOamxgfXgOaV4iqwvZ4tYcLcrtPq0MOcC5rlSaKMdJuSA1 6M2/8PijOTndUVJpS/GhSMdKlTAXjtfv9V6t/pfLuoo7cNadlggpJnwC8Qm9DNAA 5ETVZK44q2+2YvGwrvY6LBb9BVBpL29YbWPNqqw/OXXY++ZFhBJV07osZO38MpsB QzUyfRaMTgm9/BdbkG8uxA7Zk6C0YBl5eC4aU79LWGWjGO225CLj95LoBOVjQw9f eh+RFGapwVvtyzScDF/a9pH6UwGco/s4kCq8rLr2ztljlO595N3LUwhQBHtiSGtm fr65NRDyJMXbqy8yLGrlOnP/4ll2VfTH+el2+tzr5smoTD29EASM155hKDDUOAG0 TrDHtNrxG1MIROHjp+HSui424Op7NXTnfjwmuKzo+mGpPOcPclPSmAacFJpRGVBE 220hnk+LrBf525nJzQYHifdCL+JAqbWv/S4YSRGizgppK3DlO/gYcu1zpWb0WWuo 0VuvxUZDSIZY1aVpMEOQov74WtovB7YyG8RPHl7h2m5dJuLLFgJmLDMDTJR1LLNT +tHNJ6jERLQz9wqTvquh =OX7Z -----END PGP SIGNATURE----- Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 bugfixes from Ted Ts'o: "More bug fixes for ext4 -- most importantly, a fix for a bug introduced in 3.15 that can end up triggering a file system corruption error after a journal replay. It shouldn't lead to any actual data corruption, but it is scary and can force file systems to be remounted read-only, etc" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix potential null pointer dereference in ext4_free_inode ext4: fix a potential deadlock in __ext4_es_shrink() ext4: revert commit which was causing fs corruption after journal replays ext4: disable synchronous transaction batching if max_batch_time==0 ext4: clarify ext4_error message in ext4_mb_generate_buddy_error() ext4: clarify error count warning messages ext4: fix unjournalled bg descriptor while initializing inode bitmap	2014-07-13 13:14:55 -07:00
Trond Myklebust	e655f945cd	Merge branch 'bugfixes' into linux-next * bugfixes: NFS: Don't reset pg_moreio in __nfs_pageio_add_request NFS: Remove 2 unused variables nfs: handle multiple reqs in nfs_wb_page_cancel nfs: handle multiple reqs in nfs_page_async_flush nfs: change find_request to find_head_request nfs: nfs_page should take a ref on the head req nfs: mark nfs_page reqs with flag for extra ref nfs: only show Posix ACLs in listxattr if actually present Conflicts: fs/nfs/write.c	2014-07-13 15:22:02 -04:00
Trond Myklebust	f563b89b18	NFS: Don't reset pg_moreio in __nfs_pageio_add_request Once we've started sending unstable NFS writes, we do not want to clear pg_moreio, or we may end up sending the very last request as a stable write if the commit lists are still empty. Do, however, reset pg_moreio in the case where we end up having to recoalesce the write if an attempt to use pNFS failed. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-07-13 15:18:44 -04:00
Fabian Frederick	002160269f	NFS: use ARRAY_SIZE instead of sizeof/sizeof[0] Use macro definition Cc: Trond Myklebust <trond.myklebust@primarydata.com> Cc: linux-nfs@vger.kernel.org Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-07-12 18:43:58 -04:00
Himangi Saraogi	8ee2b78a44	NFSv4: Drop cast This patch does away with the cast on void * as it is unnecessary. The following Coccinelle semantic patch was used for making the change: @r@ expression x; void* e; type T; identifier f; @@ ( ((T )e) \| ((T )x)[...] \| ((T )x)->f \| - (T *) e ) Signed-off-by: Himangi Saraogi <himangi774@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-07-12 18:43:47 -04:00
Fabian Frederick	57b696fb1b	fs/nfs_common/nfsacl.c: move EXPORT symbol after functions Fix checkpatch warnings: "WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable" Cc: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-07-12 18:43:42 -04:00
Jeff Layton	f11b2a1cfb	nfs4: copy acceptor name from context to nfs_client The current CB_COMPOUND handling code tries to compare the principal name of the request with the cl_hostname in the client. This is not guaranteed to ever work, particularly if the client happened to mount a CNAME of the server or a non-fqdn. Fix this by instead comparing the cr_principal string with the acceptor name that we get from gssd. In the event that gssd didn't send one down (i.e. it was too old), then we fall back to trying to use the cl_hostname as we do today. Signed-off-by: Jeff Layton <jlayton@poochiereds.net> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-07-12 18:41:25 -04:00
Jeff Layton	f1cdae87fc	nfs4: turn free_lock_state into a void return operation Nothing checks its return value. Signed-off-by: Jeff Layton <jlayton@poochiereds.net> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-07-12 18:36:37 -04:00
Jeff Layton	49a4bda22e	nfs4: queue free_lock_state job submission to nfsiod We got a report of the following warning in Fedora: BUG: sleeping function called from invalid context at mm/slub.c:969 in_atomic(): 1, irqs_disabled(): 0, pid: 533, name: bash 3 locks held by bash/533: #0: (&sp->so_delegreturn_mutex){+.+...}, at: [<ffffffffa033da62>] nfs4_proc_lock+0x262/0x910 [nfsv4] #1: (&nfsi->rwsem){.+.+.+}, at: [<ffffffffa033da6a>] nfs4_proc_lock+0x26a/0x910 [nfsv4] #2: (&sb->s_type->i_lock_key#23){+.+...}, at: [<ffffffff812998dc>] flock_lock_file_wait+0x8c/0x3a0 CPU: 0 PID: 533 Comm: bash Not tainted 3.15.0-0.rc1.git1.1.fc21.x86_64 #1 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 0000000000000000 00000000d664ff3c ffff880078b69a70 ffffffff817e82e0 0000000000000000 ffff880078b69a98 ffffffff810cf1a4 0000000000000050 0000000000000050 ffff88007cc01a00 ffff880078b69ad8 ffffffff8121449e Call Trace: [<ffffffff817e82e0>] dump_stack+0x4d/0x66 [<ffffffff810cf1a4>] __might_sleep+0x184/0x240 [<ffffffff8121449e>] kmem_cache_alloc_trace+0x4e/0x330 [<ffffffffa0331124>] ? nfs4_release_lockowner+0x74/0x110 [nfsv4] [<ffffffffa0331124>] nfs4_release_lockowner+0x74/0x110 [nfsv4] [<ffffffffa0352340>] nfs4_put_lock_state+0x90/0xb0 [nfsv4] [<ffffffffa0352375>] nfs4_fl_release_lock+0x15/0x20 [nfsv4] [<ffffffff81297515>] locks_free_lock+0x45/0x90 [<ffffffff8129996c>] flock_lock_file_wait+0x11c/0x3a0 [<ffffffffa033da6a>] ? nfs4_proc_lock+0x26a/0x910 [nfsv4] [<ffffffffa033301e>] do_vfs_lock+0x1e/0x30 [nfsv4] [<ffffffffa033da79>] nfs4_proc_lock+0x279/0x910 [nfsv4] [<ffffffff810dbb26>] ? local_clock+0x16/0x30 [<ffffffff810f5a3f>] ? lock_release_holdtime.part.28+0xf/0x200 [<ffffffffa02f820c>] do_unlk+0x8c/0xc0 [nfs] [<ffffffffa02f85c5>] nfs_flock+0xa5/0xf0 [nfs] [<ffffffff8129a6f6>] locks_remove_file+0xb6/0x1e0 [<ffffffff812159d8>] ? kfree+0xd8/0x2d0 [<ffffffff8123bc63>] __fput+0xd3/0x210 [<ffffffff8123bdee>] ____fput+0xe/0x10 [<ffffffff810bfb6d>] task_work_run+0xcd/0xf0 [<ffffffff81019cd1>] do_notify_resume+0x61/0x90 [<ffffffff817fbea2>] int_signal+0x12/0x17 The problem is that NFSv4 is trying to do an allocation from fl_release_private (in order to send a RELEASE_LOCKOWNER call). That function can be called while holding the inode->i_lock, and it's currently set up to do __GFP_WAIT allocations. v4.1 code has a similar problem. This patch adds a work_struct to the nfs4_lock_state and has the code queue the free_lock_state operation to nfsiod. Reported-by: Josh Stone <jistone@redhat.com> Signed-off-by: Jeff Layton <jlayton@poochiereds.net> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-07-12 18:36:35 -04:00
Jeff Layton	8003d3c4aa	nfs4: treat lock owners as opaque values Do the following set of ops with a file on a NFSv4 mount: exec 3>>/file/on/nfsv4 flock -x 3 exec 3>&- You'll see the LOCK request go across the wire, but no LOCKU when the file is closed. What happens is that the fd is passed across a fork, and the final close is done in a different process than the opener. That makes __nfs4_find_lock_state miss finding the correct lock state because it uses the fl_pid as a search key. A new one is created, and the locking code treats it as a delegation stateid (because NFS_LOCK_INITIALIZED isn't set). The root cause of this breakage seems to be commit `77041ed9b4` (NFSv4: Ensure the lockowners are labelled using the fl_owner and/or fl_pid). That changed it so that flock lockowners are allocated based on the fl_pid. I think this is incorrect. flock locks should be "owned" by the struct file, and that is already accounted for in the fl_owner field of the lock request when it comes through nfs_flock. This patch basically reverts the above commit and with it, a LOCKU is sent in the above reproducer. Signed-off-by: Jeff Layton <jlayton@poochiereds.net> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-07-12 18:36:31 -04:00
Peng Tao	039b756a2d	nfs41: layout return on close in delegation return If file is not opened by anyone, we do layout return on close in delegation return. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-07-12 18:23:17 -04:00
Peng Tao	fe08c54691	nfs41: return layout on last close If client has valid delegation, do not return layout on close at all. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-07-12 18:23:04 -04:00
Peng Tao	15bb3afe90	nfs4: add nfs4_check_delegation Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-07-12 18:22:58 -04:00
Peng Tao	0b0bc6ea77	pnfs/filelayout: retry ds commit if nfs_commitdata_alloc fails Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-07-12 18:22:45 -04:00
Peng Tao	c8a3292d24	pnfs/filelayout: fix race between mark_request_commit and scan_commit_lists We need to hold cinfo lock while setting bucket->wlseg and adding req to nwritten list at the same time. Otherwise there might be a window where nwritten list is empty yet we set bucket->wlseg, in which case ff_layout_scan_ds_commit_list() may end up clearing bucket->wlseg incorrectly, casuing client to oops later on. This was found when testing flexfile layout but filelayout has the same problem. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-07-12 18:22:41 -04:00
Trond Myklebust	f3792d63d2	NFSv4: Fix OPEN w/create access mode checking POSIX states that open("foo", O_CREAT\|O_RDONLY, 000) should succeed if the file "foo" does not already exist. With the current NFS client, it will fail with an EACCES error because of the permissions checks in nfs4_opendata_access(). Fix is to turn that test off if the server says that we created the file. Reported-by: "Frank S. Filz" <ffilzlnx@mindspring.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-07-12 18:20:55 -04:00
Trond Myklebust	aafe37504c	NFS: Remove 2 unused variables Cc: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-07-12 17:35:57 -04:00
Weston Andros Adamson	3e2170451e	nfs: handle multiple reqs in nfs_wb_page_cancel Use nfs_lock_and_join_requests to merge all subrequests into the head request - this cancels and dereferences all subrequests. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-07-12 17:35:47 -04:00
Weston Andros Adamson	d458138353	nfs: handle multiple reqs in nfs_page_async_flush Change nfs_find_and_lock_request so nfs_page_async_flush can handle multiple requests in a page. There is only one request for a page the first time nfs_page_async_flush is called, but if a write or commit fails, async_flush is called again and there may be multiple requests associated with the page. The solution is to merge all the requests in a page group into a single request before calling nfs_pageio_add_request. Rename nfs_find_and_lock_request to nfs_lock_and_join_requests and change it to first lock all requests for the page, then cancel and merge all subrequests into the head request. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-07-12 17:35:46 -04:00
Weston Andros Adamson	84d3a9a913	nfs: change find_request to find_head_request nfs_page_find_request_locked* should find the head request for that page. Rename the functions and add comments to make this clear, and fix a bug that could return a subrequest when page_private isn't set on the page. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-07-12 16:51:41 -04:00
Weston Andros Adamson	85710a837c	nfs: nfs_page should take a ref on the head req nfs_pages that aren't the the head of a group must take a reference on the head as long as ->wb_head is set to it. This stops the head from hitting a refcount of 0 while there is still an active nfs_page for the page group. This avoids kref warnings in the writeback code when the page group head is found and referenced. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-07-12 16:51:41 -04:00
Weston Andros Adamson	17089a29a2	nfs: mark nfs_page reqs with flag for extra ref Change the use of PG_INODE_REF - set it when taking extra reference on subrequests and take care to only release once for each request. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-07-12 16:51:41 -04:00
Namjae Jeon	bf40c92635	ext4: fix potential null pointer dereference in ext4_free_inode Fix potential null pointer dereferencing problem caused by `e43bb4e612` ("ext4: decrement free clusters/inodes counters when block group declared bad") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Lukas Czerner <lczerner@redhat.com>	2014-07-12 16:11:42 -04:00
Theodore Ts'o	3f1f9b8513	ext4: fix a potential deadlock in __ext4_es_shrink() This fixes the following lockdep complaint: [ INFO: possible circular locking dependency detected ] 3.16.0-rc2-mm1+ #7 Tainted: G O ------------------------------------------------------- kworker/u24:0/4356 is trying to acquire lock: (&(&sbi->s_es_lru_lock)->rlock){+.+.-.}, at: [<ffffffff81285fff>] __ext4_es_shrink+0x4f/0x2e0 but task is already holding lock: (&ei->i_es_lock){++++-.}, at: [<ffffffff81286961>] ext4_es_insert_extent+0x71/0x180 which lock already depends on the new lock. Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&ei->i_es_lock); lock(&(&sbi->s_es_lru_lock)->rlock); lock(&ei->i_es_lock); lock(&(&sbi->s_es_lru_lock)->rlock); * DEADLOCK * 6 locks held by kworker/u24:0/4356: #0: ("writeback"){.+.+.+}, at: [<ffffffff81071d00>] process_one_work+0x180/0x560 #1: ((&(&wb->dwork)->work)){+.+.+.}, at: [<ffffffff81071d00>] process_one_work+0x180/0x560 #2: (&type->s_umount_key#22){++++++}, at: [<ffffffff811a9c74>] grab_super_passive+0x44/0x90 #3: (jbd2_handle){+.+...}, at: [<ffffffff812979f9>] start_this_handle+0x189/0x5f0 #4: (&ei->i_data_sem){++++..}, at: [<ffffffff81247062>] ext4_map_blocks+0x132/0x550 #5: (&ei->i_es_lock){++++-.}, at: [<ffffffff81286961>] ext4_es_insert_extent+0x71/0x180 stack backtrace: CPU: 0 PID: 4356 Comm: kworker/u24:0 Tainted: G O 3.16.0-rc2-mm1+ #7 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Workqueue: writeback bdi_writeback_workfn (flush-253:0) ffffffff8213dce0 ffff880014b07538 ffffffff815df0bb 0000000000000007 ffffffff8213e040 ffff880014b07588 ffffffff815db3dd ffff880014b07568 ffff880014b07610 ffff88003b868930 ffff88003b868908 ffff88003b868930 Call Trace: [<ffffffff815df0bb>] dump_stack+0x4e/0x68 [<ffffffff815db3dd>] print_circular_bug+0x1fb/0x20c [<ffffffff810a7a3e>] __lock_acquire+0x163e/0x1d00 [<ffffffff815e89dc>] ? retint_restore_args+0xe/0xe [<ffffffff815ddc7b>] ? __slab_alloc+0x4a8/0x4ce [<ffffffff81285fff>] ? __ext4_es_shrink+0x4f/0x2e0 [<ffffffff810a8707>] lock_acquire+0x87/0x120 [<ffffffff81285fff>] ? __ext4_es_shrink+0x4f/0x2e0 [<ffffffff8128592d>] ? ext4_es_free_extent+0x5d/0x70 [<ffffffff815e6f09>] _raw_spin_lock+0x39/0x50 [<ffffffff81285fff>] ? __ext4_es_shrink+0x4f/0x2e0 [<ffffffff8119760b>] ? kmem_cache_alloc+0x18b/0x1a0 [<ffffffff81285fff>] __ext4_es_shrink+0x4f/0x2e0 [<ffffffff812869b8>] ext4_es_insert_extent+0xc8/0x180 [<ffffffff812470f4>] ext4_map_blocks+0x1c4/0x550 [<ffffffff8124c4c4>] ext4_writepages+0x6d4/0xd00 ... Reported-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reported-by: Minchan Kim <minchan@kernel.org> Cc: stable@vger.kernel.org Cc: Zheng Liu <gnehzuil.liu@gmail.com>	2014-07-12 15:32:24 -04:00
Linus Torvalds	bae78dc259	Merge branch 'for-3.16' of git://linux-nfs.org/~bfields/linux Pull nfsd bugfix from Bruce Fields: "Another xdr encoding regression that may cause incorrect encoding on failures of certain readdirs" * 'for-3.16' of git://linux-nfs.org/~bfields/linux: nfsd: Fix bad reserving space for encoding rdattr_error	2014-07-11 15:10:04 -07:00
Gu Zheng	4b2868aa4f	f2fs: remove the unused stat_lock Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-07-11 15:01:48 -07:00
Gu Zheng	7a6c76b1b2	f2fs: cleanup the needless return of f2fs_create_root_stats Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-07-11 15:01:47 -07:00
Kinglong Mee	d5d5c304b1	NFSD: Fix bad checking of space for padding in splice read Note that the caller has already reserved space for count and eof, so xdr->p has already moved past them, only the padding remains. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Fixes `dc97618ddd` (nfsd4: separate splice and readv cases) Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-11 15:19:25 -04:00
Kinglong Mee	35e634b83c	NFSD: Check acl returned from get_acl/posix_acl_from_mode Commit `4ac7249ea5` (nfsd: use get_acl and ->set_acl) don't check the acl returned from get_acl()/posix_acl_from_mode(). Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-11 15:03:53 -04:00
Theodore Ts'o	f9ae9cf5d7	ext4: revert commit which was causing fs corruption after journal replays Commit `007649375f` ("ext4: initialize multi-block allocator before checking block descriptors") causes the block group descriptor's count of the number of free blocks to become inconsistent with the number of free blocks in the allocation bitmap. This is a harmless form of fs corruption, but it causes the kernel to potentially remount the file system read-only, or to panic, depending on the file systems's error behavior. Thanks to Eric Whitney for his tireless work to reproduce and to find the guilty commit. Fixes: `007649375f` ("ext4: initialize multi-block allocator before checking block descriptors" Cc: stable@vger.kernel.org # 3.15 Reported-by: David Jander <david@protonic.nl> Reported-by: Matteo Croce <technoboy85@gmail.com> Tested-by: Eric Whitney <enwlinux@gmail.com> Suggested-by: Eric Whitney <enwlinux@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2014-07-11 13:55:40 -04:00
Jeff Layton	a46cb7f287	nfsd: cleanup and rename nfs4_check_open Rename it to better describe what it does, and have it just return the stateid instead of a __be32 (which is now always nfs_ok). Also, do the search for an existing stateid after the delegation check, to reduce cleanup if the delegation check returns error. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-11 11:06:38 -04:00
Jeff Layton	baeb4ff0e5	nfsd: make deny mode enforcement more efficient and close races in it The current enforcement of deny modes is both inefficient and scattered across several places, which makes it hard to guarantee atomicity. The inefficiency is a problem now, and the lack of atomicity will mean races once the client_mutex is removed. First, we address the inefficiency. We have to track deny modes on a per-stateid basis to ensure that open downgrades are sane, but when the server goes to enforce them it has to walk the entire list of stateids and check against each one. Instead of doing that, maintain a per-nfs4_file deny mode. When a file is opened, we simply set any deny bits in that mode that were specified in the OPEN call. We can then use that unified deny mode to do a simple check to see whether there are any conflicts without needing to walk the entire stateid list. The only time we'll need to walk the entire list of stateids is when a stateid that has a deny mode on it is being released, or one is having its deny mode downgraded. In that case, we must walk the entire list and recalculate the fi_share_deny field. Since deny modes are pretty rare today, this should be very rare under normal workloads. To address the potential for races once the client_mutex is removed, protect fi_share_deny with the fi_lock. In nfs4_get_vfs_file, check to make sure that any deny mode we want to apply won't conflict with existing access. If that's ok, then have nfs4_file_get_access check that new access to the file won't conflict with existing deny modes. If that also passes, then get file access references, set the correct access and deny bits in the stateid, and update the fi_share_deny field. If opening the file or truncating it fails, then unwind the whole mess and return the appropriate error. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-11 11:06:32 -04:00
Jeff Layton	7214e8600e	nfsd: always hold the fi_lock when bumping fi_access refcounts Once we remove the client_mutex, there's an unlikely but possible race that could occur. It will be possible for nfs4_file_put_access to race with nfs4_file_get_access. The refcount will go to zero (briefly) and then bumped back to one. If that happens we set ourselves up for a use-after-free and the potential for a lock to race onto the i_flock list as a filp is being torn down. Ensure that we can safely bump the refcount on the file by holding the fi_lock whenever that's done. The only place it currently isn't is in get_lock_access. In order to ensure atomicity with finding the file, use the find_*_file_locked variants and then call get_lock_access to get new access references on the nfs4_file under the same lock. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-11 11:06:17 -04:00
Jeff Layton	3b84240a7b	nfsd: clean up reset_union_bmap_deny Fix the "deny" argument type, and start the loop at 1. The 0 iteration is always a noop. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-11 11:06:11 -04:00
Jeff Layton	6eb3a1d096	nfsd: set stateid access and deny bits in nfs4_get_vfs_file Cleanup -- ensure that the stateid bits are set at the same time that the file access refcounts are incremented. Keeping them coherent like this makes it easier to ensure that we account for all of the references. Since the initialization of the st_*_bmap fields is done when it's hashed, we go ahead and hash the stateid before getting access to the file and unhash it if that function returns error. This will be necessary anyway in a follow-on patch that will overhaul deny mode handling. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-11 11:06:05 -04:00
Jeff Layton	c11c591fe6	nfsd: shrink st_access_bmap and st_deny_bmap We never use anything above bit #3, so an unsigned long for each is wasteful. Shrink them to a char each, and add some WARN_ON_ONCE calls if we try to set or clear bits that would go outside those sizes. Note too that because atomic bitops work on unsigned longs, we have to abandon their use here. That shouldn't be a problem though since we don't really care about the atomicity in this code anyway. Using them was just a convenient way to flip bits. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-11 11:06:04 -04:00
Jeff Layton	6d338b51eb	nfsd: remove nfs4_file_put_fd ...and replace it with a simple swap call. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-11 11:05:57 -04:00
Jeff Layton	1265965172	nfsd: refactor nfs4_file_get_access and nfs4_file_put_access Have them take NFS4_SHARE_ACCESS_* flags instead of an open mode. This spares the callers from having to convert it themselves. This also allows us to simplify these functions as we no longer need to do the access_to_omode conversion in either one. Note too that this patch eliminates the WARN_ON in __nfs4_file_get_access. It's valid for now, but in a later patch we'll be bumping the refcounts prior to opening the file in order to close some races, at which point we'll need to remove it anyway. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-11 11:03:23 -04:00
Marcel Holtmann	f49daa8190	Bluetooth: Move HCI socket definitions into its own header file All the HCI sockets and ioctl based definitions have been in a global header file that also includes all the HCI protocol structures. To make this a bit cleaner, move them into its own file. This also adjusts fs/compat_ioctl.c to only include this new file and not all the protocol structures that are not needed. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-07-11 13:53:04 +03:00
Chao Yu	81e366f87f	f2fs: check name_len of dir entry to prevent from deadloop We assume that modification of some special application could result in zeroed name_len, or it is consciously made by somebody. We will deadloop in find_in_block when name_len of dir entry is zero. This patch is added for preventing deadloop in above scenario. change log from v1: o use f2fs_bug_on rather than break out from searching dir entry suggested by Jaegeuk Kim. Jaegeuk describe: "Well, IMO, it would be good to add f2fs_bug_on() here with a specific comment. In the current phase of f2fs, it is more important to investigate the file system bugs, rather than workarounds for any corrupted images. And, definitely it needs to stop the kernel if any corrupted image was mounted, so that we can figure out where the bugs are occurred." Suggested-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-07-10 17:00:02 -07:00
Trond Myklebust	e20fcf1e65	nfsd: clean up helper __release_lock_stateid Use filp_close instead of open coding. filp_close does a bit more than just release the locks and put the filp. It also calls ->flush and dnotify_flush, both of which should be done here anyway. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-10 15:05:26 -04:00
Trond Myklebust	de18643dce	nfsd: Add locking to the nfs4_file->fi_fds[] array Preparation for removal of the client_mutex, which currently protects this array. While we don't actually need the find_*_file_locked variants just yet, a later patch will. So go ahead and add them now to reduce future churn in this code. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-10 15:05:26 -04:00
Trond Myklebust	1d31a2531a	nfsd: Add fine grained protection for the nfs4_file->fi_stateids list Access to this list is currently serialized by the client_mutex. Add finer grained locking around this list in preparation for its removal. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-10 15:05:25 -04:00
Linus Torvalds	40f6123737	Merge branch 'for-3.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: "Mostly fixes for the fallouts from the recent cgroup core changes. The decoupled nature of cgroup dynamic hierarchy management (hierarchies are created dynamically on mount but may or may not be reused once unmounted depending on remaining usages) led to more ugliness being added to kernfs. Hopefully, this is the last of it" * 'for-3.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cpuset: break kernfs active protection in cpuset_write_resmask() cgroup: fix a race between cgroup_mount() and cgroup_kill_sb() kernfs: introduce kernfs_pin_sb() cgroup: fix mount failure in a corner case cpuset,mempolicy: fix sleeping function called from invalid context cgroup: fix broken css_has_online_children()	2014-07-10 11:38:23 -07:00
Jeff Layton	d6c249b4d4	nfsd: reduce some spinlocking in put_client_renew No need to take the lock unless the count goes to 0. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-10 13:41:00 -04:00
Jeff Layton	dff1399f8a	nfsd: close potential race between delegation break and laundromat Bruce says: There's also a preexisting expire_client/laundromat vs break race: - expire_client/laundromat adds a delegation to its local reaplist using the same dl_recall_lru field that a delegation uses to track its position on the recall lru and drops the state lock. - a concurrent break_lease adds the delegation to the lru. - expire/client/laundromat then walks it reaplist and sees the lru head as just another delegation on the list.... Fix this race by checking the dl_time under the state_lock. If we find that it's not 0, then we know that it has already been queued to the LRU list and that we shouldn't queue it again. In the case of destroy_client, we must also ensure that we don't hit similar races by ensuring that we don't move any delegations to the reaplist with a dl_time of 0. Just bump the dl_time by one before we drop the state_lock. We're destroying the delegations anyway, so a 1s difference there won't matter. The fault injection code also requires a bit of surgery here: First, in the case of nfsd_forget_client_delegations, we must prevent the same sort of race vs. the delegation break callback. For that, we just increment the dl_time to ensure that a delegation callback can't race in while we're working on it. We can't do that for nfsd_recall_client_delegations, as we need to have it actually queue the delegation, and that won't happen if we increment the dl_time. The state lock is held over that function, so we don't need to worry about these sorts of races there. There is one other potential bug nfsd_recall_client_delegations though. Entries on the victims list are not dequeued before calling nfsd_break_one_deleg. That's a potential list corruptor, so ensure that we do that there. Reported-by: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-10 13:40:51 -04:00
Miklos Szeredi	4237ba43b6	fuse: restructure ->rename2() Make ->rename2() universal, i.e. able to handle zero flags. This is to make future change of the API easier. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2014-07-10 10:50:19 +02:00
Kinglong Mee	01529e3f81	NFSD: Fix memory leak in encoding denied lock Commit `8c7424cff6` (nfsd4: don't try to encode conflicting owner if low on space) forgot free conf->data in nfsd4_encode_lockt and before sign conf->data to NULL in nfsd4_encode_lock_denied. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-09 20:55:08 -04:00

... 25 26 27 28 29 ...

38,263 commits