linux-pinenote

Author	SHA1	Message	Date
Changman Lee	80ec2e914d	f2fs: no more dirty_nat_entires when flushing After flushing dirty nat entries, it has to be no more dirty nat entries. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-25 17:26:36 -08:00
Changman Lee	20d047c876	f2fs: check dirty_nat_cnt before flushing nat entries in journal It's meaningless to check dirty_nat_cnt after re-dirtying nat entries in journal. And although there are rooms for dirty nat entires if dirty_nat_cnt is zero, it's also meaningless to check __has_cursum_space. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-25 17:26:34 -08:00
Jan Kara	d4f7610743	ext4: forbid journal_async_commit in data=ordered mode Option journal_async_commit breaks gurantees of data=ordered mode as it sends only a single cache flush after writing a transaction commit block. Thus even though the transaction including the commit block is fully stored on persistent storage, file data may still linger in drives caches and will be lost on power failure. Since all checksums match on journal recovery, we replay the transaction thus possibly exposing stale user data. To fix this data exposure issue, remove the possibility to use journal_async_commit in data=ordered mode. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2014-11-25 20:19:17 -05:00
Theodore Ts'o	d9f39d1e44	jbd2: remove unnecessary NULL check before iput() Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2014-11-25 20:02:37 -05:00
Markus Elfring	bfcba2d035	ext4: Remove an unnecessary check for NULL before iput() The iput() function tests whether its argument is NULL and then returns immediately. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2014-11-25 20:01:37 -05:00
Anna Schumaker	624bd5b7b6	nfs: Add DEALLOCATE support This patch adds support for using the NFS v4.2 operation DEALLOCATE to punch holes in a file. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-11-25 16:38:32 -05:00
Anna Schumaker	f4ac1674f5	nfs: Add ALLOCATE support This patch adds support for using the NFS v4.2 operation ALLOCATE to preallocate data in a file. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-11-25 16:38:32 -05:00
Namjae Jeon	31fc006b12	ext4: remove unneeded code in ext4_unlink Setting retval to zero is not needed in ext4_unlink. Remove unneeded code. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2014-11-25 16:34:38 -05:00
Eric Sandeen	b003b52496	ext4: don't count external journal blocks as overhead This was fixed for ext3 with: `e6d8fb3` ext3: Count internal journal as bsddf overhead in ext3_statfs but was never fixed for ext4. With a large external journal and no used disk blocks, df comes out negative without this, as journal blocks are added to the overhead & subtracted from used blocks unconditionally. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2014-11-25 16:27:44 -05:00
Jan Kara	733ded2a80	ext4: remove never taken branch from ext4_ext_shift_path_extents() path[depth].p_hdr can never be NULL for a path passed to us (and even if it could, EXT_LAST_EXTENT() would make something != NULL from it). So just remove the branch. Coverity-id: 1196498 Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2014-11-25 16:23:48 -05:00
Chuck Lever	c2ef47b7f5	NFS: Clean up nfs4_init_callback() nfs4_init_callback() is never invoked for NFS versions other than 4. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-11-25 16:22:16 -05:00
Chuck Lever	6dd3436b9d	NFS: SETCLIENTID XDR buffer sizes are incorrect Use the correct calculation of the maximum size of a clientaddr4 when encoding and decoding SETCLIENTID operations. clientaddr4 is defined in section 2.2.10 of RFC3530bis-31. The usage in encode_setclientid_maxsz is missing the 4-byte length in both strings, but is otherwise correct. decode_setclientid_maxsz simply asks for a page of receive buffer space, which is unnecessarily large (more than 4KB). Note that a SETCLIENTID reply is either clientid+verifier, or clientaddr4, depending on the returned NFS status. It doesn't hurt to allocate enough space for both. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-11-25 16:22:16 -05:00
Darrick J. Wong	c6d3d56dd0	ext4: create nojournal_checksum mount option Create a mount option to disable journal checksumming (because the metadata_csum feature turns it on by default now), and fix remount not to allow changing the journal checksumming option, since changing the mount options has no effect on the journal. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2014-11-25 16:20:50 -05:00
Wang Shilong	58d86a50ee	ext4: update comments regarding ext4_delete_inode() ext4_delete_inode() has been renamed for a long time, update comments for this. Signed-off-by: Wang Shilong <wshilong@ddn.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2014-11-25 16:17:29 -05:00
Jaegeuk Kim	5f72739583	f2fs: fix deadlock during inline_data conversion A deadlock can be occurred: Thread 1] Thread 2] - f2fs_write_data_pages - f2fs_write_begin - lock_page(page #0) - grab_cache_page(page #X) - get_node_page(inode_page) - grab_cache_page(page #0) : to convert inline_data - f2fs_write_data_page - f2fs_write_inline_data - get_node_page(inode_page) In this case, trying to lock inode_page and page #0 causes deadlock. In order to avoid this, this patch adds a rule for this locking policy, which is that page #0 should be locked followed by inode_page lock. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-25 12:08:30 -08:00
Markus Elfring	ce3e6d25f3	f2fs: fix typos for the word "destroy" in jump labels Two jump labels were adjusted in the implementation of the create_node_manager_caches() function because these identifiers contained typos. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-25 12:08:22 -08:00
Dmitry Monakhov	4fdb554318	ext4: cleanup GFP flags inside resize path We must use GFP_NOFS instead GFP_KERNEL inside ext4_mb_add_groupinfo and ext4_calculate_overhead() because they are called from inside a journal transaction. Call trace: ioctl ->ext4_group_add ->journal_start ->ext4_setup_new_descs ->ext4_mb_add_groupinfo -> GFP_KERNEL ->ext4_flex_group_add ->ext4_update_super ->ext4_calculate_overhead -> GFP_KERNEL ->journal_stop Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2014-11-25 13:08:04 -05:00
Jan Kara	2be12de98a	ext4: introduce aging to extent status tree Introduce a simple aging to extent status tree. Each extent has a REFERENCED bit which gets set when the extent is used. Shrinker then skips entries with referenced bit set and clears the bit. Thus frequently used extents have higher chances of staying in memory. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2014-11-25 11:55:24 -05:00
Jan Kara	624d0f1dd7	ext4: cleanup flag definitions for extent status tree Currently flags for extent status tree are defined twice, once shifted and once without a being shifted. Consolidate these definitions into one place and make some computations automatic to make adding flags less error prone. Compiler should be clever enough to figure out these are constants and generate the same code. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2014-11-25 11:53:47 -05:00
Jan Kara	dd47592551	ext4: limit number of scanned extents in status tree shrinker Currently we scan extent status trees of inodes until we reclaim nr_to_scan extents. This can however require a lot of scanning when there are lots of delayed extents (as those cannot be reclaimed). Change shrinker to work as shrinkers are supposed to and scan only nr_to_scan extents regardless of how many extents did we actually reclaim. We however need to be careful and avoid scanning each status tree from the beginning - that could lead to a situation where we would not be able to reclaim anything at all when first nr_to_scan extents in the tree are always unreclaimable. We remember with each inode offset where we stopped scanning and continue from there when we next come across the inode. Note that we also need to update places calling __es_shrink() manually to pass reasonable nr_to_scan to have a chance of reclaiming anything and not just 1. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2014-11-25 11:51:23 -05:00
Jan Kara	b0dea4c165	ext4: move handling of list of shrinkable inodes into extent status code Currently callers adding extents to extent status tree were responsible for adding the inode to the list of inodes with freeable extents. This is error prone and puts list handling in unnecessarily many places. Just add inode to the list automatically when the first non-delay extent is added to the tree and remove inode from the list when the last non-delay extent is removed. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2014-11-25 11:49:25 -05:00
Zheng Liu	edaa53cac8	ext4: change LRU to round-robin in extent status tree shrinker In this commit we discard the lru algorithm for inodes with extent status tree because it takes significant effort to maintain a lru list in extent status tree shrinker and the shrinker can take a long time to scan this lru list in order to reclaim some objects. We replace the lru ordering with a simple round-robin. After that we never need to keep a lru list. That means that the list needn't be sorted if the shrinker can not reclaim any objects in the first round. Cc: Andreas Dilger <adilger.kernel@dilger.ca> Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2014-11-25 11:45:37 -05:00
Zheng Liu	2f8e0a7c6c	ext4: cache extent hole in extent status tree for ext4_da_map_blocks() Currently extent status tree doesn't cache extent hole when a write looks up in extent tree to make sure whether a block has been allocated or not. In this case, we don't put extent hole in extent cache because later this extent might be removed and a new delayed extent might be added back. But it will cause a defect when we do a lot of writes. If we don't put extent hole in extent cache, the following writes also need to access extent tree to look at whether or not a block has been allocated. It brings a cache miss. This commit fixes this defect. Also if the inode doesn't have any extent, this extent hole will be cached as well. Cc: Andreas Dilger <adilger.kernel@dilger.ca> Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2014-11-25 11:44:37 -05:00
Jan Kara	cbd7584e6e	ext4: fix block reservation for bigalloc filesystems For bigalloc filesystems we have to check whether newly requested inode block isn't already part of a cluster for which we already have delayed allocation reservation. This check happens in ext4_ext_map_blocks() and that function sets EXT4_MAP_FROM_CLUSTER if that's the case. However if ext4_da_map_blocks() finds in extent cache information about the block, we don't call into ext4_ext_map_blocks() and thus we always end up getting new reservation even if the space for cluster is already reserved. This results in overreservation and premature ENOSPC reports. Fix the problem by checking for existing cluster reservation already in ext4_da_map_blocks(). That simplifies the logic and actually allows us to get rid of the EXT4_MAP_FROM_CLUSTER flag completely. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2014-11-25 11:41:49 -05:00
Filipe Manana	9ea24bbe17	Btrfs: fix snapshot inconsistency after a file write followed by truncate If right after starting the snapshot creation ioctl we perform a write against a file followed by a truncate, with both operations increasing the file's size, we can get a snapshot tree that reflects a state of the source subvolume's tree where the file truncation happened but the write operation didn't. This leaves a gap between 2 file extent items of the inode, which makes btrfs' fsck complain about it. For example, if we perform the following file operations: $ mkfs.btrfs -f /dev/vdd $ mount /dev/vdd /mnt $ xfs_io -f \ -c "pwrite -S 0xaa -b 32K 0 32K" \ -c "fsync" \ -c "pwrite -S 0xbb -b 32770 16K 32770" \ -c "truncate 90123" \ /mnt/foobar and the snapshot creation ioctl was just called before the second write, we often can get the following inode items in the snapshot's btree: item 120 key (257 INODE_ITEM 0) itemoff 7987 itemsize 160 inode generation 146 transid 7 size 90123 block group 0 mode 100600 links 1 uid 0 gid 0 rdev 0 flags 0x0 item 121 key (257 INODE_REF 256) itemoff 7967 itemsize 20 inode ref index 282 namelen 10 name: foobar item 122 key (257 EXTENT_DATA 0) itemoff 7914 itemsize 53 extent data disk byte 1104855040 nr 32768 extent data offset 0 nr 32768 ram 32768 extent compression 0 item 123 key (257 EXTENT_DATA 53248) itemoff 7861 itemsize 53 extent data disk byte 0 nr 0 extent data offset 0 nr 40960 ram 40960 extent compression 0 There's a file range, corresponding to the interval [32K; ALIGN(16K + 32770, 4096)[ for which there's no file extent item covering it. This is because the file write and file truncate operations happened both right after the snapshot creation ioctl called btrfs_start_delalloc_inodes(), which means we didn't start and wait for the ordered extent that matches the write and, in btrfs_setsize(), we were able to call btrfs_cont_expand() before being able to commit the current transaction in the snapshot creation ioctl. So this made it possibe to insert the hole file extent item in the source subvolume (which represents the region added by the truncate) right before the transaction commit from the snapshot creation ioctl. Btrfs' fsck tool complains about such cases with a message like the following: "root 331 inode 257 errors 100, file extent discount" >From a user perspective, the expectation when a snapshot is created while those file operations are being performed is that the snapshot will have a file that either: 1) is empty 2) only the first write was captured 3) only the 2 writes were captured 4) both writes and the truncation were captured But never capture a state where only the first write and the truncation were captured (since the second write was performed before the truncation). A test case for xfstests follows. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>	2014-11-25 07:41:23 -08:00
Filipe Manana	e5fa8f865b	Btrfs: ensure send always works on roots without orphans Move the logic from the snapshot creation ioctl into send. This avoids doing the transaction commit if send isn't used, and ensures that if a crash/reboot happens after the transaction commit that created the snapshot and before the transaction commit that switched the commit root, send will not get a commit root that differs from the main root (that has orphan items). Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>	2014-11-25 07:41:23 -08:00
Filipe Manana	758eb51e71	Btrfs: fix freeing used extent after removing empty block group Due to ignoring errors returned by clear_extent_bits (at the moment only -ENOMEM is possible), we can end up freeing an extent that is actually in use (i.e. return the extent to the free space cache). The sequence of steps that lead to this: 1) Cleaner thread starts execution and calls btrfs_delete_unused_bgs(), with the goal of freeing empty block groups; 2) btrfs_delete_unused_bgs() finds an empty block group, joins the current transaction (or starts a new one if none is running) and attempts to clear the EXTENT_DIRTY bit for the block group's range from freed_extents[0] and freed_extents[1] (of which one corresponds to fs_info->pinned_extents); 3) Clearing the EXTENT_DIRTY bit (via clear_extent_bits()) fails with -ENOMEM, but such error is ignored and btrfs_delete_unused_bgs() proceeds to delete the block group and the respective chunk, while pinned_extents remains with that bit set for the whole (or a part of the) range covered by the block group; 4) Later while the transaction is still running, the chunk ends up being reused for a new block group (maybe for different purpose, data or metadata), and extents belonging to the new block group are allocated for file data or btree nodes/leafs; 5) The current transaction is committed, meaning that we unpinned one or more extents from the new block group (through btrfs_finish_extent_commit() and unpin_extent_range()) which are now being used for new file data or new metadata (through btrfs_finish_extent_commit() and unpin_extent_range()). And unpinning means we returned the extents to the free space cache of the new block group, which implies those extents can be used for future allocations while they're still in use. Alternatively, we can hit a BUG_ON() when doing a lookup for a block group's cache object in unpin_extent_range() if a new block group didn't end up being allocated for the same chunk (step 4 above). Fix this by not freeing the block group and chunk if we fail to clear the dirty bit. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>	2014-11-25 07:41:23 -08:00
Chris Mason	8f608de699	Btrfs: include vmalloc.h in check-integrity.c Fengguang's build monster reported warnings on some arches because we don't have vmalloc.h included Signed-off-by: Chris Mason <clm@fb.com> Reported-by: fengguang.wu@intel.com	2014-11-25 06:01:11 -08:00
Qu Wenruo	084b6e7c76	btrfs: Fix a lockdep warning when running xfstest. The following lockdep warning is triggered during xfstests: [ 1702.980872] ========================================================= [ 1702.981181] [ INFO: possible irq lock inversion dependency detected ] [ 1702.981482] 3.18.0-rc1 #27 Not tainted [ 1702.981781] --------------------------------------------------------- [ 1702.982095] kswapd0/77 just changed the state of lock: [ 1702.982415] (&delayed_node->mutex){+.+.-.}, at: [<ffffffffa03b0b51>] __btrfs_release_delayed_node+0x41/0x1f0 [btrfs] [ 1702.982794] but this lock took another, RECLAIM_FS-unsafe lock in the past: [ 1702.983160] (&fs_info->dev_replace.lock){+.+.+.} and interrupts could create inverse lock ordering between them. [ 1702.984675] other info that might help us debug this: [ 1702.985524] Chain exists of: &delayed_node->mutex --> &found->groups_sem --> &fs_info->dev_replace.lock [ 1702.986799] Possible interrupt unsafe locking scenario: [ 1702.987681] CPU0 CPU1 [ 1702.988137] ---- ---- [ 1702.988598] lock(&fs_info->dev_replace.lock); [ 1702.989069] local_irq_disable(); [ 1702.989534] lock(&delayed_node->mutex); [ 1702.990038] lock(&found->groups_sem); [ 1702.990494] <Interrupt> [ 1702.990938] lock(&delayed_node->mutex); [ 1702.991407] * DEADLOCK * It is because the btrfs_kobj_{add/rm}_device() will call memory allocation with GFP_KERNEL, which may flush fs page cache to free space, waiting for it self to do the commit, causing the deadlock. To solve the problem, move btrfs_kobj_{add/rm}_device() out of the dev_replace lock range, also involing split the btrfs_rm_dev_replace_srcdev() function into remove and free parts. Now only btrfs_rm_dev_replace_remove_srcdev() is called in dev_replace lock range, and kobj_{add/rm} and btrfs_rm_dev_replace_free_srcdev() are called out of the lock range. Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>	2014-11-25 05:55:38 -08:00
Chris Mason	ad27c0dab7	Merge branch 'dev/pending-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus	2014-11-25 05:45:30 -08:00
Li RongQing	e9f456ca50	nfs: define nfs_inc_fscache_stats and using it as possible Define and use nfs_inc_fscache_stats when plus one, which can save to pass one parameter. Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-11-24 20:08:47 -05:00
Li RongQing	5a254d08b0	nfs: replace nfs_add_stats with nfs_inc_stats when add one Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-11-24 20:08:47 -05:00
Markus Elfring	fe0bf1185d	NFS: Deletion of unnecessary checks before the function call "nfs_put_client" The nfs_put_client() function tests whether its argument is NULL and then returns immediately. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-11-24 20:07:27 -05:00
Jeff Layton	10b89567db	lockd: eliminate LOCKD_DEBUG LOCKD_DEBUG is always the same value as CONFIG_SUNRPC_DEBUG, so we can just use it instead. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-11-24 17:24:08 -05:00
Peng Tao	4bd5a980de	nfs41: fix nfs4_proc_layoutget error handling nfs4_layoutget_release() drops layout hdr refcnt. Grab the refcnt early so that it is safe to call .release in case nfs4_alloc_pages fails. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Fixes: `a47970ff78` ("NFSv4.1: Hold reference to layout hdr in layoutget") Cc: stable@vger.kernel.org # 3.9+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-11-24 17:14:54 -05:00
Weston Andros Adamson	cb1410c71e	NFS: fix subtle change in COMMIT behavior Recent work in the pgio layer made it possible for there to be more than one request per page. This caused a subtle change in commit behavior, because write.c:nfs_commit_unstable_pages compares the number of pages waiting for writeback against the number of requests on a commit list to choose when to send a COMMIT in a non-blocking flush. This is probably hard to hit in normal operation - you have to be using rsize/wsize < PAGE_SIZE, or pnfs with lots of boundaries that are not page aligned to have a noticeable change in behavior. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-11-24 17:00:42 -05:00
Christoph Hellwig	6a74c0c940	pnfs/blocklayout: fix end calculation in pnfs_num_cont_bytes Use the number of pages in the pagecache mapping instead of the number of pnfs requests which is only slightly related. Reported-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-11-24 17:00:41 -05:00
Anna Schumaker	878ffa9f85	NFS: Use nfs_server_capable() for checknig NFS_CAP_SEEK This should make the code easier to maintain in the future. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-11-24 12:49:13 -05:00
Jan Kara	c1b69b1ca1	nfs: Remove dead case from nfs4_map_errors() NFS4ERR_ACCESS has number 13 and thus is matched and returned immediately at the beginning of nfs4_map_errors() and there's no point in checking it later. Coverity-id: 733891 Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-11-24 12:48:14 -05:00
Paul Burton	774c105ed8	binfmt_elf: allow arch code to examine PT_LOPROC ... PT_HIPROC headers MIPS is introducing new variants of its O32 ABI which differ in their handling of floating point, in order to enable a gradual transition towards a world where mips32 binaries can take advantage of new hardware features only available when configured for certain FP modes. In order to do this ELF binaries are being augmented with a new section that indicates, amongst other things, the FP mode requirements of the binary. The presence & location of such a section is indicated by a program header in the PT_LOPROC ... PT_HIPROC range. In order to allow the MIPS architecture code to examine the program header & section in question, pass all program headers in this range to an architecture-specific arch_elf_pt_proc function. This function may return an error if the header is deemed invalid or unsuitable for the system, in which case that error will be returned from load_elf_binary and upwards through the execve syscall. A means is required for the architecture code to make a decision once it is known that all such headers have been seen, but before it is too late to return from an execve syscall. For this purpose the arch_check_elf function is added, and called once, after all PT_LOPROC to PT_HIPROC headers have been passed to arch_elf_pt_proc but before the code which invoked execve has been lost. This enables the architecture code to make a decision based upon all the headers present in an ELF binary and its interpreter, as is required to forbid conflicting FP ABI requirements between an ELF & its interpreter. In order to allow data to be stored throughout the calls to the above functions, struct arch_elf_state is introduced. Finally a variant of the SET_PERSONALITY macro is introduced which accepts a pointer to the struct arch_elf_state, allowing it to act based upon state observed from the architecture specific program headers. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/7679/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2014-11-24 07:45:02 +01:00
Paul Burton	a9d9ef133f	binfmt_elf: load interpreter program headers earlier Load the program headers of an ELF interpreter early enough in load_elf_binary that they can be examined before it's too late to return an error from an exec syscall. This patch does not perform any such checking, it merely lays the groundwork for a further patch to do so. No functional change is intended. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/7675/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2014-11-24 07:45:02 +01:00
Paul Burton	6a8d38945c	binfmt_elf: Hoist ELF program header loading to a function load_elf_binary & load_elf_interp both load program headers from an ELF executable in the same way, duplicating the code. This patch introduces a helper function (load_elf_phdrs) which performs this common task & calls it from both load_elf_binary & load_elf_interp. In addition to reducing code duplication, this is part of preparing to load the ELF interpreter headers earlier such that they can be examined before it's too late to return an error from an exec syscall. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/7676/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2014-11-24 07:45:02 +01:00
Jaegeuk Kim	0341845efc	f2fs: fix livelock calling f2fs_iget during f2fs_evict_inode In f2fs_evict_inode, commit_inmemory_pages f2fs_gc f2fs_iget iget_locked -> wait for inode free Here, if the inode is same as the one to be evicted, f2fs should wait forever. Actually, we should not call f2fs_balance_fs during f2fs_evict_inode to avoid this. But, the commit_inmem_pages calls f2fs_balance_fs by default, even if f2fs_evict_inode wants to free inmemory pages only. Hence, this patch adds to trigger f2fs_balance_fs only when there is something to write. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-23 21:51:57 -08:00
Jaegeuk Kim	9486ba442b	f2fs: introduce f2fs_dentry_kunmap to clean up This patch introduces f2fs_dentry_kunmap to clean up dirty codes. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-23 21:51:53 -08:00
Changman Lee	c9ee00857c	f2fs: fix wrong data structure when create slab It used nat_entry_set when create slab for sit_entry_set. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-23 21:48:49 -08:00
Jaegeuk Kim	09b8b3c839	f2fs: call flush_dcache_page when the page was updated Whenever f2fs updates mapped pages, it needs to call flush_dcache_page. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-23 21:48:31 -08:00
Linus Torvalds	d038a63ace	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs deadlock fix from Chris Mason: "This has a fix for a long standing deadlock that we've been trying to nail down for a while. It ended up being a bad interaction with the fair reader/writer locks and the order btrfs reacquires locks in the btree" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: btrfs: fix lockups from btrfs_clear_path_blocking	2014-11-23 11:16:36 -08:00
Eric Whitney	0756b908a3	ext4: fix end of region partial cluster handling ext4_ext_remove_space() can incorrectly free a partial_cluster if EAGAIN is encountered while truncating or punching. Extent removal should be retried in this case. It also fails to free a partial cluster when the punched region begins at the start of a file on that unaligned cluster and where the entire file has not been punched. Remove the requirement that all blocks in the file must have been freed in order to free the partial cluster. Signed-off-by: Eric Whitney <enwlinux@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2014-11-23 00:59:39 -05:00
Eric Whitney	345ee94748	ext4: miscellaneous partial cluster cleanups Add some casts and rearrange a few statements for improved readability. Some code can also be simplified and made more readable if we set partial_cluster to 0 rather than to a negative value when we can tell we've hit the left edge of the punched region. Signed-off-by: Eric Whitney <enwlinux@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2014-11-23 00:59:39 -05:00
Eric Whitney	5bf4376065	ext4: fix end of leaf partial cluster handling The fix in commit `ad6599ab3a` ("ext4: fix premature freeing of partial clusters split across leaf blocks"), intended to avoid dereferencing an invalid extent pointer when determining whether a partial cluster should be freed, wasn't quite good enough. Assure that at least one extent remains at the start of the leaf once the hole has been punched. Otherwise, the pointer to the extent to the right of the hole will be invalid and a partial cluster will be incorrectly freed. Set partial_cluster to 0 when we can tell we've hit the left edge of the punched region within the leaf. This prevents incorrect freeing of a partial cluster when ext4_ext_rm_leaf is called one last time during extent tree traversal after the punched region has been removed. Adjust comments to reflect code changes and a correction. Remove a bit of dead code. Signed-off-by: Eric Whitney <enwlinux@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2014-11-23 00:58:11 -05:00

... 5 6 7 8 9 ...

38980 commits