Commit graph

42,026 commits

Author SHA1 Message Date
Jaegeuk Kim
e5e0906b6b f2fs crypto: clean up error handling in f2fs_fname_setup_filename
Sync with:
  ext4 crypto: clean up error handling in ext4_fname_setup_filename

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-06-01 16:21:08 -07:00
Jaegeuk Kim
e992e238ff f2fs crypto: avoid f2fs_inherit_context for symlink
This patch fixes to call f2fs_inherit_context twice for newly created symlink.
The original one is called by f2fs_add_link(), which invokes f2fs_setxattr.
If the second one is called again, f2fs_setxattr is triggered again with same
encryption index.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-06-01 16:21:07 -07:00
Chao Yu
4637fd11ff f2fs crypto: do not set encryption policy for non-directory by ioctl
Encryption policy should only be set to an empty directory through ioctl,
This patch add a judgement condition to verify type of the target inode
to avoid incorrectly configuring for non-directory.

Additionally, remove unneeded inline data conversion since regular or symlink
file should not be processed here.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-06-01 16:21:07 -07:00
Chao Yu
81b0a8ffaa f2fs crypto: allow setting encryption policy once
This patch add XATTR_CREATE flag in setxattr when setting encryption
context for inode. Without this flag the context could be set more than
once, this should never happen. So, fix it.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-06-01 16:21:06 -07:00
Chao Yu
d3baf7c472 f2fs crypto: check context consistent for rename2
For exchange rename, we should check context consistent of encryption
between new_dir and old_inode or old_dir and new_inode. Otherwise
inheritance of parent's encryption context will be broken.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
[Jaegeuk Kim: sync with ext4 approach]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-06-01 16:21:05 -07:00
Chao Yu
1237702471 f2fs: avoid duplicated code by reusing f2fs_read_end_io
This patch tries to clean up code because part code of f2fs_read_end_io
and mpage_end_io are the same, so it's better to merge and reuse them.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-06-01 16:21:04 -07:00
Jaegeuk Kim
26bf3dc7e2 f2fs crypto: use per-inode tfm structure
This patch applies the following ext4 patch:

  ext4 crypto: use per-inode tfm structure

As suggested by Herbert Xu, we shouldn't allocate a new tfm each time
we read or write a page.  Instead we can use a single tfm hanging off
the inode's crypt_info structure for all of our encryption needs for
that inode, since the tfm can be used by multiple crypto requests in
parallel.

Also use cmpxchg() to avoid races that could result in crypt_info
structure getting doubly allocated or doubly freed.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-06-01 16:21:04 -07:00
hujianyang
da554e48ca f2fs: recovering broken superblock during mount
This patch recovers a broken superblock with the other valid one.

Signed-off-by: hujianyang <hujianyang@huawei.com>
[Jaegeuk Kim: reinitialize local variables in f2fs_fill_super for retrial]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-06-01 16:21:03 -07:00
Jaegeuk Kim
304eecc346 f2fs crypto: check encryption for tmpfile
This patch adds to check encryption for tmpfile in early stage.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-06-01 16:21:02 -07:00
Chao Yu
7e01e7ad74 f2fs: support RENAME_WHITEOUT
As the description of rename in manual, RENAME_WHITEOUT is a special operation
that only makes sense for overlay/union type filesystem.

When performing rename with RENAME_WHITEOUT, dst will be replace with src, and
meanwhile, a 'whiteout' will be create with name of src.

A "whiteout" is designed to be a char device with 0,0 device number, it has
specially meaning for stackable filesystem. In these filesystems, there are
multiple layers exist, and only top of these can be modified. So a whiteout
in top layer is used to hide a corresponding file in lower layer, as well
removal of whiteout will make the file appear.

Now in overlayfs, when we rename a file which is exist in lower layer, it
will be copied up to upper if it is not on upper layer yet, and then rename
it on upper layer, source file will be whiteouted to hide corresponding file
in lower layer at the same time.

So in upper layer filesystem, implementation of RENAME_WHITEOUT provide a
atomic operation for stackable filesystem to support rename operation.

There are multiple ways to implement RENAME_WHITEOUT in log of this commit:
7dcf5c3e45 ("xfs: add RENAME_WHITEOUT support") which pointed out by
Dave Chinner.

For now, we just try to follow the way that xfs/ext4 use.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-06-01 16:21:01 -07:00
Chao Yu
381722d2ac f2fs: introduce update_meta_page
Add a help function update_meta_page() to update meta page with specified
buffer.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-06-01 16:21:00 -07:00
Chao Yu
cb5c94cf3a f2fs crypto: zero next free dnode block
Now page cache of meta inode is used by garbage collection for encrypted page,
it may contain random data, so we should zero it before issuing discard.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-06-01 16:21:00 -07:00
Jaegeuk Kim
cfc4d971df f2fs crypto: split f2fs_crypto_init/exit with two parts
This patch splits f2fs_crypto_init/exit with two parts: base initialization and
memory allocation.

Firstly, f2fs module declares the base encryption memory pointers.
Then, allocating internal memories is done at the first encrypted inode access.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-06-01 16:20:59 -07:00
Chao Yu
b9da898b05 f2fs crypto: fix incorrect release for crypto ctx
When encryption feature is enable, if we rmmod f2fs module,
we will encounter a stack backtrace reported in syslog:

"BUG: Bad page state in process rmmod  pfn:aaf8a
page:f0f4f148 count:0 mapcount:129 mapping:ee2f4104 index:0x80
flags: 0xee2830a4(referenced|lru|slab|private_2|writeback|swapbacked|mlocked)
page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
bad because of flags:
flags: 0x2030a0(lru|slab|private_2|writeback|mlocked)
Modules linked in: f2fs(O-) fuse bnep rfcomm bluetooth dm_crypt binfmt_misc snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm
snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device joydev ppdev mac_hid lp hid_generic i2c_piix4
parport_pc psmouse snd serio_raw parport soundcore ext4 jbd2 mbcache usbhid hid e1000 [last unloaded: f2fs]
CPU: 1 PID: 3049 Comm: rmmod Tainted: G    B      O    4.1.0-rc3+ #10
Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
00000000 00000000 c0021eb4 c15b7518 f0f4f148 c0021ed8 c112e0b7 c1779174
c9b75674 000aaf8a 01b13ce1 c17791a4 f0f4f148 ee2830a4 c0021ef8 c112e3c3
00000000 f0f4f148 c0021f34 f0f4f148 ee2830a4 ef9f0000 c0021f20 c112fdf8
Call Trace:
[<c15b7518>] dump_stack+0x41/0x52
[<c112e0b7>] bad_page.part.72+0xa7/0x100
[<c112e3c3>] free_pages_prepare+0x213/0x220
[<c112fdf8>] free_hot_cold_page+0x28/0x120
[<c1073380>] ? try_to_wake_up+0x2b0/0x2b0
[<c112ff15>] __free_pages+0x25/0x30
[<c112c4fd>] mempool_free_pages+0xd/0x10
[<c112c5f1>] mempool_free+0x31/0x90
[<f0f441cf>] f2fs_exit_crypto+0x6f/0xf0 [f2fs]
[<f0f456c4>] exit_f2fs_fs+0x23/0x95f [f2fs]
[<c10c30e0>] SyS_delete_module+0x130/0x180
[<c11556d6>] ? vm_munmap+0x46/0x60
[<c15bd888>] sysenter_do_call+0x12/0x12"

The reason is that:

since commit 0827e645fd35
("f2fs crypto: shrink size of the f2fs_crypto_ctx structure") is merged,
some fields in f2fs_crypto_ctx structure are merged into a union as they
will never be used simultaneously in write path, read path or on free list.

In f2fs_exit_crypto, we traverse each crypto ctx from free list, in this
moment, our free_list field in union is valid, but still we will try to
release memory space which is pointed by other invalid field in union
structure for each ctx.

Then the error occurs, let's fix it with this patch.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-06-01 16:20:58 -07:00
Chao Yu
7bf4b5576a f2fs crypto: fix to release buffer for fname crypto
This patch fixes memory leak issue in error path of f2fs_fname_setup_filename().

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-06-01 16:20:57 -07:00
Jaegeuk Kim
ca40b03052 f2fs crypto: shrink size of the f2fs_crypto_ctx structure
This patch integrates the below patch into f2fs.

"ext4 crypto: shrink size of the ext4_crypto_ctx structure

Some fields are only used when the crypto_ctx is being used on the
read path, some are only used on the write path, and some are only
used when the structure is on free list.  Optimize memory use by using
a union."

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-06-01 16:20:57 -07:00
Jaegeuk Kim
640778fbc9 f2fs crypto: get rid of ci_mode from struct f2fs_crypt_info
This patch integrates the below patch into f2fs.

"ext4 crypto: get rid of ci_mode from struct ext4_crypt_info

The ci_mode field was superfluous, and getting rid of it gets rid of
an unused hole in the structure."

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-06-01 16:20:56 -07:00
Jaegeuk Kim
8bacf6deb0 f2fs crypto: use slab caches
This patch integrates the below patch into f2fs.

"ext4 crypto: use slab caches

Use slab caches the ext4_crypto_ctx and ext4_crypt_info structures for
slighly better memory efficiency and debuggability."

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-06-01 16:20:55 -07:00
Jaegeuk Kim
06e1bc05ca f2fs: truncate data blocks for orphan inode
As Hu reported, F2FS has a space leak problem, when conducting:

1) format a 4GB f2fs partition
2) dd a 3G file,
3) unlink it.

So, when doing f2fs_drop_inode(), we need to truncate data blocks
before skipping it.
We can also drop unused caches assigned to each inode.

Reported-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-06-01 16:20:54 -07:00
Dan Carpenter
912a83b509 f2fs: cleanup a confusing indent
The return was not indented far enough so it looked like it was supposed
to go with the other if statement.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-06-01 16:20:53 -07:00
Arnd Bergmann
7beb428eda f2fs: fix building on 32-bit architectures
A bug fix to the debug output extended the type of some local
variables to 64-bit, which now causes the kernel to fail building
because of missing 64-bit division functions:

ERROR: "__aeabi_uldivmod" [fs/f2fs/f2fs.ko] undefined!

In the kernel, we have to use div_u64 or do_div to do this,
in order to annotate that this is an expensive operation.

As the function is only called for debug out, we know this
is not performance critical, so it is safe to use div_u64.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: d1f85bd38db19 ("f2fs: avoid value overflow in showing current status")
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-06-01 16:20:53 -07:00
Jaegeuk Kim
e19ef527aa f2fs: avoid buggy functions
This patch avoids to use a buggy function for now.
It needs to fix them later.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-06-01 16:20:52 -07:00
hujianyang
08b95126c7 f2fs: add compat_ioctl to provide backward compatability
introduce compat_ioctl to regular files, but doesn't add this
functionality to f2fs_dir_operations.

While running a 32-bit busybox, I met an error like this:
(A is a directory)

chattr: reading flags on A: Inappropriate ioctl for device

This patch copies compat_ioctl from f2fs_file_operations and
fix this problem.

Signed-off-by: hujianyang <hujianyang@huawei.com>
Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-06-01 16:20:51 -07:00
Jaegeuk Kim
40a02be178 f2fs: do not issue next dnode discard redundantly
We have a discard map, so that we can avoid redundant discard issues.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-06-01 16:20:50 -07:00
Nan Jia
55a8d02bbb jfs: removed a prohibited space after opening parenthesis
Fixed a coding style issue.

Signed-off-by: Nan Jia <jiananmail@gmail.com>
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
2015-06-01 12:22:30 -05:00
Olga Kornievskaia
e8d975e73e fixing infinite OPEN loop in 4.0 stateid recovery
Problem: When an operation like WRITE receives a BAD_STATEID, even though
recovery code clears the RECLAIM_NOGRACE recovery flag before recovering
the open state, because of clearing delegation state for the associated
inode, nfs_inode_find_state_and_recover() gets called and it makes the
same state with RECLAIM_NOGRACE flag again. As a results, when we restart
looking over the open states, we end up in the infinite loop instead of
breaking out in the next test of state flags.

Solution: unset the RECLAIM_NOGRACE set because of
calling of nfs_inode_find_state_and_recover() after returning from calling
recover_open() function.

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-06-01 09:58:02 -04:00
Vladimir Zapolskiy
eaa5cd9263 fs: sysfs: don't pass count == 0 to bin file readers
If count == 0 bytes are requested by a reader, sysfs_kf_bin_read()
deliberately returns 0 without passing a potentially harmful value to
some externally defined underlying battr->read() function.

However in case of (pos == size && count) the next clause always sets
count to 0 and this value is handed over to battr->read().

The change intends to make obsolete (and remove later) a redundant
sanity check in battr->read(), if it is present, or add more
protection to struct bin_attribute users, who does not care about
input arguments.

Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-06-01 10:17:17 +09:00
Dave Chinner
b9a350a118 Merge branch 'xfs-sparse-inode' into for-next 2015-06-01 10:51:38 +10:00
Dave Chinner
e01c025fbd Merge branch 'xfs-misc-fixes-for-4.2' into for-next 2015-06-01 10:50:18 +10:00
Nan Jia
339e4f66d1 xfs: Clean up xfs_trans_dup_dqinfo
Fixed two missing spaces.

Signed-off-by: Nan Jia <jiananmail@gmail.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-06-01 10:50:00 +10:00
Linus Torvalds
8ba64dc338 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fix from Al Viro:
 "Off-by-one in d_walk()/__dentry_kill() race fix.

  It's very hard to hit; possible in the same conditions as the original
  bug, except that you need the skipped branch to contain all the
  remaining evictables, so that the d_walk()-calling loop in
  d_invalidate() decides there's nothing more to do and doesn't go for
  another pass - otherwise that next pass will sweep the sucker.

  So it's not too urgent, but seeing that the fix is obvious and the
  original commit has spread into all -stable branches..."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  d_walk() might skip too much
2015-05-31 16:00:34 -07:00
Eric Sandeen
39e56d9219 xfs: don't cast string literals
The commit:

a9273ca5 xfs: convert attr to use unsigned names

added these (unsigned char *) casts, but then the _SIZE macros
return "7" - size of a pointer minus one - not the length of
the string.  This is harmless in the kernel, because the _SIZE
macros are not used, but as we sync up with userspace, this will
matter.

I don't think the cast is necessary; i.e. assigning the string
literal to an unsigned char *, or passing it to a function
expecting an unsigned char *, should be ok, right?

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-06-01 07:15:38 +10:00
Brian Foster
7f884dc198 xfs: fix quota block reservation leak when tp allocates and frees blocks
Al Viro reports that generic/231 fails frequently on XFS and bisected
the problem to the following commit:

	5d11fb4b xfs: rework zero range to prevent invalid i_size updates

... which is just the first commit that happens to cause fsx to
reproduce the problem. fsx reproduces via zero range calls. The
aforementioned commit overhauls zero range to use hole punch and
fallocate. As it turns out, the problem is reproducible on demand using
basic hole punch as follows:

$ mkfs.xfs -f -m crc=1,finobt=1 <dev>
$ mount <dev> /mnt -o uquota
$ xfs_io -f -c "falloc 0 50m" /mnt/file
$ for i in $(seq 1 20); do xfs_io -c "fpunch ${i}m 32k" /mnt/file; done
$ rm -f /mnt/file
$ repquota -us /mnt
...
User            used    soft    hard  grace    used  soft  hard  grace
----------------------------------------------------------------------
root      --     32K      0K      0K              3     0     0

A file is allocated with a single 50m extent. The extent count increases
via hole punches until the bmap converts to btree format. The file is
removed but quota reports 32k of space usage for the user. This
reservation is effectively leaked for the lifetime of the mount.

The reason this occurs is because the quota block reservation tracking
is confused when a transaction happens to free and allocate blocks at
the same time. Consider the following sequence of events:

- tp is allocated from xfs_free_file_space() and reserves several blocks
  for btree management. Blocks are reserved against the dquot and marked
  as such in the transaction (qtrx->qt_blk_res).
- 8 blocks are accounted free when the 32k range is punched out.
  xfs_trans_mod_dquot() is called with XFS_TRANS_DQ_BCOUNT and sets
  ->qt_bcount_delta to -8.
- Subsequently, a block is allocated against the same transaction by
  xfs_bmap_extents_to_btree() for btree conversion. A call to
  xfs_trans_mod_dquot() increases qt_blk_res_used to 1 and qt_bcount_delta
  to -7.
- The transaction is dup'd and committed by xfs_bmap_finish().
  xfs_trans_dup_dqinfo() sets the first transaction up such that it has a
  matching qt_blk_res and qt_blk_res_used of 1. The remaining unused
  reservation is transferred to the duplicate tp.

When the transactions are committed, the dquots are fixed up in
xfs_trans_apply_dquot_deltas() according to one of two methods:

1.) If the transaction holds a block reservation (->qt_blk_res != 0),
_only_ the unused portion reservation is unaccounted from the dquot.
Note that the tp duplication behavior of xfs_bmap_finish() makes it such
that qt_blk_res is typically 0 for tp's with unused reservation.
2.) Otherwise, the dquot is fixed up based on the block delta
(->qt_bcount_delta) created by the transaction.

Therefore, if a transaction has a negative qt_bcount_delta and positive
qt_blk_res_used, the former set of blocks that have been removed from
the file are never factored out of the in-core dquot reservation.
Instead, *_apply_dquot_deltas() sees 1 block used out of a 1 block
reservation and believes there is nothing to fix up. The on-disk
d_bcount is updated independently from qt_bcount_delta, and thus is
correct (and allows the quota usage to correct on remount).

To deal with this situation, we effectively want the "used reservation"
part of the transaction to be consistent with any freed blocks with
respect to quota tracking. For example, if 8 blocks are freed, the
subsequent single block allocation does not need to consume the initial
reservation made by the tp. Instead, it simply borrows one from the
previously freed. One possible implementation of such borrowing is to
avoid the blks_res_used increment when bcount_delta is negative. This
alone is flawed logic in that it only handles the case where blocks are
freed before allocated, however.

Rather than add more complexity to manage synchronization between
bcount_delta and blks_res_used, kill the latter entirely. blk_res_used
is only updated in one place and always in sync with delta_bcount.
Therefore, the net block reservation consumption of the transaction is
always available from bcount_delta. Calculate the reservation
consumption on the fly where necessary based on whether the tp has a
reservation and results in a positive net block delta on the inode.

Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-06-01 07:15:37 +10:00
Brian Foster
2e588a46aa xfs: always log the inode on unwritten extent conversion
The fsync() requirements for crash consistency on XFS are to flush file
data and force any in-core inode updates to the log. We currently check
whether the inode is pinned to identify whether the log needs to be
forced, since a non-zero pin count generally represents an inode that
has transactions awaiting a flush to the on-disk log.

This is not sufficient in all cases, however. Reports of xfstests test
generic/311 failures on ppc64/s390x hosts have identified failures to
fsync outstanding inode modifications due to the inode not being pinned
at the time of the fsync. This occurs because certain bmap updates can
complete by logging bmapbt buffers but without ever dirtying (and thus
pinning) the core inode. The following is a specific incarnation of this
problem:

$ mount $dev /mnt -o noatime,nobarrier
$ for i in $(seq 0 2 31); do \
        xfs_io -f -c "falloc $((i * 32768)) 32k" -c fsync /mnt/file; \
	done
$ xfs_io -c "pwrite -S 0 80k 16k" -c fsync -c "pwrite 76k 4k" -c fsync /mnt/file; \
	hexdump /mnt/file; \
	./xfstests-dev/src/godown /mnt
...
0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
0013000 cdcd cdcd cdcd cdcd cdcd cdcd cdcd cdcd
*
0014000 0000 0000 0000 0000 0000 0000 0000 0000
*
00f8000
$ umount /mnt; mount ...
$ hexdump /mnt/file
0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
00f8000

In short, the unwritten extent conversion for the last write is lost
despite the fact that an fsync executed before the filesystem was
shutdown. Note that this is impossible to reproduce on v5 supers due to
unconditional time callbacks for di_changecount and highly difficult to
reproduce on CONFIG_HZ=1000 kernels due to those same callbacks
frequently updating cmtime prior to the bmap update. CONFIG_HZ=100
reduces timer granularity enough to increase the odds that time updates
are skipped and allows this to reproduce within a handful of attempts.

To deal with this problem, unconditionally log the core in the unwritten
extent conversion path. Fix up logflags after the extent conversion to
keep the extent update code consistent with the other extent update
helpers. This fixup is not necessary for the other (hole, delay) extent
helpers because they execute in the block allocation codepath, which
already logs the inode for other reasons (e.g., for di_nblocks).

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-06-01 07:15:23 +10:00
Chao Yu
e298e73bd7 ext4 crypto: release crypto resource on module exit
Crypto resource should be released when ext4 module exits, otherwise
it will cause memory leak.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-31 13:37:35 -04:00
Theodore Ts'o
abdd438b26 ext4 crypto: handle unexpected lack of encryption keys
Fix up attempts by users to try to write to a file when they don't
have access to the encryption key.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-31 13:35:39 -04:00
Theodore Ts'o
4d3c4e5b8c ext4 crypto: allocate the right amount of memory for the on-disk symlink
Previously we were taking the required padding when allocating space
for the on-disk symlink.  This caused a buffer overrun which could
trigger a krenel crash when running fsstress.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-31 13:35:32 -04:00
Theodore Ts'o
82d0d3e7e6 ext4 crypto: clean up error handling in ext4_fname_setup_filename
Fix a potential memory leak where fname->crypto_buf.name wouldn't get
freed in some error paths, and also make the error handling easier to
understand/audit.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-31 13:35:22 -04:00
Theodore Ts'o
d87f6d78e9 ext4 crypto: policies may only be set on directories
Thanks to Chao Yu <chao2.yu@samsung.com> for pointing out we were
missing this check.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-31 13:35:14 -04:00
Theodore Ts'o
c2faccaff6 ext4 crypto: enforce crypto policy restrictions on cross-renames
Thanks to Chao Yu <chao2.yu@samsung.com> for pointing out the need for
this check.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-31 13:35:09 -04:00
Theodore Ts'o
e709e9df64 ext4 crypto: encrypt tmpfile located in encryption protected directory
Factor out calls to ext4_inherit_context() and move them to
__ext4_new_inode(); this fixes a problem where ext4_tmpfile() wasn't
calling calling ext4_inherit_context(), so the temporary file wasn't
getting protected.  Since the blocks for the tmpfile could end up on
disk, they really should be protected if the tmpfile is created within
the context of an encrypted directory.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-31 13:35:02 -04:00
Theodore Ts'o
6bc445e0ff ext4 crypto: make sure the encryption info is initialized on opendir(2)
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-31 13:34:57 -04:00
Theodore Ts'o
5555702955 ext4 crypto: set up encryption info for new inodes in ext4_inherit_context()
Set up the encryption information for newly created inodes immediately
after they inherit their encryption context from their parent
directories.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-31 13:34:29 -04:00
Theodore Ts'o
95ea68b4c7 ext4 crypto: fix memory leaks in ext4_encrypted_zeroout
ext4_encrypted_zeroout() could end up leaking a bio and bounce page.
Fortunately it's not used much.  While we're fixing things up,
refactor out common code into the static function alloc_bounce_page()
and fix up error handling if mempool_alloc() fails.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-31 13:34:24 -04:00
Theodore Ts'o
c936e1ec28 ext4 crypto: use per-inode tfm structure
As suggested by Herbert Xu, we shouldn't allocate a new tfm each time
we read or write a page.  Instead we can use a single tfm hanging off
the inode's crypt_info structure for all of our encryption needs for
that inode, since the tfm can be used by multiple crypto requests in
parallel.

Also use cmpxchg() to avoid races that could result in crypt_info
structure getting doubly allocated or doubly freed.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-31 13:34:22 -04:00
Theodore Ts'o
71dea01ea2 ext4 crypto: require CONFIG_CRYPTO_CTR if ext4 encryption is enabled
On arm64 this is apparently needed for CTS mode to function correctly.
Otherwise attempts to use CTS return ENOENT.

Change-Id: I732ea9a5157acc76de5b89edec195d0365f4ca63
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-31 13:31:37 -04:00
Theodore Ts'o
614def7013 ext4 crypto: shrink size of the ext4_crypto_ctx structure
Some fields are only used when the crypto_ctx is being used on the
read path, some are only used on the write path, and some are only
used when the structure is on free list.  Optimize memory use by using
a union.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-31 13:31:34 -04:00
Richard Weinberger
f74a14e870 um: Remove hppfs
hppfs (honeypot procfs) was an attempt to use UML as honeypot.
It was never stable nor in heavy use.

As Al Viro and Christoph Hellwig pointed some major issues out
it is better to let it die.

Signed-off-by: Richard Weinberger <richard@nod.at>
2015-05-31 13:23:08 +02:00
Linus Torvalds
1be44e234b xfs: update for 4.1-rc6
Changes in this update:
 o regression fix for new rename whiteout code
 o regression fixes for new superblock generic per-cpu counter code
 o fix for incorrect error return sign introduced in 3.17
 o metadata corruption fixes that need to go back to -stable kernels
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJVaO0JAAoJEK3oKUf0dfodd4UQANRdfXnUrpyQGhVS7HFoFoVt
 FIQ52pPGbMu72+DqHc+Q41uvgAPe65LFB2VUL6CUGCMExstF72F5+QonzppMgkMo
 unPER3eB8ya03SY+Kp+803ZGgzI2Nl2M6w8Kof730/RUk56PTGYIx4eLXd6iZSli
 RsYjw8JDbeue5OQo5FPmLCSQ/Kr5ZJXbgWVPyWkKg9aCcXLN5YSJIV3xcMTK9Q2I
 LqqODkyatnGc6YxGAKddS7Xzt1ntlZgbe5mndQw04a2g0Lf6emPH5r8b0UJXIu96
 advOBX0pEbad4FeFS6Mo5D+nNCaaNP4WzN7wgdb+BYNVw3ss4Ebam7+yY6Gexg6y
 bzZOEkk9saL4YeBDgyYICNu7kG4BRVKRQiiX220G6SFXM3nqbl7qBPb3kVFyDpcI
 RRuFJ0ZV0kFJ+3IQ4xVnIh6nootceRk/mvZaK5HhLhQLzklpZ8fj4HF3oBDUAnvN
 wNd+7GoZy7zldjCkbF4BP3GjUeW+b9ngrCNc+bFXi5cUbdECXAa2krjxyY+MlQF2
 veNVVcsoRdfeM0VjJh2/piGJxMWIlRqXdKzPKsfMWnlIaJ6YyslfbSq+2K7LxgGR
 Ho3Sjt0oUuPMZ9F/Mjj+XDqwmzgooUHXNyDBxhGXBNBPjApcRLcb2vQ2SrWEmeGJ
 vZmC2R1ZoGdBJg8a55BT
 =w5SP
 -----END PGP SIGNATURE-----

Merge tag 'xfs-for-linus-4.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs

Pull xfs fixes from Dave Chinner:
 "This is a little larger than I'd like late in the release cycle, but
  all the fixes are for regressions introduced in the 4.1-rc1 merge, or
  are needed back in -stable kernels fairly quickly as they are
  filesystem corruption or userspace visible correctness issues.

  Changes in this update:

   - regression fix for new rename whiteout code

   - regression fixes for new superblock generic per-cpu counter code

   - fix for incorrect error return sign introduced in 3.17

   - metadata corruption fixes that need to go back to -stable kernels"

* tag 'xfs-for-linus-4.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs:
  xfs: fix broken i_nlink accounting for whiteout tmpfile inode
  xfs: xfs_iozero can return positive errno
  xfs: xfs_attr_inactive leaves inconsistent attr fork state behind
  xfs: extent size hints can round up extents past MAXEXTLEN
  xfs: inode and free block counters need to use __percpu_counter_compare
  percpu_counter: batch size aware __percpu_counter_compare()
  xfs: use percpu_counter_read_positive for mp->m_icount
2015-05-29 16:45:45 -07:00
Andreas Gruenbacher
2f6b3879c2 nfsd: Remove dead declarations
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-05-29 11:04:04 -04:00