linux-pinenote

Author	SHA1	Message	Date
Filipe Manana	5f9a8a51d8	Btrfs: add semaphore to synchronize direct IO writes with fsync Due to the optimization of lockless direct IO writes (the inode's i_mutex is not held) introduced in commit `38851cc19a` ("Btrfs: implement unlocked dio write"), we started having races between such writes with concurrent fsync operations that use the fast fsync path. These races were addressed in the patches titled "Btrfs: fix race between fsync and lockless direct IO writes" and "Btrfs: fix race between fsync and direct IO writes for prealloc extents". The races happened because the direct IO path, like every other write path, does create extent maps followed by the corresponding ordered extents while the fast fsync path collected first ordered extents and then it collected extent maps. This made it possible to log file extent items (based on the collected extent maps) without waiting for the corresponding ordered extents to complete (get their IO done). The two fixes mentioned before added a solution that consists of making the direct IO path create first the ordered extents and then the extent maps, while the fsync path attempts to collect any new ordered extents once it collects the extent maps. This was simple and did not require adding any synchonization primitive to any data structure (struct btrfs_inode for example) but it makes things more fragile for future development endeavours and adds an exceptional approach compared to the other write paths. This change adds a read-write semaphore to the btrfs inode structure and makes the direct IO path create the extent maps and the ordered extents while holding read access on that semaphore, while the fast fsync path collects extent maps and ordered extents while holding write access on that semaphore. The logic for direct IO write path is encapsulated in a new helper function that is used both for cow and nocow direct IO writes. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Josef Bacik <jbacik@fb.com>	2016-05-13 01:59:36 +01:00
Filipe Manana	f78c436c39	Btrfs: fix race between block group relocation and nocow writes Relocation of a block group waits for all existing tasks flushing dellaloc, starting direct IO writes and any ordered extents before starting the relocation process. However for direct IO writes that end up doing nocow (inode either has the flag nodatacow set or the write is against a prealloc extent) we have a short time window that allows for a race that makes relocation proceed without waiting for the direct IO write to complete first, resulting in data loss after the relocation finishes. This is illustrated by the following diagram: CPU 1 CPU 2 btrfs_relocate_block_group(bg X) direct IO write starts against an extent in block group X using nocow mode (inode has the nodatacow flag or the write is for a prealloc extent) btrfs_direct_IO() btrfs_get_blocks_direct() --> can_nocow_extent() returns 1 btrfs_inc_block_group_ro(bg X) --> turns block group into RO mode btrfs_wait_ordered_roots() --> returns and does not know about the DIO write happening at CPU 2 (the task there has not created yet an ordered extent) relocate_block_group(bg X) --> rc->stage == MOVE_DATA_EXTENTS find_next_extent() --> returns extent that the DIO write is going to write to relocate_data_extent() relocate_file_extent_cluster() --> reads the extent from disk into pages belonging to the relocation inode and dirties them --> creates DIO ordered extent btrfs_submit_direct() --> submits bio against a location on disk obtained from an extent map before the relocation started btrfs_wait_ordered_range() --> writes all the pages read before to disk (belonging to the relocation inode) relocation finishes bio completes and wrote new data to the old location of the block group So fix this by tracking the number of nocow writers for a block group and make sure relocation waits for that number to go down to 0 before starting to move the extents. The same race can also happen with buffered writes in nocow mode since the patch I recently made titled "Btrfs: don't do unnecessary delalloc flushes when relocating", because we are no longer flushing all delalloc which served as a synchonization mechanism (due to page locking) and ensured the ordered extents for nocow buffered writes were created before we called btrfs_wait_ordered_roots(). The race with direct IO writes in nocow mode existed before that patch (no pages are locked or used during direct IO) and that fixed only races with direct IO writes that do cow. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Josef Bacik <jbacik@fb.com>	2016-05-13 01:59:34 +01:00
Filipe Manana	0b901916a0	Btrfs: fix race between fsync and direct IO writes for prealloc extents When we do a direct IO write against a preallocated extent (fallocate) that does not go beyond the i_size of the inode, we do the write operation without holding the inode's i_mutex (an optimization that landed in commit `38851cc19a` ("Btrfs: implement unlocked dio write")). This allows for a very tiny time window where a race can happen with a concurrent fsync using the fast code path, as the direct IO write path creates first a new extent map (no longer flagged as a prealloc extent) and then it creates the ordered extent, while the fast fsync path first collects ordered extents and then it collects extent maps. This allows for the possibility of the fast fsync path to collect the new extent map without collecting the new ordered extent, and therefore logging an extent item based on the extent map without waiting for the ordered extent to be created and complete. This can result in a situation where after a log replay we end up with an extent not marked anymore as prealloc but it was only partially written (or not written at all), exposing random, stale or garbage data corresponding to the unwritten pages and without any checksums in the csum tree covering the extent's range. This is an extension of what was done in commit `de0ee0edb2` ("Btrfs: fix race between fsync and lockless direct IO writes"). So fix this by creating first the ordered extent and then the extent map, so that this way if the fast fsync patch collects the new extent map it also collects the corresponding ordered extent. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Josef Bacik <jbacik@fb.com>	2016-05-13 01:59:32 +01:00
Filipe Manana	5062af35c3	Btrfs: fix number of transaction units for renames with whiteout When we do a rename with the whiteout flag, we need to create the whiteout inode, which in the worst case requires 5 transaction units (1 inode item, 1 inode ref, 2 dir items and 1 xattr if selinux is enabled). So bump the number of transaction units from 11 to 16 if the whiteout flag is set. Signed-off-by: Filipe Manana <fdmanana@suse.com>	2016-05-13 01:59:30 +01:00
Filipe Manana	376e5a57bf	Btrfs: pin logs earlier when doing a rename exchange operation The btrfs_rename_exchange() started as a copy-paste from btrfs_rename(), which had a race fixed by my previous patch titled "Btrfs: pin log earlier when renaming", and so it suffers from the same problem. We pin the logs of the affected roots after we insert the new inode references, leaving a time window where concurrent tasks logging the inodes can end up logging both the new and old references, resulting in log trees that when replayed can turn the metadata into inconsistent states. This behaviour was added to btrfs_rename() in 2009 without any explanation about why not pinning the logs earlier, just leaving a comment about the posibility for the race. As of today it's perfectly safe and sane to pin the logs before we start doing any of the steps involved in the rename operation. Signed-off-by: Filipe Manana <fdmanana@suse.com>	2016-05-13 01:59:28 +01:00
Filipe Manana	86e8aa0e77	Btrfs: unpin logs if rename exchange operation fails If rename exchange operations fail at some point after we pinned any of the logs, we end up aborting the current transaction but never unpin the logs, which leaves concurrent tasks that are trying to sync the logs (as part of an fsync request from user space) blocked forever and preventing the filesystem from being unmountable. Fix this by safely unpinning the log. Signed-off-by: Filipe Manana <fdmanana@suse.com>	2016-05-13 01:59:26 +01:00
Filipe Manana	c990161888	Btrfs: fix inode leak on failure to setup whiteout inode in rename If we failed to fully setup the whiteout inode during a rename operation with the whiteout flag, we ended up leaking the inode, not decrementing its link count nor removing all its items from the fs/subvol tree. Signed-off-by: Filipe Manana <fdmanana@suse.com>	2016-05-13 01:59:23 +01:00
Dan Fuhry	cdd1fedf82	btrfs: add support for RENAME_EXCHANGE and RENAME_WHITEOUT Two new flags, RENAME_EXCHANGE and RENAME_WHITEOUT, provide for new behavior in the renameat2() syscall. This behavior is primarily used by overlayfs. This patch adds support for these flags to btrfs, enabling it to be used as a fully functional upper layer for overlayfs. RENAME_EXCHANGE support was written by Davide Italiano originally submitted on 2 April 2015. Signed-off-by: Davide Italiano <dccitaliano@gmail.com> Signed-off-by: Dan Fuhry <dfuhry@datto.com> [ remove unlikely ] Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com>	2016-05-13 01:59:21 +01:00
Filipe Manana	c4aba95454	Btrfs: pin log earlier when renaming We were pinning the log right after the first step in the rename operation (inserting inode ref for the new name in the destination directory) instead of doing it before. This behaviour was introduced in 2009 for some reason that was not mentioned neither on the changelog nor any comment, with the drawback of a small time window where concurrent log writers can end up logging the new inode reference for the inode we are renaming while the rename operation is in progress (so that we can end up with a log containing both the new and old references). As of today there's no reason to not pin the log before that first step anymore, so just fix this. Signed-off-by: Filipe Manana <fdmanana@suse.com>	2016-05-13 01:59:19 +01:00
Filipe Manana	3dc9e8f767	Btrfs: unpin log if rename operation fails If rename operations fail at some point after we pinned the log, we end up aborting the current transaction but never unpin the log, which leaves concurrent tasks that are trying to sync the log (as part of an fsync request from user space) blocked forever and preventing the filesystem from being unmountable. Fix this by safely unpinning the log. Signed-off-by: Filipe Manana <fdmanana@suse.com>	2016-05-13 01:59:18 +01:00
Filipe Manana	9cfa3e34e2	Btrfs: don't do unnecessary delalloc flushes when relocating Before we start the actual relocation process of a block group, we do calls to flush delalloc of all inodes and then wait for ordered extents to complete. However we do these flush calls just to make sure we don't race with concurrent tasks that have actually already started to run delalloc and have allocated an extent from the block group we want to relocate, right before we set it to readonly mode, but have not yet created the respective ordered extents. The flush calls make us wait for such concurrent tasks because they end up calling filemap_fdatawrite_range() (through btrfs_start_delalloc_roots() -> __start_delalloc_inodes() -> btrfs_alloc_delalloc_work() -> btrfs_run_delalloc_work()) which ends up serializing us with those tasks due to attempts to lock the same pages (and the delalloc flush procedure calls the allocator and creates the ordered extents before unlocking the pages). These flushing calls not only make us waste time (cpu, IO) but also reduce the chances of writing larger extents (applications might be writing to contiguous ranges and we flush before they finish dirtying the whole ranges). So make sure we don't flush delalloc and just wait for concurrent tasks that have already started flushing delalloc and have allocated an extent from the block group we are about to relocate. This change also ends up fixing a race with direct IO writes that makes relocation not wait for direct IO ordered extents. This race is illustrated by the following diagram: CPU 1 CPU 2 btrfs_relocate_block_group(bg X) starts direct IO write, target inode currently has no ordered extents ongoing nor dirty pages (delalloc regions), therefore the root for our inode is not in the list fs_info->ordered_roots btrfs_direct_IO() __blockdev_direct_IO() btrfs_get_blocks_direct() btrfs_lock_extent_direct() locks range in the io tree btrfs_new_extent_direct() btrfs_reserve_extent() --> extent allocated from bg X btrfs_inc_block_group_ro(bg X) btrfs_start_delalloc_roots() __start_delalloc_inodes() --> does nothing, no dealloc ranges in the inode's io tree so the inode's root is not in the list fs_info->delalloc_roots btrfs_wait_ordered_roots() --> does not find the inode's root in the list fs_info->ordered_roots --> ends up not waiting for the direct IO write started by the task at CPU 2 relocate_block_group(rc->stage == MOVE_DATA_EXTENTS) prepare_to_relocate() btrfs_commit_transaction() iterates the extent tree, using its commit root and moves extents into new locations btrfs_add_ordered_extent_dio() --> now a ordered extent is created and added to the list root->ordered_extents and the root added to the list fs_info->ordered_roots --> this is too late and the task at CPU 1 already started the relocation btrfs_commit_transaction() btrfs_finish_ordered_io() btrfs_alloc_reserved_file_extent() --> adds delayed data reference for the extent allocated from bg X relocate_block_group(rc->stage == UPDATE_DATA_PTRS) prepare_to_relocate() btrfs_commit_transaction() --> delayed refs are run, so an extent item for the allocated extent from bg X is added to extent tree --> commit roots are switched, so the next scan in the extent tree will see the extent item sees the extent in the extent tree When this happens the relocation produces the following warning when it finishes: [ 7260.832836] ------------[ cut here ]------------ [ 7260.834653] WARNING: CPU: 5 PID: 6765 at fs/btrfs/relocation.c:4318 btrfs_relocate_block_group+0x245/0x2a1 [btrfs]() [ 7260.838268] Modules linked in: btrfs crc32c_generic xor ppdev raid6_pq psmouse sg acpi_cpufreq evdev i2c_piix4 tpm_tis serio_raw tpm i2c_core pcspkr parport_pc [ 7260.850935] CPU: 5 PID: 6765 Comm: btrfs Not tainted 4.5.0-rc6-btrfs-next-28+ #1 [ 7260.852998] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS by qemu-project.org 04/01/2014 [ 7260.852998] 0000000000000000 ffff88020bf57bc0 ffffffff812648b3 0000000000000000 [ 7260.852998] 0000000000000009 ffff88020bf57bf8 ffffffff81051608 ffffffffa03c1b2d [ 7260.852998] ffff8800b2bbb800 0000000000000000 ffff8800b17bcc58 ffff8800399dd000 [ 7260.852998] Call Trace: [ 7260.852998] [<ffffffff812648b3>] dump_stack+0x67/0x90 [ 7260.852998] [<ffffffff81051608>] warn_slowpath_common+0x99/0xb2 [ 7260.852998] [<ffffffffa03c1b2d>] ? btrfs_relocate_block_group+0x245/0x2a1 [btrfs] [ 7260.852998] [<ffffffff810516d4>] warn_slowpath_null+0x1a/0x1c [ 7260.852998] [<ffffffffa03c1b2d>] btrfs_relocate_block_group+0x245/0x2a1 [btrfs] [ 7260.852998] [<ffffffffa039d9de>] btrfs_relocate_chunk.isra.29+0x66/0xdb [btrfs] [ 7260.852998] [<ffffffffa039f314>] btrfs_balance+0xde1/0xe4e [btrfs] [ 7260.852998] [<ffffffff8127d671>] ? debug_smp_processor_id+0x17/0x19 [ 7260.852998] [<ffffffffa03a9583>] btrfs_ioctl_balance+0x255/0x2d3 [btrfs] [ 7260.852998] [<ffffffffa03ac96a>] btrfs_ioctl+0x11e0/0x1dff [btrfs] [ 7260.852998] [<ffffffff811451df>] ? handle_mm_fault+0x443/0xd63 [ 7260.852998] [<ffffffff81491817>] ? _raw_spin_unlock+0x31/0x44 [ 7260.852998] [<ffffffff8108b36a>] ? arch_local_irq_save+0x9/0xc [ 7260.852998] [<ffffffff811876ab>] vfs_ioctl+0x18/0x34 [ 7260.852998] [<ffffffff81187cb2>] do_vfs_ioctl+0x550/0x5be [ 7260.852998] [<ffffffff81190c30>] ? __fget_light+0x4d/0x71 [ 7260.852998] [<ffffffff81187d77>] SyS_ioctl+0x57/0x79 [ 7260.852998] [<ffffffff81492017>] entry_SYSCALL_64_fastpath+0x12/0x6b [ 7260.893268] ---[ end trace eb7803b24ebab8ad ]--- This is because at the end of the first stage, in relocate_block_group(), we commit the current transaction, which makes delayed refs run, the commit roots are switched and so the second stage will find the extent item that the ordered extent added to the delayed refs. But this extent was not moved (ordered extent completed after first stage finished), so at the end of the relocation our block group item still has a positive used bytes counter, triggering a warning at the end of btrfs_relocate_block_group(). Later on when trying to read the extent contents from disk we hit a BUG_ON() due to the inability to map a block with a logical address that belongs to the block group we relocated and is no longer valid, resulting in the following trace: [ 7344.885290] BTRFS critical (device sdi): unable to find logical 12845056 len 4096 [ 7344.887518] ------------[ cut here ]------------ [ 7344.888431] kernel BUG at fs/btrfs/inode.c:1833! [ 7344.888431] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC [ 7344.888431] Modules linked in: btrfs crc32c_generic xor ppdev raid6_pq psmouse sg acpi_cpufreq evdev i2c_piix4 tpm_tis serio_raw tpm i2c_core pcspkr parport_pc [ 7344.888431] CPU: 0 PID: 6831 Comm: od Tainted: G W 4.5.0-rc6-btrfs-next-28+ #1 [ 7344.888431] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS by qemu-project.org 04/01/2014 [ 7344.888431] task: ffff880215818600 ti: ffff880204684000 task.ti: ffff880204684000 [ 7344.888431] RIP: 0010:[<ffffffffa037c88c>] [<ffffffffa037c88c>] btrfs_merge_bio_hook+0x54/0x6b [btrfs] [ 7344.888431] RSP: 0018:ffff8802046878f0 EFLAGS: 00010282 [ 7344.888431] RAX: 00000000ffffffea RBX: 0000000000001000 RCX: 0000000000000001 [ 7344.888431] RDX: ffff88023ec0f950 RSI: ffffffff8183b638 RDI: 00000000ffffffff [ 7344.888431] RBP: ffff880204687908 R08: 0000000000000001 R09: 0000000000000000 [ 7344.888431] R10: ffff880204687770 R11: ffffffff82f2d52d R12: 0000000000001000 [ 7344.888431] R13: ffff88021afbfee8 R14: 0000000000006208 R15: ffff88006cd199b0 [ 7344.888431] FS: 00007f1f9e1d6700(0000) GS:ffff88023ec00000(0000) knlGS:0000000000000000 [ 7344.888431] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 7344.888431] CR2: 00007f1f9dc8cb60 CR3: 000000023e3b6000 CR4: 00000000000006f0 [ 7344.888431] Stack: [ 7344.888431] 0000000000001000 0000000000001000 ffff880204687b98 ffff880204687950 [ 7344.888431] ffffffffa0395c8f ffffea0004d64d48 0000000000000000 0000000000001000 [ 7344.888431] ffffea0004d64d48 0000000000001000 0000000000000000 0000000000000000 [ 7344.888431] Call Trace: [ 7344.888431] [<ffffffffa0395c8f>] submit_extent_page+0xf5/0x16f [btrfs] [ 7344.888431] [<ffffffffa03970ac>] __do_readpage+0x4a0/0x4f1 [btrfs] [ 7344.888431] [<ffffffffa039680d>] ? btrfs_create_repair_bio+0xcb/0xcb [btrfs] [ 7344.888431] [<ffffffffa037eeb4>] ? btrfs_writepage_start_hook+0xbc/0xbc [btrfs] [ 7344.888431] [<ffffffff8108df55>] ? trace_hardirqs_on+0xd/0xf [ 7344.888431] [<ffffffffa039728c>] __do_contiguous_readpages.constprop.26+0xc2/0xe4 [btrfs] [ 7344.888431] [<ffffffffa037eeb4>] ? btrfs_writepage_start_hook+0xbc/0xbc [btrfs] [ 7344.888431] [<ffffffffa039739b>] __extent_readpages.constprop.25+0xed/0x100 [btrfs] [ 7344.888431] [<ffffffff81129d24>] ? lru_cache_add+0xe/0x10 [ 7344.888431] [<ffffffffa0397ea8>] extent_readpages+0x160/0x1aa [btrfs] [ 7344.888431] [<ffffffffa037eeb4>] ? btrfs_writepage_start_hook+0xbc/0xbc [btrfs] [ 7344.888431] [<ffffffff8115daad>] ? alloc_pages_current+0xa9/0xcd [ 7344.888431] [<ffffffffa037cdc9>] btrfs_readpages+0x1f/0x21 [btrfs] [ 7344.888431] [<ffffffff81128316>] __do_page_cache_readahead+0x168/0x1fc [ 7344.888431] [<ffffffff811285a0>] ondemand_readahead+0x1f6/0x207 [ 7344.888431] [<ffffffff811285a0>] ? ondemand_readahead+0x1f6/0x207 [ 7344.888431] [<ffffffff8111cf34>] ? pagecache_get_page+0x2b/0x154 [ 7344.888431] [<ffffffff8112870e>] page_cache_sync_readahead+0x3d/0x3f [ 7344.888431] [<ffffffff8111dbf7>] generic_file_read_iter+0x197/0x4e1 [ 7344.888431] [<ffffffff8117773a>] __vfs_read+0x79/0x9d [ 7344.888431] [<ffffffff81178050>] vfs_read+0x8f/0xd2 [ 7344.888431] [<ffffffff81178a38>] SyS_read+0x50/0x7e [ 7344.888431] [<ffffffff81492017>] entry_SYSCALL_64_fastpath+0x12/0x6b [ 7344.888431] Code: 8d 4d e8 45 31 c9 45 31 c0 48 8b 00 48 c1 e2 09 48 8b 80 80 fc ff ff 4c 89 65 e8 48 8b b8 f0 01 00 00 e8 1d 42 02 00 85 c0 79 02 <0f> 0b 4c 0 [ 7344.888431] RIP [<ffffffffa037c88c>] btrfs_merge_bio_hook+0x54/0x6b [btrfs] [ 7344.888431] RSP <ffff8802046878f0> [ 7344.970544] ---[ end trace eb7803b24ebab8ae ]--- Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com>	2016-05-13 01:59:16 +01:00
Filipe Manana	578def7c50	Btrfs: don't wait for unrelated IO to finish before relocation Before the relocation process of a block group starts, it sets the block group to readonly mode, then flushes all delalloc writes and then finally it waits for all ordered extents to complete. This last step includes waiting for ordered extents destinated at extents allocated in other block groups, making us waste unecessary time. So improve this by waiting only for ordered extents that fall into the block group's range. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com>	2016-05-13 01:59:14 +01:00
Filipe Manana	3f9749f6e9	Btrfs: fix empty symlink after creating symlink and fsync parent dir If we create a symlink, fsync its parent directory, crash/power fail and mount the filesystem, we end up with an empty symlink, which not only is useless it's also not allowed in linux (the man page symlink(2) is well explicit about that). So we just need to make sure to fully log an inode if it's a symlink, to ensure its inline extent gets logged, ensuring the same behaviour as ext3, ext4, xfs, reiserfs, f2fs, nilfs2, etc. Example reproducer: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ mkdir /mnt/testdir $ sync $ ln -s /mnt/foo /mnt/testdir/bar $ xfs_io -c fsync /mnt/testdir <power fail> $ mount /dev/sdb /mnt $ readlink /mnt/testdir/bar <empty string> A test case for fstests follows soon. Signed-off-by: Filipe Manana <fdmanana@suse.com>	2016-05-13 01:59:12 +01:00
Filipe Manana	657ed1aa48	Btrfs: fix for incorrect directory entries after fsync log replay If we move a directory to a new parent and later log that parent and don't explicitly log the old parent, when we replay the log we can end up with entries for the moved directory in both the old and new parent directories. Besides being ilegal to have directories with multiple hard links in linux, it also resulted in the leaving the inode item with a link count of 1. A similar issue also happens if we move a regular file - after the log tree is replayed the file has a link in both the old and new parent directories, when it should be only at the new directory. Sample reproducer: $ mkfs.btrfs -f /dev/sdc $ mount /dev/sdc /mnt $ mkdir /mnt/x $ mkdir /mnt/y $ touch /mnt/x/foo $ mkdir /mnt/y/z $ sync $ ln /mnt/x/foo /mnt/x/bar $ mv /mnt/y/z /mnt/x/z < power fail > $ mount /dev/sdc /mnt $ ls -1Ri /mnt /mnt: 257 x 258 y /mnt/x: 259 bar 259 foo 260 z /mnt/x/z: /mnt/y: 260 z /mnt/y/z: $ umount /dev/sdc $ btrfs check /dev/sdc Checking filesystem on /dev/sdc UUID: a67e2c4a-a4b4-4fdc-b015-9d9af1e344be checking extents checking free space cache checking fs roots root 5 inode 260 errors 2000, link count wrong unresolved ref dir 257 index 4 namelen 1 name z filetype 2 errors 0 unresolved ref dir 258 index 2 namelen 1 name z filetype 2 errors 0 (...) Attempting to remove the directory becomes impossible: $ mount /dev/sdc /mnt $ rmdir /mnt/y/z $ ls -lh /mnt/y ls: cannot access /mnt/y/z: No such file or directory total 0 d????????? ? ? ? ? ? z $ rmdir /mnt/x/z rmdir: failed to remove ‘/mnt/x/z’: Stale file handle $ ls -lh /mnt/x ls: cannot access /mnt/x/z: Stale file handle total 0 -rw-r--r-- 2 root root 0 Apr 6 18:06 bar -rw-r--r-- 2 root root 0 Apr 6 18:06 foo d????????? ? ? ? ? ? z So make sure that on rename we set the last_unlink_trans value for our inode, even if it's a directory, to the value of the current transaction's ID and that if the new parent directory is logged that we fallback to a transaction commit. A test case for fstests is being submitted as well. Signed-off-by: Filipe Manana <fdmanana@suse.com>	2016-05-13 01:59:11 +01:00
Linus Walleij	0d5358330c	Revert "pinctrl: tegra: avoid parked_reg and parked_bank" This reverts commit `1d18a3f0f0`.	2016-05-13 02:45:04 +02:00
Al Viro	ae05327a00	ext4: switch to ->iterate_shared() Note that we need relax_dir() equivalent for directories locked shared. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-12 20:36:01 -04:00
Al Viro	9717a91b01	hfs: switch to ->iterate_shared() exact parallel of hfsplus analogue Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-12 20:13:50 -04:00
Al Viro	323ee8fc54	hfsplus: switch to ->iterate_shared() We need to protect the list of hfsplus_readdir_data against parallel insertions (in readdir) and removals (in release). Add a spinlock for that. Note that it has nothing to do with protection of hfsplus_readdir_data->key - we have an exclusion between hfsplus_readdir() and hfsplus_delete_cat() on directory lock and between several hfsplus_readdir() for the same struct file on ->f_pos_lock. The spinlock is strictly for list changes. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-12 20:08:40 -04:00
Al Viro	552a9d489f	hostfs: switch to ->iterate_shared() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-12 19:49:30 -04:00
Al Viro	7d674b3195	hpfs: switch to ->iterate_shared() NOTE: the only reason we can do that without ->i_rdir_offs races is that hpfs_lock() serializes everything in there anyway. It's not that hard to get rid of, but not as part of this series... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-12 19:47:13 -04:00
Al Viro	e82c314755	hpfs: handle allocation failures in hpfs_add_pos() pr_err() is nice, but we'd better propagate the error to caller and not proceed to violate the invariants (namely, "every file with f_pos tied to directory block should have its address visible in per-inode array"). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-12 19:35:57 -04:00
Andrea Arcangeli	6d0a07edd1	mm: thp: calculate the mapcount correctly for THP pages during WP faults This will provide fully accuracy to the mapcount calculation in the write protect faults, so page pinning will not get broken by false positive copy-on-writes. total_mapcount() isn't the right calculation needed in reuse_swap_page(), so this introduces a page_trans_huge_mapcount() that is effectively the full accurate return value for page_mapcount() if dealing with Transparent Hugepages, however we only use the page_trans_huge_mapcount() during COW faults where it strictly needed, due to its higher runtime cost. This also provide at practical zero cost the total_mapcount information which is needed to know if we can still relocate the page anon_vma to the local vma. If page_trans_huge_mapcount() returns 1 we can reuse the page no matter if it's a pte or a pmd_trans_huge triggering the fault, but we can only relocate the page anon_vma to the local vma->anon_vma if we're sure it's only this "vma" mapping the whole THP physical range. Kirill A. Shutemov discovered the problem with moving the page anon_vma to the local vma->anon_vma in a previous version of this patch and another problem in the way page_move_anon_rmap() was called. Andrew Morton discovered that CONFIG_SWAP=n wouldn't build in a previous version, because reuse_swap_page must be a macro to call page_trans_huge_mapcount from swap.h, so this uses a macro again instead of an inline function. With this change at least it's a less dangerous usage than it was before, because "page" is used only once now, while with the previous code reuse_swap_page(page++) would have called page_mapcount on page+1 and it would have increased page twice instead of just once. Dean Luick noticed an uninitialized variable that could result in a rmap inefficiency for the non-THP case in a previous version. Mike Marciniszyn said: : Our RDMA tests are seeing an issue with memory locking that bisects to : commit `61f5d698cc` ("mm: re-enable THP") : : The test program registers two rather large MRs (512M) and RDMA : writes data to a passive peer using the first and RDMA reads it back : into the second MR and compares that data. The sizes are chosen randomly : between 0 and 1024 bytes. : : The test will get through a few (<= 4 iterations) and then gets a : compare error. : : Tracing indicates the kernel logical addresses associated with the individual : pages at registration ARE correct , the data in the "RDMA read response only" : packets ARE correct. : : The "corruption" occurs when the packet crosse two pages that are not physically : contiguous. The second page reads back as zero in the program. : : It looks like the user VA at the point of the compare error no longer points to : the same physical address as was registered. : : This patch totally resolves the issue! Link: http://lkml.kernel.org/r/1462547040-1737-2-git-send-email-aarcange@redhat.com Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Reviewed-by: "Kirill A. Shutemov" <kirill@shutemov.name> Reviewed-by: Dean Luick <dean.luick@intel.com> Tested-by: Alex Williamson <alex.williamson@redhat.com> Tested-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Tested-by: Josh Collier <josh.d.collier@intel.com> Cc: Marc Haber <mh+linux-kernel@zugschlus.de> Cc: <stable@vger.kernel.org> [4.5] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-05-12 15:52:50 -07:00
Zhou Chengming	7496fea9a6	ksm: fix conflict between mmput and scan_get_next_rmap_item A concurrency issue about KSM in the function scan_get_next_rmap_item. task A (ksmd): \|task B (the mm's task): \| mm = slot->mm; \| down_read(&mm->mmap_sem); \| \| ... \| \| spin_lock(&ksm_mmlist_lock); \| \| ksm_scan.mm_slot go to the next slot; \| \| spin_unlock(&ksm_mmlist_lock); \| \|mmput() -> \| ksm_exit(): \| \|spin_lock(&ksm_mmlist_lock); \|if (mm_slot && ksm_scan.mm_slot != mm_slot) { \| if (!mm_slot->rmap_list) { \| easy_to_free = 1; \| ... \| \|if (easy_to_free) { \| mmdrop(mm); \| ... \| \|So this mm_struct may be freed in the mmput(). \| up_read(&mm->mmap_sem); \| As we can see above, the ksmd thread may access a mm_struct that already been freed to the kmem_cache. Suppose a fork will get this mm_struct from the kmem_cache, the ksmd thread then call up_read(&mm->mmap_sem), will cause mmap_sem.count to become -1. As suggested by Andrea Arcangeli, unmerge_and_remove_all_rmap_items has the same SMP race condition, so fix it too. My prev fix in function scan_get_next_rmap_item will introduce a different SMP race condition, so just invert the up_read/spin_unlock order as Andrea Arcangeli said. Link: http://lkml.kernel.org/r/1462708815-31301-1-git-send-email-zhouchengming1@huawei.com Signed-off-by: Zhou Chengming <zhouchengming1@huawei.com> Suggested-by: Andrea Arcangeli <aarcange@redhat.com> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Geliang Tang <geliangtang@163.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Hanjun Guo <guohanjun@huawei.com> Cc: Ding Tianhong <dingtianhong@huawei.com> Cc: Li Bin <huawei.libin@huawei.com> Cc: Zhen Lei <thunder.leizhen@huawei.com> Cc: Xishi Qiu <qiuxishi@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-05-12 15:52:50 -07:00
Junxiao Bi	c25a1e0671	ocfs2: fix posix_acl_create deadlock Commit `702e5bc68a` ("ocfs2: use generic posix ACL infrastructure") refactored code to use posix_acl_create. The problem with this function is that it is not mindful of the cluster wide inode lock making it unsuitable for use with ocfs2 inode creation with ACLs. For example, when used in ocfs2_mknod, this function can cause deadlock as follows. The parent dir inode lock is taken when calling posix_acl_create -> get_acl -> ocfs2_iop_get_acl which takes the inode lock again. This can cause deadlock if there is a blocked remote lock request waiting for the lock to be downconverted. And same deadlock happened in ocfs2_reflink. This fix is to revert back using ocfs2_init_acl. Fixes: `702e5bc68a` ("ocfs2: use generic posix ACL infrastructure") Signed-off-by: Tariq Saeed <tariq.x.saeed@oracle.com> Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Joseph Qi <joseph.qi@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-05-12 15:52:50 -07:00
Junxiao Bi	5ee0fbd50f	ocfs2: revert using ocfs2_acl_chmod to avoid inode cluster lock hang Commit `743b5f1434` ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()") introduced this issue. ocfs2_setattr called by chmod command holds cluster wide inode lock when calling posix_acl_chmod. This latter function in turn calls ocfs2_iop_get_acl and ocfs2_iop_set_acl. These two are also called directly from vfs layer for getfacl/setfacl commands and therefore acquire the cluster wide inode lock. If a remote conversion request comes after the first inode lock in ocfs2_setattr, OCFS2_LOCK_BLOCKED will be set. And this will cause the second call to inode lock from the ocfs2_iop_get_acl() to block indefinetly. The deleted version of ocfs2_acl_chmod() calls __posix_acl_chmod() which does not call back into the filesystem. Therefore, we restore ocfs2_acl_chmod(), modify it slightly for locking as needed, and use that instead. Fixes: `743b5f1434` ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()") Signed-off-by: Tariq Saeed <tariq.x.saeed@oracle.com> Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Joseph Qi <joseph.qi@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-05-12 15:52:50 -07:00
Bjorn Andersson	b3d39032d7	remoteproc: Add additional crash reasons The Qualcomm WCNSS can crash by watchdog or a fatal software error. Add these types to the list of remoteproc crash reasons. Signed-off-by: Bjorn Andersson <bjorn.andersson@sonymobile.com> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>	2016-05-12 15:50:19 -07:00
Bjorn Andersson	e395f9ce49	remoteproc: core: Make the loaded resource table optional Remote processors like the ones found in the Qualcomm SoCs does not have a resource table passed to them, so make it optional by only populating it if it does exist. Signed-off-by: Bjorn Andersson <bjorn.andersson@sonymobile.com> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>	2016-05-12 15:50:04 -07:00
Archit Taneja	9ffee1c4be	clk: qcom: mmcc-8996: Remove clocks that should be controlled by RPM The branch clocks MMSS_MMAGIC_AXI_CLK and MMAGIC_BIMC_AXI_CLK are controlled by RPM when the APPs processor enable or disable the RPM_MMAXI_CLK. During the boot sequence, someone can enable the RPM_MMAXI_CLK, resulting in register status bits showing that these clocks are enabled, our clock driver may look at the enabled status of these clocks and try to disable them since it thinks they are unused. Don't make the clock driver touch these clocks. Signed-off-by: Archit Taneja <architt@codeaurora.org> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>	2016-05-12 14:48:28 -07:00
Stephen Boyd	d8609a3a2e	Merge tag 'imx-clk-fixes-4.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into clk-next Pull some non-critical i.MX clk fixes from Shawn Guo: * Fix the commit `3713e3f5e9` ("clk: imx35: define two clocks for rtc") which messed up the clock enumeration when adding new clock. * tag 'imx-clk-fixes-4.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux: ARM: dts: imx35: restore existing used clock enumeration clk: imx6q: fix typo in CAN clock definition	2016-05-12 14:48:26 -07:00
Harvey Hunt	4afe2d1a6e	clk: ingenic: Allow divider value to be divided The JZ4780's MSC clock divider registers multiply the clock divider by 2. This means that MMC devices run at half their expected speed. Add the ability to divide the clock divider in order to solve this. Signed-off-by: Harvey Hunt <harvey.hunt@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-clk@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>	2016-05-12 14:48:25 -07:00
Stephen Boyd	5707291c6c	Merge tag 'v4.7-rockchip-clk4' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into clk-next Pull rockchip clk updates from Heiko Stuebner: Another small rk3399 fixup as well as simplifications around our handling of the General-Register-Files syscon. * tag 'v4.7-rockchip-clk4' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip: clk: rockchip: drop old_rate calculation on pll rate changes clk: rockchip: simplify GRF handling in pll clocks clk: rockchip: lookup General Register Files in rockchip_clk_init clk: rockchip: fix the rk3399 sdmmc sample / drv name	2016-05-12 14:48:22 -07:00
Maxime Ripard	98b8525abb	clk: sunxi: Add display and TCON0 clocks driver The A10 SoCs and its relatives has a special clock controller to drive the display engines (both frontend and backend), that have a lot in common with the clock to drive the first TCON channel. Add a driver to support both. Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> Acked-by: Rob Herring <robh@kernel.org> [sboyd@codeaurora.org: Silence variable sized array warning] Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>	2016-05-12 14:47:52 -07:00
Jon Paul Maloy	e7142c341c	tipc: eliminate risk of double link_up events When an ACTIVATE or data packet is received in a link in state ESTABLISHING, the link does not immediately change state to ESTABLISHED, but does instead return a LINK_UP event to the caller, which will execute the state change in a different lock context. This non-atomic approach incurs a low risk that we may have two LINK_UP events pending simultaneously for the same link, resulting in the final part of the setup procedure being executed twice. The only potential harm caused by this it that we may see two LINK_UP events issued to subsribers of the topology server, something that may cause confusion. This commit eliminates this risk by checking if the link is already up before proceeding with the second half of the setup. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-05-12 17:11:27 -04:00
Al Viro	1d1bb236bc	gfs2: switch to ->iterate_shared() protected by glock and already used without locking the directory by gfs2_get_name() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-12 17:00:20 -04:00
Arnd Bergmann	2073dbad17	net: mvneta: bm: fix dependencies again I tried to fix this before, but my previous fix was incomplete and we can still get the same link error in randconfig builds because of the way that Kconfig treats the default y if MVNETA=y && MVNETA_BM_ENABLE line that does not actually trigger when MVNETA_BM_ENABLE=m, unlike I intended. Changing the line to use MVNETA_BM_ENABLE!=n however has the desired effect and hopefully makes all configurations work as expected. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: `019ded3aa7` ("net: mvneta: bm: clarify dependencies") Acked-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-05-12 16:56:30 -04:00
Omar Sandoval	2c4cb04300	coredump: only charge written data against RLIMIT_CORE Commit `9b56d54380` ("dump_skip(): dump_seek() replacement taking coredump_params") introduced a regression with regard to RLIMIT_CORE. Previously, when a core dump was sparse, only the data that was actually written out would count against the limit. Now, the sparse ranges are also included, which leads to truncated core dumps when the actual disk usage is still well below the limit. Restore the old behavior by only counting what gets emitted and ignoring what gets skipped. Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-12 16:55:50 -04:00
Omar Sandoval	a008393951	coredump: get rid of coredump_params->written cprm->written is redundant with cprm->file->f_pos, so use that instead. Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-12 16:55:50 -04:00
Fabio Estevam	f893a99e7e	phy: micrel: Use MICREL_PHY_ID_MASK definition Replace the hardcoded mask 0x00fffff0 with MICREL_PHY_ID_MASK for better readability. Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-05-12 16:55:33 -04:00
Haishuang Yan	da73b4e953	gre: Fix wrong tpi->proto in WCCP When dealing with WCCP in gre6 tunnel, it sets the wrong tpi->protocol, that is, ETH_P_IP instead of ETH_P_IPV6 for the encapuslated traffic. Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-05-12 16:53:58 -04:00
Haishuang Yan	23f72215bc	ip6_gre: Fix get_size calculation for gre6 tunnel Do not include attribute IFLA_GRE_TOS. Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-05-12 16:53:58 -04:00
Linus Torvalds	02c9c0e9b9	Keyrings fixes -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIVAwUAVzRizPSw1s6N8H32AQLvSg/+OVH7Typvg1nAXLrY1cNG6NmlHgs6frGq tFLwl4P/ZZDfxcjZ5qiutlB1H4D2T2jfF0T4IAnFqfTdKayr+dVpM+o2NBcGZIbj dNw9JWnZ9W1o4j0Ym7aj/8PTna21ZyEmv/5208d01xn08AbOKi8dJusi9d9cxG4f R93Zi2Flg1epA6kEmia0Cmv+BWAoBnYFwN6N38knxOLyIkkOziT1iBseV44NzY0o wtG+4MCgv2tmX5dG7O0XjyD3edxxEi8x0qrXrM0aafZFUWn+OCNuSSQPp/RudaH8 W7DzDJGy1y04roQWIKu16iD/HkAgMb5n/StFMMRLHAS1m42gKpXEwiAoGTdzMV3L MEsedkYA+Pe5J9PaUiJnpTNNz3xCS1Vc/yGcX5dGYaSmhXuC5gFQMjgwzWvP8ic2 IOH1CDEJbA8ZFuEjHYUjxNwu2T/iqN07CG3W/3Zc7K9bKDeuT6M5ATkk4tyzEX/d VAolzs6W/5Iw2ZUTsUtv7ajzdP4fNmjV69mGMvUQH8wmnh/eK4U9B9mbHIJEMwWG cwQ6Z5RGslr+WKphxA+X3RsyQGYoT8I4u1nnPPMyLp4fbyicnlxSZIms4WrRiENy aCSp3ZBLPh2/EAAIs1gLSv/0a7xTS6e62vaLQ72vhDCR0G/G3ANSZLoBvtJGU/Ra GDcaU/79WqU= =SJAm -----END PGP SIGNATURE----- Merge tag 'keys-fixes-20160512' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull keyring fix from David Howells: "Fix ASN.1 indefinite length object parsing" * tag 'keys-fixes-20160512' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: KEYS: Fix ASN.1 indefinite length object parsing	2016-05-12 13:00:33 -07:00
Linus Torvalds	e5ad8b6d1e	sound fixes for 4.6 This is a pretty boring pull request as you wish: including a few small and trivial HD-audio and USB-audio quirks and a couple of small regression fixes in HD-audio. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJXNE/uAAoJEGwxgFQ9KSmk2DAP/1GfWZXzZRMGeGX3uylffRJP S7BeZeVl6xHrzZfkteZYMDc3+B1Ebj0h+oS1AQftOu836q5JTYbTf4dfzaG9J4YK g1ZveP+fkAlSiRDzeRa1+iNHl/tUk4fqLhkN0FDIinhEnHegIKgXZd3N+GToOR81 zuZLCEjzmrmm3mDGH0uQ+jDfDDnTTT9bJC+uh61CjDnH6OaeLjVc/lVhdo2tYtaV SwqmCa+hD/CfSEDT5Ui/iD+0g8GCerxyTLzdA6A8mQ01SuLzq9QmRw4yb6s/raeO jHMGHUsacMLyAITmKZPxGQQnzmGsGZn5Uc2ia/HgLWmCRohnjZXxhx2QzGUJs9BL Z45njkULibUFqMlNrsfQ70BvZICEi4SddnSpVPV6BlHX1BCRec/EGgXqyKMqt7aA pSg+2NCDUVpuLC2pLw3So0ND3uJvFXQ2yHA4WuH8kbbXfAXCxLvHj/Sf5xvXC9Ty XN0C/NU4V2JTlUbCDnNUknToYszF/WN/k73HjInRUWakLsTFwmssiOVLYixOFDiL h/fPp889Uykl20YfmVXV3bQNRukJxwLCQpQR6o7HOyYaldMoja3Pky2AhSF8EWfB XZPf9rhF6VWKjsiH83oWXjBfqd/Uh62MEXehGkjlvqbSPjBtyjXSPFPUzVvY+WIX 03t6MfqOBRaLRgkIMHEc =lhNO -----END PGP SIGNATURE----- Merge tag 'sound-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "This is a pretty boring pull request as you wish: including a few small and trivial HD-audio and USB-audio quirks and a couple of small regression fixes in HD-audio" * tag 'sound-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: usb-audio: Yet another Phoneix Audio device quirk ALSA: hda - Fix regression on ATI HDMI audio ALSA: hda - Fix subwoofer pin on ASUS N751 and N551 ALSA: hda - Fix broken reconfig ALSA: hda - Fix white noise on Asus UX501VW headset ALSA: usb-audio: Quirk for yet another Phoenix Audio devices (v2)	2016-05-12 12:55:42 -07:00
Linus Torvalds	ed1e33dded	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input subsystem fixes from Dmitry Torokhov. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: twl6040-vibra - fix DT node memory management Input: max8997-haptic - fix NULL pointer dereference Input: byd - update copyright header	2016-05-12 12:47:49 -07:00
Arnaldo Carvalho de Melo	42ef8a78c1	perf stat: Fallback to user only counters when perf_event_paranoid > 1 After `0161028b7c` ("perf/core: Change the default paranoia level to 2") 'perf stat' fails for users without CAP_SYS_ADMIN, so just use 'perf_evsel__fallback()' to have the same behaviour as 'perf record', i.e. set perf_event_attr.exclude_kernel to 1. Now: [acme@jouet linux]$ perf stat usleep 1 Performance counter stats for 'usleep 1': 0.352536 task-clock:u (msec) # 0.423 CPUs utilized 0 context-switches:u # 0.000 K/sec 0 cpu-migrations:u # 0.000 K/sec 49 page-faults:u # 0.139 M/sec 309,407 cycles:u # 0.878 GHz 243,791 instructions:u # 0.79 insn per cycle 49,622 branches:u # 140.757 M/sec 3,884 branch-misses:u # 7.83% of all branches 0.000834174 seconds time elapsed [acme@jouet linux]$ Reported-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-b20jmx4dxt5hpaa9t2rroi0o@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-05-12 16:25:18 -03:00
Arnaldo Carvalho de Melo	08094828b7	perf evsel: Handle EACCESS + perf_event_paranoid=2 in fallback() Now with the default for the kernel.perf_event_paranoid sysctl being 2 [1] we need to fall back to :u, i.e. to set perf_event_attr.exclude_kernel to 1. Before: [acme@jouet linux]$ perf record usleep 1 Error: You may not have permission to collect stats. Consider tweaking /proc/sys/kernel/perf_event_paranoid, which controls use of the performance events system by unprivileged users (without CAP_SYS_ADMIN). The current value is 2: -1: Allow use of (almost) all events by all users >= 0: Disallow raw tracepoint access by users without CAP_IOC_LOCK >= 1: Disallow CPU event access by users without CAP_SYS_ADMIN >= 2: Disallow kernel profiling by users without CAP_SYS_ADMIN [acme@jouet linux]$ After: [acme@jouet linux]$ perf record usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.016 MB perf.data (7 samples) ] [acme@jouet linux]$ perf evlist cycles:u [acme@jouet linux]$ perf evlist -v cycles:u: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP\|TID\|TIME\|PERIOD, disabled: 1, inherit: 1, exclude_kernel: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 [acme@jouet linux]$ And if the user turns on verbose mode, an explanation will appear: [acme@jouet linux]$ perf record -v usleep 1 Warning: kernel.perf_event_paranoid=2, trying to fall back to excluding kernel samples mmap size 528384B [ perf record: Woken up 1 times to write data ] Looking at the vmlinux_path (8 entries long) Using /lib/modules/4.6.0-rc7+/build/vmlinux for symbols [ perf record: Captured and wrote 0.016 MB perf.data (7 samples) ] [acme@jouet linux]$ [1] `0161028b7c` ("perf/core: Change the default paranoia level to 2") Reported-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-b20jmx4dxt5hpaa9t2rroi0o@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-05-12 16:13:16 -03:00
Alex Deucher	c47b9e0944	drm/amdgpu: fix DP mode validation Switch the order of the loops to walk the rates on the top so we exhaust all DP 1.1 rate/lane combinations before trying DP 1.2 rate/lane combos. This avoids selecting rates that are supported by the monitor, but not the connector leading to valid modes getting rejected. bug: https://bugs.freedesktop.org/show_bug.cgi?id=95206 Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2016-05-12 15:03:49 -04:00
Alex Deucher	ff0bd441bd	drm/radeon: fix DP mode validation Switch the order of the loops to walk the rates on the top so we exhaust all DP 1.1 rate/lane combinations before trying DP 1.2 rate/lane combos. This avoids selecting rates that are supported by the monitor, but not the connector leading to valid modes getting rejected. bug: https://bugs.freedesktop.org/show_bug.cgi?id=95206 Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2016-05-12 15:03:39 -04:00
Arnaldo Carvalho de Melo	7d173913a6	perf evsel: Improve EPERM error handling in open_strerror() We were showing a hardcoded default value for the kernel.perf_event_paranoid sysctl, now that it became more paranoid (1 -> 2 [1]), this would need to be updated, instead show the current value: [acme@jouet linux]$ perf record ls Error: You may not have permission to collect stats. Consider tweaking /proc/sys/kernel/perf_event_paranoid, which controls use of the performance events system by unprivileged users (without CAP_SYS_ADMIN). The current value is 2: -1: Allow use of (almost) all events by all users >= 0: Disallow raw tracepoint access by users without CAP_IOC_LOCK >= 1: Disallow CPU event access by users without CAP_SYS_ADMIN >= 2: Disallow kernel profiling by users without CAP_SYS_ADMIN [acme@jouet linux]$ [1] `0161028b7c` ("perf/core: Change the default paranoia level to 2") Reported-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-0gc4rdpg8d025r5not8s8028@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-05-12 15:44:55 -03:00
Linus Torvalds	422ce5a975	A single last pin control fix for v4.6: - The pull up/down logic for the AT91 PIO4 controller was tilted: we need to mask the reverse pull when unmasking a pull direction. Setting both pull up & pull down is illegal and makes no sense. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJXNILOAAoJEEEQszewGV1zLfAQAIvFWWkNEtuAC4+INKjyqI6/ UZuc9CtqIAUx23C/Syco2itSQ40xIyQL5fnH9o9IBnSQc+Qv8H8jMvO3eLezZEnK s3fPfdid3jhRPQWa5qXP6z5TtRTMPFqoiT8JzVENL3j167pD2Ip1K0XPOR1lZsWY 1zOHXZtiDGOMqA+PIOKXbst1r2Ney5AL1/WBkwrLisX7oEm57jUgtvJiiNyHQ56H GFU4kpa4ei0EBhySOUc1JnUKtQOx9g1ZlQS3vWp9uDDXdEMkgr7zL9mP63lACaG9 Qfz65bpPzfwgYvYS7FrMkHTCYIODRPAHyyPj0Vm+JNzDTYD1Op99kZIzBEcMKhWq obEh9o8PT9/y1JYl+Xhzkv1wLYO/zNwdljh5H7zDVdIkkrAV5t3nFyOutA+lWWi4 3GoaQAyrJre5UyrgdjI2ARBUwqYDGSziKLensqSakhmHCv6RQsvYYs0nBeVKDADj 7DSHFcUwo44ivCc7CRy7Lm3Z/nyfwlHOoS4kvzIQTJ0ipiyHoUfgacknWsXoOZJ6 mLl0HZwRxJ/OaNOtkXKrReph/CMpx3rdSbbTAQf2RXyqoOclOM/AXQJOnTGzVQZI jw1xnk2uMY7GM1eTnOADNxHDXUNh0dGHmpkJf/hlZ/oc4Tbu7h8fE8Kf38IzTZzT e3oZPmkuBt644pHqSpzT =Kzzn -----END PGP SIGNATURE----- Merge tag 'pinctrl-v4.6-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pinctrl fix from Linus Walleij: "A single last pin control fix for v4.6. t's tagged for stable and only hits a single driver with two added lines so should be safe. Tested in linux-next. - The pull up/down logic for the AT91 PIO4 controller was tilted: we need to mask the reverse pull when unmasking a pull direction. Setting both pull up & pull down is illegal and makes no sense" * tag 'pinctrl-v4.6-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: at91-pio4: fix pull-up/down logic	2016-05-12 11:23:08 -07:00
Christoph Hellwig	0691a286d5	IB/cma: pass the port number to ib_create_qp The new RW API will need this. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Tested-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-05-12 14:22:54 -04:00

... 25 26 27 28 29 ...

599145 commits