Commit graph

616555 commits

Author SHA1 Message Date
Steven J. Hill
8552b5b4da MIPS: Octeon: Put restrictions on DMA descriptors.
Set the DMA mask such that all descriptors stay in the
lower 4GB of memory.

Signed-off-by: Steven J. Hill <steven.hill@cavium.com>
Acked-by: David Daney <david.daney@cavium.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13830/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-07-28 12:01:34 +02:00
Steven J. Hill
71471e2866 MIPS: Octeon: Remove forced mappings of USB interrupts.
Get rid of unnecessary forced interrupt mappings for
the USB host controller on OCTEON II.

Signed-off-by: Steven J. Hill <steven.hill@cavium.com>
Acked-by: David Daney <david.daney@cavium.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13824/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-07-28 12:01:06 +02:00
David Howells
20f06ed9f6 KEYS: 64-bit MIPS needs to use compat_sys_keyctl for 32-bit userspace
MIPS64 needs to use compat_sys_keyctl for 32-bit userspace rather than
calling sys_keyctl.  The latter will work in a lot of cases, thereby hiding
the issue.

Reported-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: stable@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-security-module@vger.kernel.org
Cc: keyrings@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/13832/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-07-28 11:56:48 +02:00
James Hogan
5573f6ad3e MIPS: Print segment physical address when EU=1
Currently the debugfs interface to print the segment configuration
refuses to print the physical address of mapped segments. However if the
EU bit is set these become unmapped at error level (when
CP0_Status.ERL=1), so the physical address is still relevant.

Update the logic to print the physical address of mapped segments when
the EU bit is set, while still hiding the Cache Coherency Attribute
(since EU overrides that to uncached when ERL=1 too).

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13833/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-07-28 11:44:30 +02:00
Jiri Kosina
8c2f421c1f Merge branch 'for-4.8/hid-led' into for-linus
Conflicts:
	drivers/hid/hid-thingm.c
2016-07-28 10:49:23 +02:00
Jiri Kosina
e82a82c19f Merge branches 'for-4.8/alps', 'for-4.8/apple', 'for-4.8/i2c-hid', 'for-4.8/uhid-offload-hid-device-add' and 'for-4.8/upstream' into for-linus 2016-07-28 10:48:15 +02:00
Stefan Christ
8d762017f4 drm/gma500: remove unnecessary stub for fb_ioctl()
Stub implementation of fb_ioctl can be omitted, because function
do_fb_ioctl already returns -ENOTTY when fb_ioctl is not assigned.

Signed-off-by: Stefan Christ <s.christ@phytec.de>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1469622270-10803-1-git-send-email-s.christ@phytec.de
2016-07-28 10:04:03 +02:00
Lukas Wunner
305964b7a0 apple-gmux: Sphinxify docs
Convert asciidoc-formatted docs to rst in accordance with Jonathan's and
Jani's effort to use sphinx for kernel-doc rendering in 4.8.

Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Acked-by: Darren Hart <dvhart@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/4c1b29986fa77772156b1af0c965d3799e43a47b.1467628307.git.lukas@wunner.de
2016-07-28 10:04:03 +02:00
Benoît Thébaudeau
6f367788d6 rtc: rv8803: Clear V1F when setting the time
V1F indicates that the time accuracy may have been compromised because
of a voltage drop (possibly only temporary) below VLOW1, which stops the
temperature compensation. When the time is set, the accuracy is
restored, so V1F should be cleared in order to indicate this and to be
able to detect the next temperature compensation loss. This is the same
principle as for V2F, which is cleared when the time is set to indicate
that the time is no longer invalid and to be able to detect the next
data loss.

Signed-off-by: Benoît Thébaudeau <benoit@wsystem.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2016-07-28 09:59:43 +02:00
Benoît Thébaudeau
d3700b6b64 rtc: rv8803: Stop the clock while setting the time
According to the application manual of the RX8900, the RESET bit must be
set to 1 to prevent a timer update while setting the time. This also
resets the subsecond counter. The application manual of the RV-8803 does
not mention such a requirement, and it says that the 100th Seconds
register is cleared when writing to the Seconds register, but using the
RESET bit for the RV-8803 too should not be an issue and is probably
safer.

This change also ensures that the RESET bit is initialized properly in
all cases. Indeed, all the registers must be initialized if the voltage
has been lower than VLOW2 (triggering V2F), but not low enough to
trigger a POR.

Signed-off-by: Benoît Thébaudeau <benoit@wsystem.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2016-07-28 09:59:41 +02:00
Benoît Thébaudeau
d522649e26 rtc: rv8803: Always apply the I²C workaround
The I²C NACK issue of the RV-8803 may occur after any I²C START
condition, depending on the timings. Consequently, the workaround must
be applied for all the I²C transfers.

This commit abstracts the I²C transfer code into register access
functions. This avoids duplicating the I²C workaround everywhere. This
also avoids the duplication of the code handling the return value of
i2c_smbus_read_i2c_block_data(). Error messages are issued in case of
definitive register access failures (if the workaround fails). This
change also makes the I²C transfer return value checks consistent.

Signed-off-by: Benoît Thébaudeau <benoit@wsystem.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2016-07-28 09:59:39 +02:00
Benoît Thébaudeau
a1e98e0970 rtc: rv8803: Fix read day of week
The Weekday register is encoded as 2^tm_wday, with tm_wday in 0..6, so
using tm_wday = ffs(reg) to fill tm_wday from the register value is
wrong because this gives the expected value + 1. This could be fixed as
tm_wday = ffs(reg) - 1, but tm_wday = ilog2(reg) works as well and is
more direct.

Signed-off-by: Benoît Thébaudeau <benoit@wsystem.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2016-07-28 09:59:37 +02:00
Benoît Thébaudeau
96acb25c50 rtc: rv8803: Remove the check for valid time
The RTC core always calls rtc_valid_tm() after ->read_time() in case of
success (in __rtc_read_time()), so do not call it twice.

Signed-off-by: Benoît Thébaudeau <benoit@wsystem.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2016-07-28 09:59:33 +02:00
Benoît Thébaudeau
34166a00ec rtc: rv8803: Kconfig: Indicate rx8900 support
This driver supports the Epson RX8900, but this was not indicated in
Kconfig.

Signed-off-by: Benoît Thébaudeau <benoit@wsystem.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2016-07-28 09:59:30 +02:00
Paul Mackerras
93d17397e4 KVM: PPC: Book3S HV: Save/restore TM state in H_CEDE
It turns out that if the guest does a H_CEDE while the CPU is in
a transactional state, and the H_CEDE does a nap, and the nap
loses the architected state of the CPU (which is is allowed to do),
then we lose the checkpointed state of the virtual CPU.  In addition,
the transactional-memory state recorded in the MSR gets reset back
to non-transactional, and when we try to return to the guest, we take
a TM bad thing type of program interrupt because we are trying to
transition from non-transactional to transactional with a hrfid
instruction, which is not permitted.

The result of the program interrupt occurring at that point is that
the host CPU will hang in an infinite loop with interrupts disabled.
Thus this is a denial of service vulnerability in the host which can
be triggered by any guest (and depending on the guest kernel, it can
potentially triggered by unprivileged userspace in the guest).

This vulnerability has been assigned the ID CVE-2016-5412.

To fix this, we save the TM state before napping and restore it
on exit from the nap, when handling a H_CEDE in real mode.  The
case where H_CEDE exits to host virtual mode is already OK (as are
other hcalls which exit to host virtual mode) because the exit
path saves the TM state.

Cc: stable@vger.kernel.org # v3.15+
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2016-07-28 16:10:07 +10:00
Paul Mackerras
f024ee0984 KVM: PPC: Book3S HV: Pull out TM state save/restore into separate procedures
This moves the transactional memory state save and restore sequences
out of the guest entry/exit paths into separate procedures.  This is
so that these sequences can be used in going into and out of nap
in a subsequent patch.

The only code changes here are (a) saving and restore LR on the
stack, since these new procedures get called with a bl instruction,
(b) explicitly saving r1 into the PACA instead of assuming that
HSTATE_HOST_R1(r13) is already set, and (c) removing an unnecessary
and redundant setting of MSR[TM] that should have been removed by
commit 9d4d0bdd9e0a ("KVM: PPC: Book3S HV: Add transactional memory
support", 2013-09-24) but wasn't.

Cc: stable@vger.kernel.org # v3.15+
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2016-07-28 16:09:34 +10:00
Dan Carpenter
344e3c7734 sparc: serial: sunhv: fix a double lock bug
We accidentally take the "port->lock" twice in a row.  This old code
was supposed to be deleted.

Fixes: e58e241c17 ('sparc: serial: Clean up the locking for -rt')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-27 22:54:52 -07:00
Dan Carpenter
fa1608285b sparc32: off by ones in BUG_ON()
Smatch complains that these tests are off by one, which is true but not
life threatening.

	arch/sparc/kernel/irq_32.c:169 irq_link()
	error: buffer overflow 'irq_map' 384 <= 384

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-27 22:53:17 -07:00
Romain Perier
64ec6ccb76 crypto: marvell - Update cache with input sg only when it is unmapped
So far, the cache of the ahash requests was updated from the 'complete'
operation. This complete operation is called from mv_cesa_tdma_process
before the cleanup operation, which means that the content of req->src
can be read and copied when it is still mapped. This commit fixes the
issue by moving this cache update from mv_cesa_ahash_complete to
mv_cesa_ahash_req_cleanup, so the copy is done once the sglist is
unmapped.

Fixes: 1bf6682cb3 ("crypto: marvell - Add a complete operation for..")
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-07-28 13:04:44 +08:00
Romain Perier
603e989ebe crypto: marvell - Don't chain at DMA level when backlog is disabled
The flag CRYPTO_TFM_REQ_MAY_BACKLOG is optional and can be set from the
user to put requests into the backlog queue when the main cryptographic
queue is full. Before calling mv_cesa_tdma_chain we must check the value
of the return status to be sure that the current request has been
correctly queued or added to the backlog.

Fixes: 85030c5168 ("crypto: marvell - Add support for chaining...")
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-07-28 13:04:43 +08:00
Romain Perier
ec38f82e85 crypto: marvell - Fix memory leaks in TDMA chain for cipher requests
So far in mv_cesa_ablkcipher_dma_req_init, if an error is thrown while
the tdma chain is built there is a memory leak. This issue exists
because the chain is assigned later at the end of the function, so the
cleanup function is called with the wrong version of the chain.

Fixes: db509a4533 ("crypto: marvell/cesa - add TDMA support")
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-07-28 13:04:42 +08:00
Vinod Koul
4bb0439626 Merge branch 'topic/dmaengine_cleanups' into for-linus 2016-07-28 10:10:37 +05:30
Rob Rice
a24532f8d1 mailbox: Add Broadcom PDC mailbox driver
The Broadcom PDC mailbox driver is a mailbox controller that
manages data transfers to and from one or more offload engines.

Signed-off-by: Rob Rice <rob.rice@broadcom.com>
Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Reviewed-by: Ray Jui <ray.jui@broadcom.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
2016-07-28 09:34:47 +05:30
Rob Rice
b04f127286 dt-bindings: add bindings documentation for PDC driver.
Add the device tree binding documentation for the PDC hardware
in Broadcom iProc SoCs.

Signed-off-by: Rob Rice <rob.rice@broadcom.com>
Acked-by: Rob Herring <robh@kernel.org>
Reviewed-by: Ray Jui <ray.jui@broadcom.com>
Reviewed-by: Anup Patel <anup.patel@broadcom.com>
Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
2016-07-28 09:34:46 +05:30
Pavel Shilovsky
7893242e24 CIFS: Fix a possible invalid memory access in smb2_query_symlink()
During following a symbolic link we received err_buf from SMB2_open().
While the validity of SMB2 error response is checked previously
in smb2_check_message() a symbolic link payload is not checked at all.
Fix it by adding such checks.

Cc: Dan Carpenter <dan.carpenter@oracle.com>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
2016-07-27 22:55:56 -05:00
Aurelien Aptel
a6b5058faf fs/cifs: make share unaccessible at root level mountable
if, when mounting //HOST/share/sub/dir/foo we can query /sub/dir/foo but
not any of the path components above:

- store the /sub/dir/foo prefix in the cifs super_block info
- in the superblock, set root dentry to the subpath dentry (instead of
  the share root)
- set a flag in the superblock to remember it
- use prefixpath when building path from a dentry

fixes bso#8950

Signed-off-by: Aurelien Aptel <aaptel@suse.com>
CC: Stable <stable@vger.kernel.org>
Reviewed-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
2016-07-27 22:50:55 -05:00
Theodore Ts'o
59b8d4f1f5 random: use for_each_online_node() to iterate over NUMA nodes
This fixes a crash on s390 with fake NUMA enabled.

Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Fixes: 1e7f583af6 ("random: make /dev/urandom scalable for silly userspace programs")
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-07-27 23:30:25 -04:00
Nicolas Pitre
002d2f01f1 m68k: enable binfmt_flat on systems with an MMU
Now that the generic changes are in place, this can be enabled on m68k
with the use of proper user space accessors in the flat_get_addr_from_rp()
and flat_put_addr_at_rp() handlers as rp actually holds a user space
address.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
2016-07-28 13:29:13 +10:00
Nicolas Pitre
472f95f32d binfmt_flat: allow compressed flat binary format to work on MMU systems
Let's take the simple and obvious approach by decompressing the binary
into a kernel buffer and then copying it to user space.  Those who are
looking for top performance on an MMU system are unlikely to choose this
executable format anyway.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Reviewed-by: Greg Ungerer <gerg@linux-m68k.org>
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
2016-07-28 13:29:12 +10:00
Nicolas Pitre
015feacf93 binfmt_flat: add MMU-specific support
Not much else to do at this point except for the different stack setups.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Reviewed-by: Greg Ungerer <gerg@linux-m68k.org>
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
2016-07-28 13:29:12 +10:00
Nicolas Pitre
af521f92dc binfmt_flat: update libraries' data segment pointer with userspace accessors
This is needed on systems with a MMU.  This also gets rid of the
strangest C code I've seen lateli i.e. an integer indexed with a
pointer value within square brackets. That really looked backwards.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Reviewed-by: Greg Ungerer <gerg@linux-m68k.org>
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
2016-07-28 13:29:11 +10:00
Nicolas Pitre
467aa1465a binfmt_flat: use clear_user() rather than memset() to clear .bss
This is needed on systems with a MMU.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Reviewed-by: Greg Ungerer <gerg@linux-m68k.org>
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
2016-07-28 13:29:11 +10:00
Nicolas Pitre
1b2ce442ea binfmt_flat: use proper user space accessors with old relocs code
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Reviewed-by: Greg Ungerer <gerg@linux-m68k.org>
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
2016-07-28 13:28:58 +10:00
Linus Torvalds
194dc870a5 Add braces to avoid "ambiguous ‘else’" compiler warnings
Some of our "for_each_xyz()" macro constructs make gcc unhappy about
lack of braces around if-statements inside or outside the loop, because
the loop construct itself has a "if-then-else" statement inside of it.

The resulting warnings look something like this:

  drivers/gpu/drm/i915/i915_debugfs.c: In function ‘i915_dump_lrc’:
  drivers/gpu/drm/i915/i915_debugfs.c:2103:6: warning: suggest explicit braces to avoid ambiguous ‘else’ [-Wparentheses]
     if (ctx != dev_priv->kernel_context)
        ^

even if the code itself is fine.

Since the warning is fairly easy to avoid by adding a braces around the
if-statement near the for_each_xyz() construct, do so, rather than
disabling the otherwise potentially useful warning.

(The if-then-else statements used in the "for_each_xyz()" constructs are
designed to be inherently safe even with no braces, but in this case
it's quite understandable that gcc isn't really able to tell that).

This finally leaves the standard "allmodconfig" build with just a
handful of remaining warnings, so new and valid warnings hopefully will
stand out.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-27 20:03:31 -07:00
Stephen Rothwell
fbef66f0ad powerpc/mm: Parenthesise IS_ENABLED() in if condition
Currently IS_ENABLED() produces an expression surrounded by parentheses,
which allows this code to compile, generating eg:

    else if (1 || 0)
        hpte_init_native();

However a change to the macro in the kbuild tree will break this in
future by removing the parentheses.

Fixes: 7353644fa9 ("powerpc/mm: Fix build break when PPC_NATIVE=n")
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-28 12:39:15 +10:00
Linus Torvalds
124a3d88fa Disable "frame-address" warning
Newer versions of gcc warn about the use of __builtin_return_address()
with a non-zero argument when "-Wall" is specified:

  kernel/trace/trace_irqsoff.c: In function ‘stop_critical_timings’:
  kernel/trace/trace_irqsoff.c:433:86: warning: calling ‘__builtin_return_address’ with a nonzero argument is unsafe [-Wframe-address]
     stop_critical_timing(CALLER_ADDR0, CALLER_ADDR1);
  [ .. repeats a few times for other similar cases .. ]

It is true that a non-zero argument is somewhat dangerous, and we do not
actually have very many uses of that in the kernel - but the ftrace code
does use it, and as Stephen Rostedt says:

 "We are well aware of the danger of using __builtin_return_address() of
  > 0.  In fact that's part of the reason for having the "thunk" code in
  x86 (See arch/x86/entry/thunk_{64,32}.S).  [..] it adds extra frames
  when tracking irqs off sections, to prevent __builtin_return_address()
  from accessing bad areas.  In fact the thunk_32.S states: 'Trampoline to
  trace irqs off.  (otherwise CALLER_ADDR1 might crash)'."

For now, __builtin_return_address() with a non-zero argument is the best
we can do, and the warning is not helpful and can end up making people
miss other warnings for real problems.

So disable the frame-address warning on compilers that need it.

Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-27 19:03:04 -07:00
Uwe Kleine-König
d205a21859 Input: rotary_encoder - support binary encoding of states
It's not advisable to use this encoding, but to support existing devices
add support for this to the driver.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2016-07-27 18:26:10 -07:00
Cameron Gutman
f712a5a052 Input: xpad - power off wireless 360 controllers on suspend
When the USB wireless adapter is suspended, the controllers
lose their connection. This causes them to start flashing
their LED rings and searching for the wireless adapter
again, wasting the controller's battery power.

Instead, we will tell the controllers to power down when
we suspend. This mirrors the behavior of the controllers
when connected to the console itself and how the official
Xbox One wireless adapter behaves on Windows.

Signed-off-by: Cameron Gutman <aicommander@gmail.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2016-07-27 18:26:09 -07:00
Phil Turnbull
955818cd5b ceph: Correctly return NXIO errors from ceph_llseek
ceph_llseek does not correctly return NXIO errors because the 'out' path
always returns 'offset'.

Fixes: 06222e491e ("fs: handle SEEK_HOLE/SEEK_DATA properly in all fs's that define their own llseek")
Signed-off-by: Phil Turnbull <phil.turnbull@oracle.com>
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-28 03:00:45 +02:00
Nikolay Borisov
6b1a9a6c54 ceph: Mark the file cache as unreclaimable
Ceph creates multiple caches with the SLAB_RECLAIMABLE flag set, so
that it can satisfy its internal needs. Inspecting the code shows that
most of the caches are indeed reclaimable since they are directly
related to the generic inode/dentry shrinkers. However, one of the
cache used to satisfy struct file is not reclaimable since its
entries are freed only when the last reference to the file is
dropped. If a heavily loaded node opens a lot of files it can
introduce non-trivial discrepancies between memory shown as reclaimable
and what is actually reclaimed when drop_caches is used.

Fix this by removing the reclaimable flag for the file's cache.

Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com>
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-28 03:00:45 +02:00
Yan, Zheng
c8799fc467 ceph: optimize cap flush waiting
Add a 'wake' flag to ceph_cap_flush struct, which indicates if there
is someone waiting for it to finish. When getting flush ack message,
we check the 'wake' flag in corresponding ceph_cap_flush struct to
decide if we should wake up waiters. One corner case is that the
acked cap flush has 'wake' flags is set, but it is not the first one
on the flushing list. We do not wake up waiters in this case, set
'wake' flags of preceding ceph_cap_flush struct instead

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-28 03:00:45 +02:00
Yan, Zheng
ed9b430c9b ceph: cleanup ceph_flush_snaps()
This patch devide __ceph_flush_snaps() into two stags. In the first
stage, __ceph_flush_snaps() assign snapcaps flush TIDs and add them
to cap flush lists. __ceph_flush_snaps() keeps holding the
i_ceph_lock in this stagge. So inode's auth cap can not change. In
the second stage, __ceph_flush_snaps() send flushsnap cap messages.
i_ceph_lock is unlocked before sending each cap message. If auth cap
changes in the middle, __ceph_flush_snaps() just stops. This is OK
because kick_flushing_inode_caps() will re-send flushsnap cap messages
to inode's new auth MDS.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-28 03:00:44 +02:00
Yan, Zheng
7bc00fddb9 ceph: kick cap flushes before sending other cap message
If ceph_check_caps() wants to send cap message to a recovering MDS,
make sure it kicks cap flushes first.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-28 03:00:44 +02:00
Yan, Zheng
70220ac8c2 ceph: introduce an inode flag to indicates if snapflush is needed
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-28 03:00:43 +02:00
Yan, Zheng
13c2b57d81 ceph: avoid sending duplicated cap flush message
make ceph_kick_flushing_caps() ignore inodes whose cap flushes
have already been re-sent by ceph_early_kick_flushing_caps()

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-28 03:00:43 +02:00
Yan, Zheng
0e29438789 ceph: unify cap flush and snapcap flush
This patch includes following changes
- Assign flush tid to snapcap flush
- Remove session's s_cap_snaps_flushing list. Add inode to session's
  s_cap_flushing list instead. Inode is removed from the list when
  there is no pending snapcap flush or cap flush.
- make __kick_flushing_caps() re-send both snapcap flushes and cap
  flushes.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-28 03:00:42 +02:00
Yan, Zheng
e4500b5e35 ceph: use list instead of rbtree to track cap flushes
We don't have requirement of searching cap flush by TID. In most cases,
we just need to know TID of the oldest cap flush. List is ideal for this
usage.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-28 03:00:42 +02:00
Yan, Zheng
3609404f8c ceph: update types of some local varibles
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-28 03:00:42 +02:00
Yan, Zheng
3469ed0d14 ceph: include 'follows' of pending snapflush in cap reconnect message
This helps the recovering MDS to reconstruct the internal states that
tracking pending snapflush.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-28 03:00:41 +02:00
Yan, Zheng
121f22a19a ceph: update cap reconnect message to version 3
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-07-28 03:00:41 +02:00