ca_maxoperations:
For the backchannel, the server MUST
NOT change the value the client offers. For the fore channel,
the server MAY change the requested value.
ca_maxrequests:
For the backchannel, the server MUST NOT change the
value the client offers. For the fore channel, the server MAY
change the requested value.
Signed-off-by: Vitaliy Gusev <gusev.vitaliy@nexenta.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Raid array setup code creates an extent buffer in an usual way. When the
PAGE_CACHE_SIZE is > super block size, the extent pages are not marked
up-to-date, which triggers a WARN_ON in the following
write_extent_buffer call. Add an explicit up-to-date call to silence the
warning.
Signed-off-by: David Sterba <dsterba@suse.cz>
On ia64, powerpc64 and sparc64 the bitfield is modified through a RMW cycle and current
gcc rewrites the adjacent 4B word, which in case of a spinlock or atomic has
disaterous effect.
https://lkml.org/lkml/2012/2/1/220
Signed-off-by: David Sterba <dsterba@suse.cz>
We encountered an issue that was easily observable on s/390 systems but
could really happen anywhere. The timing just seemed to hit reliably
on s/390 with limited memory.
The gist is that when an unexpected set_page_dirty() happened, we'd
run into the BUG() in btrfs_writepage_fixup_worker since it wasn't
properly set up for delalloc.
This patch does the following:
- Performs the missing delalloc in the fixup worker
- Allow the start hook to return -EBUSY which informs __extent_writepage
that it should mark the page skipped and not to redirty it. This is
required since the fixup worker can fail with -ENOSPC and the page
will have already been redirtied. That causes an Oops in
drop_outstanding_extents later. Retrying the fixup worker could
lead to an infinite loop. Deferring the page redirty also saves us
some cycles since the page would be stuck in a resubmit-redirty loop
until the fixup worker completes. It's not harmful, just wasteful.
- If the fixup worker fails, we mark the page and mapping as errored,
and end the writeback, similar to what we would do had the page
actually been submitted to writeback.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Because scrub enumerates the dev extent tree to find the chunks to scrub,
it currently finds each DUP chunk twice and also scrubs it twice. This
patch makes sure that scrub_chunk only checks that part of the chunk the
dev extent has been found for. This only changes the behaviour for DUP
chunks.
Reported-and-tested-by: Stefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: Arne Jansen <sensille@gmx.net>
A user reported a bug of btrfs's trim, that is we will trim 0 bytes
after a device delete.
The reproducer:
$ mkfs.btrfs disk1
$ mkfs.btrfs disk2
$ mount disk1 /mnt
$ fstrim -v /mnt
$ btrfs device add disk2 /mnt
$ btrfs device del disk1 /mnt
$ fstrim -v /mnt
This is because after we delete the device, the block group may start from
a non-zero place, which will confuse trim to discard nothing.
Reported-by: Lutz Euler <lutz.euler@freenet.de>
Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
Given that ENXIO only means "offset beyond EOF" for either SEEK_DATA or SEEK_HOLE inquiry
in a desired file range, so we should return the internal error unchanged if btrfs_get_extent_fiemap()
call failed, rather than ENXIO.
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
inode_ref_info() returns 1 when the element wasn't found and < 0 on error,
just like btrfs_search_slot(). In iref_to_path() it's an error when the
inode ref can't be found, thus we return ERR_PTR(ret) in that case. In order
to avoid ERR_PTR(1), we now set ret to -ENOENT in that case.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
Gracefully fail when trying to mount a BTRFS file system that has a
sectorsize smaller than PAGE_SIZE.
On PPC it is possible to build a FS while using a 4k PAGE_SIZE kernel
then boot into a 64K PAGE_SIZE kernel. Presently open_ctree fails in an
endless loop and hangs the machine in this situation.
My debugging has show this Sector size < Page size to be a non trivial
situation and a graceful exit from the situation would be nice for the
time being.
Signed-off-by: Keith Mannthey <kmannth@us.ibm.com>
Don't allow invalid 'vers' and 'minorversion' combinations in mount options,
such as "vers=3,minorversion=1".
Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Don't allocate the legacy idmapper tables until we actually need
them.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Instead of pre-allocating the storage for all the strings, we can
significantly reduce the size of that table by doing the allocation
when we do the downcall.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
It's not compilable in case of CONFIG_NFS_V4_1 is not set.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Ensure that we initialise the nfs_net->nfs_client_lock spinlock.
Also ensure that nfs_server_remove_lists() doesn't try to
dereference server->nfs_client before that is initialised.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Stanislav Kinsbursky <skinsbursky@parallels.com>
Lockd now managed in network namespace context. And this patch introduces
network namespace related NLM hosts shutdown in case of releasing per-net Lockd
resources.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
NLM host is network namespace aware now.
So NSM have to take it into account.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This object depends on RPC client, and thus on network namespace.
So let's make it's allocation and lookup in network namespace context.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This patch introduces per-net Lockd initialization and destruction routines.
The logic is the same as in global Lockd up and down routines. Probably the
solution is not the best one. But at least it looks clear.
So per-net "up" routine are called only in case of lockd is running already. If
per-net resources are not allocated yet, then service is being registered with
local portmapper and lockd sockets created.
Per-net "down" routine is called on every lockd_down() call in case of global
users counter is not zero.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Lockd is going to be shared between network namespaces - i.e. going to be able
to handle lock requests from different network namespaces. This means, that
network namespace related resources have to be allocated not once (like now),
but for every network namespace context, from which service is requested to
operate.
This patch implements Lockd per-net users accounting. New per-net counter is
used to determine, when per-net resources have to be freed.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This patch parametrizes Lockd permanent sockets creation routine by network
namespace context.
It also replaces hard-coded init_net with current network namespace context in
Lockd sockets creation routines.
This approach looks safe, because Lockd is created during NFS mount (or NFS
server start) and thus socket is required exactly in current network namespace
context. But in the same time it means, that Lockd sockets inherits first Lockd
requester network namespace. This issue will be fixed in further patches of the
series.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Add the module parameter 'max_session_slots' to set the initial number
of slots that the NFSv4.1 client will attempt to negotiate with the
server.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
It is perfectly legal to negotiate up to 2^32-1 slots in the protocol,
and with 10GigE, we are already seeing that 255 slots is far too limiting.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This fixes an oops when a buggy client tries to use an initial seqid of
0 on a new slot, which we may misinterpret as a replay.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Combine two booleans into a single flag field, move the smaller fields
to the end.
(In practice this doesn't make the struct any smaller. But we'll be
adding another flag here soon.)
Remove some debugging code that doesn't look useful, while we're in the
neighborhood.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* git://git.samba.org/sfrench/cifs-2.6:
cifs: don't return error from standard_receive3 after marking response malformed
cifs: request oplock when doing open on lookup
cifs: fix error handling when cifscreds key payload is an error
unfortunately, nlink_t may be smaller than 32 bits and ->i_nlink
on ocfs2 can grow up to 0xffffffff; storing it in nlink_t variable
will lose upper bits on such architectures. Needs to be made u32,
until we get kernel-side nlink_t uniformly 32bit...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Massaged cp_compat_stat() into form closer to cp_new_stat(); the only
real issue had been in handling of st_nlink overflows - native 32bit
stat(2) returns -EOVERFLOW in such situations, compat one silently
loses upper bits.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
This script causes a kernel deadlock:
set -e
DEVICE=/dev/vg1/linear
lvchange -ay $DEVICE
mkfs.ext3 $DEVICE
mount -t ext3 -o usrquota,grpquota $DEVICE /mnt/test
quotacheck -gu /mnt/test
umount /mnt/test
mount -t ext3 -o usrquota,grpquota $DEVICE /mnt/test
quotaon /mnt/test
dmsetup suspend $DEVICE
setquota -u root 1 2 3 4 /mnt/test &
sleep 1
dmsetup resume $DEVICE
setquota acquired semaphore s_umount for read and then tried to perform a
transaction (and waits because the device is suspended). dmsetup resume tries
to acquire s_umount for write before resuming the device (and waits for
setquota).
Fix the deadlock by grabbing a thawed superblock for quota commands which need
it.
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
In quota code we need to find a superblock corresponding to a device and wait
for superblock to be unfrozen. However this waiting has to happen without
s_umount semaphore because that is required for superblock to thaw. So provide
a function in VFS for this to keep dances with s_umount where they belong.
[AV: implementation switched to saner variant]
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
When the number of dentry cache hash table entries gets too high
(2147483648 entries), as happens by default on a 16TB system, use of a
signed integer in the dcache_init() initialization loop prevents the
dentry_hashtable from getting initialized, causing a panic in
__d_lookup(). Fix this in dcache_init() and similar areas.
Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
When recursing down the locks when traversing a tree/list in
get_next_positive_dentry() or get_next_positive_subdir() a lock can
change from being nested to being a parent which breaks lockdep. This
patch tells lockdep about what we did.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Ian Kent <raven@themaw.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
It looks to me like the two ASSERT()s in xfs_trans_add_item() really
want to do a compare (==) rather than assignment (=).
This patch changes it from the latter to the former.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Ben Myers <bpm@sgi.com>
(cherry picked from commit 05293485a0)
It looks to me like the two ASSERT()s in xfs_trans_add_item() really
want to do a compare (==) rather than assignment (=).
This patch changes it from the latter to the former.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Ben Myers <bpm@sgi.com>