At least two flashes exists that have the concept of a minimum write unit,
similar to NAND pages, but no other NAND characteristics. Therefore, rename
the minimum write unit to "writesize" for all flashes, including NAND.
Signed-off-by: Joern Engel <joern@wh.fh-wedel.de>
We'll be using a proper list of nodes in the jffs2_xattr_datum and
jffs2_xattr_ref structures, because the existing code to overwrite
them is just broken. Put it in the common part at the front of the
structure which is shared with the jffs2_inode_cache, so that the
jffs2_link_node_ref() function can do the right thing.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
In a couple of places, we assume that what's at the end of the
->next_in_ino list is a struct jffs2_inode_cache. Let's check
for that, since we expect it to change soon.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Let's avoid the potential for forgetting to set ref->next_in_ino, by doing
it within jffs2_link_node_ref() instead.
This highlights the ugliness of what we're currently doing with
xattr_datum and xattr_ref structures -- we should find a nicer way of
dealing with that.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
This adds support to GFS2 for selinux extended attributes. There is a
known bug in gfs2_ea_get() which is believed to be independant of this
patch. Further patches will follow once that bug is fixed in order to
make GFS2 use as much of the generic eattr infrastructure as possible.
Signed-off-by: Ryan O'Hara <rohara@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
When filing REF_OBSOLETE nodes, we'd add their size to the global
'dirty_size' count, but then to the eraseblock's 'used_size' count.
That's not clever.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Don't reassign to watch. If idr_find() returns NULL, then
put_inotify_watch() will choke.
Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Cc: John McCutchan <john@johnmccutchan.com>
Cc: Robert Love <rlove@rlove.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
While doing some inotify stress testing, I hit the following race. In
inotify_release(), it's possible for a watch to be removed from the lists
in between dropping dev->mutex and taking inode->inotify_mutex. The
reference we hold prevents the watch from being freed, but not from being
removed.
Checking the dev's idr mapping will prevent a double list_del of the
same watch.
Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Acked-by: John McCutchan <john@johnmccutchan.com>
Cc: Robert Love <rml@novell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Bernd Schmidt points out that binfmt_flat is now leaving the exec file open
while the application runs. This offsets all the application's fd numbers.
We should have closed the file within exec(), not at exit()-time.
But there doesn't seem to be a lot of point in doing all this just to avoid
going over RLIMIT_NOFILE by one fd for a few microseconds. So take the EMFILE
checking out again. This will cause binfmt_flat to again fail LTP's
exec-should-return-EMFILE-when-fdtable-is-full test. That test appears to be
wrong anyway - Open Group specs say nothing about exec() returning EMFILE.
Cc: Bernd Schmidt <bernd.schmidt@analog.com>
Cc: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Assigning the result of posix_acl_to_xattr() to an unsigned data type
(size/size_t) obscures possible errors.
Coverity CID: 1206.
Signed-off-by: Florin Malita <fmalita@gmail.com>
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Address a problem found when a Linux NFS server uses the "subtree_check"
export option.
The "subtree_check" NFS export option was designed to prohibit a client
from using a file handle for which it should not have permission. The
algorithm used is to ensure that the entire path to the file being
referenced is accessible to the user attempting to use the file handle. If
some part of the path is not accessible, then the operation is aborted and
the appropriate version of ESTALE is returned to the NFS client.
The error, ESTALE, is unfortunate in that it causes NFS clients to make
certain assumptions about the continued existence of the file. They assume
that the file no longer exists and refuse to attempt to access it again.
In this case, the file really does exist, but access was denied by the
server for a particular user.
A better error to return would be an EACCES sort of error. This would
inform the client that the particular operation that it was attempting was
not allowed, without the nasty side effects of the ESTALE error.
Signed-off-by: Peter Staubach <staubach@redhat.com>
Acked-By: NeilBrown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Functions compat_nfs_svc_trans, compat_nfs_clnt_trans,
compat_nfs_exp_trans, compat_nfs_getfd_trans and compat_nfs_getfs_trans,
which are called by compat_sys_nfsservctl(fs/compat.c), don't handle the
return value of access_ok properly. access_ok return 1 when the addr is
valid, and 0 when it's not, but these functions have the reversed
understanding. When the address is valid, they always return -EFAULT to
compat_sys_nfsservctl.
An example is to run /usr/sbin/rpc.nfsd(32bit program on Power5). It
doesn't function as expected. strace showes that nfsservctl returns
-EFAULT.
The patch fixes this by correcting the error handling on the return value
of access_ok in the five functions.
Signed-off-by: Lin Feng Shen <shenlinf@cn.ibm.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Well, almost. We'll actually keep a 'TEST_TOTLEN' macro set for now, and keep
doing some paranoia checks to make sure it's all working correctly. But if
TEST_TOTLEN is unset, the size of struct jffs2_raw_node_ref drops from 16
bytes to 12 on 32-bit machines. That's a saving of about half a megabyte of
memory on the OLPC prototype board, with 125K or so nodes in its 512MiB of
flash.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
We can't use jffs2_scan_dirty_space() because it doesn't do any locking; it's
only for use at scan time -- hence the 'scan' in the name.
Also, don't allocate refs while we have c->erase_completion_lock held.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
We don't allocate this locally any more -- it's given to us and owner by
our caller. Also improve the debug messages a little.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Next step in ongoing campaign to file a struct jffs2_raw_node_ref for every
piece of dirty space in the system, so that __totlen can be killed off....
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
If __totlen is going away, we need to pass the length in separately.
Also stop callers from needlessly setting ref->next_phys to NULL,
since that's done for them... and since that'll also be going away soon.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Make sure we allocate a ref for any dirty space which exists between nodes
which we find in an eraseblock summary.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
The incoming ref_totlen() calculation is going to rely on the existence
of nodes which cover all dirty space. We can't just tweak the accounting
data any more; we have to call jffs2_scan_dirty_space() to do it.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
To eliminate the __totlen field from struct jffs2_raw_node_ref, we need
to allocate nodes for dirty space instead of just tweaking the accounting
data. Introduce jffs2_scan_dirty_space() in preparation for that.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
For RWCOMPAT and ROCOMPAT nodes, we should still allow the mount to
succeed. Just abandon the summary and fall through to the full scan.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
If we had to allocate extra space for the summary node, we weren't
correctly freeing it when jffs2_sum_scan_sumnode() returned nonzero --
which is both the success and the failure case. Only when it returned
zero, which means fall through to the full scan, were we correctly freeing
the buffer.
Document the meaning of those return codes while we're at it.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
We should preserve these when we come to garbage collect them, not let
them get erased. Use jffs2_garbage_collect_pristine() for this, and make
sure the summary code copes -- just refrain from writing a summary for any
block which contains a node we don't understand.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
The same sequence of code was repeated in many places, to add a new
struct jffs2_raw_node_ref to an eraseblock and adjust the space accounting
accordingly. Move it out-of-line.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
We were calling ref_totlen() 18 times. Even before that becomes a real
function rather than just a dereference, apparently some compilers still
suck anyway. It'll _certainly_ suck after ref_totlen() becomes more
complicated, so calculate it once and don't rely on CSE.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
This improves the time to mount 512MiB of NAND flash on my OLPC prototype
by about 4%. We used to read the last page of the eraseblock twice -- once
to find the offset of the summary node, and again to actually _read_ the
summary node. Now we read the last page only once, and read more only if
we need to.
We also don't allocate a new buffer just for the summary code -- we use
the buffer which was already allocated for the scan. Better still, if the
'buffer' for the scan is actually just a pointer directly into NOR flash,
we use that too, avoiding the memcpy() which we used to do.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Remove forgotten lines from jffs2_scan_eraseblock() which
were unnecessary and may cause problem in some environments.
Thanks to Alexander Belyakov <alexander.belyakov@intel.com>.
Signed-off-by: Ferenc Havasi <havasi@inf.u-szeged.hu>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Setup the lock_dlm kobject before setting up the dlm lockspace instead
of after. We want to use the sysfs files to detect the mount without
having to wait for the dlm setup which can take a while.
Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Device node major/minor numbers are just stored in the payload of a single
data node. Just extend that to 4 bytes and use new_encode_dev() for it.
We only use the 4-byte format if we _need_ to, if !old_valid_dev(foo).
This preserves backwards compatibility with older code as much as
possible. If we do make devices with major or minor numbers above 255, and
then mount the file system with the old code, it'll just read the first
two bytes and get the numbers wrong. If it comes to garbage-collect it,
it'll then write back those wrong numbers. But that's about the best we
can expect.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
This adds some extra debugging to glock.c and changes
inode.c's deallocation code to call the debugging code
at a suitable moment. I'm chasing down a particular bug
to do with deallocation at the moment and the code can
go again once the bug is fixed.
Also this includes the first part of some changes to unify
the Linux struct inode and GFS2's struct gfs2_inode. This
transformation will happen in small parts over the next short
period.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
We no longer use semaphores, everything has been converted to
mutex or rwsem, so we don't need to include this header any more.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
This patch drops the log spinlock when an I/O error occurs
to avoid any possible problems in case of blocking or
recursion in the I/O error routine. It also has a few
cosmetic changes to tidy up various other files.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Since they are small and will be inlined by the complier,
it makes sense to merge the contents of bits.[ch] into
rgrp.c
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
configfs_init() needs to be called first to register configfs before anyconsumers try to access it. Move up configfs in fs/Makefile to make
sure it is initialized early.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
If configfs_mkdir() errored in certain ways after the parent<->child
linkage was already created, it would not undo the linkage. Also,
comment the reference counting for clarity.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
configfs_mkdir() failed to release the working parent reference in most
exit paths. Also changed the exit path for readability.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
We were using GFP_KERNEL in a handful of places which really wanted
GFP_NOFS. Fix this.
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Temporarily take the meta data lock in ocfs2_file_aio_read() to allow us to
update our inode fields.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
We need to take a data lock around extends to protect the pages that
ocfs2_zero_extend is going to be pulling into the page cache. Otherwise an
extend on one node might populate the page cache with data pages that have
no lock coverage.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
The ref count of certain glock's got elevated too far during unlink
which caused umount to fail. This fixes it.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>