Commit graph

4140 commits

Author SHA1 Message Date
Jan Kara
b9251b823b [PATCH] Fix reiserfs deadlock
reiserfs_cache_default_acl() should return whether we successfully found
the acl or not.  We have to return correct value even if reiserfs_get_acl()
returns error code and not just 0.  Otherwise callers such as
reiserfs_mkdir() can unnecessarily lock the xattrs and later functions such
as reiserfs_new_inode() fail to notice that we have already taken the lock
and try to take it again with obvious consequences.

Signed-off-by: Jan Kara <jack@suse.cz>
Cc: <reiserfs-dev@namesys.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-22 09:19:53 -07:00
Steve French
60808233f3 [CIFS] Readdir fixes to allow search to start at arbitrary position
in directory

Also includes first part of fix to compensate for servers which forget
to return . and .. as well as updates to changelog and cifs readme.

Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-04-22 15:53:05 +00:00
Steve French
45af7a0f2e [CIFS] Use the kthread_ API instead of opencoding lots of hairy code for kernel
thread creation and teardown.

It does not move the cifsd thread handling to kthread due to problems
found in testing with wakeup of threads blocked in the socket peek api,
but the other cifs kernel threads now use kthread.
Also cleanup cifs_init to properly unwind when thread creation fails.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-04-21 22:52:25 +00:00
Steven Whitehouse
1e09ae544e [GFS2] Move BUG() back into the header file
In order to make the file and line number reporting work
correctly, this has been moved back into the header file.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-21 15:52:46 -04:00
Steven Whitehouse
1dde2dbfc7 [GFS2] Add back missing BUG()
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-21 15:39:02 -04:00
Steven Whitehouse
a74604bee2 [GFS2] sem -> mutex conversion in locking.c
Convert a semaphore to a mutex in locking.c and also tidy
up one or two loose ends.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-21 15:10:46 -04:00
Steve French
296034f7de [CIFS] Don't allow a backslash in a path component
Unless Posix paths have been negotiated, the backslash, "\", is not a valid
character in a path component.

Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Steve French  <sfrench@us.ibm.com>
2006-04-21 18:18:37 +00:00
Steve French
0bd4fa977f [CIFS] [CIFS] Do not take rename sem on most path based calls (during
building of full path) to avoid hang rename/readdir hang

Reported by Alan Tyson

Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-04-21 18:17:42 +00:00
Steven Whitehouse
a748422ee4 Merge branch 'master' 2006-04-21 12:52:36 -04:00
David Woodhouse
21f1d5fc59 [RBTREE] Update JFFS2 to use rb_parent() accessor macro.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-04-21 13:17:57 +01:00
David Woodhouse
c569882b2e [RBTREE] Update eventpoll.c to use rb_parent() accessor macro.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-04-21 13:17:24 +01:00
David Woodhouse
52b5108ca7 [RBTREE] Update ext3 to use rb_parent() accessor macro.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-04-21 13:15:57 +01:00
David Teigland
c63e31c2cc [GFS2] journal recovery patch
This is one of the changes related to journal recovery I mentioned a
couple weeks ago.  We can get into a situation where there are only
readonly nodes currently mounting the fs, but there are journals that need
to be recovered.  Since the readonly nodes can't recover journals, the
next rw mounter needs to go through and check all journals and recover any
that are dirty (i.e. what the first node to mount the fs does).  This rw
mounter needs to skip the journals held by the existing readonly nodes.
Skipping those journals amounts to using the TRY flag on the journal locks
so acquiring the lock of a journal held by a readonly node will fail
instead of blocking indefinately.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-20 17:03:48 -04:00
Steven Whitehouse
190562bd84 [GFS2] Fix a bug: scheduling under a spinlock
At some stage, a mutex was added to gfs2_glock_put() without
checking all its call sites. Two of them were called from
under a spinlock causing random delays at various points and
crashes.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-20 16:57:23 -04:00
Jens Axboe
82aa5d6183 [PATCH] splice: fix smaller sized splice reads
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-20 13:05:48 +02:00
Linus Torvalds
949b211235 Merge git://git.linux-nfs.org/pub/linux/nfs-2.6
* git://git.linux-nfs.org/pub/linux/nfs-2.6:
  SUNRPC: Dead code in net/sunrpc/auth_gss/auth_gss.c
  NFS: remove needless check in nfs_opendir()
  NFS: nfs_show_stats; for_each_possible_cpu(), not NR_CPUS
  NFS: make 2 functions static
  NFS,SUNRPC: Fix compiler warnings if CONFIG_PROC_FS & CONFIG_SYSCTL are unset
  NFS: fix PROC_FS=n compile error
  VFS: Fix another open intent Oops
  RPCSEC_GSS: fix leak in krb5 code caused by superfluous kmalloc
2006-04-19 10:46:59 -07:00
Carsten Otte
7451c4f0ee NFS: remove needless check in nfs_opendir()
Local variable res was initialized to 0 - no check needed here.

Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-04-19 13:06:37 -04:00
John Hawkes
b9d9506d94 NFS: nfs_show_stats; for_each_possible_cpu(), not NR_CPUS
Convert a for-loop that explicitly references "NR_CPUS" into the
potentially more efficient for_each_possible_cpu() construct.

Signed-off-by: John Hawkes <hawkes@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-04-19 13:06:20 -04:00
Adrian Bunk
ec535ce154 NFS: make 2 functions static
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-04-19 12:43:47 -04:00
Trond Myklebust
e99170ff3b NFS,SUNRPC: Fix compiler warnings if CONFIG_PROC_FS & CONFIG_SYSCTL are unset
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-04-19 12:43:47 -04:00
Trond Myklebust
95cf959b24 VFS: Fix another open intent Oops
If the call to nfs_intent_set_file() fails to open a file in
nfs4_proc_create(), we should return an error.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-04-19 12:43:46 -04:00
Linus Torvalds
0efd9323f3 Merge branch 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block
* 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block:
  [PATCH] splice: fixup writeout path after ->map changes
  [PATCH] splice: offset fixes
  [PATCH] tee: link_pipe() must be careful when dropping one of the pipe locks
  [PATCH] splice: cleanup the SPLICE_F_NONBLOCK handling
  [PATCH] splice: close i_size truncate races on read
2006-04-19 09:25:52 -07:00
Dipankar Sarma
ca99c1da08 [PATCH] Fix file lookup without ref
There are places in the kernel where we look up files in fd tables and
access the file structure without holding refereces to the file.  So, we
need special care to avoid the race between looking up files in the fd
table and tearing down of the file in another CPU.  Otherwise, one might
see a NULL f_dentry or such torn down version of the file.  This patch
fixes those special places where such a race may happen.

Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com>
Acked-by: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-19 09:13:51 -07:00
Arthur Othieno
dda27d1a55 [PATCH] hugetlbfs: add Kconfig help text
In kernel bugzilla #6248 (http://bugzilla.kernel.org/show_bug.cgi?id=6248),
Adrian Bunk <bunk@stusta.de> notes that CONFIG_HUGETLBFS is missing Kconfig
help text.

Signed-off-by: Arthur Othieno <apgo@patchbomb.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-19 09:13:50 -07:00
Eric W. Biederman
5e85d4abe3 [PATCH] task: Make task list manipulations RCU safe
While we can currently walk through thread groups, process groups, and
sessions with just the rcu_read_lock, this opens the door to walking the
entire task list.

We already have all of the other RCU guarantees so there is no cost in
doing this, this should be enough so that proc can stop taking the
tasklist lock during readdir.

prev_task was killed because it has no users, and using it will miss new
tasks when doing an rcu traversal.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-19 09:13:49 -07:00
Jens Axboe
9e0267c26e [PATCH] splice: fixup writeout path after ->map changes
Since ->map() no longer locks the page, we need to adjust the handling
of those pages (and stealing) a little. This now passes full regressions
again.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-19 15:57:31 +02:00
Jens Axboe
a4514ebd8e [PATCH] splice: offset fixes
- We need to adjust *ppos for writes as well.
- Copy back modified offset value if one was passed in, similar to
  what sendfile does.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-19 15:57:05 +02:00
Jens Axboe
2a27250e6c [PATCH] tee: link_pipe() must be careful when dropping one of the pipe locks
We need to ensure that we only drop a lock that is ordered last, to avoid
ABBA deadlocks with competing processes.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-19 15:56:40 +02:00
Jens Axboe
c4f895cbe1 [PATCH] splice: cleanup the SPLICE_F_NONBLOCK handling
- generic_file_splice_read() more readable and correct
- Don't bail on page allocation with NONBLOCK set, just don't allow
  direct blocking on IO (eg lock_page).

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-19 15:56:12 +02:00
Jens Axboe
91ad66ef44 [PATCH] splice: close i_size truncate races on read
We need to check i_size after doing a blocking readpage.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-19 15:55:10 +02:00
Linus Torvalds
385910f2b2 x86: be careful about tailcall breakage for sys_open[at] too
Came up through a quick grep for other cases similar to the ftruncate()
one in commit 0a489cb3b6.

Also, add a comment, so that people who read the code understand why we
do what looks like a no-op.

(Again, this won't actually matter to any sane user, since libc will
save and restore the register gcc stomps on, but it's still wrong to
stomp on it)

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-18 13:22:59 -07:00
Linus Torvalds
0a489cb3b6 x86: don't allow tail-calls in sys_ftruncate[64]()
Gcc thinks it owns the incoming argument stack, but that's not true for
"asmlinkage" functions, and it corrupts the caller-set-up argument stack
when it pushes the third argument onto the stack.  Which can result in
%ebx getting corrupted in user space.

Now, normally nobody sane would ever notice, since libc will save and
restore %ebx anyway over the system call, but it's still wrong.

I'd much rather have "asmlinkage" tell gcc directly that it doesn't own
the stack, but no such attribute exists, so we're stuck with our hacky
manual "prevent_tail_call()" macro once more (we've had the same issue
before with sys_waitpid() and sys_wait4()).

Thanks to Hans-Werner Hilse <hilse@sub.uni-goettingen.de> for reporting
the issue and testing the fix.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-18 13:02:48 -07:00
Steven Whitehouse
fe1bdedc6c [GFS2] Use vmalloc() in dir code
When allocating memory to sort directory entries, use vmalloc()
rather than kmalloc() since for larger directories, the required
size can easily be graeter than the 128k maximum of kmalloc().

Also adding the first steps towards getting the AOP_TRUNCATED_PAGE
return code get in the glock code by flagging all places where we
request a glock and we are holding a page lock.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-18 10:09:15 -04:00
Richard Purdie
373d5e7183 JFFS2: Return an error for long filenames
Return an error if a name is too long for JFFS2 rather than
corrupting data.

Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
2006-04-18 02:05:46 +01:00
Ananiev, Leonid I
75616cf985 [PATCH] ext3: Fix missed mutex unlock
Missed unlock_super()call is added in error condition code path.

Signed-off-by: Leonid Ananiev <leonid.i.ananiev@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-04-17 14:24:57 -07:00
Stephen Rothwell
2436f039d2 [PATCH] Fix block device symlink name
As noted further on the this file, some block devices have a / in their
name, so fix the "block:..." symlink name the same as the /sys/block name.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-04-17 14:24:57 -07:00
David Woodhouse
94171db1d2 Merge with git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git 2006-04-17 15:35:18 +01:00
David Woodhouse
d96fb997c6 [JFFS2] Fix race in post-mount node checking
For a while now, we've postponed CRC-checking of data nodes to be done
by the GC thread, instead of being done while the user is waiting for
mount to finish. The GC thread would iterate through all the inodes on
the system and check each of their data nodes. It would skip over inodes
which had already been used or were already being read in by
read_inode(), because their data nodes would have been examined anyway.

However, we could sometimes reach the end of the for-each-inode loop and
still have some unchecked space left, if an inode we'd skipped was
_still_ in the process of being read. This fixes that race by actually
waiting for read_inode() to finish rather than just moving on.

Thanks to Ladislav Michl for coming up with a reproducible test case and
helping to track it down.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-04-17 00:19:48 +01:00
Kay Sievers
d4d7e5dffc [PATCH] BLOCK: delay all uevents until partition table is scanned
[BLOCK] delay all uevents until partition table is scanned

Here we delay the annoucement of all block device events until the
disk's partition table is scanned and all partition devices are already
created and sysfs is populated.

We have a bunch of old bugs for removable storage handling where we
probe successfully for a filesystem on the raw disk, but at the
same time the kernel recognizes a partition table and creates partition
devices.
Currently there is no sane way to tell if partitions will show up or not
at the time the disk device is announced to userspace. With the delayed
events we can simply skip any probe for a filesystem on the raw disk when
we find already present partitions.

Signed-off-by: Kay Sievers <kay.sievers@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-04-14 11:41:24 -07:00
NeilBrown
4508a7a734 [PATCH] sysfs: Allow sysfs attribute files to be pollable
It works like this:
  Open the file
  Read all the contents.
  Call poll requesting POLLERR or POLLPRI (so select/exceptfds works)
  When poll returns,
     close the file and go to top of loop.
   or lseek to start of file and go back to the 'read'.

Events are signaled by an object manager calling
   sysfs_notify(kobj, dir, attr);

If the dir is non-NULL, it is used to find a subdirectory which
contains the attribute (presumably created by sysfs_create_group).

This has a cost of one int  per attribute, one wait_queuehead per kobject,
one int per open file.

The name "sysfs_notify" may be confused with the inotify
functionality.  Maybe it would be nice to support inotify for sysfs
attributes as well?

This patch also uses sysfs_notify to allow /sys/block/md*/md/sync_action
to be pollable

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-04-14 11:41:24 -07:00
Linus Torvalds
9a7e9f1c60 Merge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/mszeredi/fuse
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/mszeredi/fuse:
  [fuse] Direct I/O  should not use fuse_reset_request
  [fuse] Don't init request twice
  [fuse] Fix accounting the number of waiting requests
  [fuse] fix deadlock between fuse_put_super() and request_end()
2006-04-14 09:11:34 -07:00
Linus Torvalds
9ca686626c Merge branch 'tee' of git://brick.kernel.dk/data/git/linux-2.6-block
* 'tee' of git://brick.kernel.dk/data/git/linux-2.6-block:
  [PATCH] splice: add support for sys_tee()
  [PATCH] splice: pass offset around for ->splice_read() and ->splice_write()
2006-04-14 09:02:07 -07:00
Eric W. Biederman
c06511d12d [PATCH] de_thread: Don't change our parents and ptrace flags.
This is two distinct changes.
 - Not changing our real parents.
 - Not changing our ptrace parents.

Not changing our real parents is trivially correct because both tasks
have the same real parents as they are part of a thread group.  Now that
we demote the leader to a thread there is no longer any reason to change
it's parentage.

Not changing our ptrace parents is a user visible change if someone
looks hard enough.  I don't think user space applications will care or
even notice.

In the practical and I think common case a debugger will have attached
to all of the threads using the same ptrace flags.  From my quick skim
of strace and gdb that appears to be the case.  Which if true means
debuggers will not notice a change.

Before this point we have already generated a ptrace event in do_exit
that reports the leaders pid has died so de_thread is visible to a
debugger.  Which means attempting to hide this case by copying flags
around appears excessive.

By not doing anything it avoids all of the weird locking issues between
de_thread and ptrace attach, and removes one case from consideration for
fixing the ptrace locking.

This only addresses Oleg's first concern with ptrace_attach, that of the
problems caused by reparenting.  Oleg's second concern is essentially a
race between ptrace_attach and release_task that causes an oops when we
get to force_sig_specific.  There is nothing special about de_thread
with respect to that race.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-14 08:49:19 -07:00
Steven Whitehouse
4d8012b60e [GFS2] Fix bug which was causing postmark to fail
A typo in the directory code was causing postmark to fail
somewhere in the allocation code, since it was unable to
find newly allocated directory leaf blocks under certain
circumstances.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-12 17:39:45 -04:00
Randy Dunlap
fb6a82c94a [PATCH] jffs2: fix printk warnings
Fix printk format warnings in jffs2.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-04-11 20:12:10 -04:00
Miklos Szeredi
56cf34ff07 [fuse] Direct I/O should not use fuse_reset_request
It's cleaner to allocate a new request, otherwise the uid/gid/pid
fields of the request won't be filled in.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
2006-04-11 21:16:51 +02:00
Miklos Szeredi
4858cae4f0 [fuse] Don't init request twice
Request is already initialized in fuse_request_alloc() so no need to
do it again in fuse_get_req().

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
2006-04-11 21:16:38 +02:00
Miklos Szeredi
9bc5dddad1 [fuse] Fix accounting the number of waiting requests
Properly accounting the number of waiting requests was forgotten in
"clean up request accounting" patch.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
2006-04-11 21:16:09 +02:00
Miklos Szeredi
73ce8355c2 [fuse] fix deadlock between fuse_put_super() and request_end()
A deadlock was possible, when the last reference to the superblock was
held due to a background request containing a file reference.

Releasing the file would release the vfsmount which in turn would
release the superblock.  Since sbput_sem is held during the fput() and
fuse_put_super() tries to acquire this same semaphore, a deadlock
results.

The chosen soltuion is to get rid of sbput_sem, and instead use the
spinlock to ensure the referenced inodes/file are released only once.
Since the actual release may sleep, defer these outside the locked
region, but using local variables instead of the structure members.

This is a much more rubust solution.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
2006-04-11 21:14:26 +02:00
Steven Whitehouse
f4154ea039 [GFS2] Update journal accounting code.
A small update to the journaling code to change the way that
the "extra" blocks are accounted for in the journal. These are
used at a rate of one per 503 metadata blocks or one per 251
journaled data blocks (or just one if the total number of journaled
blocks in the transaction is smaller). Since we are using them at
two different rates the old method of accounting for them no longer
works and we count them up as required.

Since the "per transaction" accounting can't handle this (there is no
fixed number of header blocks per transaction) we have to account for
it in the general journal code. We now require that each transaction
reserves more blocks than it actually needs to take account of the
possible extra blocks.

Also a final fix to dir.c to ensure that all ref counts are handled
correctly.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-11 14:49:06 -04:00