Commit graph

4140 commits

Author SHA1 Message Date
David Woodhouse
cbb9a56177 Move jffs2_fs_i.h and jffs2_fs_sb.h from include/linux/ to fs/jffs2/
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-03 13:07:27 +01:00
David Teigland
97a35d1e5f [DLM] fix grant_after_purge softlockup
In dlm_grant_after_purge() we were holding a hash table read_lock while
calling put_rsb() which potentially removes the rsb from the hash table,
taking the same lock in write.  Fix this by flagging rsb's ahead of time
that have been purged.  Then iteratively read_lock the hash table, find a
flagged rsb, unlock, process rsb.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-05-02 13:34:03 -04:00
Steven Whitehouse
d2d7b8a2a7 [GFS2] Fix bug in writepage()
As pointed out by Wendy Cheng, the logic in GFS2's writepage() function
wasn't quite right with respect to invalidating pages when a file has been
truncated. This patch fixes that.

CC: Wendy Cheng <wcheng@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-05-02 12:09:42 -04:00
Jens Axboe
330ab71619 [PATCH] vmsplice: restrict stealing a little more
Apply the same rules as the anon pipe pages, only allow stealing
if no one else is using the page.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-05-02 15:29:57 +02:00
Jens Axboe
a893b99be7 [PATCH] splice: fix page LRU accounting
Currently we rely on the PIPE_BUF_FLAG_LRU flag being set correctly
to know whether we need to fiddle with page LRU state after stealing it,
however for some origins we just don't know if the page is on the LRU
list or not.

So remove PIPE_BUF_FLAG_LRU and do this check/add manually in pipe_to_file()
instead.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-05-02 15:03:27 +02:00
Jens Axboe
7591489a8f [PATCH] vmsplice: fix badly placed end paranthesis
We need to use the minium of {len, PAGE_SIZE-off}, not {len, PAGE_SIZE}-off.
The latter doesn't make any sense, and could cause us to attempt negative
length transfers...

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-05-02 12:57:18 +02:00
Linus Torvalds
9817d207dc Merge branch 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block
* 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block:
  [PATCH] vmsplice: allow user to pass in gift pages
  [PATCH] pipe: enable atomic copying of pipe data to/from user space
  [PATCH] splice: call handle_ra_miss() on failure to lookup page
  [PATCH] Add ->splice_read/splice_write to def_blk_fops
  [PATCH] pipe: introduce ->pin() buffer operation
  [PATCH] splice: fix bugs in pipe_to_file()
  [PATCH] splice: fix bugs with stealing regular pipe pages
2006-05-01 18:33:40 -07:00
Andi Kleen
d261020229 [PATCH] x86_64: Add compat_sys_vmsplice and use it in x86-64
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-01 18:17:43 -07:00
Jens Axboe
7afa6fd037 [PATCH] vmsplice: allow user to pass in gift pages
If SPLICE_F_GIFT is set, the user is basically giving this pages away to
the kernel. That means we can steal them for eg page cache uses instead
of copying it.

The data must be properly page aligned and also a multiple of the page size
in length.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-05-01 20:02:33 +02:00
Jens Axboe
f6762b7ad8 [PATCH] pipe: enable atomic copying of pipe data to/from user space
The pipe ->map() method uses kmap() to virtually map the pages, which
is both slow and has known scalability issues on SMP. This patch enables
atomic copying of pipe pages, by pre-faulting data and using kmap_atomic()
instead.

lmbench bw_pipe and lat_pipe measurements agree this is a Good Thing. Here
are results from that on a UP machine with highmem (1.5GiB of RAM), running
first a UP kernel, SMP kernel, and SMP kernel patched.

Vanilla-UP:
Pipe bandwidth: 1622.28 MB/sec
Pipe bandwidth: 1610.59 MB/sec
Pipe bandwidth: 1608.30 MB/sec
Pipe latency: 7.3275 microseconds
Pipe latency: 7.2995 microseconds
Pipe latency: 7.3097 microseconds

Vanilla-SMP:
Pipe bandwidth: 1382.19 MB/sec
Pipe bandwidth: 1317.27 MB/sec
Pipe bandwidth: 1355.61 MB/sec
Pipe latency: 9.6402 microseconds
Pipe latency: 9.6696 microseconds
Pipe latency: 9.6153 microseconds

Patched-SMP:
Pipe bandwidth: 1578.70 MB/sec
Pipe bandwidth: 1579.95 MB/sec
Pipe bandwidth: 1578.63 MB/sec
Pipe latency: 9.1654 microseconds
Pipe latency: 9.2266 microseconds
Pipe latency: 9.1527 microseconds

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-05-01 20:02:05 +02:00
Jens Axboe
e27dedd84c [PATCH] splice: call handle_ra_miss() on failure to lookup page
Notify the readahead logic of the missing page. Suggested by
Oleg Nesterov.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-05-01 19:59:54 +02:00
Jens Axboe
7f9c51f0d9 [PATCH] Add ->splice_read/splice_write to def_blk_fops
It can use the generic handlers.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-05-01 19:59:32 +02:00
Jens Axboe
f84d751994 [PATCH] pipe: introduce ->pin() buffer operation
The ->map() function is really expensive on highmem machines right now,
since it has to use the slower kmap() instead of kmap_atomic(). Splice
rarely needs to access the virtual address of a page, so it's a waste
of time doing it.

Introduce ->pin() to take over the responsibility of making sure the
page data is valid. ->map() is then reduced to just kmap(). That way we
can also share a most of the pipe buffer ops between pipe.c and splice.c

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-05-01 19:59:03 +02:00
Jens Axboe
0568b409c7 [PATCH] splice: fix bugs in pipe_to_file()
Found by Oleg Nesterov <oleg@tv-sign.ru>, fixed by me.

- Only allow full pages to go to the page cache.
- Check page != buf->page instead of using PIPE_BUF_FLAG_STOLEN.
- Remember to clear 'stolen' if add_to_page_cache() fails.

And as a cleanup on that:

- Make the bottom fall-through logic a little less convoluted. Also make
  the steal path hold an extra reference to the page, so we don't have
  to differentiate between stolen and non-stolen at the end.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-05-01 19:50:48 +02:00
Jens Axboe
46e678c96b [PATCH] splice: fix bugs with stealing regular pipe pages
- Check that page has suitable count for stealing in the regular pipes.
- pipe_to_file() assumes that the page is locked on succesful steal, so
  do that in the pipe steal hook
- Missing unlock_page() in add_to_page_cache() failure.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-30 16:36:32 +02:00
Steven Whitehouse
56409abbf8 [GFS2] Remove some unused code
Remove some of the unused code flagged up by Adrian Bunk.

Cc: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Steven Whitehouse
2006-04-28 11:48:45 -04:00
Adrian Bunk
08bc2dbc73 [GFS2] [-mm patch] fs/gfs2/: possible cleanups
This patch contains the following possible cleanups:
- make needlessly global code static
- #if 0 unused functions
- remove the following global function that was both unused and
  unimplemented:
  - super.c: gfs2_do_upgrade()

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-28 10:59:12 -04:00
David Teigland
c56b39cd2c [DLM] PATCH 3/3 dlm: show recover state
Expose the current recovery state in sysfs to help in debugging.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-28 10:51:53 -04:00
David Teigland
1c032c0311 [DLM] PATCH 2/3 dlm: lowcomms close
When a node is removed from a lockspace configuration, close our
connection to it, clearing any remaining messages for it.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-28 10:50:41 -04:00
David Teigland
ae118962b9 [DLM] PATCH 1/3 dlm: force free user lockspace
Lockspaces created from user space should be forcibly freed without
requiring any further user space interaction.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-28 10:48:59 -04:00
Steven Whitehouse
363275216c [GFS2] Reordering in deallocation to avoid recursive locking
Despite my earlier careful search, there was a recursive lock left
in the deallocation code. This removes it. It also should speed up
deallocation be reducing the number of locking operations which take
place by using two "try lock" operations on the two locks involved in
inode deallocation which allows us to grab the locks out of order
(compared with NFS which grabs the inode lock first and the iopen
lock later). It is ok for us to fail while doing this since if it
does fail it means that someone else is still using the inode and
thus it wouldn't be possible to deallocate anyway.

This fixes the bug reported to me by Rob Kenna.

Cc: Rob Kenna <rkenna@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-28 10:46:21 -04:00
Andreas Schwab
2833c28aa0 [PATCH] powerpc: Wire up *at syscalls
Wire up *at syscalls.

This patch has been tested on ppc64 (using glibc's testsuite, both 32bit
and 64bit), and compile-tested for ppc32 (I have currently no ppc32 system
available, but I expect no problems).

Signed-off-by: Andreas Schwab <schwab@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-04-28 21:04:59 +10:00
David Teigland
d26046bb0a Merge branch 'master' 2006-04-27 11:49:55 -04:00
David Teigland
e7f5c01cad [GFS2] Remove redundant casts to/from void
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-27 11:25:45 -04:00
Jens Axboe
eb20796bf6 [PATCH] splice: make the read-side do batched page lookups
Use the new find_get_pages_contig() to potentially look up the entire
splice range in one single call. This speeds up generic_file_splice_read()
quite a bit.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-27 11:05:22 +02:00
Jens Axboe
eb645a24de [PATCH] splice: switch to using page_cache_readahead()
Avoids doing useless work, when the file is fully cached.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-27 08:59:48 +02:00
David Teigland
6bd70aba5a [DLM] lock_dlm recover_status patch
This saves the journal recovery result and makes it visible through sysfs.
User space needs to know if the node actually recovered the journal or
tried and gave up.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-26 15:56:35 -04:00
Steven Whitehouse
579b78a43b [GFS2] Remove GL_NEVER_RECURSE flag
There is no point in keeping this flag since recursion is not
now allowed for any glock.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-26 14:58:26 -04:00
Steven Whitehouse
5965b1f479 [GFS2] Don't do recursive locking in glock layer
This patch changes the last user of recursive locking so that
it no longer needs this feature and removes it from the glock
layer. This makes the glock code a lot simpler and easier to
understand. Its also a prerequsite to adding support for the
AOP_TRUNCATED_PAGE return code (or at least it is if you don't
want your brain to melt in the process)

I've left in a couple of checks just in case there is some place
else in the code which is still using this feature that I didn't
spot yet, but they can probably be removed long term.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-26 13:21:55 -04:00
James Morris
e7edf9cded [PATCH] LSM: add missing hook to do_compat_readv_writev()
This patch addresses a flaw in LSM, where there is no mediation of readv()
and writev() in for 32-bit compatible apps using a 64-bit kernel.

This bug was discovered and fixed initially in the native readv/writev
code [1], but was not fixed in the compat code.  Thanks to Al for spotting
this one.

  [1] http://lwn.net/Articles/154282/

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-26 07:52:21 -07:00
Al Viro
a090d9132c [PATCH] protect ext3 ioctl modifying append_only, immutable, etc. with i_mutex
All modifications of ->i_flags in inodes that might be visible to
somebody else must be under ->i_mutex.  That patch fixes ext3 ioctl()
setting S_APPEND and friends.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-26 07:52:21 -07:00
Al Viro
de0bb97aff [PATCH] forgotten ->b_data in memcpy() call in ext3/resize.c (oopsable)
sbi->s_group_desc is an array of pointers to buffer_head.  memcpy() of
buffer size from address of buffer_head is a bad idea - it will generate
junk in any case, may oops if buffer_head is close to the end of slab
page and next page is not mapped and isn't what was intended there.
IOW, ->b_data is missing in that call.  Fortunately, result doesn't go
into the primary on-disk data structures, so only backup ones get crap
written to them; that had allowed this bug to remain unnoticed until
now.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-26 07:52:21 -07:00
Linus Torvalds
7b97ebfb93 Merge branch 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block
* 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block:
  [PATCH] splice: add ->splice_write support for /dev/null
  [PATCH] splice: rearrange moving to/from pipe helpers
  [PATCH] Add support for the sys_vmsplice syscall
  [PATCH] splice: fix offset problems
  [PATCH] splice: fix min() warning
2006-04-26 07:47:55 -07:00
Jens Axboe
00522fb41a [PATCH] splice: rearrange moving to/from pipe helpers
We need these for people writing their own ->splice_read/write hooks.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-26 14:39:29 +02:00
Jens Axboe
912d35f867 [PATCH] Add support for the sys_vmsplice syscall
sys_splice() moves data to/from pipes with a file input/output. sys_vmsplice()
moves data to a pipe, with the input being a user address range instead.

This uses an approach suggested by Linus, where we can hold partial ranges
inside the pages[] map. Hopefully this will be useful for network
receive support as well.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-26 10:59:21 +02:00
Miklos Szeredi
8aa09a50b5 [fuse] fix race between checking and setting file->private_data
BKL does not protect against races if the task may sleep between
checking and setting a value.  So move checking of file->private_data
near to setting it in fuse_fill_super().

Found by Al Viro.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
2006-04-26 10:49:16 +02:00
Miklos Szeredi
6dbbcb1205 [fuse] fix deadlock between fuse_put_super() and request_end(), try #2
A deadlock was possible, when the last reference to the superblock was
held due to a background request containing a file reference.

Releasing the file would release the vfsmount which in turn would
release the superblock.  Since sbput_sem is held during the fput() and
fuse_put_super() tries to acquire this same semaphore, a deadlock
results.

The solution is to move the fput() outside the region protected by
sbput_sem.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
2006-04-26 10:49:06 +02:00
Miklos Szeredi
5a5fb1ea74 Revert "[fuse] fix deadlock between fuse_put_super() and request_end()"
This reverts 73ce8355c2 commit.

It was wrong, because it didn't take into account the requirement,
that iput() for background requests must be performed synchronously
with ->put_super(), otherwise active inodes may remain after unmount.

The right solution is to keep the sbput_sem and perform iput() within
the locked region, but move fput() outside sbput_sem.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
2006-04-26 10:48:55 +02:00
Jens Axboe
016b661e2f [PATCH] splice: fix offset problems
Make the move_from_pipe() actors return number of bytes processed, then
move_from_pipe() can decide more cleverly when to move on to the next
buffer.

This fixes problems with pipe offset and differing file offset.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-26 10:33:34 +02:00
Andrew Morton
ba5f5d90c4 [PATCH] splice: fix min() warning
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-26 10:33:34 +02:00
David Teigland
3a2a9c96ac [GFS2] Update plock code in DLM locking module
We should be using fl_pid not fl_owner.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-25 15:45:51 -04:00
Patrick Caulfield
714dc65c34 [DLM] Convert a semaphore to a completion
Convert a semaphore into a completion in device.c.

Cc: David Teigland <teigland@redhat.com>
Cc: Andrew Morton <akpm@osdl.org>
Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-25 14:49:01 -04:00
Steven Whitehouse
96c2c0083d [DLM] Update Kconfig in the light of comments on lkml
We now depend on user selectable options rather than
select them. There is no dependancy on SYSFS since this
selection is independant of the DLM (even though it wouldn't
be sensible to build the DLM without it)

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-25 13:23:09 -04:00
Steven Whitehouse
4bcf7091f9 [GFS2] Remove inherited flags from exported flags.
We don't need the inherited flags since this action can be
implied by setting the flags on directories where they
wouldn't otherwise make sense. It reduces the number of extra
flags by two. Also updated the list of flags to take account of
one extra ext2/3 flag.

Cc: Andreas Dilger <adilger@clusterfs.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-25 13:20:27 -04:00
Steven Whitehouse
b5ea3e1ef3 [GFS2] Tidy up Makefile & Kconfig
Remove select of SYSFS as requested by Greg KH. Change whitespace to
tabs rather than spaces in places where it was incorrect and removed
'default m' as suggested by Adrian Bunk.

Reorganised Makefile as suggested by Sam Ravnborg.

Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Adrian Bunk <bunk@stusta.de>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-24 14:14:42 -04:00
Steven Whitehouse
b800a1cb39 [GFS2] Tidy up daemon.c
As per Andrew Morton's comments, remove uneeded casts and use
wait_event_interruptible() rather than open code the wait.

Cc: Andrew Morton <akpm@osdl.org>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-24 13:13:56 -04:00
Steve French
301dc3e6f6 [CIFS] Fix compile error when CONFIG_CIFS_EXPERIMENTAL is undefined
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-04-24 16:24:54 +00:00
Steven Whitehouse
61e085a88c [GFS2] Tidy up dir code as per Christoph Hellwig's comments
1. Comment whitespace fix
2. Removed unused header files from dir.c
3. Split the gfs2_dir_get_buffer() function into two functions

Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-04-24 10:07:13 -04:00
Linus Torvalds
41bc3982b9 Merge master.kernel.org:/pub/scm/linux/kernel/git/sfrench/cifs-2.6-stable
* master.kernel.org:/pub/scm/linux/kernel/git/sfrench/cifs-2.6-stable:
  [CIFS] Fix typo in previous
  [CIFS] Readdir fixes to allow search to start at arbitrary position
  [CIFS] Use the kthread_ API instead of opencoding lots of hairy code for kernel
  [CIFS] Don't allow a backslash in a path component
  [CIFS] [CIFS] Do not take rename sem on most path based calls (during
2006-04-23 09:38:09 -07:00
Steve French
b66ac3ea21 [CIFS] Fix typo in previous
Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-04-23 01:54:50 +00:00