linux-pinenote

Author	SHA1	Message	Date
Josef Bacik	6a63209fc0	Btrfs: add better -ENOSPC handling This is a step in the direction of better -ENOSPC handling. Instead of checking the global bytes counter we check the space_info bytes counters to make sure we have enough space. If we don't we go ahead and try to allocate a new chunk, and then if that fails we return -ENOSPC. This patch adds two counters to btrfs_space_info, bytes_delalloc and bytes_may_use. bytes_delalloc account for extents we've actually setup for delalloc and will be allocated at some point down the line. bytes_may_use is to keep track of how many bytes we may use for delalloc at some point. When we actually set the extent_bit for the delalloc bytes we subtract the reserved bytes from the bytes_may_use counter. This keeps us from not actually being able to allocate space for any delalloc bytes. Signed-off-by: Josef Bacik <jbacik@redhat.com>	2009-02-20 11:00:09 -05:00
Chris Mason	2cfbd50b53	Btrfs: check file pointer in btrfs_sync_file fsync can be called by NFS with a null file pointer, and btrfs was oopsing in this case. Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-02-20 10:55:10 -05:00
Linus Torvalds	620565ef5f	Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs * 'for-linus' of git://oss.sgi.com/xfs/xfs: Revert "[XFS] remove old vmap cache" Revert "[XFS] use scalable vmap API"	2009-02-19 13:09:32 -08:00
Felix Blyakher	27e88bf6af	Revert "[XFS] remove old vmap cache" This reverts commit `d2859751cd`. This commit caused regression. We'll try to fix use of new vmap API for next release. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-02-19 13:15:55 -06:00
Felix Blyakher	7fdf582447	Revert "[XFS] use scalable vmap API" This reverts commit `95f8e302c0`. This commit caused regression. We'll try to fix use of new vmap API for next release. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-02-19 13:15:44 -06:00
Ingo Molnar	72c26c9a26	Merge branch 'linus' into tracing/blktrace Conflicts: block/blktrace.c Semantic merge: kernel/trace/blktrace.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-19 09:00:35 +01:00
Linus Torvalds	ba95fd47d1	Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block * 'for-linus' of git://git.kernel.dk/linux-2.6-block: block: fix deadlock in blk_abort_queue() for drivers that readd to timeout list block: fix booting from partitioned md array block: revert part of `18ce3751cc` cciss: PCI power management reset for kexec paride/pg.c: xs(): &&/\|\| confusion fs/bio: bio_alloc_bioset: pass right object ptr to mempool_free block: fix bad definition of BIO_RW_SYNC bsg: Fix sense buffer bug in SG_IO	2009-02-18 18:33:04 -08:00
Ingo Molnar	f04b30de3c	inotify: fix GFP_KERNEL related deadlock Enhanced lockdep coverage of __GFP_NOFS turned up this new lockdep assert: [ 1093.677775] [ 1093.677781] ================================= [ 1093.680031] [ INFO: inconsistent lock state ] [ 1093.680031] 2.6.29-rc5-tip-01504-gb49eca1-dirty #1 [ 1093.680031] --------------------------------- [ 1093.680031] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage. [ 1093.680031] kswapd0/308 [HC0[0]:SC0[0]:HE1:SE1] takes: [ 1093.680031] (&inode->inotify_mutex){+.+.?.}, at: [<c0205942>] inotify_inode_is_dead+0x20/0x80 [ 1093.680031] {RECLAIM_FS-ON-W} state was registered at: [ 1093.680031] [<c01696b9>] mark_held_locks+0x43/0x5b [ 1093.680031] [<c016baa4>] lockdep_trace_alloc+0x6c/0x6e [ 1093.680031] [<c01cf8b0>] kmem_cache_alloc+0x20/0x150 [ 1093.680031] [<c040d0ec>] idr_pre_get+0x27/0x6c [ 1093.680031] [<c02056e3>] inotify_handle_get_wd+0x25/0xad [ 1093.680031] [<c0205f43>] inotify_add_watch+0x7a/0x129 [ 1093.680031] [<c020679e>] sys_inotify_add_watch+0x20f/0x250 [ 1093.680031] [<c010389e>] sysenter_do_call+0x12/0x35 [ 1093.680031] [<ffffffff>] 0xffffffff [ 1093.680031] irq event stamp: 60417 [ 1093.680031] hardirqs last enabled at (60417): [<c018d5f5>] call_rcu+0x53/0x59 [ 1093.680031] hardirqs last disabled at (60416): [<c018d5b9>] call_rcu+0x17/0x59 [ 1093.680031] softirqs last enabled at (59656): [<c0146229>] __do_softirq+0x157/0x16b [ 1093.680031] softirqs last disabled at (59651): [<c0106293>] do_softirq+0x74/0x15d [ 1093.680031] [ 1093.680031] other info that might help us debug this: [ 1093.680031] 2 locks held by kswapd0/308: [ 1093.680031] #0: (shrinker_rwsem){++++..}, at: [<c01b0502>] shrink_slab+0x36/0x189 [ 1093.680031] #1: (&type->s_umount_key#4){+++++.}, at: [<c01e6d77>] shrink_dcache_memory+0x110/0x1fb [ 1093.680031] [ 1093.680031] stack backtrace: [ 1093.680031] Pid: 308, comm: kswapd0 Not tainted 2.6.29-rc5-tip-01504-gb49eca1-dirty #1 [ 1093.680031] Call Trace: [ 1093.680031] [<c016947a>] valid_state+0x12a/0x13d [ 1093.680031] [<c016954e>] mark_lock+0xc1/0x1e9 [ 1093.680031] [<c016a5b4>] ? check_usage_forwards+0x0/0x3f [ 1093.680031] [<c016ab74>] __lock_acquire+0x2c6/0xac8 [ 1093.680031] [<c01688d9>] ? register_lock_class+0x17/0x228 [ 1093.680031] [<c016b3d3>] lock_acquire+0x5d/0x7a [ 1093.680031] [<c0205942>] ? inotify_inode_is_dead+0x20/0x80 [ 1093.680031] [<c08824c4>] __mutex_lock_common+0x3a/0x4cb [ 1093.680031] [<c0205942>] ? inotify_inode_is_dead+0x20/0x80 [ 1093.680031] [<c08829ed>] mutex_lock_nested+0x2e/0x36 [ 1093.680031] [<c0205942>] ? inotify_inode_is_dead+0x20/0x80 [ 1093.680031] [<c0205942>] inotify_inode_is_dead+0x20/0x80 [ 1093.680031] [<c01e6672>] dentry_iput+0x90/0xc2 [ 1093.680031] [<c01e67a3>] d_kill+0x21/0x45 [ 1093.680031] [<c01e6a46>] __shrink_dcache_sb+0x27f/0x355 [ 1093.680031] [<c01e6dc5>] shrink_dcache_memory+0x15e/0x1fb [ 1093.680031] [<c01b05ed>] shrink_slab+0x121/0x189 [ 1093.680031] [<c01b0d12>] kswapd+0x39f/0x561 [ 1093.680031] [<c01ae499>] ? isolate_pages_global+0x0/0x233 [ 1093.680031] [<c0157eae>] ? autoremove_wake_function+0x0/0x43 [ 1093.680031] [<c01b0973>] ? kswapd+0x0/0x561 [ 1093.680031] [<c0157daf>] kthread+0x41/0x82 [ 1093.680031] [<c0157d6e>] ? kthread+0x0/0x82 [ 1093.680031] [<c01043ab>] kernel_thread_helper+0x7/0x10 inotify_handle_get_wd() does idr_pre_get() which does a kmem_cache_alloc() without __GFP_FS - and is hence deadlockable under extreme MM pressure. Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: MinChan Kim <minchan.kim@gmail.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-02-18 15:37:56 -08:00
Bill Nottingham	2db69a9340	vt: Declare PIO_CMAP/GIO_CMAP as compatbile ioctls. Otherwise, these don't work when called from 32-bit userspace on 64-bit kernels. Cc: Jiri Kosina <jkosina@suse.cz> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: <stable@kernel.org> [2.6.25.x, 2.6.26.x, 2.6.27.x, 2.6.28.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-02-18 15:37:56 -08:00
Peter Zijlstra	ada723dcd6	fs/super.c: add lockdep annotation to s_umount Li Zefan said: Thread 1: for ((; ;)) { mount -t cpuset xxx /mnt > /dev/null 2>&1 cat /mnt/cpus > /dev/null 2>&1 umount /mnt > /dev/null 2>&1 } Thread 2: for ((; ;)) { mount -t cpuset xxx /mnt > /dev/null 2>&1 umount /mnt > /dev/null 2>&1 } (Note: It is irrelevant which cgroup subsys is used.) After a while a lockdep warning showed up: ============================================= [ INFO: possible recursive locking detected ] 2.6.28 #479 --------------------------------------------- mount/13554 is trying to acquire lock: (&type->s_umount_key#19){--..}, at: [<c049d888>] sget+0x5e/0x321 but task is already holding lock: (&type->s_umount_key#19){--..}, at: [<c049da0c>] sget+0x1e2/0x321 other info that might help us debug this: 1 lock held by mount/13554: #0: (&type->s_umount_key#19){--..}, at: [<c049da0c>] sget+0x1e2/0x321 stack backtrace: Pid: 13554, comm: mount Not tainted 2.6.28-mc #479 Call Trace: [<c044ad2e>] validate_chain+0x4c6/0xbbd [<c044ba9b>] __lock_acquire+0x676/0x700 [<c044bb82>] lock_acquire+0x5d/0x7a [<c049d888>] ? sget+0x5e/0x321 [<c061b9b8>] down_write+0x34/0x50 [<c049d888>] ? sget+0x5e/0x321 [<c049d888>] sget+0x5e/0x321 [<c045a2e7>] ? cgroup_set_super+0x0/0x3e [<c045959f>] ? cgroup_test_super+0x0/0x2f [<c045bcea>] cgroup_get_sb+0x98/0x2e7 [<c045cfb6>] cpuset_get_sb+0x4a/0x5f [<c049dfa4>] vfs_kern_mount+0x40/0x7b [<c049e02d>] do_kern_mount+0x37/0xbf [<c04af4a0>] do_mount+0x5c3/0x61a [<c04addd2>] ? copy_mount_options+0x2c/0x111 [<c04af560>] sys_mount+0x69/0xa0 [<c0403251>] sysenter_do_call+0x12/0x31 The cause is after alloc_super() and then retry, an old entry in list fs_supers is found, so grab_super(old) is called, but both functions hold s_umount lock: struct super_block *sget(...) { ... retry: spin_lock(&sb_lock); if (test) { list_for_each_entry(old, &type->fs_supers, s_instances) { if (!test(old, data)) continue; if (!grab_super(old)) <--- 2nd: down_write(&old->s_umount); goto retry; if (s) destroy_super(s); return old; } } if (!s) { spin_unlock(&sb_lock); s = alloc_super(type); <--- 1th: down_write(&s->s_umount) if (!s) return ERR_PTR(-ENOMEM); goto retry; } ... } It seems like a false positive, and seems like VFS but not cgroup needs to be fixed. Peter said: We can simply put the new s_umount instance in a but lockdep doesn't particularly cares about subclass order. If there's any issue with the callers of sget() assuming the s_umount lock being of sublcass 0, then there is another annotation we can use to fix that, but lets not bother with that if this is sufficient. Addresses http://bugzilla.kernel.org/show_bug.cgi?id=12673 Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Li Zefan <lizf@cn.fujitsu.com> Reported-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Paul Menage <menage@google.com> Cc: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-02-18 15:37:55 -08:00
Nick Piggin	1cf6e7d83b	mm: task dirty accounting fix YAMAMOTO-san noticed that task_dirty_inc doesn't seem to be called properly for cases where set_page_dirty is not used to dirty a page (eg. mark_buffer_dirty). Additionally, there is some inconsistency about when task_dirty_inc is called. It is used for dirty balancing, however it even gets called for __set_page_dirty_no_writeback. So rather than increment it in a set_page_dirty wrapper, move it down to exactly where the dirty page accounting stats are incremented. Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp> Signed-off-by: Nick Piggin <npiggin@suse.de> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-02-18 15:37:54 -08:00
Davide Libenzi	610d18f412	timerfd: add flags check As requested by Michael, add a missing check for valid flags in timerfd_settime(), and make it return EINVAL in case some extra bits are set. Michael said: If this is to be any use to userland apps that want to check flag support (perhaps it is too late already), then the sooner we get it into the kernel the better: 2.6.29 would be good; earlier stables as well would be even better. [akpm@linux-foundation.org: remove unused TFD_FLAGS_SET] Acked-by: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Cc: <stable@kernel.org> [2.6.27.x, 2.6.28.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-02-18 15:37:53 -08:00
Eric Biederman	8f19d47293	seq_file: properly cope with pread Currently seq_read assumes that the offset passed to it is always the offset it passed to user space. In the case pread this assumption is broken and we do the wrong thing when presented with pread. To solve this I introduce an offset cache inside of struct seq_file so we know where our logical file position is. Then in seq_read if we try to read from another offset we reset our data structures and attempt to go to the offset user space wanted. [akpm@linux-foundation.org: restore FMODE_PWRITE] [pjt@google.com: seq_open needs its fmode opened up to take advantage of this] Signed-off-by: Eric Biederman <ebiederm@xmission.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Paul Turner <pjt@google.com> Cc: <stable@kernel.org> [2.6.25.x, 2.6.26.x, 2.6.27.x, 2.6.28.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-02-18 15:37:53 -08:00
Felix Blyakher	3a011a1719	Revert "[XFS] remove old vmap cache" This reverts commit `d2859751cd`. This commit caused regression. We'll try to fix use of new vmap API for next release. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-02-18 15:57:51 -06:00
Felix Blyakher	cf7dab8017	Revert "[XFS] use scalable vmap API" This reverts commit `95f8e302c0`. This commit caused regression. We'll try to fix use of new vmap API for next release. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-02-18 15:41:28 -06:00
Felix Blyakher	01234f3c87	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6	2009-02-18 15:35:05 -06:00
Jens Axboe	78f707bfc7	block: revert part of `18ce3751cc` The above commit added WRITE_SYNC and switched various places to using that for committing writes that will be waited upon immediately after submission. However, this causes a performance regression with AS and CFQ for ext3 at least, since sync_dirty_buffer() will submit some writes with WRITE_SYNC while ext3 has sumitted others dependent writes without the sync flag set. This causes excessive anticipation/idling in the IO scheduler because sync and async writes get interleaved, causing a big performance regression for the below test case (which is meant to simulate sqlite like behaviour). ---- test case ---- int main(int argc, char *argv) { int fdes, i; FILE fp; struct timeval start; struct timeval end; struct timeval res; gettimeofday(&start, NULL); for (i=0; i<ROWS; i++) { fp = fopen("test_file", "a"); fprintf(fp, "Some Text Data\n"); fdes = fileno(fp); fsync(fdes); fclose(fp); } gettimeofday(&end, NULL); timersub(&end, &start, &res); fprintf(stdout, "time to write %d lines is %ld(msec)\n", ROWS, (res.tv_sec*1000000 + res.tv_usec)/1000); return 0; } ------------------- Thanks to Sean.White@APCC.com for tracking down this performance regression and providing a test case. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-02-18 10:32:01 +01:00
Subhash Peddamallu	a60e78e57a	fs/bio: bio_alloc_bioset: pass right object ptr to mempool_free When freeing from bio pool use right ptr to account for bs->front_pad, instead of bio ptr, Signed-off-by: Subhash Peddamallu <subhash.peddamallu@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-02-18 10:32:01 +01:00
Linus Torvalds	48c0d9ece3	Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable * git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: Btrfs: hold trans_mutex when using btrfs_record_root_in_trans Btrfs: make a lockdep class for the extent buffer locks Btrfs: fs/btrfs/volumes.c: remove useless kzalloc Btrfs: remove unused code in split_state() Btrfs: remove btrfs_init_path Btrfs: balance_level checks !child after access Btrfs: Avoid using __GFP_HIGHMEM with slab allocator Btrfs: don't clean old snapshots on sync(1) Btrfs: use larger metadata clusters in ssd mode Btrfs: process mount options on mount -o remount, Btrfs: make sure all pending extent operations are complete	2009-02-17 14:19:14 -08:00
Linus Torvalds	3512a79dbc	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: Fix NULL dereference in ext4_ext_migrate()'s error handling ext4: Implement range_cyclic in ext4_da_writepages instead of write_cache_pages ext4: Initialize preallocation list_head's properly ext4: Fix lockdep warning ext4: Fix to read empty directory blocks correctly in 64k jbd2: Avoid possible NULL dereference in jbd2_journal_begin_ordered_truncate() Revert "ext4: wait on all pending commits in ext4_sync_fs()" jbd2: Fix return value of jbd2_journal_start_commit()	2009-02-17 14:05:05 -08:00
Al Viro	1a88b5364b	Fix incomplete __mntput locking Getting this wrong caused WARNING: at fs/namespace.c:636 mntput_no_expire+0xac/0xf2() due to optimistically checking cpu_writer->mnt outside the spinlock. Here's what we really want: * we know that nobody will set cpu_writer->mnt to mnt from now on * all changes to that sucker are done under cpu_writer->lock * we want the laziest equivalent of spin_lock(&cpu_writer->lock); if (likely(cpu_writer->mnt != mnt)) { spin_unlock(&cpu_writer->lock); continue; } /* do stuff */ that would make sure we won't miss earlier setting of ->mnt done by another CPU. Anyway, for now we just move the spin_lock() earlier and move the test into the properly locked region. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Reported-and-tested-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-02-17 14:02:08 -08:00
Eric Sesterhenn	ec32816f94	UBIFS: list usage cleanup Trivial cleanup, list_del(); list_add{,_tail}() is equivalent to list_move{,_tail}(). Semantic patch for coccinelle can be found at www.cccmz.de/~snakebyte/list_move_tail.spatch Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2009-02-17 12:45:22 +02:00
Patrick Ohly	d24fff22d8	net: pass new SIOCSHWTSTAMP through to device drivers Signed-off-by: Patrick Ohly <patrick.ohly@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-15 22:43:38 -08:00
Dan Carpenter	090542641d	ext4: Fix NULL dereference in ext4_ext_migrate()'s error handling This was found through a code checker (http://repo.or.cz/w/smatch.git/). It looks like you might be able to trigger the error by trying to migrate a readonly file system. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-02-15 20:02:19 -05:00
Duane Griffin	2dc6b0d48c	ext4: tighten restrictions on inode flags At the moment there are few restrictions on which flags may be set on which inodes. Specifically DIRSYNC may only be set on directories and IMMUTABLE and APPEND may not be set on links. Tighten that to disallow TOPDIR being set on non-directories and only NODUMP and NOATIME to be set on non-regular file, non-directories. Introduces a flags masking function which masks flags based on mode and use it during inode creation and when flags are set via the ioctl to facilitate future consistency. Signed-off-by: Duane Griffin <duaneg@dghda.com> Acked-by: Andreas Dilger <adilger@sun.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-02-15 18:09:20 -05:00
Duane Griffin	8fa43a81b9	ext4: don't inherit inappropriate inode flags from parent At present INDEX and EXTENTS are the only flags that new ext4 inodes do NOT inherit from their parent. In addition prevent the flags DIRTY, ECOMPR, IMAGIC, TOPDIR, HUGE_FILE and EXT_MIGRATE from being inherited. List inheritable flags explicitly to prevent future flags from accidentally being inherited. This fixes the TOPDIR flag inheritance bug reported at http://bugzilla.kernel.org/show_bug.cgi?id=9866. Signed-off-by: Duane Griffin <duaneg@dghda.com> Acked-by: Andreas Dilger <adilger@sun.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-02-15 18:57:26 -05:00
Pekka Enberg	705895b611	ext4: allocate ->s_blockgroup_lock separately As spotted by kmemtrace, struct ext4_sb_info is 17664 bytes on 64-bit which makes it a very bad fit for SLAB allocators. The culprit of the wasted memory is ->s_blockgroup_lock which can be as big as 16 KB when NR_CPUS >= 32. To fix that, allocate ->s_blockgroup_lock, which fits nicely in a order 2 page in the worst case, separately. This shinks down struct ext4_sb_info enough to fit a 2 KB slab cache so now we allocate 16 KB + 2 KB instead of 32 KB saving 14 KB of memory. Acked-by: Andreas Dilger <adilger@sun.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-02-15 18:07:52 -05:00
David S. Miller	5e30589521	Merge branch 'master' of /home/davem/src/GIT/linux-2.6/ Conflicts: drivers/net/wireless/iwlwifi/iwl-agn.c drivers/net/wireless/iwlwifi/iwl3945-base.c	2009-02-14 23:12:00 -08:00
Wei Yongjun	3d0518f475	ext4: New rec_len encoding for very large blocksizes The rec_len field in the directory entry is 16 bits, so to encode blocksizes larger than 64k becomes problematic. This patch allows us to supprot block sizes up to 256k, by using the low 2 bits to extend the range of rec_len to 2**18-1 (since valid rec_len sizes must be a multiple of 4). We use the convention that a rec_len of 0 or 65535 means the filesystem block size, for compatibility with older kernels. It's unlikely we'll see VM pages of up to 256k, but at some point we might find that the Linux VM has been enhanced to support filesystem block sizes > than the VM page size, at which point it might be useful for some applications to allow very large filesystem block sizes. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-02-14 23:01:36 -05:00
Theodore Ts'o	8bad4597c2	ext4: Use unsigned int for blocksize in dx_make_map() and dx_pack_dirents() Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-02-14 21:46:54 -05:00
Aneesh Kumar K.V	2acf2c261b	ext4: Implement range_cyclic in ext4_da_writepages instead of write_cache_pages With delayed allocation we lock the page in write_cache_pages() and try to build an in memory extent of contiguous blocks. This is needed so that we can get large contiguous blocks request. If range_cyclic mode is enabled, write_cache_pages() will loop back to the 0 index if no I/O has been done yet, and try to start writing from the beginning of the range. That causes an attempt to take the page lock of lower index page while holding the page lock of higher index page, which can cause a dead lock with another writeback thread. The solution is to implement the range_cyclic behavior in ext4_da_writepages() instead. http://bugzilla.kernel.org/show_bug.cgi?id=12579 Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-02-14 10:42:58 -05:00
Aneesh Kumar K.V	d794bf8e09	ext4: Initialize preallocation list_head's properly When creating a new ext4_prealloc_space structure, we have to initialize its list_head pointers before we add them to any prealloc lists. Otherwise, with list debug enabled, we will get list corruption warnings. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-02-14 10:31:16 -05:00
Andres Salomon	efab0b5d3e	[JFFS2] force the jffs2 GC daemon to behave a bit better I've noticed some pretty poor behavior on OLPC machines after bootup, when gdm/X are starting. The GCD monopolizes the scheduler (which in turns means it gets to do more nand i/o), which results in processes taking much much longer than they should to start. As an example, on an OLPC machine going from OFW to a usable X (via auto-login gdm) takes 2m 30s. The majority of this time is consumed by the switch into graphical mode. With this patch, we cut a full 60s off of bootup time. After bootup, things are much snappier as well. Note that we have seen a CRC node error with this patch that causes the machine to fail to boot, but we've also seen that problem without this patch. Signed-off-by: Andres Salomon <dilinger@debian.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2009-02-14 08:59:04 +00:00
Ingo Molnar	1c511f740f	Merge branches 'tracing/ftrace', 'tracing/ring-buffer', 'tracing/sysprof', 'tracing/urgent' and 'linus' into tracing/core	2009-02-13 10:25:18 +01:00
Felix Blyakher	8aa4349ad5	Merge branch 'master' of git://git.kernel.org/pub/scm/fs/xfs/xfs	2009-02-12 15:06:27 -06:00
Felix Blyakher	b747664516	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6	2009-02-12 15:05:33 -06:00
Yan Zheng	2456242530	Btrfs: hold trans_mutex when using btrfs_record_root_in_trans btrfs_record_root_in_trans needs the trans_mutex held to make sure two callers don't race to setup the root in a given transaction. This adds it to all the places that were missing it. Signed-off-by: Yan Zheng <zheng.yan@oracle.com>	2009-02-12 14:14:53 -05:00
Chris Mason	4008c04a07	Btrfs: make a lockdep class for the extent buffer locks Btrfs is currently using spin_lock_nested with a nested value based on the tree depth of the block. But, this doesn't quite work because the max tree depth is bigger than what spin_lock_nested can deal with, and because locks are sometimes taken before the level field is filled in. The solution here is to use lockdep_set_class_and_name instead, and to set the class before unlocking the pages when the block is read from the disk and just after init of a freshly allocated tree block. btrfs_clear_path_blocking is also changed to take the locks in the proper order, and it also makes sure all the locks currently held are properly set to blocking before it tries to retake the spinlocks. Otherwise, lockdep gets upset about bad lock orderin. The lockdep magic cam from Peter Zijlstra <peterz@infradead.org> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-02-12 14:09:45 -05:00
Christoph Hellwig	7c8f7af67d	xfs: reject swapext ioctl on swapfiles Swapfiles are magic - I/O is directly initialized by the VM without involving the filesystem. Swapping out extents underneath the VM thus can cause severe problems. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com>	2009-02-12 19:56:00 +01:00
Christoph Hellwig	264307520b	xfs: fix error handling in xfs_log_mount We can't just call xfs_log_unmount_dealloc on any failure because the ail thread which is torn down by xfs_log_unmount_dealloc might not be initialized yet. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Reported-by: Lachlan McIlroy <lachlan@sgi.com>	2009-02-12 19:55:48 +01:00
Julia Lawall	3f3420df50	Btrfs: fs/btrfs/volumes.c: remove useless kzalloc The call to kzalloc is followed by a kmalloc whose result is stored in the same variable. The semantic match that finds the problem is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @r exists@ local idexpression x; statement S; expression E; identifier f,l; position p1,p2; expression ptr != NULL; @@ ( if ((x@p1 = \(kmalloc\\|kzalloc\\|kcalloc\)(...)) == NULL) S \| x@p1 = \(kmalloc\\|kzalloc\\|kcalloc\)(...); ... if (x == NULL) S ) <... when != x when != if (...) { <+...x...+> } x->f = E ...> ( return \(0\\|<+...x...+>\\|ptr\); \| return@p2 ...; ) @script:python@ p1 << r.p1; p2 << r.p2; @@ print " file: %s kmalloc %s return %s" % (p1[0].file,p1[0].line,p2[0].line) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-02-12 10:16:03 -05:00
Qinghuang Feng	a48ddf08ba	Btrfs: remove unused code in split_state() These two lines are not used, remove them. Signed-off-by: Qinghuang Feng <qhfeng.kernel@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-02-12 14:25:23 -05:00
Jeff Mahoney	e00f730865	Btrfs: remove btrfs_init_path btrfs_init_path was initially used when the path objects were on the stack. Now all the work is done by btrfs_alloc_path and btrfs_init_path isn't required. This patch removes it, and just uses kmem_cache_zalloc to zero out the object. Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-02-12 14:11:25 -05:00
Jeff Mahoney	7951f3cefb	Btrfs: balance_level checks !child after access The BUG_ON() is in the wrong spot. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-02-12 10:06:15 -05:00
Yan Zheng	b335b0034e	Btrfs: Avoid using __GFP_HIGHMEM with slab allocator btrfs_releasepage may call kmem_cache_alloc indirectly, and provide same GFP flags it gets to kmem_cache_alloc. So it's possible to use __GFP_HIGHMEM with the slab allocator. Signed-off-by: Yan Zheng <zheng.yan@oracle.com>	2009-02-12 10:06:04 -05:00
Chris Mason	e1df36d2f1	Btrfs: don't clean old snapshots on sync(1) Cleaning old snapshots can make sync(1) somewhat slow, and some users and applications still use it in a global fsync kind of workload. This patch changes btrfs not to clean old snapshots during sync, which is safe from a FS consistency point of view. The major downside is that it makes it difficult to tell when old snapshots have been reaped and the space they were using has been reclaimed. A new ioctl will be added for this purpose instead. Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-02-12 09:45:08 -05:00
Chris Mason	536ac8ae86	Btrfs: use larger metadata clusters in ssd mode Larger metadata clusters can significantly improve writeback performance on ssd drives with large erasure blocks. The larger clusters make it more likely a given IO will completely overwrite the ssd block, so it doesn't have to do an internal rwm cycle. On spinning media, lager metadata clusters end up spreading out the metadata more over time, which makes fsck slower, so we don't want this to be the default. Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-02-12 09:41:38 -05:00
Chris Mason	b288052e17	Btrfs: process mount options on mount -o remount, Btrfs wasn't parsing any new mount options during remount, making it difficult to set mount options on a root drive. Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-02-12 09:37:35 -05:00
Josef Bacik	eb09967089	Btrfs: make sure all pending extent operations are complete Theres a slight problem with finish_current_insert, if we set all to 1 and then go through and don't actually skip any of the extents on the pending list, we could exit right after we've added new extents. This is a problem because by inserting the new extents we could have gotten new COW's to happen and such, so we may have some pending updates to do or even more inserts to do after that. So this patch will only exit if we have never skipped any of the extents in the pending list, and we have no extents to insert, this will make sure that all of the pending work is truly done before we return. I've been running with this patch for a few days with all of my other testing and have not seen issues. Thanks, Signed-off-by: Josef Bacik <jbacik@redhat.com>	2009-02-12 09:27:38 -05:00
Kentaro Takeda	f9ce1f1cda	Add in_execve flag into task_struct. This patch allows LSM modules to determine whether current process is in an execve operation or not so that they can behave differently while an execve operation is in progress. This patch is needed by TOMOYO. Please see another patch titled "LSM adapter functions." for backgrounds. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>	2009-02-12 15:15:03 +11:00

... 15 16 17 18 19 ...

13640 commits