linux-pinenote

Author	SHA1	Message	Date
Dmitry Monakhov	a9e7f44720	ext4: Convert to generic reserved quota's space management. This patch also fixes write vs chown race condition. Acked-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Jan Kara <jack@suse.cz>	2009-12-23 13:33:55 +01:00
Dmitry Monakhov	fd8fbfc170	quota: decouple fs reserved space from quota reservation Currently inode_reservation is managed by fs itself and this reservation is transfered on dquot_transfer(). This means what inode_reservation must always be in sync with dquot->dq_dqb.dqb_rsvspace. Otherwise dquot_transfer() will result in incorrect quota(WARN_ON in dquot_claim_reserved_space() will be triggered) This is not easy because of complex locking order issues for example http://bugzilla.kernel.org/show_bug.cgi?id=14739 The patch introduce quota reservation field for each fs-inode (fs specific inode is used in order to prevent bloating generic vfs inode). This reservation is managed by quota code internally similar to i_blocks/i_bytes and may not be always in sync with internal fs reservation. Also perform some code rearrangement: - Unify dquot_reserve_space() and dquot_reserve_space() - Unify dquot_release_reserved_space() and dquot_free_space() - Also this patch add missing warning update to release_rsv() dquot_release_reserved_space() must call flush_warnings() as dquot_free_space() does. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Jan Kara <jack@suse.cz>	2009-12-23 13:33:54 +01:00
Dmitry Monakhov	b462707e7c	Add unlocked version of inode_add_bytes() function Quota code requires unlocked version of this function. Off course we can just copy-paste the code, but copy-pasting is always an evil. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Jan Kara <jack@suse.cz>	2009-12-23 13:33:54 +01:00
Dmitry Monakhov	c459001fa4	ext3: quota macros cleanup [V2] Currently all quota block reservation macros contains hardcoded "2" aka MAXQUOTAS value. This is no good because in some places it is not obvious to understand what does this digit represent. Let's introduce new macro with self descriptive name. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Jan Kara <jack@suse.cz>	2009-12-23 13:33:54 +01:00
Theodore Ts'o	cc3e1bea5d	ext4, jbd2: Add barriers for file systems with exernal journals This is a bit complicated because we are trying to optimize when we send barriers to the fs data disk. We could just throw in an extra barrier to the data disk whenever we send a barrier to the journal disk, but that's not always strictly necessary. We only need to send a barrier during a commit when there are data blocks which are must be written out due to an inode written in ordered mode, or if fsync() depends on the commit to force data blocks to disk. Finally, before we drop transactions from the beginning of the journal during a checkpoint operation, we need to guarantee that any blocks that were flushed out to the data disk are firmly on the rust platter before we drop the transaction from the journal. Thanks to Oleg Drokin for pointing out this flaw in ext3/ext4. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-12-23 06:52:08 -05:00
Alan Cox	28ba0ec64c	jfs: Fix 32bit build warning loff_t is a type that isn't entirely dependant upon 32 v 64bit choice Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-12-22 12:27:35 -05:00
Al Viro	5300990c03	Sanitize f_flags helpers * pull ACC_MODE to fs.h; we have several copies all over the place * nightmarish expression calculating f_mode by f_flags deserves a helper too (OPEN_FMODE(flags)) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-12-22 12:27:34 -05:00
Al Viro	482928d59d	Fix f_flags/f_mode in case of lookup_instantiate_filp() from open(pathname, 3) Just set f_flags when shoving struct file into nameidata; don't postpone that until __dentry_open(). do_filp_open() has correct value; lookup_instantiate_filp() doesn't - we lose the difference between O_RDWR and 3 by that point. We still set .intent.open.flags, so no fs code needs to be changed. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-12-22 12:27:34 -05:00
Roland Dreier	628ff7c1d8	anonfd: Allow making anon files read-only It seems a couple places such as arch/ia64/kernel/perfmon.c and drivers/infiniband/core/uverbs_main.c could use anon_inode_getfile() instead of a private pseudo-fs + alloc_file(), if only there were a way to get a read-only file. So provide this by having anon_inode_getfile() create a read-only file if we pass O_RDONLY in flags. Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-12-22 12:27:34 -05:00
Arnd Bergmann	ed2617585f	fs/compat_ioctl.c: fix build error when !BLOCK No driver uses SG_SET_TRANSFORM any more in Linux, since the ide-scsi driver was removed in 2.6.29. The compat-ioctl cleanup series moved the handling for this around, which broke building without CONFIG_BLOCK. Just remove the code handling it for compat mode. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-12-22 12:27:33 -05:00
Roland Dreier	385e3ed4f0	alloc_file(): simplify handling of mnt_clone_write() errors When alloc_file() and init_file() were combined, the error handling of mnt_clone_write() was taken into alloc_file() in a somewhat obfuscated way. Since we don't use the error code for anything except warning, we might as well warn directly without an extra variable. Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-12-22 12:27:33 -05:00
Sage Weil	7067f797b8	ceph: fix incremental osdmap pg_temp decoding bug An incremental pg_temp wasn't being decoded properly (wrong bound on for loop). Also remove unused local variable, while we're at it. Signed-off-by: Sage Weil <sage@newdream.net>	2009-12-21 16:40:00 -08:00
Sage Weil	30dc6381bb	ceph: fix error paths for corrupt osdmap messages Both osdmap_decode() and osdmap_apply_incremental() should never return NULL. Signed-off-by: Sage Weil <sage@newdream.net>	2009-12-21 16:39:59 -08:00
Sage Weil	5de7bf8afa	ceph: do not drop lease during revalidate We need to hold session s_mutex for __ceph_mdsc_drop_dentry_lease(), which we don't, so skip it. It was purely an optimization. Signed-off-by: Sage Weil <sage@newdream.net>	2009-12-21 16:39:58 -08:00
Sage Weil	c4a29f26d5	ceph: ensure rename target dentry fails revalidation This works around a bug in vfs_rename_dir() that rehashes the target dentry. Ensure such dentries always fail revalidation by timing out the dentry lease and kicking it out of the current directory lease gen. This can be reverted when the vfs bug is fixed. Signed-off-by: Sage Weil <sage@newdream.net>	2009-12-21 16:39:57 -08:00
Yehuda Sadeh	2baba25019	ceph: writeback congestion control Set bdi congestion bit when amount of write data in flight exceeds adjustable threshold. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>	2009-12-21 16:39:56 -08:00
Yehuda Sadeh	dbd646a851	ceph: writepage grabs and releases inode Fixes a deadlock that is triggered due to kswapd, while the page was locked and the iput couldn't tear down the address space. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>	2009-12-21 16:39:56 -08:00
Yehuda Sadeh	169e16ce81	ceph: remove unaccessible code Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>	2009-12-21 16:39:55 -08:00
Sage Weil	06edf046dd	ceph: include link to bdi in debugfs Signed-off-by: Sage Weil <sage@newdream.net>	2009-12-21 16:39:54 -08:00
Sage Weil	e2885f06ce	ceph: make mds ops interruptible Signed-off-by: Sage Weil <sage@newdream.net>	2009-12-21 16:39:53 -08:00
Sage Weil	cf3e5c409b	ceph: plug leak of incoming message during connection fault/close If we explicitly close a connection, or there is a socket error, we need to drop any partially received message. Signed-off-by: Sage Weil <sage@newdream.net>	2009-12-21 16:39:53 -08:00
Sage Weil	9ec7cab14e	ceph: hex dump corrupt server data to KERN_DEBUG Also, print fsid using standard format, NOT hex dump. Signed-off-by: Sage Weil <sage@newdream.net>	2009-12-21 16:39:52 -08:00
Yehuda Sadeh	93c20d98c2	ceph: fix msgpool reservation leak Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>	2009-12-21 16:39:51 -08:00
Sage Weil	b3d1dbbdd5	ceph: don't save sent messages on lossy connections For lossy connections we drop all state on socket errors, so there is no reason to keep sent ceph_msg's around. Signed-off-by: Sage Weil <sage@newdream.net>	2009-12-21 16:39:50 -08:00
Sage Weil	92ac41d0a4	ceph: detect lossy state of connection The server indicates whether a connection is lossy; set our LOSSYTX bit appropriately. Do not set lossy bit on outgoing connections. Signed-off-by: Sage Weil <sage@newdream.net>	2009-12-21 16:39:49 -08:00
Sage Weil	5e095e8b40	ceph: plug msg leak in con_fault Signed-off-by: Sage Weil <sage@newdream.net>	2009-12-21 16:39:49 -08:00
Sage Weil	c86a2930cc	ceph: carry explicit msg reference for currently sending message Carry a ceph_msg reference for connection->out_msg. This will allow us to make out_sent optional. Signed-off-by: Sage Weil <sage@newdream.net>	2009-12-21 16:39:38 -08:00
J. Bruce Fields	3d354cbc43	nfsd: fix "insecure" export option A typo in `12045a6ee9` "nfsd: let "insecure" flag vary by pseudoflavor" reversed the sense of the "insecure" flag. Reported-by: Michael Guntsche <mike@it-loops.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-20 20:19:51 -08:00
J. Bruce Fields	f69ac2f5a3	nfsd: fix "insecure" export option A typo in `12045a6ee9` "nfsd: let "insecure" flag vary by pseudoflavor" reversed the sense of the "insecure" flag. Reported-by: Michael Guntsche <mike@it-loops.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-12-20 10:22:58 -05:00
Linus Torvalds	aac3d39693	Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (25 commits) sched: Fix broken assertion sched: Assert task state bits at build time sched: Update task_state_arraypwith new states sched: Add missing state chars to TASK_STATE_TO_CHAR_STR sched: Move TASK_STATE_TO_CHAR_STR near the TASK_state bits sched: Teach might_sleep() about preemptible RCU sched: Make warning less noisy sched: Simplify set_task_cpu() sched: Remove the cfs_rq dependency from set_task_cpu() sched: Add pre and post wakeup hooks sched: Move kthread_bind() back to kthread.c sched: Fix select_task_rq() vs hotplug issues sched: Fix sched_exec() balancing sched: Ensure set_task_cpu() is never called on blocked tasks sched: Use TASK_WAKING for fork wakups sched: Select_task_rq_fair() must honour SD_LOAD_BALANCE sched: Fix task_hot() test order sched: Fix set_cpu_active() in cpu_down() sched: Mark boot-cpu active before smp_init() sched: Fix cpu_clock() in NMIs, on !CONFIG_HAVE_UNSTABLE_SCHED_CLOCK ...	2009-12-19 09:47:49 -08:00
Tao Ma	10cf1a02f4	ocfs2: Set i_nlink properly during reflink. We create a file in orphan dir for reflink so that if there is any error, we don't create any wrong dentry in the dir. But actually the file in orphan dir should be i_nlink = 0 so that it can be replayed and freed successfully. This patch first set i_nlink to 0 when creating the file in orphan dir and then set it to 1(reflink now only works for regular file) when we move it to the dest dir. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2009-12-18 13:32:28 -08:00
Tao Ma	c7d260afcb	ocfs2: Add reflinked file's inode to inode hash eariler. We used to add reflinked file's inode to inode hash when we add it to the dest dir. But actually there is a race. Consider the following sequence. 1. reflink happens and create the inode in orphan dir. 2. reflink thread is scheduled out because of some io. 3. recovery begins to work and calls ocfs2_recover_orphans. It calls ocfs2_iget and get a new inode and i_count = 1. It calls iput then and delete inode. the buffer's uptodate state is cleared. This patch move insert_inode_hash to the create function so that it can be found by step 3 and prevented from deleting because i_count > 1. This resolves the bug http://oss.oracle.com/bugzilla/show_bug.cgi?id=1183. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2009-12-18 13:32:20 -08:00
Tristan Ye	55f4946ed2	Ocfs2: Should ocfs2 support fiemap for S_IFDIR inode? Let userspace have a chance to get the extent info of a directory just like extN did. Signed-off-by: Tristan Ye <tristan.ye@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2009-12-17 21:21:32 -08:00
Sunil Mushran	faf8b70f79	ocfs2: Use FIEMAP_EXTENT_SHARED Adds FIEMAP_EXTENT_SHARED flag to refcounted extents. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Acked-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2009-12-17 20:55:59 -08:00
Coly Li	9365454016	ocfs2: replace u8 by __u8 in ocfs2_fs.h This patch replaces date type 'u8' with '__u8', which follows the coding style of ocfs2_fs.h, and portable to user space for ocfs2-tools. Signed-off-by: Coly Li <coly.li@suse.de> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2009-12-17 20:55:54 -08:00
Coly Li	3a05d7961e	ocfs2: explicit declare uninitialized var in user_cluster_connect() This patch explicitly declares an uninitialized local variable in user_cluster_connect(), to remove a compiling warning. Signed-off-by: Coly Li <coly.li@suse.de> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2009-12-17 20:55:52 -08:00
Linus Torvalds	7c508e50be	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: Btrfs: make sure fallocate properly starts a transaction Btrfs: make metadata chunks smaller Btrfs: Show discard option in /proc/mounts Btrfs: deny sys_link across subvolumes. Btrfs: fail mount on bad mount options Btrfs: don't add extent 0 to the free space cache v2 Btrfs: Fix per root used space accounting Btrfs: Fix btrfs_drop_extent_cache for skip pinned case Btrfs: Add delayed iput Btrfs: Pass transaction handle to security and ACL initialization functions Btrfs: Make truncate(2) more ENOSPC friendly Btrfs: Make fallocate(2) more ENOSPC friendly Btrfs: Avoid orphan inodes cleanup during committing transaction Btrfs: Avoid orphan inodes cleanup while replaying log Btrfs: Fix disk_i_size update corner case Btrfs: Rewrite btrfs_drop_extents Btrfs: Add btrfs_duplicate_item Btrfs: Avoid superfluous tree-log writeout	2009-12-17 16:01:03 -08:00
Masami Hiramatsu	f6151dfea2	mm: introduce coredump parameter structure Introduce coredump parameter data structure (struct coredump_params) to simplify binfmt->core_dump() arguments. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Suggested-by: Ingo Molnar <mingo@elte.hu> Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Roland McGrath <roland@redhat.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-17 15:45:31 -08:00
Oleg Nesterov	9cd80bbb07	do_wait() optimization: do not place sub-threads on task_struct->children list Thanks to Roland who pointed out de_thread() issues. Currently we add sub-threads to ->real_parent->children list. This buys nothing but slows down do_wait(). With this patch ->children contains only main threads (group leaders). The only complication is that forget_original_parent() should iterate over sub-threads by hand, and de_thread() needs another list_replace() when it changes ->group_leader. Henceforth do_wait_thread() can never see task_detached() && !EXIT_DEAD tasks, we can remove this check (and we can unify do_wait_thread() and ptrace_do_wait()). This change can confuse the optimistic search in mm_update_next_owner(), but this is fixable and minor. Perhaps badness() and oom_kill_process() should be updated, but they should be fixed in any case. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Roland McGrath <roland@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Ratan Nalumasu <rnalumasu@gmail.com> Cc: Vitaly Mayatskikh <vmayatsk@redhat.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-17 15:45:31 -08:00
Mike Frysinger	0f67b0b039	nommu: ramfs: remove unused local var Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: David Howells <dhowells@redhat.com> Acked-by: Greg Ungerer <gerg@snapgear.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-17 15:45:31 -08:00
Jan Kara	ec8e2f7466	reiserfs: truncate blocks not used by a write It can happen that write does not use all the blocks allocated in write_begin either because of some filesystem error (like ENOSPC) or because page with data to write has been removed from memory. We truncate these blocks so that we don't have dangling blocks beyond i_size. Cc: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-17 15:45:30 -08:00
Linus Torvalds	b6e3224fb2	Revert "task_struct: make journal_info conditional" This reverts commit `e4c570c4cb`, as requested by Alexey: "I think I gave a good enough arguments to not merge it. To iterate: * patch makes impossible to start using ext3 on EXT3_FS=n kernels without reboot. * this is done only for one pointer on task_struct" None of config options which define task_struct are tristate directly or effectively." Requested-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-17 13:23:24 -08:00
Chris Mason	7a5d24b106	Merge branch 'master' of ssh://master.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable into for-linus	2009-12-17 16:01:41 -05:00
Linus Torvalds	a2770d86b3	Revert "fix mismerge with Trond's stuff (create_mnt_ns() export is gone now)" This reverts commit `e9496ff46a`. Quoth Al: "it's dependent on a lot of other stuff not currently in mainline and badly broken with current fs/namespace.c. Sorry, badly out-of-order cherry-pick from old queue. PS: there's a large pending series reworking the refcounting and lifetime rules for vfsmounts that will, among other things, allow to rip a subtree away _without_ dissolving connections in it, to be garbage-collected when all active references are gone. It's considerably saner wrt "is the subtree busy" logics, but it's nowhere near being ready for merge at the moment; this changeset is one of the things becoming possible with that sucker, but it certainly shouldn't have been picked during this cycle. My apologies..." Noticed-by: Eric Paris <eparis@redhat.com> Requested-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-17 12:51:05 -08:00
Chris Mason	3a1abec9f6	Btrfs: make sure fallocate properly starts a transaction The recent patch to make fallocate enospc friendly would send down a NULL trans handle to the allocator. This moves the transaction start to properly fix things. Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-12-17 15:47:17 -05:00
Chris Mason	ebfee3d71d	Merge branch btrfs-master into for-linus Conflicts: fs/btrfs/acl.c	2009-12-17 15:02:22 -05:00
Josef Bacik	83d3c9696f	Btrfs: make metadata chunks smaller This patch makes us a bit less zealous about making sure we have enough free metadata space by pearing down the size of new metadata chunks to 256mb instead of 1gb. Also, we used to try an allocate metadata chunks when allocating data, but that sort of thing is done elsewhere now so we can just remove it. With my -ENOSPC test I used to have 3gb reserved for metadata out of 75gb, now I have 1.7gb. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-12-17 12:33:38 -05:00
Matthew Wilcox	20a5239a5d	Btrfs: Show discard option in /proc/mounts Christoph's patch `e244a0aeb6` doesn't display the discard option in /proc/mounts, leading to some confusion for me. Here's the missing bit. Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-12-17 12:33:37 -05:00
TARUISI Hiroaki	4a8be425a8	Btrfs: deny sys_link across subvolumes. I rebased Christian Parpart's patch to deny hard link across subvolumes. Original patch modifies also btrfs_rename, but I excluded it because we can move across subvolumes now and it make no problem. ----------------- Hard link across subvolumes should not allowed in Btrfs. btrfs_link checks root of 'to' directory is same as root of 'from' file. If not same, btrfs_link returns -EPERM. Signed-off-by: TARUISI Hiroaki <taruishi.hiroak@jp.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-12-17 12:33:37 -05:00
Sage Weil	a7a3f7cadd	Btrfs: fail mount on bad mount options We shouldn't silently ignore unrecognized options. Signed-off-by: Sage Weil <sage@newdream.net> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-12-17 12:33:36 -05:00

... 57 58 59 60 61 ...

19,389 commits