Commit graph

34,180 commits

Author SHA1 Message Date
Jeff Liu
d6394b5900 ocfs2: fix null pointer dereference in ocfs2_dir_foreach_blk_id()
Fix a NULL pointer deference while removing an empty directory, which
was introduced by commit 3704412bdb ("[readdir] convert ocfs2").

  BUG: unable to handle kernel NULL pointer dereference at (null)
  IP: [<(null)>]           (null)
  PGD 6da85067 PUD 6da89067 PMD 0
  Oops: 0010 [#1] SMP
  CPU: 0 PID: 6564 Comm: rmdir Tainted: G           O 3.11.0-rc1 #4
  RIP: 0010:[<0000000000000000>]  [<          (null)>]           (null)
  Call Trace:
    ocfs2_dir_foreach+0x49/0x50 [ocfs2]
    ocfs2_empty_dir+0x12c/0x3e0 [ocfs2]
    ocfs2_unlink+0x56e/0xc10 [ocfs2]
    vfs_rmdir+0xd5/0x140
    do_rmdir+0x1cb/0x1e0
    SyS_rmdir+0x16/0x20
    system_call_fastpath+0x16/0x1b
  Code:  Bad RIP value.
  RIP  [<          (null)>]           (null)
  RSP <ffff88006daddc10>
  CR2: 0000000000000000

[dan.carpenter@oracle.com: fix pointer math]
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Reported-by: David Weber <wb@munzinger.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-08-13 17:57:49 -07:00
Tiger Yang
c7dd3392ad ocfs2: fix NULL pointer dereference in ocfs2_duplicate_clusters_by_page
Since ocfs2_cow_file_pos will invoke ocfs2_refcount_icow with a NULL as
the struct file pointer, it finally result in a null pointer dereference
in ocfs2_duplicate_clusters_by_page.

This patch replace file pointer with inode pointer in
cow_duplicate_clusters to fix this issue.

[jeff.liu@oracle.com: rebased patch against linux-next tree]
Signed-off-by: Tiger Yang <tiger.yang@oracle.com>
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Acked-by: Tao Ma <tm@tao.ma>
Tested-by: David Weber <wb@munzinger.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-08-13 17:57:49 -07:00
Jie Liu
6115ea2884 ocfs2: Revert 40bd62e to avoid regression in extended allocation
Revert commit 40bd62eb7f ("fs/ocfs2/journal.h: add bits_wanted while
calculating credits in ocfs2_calc_extend_credits").

Unfortunately this change broke fallocate even if there is insufficient
disk space for the preallocation, which is a serious problem.

  # df -h
  /dev/sda8        22G  1.2G   21G   6% /ocfs2
  # fallocate -o 0 -l 200M /ocfs2/testfile
  fallocate: /ocfs2/test: fallocate failed: No space left on device

and a kernel warning:

  CPU: 3 PID: 3656 Comm: fallocate Tainted: G        W  O 3.11.0-rc3 #2
  Call Trace:
    dump_stack+0x77/0x9e
    warn_slowpath_common+0xc4/0x110
    warn_slowpath_null+0x2a/0x40
    start_this_handle+0x6c/0x640 [jbd2]
    jbd2__journal_start+0x138/0x300 [jbd2]
    jbd2_journal_start+0x23/0x30 [jbd2]
    ocfs2_start_trans+0x166/0x300 [ocfs2]
    __ocfs2_extend_allocation+0x38f/0xdb0 [ocfs2]
    ocfs2_allocate_unwritten_extents+0x3c9/0x520
    __ocfs2_change_file_space+0x5e0/0xa60 [ocfs2]
    ocfs2_fallocate+0xb1/0xe0 [ocfs2]
    do_fallocate+0x1cb/0x220
    SyS_fallocate+0x6f/0xb0
    system_call_fastpath+0x16/0x1b
  JBD2: fallocate wants too many credits (51216 > 4381)

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Cc: Goldwyn Rodrigues <rgoldwyn@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-08-13 17:57:49 -07:00
Michal Hocko
b610ded719 hugetlb: fix lockdep splat caused by pmd sharing
Dave has reported the following lockdep splat:

  =================================
  [ INFO: inconsistent lock state ]
  3.11.0-rc1+ #9 Not tainted
  ---------------------------------
  inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
  kswapd0/49 [HC0[0]:SC0[0]:HE1:SE1] takes:
   (&mapping->i_mmap_mutex){+.+.?.}, at: [<c114971b>] page_referenced+0x87/0x5e3
  {RECLAIM_FS-ON-W} state was registered at:
     mark_held_locks+0x81/0xe7
     lockdep_trace_alloc+0x5e/0xbc
     __alloc_pages_nodemask+0x8b/0x9b6
     __get_free_pages+0x20/0x31
     get_zeroed_page+0x12/0x14
     __pmd_alloc+0x1c/0x6b
     huge_pmd_share+0x265/0x283
     huge_pte_alloc+0x5d/0x71
     hugetlb_fault+0x7c/0x64a
     handle_mm_fault+0x255/0x299
     __do_page_fault+0x142/0x55c
     do_page_fault+0xd/0x16
     error_code+0x6c/0x74
  irq event stamp: 3136917
  hardirqs last  enabled at (3136917):  _raw_spin_unlock_irq+0x27/0x50
  hardirqs last disabled at (3136916):  _raw_spin_lock_irq+0x15/0x78
  softirqs last  enabled at (3136180):  __do_softirq+0x137/0x30f
  softirqs last disabled at (3136175):  irq_exit+0xa8/0xaa
  other info that might help us debug this:
   Possible unsafe locking scenario:
         CPU0
         ----
    lock(&mapping->i_mmap_mutex);
    <Interrupt>
      lock(&mapping->i_mmap_mutex);

  *** DEADLOCK ***
  no locks held by kswapd0/49.

  stack backtrace:
  CPU: 1 PID: 49 Comm: kswapd0 Not tainted 3.11.0-rc1+ #9
  Hardware name: Dell Inc.                 Precision WorkStation 490    /0DT031, BIOS A08 04/25/2008
  Call Trace:
    dump_stack+0x4b/0x79
    print_usage_bug+0x1d9/0x1e3
    mark_lock+0x1e0/0x261
    __lock_acquire+0x623/0x17f2
    lock_acquire+0x7d/0x195
    mutex_lock_nested+0x6c/0x3a7
    page_referenced+0x87/0x5e3
    shrink_page_list+0x3d9/0x947
    shrink_inactive_list+0x155/0x4cb
    shrink_lruvec+0x300/0x5ce
    shrink_zone+0x53/0x14e
    kswapd+0x517/0xa75
    kthread+0xa8/0xaa
    ret_from_kernel_thread+0x1b/0x28

which is a false positive caused by hugetlb pmd sharing code which
allocates a new pmd from withing mapping->i_mmap_mutex.  If this
allocation causes reclaim then the lockdep detector complains that we
might self-deadlock.

This is not correct though, because hugetlb pages are not reclaimable so
their mapping will be never touched from the reclaim path.

The patch tells lockup detector that hugetlb i_mmap_mutex is special by
assigning it a separate lockdep class so it won't report possible
deadlocks on unrelated mappings.

[peterz@infradead.org: comment for annotation]
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-08-13 17:57:48 -07:00
Cyrill Gorcunov
41bb3476b3 mm: save soft-dirty bits on file pages
Andy reported that if file page get reclaimed we lose the soft-dirty bit
if it was there, so save _PAGE_BIT_SOFT_DIRTY bit when page address get
encoded into pte entry.  Thus when #pf happens on such non-present pte
we can restore it back.

Reported-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-08-13 17:57:48 -07:00
Cyrill Gorcunov
179ef71cbc mm: save soft-dirty bits on swapped pages
Andy Lutomirski reported that if a page with _PAGE_SOFT_DIRTY bit set
get swapped out, the bit is getting lost and no longer available when
pte read back.

To resolve this we introduce _PTE_SWP_SOFT_DIRTY bit which is saved in
pte entry for the page being swapped out.  When such page is to be read
back from a swap cache we check for bit presence and if it's there we
clear it and restore the former _PAGE_SOFT_DIRTY bit back.

One of the problem was to find a place in pte entry where we can save
the _PTE_SWP_SOFT_DIRTY bit while page is in swap.  The _PAGE_PSE was
chosen for that, it doesn't intersect with swap entry format stored in
pte.

Reported-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-08-13 17:57:47 -07:00
Dave Chinner
4bb928cdb9 xfs: split the CIL lock
The xc_cil_lock is used for two purposes - to protect the CIL
itself, and to protect the push/commit state and lists. These are
two logically separate structures and operations, so can have their
own locks. This means that pushing on the CIL and the commit wait
ordering won't contend for a lock with other transactions that are
completing concurrently. As the CIL insertion is the hottest path
throught eh CIL, this is a big win.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-13 16:21:21 -05:00
Dave Chinner
991aaf65ff xfs: Combine CIL insert and prepare passes
Now that all the log item preparation and formatting is done under
the CIL lock, we can get rid of the intermediate log vector chain
used to track items to be inserted into the CIL.

We can already find all the items to be committed from the
transaction handle, so as long as we attach the log vectors to the
item before we insert the items into the CIL, we don't need to
create a log vector chain to pass around.

This means we can move all the item insertion code into and optimise
it into a pair of simple passes across all the items in the
transaction. The first pass does the formatting and accounting, the
second inserts them all into the CIL.

We keep this two pass split so that we can separate the CIL
insertion - which must be done under the CIL spinlock - from the
formatting. We could insert each item into the CIL with a single
pass, but that massively increases the number of times we have to
grab the CIL spinlock. It is much more efficient (and hence
scalable) to do a batch operation and insert all objects in a single
lock grab.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-13 16:20:09 -05:00
Dave Chinner
f5baac354d xfs: avoid CIL allocation during insert
Now that we have the size of the log vector that has been allocated,
we can determine if we need to allocate a new log vector for
formatting and insertion. We only need to allocate a new vector if
it won't fit into the existing buffer.

However, we need to hold the CIL context lock while we do this so
that we can't race with a push draining the currently queued log
vectors. It is safe to do this as long as we do GFP_NOFS allocation
to avoid avoid memory allocation recursing into the filesystem.
Hence we can safely overwrite the existing log vector on the CIL if
it is large enough to hold all the dirty regions of the current
item.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-13 16:19:03 -05:00
Dave Chinner
7492c5b42d xfs: Reduce allocations during CIL insertion
Now that we have the size of the object before the formatting pass
is called, we can allocation the log vector and it's buffer in a
single allocation rather than two separate allocations.

Store the size of the allocated buffer in the log vector so that
we potentially avoid allocation for future modifications of the
object.

While touching this code, remove the IOP_FORMAT definition.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-13 16:12:30 -05:00
Dave Chinner
166d13688a xfs: return log item size in IOP_SIZE
To begin optimising the CIL commit process, we need to have IOP_SIZE
return both the number of vectors and the size of the data pointed
to by the vectors. This enables us to calculate the size ofthe
memory allocation needed before the formatting step and reduces the
number of memory allocations per item by one.

While there, kill the IOP_SIZE macro.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-13 16:10:21 -05:00
Eric Sandeen
050a1952c3 xfs:free bp in xlog_find_tail() error path
xlog_find_tail() currently leaks a bp on one error path.

There is no error target, so manually free the bp before
returning the error.

Found by Coverity.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-13 15:49:51 -05:00
Eric Sandeen
5d0a654974 xfs: free bp in xlog_find_zeroed() error path
xlog_find_zeroed() currently leaks a bp on one error path.

Using the bp_err: target resolves this.

Found by Coverity.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-13 15:48:51 -05:00
Eric Sandeen
6dd93e9e5e xfs: avoid double-free in xfs_attr_node_addname
xfs_attr_node_addname()'s error handling tests whether it
should free "state" in the out: error handling label:

out:
        if (state)
                xfs_da_state_free(state);

but an earlier free doesn't set state to NULL afterwards; this
could lead to a double free.  Fix it by setting state to NULL
after it's freed.

This was found by Coverity.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-13 15:48:01 -05:00
Jie Liu
2c2bcc0735 xfs: call roundup_64() to calculate the min_logblks
Replace roundup() with roundup_64() as we calculate min_logblks
with 64-bit divisions.  Hence, call roundup() will cause the
following error while compiling a 32-bit kernel:

fs/built-in.o: In function `xfs_log_calc_minimum_size':
fs/xfs/xfs_log_rlimit.c:140: undefined reference to `__udivdi3'

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-13 14:19:11 -05:00
Jie Liu
3e7b91cf8c xfs: Validate log space at mount time
Validate log space during log mount stage, the underlying function
will drop a warning message via syslog in critical level if the log
space is too small or too large.

[ dchinner: For CRC enable filesystems, abort the mounting of the
filesystem as mkfs should never make a log too small for the given
filesystem configuration. ]

[ dchinner: make a note of the fact that the log size limits in
block counts are in units of filesystem blocks, not basic blocks. ]

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 17:50:35 -05:00
Jie Liu
5a96a94547 xfs: Add xfs_log_rlimit.c
Add source files for xfs_log_rlimit.c The new file is used for log
size calculations and validation shared with userspace.

[dchinner: xfs_log_calc_max_attrsetm_res() does not modify the
tr_attrsetm reservation, just calculates the maximum. ]

[dchinner: rework loop in xfs_log_get_max_trans_res() ]

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 17:49:38 -05:00
Jie Liu
e773fc934f xfs: Refactor xfs_ticket_alloc() to extract a new helper
Refactor xlog_ticket_alloc() to extract a new helper, i.e.
xfs_log_calc_unit_res().

This helper would be used to calculate the total log reservation
size by adding extra log operation/transation headers for a new
log ticket.

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 17:49:02 -05:00
Jie Liu
f749278c5a xfs: Get rid of all XFS_XXX_LOG_RES() macro
Get rid of all XFS_XXX_LOG_RES() macros since they are obsoleted now.

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 17:48:08 -05:00
Jie Liu
3d3c8b5222 xfs: refactor xfs_trans_reserve() interface
With the new xfs_trans_res structure has been introduced, the log
reservation size, log count as well as log flags are pre-initialized
at mount time.  So it's time to refine xfs_trans_reserve() interface
to be more neat.

Also, introduce a new helper M_RES() to return a pointer to the
mp->m_resv structure to simplify the input.

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 17:47:34 -05:00
Jie Liu
783cb6d172 xfs: Make writeid transaction use tr_writeid
tr_writeid is defined at mp->m_resv structure, however, it does not
really being used when it should be..

This patch changes it to tr_writeid to fetch the correct log
reservation size.

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 17:46:59 -05:00
Jie Liu
20996c9326 xfs: Introduce tr_fsyncts to m_reservation
A preparation step.

For now fsync_ts transaction use the pre-calculated log reservation
size of tr_swrite.  This patch introduce a new item tr_fsyncts to
mp->m_reservations structure so that we can fetch the log
reservation value for it in a same manner to others.

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 17:46:29 -05:00
Jie Liu
0eadd10288 xfs: Introduce a new structure to hold transaction reservation items
Introduce a new structure xfs_trans_res to hold transaction
reservation item info per log ticket.

We also need to improve xfs_trans_resv_calc() by initializing the
log count as well as log flags for permanent log reservation.

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 17:45:49 -05:00
Dave Chinner
9356fe22af xfs: make struct xfs_perag kernel only
The struct xfs_perag has many kernel-only definitions in it,
requiring a __KERNEL__ guard so userspace can use it to. Move it to
xfs_mount.h so that it it kernel-only, and let userspace redefine
it's own version of the structure containing only what it needs.
This gets rid of another __KERNEL__ check in the XFS header files.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 17:44:36 -05:00
Dave Chinner
4f3d71f68b xfs: move kernel specific type definitions to xfs.h
xfs_types.h is shared with userspace, so having kernel specific
types defined in it is problematic. Move all the kernel specific
defines to xfs_linux.h so we can remove the __KERNEL__ guards from
xfs_types.h.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 17:04:08 -05:00
Linus Torvalds
584d88b2cd Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6
Pull CIFS fixes from Steve French:
 "A set of small cifs fixes, including 3 relating to symlink handling"

* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: don't instantiate new dentries in readdir for inodes that need to be revalidated immediately
  cifs: set sb->s_d_op before calling d_make_root()
  cifs: fix bad error handling in crypto code
  cifs: file: initialize oparms.reconnect before using it
  Do not attempt to do cifs operations reading symlinks with SMB2
  cifs: extend the buffer length enought for sprintf() using
2013-08-12 15:02:53 -07:00
Linus Torvalds
fd4f35d0fa A number of miscellaneous ext4 bugs fixes for v3.11, including a fix
so that if ext4 is built as a module, to allow it to be unloaded.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABCAAGBQJSCOS0AAoJENNvdpvBGATwOYkQALgOTX2ma7hgCr91ldg3Q69e
 bejRm2Prdjs3HXa3MwC5yZ0i2nDXxTjMXidnVcD2QgLwS6rPGQQfrGhQ1OtJqLxA
 Gzf6mHIWIaRG9/JtTkzp3ucONbhc13LnaJd9B2bHASlq4+5aJhlziblwapMSvVbI
 YixhejVREmXdZ/typvnQOsoPfXzH9NLOTY/asT9IcZK6flhczNIlMC0/zRZQw0+I
 pDK/akmdOLpkd57UvJcwdcGIgcioSrg+JXk++EsDe82v2bpxleG2IdLbVZ76whIS
 A3FHEJvUfiFPj+GhKMQJyI1sa1ZGmD7nH8n2Yf6TVqZ4jgFx4ox9d83c0PpSLn0+
 VTZD601lfAoThUwKRTk45FUI4siZ1zBP13cbTH1DJXvGVllehhC9zBCezlNma4vr
 cI+K3d8VBHV7bKSRSgRVcB2pY+bJE1qP5XJXGnfDhjEiecNgXFRm0i2UtYSEl6R7
 aQGAdfjxp6tf+gme6c13zFK43FEJ9/DMaC8l0eamiQyH5U4OhOD8Q4ExnHjrfN69
 V3yZc4AZq46J9+2H/Jc457NYiFH2YK74jpAcpsVV0fXKFDR8Ss+rKXmfB+TGS45f
 Q1eHbKLuA3fHzHfwJCozqnUBxQO41M4vEd/ktqenwsKzeOw0qH3qynxEUd6xgka5
 1pcnJZnSGbm1d8pwYLMc
 =NuDG
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull more ext4 bugfixes from Ted Ts'o:
 "A number of miscellaneous ext4 bugs fixes for v3.11, including a fix
  so that if ext4 is built as a module, to allow it to be unloaded"

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: flush the extent status cache during EXT4_IOC_SWAP_BOOT
  ext4: fix mount/remount error messages for incompatible mount options
  ext4: allow the mount options nodelalloc and data=journal
2013-08-12 15:00:40 -07:00
Dave Chinner
9b90b0d9da xfs: xfs_filestreams.h doesn't need __KERNEL__
Because it is only used within the kernel.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 17:00:11 -05:00
Dave Chinner
cb9eabff58 xfs: remove __KERNEL__ check from xfs_dir2_leaf.c
It's actually an ifndef section, which means it is only included in
userspace. however, it's deep within the libxfs code, so it's
unlikely that the condition checked in userspace can actually occur
(search an empty leaf) through the libxfs interfaces. i.e. if it can
happen in usrspace, it can happen in the kernel, so remove it from
userspace too....

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 16:59:14 -05:00
Dave Chinner
b49a0c1883 xfs: remove __KERNEL__ from debug code
There is no reason the remaining kernel-only debug code needs to
remain kernel-only. Kill the __KERNEL__ part of the defines, and let
userspace handle the debug code appropriately.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 16:58:37 -05:00
Dave Chinner
63d20d6e36 xfs: kill __KERNEL__ check for debug code in allocation code
Userspace running debug builds is relatively rare, so there's need
to special case the allocation algorithm code coverage debug switch.
As it is, userspace defines random numbers to 0, so  invert the
logic of the switch so it is effectively a no-op in userspace.
This kills another couple of __KERNEL__ users.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 16:57:51 -05:00
Dave Chinner
94b406091b xfs: don't special case shared superblock mounts
Neither kernel or userspace support shared read-only mounts, so
don't bother special casing the support check to be different
between kernel and userspace. The same check can be used as neither
like it...

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 16:57:16 -05:00
Dave Chinner
a133d952b4 xfs: consolidate extent swap code
So we don't need xfs_dfrag.h in userspace anymore, move the extent
swap ioctl structure definition to xfs_fs.h where most of the other
ioctl structure definitions are.

Now that we don't need separate files for extent swapping, separate
the basic file descriptor checking code to xfs_ioctl.c, and the code
that does the extent swap operation to xfs_bmap_util.c.  This
cleanly separates the user interface code from the physical
mechanism used to do the extent swap.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 16:56:06 -05:00
Dave Chinner
e546cb79ef xfs: consolidate xfs_utils.c
There are a few small helper functions in xfs_util, all related to
xfs_inode modifications. Move them all to xfs_inode.c so all
xfs_inode operations are consiolidated in the one place.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 16:55:17 -05:00
Dave Chinner
f6bba2017a xfs: consolidate xfs_rename.c
Move the rename code to xfs_inode.c to continue consolidating
all the kernel xfs_inode operations in the one place.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 16:54:09 -05:00
Dave Chinner
c24b5dfadc xfs: kill xfs_vnodeops.[ch]
Now we have xfs_inode.c for holding kernel-only XFS inode
operations, move all the inode operations from xfs_vnodeops.c to
this new file as it holds another set of kernel-only inode
operations. The name of this file traces back to the days of Irix
and it's vnodes which we don't have anymore.

Essentially this move consolidates the inode locking functions
and a bunch of XFS inode operations into the one file. Eventually
the high level functions will be merged into the VFS interface
functions in xfs_iops.c.

This leaves only internal preallocation, EOF block manipulation and
hole punching functions in vnodeops.c. Move these to xfs_bmap_util.c
where we are already consolidating various in-kernel physical extent
manipulation and querying functions.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 16:53:39 -05:00
Dave Chinner
836a94ad59 xfs: fix issues that cause userspace warnings
Some of the code shared with userspace causes compilation warnings
from things turned off in the kernel code, such as differences in
variable signedness. Fix those issues.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 16:52:54 -05:00
Dave Chinner
c5c249b424 xfs: minor cleanups
These come from syncing the shared userspace and kernel code. Small
whitespace and trivial cleanups.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 16:46:08 -05:00
Dave Chinner
6898811459 xfs: create xfs_bmap_util.[ch]
There is a bunch of code in xfs_bmap.c that is kernel specific and
not shared with userspace. To minimise the difference between the
kernel and userspace code, shift this unshared code to
xfs_bmap_util.c, and the declarations to xfs_bmap_util.h.

The biggest issue here is xfs_bmap_finish() - userspace has it's own
definition of this function, and so we need to move it out of
xfs_bmap.[ch]. This means several other files need to include
xfs_bmap_util.h as well.

It also introduces and interesting dance for the stack switching
code in xfs_bmapi_allocate(). The stack switching/workqueue code is
actually moved to xfs_bmap_util.c, so that userspace can simply use
a #define in a header file to connect the dots without needing to
know about the stack switch code at all.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 16:45:17 -05:00
Dave Chinner
ff55068c20 xfs: introduce xfs_sb.c for sharing with libxfs
xfs_mount.c is shared with userspace, but the only functions that
are shared are to do with physical superblock manipulations. This
means that less than 25% of the xfs_mount.c code is actually shared
with userspace. Move all the superblock functions to xfs_sb.c and
share that instead with libxfs.

Note that this will leave all the in-core transaction related
superblock counter modifications in xfs_mount.c as none of that is
shared with userspace. With a few more small changes, xfs_mount.h
won't need to be shared with userspace anymore, either.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 16:44:11 -05:00
Dave Chinner
1fb7e48db6 xfs: split out the remote symlink handling
The remote symlink format definition and manipulation needs to be
shared with userspace, but the in-kernel interfaces do not. Split
the remote symlink format handling out into xfs_symlink_remote.[ch]
fo it can easily be shared with userspace.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 16:43:38 -05:00
Dave Chinner
fde2227ce1 xfs: split out attribute fork truncation code into separate file
The attribute inactivation code is not used by userspace, so like
the attribute listing, split it out into a separate file to minimise
the differences between the filesystem shared with libxfs in
userspace.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 16:42:30 -05:00
Dave Chinner
abec5f2bf9 xfs: split out attribute listing code into separate file
The attribute listing code is not used by userspace, so like the
directory readdir code, split it out into a separate file to
minimise the differences between the filesystem shared with libxfs
in userspace.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 16:41:29 -05:00
Dave Chinner
2b9ab5ab9c xfs: reshuffle dir2 definitions around for userspace
Many of the definitions within xfs_dir2_priv.h are needed in
userspace outside libxfs. Definitions within xfs_dir2_priv.h are
wholly contained within libxfs, so we need to shuffle some of the
definitions around to keep consistency across files shared between
user and kernel space.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 16:40:57 -05:00
Dave Chinner
4a8af273de xfs: move getdents code into it's own file
The directory readdir code is not used by userspace, but it is
intermingled with files that are shared with userspace. This makes
it difficult to compare the differences between the userspac eand
kernel files are the userspace files don't have the getdents code in
them. Move all the kernel getdents code to a separate file to bring
the shared content between userspace and kernel files closer
together.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 16:39:56 -05:00
Dave Chinner
1fd7115eda xfs: introduce xfs_inode_buf.c for inode buffer operations
The only thing remaining in xfs_inode.[ch] are the operations that
read, write or verify physical inodes in their underlying buffers.
Move all this code to xfs_inode_buf.[ch] and so we can stop sharing
xfs_inode.[ch] with userspace.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 16:39:05 -05:00
Dave Chinner
7bb85ef360 xfs: move unrelated definitions out of xfs_inode.h
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 16:37:57 -05:00
Dave Chinner
5c4d97d01a xfs: move inode fork definitions to a new header file
The inode fork definitions are a combination of on-disk format
definition and in-memory tracking and manipulation. They are both
shared with userspace, so move them all into their own file so
sharing is easy to do and track.  This removes all inode fork
related information from xfs_inode.h.

Do the same for the all the C code that currently resides in
xfs_inode.c for the same reason.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 16:37:32 -05:00
Dave Chinner
7fd36c4418 xfs: split out transaction reservation code
The transaction reservation size calculations is used by both kernel
and userspace, but most of the transaction code in xfs_trans.c is
kernel specific. Split all the transaction reservation code out into
it's own files to make sharing with userspace simpler. This just
leaves kernel-only definitions in xfs_trans.h, so it doesn't need to
be shared with userspace anymore, either.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 16:36:16 -05:00
Dave Chinner
d386b32b55 xfs: sync minor header differences needed by userspace.
Little things like exported functions, __KERNEL__ protections, and
so on that ensure user and kernel shared headers are identical.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 16:35:41 -05:00