Commit graph

165543 commits

Author SHA1 Message Date
Tao Ma
5aea1f0ef4 ocfs2: Abstract the creation of xattr block.
In xattr reflink, we also need to create xattr block, so
abstract the process out.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:42 -07:00
Tao Ma
fd68a894fc ocfs2: Remove inode from ocfs2_xattr_bucket_get_name_value.
In ocfs2_xattr_bucket_get_name_value, actually we only use
super_block. So use it.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:41 -07:00
Tao Ma
492a8a33e1 ocfs2: Add CoW support for xattr.
In order to make 2 transcation(xattr and cow) independent with each other,
we CoW the whole xattr out in case we are setting them.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:41 -07:00
Tao Ma
913580b4cd ocfs2: Abstract duplicate clusters process in CoW.
We currently use pagecache to duplicate clusters in CoW,
but it isn't suitable for xattr case. So abstract it out
so that the caller can decide which method it use.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:40 -07:00
Tao Ma
1061f9c1c9 ocfs2: Return extent flags for xattr value tree.
With the new refcount tree, xattr value can also be refcounted
among multiple files. So return the appropriate extent flags
so that CoW can used it later.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:39 -07:00
Tao Ma
a9063ab9a3 ocfs2: handle file attributes issue for reflink.
A reflink creates a snapshot of a file, that means the attributes
must be identical except for three exceptions - nlink, ino, and ctime.

As for time changes, Here is a brief description:

1. Source file:
   1) atime: Ignore. Let the lazy atime code handle that.
   2) mtime: don't touch.
   3) ctime: If we change the tree (adding REFCOUNTED to at least one
             extent), update it.
2. Destination file:
   1) atime: ignore.
   2) mtime: we want it to appear identical to the source.
   3) ctime: update.

The idea here is that an ls -l will show the same time for the
src and target - it shows mtime.  Backup software like rsync and tar
will treat the new file correctly too.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:39 -07:00
Tao Ma
110a045aca ocfs2: Add normal functions for reflink a normal file's extents.
2 major functions are added in this patch.

ocfs2_attach_refcount_tree will create a new refcount tree to the
old file if it doesn't have one and insert all the extent records
to the tree if they are not refcounted.

ocfs2_create_reflink_node will:
1. set the refcount tree to the new file.
2. call ocfs2_duplicate_extent_list which will iterate all the
   extents for the old file, insert it to the new file and increase
   the corresponding referennce count.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:38 -07:00
Tao Ma
37f8a2bfaa ocfs2: CoW a reflinked cluster when it is truncated.
When we truncate a file to a specific size which resides in a reflinked
cluster, we need to CoW it since ocfs2_zero_range_for_truncate will
zero the space after the size(just another type of write).

So we add a "max_cpos" in ocfs2_refcount_cow so that it will stop when
it hit the max cluster offset.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:38 -07:00
Tao Ma
293b2f70b4 ocfs2: Integrate CoW in file write.
When we use mmap, we CoW the refcountd clusters in
ocfs2_write_begin_nolock. While for normal file
io(including directio), we do CoW in
ocfs2_prepare_inode_for_write.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:37 -07:00
Tao Ma
6ae23c5555 ocfs2: CoW refcount tree improvement.
During CoW, if the old extent record is refcounted, we allocate
som new clusters and do CoW. Actually we can have some improvement
here. If the old extent has refcount=1, that means now it is only
used by this file. So we don't need to allocate new clusters, just
remove the refcounted flag and it is OK. We also have to remove
it from the refcount tree while not deleting it.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:36 -07:00
Tao Ma
6f70fa5199 ocfs2: Add CoW support.
This patch try CoW support for a refcounted record.

the whole process will be:
1. Calculate how many clusters we need to CoW and where we start.
   Extents that are not completely encompassed by the write will
   be broken on 1MB boundaries.
2. Do CoW for the clusters with the help of page cache.
3. Change the b-tree structure with the new allocated clusters.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:36 -07:00
Tao Ma
bcbbb24a6a ocfs2: Decrement refcount when truncating refcounted extents.
Add 'Decrement refcount for delete' in to the normal truncate
process. So for a refcounted extent record, call refcount rec
decrementation instead of cluster free.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:35 -07:00
Tao Ma
1aa75fea64 ocfs2: Add functions for extents refcounted.
Add function ocfs2_mark_extent_refcounted which can mark
an extent refcounted.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:34 -07:00
Tao Ma
1823cb0b9f ocfs2: Add support of decrementing refcount for delete.
Given a physical cpos and length, decrement the refcount
in the tree. If the refcount for any portion of the extent goes
to zero, that portion is queued for freeing.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:33 -07:00
Tao Ma
e73a819db9 ocfs2: Add support for incrementing refcount in the tree.
Given a physical cpos and length, increment the refcount
in the tree. If the extent has not been seen before, a refcount
record is created for it. Refcount records may be merged or
split by this operation.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:33 -07:00
Tao Ma
e2e9f6082b ocfs2: move tree path functions to alloc.h.
Now fs/ocfs2/alloc.c has more than 7000 lines. It contains our
basic b-tree operation. Although we have already make our b-tree
operation generic, the basic structrue ocfs2_path which is used
to iterate one b-tree branch is still static and limited to only
used in alloc.c. As refcount tree need them and I don't want to
add any more b-tree unrelated code to alloc.c, export them out.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:32 -07:00
Tao Ma
fe92441595 ocfs2: Add refcount b-tree as a new extent tree.
Add refcount b-tree as a new extent tree so that it can
use the b-tree to store and maniuplate ocfs2_refcount_rec.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:31 -07:00
Tao Ma
555936bfcb ocfs2: Abstract extent split process.
ocfs2_mark_extent_written actually does the following things:
1. check the parameters.
2. initialize the left_path and split_rec.
3. call __ocfs2_mark_extent_written. it will do:
   1) check the flags of unwritten
   2) do the real split work.
The whole process is packed tightly somehow. So this patch
will abstract 2 different functions so that future b-tree
operation can work with it.

1. __ocfs2_split_extent will accept path and split_rec and do
  the real split work.
2. ocfs2_change_extent_flag will accept a new flag and initialize
   path and split_rec.

So now ocfs2_mark_extent_written will do:
1. check the parameters.
2. call ocfs2_change_extent_flag.
   1) initalize the left_path and split_rec.
   2) check whether the new flags conflict with the old one.
   3) call __ocfs2_split_extent to do the split.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:31 -07:00
Tao Ma
853a3a1439 ocfs2: Wrap ocfs2_extent_contig in ocfs2_extent_tree.
Add a new operation eo_ocfs2_extent_contig int the extent tree's
operations vector. So that with the new refcount tree, We want
this so that refcount trees can always return CONTIG_NONE and
prevent extent merging.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:30 -07:00
Tao Ma
8bf396de98 ocfs2: Basic tree root operation.
Add basic refcount tree root operation.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:30 -07:00
Tao Ma
374a263e79 ocfs2: Add refcount tree lock mechanism.
Implement locking around struct ocfs2_refcount_tree.  This protects
all read/write operations on refcount trees.  ocfs2_refcount_tree
has its own lock and its own caching_info, protecting buffers among
multiple nodes.

User must call ocfs2_lock_refcount_tree before his operation on
the tree and unlock it after that.

ocfs2_refcount_trees are referenced by the block number of the
refcount tree root block, So we create an rb-tree on the ocfs2_super
to look them up.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:29 -07:00
Tao Ma
c732eb16bf ocfs2: Add caching info for refcount tree.
refcount tree should use its own caching info so that when
we downconvert the refcount tree lock, we can drop all the
cached buffer head.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:28 -07:00
Tao Ma
8dec98edfe ocfs2: Add new refcount tree lock resource in dlmglue.
refcount tree lock resource is used to protect refcount
tree read/write among multiple nodes.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:28 -07:00
Tao Ma
a433848132 ocfs2: Abstract caching info checkpoint.
In meta downconvert, we need to checkpoint the metadata in an inode.
For refcount tree, we also need it. So abstract the process out.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:27 -07:00
Tao Ma
f2c870e3b1 ocfs2: Add ocfs2_read_refcount_block.
Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:26 -07:00
Tao Ma
93c97087a6 ocfs2: Add metaecc for ocfs2_refcount_block.
Add metaecc and journal trigger for ocfs2_refcount_block.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:26 -07:00
Tao Ma
721f69c404 ocfs2: Define refcount tree structure.
Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:25 -07:00
Roland McGrath
18c1e2c80d x86: syscall_get_nr returns int
Make syscall_get_nr() return int, so we always sign-extend
the low 32 bits of orig_ax in checks.

Signed-off-by: Roland McGrath <roland@redhat.com>
2009-09-22 19:57:51 -07:00
Roland McGrath
268e46712d asm-generic: syscall_get_nr returns int
Only 32 bits of system call number are meaningful, so make the
specification for syscall_get_nr() be to return int, not long.

Signed-off-by: Roland McGrath <roland@redhat.com>
2009-09-22 19:56:50 -07:00
Andre Maasikas
5b31aee9d7 drm/radeon/r600: set correct pitch for 4 byte copy
[agd5f: also fix the non-kms path]

Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
2009-09-23 10:21:06 +10:00
Dave Airlie
c214271563 drm/radeon: consolidate family flags used in pciids.
having these separate was pointless and introduced a bug when
one got updated without the other.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2009-09-23 10:21:00 +10:00
Ingo Molnar
7c329288d7 vgaarb: make client interface config invariant.
Fixes build when VGA_ARB is off.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2009-09-23 09:52:18 +10:00
Anton Vorontsov
f056878332 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:
	drivers/power/wm97xx_battery.c
2009-09-23 03:49:27 +04:00
Mark Brown
63209a71e8 regulator: Add some brief design documentation
Provide some brief documentation of some of the design decisions that
are made by the regulator API.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
2009-09-22 22:16:53 +01:00
Martin Schwidefsky
ed87b27e00 [S390] Update default configuration.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-22 22:58:46 +02:00
Michael Holzheu
1aaf179d04 [S390] hibernate: Do real CPU swap at resume time
Currently, when the physical resume CPU is not equal to the physical suspend
CPU, we swap the CPUs logically, by modifying the logical/physical CPU mapping.
This has two major drawbacks: First the change is visible from user space (e.g.
CPU sysfs files) and second it is hard to ensure that nowhere in the kernel
the physical CPU ID is stored before suspend.
To fix this, we now really swap the physical CPUs, if the resume CPU is not
the pysical suspend CPU. We restart the suspend CPU and stop the resume CPU
using SIGP restart and SIGP stop. If the suspend CPU is no longer available,
we write a message and load a disabled wait PSW.

Signed-off-by: Michael Holzheu <michael.holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-22 22:58:46 +02:00
Stefan Weinhuber
68d1e5f08b [S390] dasd: tolerate devices that have no feature codes
The DASD device driver reads the feature codes of a device during
device initialization. These codes are later used to determine the
availability of advanced features like PAV or High Performance FICON.
Some very old devices do not support the command to read feature
codes and the initialization routine fails.
As the feature codes are not necessary for basic DASD operations, we
can support such devices by just ignoring missing feature codes.

Signed-off-by: Stefan Weinhuber <wein@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-22 22:58:46 +02:00
Felix Beck
5314af693d [S390] zcrypt: Do not add/remove devices in s/r callbacks
Devices are no longer removed or added in the suspend and resume
callbacks. Instead they are marked unregistered in suspend. In the
resume callback the ap_scan_bus method is scheduled. The bus scan
function will remove the old device and add new ones. This way all
the device handling will be done in only one function. Additionaly
the case where the domain might change during suspend/resume is
caught. In that case the devices qid needs to re-calculated in
order of having it found by the scan method.

Signed-off-by: Felix Beck <felix.beck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-22 22:58:46 +02:00
Heiko Carstens
2573a57530 [S390] hibernate: make sure pfn_is_nosave handles lowcore pages
pfn_is_nosave doesn't return the correct value for the second lowcore
page if lowcore protection is enabled. Make sure it always returns
the correct value.

While at it simplify the whole thing.
NSS special handling is done by the tprot check like it already works
for DCSS as well. So remove the extra code for NSS.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-22 22:58:45 +02:00
Heiko Carstens
3fd26a7793 [S390] smp: introduce LC_ORDER and simplify lowcore handling
Removes a couple of simple code duplications. But before I have to do
this again, just simplify it.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-22 22:58:45 +02:00
Christian Borntraeger
07805ac81c [S390] ptrace: use common code for simple peek/poke operations
arch_ptrace on s390 implements PTRACE_(PEEK|POKE)(TEXT|DATA) instead of
using using ptrace_request in kernel/ptrace.c.
The only reason is the 31bit addressing mode, where we have to unmask the
highest bit.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-22 22:58:45 +02:00
Heiko Carstens
bdd42b28cd [S390] fix disabled_wait inline assembly clobber list
The disabled_wait inline assmembly also clobbers register r1, but it
is missing in the clobber list.
Fixes recursive Oops on panic.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-22 22:58:45 +02:00
Heiko Carstens
87458ff458 [S390] Change kernel_page_present coding style.
Make the inline assembly look like all others.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-22 22:58:44 +02:00
Heiko Carstens
2583d1efe0 [S390] hibernation: reset system after resume
Force system into defined state after resume.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-22 22:58:44 +02:00
Heiko Carstens
846955c8af [S390] hibernation: fix guest page hinting related crash
On resume the system that loads the to be resumed image might have
unstable pages.
When the resume image is copied back and a write access happen to an
unstable page this causes an exception and the system crashes.

To fix this set all free pages to stable before copying the resumed
image data. Also after everything has been restored set all free
pages of the resumed system to unstable again.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-22 22:58:44 +02:00
Heiko Carstens
2e50195f58 [S390] Get rid of init_module/delete_module compat functions.
These functions aren't needed. Might be a leftover of the pre
cond_syscall time.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-22 22:58:43 +02:00
Heiko Carstens
3e86a8c617 [S390] Convert sys_execve to function with parameters.
Use function parameters instead of accessing the pt_regs structure
to get the parameters.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-22 22:58:43 +02:00
Heiko Carstens
2d70ca23f8 [S390] Convert sys_clone to function with parameters.
Use function parameters instead of accessing the pt_regs structure
to get the parameters.
Also merge the 31 and 64 bit versions since they are identical.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-22 22:58:43 +02:00
Jan Glauber
6541f7b68f [S390] qdio: change state of all primed input buffers
If input buffers stay in primed state qdio may not receive further interrupts
for the input queue depending on the firmware. That can cause a connection
hang on OSA cards.

Change the state of all primed input buffers that are not acknowledged to
not initialized.

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-22 22:58:43 +02:00
Jan Glauber
1d7e1500a6 [S390] qdio: reduce per device debug messages
Even if turned off the debug message overhead is measurable in the hot path.
Reduce the number of debug message calls in do_QDIO and qdio_kick_handler.
Also use hex numbers to save space in the debug entries.

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-22 22:58:42 +02:00