Add the fl_owner to NLM compare locks. Since two different client can
present the same pid to the server it is not enough to distinguish locks
from different clients. The fl_owner field is a pointer to the struct
nlm_host which is unique for each client.
Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Respond to a moved error on NFS lookup by setting up the referral.
Note: We don't actually follow the referral during lookup/getattr, but
later when we detect fsid mismatch in inode revalidation (similar to the
processing done for cloning submounts). Referrals will have fake attributes
until they are actually followed or traversed.
Signed-off-by: Manoj Naik <manoj@almaden.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Set up mountpoint when hitting a referral on moved error by getting
fs_locations.
Signed-off-by: Manoj Naik <manoj@almaden.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Use component4-style formats for decoding list of servers and pathnames in
fs_locations.
Signed-off-by: Manoj Naik <manoj@almaden.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
NFSv4 allows for the fact that filesystems may be replicated across
several servers or that they may be migrated to a backup server in case of
failure of the primary server.
fs_locations is an NFSv4 operation for retrieving information about the
location of migrated and/or replicated filesystems.
Based on an initial implementation by Jiaying Zhang <jiayingz@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Make automounted partitions expire using the mark_mounts_for_expiry()
function. The timeout is controlled via a sysctl.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This should enable us to detect if we are crossing a mountpoint in the
case where the server is exporting "nohide" mounts.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Allow filesystems to decide to perform pre-umount processing whether or not
MNT_FORCE is set.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Allow a submount to be marked as being 'shrinkable' by means of the
vfsmount->mnt_flags, and then add a function 'shrink_submounts()' which
attempts to recursively unmount these submounts.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Replace all module uses with the new vfs_kern_mount() interface, and fix up
simple_pin_fs().
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
do_kern_mount() does not allow the kernel to use private mount interfaces
without exposing the same interfaces to userland. The problem is that the
filesystem is referenced by name, thus meaning that it and its mount
interface must be registered in the global filesystem list.
vfs_kern_mount() passes the struct file_system_type as an explicit
parameter in order to overcome this limitation.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
In the case of a call to truncate_inode_pages(), we should really try to
cancel any pending writes on the page.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Separate out the function of revalidating the inode metadata, and
revalidating the mapping. The former may be called by lookup(),
and only really needs to check that permissions, ctime, etc haven't changed
whereas the latter needs only done when we want to read data from the page
cache, and may need to sync and then invalidate the mapping.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Whenever the directory changes, we want to make sure that we always
invalidate its page cache. Fix up update_changeattr() and
nfs_mark_for_revalidate() so that they do so.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Clean up use of page_array, and fix an off-by-one error noticed by Tom
Talpey which causes kmalloc calls in cases where using the page_array
is sufficient.
Test plan:
Normal client functional testing with r/wsize=32768.
Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This new prctl is intended for changing the execution mode of the
processor, on processors that support both a little-endian mode and a
big-endian mode. It is intended for use by programs such as
instruction set emulators (for example an x86 emulator on PowerPC),
which may find it convenient to use the processor in an alternate
endianness mode when executing translated instructions.
Note that this does not imply the existence of a fully-fledged ABI for
both endiannesses, or of compatibility code for converting system
calls done in the non-native endianness mode. The program is expected
to arrange for all of its system call arguments to be presented in the
native endianness.
Switching between big and little-endian mode will require some care in
constructing the instruction sequence for the switch. Generally the
instructions up to the instruction that invokes the prctl system call
will have to be in the old endianness, and subsequent instructions
will have to be in the new endianness.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
There's a race between shutting down one io scheduler and firing up the
next, in which a new io could enter and cause the io scheduler to be
invoked with bad or NULL data.
To fix this, we need to maintain the queue lock for a bit longer.
Unfortunately we cannot do that, since the elevator init requires to be
run without the lock held. This isn't easily fixable, without also
changing the mempool API. So split the initialization into two parts,
and alloc-init operation and an attach operation. Then we can
preallocate the io scheduler and related structures, and run the attach
inside the lock after we detach the old one.
This patch has survived 30 minutes of 1 second io scheduler switching
with a very busy io load.
Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
From: Ralf Baechle <ralf@linux-mips.org>
<linux/mempolicy.h> uses struct mm_struct and relies on a definition or
declaration somehow magically being dragged in which may result in a
build:
[...]
CC mm/mempolicy.o
In file included from mm/mempolicy.c:69:
include/linux/mempolicy.h:150: warning: âstruct mm_structâ declared inside parameter list
include/linux/mempolicy.h:150: warning: its scope is only this definition or declaration, which is probably not what you want
include/linux/mempolicy.h:175: warning: âstruct mm_structâ declared inside parameter list
mm/mempolicy.c:622: error: conflicting types for âdo_migrate_pagesâ
include/linux/mempolicy.h:175: error: previous declaration of âdo_migrate_pagesâ was here
mm/mempolicy.c:1661: error: conflicting types for âmpol_rebind_mmâ
include/linux/mempolicy.h:150: error: previous declaration of âmpol_rebind_mmâ was here
make[1]: *** [mm/mempolicy.o] Error 1
make: *** [mm] Error 2
[ralf@denk linux-ip35]$
Including <linux/sched.h> is a step into direction of include hell so
fixed by adding a forward declaration of struct mm_struct instead.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
From: Andrew Morton <akpm@osdl.org>
drivers/rtc/rtc-m48t86.c: In function `m48t86_rtc_read_time':
drivers/rtc/rtc-m48t86.c:51: error: structure has no member named `ia64_mv'
drivers/rtc/rtc-m48t86.c:55: error: structure has no member named `ia64_mv'
drivers/rtc/rtc-m48t86.c:56: error: structure has no member named `ia64_mv'
drivers/rtc/rtc-m48t86.c:57: error: structure has no member named `ia64_mv'
drivers/rtc/rtc-m48t86.c:58: error: structure has no member named `ia64_mv'
drivers/rtc/rtc-m48t86.c:60: error: structure has no member named `ia64_mv'
readb() and writeb() are macros on ia64.
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
From: Ralf Baechle <ralf@linux-mips.org>
<linux/mmzone.h> uses PAGE_SIZE, PAGE_SHIFT from <asm/page.h> without
including that header itself. For some sparsemem configurations this may
result in build errors like:
CC init/initramfs.o
In file included from include/linux/gfp.h:4,
from include/linux/slab.h:15,
from include/linux/percpu.h:4,
from include/linux/rcupdate.h:41,
from include/linux/dcache.h:10,
from include/linux/fs.h:226,
from init/initramfs.c:2:
include/linux/mmzone.h:498:22: warning: "PAGE_SHIFT" is not defined
In file included from include/linux/gfp.h:4,
from include/linux/slab.h:15,
from include/linux/percpu.h:4,
from include/linux/rcupdate.h:41,
from include/linux/dcache.h:10,
from include/linux/fs.h:226,
from init/initramfs.c:2:
include/linux/mmzone.h:526: error: `PAGE_SIZE' undeclared here (not in a function)
include/linux/mmzone.h: In function `__pfn_to_section':
include/linux/mmzone.h:573: error: `PAGE_SHIFT' undeclared (first use in this function)
include/linux/mmzone.h:573: error: (Each undeclared identifier is reported only once
include/linux/mmzone.h:573: error: for each function it appears in.)
include/linux/mmzone.h: In function `pfn_valid':
include/linux/mmzone.h:578: error: `PAGE_SHIFT' undeclared (first use in this function)
make[1]: *** [init/initramfs.o] Error 1
make: *** [init] Error 2
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Seems-reasonable-to: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Since rb_insert_color() is part of the _public_ API, while the others are
purely internal, switch to be consistent with that.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
The rest of the file uses these types instead of C99 types.
Acked-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
From: Andrew Morton <akpm@osdl.org>
Revert commit ff4da2e262.
It broke APM suspend, probably because APM doesn't switch back to a VT
when suspending.
Tracked down by Matt Mackall <mpm@selenic.com>
Rafael sayeth:
"It only fixed the theoretical issue that a quick-handed user could
switch to X after processes have been frozen and before the devices
are suspended.
With the current userland suspend tools it shouldn't be necessary."
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Now that all drivers implementing new EH are converted to new probing
mechanism, ops->probe_reset doesn't have any user. Kill it.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Update unload unplug - driver unloading / PCI removal. This is done
by ata_port_detach() which short-circuits EH, disables all devices and
freezes the port. With this patch, EH and unloading/unplugging are
properly synchronized.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Implement warmplug. User-initiated unplug can be detected by
hostt->slave_destroy() and plug by transportt->user_scan(). This
patch only implements the two callbacks. The next function will hook
them.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Implement SCSI part of hotplug.
This must be done in a separate context as SCSI makes use of EH during
probing. SCSI scan fails silently if EH is in progress. In such
cases, libata pauses briefly and retries until every device is
attached.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Implement ATA part of hotplug. To avoid probing broken devices over
and over again, disabled devices are not automatically detached. They
are detached only if probing is requested for the device or the
associated port is offline. Also, to avoid infinite probing loop,
Each device is probed only once per EH run.
As SATA PHY status is fragile, devices are detached only after it has
used up its recovery chances unless explicitly requested by LLDD or
user (LLDD may request direct detach if, for example, it supports cold
presence detection).
Signed-off-by: Tejun Heo <htejun@gmail.com>
ata_hsm_move() will be used by LLDDs which depend on standard PIO HSM
but implement their own interrupt handlers.
Signed-off-by: Tejun Heo <htejun@gmail.com>
With hotplug, every reset might be a probing reset and thus something
similar to probe_init() is needed. prereset() method is called before
a series of resets to a port and is the counterpart of postreset().
prereset() can tell EH to use different type of reset or skip reset by
modifying ehc->i.action.
This patch also implements ata_std_prereset(). Most controllers
should be able to use this function directly or with some wrapping.
After hotplug, different controllers need different actions to resume
the PHY and detect the newly attached device. Controllers can be
categorized as follows.
* Controllers which can wait for the first D2H FIS after hotplug.
Note that if the waiting is implemented by polling TF status, there
needs to be a way to set BSY on PHY status change. It can be
implemented by hardware or with the help of the driver.
* Controllers which can wait for the first D2H FIS after sending
COMRESET. These controllers need to issue COMRESET to wait for the
first FIS. Note that the received D2H FIS could be the first D2H
FIS after POR (power-on-reset) or D2H FIS in response to the
COMRESET. Some controllers use COMRESET as TF status
synchronization point and clear TF automatically (sata_sil).
* Controllers which cannot wait for the first D2H FIS reliably.
Blindly issuing SRST to spinning-up device often results in command
issue failure or timeout, causing extended delay. For these
controllers, ata_std_prereset() explicitly waits ATA_SPINUP_WAIT
(currently 8s) to give newly attached device time to spin up, then
issues reset. Note that failing to getting ready in ATA_SPINUP_WAIT
is not critical. libata will retry. So, the timeout needs to be
long enough to spin up most devices.
LLDDs can tell ata_std_prereset() which of above action is needed with
ATA_FLAG_HRST_TO_RESUME and ATA_FLAG_SKIP_D2H_BSY flags. These flags
are PHY-specific property and will be moved to ata_link later.
While at it, this patch unifies function typedef's such that they all
have named arguments.
Signed-off-by: Tejun Heo <htejun@gmail.com>
With hotplug, PHY always needs to be debounced before a reset as any
reset might find new devices. Extract PHY waiting code from
sata_phy_resume() and extend it to include SStatus debouncing. Note
that sata_phy_debounce() is superset of what used to be done inside
sata_phy_resume().
Three default debounce timing parameters are defined to be used by
hot/boot plug. As resume failure during probing will be properly
handled as errors, timeout doesn't have to be long as before.
probeinit() uses the same timeout to retain the original behavior.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Add device persistent field dev->sdev and store the attached SCSI
device. With hotplug, libata needs to know the attached SCSI device
to offline and detach it, but scsi_device_lookup() cannot be used
because libata will reuse SCSI ID numbers - dead but not gone devices
(due to zombie opens, etc...) interfere with the lookup.
dev->sdev doesn't hold reference to the SCSI device. It's cleared
when the SCSI device goes away.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Add ap->hw_sata_spd_limit and initialize it once during the boot
initialization (or driver load initialization). ap->sata_spd_limit is
reset to ap->hw_sata_spd_limit on hotplug. This prevents spd limits
introduced by earlier devices from affecting new devices.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Lifetimes of some fields span over device plugging/unplugging. This
patch moves such persistent fields to the top of ata_device and
separate them with ATA_DEVICE_CLEAR_OFFSET. Fields above the offset
are initialized once during host initializatino while all other fields
are cleared before hotplugging. Currently ->ap, devno and part of
flags are persistent.
Note that flags is partially cleared while holding host_set lock.
This is to synchronize with later warm plug implementation which will
record hotplug request in dev->flags.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Implement ata_eh_wait(). On return from this function, it's
guaranteed that the EH which was pending or in progress when the
function was called is complete - including the tailing part of SCSI
EH. This will be used by hotplug and others to synchronize with EH.
Signed-off-by: Tejun Heo <htejun@gmail.com>
* master.kernel.org:/pub/scm/linux/kernel/git/davej/agpgart:
[AGPGART] VIA PT880 Ultra support.
[AGPGART] Fix Nforce3 suspend on amd64.
[AGPGART] Enable SIS AGP driver on x86-64 for EM64T systems
Remove the numbered SW_* entries from the input system and assign names
to the existing users.
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
The raw read/write access to NAND (without ECC) has been changed in the
NAND rework. Expose the new way - setting the file mode via ioctl - to
userspace. Also allow to read out the ecc statistics information so userspace
tools can see that bitflips happened and whether errors where correctable
or not. Also expose the number of bad blocks for the partition, so nandwrite
can check if the data fits into the parition before writing to it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Hopefully the last iteration on this!
The handling of out of band data on NAND was accompanied by tons of fruitless
discussions and halfarsed patches to make it work for a particular
problem. Sufficiently annoyed by I all those "I know it better" mails and the
resonable amount of discarded "it solves my problem" patches, I finally decided
to go for the big rework. After removing the _ecc variants of mtd read/write
functions the solution to satisfy the various requirements was to refactor the
read/write _oob functions in mtd.
The major change is that read/write_oob now takes a pointer to an operation
descriptor structure "struct mtd_oob_ops".instead of having a function with at
least seven arguments.
read/write_oob which should probably renamed to a more descriptive name, can do
the following tasks:
- read/write out of band data
- read/write data content and out of band data
- read/write raw data content and out of band data (ecc disabled)
struct mtd_oob_ops has a mode field, which determines the oob handling mode.
Aside of the MTD_OOB_RAW mode, which is intended to be especially for
diagnostic purposes and some internal functions e.g. bad block table creation,
the other two modes are for mtd clients:
MTD_OOB_PLACE puts/gets the given oob data exactly to/from the place which is
described by the ooboffs and ooblen fields of the mtd_oob_ops strcuture. It's
up to the caller to make sure that the byte positions are not used by the ECC
placement algorithms.
MTD_OOB_AUTO puts/gets the given oob data automaticaly to/from the places in
the out of band area which are described by the oobfree tuples in the ecclayout
data structre which is associated to the devicee.
The decision whether data plus oob or oob only handling is done depends on the
setting of the datbuf member of the data structure. When datbuf == NULL then
the internal read/write_oob functions are selected, otherwise the read/write
data routines are invoked.
Tested on a few platforms with all variants. Please be aware of possible
regressions for your particular device / application scenario
Disclaimer: Any whining will be ignored from those who just contributed "hot
air blurb" and never sat down to tackle the underlying problem of the mess in
the NAND driver grown over time and the big chunk of work to fix up the
existing users. The problem was not the holiness of the existing MTD
interfaces. The problems was the lack of time to go for the big overhaul. It's
easy to add more mess to the existing one, but it takes alot of effort to go
for a real solution.
Improvements and bugfixes are welcome!
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Most of those macros are unused and the used ones just obfuscate
the code. Remove them and fixup all users.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The nand_oobinfo structure is not fitting the newer error correction
demands anymore. Replace it by struct nand_ecclayout and fixup the users
all over the place. Keep the nand_oobinfo based ioctl for user space
compability reasons.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The info structure for out of band data was copied into
the mtd structure. Make it a pointer and remove the ability
to set it from userspace. The position of ecc bytes is
defined by the hardware and should not be changed by software.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>