The introduction of barriers to loop devices has created a new IO
order completion dependency that XFS does not handle. The loop
device implements barriers using fsync and so turns a log IO in the
XFS filesystem on the loop device into a data IO in the backing
filesystem. That is, the completion of log IOs in the loop
filesystem are now dependent on completion of data IO in the backing
filesystem.
This can cause deadlocks when a flush daemon issues a log force with
an inode locked because the IO completion of IO on the inode is
blocked by the inode lock. This in turn prevents further data IO
completion from occuring on all XFS filesystems on that CPU (due to
the shared nature of the completion queues). This then prevents the
log IO from completing because the log is waiting for data IO
completion as well.
The fix for this new completion order dependency issue is to make
the IO completion inode locking non-blocking. If the inode lock
can't be grabbed, simply requeue the IO completion back to the work
queue so that it can be processed later. This prevents the
completion queue from being blocked and allows data IO completion on
other inodes to proceed, hence avoiding completion order dependent
deadlocks.
Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
Allow us to track the difference between timestamp and size updates
by using mark_inode_dirty from the I/O completion code, and checking
the VFS inode flags in xfs_file_fsync.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
Currently the fsync file operation is divided into a low-level
routine doing all the work and one that implements the Linux file
operation and does minimal argument wrapping. This is a leftover
from the days of the vnode operations layer and can be removed to
simplify the code a bit, as well as preparing for the implementation
of an optimized fdatasync which needs to look at the Linux inode
state.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
Currently the aio_read, aio_write, splice_read and splice_write file
operations are divided into a low-level routine doing all the work
and one that implements the Linux file operations and does minimal
argument wrapping. This is a leftover from the days of the vnode
operations layer and can be removed to simplify the code a lot.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
Currently the code to implement the file operations is split over
two small files. Merge the content of xfs_lrw.c into xfs_file.c to
have it in one place. Note that I haven't done various cleanups
that are possible after this yet, they will follow in the next
patch. Also the function xfs_dev_is_read_only which was in
xfs_lrw.c before really doesn't fit in here at all and was moved to
xfs_mount.c.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
The be32_to_cpu in the TP_printk output breaks automatic parsing of
the trace format by the trace-cmd tools, so we have to move it into
the TP_assign block. While we're at it also fix the format for the
quota limits to more regular and easier parseable.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
While doing some testing of readdir perf a while back,
I noticed that the buffer size we're using internally is
smaller than what glibc gives us by default. Upping this
size helped a bit, and seems safe.
glibc's __alloc_dir() does:
const size_t default_allocation = (4 * BUFSIZ < sizeof (struct dirent64)
? sizeof (struct dirent64) : 4 * BUFSIZ);
const size_t small_allocation = (BUFSIZ < sizeof (struct dirent64)
? sizeof (struct dirent64) : BUFSIZ);
size_t allocation = default_allocation;
#ifdef _STATBUF_ST_BLKSIZE
if (statp != NULL && default_allocation < statp->st_blksize)
allocation = statp->st_blksize;
#endif
and
#define _G_BUFSIZ 8192
#define _IO_BUFSIZ _G_BUFSIZ
# define BUFSIZ _IO_BUFSIZ
so the default buffer is 4 * 8192 = 32768
(except in the unlikely case of blocks > 32k....)
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
Some new devices have extra keys, which we add to our list. Currently,
they all generate events that allow us to use a simple table/array,
without need for the sparse keymap.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com>
* 'davinci-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-davinci: (40 commits)
DaVinci DM365: Adding support for SPI EEPROM
DaVinci DM365: Adding DM365 SPI support
DaVinci DM355: Modifications to DM355 SPI support
DaVinci: SPI: Adding header file for SPI support.
davinci: dm646x: CDCE clocks: davinci_clk converted to clk_lookup
davinci: clkdev cleanup: remove clk_lookup wrapper, use clkdev_add_table()
DaVinci: DM365: Voice codec support for the DM365 SoC
davinci: clock: let clk->set_rate function sleep
Add SDA and SCL pin numbers to i2c platform data
davinci: da8xx/omap-l1xx: Add EDMA platform data for da850/omap-l138
davinci: build list of unused EDMA events dynamically
davinci: Fix edma_alloc_channel api for EDMA_CHANNEL_ANY case
davinci: Keep count of channel controllers on a platform
davinci: Correct return value of edma_alloc_channel api
davinci: add CDCE949 support on DM6467 EVM
davinci: add support for CDCE949 clock synthesizer
davinci: da850/omap-l138 EVM: register for suspend support
davinci: da850/omap-l138: add support for SoC suspend
davinci: add power management support
DaVinci: DM365: Changing default queue for DM365.
...
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: (38 commits)
sata_via: Delay on vt6420 when starting ATAPI DMA write
ata: Detect Delkin Devices compact flash
pata_efar: Enable parallel scanning
pata_atiixp: enable parallel scan
[libata] pata_atiixp: add locking for parallel scanning
[libata] pata_efar: add locking for parallel scanning
libata: Pass host flags into the pci helper
[libata] pata_marvell: CONFIG_AHCI is really CONFIG_SATA_AHCI
libata: Allow pata_legacy to be built on non-ISA but PCI systems
pata_pdc202xx_old: fix UDMA mode for PDC2026x chipsets
pata_pdc202xx_old: fix UDMA mode for Promise UDMA33 cards
[libata] pata_at91: fix backslash-continued string
pata_via: store UDMA masks in via_isa_bridges table
pata_via: fix address setup timings underlocking
pata_serverworks: fix error message
pata_serverworks: fix PIO setup for the second channel
pata_efar: fix secondary port support
pata_cypress: fix PIO timings underclocking
pata_cs5535: use correct values for PIO1 and PIO2 data timings
pata_cmd64x: remove unused definitions
...
When writing a disc on certain lite-on dvd-writers (also rebadged
as optiarc/LG/...) connected to a vt6420, the ATAPI CDB ends
up in the datastream and on the disc, causing silent corruption.
Delaying between sending the CDB and starting DMA seems to
prevent this.
I do not know if there are burners that do not suffer from
this, but the patch should be safe for those as well.
There are many reports of this issue, but AFAICT no solution was
found before. For example:
http://lkml.indiana.edu/hypermail/linux/kernel/0802.3/0561.html
Signed-off-by: Bart Hartgers <bart.hartgers@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
I have a Delkin Devices compact flash card that isn't being recognized using the
SATA/PATA drivers.
The card is recognized and works with the deprecated ATA drivers.
The error I am seeing is:
ata1.00: failed to IDENTIFY (device reports invalid type, err_mask=0x0)
I tracked it down to ata_id_is_cfa() in include/linux/ata.h.
The Delkin card has id[0] set to 0x844a and id[83] set to 0.
This isn't what the kernel expects and is probably incorrect.
The simplest work-around is to add a check for 0x844a to ata_id_is_cfa().
Signed-off-by: Ben Gardner <gardner.ben@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Again originally proposed by Bartlomiej but this does it by using the
generic helper logic instead.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
This was originally proposed by Bartlomiej but as a device specific
expansion of the init_one function rather than making the helper more
generic.
Enable the parallel scan via the generic flags.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
This is similar change as commit 60c3be3 for ata_piix host driver
and while pata_atiixp doesn't enable parallel scan yet the race
could probably also be triggered by requesting re-scanning of both
ports at the same time using SCSI sysfs interface.
[Ported to current tree without other patch dependancies by Alan Cox]
Original is
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
This one is
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Add clearing of UDMA enable bit also for PIO modes and then add
extra locking for parallel scanning.
This is similar change as commit 60c3be3 for ata_piix host driver
and while pata_efar doesn't enable parallel scan yet the race could
probably also be triggered by requesting re-scanning of both ports
at the same time using SCSI sysfs interface.
[Ported to current kernel without other patch dependancies by
Alan Cox]
Original is
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
This one is
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
This allows parallel scan and the like to be set without having to stop
using the existing full helper functions. This patch merely adds the argument
and fixes up the callers. It doesn't undo the special cases already in the
tree or add any new parallel callers.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
The marvell driver comtains a fallback to ahci for the sata ports
which is incorrectly checked as CONFIG_AHCI while the only AHCI config
item is actually called SATA_AHCI (which also sounds sensible
considering it's a fallback for the sata ports).
Signed-off-by: Christoph Egger <siccegge@stud.informatik.uni-erlangen.de>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
This is needed for some unsupported hardware setups on strange 64bit
mainboards where crazy stuff has been done like putting flash ata adapters
on the LPC bus, or where the real hardware is hidden/confused.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
PDC2026x chipsets need the same treatment as PDC20246 one.
This is completely untested but will hopefully fix UDMA issues
that people have been reporting against pata_pdc202xx_old for
the last couple of years.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
On Monday 04 January 2010 02:30:24 pm Russell King wrote:
> Found the problem - getting rid of the read of the alt status register
> after the command has been written fixes the UDMA CRC errors on write:
>
> @@ -676,7 +676,8 @@ void ata_sff_exec_command(struct ata_port *ap, const struct
> ata_taskfile *tf)
> DPRINTK("ata%u: cmd 0x%X\n", ap->print_id, tf->command);
>
> iowrite8(tf->command, ap->ioaddr.command_addr);
> - ata_sff_pause(ap);
> + ndelay(400);
> +// ata_sff_pause(ap);
> }
> EXPORT_SYMBOL_GPL(ata_sff_exec_command);
>
>
> This rather makes sense. The PDC20247 handles the UDMA part of the
> protocol. It has no way to tell the PDC20246 to wait while it suspends
> UDMA, so that a normal register access can take place - the 246 ploughs
> on with the register access without any regard to the state of the 247.
>
> If the drive immediately starts the UDMA protocol after a write to the
> command register (as it probably will for the DMA WRITE command), then
> we'll be accessing the taskfile in the middle of the UDMA setup, which
> can't be good. It's certainly a violation of the ATA specs.
Fix it by adding custom ->sff_exec_command method for UDMA33 chipsets.
Debugged-by: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
* store UDMA masks in via_isa_bridges[] and while at it make "flags"
field to be u8 instead of u16
* convert the driver to use UDMA masks from via_isa_bridges[]
* remove no longer needed VIA_UDMA* defines
Make some minor documentation and CodingStyle fixes while at it.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Correct via_do_set_mode() documentation while at it.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Timing registers should be programmed with the desired number of clocks
minus one clock.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
There shouldn't be any problems with it as IDE cs5535 host driver
has been using those values for years and they match values given
in the (publicly available) datasheet.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
s/ARTIM2/ARTTIM23/ in cmd648_bmdma_stop() while at it
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Clear the primary channel pending interrupt bit
instead of the reserved one.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Account for the requirements of the DMA mode currently used
by the pair device.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Use standard cycle timing for CFA PIO5 and PIO6 modes.
Based on commit 74638c8 for IDE subsystem.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Before only the timings for master were set. Datasheet can be found
here: ftp://ftp.vtbridge.org/Docs/Storage/DS_VT6421A_100_CCPL.PDF
Surprisingly, a slave drive works without this patch. According to the
datasheet, the controller by default derives the DMA mode from the
Set Features command issued to a drive. Not sure about the PIO
timings, though. The real problem is that the timings for the master
effectively are the ones tuned for the slave. If these support
different UDMA-settings, there is trouble, especially when the slave
supports a higher UDMA than the master.
Anyhow, using the same mechanism for both master and slave seems like
a good idea.
Signed-off-by: Bart Hartgers <bart.hartgers@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Make some variables in ahci and a function in pata_pcmcia static, as found
using sparse.
Signed-off-by: Robert Hancock <hancockrwd@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Traditional IDE interface sucks in that it doesn't have a reliable IRQ
pending bit, so if the controller raises IRQ while the driver is
expecting it not to, the IRQ won't be cleared and eventually the IRQ
line will be killed by interrupt subsystem. Some controllers have
non-standard mechanism to indicate IRQ pending so that this condition
can be detected and worked around.
This patch adds an optional operation ->sff_irq_check() which will be
called for each port from the ata_sff_interrupt() if an unexpected
interrupt is received. If the operation returns %true,
->sff_check_status() and ->sff_irq_clear() will be cleared for the
port. Note that this doesn't mark the interrupt as handled so it
won't prevent IRQ subsystem from killing the IRQ if this mechanism
fails to clear the spurious IRQ.
This patch also implements ->sff_irq_check() for ata_piix. Note that
this adds slight overhead to shared IRQ operation as IRQs which are
destined for other controllers will trigger extra register accesses to
check whether IDE interrupt is pending but this solves rare screaming
IRQ cases and for some curious reason also helps weird BIOS related
glitch on Samsung n130 as reported in bko#14314.
http://bugzilla.kernel.org/show_bug.cgi?id=14314
* piix_base_ops dropped as suggested by Sergei.
* Spurious IRQ detection doesn't kick in anymore if polling qc is in
progress. This provides less protection but some controllers have
possible data corruption issues if the wrong register is accessed
while a command is in progress.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Johannes Stezenbach <js@sig21.net>
Reported-by: Hans Werner <hwerner4@gmx.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
host->ports[i] is never NULL if i < host->n_ports and non-NULL return
from ata_qc_from_tag() guarantees that the returned qc is active.
Drop unnecessary tests.
Superflous () dropped as suggested by Sergei.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Describe UDMA timing bits 18-20 and 21 separately; add a note to bit
31 about it being meaningful for PIO only. Reformat the whole comment,
while at it...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
There's no need to clear the fast interrupt bit in hpt366_set_mode()
since we're doing it in hpt366_init_chipset() already.
While at it, rename 'addr1' local variable to 'addr' and
exclude 'ap->port_no' from its calculation as HPT36x are
single-channel-per-function chips.
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
As these drivers' set_piomode() and set_dmamode() methods are almost
identical, factor out the common hpt{37x|3x2n}_set_mode() function
to be called by both of them, the same as in 'pata_hpt366' driver.
This results in ~5% decrease in the 'pata_hpt37x' driver binary
size and in ~4% decrease in the 'pata_hpt3x2n' driver binary size
(as measured on x86-32).
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
The UltraDMA Tss timing must be stretched with ATA clock of 66 MHz, but the
driver only does this when PCI clock is 66 MHz, whereas it always programs
DPLL clock (which is used as the ATA clock) to 66 MHz.
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: <stable@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Use ATA_DMA_* constants instead of the bare numbers for the BMIDE registers.
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>