PIO executes without holding host_set lock, so it cannot be
synchronized using the same mechanism as interrupt driven execution.
port_task framework makes sure that EH is not entered until PIO task
is flushed, so PIO task can be sure the qc in progress won't go away
underneath it. One thing it cannot be sure of is whether the qc has
already been scheduled for EH by another exception condition while
host_set lock was released.
This patch makes ata_poll_qc-complete() handle such conditions
properly and make it freeze the port if HSM violation is detected
during PIO execution.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Freezing is performed atomic w.r.t. host_set->lock and once frozen
LLDD is not allowed to access the port or any qc on it. Also, libata
makes sure that no new qc gets issued to a frozen port.
A frozen port is thawed after a reset operation completes
successfully, so reset methods must do its job while the port is
frozen. During initialization all ports get frozen before requesting
IRQ, so reset methods are always invoked on a frozen port.
Optional ->freeze and ->thaw operations notify LLDD that the port is
being frozen and thawed, respectively. LLDD can disable/enable
hardware interrupt in these callbacks if the controller's IRQ mask can
be changed dynamically. If the controller doesn't allow such
operation, LLDD can check for frozen state in the interrupt handler
and ack/clear interrupts unconditionally while frozen.
Signed-off-by: Tejun Heo <htejun@gmail.com>
ata_port_schedule_eh() directly schedules EH for @ap without
associated qc. Once EH scheduled, no further qc is allowed and EH
kicks in as soon as all currently active qc's are drained.
ata_port_abort() schedules all currently active commands for EH by
qc_completing them with ATA_QCFLAG_FAILED set. If ata_port_abort()
doesn't find any qc to abort, it directly schedule EH using
ata_port_schedule_eh().
These two functions provide ways to invoke EH for conditions which
aren't directly related to any specfic qc.
Signed-off-by: Tejun Heo <htejun@gmail.com>
There are several ways a qc can get schedule for EH in new EH. This
patch implements one of them - completing a qc with ATA_QCFLAG_FAILED
set or with non-zero qc->err_mask. ALL such qc's are examined by EH.
New EH schedules a qc for EH from completion iff ->error_handler is
implemented, qc is marked as failed or qc->err_mask is non-zero and
the command is not an internal command (internal cmd is handled via
->post_internal_cmd). The EH scheduling itself is performed by asking
SCSI midlayer to schedule EH for the specified scmd.
For drivers implementing old-EH, nothing changes. As this change
makes ata_qc_complete() rather large, it's not inlined anymore and
__ata_qc_complete() is exported to other parts of libata for later
use.
Signed-off-by: Tejun Heo <htejun@gmail.com>
New EH framework has clear distinction about who owns a qc. Every qc
starts owned by normal execution path - PIO, interrupt or whatever.
When an exception condition occurs which affects the qc, the qc gets
scheduled for EH. Note that some events (say, link lost and regained,
command timeout) may schedule qc's which are not directly related but
could have been affected for EH too. Scheduling for EH is atomic
w.r.t. ap->host_set->lock and once schedule for EH, normal execution
path is not allowed to access the qc in whatever way. (PIO
synchronization acts a bit different and will be dealt with later)
This patch make ata_qc_from_tag() check whether a qc is active and
owned by normal path before returning it. If conditions don't match,
NULL is returned and thus access to the qc is denied.
__ata_qc_from_tag() is the original ata_qc_from_tag() and is used by
libata core/EH layers to access inactive/failed qc's.
This change is applied only if the associated LLDD implements new EH
as indicated by non-NULL ->error_handler
Signed-off-by: Tejun Heo <htejun@gmail.com>
New EH may issue internal commands to recover from error while failed
qc's are still hanging around. To allow such usage, reserve tag
ATA_MAX_QUEUE-1 for internal command. This also makes it easy to tell
whether a qc is for internal command or not. ata_tag_internal() test
implements this test.
To avoid breaking existing drivers, ata_exec_internal() uses
ATA_TAG_INTERNAL only for drivers which implement ->error_handler.
For drivers using old EH, tag 0 is used. Note that this makes
ata_tag_internal() test valid only when ->error_handler is
implemented. This is okay as drivers on old EH should not and does
not have any reason to use ata_tag_internal().
Signed-off-by: Tejun Heo <htejun@gmail.com>
Add dev->ap which points back to the port the device belongs to. This
makes it unnecessary to pass @ap for silly reasons (e.g. printks).
Also, this change is necessary to accomodate later PM support which
will introduce ATA link inbetween port and device.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Use new SCR and on/offline functions. Note that for LLDD which know
it implements SCR callbacks, SCR functions are guaranteed to succeed
and ata_port_online() == !ata_port_offline().
Signed-off-by: Tejun Heo <htejun@gmail.com>
Implement ata_scr_{valid|read|write|write_flush}() and
ata_port_{online|offline}(). These functions replace
scr_{read|write}() and sata_dev_present().
Major difference between between the new SCR functions and the old
ones is that the new ones have a way to signal error to the caller.
This makes handling SCR-available and SCR-unavailable cases in the
same path easier. Also, it eases later PM implementation where SCR
access can fail due to various reasons.
ata_port_{online|offline}() functions return 1 only when they are
affirmitive of the condition. e.g. if SCR is unaccessible or
presence cannot be determined for other reasons, these functions
return 0. So, ata_port_online() != !ata_port_offline(). This
distinction is useful in many exception handling cases.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Init ap->cbl to ATA_CBL_SATA in ata_host_init(). This is necessary
for soon-to-follow SCR handling function changes. LLDDs are free to
change ap->cbl during probing.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Add qc->result_tf and ATA_QCFLAG_RESULT_TF. This moves the
responsibility of loading result TF from post-compltion path to qc
execution path. qc->result_tf is loaded if explicitly requested or
the qc failsa. This allows more efficient completion implementation
and correct handling of result TF for controllers which don't have
global TF representation such as sil3124/32.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Make ata_do_reset() deal only with reset. postreset is now the
responsibility of the caller. This is simpler and eases later
prereset addition.
Signed-off-by: Tejun Heo <htejun@gmail.com>
It's not a very good idea to allocate memory during EH. Use
statically allocated buffer for dev->id[] and add 512byte buffer
ap->sector_buf. This buffer is owned by EH (or probing) and to be
used as temporary buffer for various purposes (IDENTIFY, NCQ log page
10h, PM GSCR block).
Signed-off-by: Tejun Heo <htejun@gmail.com>
ap->active_tag was cleared in ata_qc_free(). This left ap->active_tag
dangling after ata_qc_complete(). Spurious interrupts inbetween could
incorrectly access the qc. Clear active_tag in ata_qc_complete().
This change is necessary for later EH changes.
Signed-off-by: Tejun Heo <htejun@gmail.com>
ata_bus_probe() doesn't clear dev->class after ->phy_reset(). This
can result in falsely enabled devices if probing fails. Clear
dev->class to ATA_DEV_UNKNOWN after fetching it.
Signed-off-by: Tejun Heo <htejun@gmail.com>
While moving ata_scsi_error() from LLDD sht to libata transportt,
EXPORT_SYMBOL_GPL() entry was left out. Kill it.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Rename ata_down_sata_spd_limit() and friends to sata_down_spd_limit()
and likewise for simplicity & consistency.
Signed-off-by: Tejun Heo <htejun@gmail.com>
libata needs to invoke EH without scmd. This patch adds
shost->host_eh_scheduled to implement such behavior.
Currently the only user of this feature is libata and no general
interface is defined. This patch simply adds handling for
host_eh_scheduled where needed and exports scsi_eh_wakeup() to
modules. The rest is upto libata. This is the result of the
following discussion.
http://thread.gmane.org/gmane.linux.scsi/23853/focus=9760
In short, SCSI host is not supposed to know about exceptions unrelated
to specific device or command. Such exceptions should be handled by
transport layer proper. However, the distinction is not essential to
ATA and libata is planning to depart from SCSI, so, for the time
being, libata will be using SCSI EH to handle such exceptions.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Introduce scsi_req_abort_cmd(struct scsi_cmnd *).
This function requests that SCSI Core start recovery for the
command by deleting the timer and adding the command to the eh
queue. It can be called by either LLDDs or SCSI Core. LLDDs who
implement their own error recovery MAY ignore the timeout event if
they generated scsi_req_abort_cmd.
First post:
http://marc.theaimsgroup.com/?l=linux-scsi&m=113833937421677&w=2
Signed-off-by: Luben Tuikov <ltuikov@yahoo.com>
Signed-off-by: Tejun Heo <htejun@gmail.com>
Problem spotted by: Suzuki K P <suzuki@in.ibm.com>
A zero return on success isn't correct for filesystem write functions.
They should either return negative error or the length of bytes
consumed. Add code to convert our zero on success error return to
return the length of bytes passed in.
This fixes the following:
$ echo "scsi remove-single-device 0 0 3 0" > /proc/scsi/scsi
bash: echo: write error: No such device or address"
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
debugged by wrwhitehead@novell.com
patch and analysis by fujita.tomonori@lab.ntt.co.jp
Only tcp_read_sock and recv_actor (iscsi_tcp_data_recv for us) see
desc.count. It is is used just for permitting tcp_read_sock to read
the portion of data in the socket.
When iscsi_tcp_data_recv sees a partial header, it sets
desc.count. However, it is possible that the next skb (containing the
rest of the header) still does not come. So I'm not sure that this
scheme is completely correct.
Ideally, we should use the exact length of the data in the socket for
desc.count. However, it is not so simple (see SIOCINQ in
tcp_ioctl). So I think that iscsi_tcp_data_recv can just stop playing
with desc.count and tell tcp_read_sock to read the all skbs. As
proposed already, if iscsi_tcp_data_ready sets desc.count to
non-zero, tcp_read_sock does that.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
debugged by Ming and Rohan:
The problem Ming and Rohan debugged was that during a normal session
login, open-iscsi is not incrementing the exp_statsn counter. It was
stuck at zero. From the RFC, it looks like if the login response PDU has
a successful status then we should be incrementing that value. Also from
the RFC, it looks like if when we drop a connection then reconnect, we
should be using the exp_statsn from the old connection in the next
relogin attempt.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
align printk output
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
from patmans@us.ibm.com and michaelc@cs.wisc.edu
Fix bugs when forcing a mgmt task to fail and allow
session recovery to cleanup the session/connection
of any running mgmt tasks. When called during
the in login state.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
add transport end point callbacks so iscsi drivers that cannot connect
from userspace, like iscsi tcp, using sockets do not have to
implement their own socket infrastructure.
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch updates the lpfc driver to revision 8.1.6, which includes
the following changes:
- Fix data corruption in SCSI BUS reset path, due to reusing
the same request structure for each target.
- Change version number to 8.1.6
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Let people enable the advansys driver on x86-32, even though it's broken
on other architectures due to missing DMA mapping infrastructure.
It's used by Jeffrey Phillips Freeman <jeffreyfreeman@syncleus.com> and
possibly others.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix the driver to return SUCCESS if the firmware or driver doesn't
have a command to abort, i.e., it's already been returned. Without
this patch, error recovery will take the target offline as it tries
harder and harder to get the driver to return the command it no longer
has.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
When abort failed, the driver gets reset handleer called. In the reset
handler, driver calls 'scsi_done()' callback for same SCSI command packet
(struct scsi_cmnd) multiple times if there are multiple SCSI command packet
in the pend_list. More over, if there are entry in the pend_lsit with
IOCTL packet associated, the driver returns it to wrong free_list so that,
in turn, the driver could end up with 'NULL pointer dereference..' during
I/O command building with incorrect resource.
Also, the patch contains several minor/cosmetic changes besides this.
Signed-off-by: Seokmann Ju <seokmann.ju@lsil.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Some versions of the IBM 2104-DU3 disk enclosure
have been observed to hang Inquiries to non zero
LUNs to the SES device. This device only has LUN 0,
so this patch adds it to the BLIST to prevent scsi
core from scanning beyond LUN 0.
Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Some pioneer DVDs are apparently returning odd "not ready" status
codes that the mid-layer doesn't recognise and so passes back to the
user as errors.
This patch overhauls our not-ready handling and adds transparent retries for:
format in progress
rebuild in progress
recalculation in progress
operation in progress
Long write in progress
self test in progress
The Pioneer was actually returning "long write in progress"
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Fix resource leak in
drivers/scsi/aic7xxx/aic7xxx_osm_pci.c::ahc_linux_pci_dev_probe()
Found by the coverity checker (#668)
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Adds AHCI support for the VIA VT8251.
Includes a workaround for a hardware bug which requires a Command List
Override before softreset.
Signed-off-by: Bastiaan Jacques <b.jacques@planet.nl>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Fix ahc_pci_write_config's (wrong order of arguments).
Signed-off-by: Denis Vlasenko <vda@ilport.com.ua>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
drivers/scsi/megaraid.c: In function `mega_internal_command':
drivers/scsi/megaraid.c:4474: warning: unused variable `flags'
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
If qla2x00_probe_one() fails before calling request_irq() but gets to
qla2x00_free_device() then it will mistakenly try to free an irq it didn't
request. It's chosing to free based on ha->pdev->irq which is always set.
host->irq is set after request_irq() succeeds so let's use that to decide
to free or not.
This was observed and tested when a silly set of circumstances lead to
firmware loading failing on a 2100.
Signed-off-by: Zach Brown <zach.brown@oracle.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This fixes coverity bug id #480. Since id_array is declared as
id_array[MAX_SLOTS], the check for i>MAX_SLOTS is obviously false.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
drivers/scsi/scsi_lib.c: In function `scsi_kmap_atomic_sg':
drivers/scsi/scsi_lib.c:2394: warning: unsigned int format, different type arg (arg 3)
drivers/scsi/scsi_lib.c:2394: warning: unsigned int format, different type arg (arg 4)
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>