Driver was processing a fixed max number of cq descriptors per ISR. For
instance, for the SCSI IO queue, number of IOs processed per ISR were 8.
If hardware writes 9 cq descriptors to the cq and generates an interrupt,
driver would process only 8 descriptors and decrement the outstanding
credit count by 8. Unless another interrupt event happens, the hw does
not generate any additional interrupt. This results in the cq descriptor
sitting in the queue without being procesed and can cause IO timeouts
and aborts.
Modify all ISR functions to process all queued cq descriptors in one shot.
Since bulk of ELS frame processing is done in thread context and bulk
of SCSI IO processing is done in soft ISR deferred context, the cycles
spent in the ISR per cq descriptor is small.
Signed-off-by: Herman Lee <hermlee@cisco.com>
Signed-off-by: Abhijeet Joglekar <abjoglek@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
I was running into several different panics under stress, which I traced down
to a few different possible slab corruption issues in error handling paths.
I have not yet looked into why these exchange sends fail, but with these
fixes my test system is much more stable under stress than before.
fc_elsct_send() could fail and either leave the passed in frame intact
(failure in fc_ct/els_fill) or the frame could have been freed if the
failure was is fc_exch_seq_send(). The caller had no way of knowing, and
there was a potential double free in the error handling in fc_fcp_rec().
Make fc_elsct_send() always free the frame before returning, and remove the
fc_frame_free() call in fc_fcp_rec().
While fc_exch_seq_send() did always consume the frame, there were double free
bugs in the error handling of fc_fcp_cmd_send() and fc_fcp_srr() as well.
Numerous calls to error handling routines (fc_disc_error(),
fc_lport_error(), fc_rport_error_retry() ) were passing in a frame pointer that
had already been freed in the case of an error. I have changed the call
sites to pass in a NULL pointer, but there may be more appropriate error
codes to use.
Question: Why do these error routines take a frame pointer anyway? I
understand passing in a pointer encoded error to the response handlers, but
the error routines take no action on a valid pointer and should never be
called that way.
Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Calls ndo_fcoe_enabled() of the associated netdev upon creating the FCoE
instance to make sure LLD has all necessary resources allocated and setup
properly before passing FCoE traffic. Similarly, calls ndo_fcoe_disable()
upon destroying the FCoE instance on the associated netdev to allow the LLD
to release all allocated resources for FCoE.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
In case of sequence offload, in fc_fcp_send_data(), the skb_fill_page_info()
called may end up adding more frags to the skb_shinfo(fp_skb(fp))->frags[],
exceeding SKB_MAX_FRAGS, this eventually corrupts the memory. I am adding the
FR_FRAME_SG_LEN back, but as SKB_MAX_FRAGS -1, leaving 1 for our fcoe_eof_crc
page. And send will be broken into multiple large sends if the frame already
contains more frags than skb handle.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Add a define of FCOE_MTU as 2158 bytes and use FCOE_MTU when the LLD is found
to support NETIF_F_FCOE_MTU. The lport->mfs is then calculated out of the
2158 FCOE_MTU. Otherwise, we stick with the netdev->mtu, i.e., LAN MTU. Also,
change the notification on NETDEV_CHANGEMTU event to bypass changing mfs when
LAN MTU is changed if NETIF_F_FCOE_MTU is supported.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
When doing echo ethX > /sys..../destroy I am getting
errors when the tear down succeeds. It looks like the
reason for this is because the rc var is not getting set
when the destruction works. This just sets it to zero.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Reported-by: Alex Lyakas <alexl@mellanox.co.il>
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Adds missing exch release when RRQ is accepted by calling
fc_seq_ls_acc. Adds common exch release for fc_exch_els_rrq
by use of out label.
Reported-by: Alex Lyakas <alexl@mellanox.co.il>
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Initializing these libfc globals per lport could mess up exch
allocation/free for existing lport.
So this patch moves their initialization to fc_setup_exch_mgr
so that these globals gets initialized only once for libfc.
Reported-by: Alex Lyakas <alexl@mellanox.co.il>
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
It's possible and harmless to get FLOGI timeouts
while in RESET state. Don't do a WARN_ON in that case.
Also, split out the other WARN_ONs in fc_lport_timeout, so
we can tell which one is hit by its line number.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Fix minor errors.
A debug message said an RLIR was received instead of ECHO.
"Expected" was misspelled in several places.
Fix a type cast from u32 to __be32.
Rob, Some of these may have been also taken care of in your
other doc cleanup patch. Feel free to fold them in.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This bug is exposed when there is a link flap in LLD. Particularly, when it
happens right after a SCSI write command is sent out, no FCP_DATA is sent,
causing fsp->status_code to be set as FC_DATA_UNDRUN in fc_fcp_complete_locked
even no SCSI status is received. Consequently, fc_io_compl treats this as DID_OK.
This results in SCSI returning successful to the initial I/O request even
there is no DATA actually sent. Particularly, if you run an I/O tool w/ data
verification on, the read back for verification is gonna fail.
This is fixed here by checking when FC_DATA_UNDRUN happens, SCSI status is
received w/ FC_SRB_RCV_STATUS set in fsp->state.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This argument isn't used, let's not pass it into the routine.
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
These are a few functions that were not used by other
modules. They did not need to be exported so this patch
removes the EXPORT_SYMBOLS call for each.
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Remove the redundant checking of netdev->netdev_ops as it will never be NULL.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
xid 0 was used as an indication of invalid xid before but now xid 0
can be used as a valid exchange i. This patch fixes the ddp completion
in fcp layer, i.e., in fc_fcp.c:fc_fcp_ddp_done() function, to make sure it
does not use xid 0 for indication of an invalid xid, instead, it now
uses use FC_XID_UNKNOWN for such indication.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
A received Fibre Channel ELS PRLI request contains a bit that
indicates whether the remote port supports certain retry processing
sequences. The test for this bit was somehow coded to use multiply
instead of AND!
This case would apply only for target mode operation, and it is
unlikely to be noticed as an initiator.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
When issuing a Cancel to the virtual fibre channel adapter,
the interface specifies a flags field for the client to indicate
what kind of error recovery is being performed. Fix up these
flags for terminate_rport_io to indicate an abort task set
rather than a target reset.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Remove a parameter to ibmvfc_init_host which is always set to
zero by all callers.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Need to grab the host lock around the call to ibmvfc_link_down.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
When processing the response to either a LUN reset,
target reset, or an abort task set, the ibmvfc driver needs to
treat as success receiving a response with a non-zero
status in the response IU along with a general transport
error with the FCP response code being zero. The VIOS
currently guarantees this cannot happen, but a future version
of VIOS may allow this to be returned, so ensure we handle
this response combination correctly for TMFs, as we already
do for SCSI commands.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This version fixes 64-bit modulo on 32-bit as well as inadvertent map
updates when TP was disabled.
Implement support for thin provisioning in scsi_debug. No actual memory
de-allocation is taking place. The intent is to emulate a thinly
provisioned storage device, not to be one.
There are four new module options:
- unmap_granularity specifies the granularity at which to track mapped
blocks (specified in number of logical blocks). 2048 (1 MB) is a
realistic value for disk arrays although some may have a finer
granularity.
- unmap_alignment specifies the first LBA which is naturally aligned on
an unmap_granularity boundary.
- unmap_max_desc specifies the maximum number of ranges that can be
unmapped using one UNMAP command. If this is 0, only WRITE SAME is
supported and UNMAP will cause a check condition.
- unmap_max_blocks specifies the maximum number of blocks that can be
unmapped using a single UNMAP command. Default is 0xffffffff.
These parameters are reported in the new and extended block limits VPD.
If unmap_granularity is specified the device is tagged as thin
provisioning capable in READ CAPACITY(16). A bitmap is allocated to
track whether blocks are mapped or not. A WRITE request will cause a
block to be mapped. So will WRITE SAME unless the UNMAP bit is set.
Blocks can be unmapped using either WRITE SAME or UNMAP. No accounting
is done to track partial blocks. This means that only whole blocks will
be marked free. This is how the array people tell me their firmwares
work.
GET LBA STATUS is also supported. This command reports whether a block
is mapped or not, and how long the adjoining mapped/unmapped extent is.
The block allocation bitmap can also be viewed from user space via:
/sys/bus/pseudo/drivers/scsi_debug/map
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Add definitions for UNMAP, WRITE SAME{16,32} and GET LBA STATUS
commands.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Correct issues where the lower scsi-status would be improperly
cleared, instead, allow the midlayer to process the status after
the proper residual-count checks are performed. Finally,
validate firmware status flags prior to assigning values from the
FCP_RSP frame.
Signed-off-by: Lalit Chandivade <lalit.chandivade@qlogic.com>
Signed-off-by: Michael Hernandez <michael.hernandez@qlogic.com>
Signed-off-by: Ravi Anand <ravi.anand@qlogic.com>
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Original code would not register FC4 nor FDMI information after a
logical tear-down of an VFC link. Code now triggers registration
date during processing of a 'Report ID Acquisition IOCB', which
is submitted after a FLOGI or FDISC completes.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Original code discarded response-info field information and
assumed the command completed successfully without verifying the
target's status within the FCP_RSP packet.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
In some case, the MPI and PHY versions when retrieved after the
Execute-FW mailbox-command are incorrect (255.255.255.255).
Instead, query the information after the check for firmware ready
is done in the abort ISP path.
Signed-off-by: Lalit Chandivade <lalit.chandivade@qlogic.com>
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The mailbox register values may assist in debugging efforts.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Currently if the balloon driver is unable to increase the guest's
reservation it assumes the failure was due to reaching its full
allocation, gives up on the ballooning operation and records the limit
it reached as the "hard limit". The driver will not try again until
the target is set again (even to the same value).
However it is possible that ballooning has in fact failed due to
memory pressure in the host and therefore it is desirable to keep
attempting to reach the target in case memory becomes available. The
most likely scenario is that some guests are ballooning down while
others are ballooning up and therefore there is temporary memory
pressure while things stabilise. You would not expect a well behaved
toolstack to ask a domain to balloon to more than its allocation nor
would you expect it to deliberately over-commit memory by setting
balloon targets which exceed the total host memory.
This patch drops the concept of a hard limit and causes the balloon
driver to retry increasing the reservation on a timer in the same
manner as when decreasing the reservation.
Also if we partially succeed in increasing the reservation
(i.e. receive less pages than we asked for) then we may as well keep
those pages rather than returning them to Xen.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Stable Kernel <stable@kernel.org>
Change totalram_pages when a single page is added/removed to the
ballooned list. This avoid totalram_pages to be set erroneously to
max_pfn at boot.
Signed-off-by: Gianluca Guida <gianluca.guida@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>
o Now issues of blkio controller and CFQ in module mode should be fixed.
Enable the cfq group scheduling support in module mode.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
o One of the goals of block IO controller is that it should be able to
support mulitple io control policies, some of which be operational at
higher level in storage hierarchy.
o To begin with, we had one io controlling policy implemented by CFQ, and
I hard coded the CFQ functions called by blkio. This created issues when
CFQ is compiled as module.
o This patch implements a basic dynamic io controlling policy registration
functionality in blkio. This is similar to elevator functionality where
ioschedulers register the functions dynamically.
o Now in future, when more IO controlling policies are implemented, these
can dynakically register with block IO controller.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
o blkio controller is inside the kernel and cfq makes use of interfaces
exported by blkio. CFQ can be a module too, hence export symbols used
by CFQ.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
With CLONE_IO, parent's io_context->nr_tasks is incremented, but never
decremented whenever copy_process() fails afterwards, which prevents
exit_io_context() from calling IO schedulers exit functions.
Give a task_struct to exit_io_context(), and call exit_io_context() instead of
put_io_context() in copy_process() cleanup path.
Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
With CLONE_IO, copy_io() increments both ioc->refcount and ioc->nr_tasks.
However exit_io_context() only decrements ioc->refcount if ioc->nr_tasks
reaches 0.
Always call put_io_context() in exit_io_context().
Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>