linux-uconsole/drivers/nvme/host
James Smart 25d87eebc7 nvme-fc: fix double-free scenarios on hw queues
[ Upstream commit c869e494ef ]

If an error occurs on one of the ios used for creating an
association, the creating routine has error paths that are
invoked by the command failure and the error paths will free
up the controller resources created to that point.

But... the io was ultimately determined by an asynchronous
completion routine that detected the error and which
unconditionally invokes the error_recovery path which calls
delete_association. Delete association deletes all outstanding
io then tears down the controller resources. So the
create_association thread can be running in parallel with
the error_recovery thread. What was seen was the LLDD received
a call to delete a queue, causing the LLDD to do a free of a
resource, then the transport called the delete queue again
causing the driver to repeat the free call. The second free
routine corrupted the allocator. The transport shouldn't be
making the duplicate call, and the delete queue is just one
of the resources being freed.

To fix, it is realized that the create_association path is
completely serialized with one command at a time. So the
failed io completion will always be seen by the create_association
path and as of the failure, there are no ios to terminate and there
is no reason to be manipulating queue freeze states, etc.
The serialized condition stays true until the controller is
transitioned to the LIVE state. Thus the fix is to change the
error recovery path to check the controller state and only
invoke the teardown path if not already in the CONNECTING state.

Reviewed-by: Himanshu Madhani <hmadhani@marvell.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-09 10:18:54 +01:00
..
core.c nvme: Discard workaround for non-conformant devices 2019-12-31 16:36:01 +01:00
fabrics.c nvme: call nvme_complete_rq when nvmf_check_ready fails for mpath I/O 2018-11-13 11:08:24 -08:00
fabrics.h nvme: if_ready checks to fail io to deleting controller 2018-07-24 13:44:40 +02:00
fault_inject.c nvme: Add fault injection feature 2018-03-26 08:53:43 -06:00
fc.c nvme-fc: fix double-free scenarios on hw queues 2020-01-09 10:18:54 +01:00
Kconfig IB: Revert "remove redundant INFINIBAND kconfig dependencies" 2018-05-28 10:40:16 -06:00
lightnvm.c lightnvm: do no update csecs and sos on 1.2 2019-11-24 08:20:51 +01:00
Makefile nvme: Add fault injection feature 2018-03-26 08:53:43 -06:00
multipath.c nvme-multipath: fix possible io hang after ctrl reconnect 2019-11-12 19:21:11 +01:00
nvme.h nvme: provide fallback for discard alloc failure 2019-12-05 09:20:04 +01:00
pci.c nvme-pci: fix conflicting p2p resource adds 2019-12-01 09:17:13 +01:00
rdma.c nvme-rdma: fix a NULL deref when an admin connect times out 2019-05-31 06:46:15 -07:00
trace.c nvme: add disk name to trace events 2018-07-24 15:55:48 +02:00
trace.h nvme: add disk name to trace events 2018-07-24 15:55:48 +02:00