Both backstores and fabrics use arrays of struct scatterlist to describe
data buffers. However TCM used struct se_mems, basically a linked list
of scatterlist entries. We are able to simplify the code by eliminating
this intermediate data structure and just using struct scatterlist[]
throughout.
Also, moved attachment of task to cmd out of transport_generic_get_task
and into allocate_control_task and allocate_data_tasks. The reasoning
is that it's nonintuitive that get_task should automatically add it to
the cmd's task list -- it should just return an allocated, initialized
task. That's all it should do, based on the function's name, so either the
function shouldn't do it, or the name should change to encapsulate the
entire essence of what it does.
(nab: Fix compile warnings in tcm_fc, and make transport_kmap_first_data_page
honor sg->offset for SGLs from contigious memory with TCM_Loop, and
fix control se_cmd descriptor memory leak)
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Since sectors is not modified, it's more straightforward to do this.
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Due to all cdbs' data buffers being referenced by scatterlists, buffers
of more than a page are not contiguous. Instead of handling this in all
control command handlers, we may be able to get away with just limiting
control cdb data buffers to one page. The only control CDBs we handle that
have potentially large data buffers are REPORT LUNS and UNMAP, so if we
didn't want to live with this limitation, they would need to be modified
to walk the pages in the data buffer's sgl.
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Previously, some control CDBs did not allocate memory in pages for their
data buffer, but just did a kmalloc. This patch makes all cdbs allocate
pages.
This has the benefit of streamlining some paths that had to behave
differently when we used two allocation methods. The downside is that
all accesses to the data buffer need to kmap it before use, and need to
handle data in page-sized chunks if more than a page is needed for a given
command's data buffer.
Finally, note that cdbs with no data buffers are handled a little
differently. Before, SCSI_NON_DATA_CDBs would not call get_mem at all
(they'd be in the final else in transport_allocate_resources) but now
these will make it into generic_get_mem, but just not allocate any
buffers.
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Implement page B1h, Block Device Characteristics, so that we can report
a medium rotation rate of 1 (non-rotating / solid state) if the
is_nonrot device attribute is set; we update the iblock backend to set
this attribute if the underlying Linux block device has its nonrot
flag set.
Signed-off-by: Roland Dreier <roland@purestorage.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch adds SCF_EMULATE_QUEUE_FULL support using -EAGAIN failures
via transport_handle_queue_full() to signal queue full in completion
path TFO->queue_data_in() and TFO->queue_status() callbacks.
This is done using a new se_cmd->transport_qf_callback() to handle
the following queue full exception cases within target core:
*) TRANSPORT_COMPLETE_OK (for completion path queue full)
*) TRANSPORT_COMPLETE_QF_WP (for TRANSPORT_WRITE_PENDING queue full)
*) transport_send_check_condition_and_sense() failure paths in
transport_generic_request_failure() and transport_generic_complete_ok()
All logic is driven using se_device->qf_work_queue -> target_qf_do_work()
to to requeue outstanding se_cmd at the head of se_dev->queue_obj->qobj_list
for transport_processing_thread() execution.
Tested using tcm_qla2xxx with MAX_OUTSTANDING_COMMANDS=128 for FCP READ
to trigger the TRANSPORT_COMPLETE_OK queue full cases, and a simulated
TFO->write_pending() -EAGAIN failure to trigger TRANSPORT_COMPLETE_QF_WP.
Reported-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch adds a transport_handle_cdb_direct() optimization for mapping
and queueing tasks directly from within fabric processing context by calling
the newly exported transport_generic_new_cmd(). This currently expects to
be called from process context only, and will fail if called within interrupt
context.
This patch also leaves transport_generic_handle_cdb() unmodified for the
moment to function as expected with existing tcm_fc and ib_srpt fabrics,
and will be removed once these have been converted and tested with v4.1
code using transport_handle_cdb_direct().
Based on Andy's original patch here:
[PATCH 39/42] target: Call transport_new_cmd instead of adding to cmd queue
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
The release_cmd_to_pool and release_cmd_direct methods are always the same.
Merge them into a single release_cmd method, and clean up the fallout.
(nab: fix breakage in transport_generic_free_cmd() parameter build breakage
in drivers/target/tcm_fc/tfc_cmd.c)
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch contains a squashed version to remove unused SCF_* flags:
target: remove the unused SCF_SE_DISABLE_ONLINE_CHECK flag
target: remove the unused SCF_CMD_PASSTHROUGH_NOALLOC flag
target: remove the unused SCF_EMULATE_SYNC_UNMAP flag
target: remove the unused SCF_EMULATE_SYNC_CACHE flag
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch contains a squashed version of third round series cleanups,
improvements ,and simplfications from Andy and Christoph ahead of the
heavy lifting between round 3 -> 4 for the target core SGL conversion.
This include cleanups to the main target I/O path and other miscellaneous
updates.
target: Replace custom sg<->buf functions with lib funcs
target: Simplify sector limiting code
target: get_cdb should never return NULL
target: Simplify transport_memcpy_se_mem_read_contig
target: Use assignment rather than increment for t_task_cdbs
target: Don't pass dma_size to generic_get_mem
target: Pass sg with type scatterlist in transport_map_sg_to_mem
target: Move task_sg_num next to task_sg in struct se_task
target: inline struct se_transport_task into struct se_cmd
target: Change name & semantics of transport_get_sectors()
target: Remove unused members of se_cmd
target: Rename se_cmd.t_task_cdbs to t_task_list_num
target: Fix some spelling
target: Remove unused var from transport_generic_do_tmr
target: map_sg_to_mem: return sg_count in return value
target/pscsi: Use min_t for sector limits
target/pscsi: Unused param for pscsi_get_bio()
target: Rename get_cdb_count to allocate_tasks
target: Make transport_generic_new_cmd() available for iscsi-target
target: Remove fabric callback to allocate iovecs
target: Fix transport_generic_new_cmd WRITE comment
(hch: Use __GFP_ZERO usage for alloc_pages() usage)
Signed-off-by: Andy Grover <agrover@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch contains the squashed version of second round of target core
cleanups and simplifications and Andy and Co. It also contains a handful
of fixes to address bugs the original series and other minor cleanups.
Here is the condensed shortlog:
target: Remove unneeded casts to void*
target: Rename get_lun_for_{cmd,tmr} to lookup_{cmd,tmr}_lun
target: Make t_task a member of se_cmd, not a pointer
target: Handle functions returning "-2"
target: Use cmd->se_dev over cmd->se_lun->lun_se_dev
target: Embed qr in struct se_cmd
target: Replace embedded struct se_queue_req with a list_head
target: Rename list_heads that are nodes in struct se_cmd to "*_node"
target: Fold transport_device_setup_cmd() into lookup_{tmr,cmd}_lun()
target: Make t_mem_list and t_mem_list_bidi members of t_task
target: Add comment & cleanup transport_map_sg_to_mem()
target: Remove unneeded checks in transport_free_pages()
(Roland: Fix se_queue_req removal leftovers OOPs)
(nab: Fix transport_lookup_tmr_lun failure case)
(nab: Fix list_empty(&cmd->t_task.t_mem_bidi_list) inversion bugs)
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch contains the squashed version of a number of cleanups and
minor fixes from Andy's initial series (round 1) for target core this
past spring. The condensed log looks like:
target: use errno values instead of returning -1 for everything
target: Rename transport_calc_sg_num to transport_init_task_sg
target: Fix leak in error path in transport_init_task_sg
target/pscsi: Remove pscsi_get_sh() usage
target: Make two runtime checks into WARN_ONs
target: Remove hba queue depth and convert to spin_lock_irq usage
target: dev->dev_status_queue_obj is unused
target: Make struct se_queue_req.cmd type struct se_cmd *
target: Remove __transport_get_qr_from_queue()
target: Rename se_dev->g_se_dev_list to se_dev_node
target: Remove struct se_global
target: Simplify scsi mib index table code
target: Make dev_queue_obj a member of se_device instead of a pointer
target: remove extraneous returns at end of void functions
target: Ensure transport_dump_vpd_ident_type returns null-terminated str
target: Function pointers don't need to use '&' to be assigned
target: Fix comment in __transport_execute_tasks()
target: Misc style cleanups
target: rename struct pr_reservation_template to pr_reservation
target: Remove #defines that just perform indirection
target: Inline transport_get_task_from_execute_queue()
target: Minor header comment fixes
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch removes the now unnecessary 'unsigned char *cdb' function
parameter from transport_get_lun_for_cmd(). This also includes updating
lio-target, tcm_loop and tcm_fc usage of transport_get_lun_for_cmd().
Reported-by: Fubo Chen <fubo.chen@gmail.com>
Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org>
The Host used to create some page tables for the Guest to use at the
top of Guest memory; it would then tell the Guest where this was. In
particular, it created linear mappings for 0 and 0xC0000000 addresses
because lguest used to switch to its real page tables quite late in
boot.
However, since d50d8fe19 Linux initialized boot page tables in
head_32.S even before the "are we lguest?" boot jump. So, now we can
simplify things: the Host pagetable code assumes 1:1 linear mapping
until it first calls the LHCALL_NEW_PGTABLE hypercall, which we now do
before we reach C code.
This also means that the Host doesn't need to know anything about the
Guest's PAGE_OFFSET. (Non-Linux guests might not even have such a
thing).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
IPv6 fragment identification generation is way beyond what we use for
IPv4 : It uses a single generator. Its not scalable and allows DOS
attacks.
Now inetpeer is IPv6 aware, we can use it to provide a more secure and
scalable frag ident generator (per destination, instead of system wide)
This patch :
1) defines a new secure_ipv6_id() helper
2) extends inet_getid() to provide 32bit results
3) extends ipv6_select_ident() with a new dest parameter
Reported-by: Fernando Gont <fernando@gont.com.ar>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kobj and queues_kset are used with CONFIG_XPS=y.
Signed-off-by: Choi, Jong-Hwan <jhbird.choi@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Prior to this change, most PHY configuration parameters were passed
into the STMMAC device as a separate PHY device. As well as being
unusual, this made it difficult to make changes to the MAC/PHY
relationship.
This patch moves all the PHY parameters into the MAC configuration
structure, mainly as a separate structure. This allows us to completely
ignore the MDIO bus attached to a stmmac if desired, and not create
the PHY bus. It also allows the stmmac driver to use a different PHY
from the one it is connected to, for example a fixed PHY or bit banging
PHY.
Also derive the stmmac/PHY connection type (MII/RMII etc) from the
mode can be passed into <platf>_configure_ethernet.
STLinux kernel at git://git.stlinux.com/stm/linux-sh4-2.6.32.y.git
provides several examples how to use this new infrastructure (that
actually is easier to maintain and clearer).
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since vlan_group_get_device and vlan_group is not going to be accessible
from device drivers, introduce function which substitutes it.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The machinery for __ARCH_HAS_CLOCKSOURCE_DATA assumed a file in
asm-generic would be the default for architectures without their own
file in asm/, but that is not how it works.
Replace it with a Kconfig option instead.
Link: http://lkml.kernel.org/r/4E288AA6.7090804@zytor.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Andy Lutomirski <luto@mit.edu>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Tony Luck <tony.luck@intel.com>
Some gcc versions warn about prototypes without "inline" when the declaration
includes the "inline" keyword. The fix generates a false error message
"marked inline, but without a definition" with sparse below 0.4.2.
Signed-off-by: Chris Friesen <chris.friesen@genband.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
If overlapping networks with different interfaces was added to
the set, the type did not handle it properly. Example
ipset create test hash:net,iface
ipset add test 192.168.0.0/16,eth0
ipset add test 192.168.0.0/24,eth1
Now, if a packet was sent from 192.168.0.0/24,eth0, the type returned
a match.
In the patch the algorithm is fixed in order to correctly handle
overlapping networks.
Limitation: the same network cannot be stored with more than 64 different
interfaces in a single set.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
The non-debug variant of mutex_destroy is a no-op, currently
implemented as a macro which does nothing. This approach fails
to check the type of the parameter, so an error would only show
when debugging gets enabled. Using an inline function instead,
offers type checking for earlier bug catching.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110716174200.41002352@endymion.delvare
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Big kernel lock had been removed and setlease now use the lock_flocks()
to hold a special spin lock file_lock_lock by Matthew.
So just remove the out-of-date NOTE.
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Btrfs needs to be able to control how filemap_write_and_wait_range() is called
in fsync to make it less of a painful operation, so push down taking i_mutex and
the calling of filemap_write_and_wait() down into the ->fsync() handlers. Some
file systems can drop taking the i_mutex altogether it seems, like ext3 and
ocfs2. For correctness sake I just pushed everything down in all cases to make
sure that we keep the current behavior the same for everybody, and then each
individual fs maintainer can make up their mind about what to do from there.
Thanks,
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
This just gets us ready to support the SEEK_HOLE and SEEK_DATA flags. Turns out
using fiemap in things like cp cause more problems than it solves, so lets try
and give userspace an interface that doesn't suck. We need to match solaris
here, and the definitions are
*o* If /whence/ is SEEK_HOLE, the offset of the start of the
next hole greater than or equal to the supplied offset
is returned. The definition of a hole is provided near
the end of the DESCRIPTION.
*o* If /whence/ is SEEK_DATA, the file pointer is set to the
start of the next non-hole file region greater than or
equal to the supplied offset.
So in the generic case the entire file is data and there is a virtual hole at
the end. That means we will just return i_size for SEEK_HOLE and will return
the same offset for SEEK_DATA. This is how Solaris does it so we have to do it
the same way.
Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Moving the event counter into the dynamically allocated 'struc seq_file'
allows poll() support without the need to allocate its own tracking
structure.
All current users are switched over to use the new counter.
Requested-by: Andrew Morton akpm@linux-foundation.org
Acked-by: NeilBrown <neilb@suse.de>
Tested-by: Lucas De Marchi lucas.demarchi@profusion.mobi
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Simple filesystems always pass inode->i_sb_bdev as the block device
argument, and never need a end_io handler. Let's simply things for
them and for my grepping activity by dropping these arguments. The
only thing not falling into that scheme is ext4, which passes and
end_io handler without needing special flags (yet), but given how
messy the direct I/O code there is use of __blockdev_direct_IO
in one instead of two out of three cases isn't going to make a large
difference anyway.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
i_alloc_sem is a rather special rw_semaphore. It's the last one that may
be released by a non-owner, and it's write side is always mirrored by
real exclusion. It's intended use it to wait for all pending direct I/O
requests to finish before starting a truncate.
Replace it with a hand-grown construct:
- exclusion for truncates is already guaranteed by i_mutex, so it can
simply fall way
- the reader side is replaced by an i_dio_count member in struct inode
that counts the number of pending direct I/O requests. Truncate can't
proceed as long as it's non-zero
- when i_dio_count reaches non-zero we wake up a pending truncate using
wake_up_bit on a new bit in i_flags
- new references to i_dio_count can't appear while we are waiting for
it to read zero because the direct I/O count always needs i_mutex
(or an equivalent like XFS's i_iolock) for starting a new operation.
This scheme is much simpler, and saves the space of a spinlock_t and a
struct list_head in struct inode (typically 160 bits on a non-debug 64-bit
system).
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The forward declaration of struct file_operations is
added to avoid compilation warnings.
Signed-off-by: Tomasz Stanislawski <t.stanislaws@samsung.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Now we have a per-superblock shrinker implementation, we can add a
filesystem specific callout to it to allow filesystem internal
caches to be shrunk by the superblock shrinker.
Rather than perpetuate the multipurpose shrinker callback API (i.e.
nr_to_scan == 0 meaning "tell me how many objects freeable in the
cache), two operations will be added. The first will return the
number of objects that are freeable, the second is the actual
shrinker call.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
With context based shrinkers, we can implement a per-superblock
shrinker that shrinks the caches attached to the superblock. We
currently have global shrinkers for the inode and dentry caches that
split up into per-superblock operations via a coarse proportioning
method that does not batch very well. The global shrinkers also
have a dependency - dentries pin inodes - so we have to be very
careful about how we register the global shrinkers so that the
implicit call order is always correct.
With a per-sb shrinker callout, we can encode this dependency
directly into the per-sb shrinker, hence avoiding the need for
strictly ordering shrinker registrations. We also have no need for
any proportioning code for the shrinker subsystem already provides
this functionality across all shrinkers. Allowing the shrinker to
operate on a single superblock at a time means that we do less
superblock list traversals and locking and reclaim should batch more
effectively. This should result in less CPU overhead for reclaim and
potentially faster reclaim of items from each filesystem.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>