All of these functions are entered with IRQs enabled.
Hence the unconditional spin_unlock_irq can be used.
Function: Caller context:
dequeue_event() client process, via read(2)
fill_bus_reset_event() fw-device.c update worqueue job
release_client_resource() client process, via ioctl(2)
fw_device_op_release() client process, via close(2)
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Make the size check of ioctl_send_request and
ioctl_send_broadcast_request speed dependent. Also change the error
return code from -EINVAL to -EIO to distinguish this from other errors
concerning the ioctl parameters.
Another payload size limit for which we don't check here though is the
remote node's Bus_Info_Block.max_rec.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
We don't want random users write to Memory Space (e.g. PCs with physical
DMA filters down) or to core CSRs like Reset_Start.
This does not protect SBP-2 target CSRs. But properly behaving SBP-2
targets ignore broadcast write requests to these registers, and the
maximum damage which can happen with laxer targets is DOS. But there
are ways to create DOS situations anyway if there are devices with weak
device file permissions (like audio/video devices) present at the same
bus as an SBP-2 target.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Write transactions to the broadcast node ID are a convenient way to
trigger functions of multiple nodes at once. IIDC is a protocol which
can make use of this if multiple cameras with same command_regs_base are
connected at the same bus.
Based on
Date: Wed, 10 Sep 2008 11:32:16 -0400
From: Jay Fenlason <fenlason@redhat.com>
Subject: [patch] SEND_BROADCAST_REQUEST
Changes: ioctl_send_request() and ioctl_send_broadcast_request() now
share code. Broadcast speed corrected to S100. Check for proper tcode.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
While the speed of asynchronous transactions is automatically chosen by
the kernel, the speed of isochronous streams has to be chosen by the
initiating client.
In case of 1394a bus topologies, the maximum possible speed could be
figured out with some effort by evaluation of the remote node's link
speed field in the config ROM, the local node's link speed field, and
the PHY speeds and topologic information in the local node's or IRM's
topology map CSR. However, this does not work in case of 1394b buses.
Hence add an ioctl to export the maximum speed which the kernel already
determined.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
This adds ioctls for allocation and deallocation of a channel or/and
bandwidth without auto-reallocation and without auto-deallocation.
The benefit of these ioctls is that libraw1394-style isochronous
resource management can be implemented without write access to the IRM's
character device file.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Based on
Date: Tue, 18 Nov 2008 11:41:27 -0500
From: Jay Fenlason <fenlason@redhat.com>
Subject: [Patch V4] Add ISO resource management support
with several changes to the ABI and implementation. Only the part of
the ABI which enables auto-reallocation and auto-deallocation is
included here.
This implements ioctls for kernel-assisted allocation of isochronous
channels and isochronous bandwidth. The benefits are:
- The client does not have to have write access to the /dev/fw* device
corresponding to the IRM.
- The client does not have to perform reallocation after bus resets.
- Channel and bandwidth are deallocated by the kernel if the file is
closed before the client deallocated the resources. Thus resources
are released even if the client crashes.
It is anticipated that future in-kernel code (firewire-core IRM code;
the firewire port of firedtv), will use the fw-iso.c portions of this
code too.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Tested-by: David Moore <dcm@acm.org>
to indicate that they are specializations of struct event or of struct
client_resource, respectively.
struct response was both an event and a client_resource; it is now split
into struct outbound_transaction_resource and ~_event in order to
document more explicitly which types of client resources exist.
struct request and struct_request_event are renamed to struct
inbound_transaction_resource and ~_event because requests and responses
occur in outbound and in inbound transactions.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The lifetime of struct client instances must be longer than the lifetime
of any client resource.
This fixes a possible race between fw_device_op_release and transaction
completions. It also prepares for new ioctls for isochronous resource
management which will involve delayed processing of client resources.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Reviewed-by: David Moore <dcm@acm.org>
The FW_CDEV_IOC_GET_INFO ioctl looks at client->device->config_rom, not
at the local node's config ROM.
We could fix the implementation or the documentation. I believe the way
how it is currently implemented is more useful than the way how it is
currently documented. In fact, libdc1394 uses the ABI already as
implemented, not as documented. Hence let's change the documentation.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
OHCI-1394 1.1 clause 10.4.3 says: "If more than one IR DMA context
specifies receives for packets from the same isochronous channel, the
context destination for that channel's packets is undefined."
Any userspace client and in the future also kernelspace clients can
allocate IR DMA contexts for any channel. We don't want them to
interfere with each other, hence it is preferable to return -EBUSY if
allocation of a second context for a channel is attempted.
Notes:
- This limitation is OHCI-1394 specific, therefore its proper place of
implementation is down in the low-level driver.
- Since the <linux/firewire-cdev.h> ABI simply maps one userspace iso
client context to one hardware iso context, this OHCI-1394
limitation alas requires userspace to implement its own multiplexing
of iso reception from the same channel and card to multiple clients
when needed.
- The limitation is independent of channel allocation at the IRM; the
latter is really only important for the initiation of iso
transmission but not of iso reception.
- We don't need to do the same for IT DMA because OHCI-1394 does not
have any ties between IT contexts and channels. Only the voluntary
channel allocation protocol via the IRM, globally to the FireWire
bus, can ensure proper isochronous transmit behaviour anyway.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Like before my commit 1415d9189e,
fw_core_add_address_handler() does not align the address region now.
Instead the caller is required to pass valid parameters.
Since one of the callers of fw_core_add_address_handler() is the cdev
userspace interface, we now check for valid input. If the client is
buggy, we give it a hint with -EINVAL.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The current code uses a linked list and a counter for storing
resources and the corresponding handle numbers. By changing to an idr
we can be safe from counter wrap-around giving two resources the same
handle.
Furthermore, the deallocation ioctls now check whether the resource to
be freed is of the intended type.
Signed-off-by: Jay Fenlason <fenlason@redhat.com>
Some rework by Stefan R:
- The idr API documentation says we get an ID within 0...0x7fffffff.
Hence we can rest assured that idr handles fit into cdev handles.
- Fix some races. Add a client->in_shutdown flag for this purpose.
- Add allocation retry to add_client_resource().
- It is possible to use idr_for_each() in fw_device_op_release().
- Fix ioctl_send_response() regression.
- Small style changes.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Unlink the client from the fw_device earlier in order to prevent bus
reset events being added to client->event_list during shutdown.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The behaviour of fw-transaction.c::fw_send_request is ill-defined for
any other tcodes than read/ write/ lock request tcodes. Therefore
prevent requests with wrong tcodes from entering the transaction layer.
Maybe fw_send_request should check them itself, but I am not inclined to
change it and fw_fill_request from void-valued functions to ones which
return error codes and pass those up. Besides, maybe fw_send_request is
going to support one more tcode than ioctl_send_request in the future
(TCODE_STREAM_DATA).
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
This adds a client_list_lock, which only protects the device's
client_list, so that future versions of the driver can call code that
takes the card->lock while holding the client_list_lock. Adding this
lock is much simpler than adding __ versions of all the functions that
the future version may need. The one ordering issue is to make sure
code never takes the client_list_lock with card->lock held. Since
client_list_lock is only used in three places, that isn't hard.
Signed-off-by: Jay Fenlason <fenlason@redhat.com>
Update fill_bus_reset_event() accordingly. Include linux/spinlock.h.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Previously, when an iso context had header_size > 4, the iso header
(len/tag/channel/tcode/sy) was passed to userspace followed by quadlets
stripped from the payload. This patch changes the behavior:
header_size = 8 now passes the header quadlet followed by the timestamp
quadlet. When header_size > 8, quadlets are stripped from the payload.
The header_size = 4 case remains identical.
Since this alters the semantics of the API, the firewire API version
needs to be bumped concurrently with this change.
This change also refactors the header copying code slightly to be much
easier to read.
Signed-off-by: David Moore <dcm@acm.org>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
This patch fixes following build error:
CC ucc_geth.o
ucc_geth.c: In function 'ucc_geth_probe':
ucc_geth.c:3644: error: implicit declaration of function 'uec_mdio_bus_name'
make[2]: *** [ucc_geth.o] Error 1
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We use RCU to defer freeing of conntrack structures. In DOS situation, RCU might
accumulate about 10.000 elements per CPU in its internal queues. To get accurate
conntrack counts (at the expense of slightly more RAM used), we might consider
conntrack counter not taking into account "about to be freed elements, waiting
in RCU queues". We thus decrement it in nf_conntrack_free(), not in the RCU
callback.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Tested-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Having a RTC that doesn't maintain proper time across a reboot is one
thing. But a RTC that doesn't work at all and only causes timeouts is
another.
Tested-by: Martin Michlmayr <tbm@cyrius.com>
Signed-off-by: Nicolas Pitre <nico@marvell.com>
Honour barrier requests in the loop back block device driver.
In case of barrier bios, flush the backing file once before processing the
barrier and once after to guarantee ordering. In case of filesystems that
does not support fsync, barrier bios would be failed with -EOPNOTSUPP.
Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Currently inherited from sg.c bsg will submit asynchronous request
at the head-of-the-queue, (using "at_head" set in the call to
blk_execute_rq_nowait()). This is bad in situation where the queues
are full, requests will execute out of order, and can cause
starvation of the first submitted requests.
The sg_io_v4->flags member is used and a bit is allocated to denote the
Q_AT_TAIL. Zero is to queue at_head as before, to be compatible with old
code at the write/read path. SG_IO code path behavior was changed so to
be the same as write/read behavior. SG_IO was very rarely used and breaking
compatibility with it is OK at this stage.
sg_io_hdr at sg.h also has a flags member and uses 3 bits from the first
nibble and one bit from the last nibble. Even though none of these bits
are supported by bsg, The second nibble is allocated for use by bsg. Just
in case.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
CC: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
We've been carrying this patch for the last 3 years in Fedora,
long past time we got it upstream...
Call pci_set_master to enable bus-mastering if the BIOS hasn't
done it already.
Signed-off-by: Kyle McMartin <kyle@redhat.com>
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
In include/linux/genhd.h: Line 335 has a comment that needs to be updated from: /* drivers/block/ll_rw_blk.c */ to /* block/blk-core.c */. Also as of kernel 2.6.16, the function definition for get_blkdev_list was removed from block/genhd.c but the function declaration is still present on line 339. This patch addresses both those fixes, by updating the comment and removing the declaration.
Signed-off-by: Petros Koutoupis <pkoutoupis@hydrasystemsllc.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
The integrity bio allocation needs its own bio_set to avoid violating
the mempool allocation rules and risking deadlocks.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
The include/linux/genhd.h file, on line 338-352 declares some function
prototypes in which the comment on line 338 states that the definition of
these prototypes are to be found at drivers/block/genhd.c. The problem is
that genhd.c has been relocated to block/genhd.c. See attached patch to
correct this minor cosmetic typo.
Signed-off-by: Petros Koutoupis <pkoutoupis@hydrasystemsllc.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
The hardware requires 64-bit alignment of commands, so add a build bug
check for that. The recent commit 8a3173de4a
didn't change the size of the command, but other additions/changes may and
thus break badly at runtime.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
If we don't have CONFIG_BLK_DEV_INTEGRITY set, then we don't have
any external dependencies on the bio_vec slabs. So don't create
the ones that we will inline anyway.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
this warning (which got fixed by commit b2bf968):
fs/bio.c: In function ‘bio_alloc_bioset’:
fs/bio.c:305: warning: ‘p’ may be used uninitialized in this function
Triggered because the code flow in bio_alloc_bioset() is correct
but a bit complex for the compiler to see through.
Streamline it a bit - this also makes the code a tiny bit more compact:
text data bss dec hex filename
7540 256 40 7836 1e9c bio.o.before
7539 256 40 7835 1e9b bio.o.after
Also remove an older compiler-warnings annotation from this function,
it's not needed.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This removes some old code that was causing issues during
filesystem freeze.
Reported-by: Andrew Price <andy@andrewprice.me.uk>
Tested-by: Andrew Price <andy@andrewprice.me.uk>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
The logic requires that we mark the glock dirty in page_mkwrite
otherwise we might not flush correctly in the case that no
allocation was required in the process of dirying the page.
Also we need to set the shared write flag early for the same
reason.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
This cleans up a number of bits of code mostly based in glops.c.
A couple of simple functions have been merged into the callers
to make it more obvious what is going on, the mysterious raising
of i_writecount around the truncate_inode_pages() call has been
removed. The meta_go_* operations have been renamed rgrp_go_*
since that is the only lock type that they are used with.
The unused argument of gfs2_read_sb has been removed. Also
a bug has been fixed where a check for the rindex inode was
in the wrong callback. More comments are added, and the
debugging code is improved too.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
After calling out to the dlm, GFS2 sets the new state of a glock to
gl_target in gdlm_ast(). However, gl_target is not always the lock
state that was requested. If a conversion from shared to exclusive
fails, finish_xmote() will call do_xmote() with LM_ST_UNLOCKED, instead
of gl->gl_target, so that it can reacquire the lock in exlusive the next
time around. In this case, setting the lock to gl_target in gdlm_ast()
will make GFS2 think that it has the glock in exclusive mode, when
really, it doesn't have the glock locked at all. This patch adds a new
field to the gfs2_glock structure, gl_req, to track the mode that was
requested.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
I introduced "is_partially_uptodate" aops for GFS2.
A page can have multiple buffers and even if a page is not uptodate, some buffers
can be uptodate on pagesize != blocksize environment.
This aops checks that all buffers which correspond to a part of a file
that we want to read are uptodate. If so, we do not have to issue actual
read IO to HDD even if a page is not uptodate because the portion we
want to read are uptodate.
"block_is_partially_uptodate" function is already used by ext2/3/4.
With the following patch random read/write mixed workloads or random read after
random write workloads can be optimized and we can get performance improvement.
I did a performance test using the sysbench.
#sysbench --num-threads=16 --max-requests=200000 --test=fileio --file-num=1
--file-block-size=8K --file-total-size=2G --file-test-mode=rndrw --file-fsync-freq=0
--file-rw-ratio=1 run
-2.6.29-rc6
Test execution summary:
total time: 202.6389s
total number of events: 200000
total time taken by event execution: 2580.0480
per-request statistics:
min: 0.0000s
avg: 0.0129s
max: 49.5852s
approx. 95 percentile: 0.0462s
-2.6.29-rc6-patched
Test execution summary:
total time: 177.8639s
total number of events: 200000
total time taken by event execution: 2419.0199
per-request statistics:
min: 0.0000s
avg: 0.0121s
max: 52.4306s
approx. 95 percentile: 0.0444s
arch: ia64
pagesize: 16k
blocksize: 4k
Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Impact: Make symbol static.
Fix this sparse warning:
fs/gfs2/rgrp.c:188:5: warning: symbol 'gfs2_bitfit' was not declared. Should it be static?
Signed-off-by: Hannes Eder <hannes@hanneseder.net>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Fix this sparse warnings:
fs/gfs2/rgrp.c:156:23: warning: constant 0xffffffffffffffff is so big it is unsigned long long
fs/gfs2/rgrp.c:157:23: warning: constant 0xaaaaaaaaaaaaaaaa is so big it is unsigned long long
fs/gfs2/rgrp.c:158:23: warning: constant 0x5555555555555555 is so big it is long long
fs/gfs2/rgrp.c:194:20: warning: constant 0x5555555555555555 is so big it is long long
fs/gfs2/rgrp.c:204:44: warning: constant 0x5555555555555555 is so big it is long long
Signed-off-by: Hannes Eder <hannes@hanneseder.net>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
This adds support for "quota" and "noquota" mount options in addition to the
existing "quota=on/off/account" so that we are compatible with the names by
which these options are more generally known.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
An alignment issue with the existing bitfit algorithm was reported
on IA64. This patch attempts to fix that, and also to tidy up the
code a bit. There is now more documentation about how this works
and it has survived a number of different tests.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>