linux-pinenote

Author	SHA1	Message	Date
Mike Christie	0c70d84b79	[SCSI] iscsi class: export pid of process that created There could be multiple userspace entities creating/destroying/ recoverying sessions and also the kernel's iscsi drivers could be doing this too. If the userspace apps do try to manage the kernel ones it can get the driver/fw out of sync and cause the user to loose the root disk, oopses or ping ponging becasue userspace wants to do one thing but the kernel manager thought we are trying to do another. This patch fixes the problem by just exporting the pid of the entity that created the session. Userspace programs like iscsid, iscsiadm, iscsistart, qlogic's tools, etc, can then figure out which sessions they own and only manage them. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2011-12-15 10:57:40 +04:00
Moger, Babu	2b132577a0	[SCSI] scsi_dh: code cleanup and remove the references to scsi_dev_info All the handlers have now implemented the match function so We don't need to use scsi_dev_info any more for matching purposes. Signed-off-by: Babu Moger <babu.moger@netapp.com> Acked-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2011-12-15 10:55:00 +04:00
Mark Brown	3c8bedb7e4	mfd: Declare da9052_regmap_config for the bus drivers Fixes build failures. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>	2011-12-15 14:52:37 +08:00
Kay Sievers	0706802183	xen-balloon: convert sysdev_class to a regular subsystem After all sysdev classes are ported to regular driver core entities, the sysdev implementation will be entirely removed from the kernel. Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-12-14 15:32:50 -08:00
Kay Sievers	fe5ff8b84c	edac: convert sysdev_class to a regular subsystem After all sysdev classes are ported to regular driver core entities, the sysdev implementation will be entirely removed from the kernel. Cc: Doug Thompson <dougthompson@xmission.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Lucas De Marchi <lucas.demarchi@profusion.mobi> Cc: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-12-14 15:21:07 -08:00
Kay Sievers	ca22e56deb	driver-core: implement 'sysdev' functionality for regular devices and buses All sysdev classes and sysdev devices will converted to regular devices and buses to properly hook userspace into the event processing. There is no interesting difference between a 'sysdev' and 'device' which would justify to roll an entire own subsystem with different userspace export semantics. Userspace relies on events and generic sysfs subsystem infrastructure from sysdev devices, which are currently not properly available. Every converted sysdev class will create a regular device with the class name in /sys/devices/system and all registered devices will becom a children of theses devices. For compatibility reasons, the sysdev class-wide attributes are created at this parent device. (Do not copy that logic for anything new, subsystem- wide properties belong to the subsystem, not to some fake parent device created in /sys/devices.) Every sysdev driver is implemented as a simple subsystem interface now, and no longer called a driver. After all sysdev classes are ported to regular driver core entities, the sysdev implementation will be entirely removed from the kernel. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-12-14 14:29:38 -08:00
Samuel Ortiz	d646960f79	NFC: Initial LLCP support This patch is an initial implementation for the NFC Logical Link Control protocol. It's also known as NFC peer to peer mode. This is a basic implementation as it lacks SDP (services Discovery Protocol), frames aggregation support, and frame rejecion parsing. Follow up patches will implement those missing features. This code has been tested against a Nexus S phone implementing LLCP 1.0. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2011-12-14 14:50:13 -05:00
Samuel Ortiz	541d920b05	NFC: Set and get DEP general bytes Without an API for setting and getting the local and remote general bytes, drivers won't be able to properly establish a DEP link. This API also allows them to propagate the remote general bytes they get from the DEP link establishment up to the LLCP layer. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2011-12-14 14:50:13 -05:00
Samuel Ortiz	1ed28f6106	NFC: Add a DEP link control netlink command NFC-DEP (Data Exchange Protocol) is an NFC MAC layer. This command allows to enable and disable the DEP link on to which e.g. LLCP can run. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2011-12-14 14:50:12 -05:00
Samuel Ortiz	7c7cd3bfec	NFC: Add tx skb allocation routine This is a factorization of the current rawsock tx skb allocation routine, as it will be used by the LLCP code. We also rename nfc_alloc_skb to nfc_alloc_recv_skb for consistency sake. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2011-12-14 14:50:12 -05:00
Greg Kroah-Hartman	6261ddee70	kref: fix up the kfree build problems It turns out that some memory allocators use kobjects, which use krefs, and kref.h was wanting to figure out the address of kfree(), which ended up in a loop. kfree was only being needed for a warning to tell the caller that they were doing something stupid. Now we just move that warning into the comments for the functions, which results in a bit more fun as everyone enjoys digging for people to mock at times of boredom. So, remove the dependancy of slab.h on kref.h, and fix up the other include file as well (we really only need bug.h and atomic.h, not types.h). Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Oliver Neukum <oneukum@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-12-14 11:19:07 -08:00
Eric Dumazet	f943cbe6fb	inet: remove rcu protection on tw_net commit `b099ce2602` (net: Batch inet_twsk_purge) added rcu protection on tw_net for no obvious reason. struct net are refcounted anyway since timewait sockets escape from rcu protected sections. tw_net stay valid for the whole timwait lifetime. This also removes a lot of sparse errors. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-12-14 13:34:55 -05:00
Alex Deucher	cd5cfce856	drm/radeon/kms: add some new pci ids Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=43739 Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com>	2011-12-14 12:29:03 +00:00
Mark Brown	704867ede0	Merge branch 'mfd/da9052' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/misc into regmap-next	2011-12-14 20:02:09 +08:00
Ashish Jangam	84c99db879	MFD: DA9052/53 MFD core module The DA9052/53 is a highly integrated PMIC subsystem with supply domain flexibility to support wide range of high performance application. It provides voltage regulators, GPIO controller, Touch Screen, RTC, Battery control and other functionality. This patch is functionally tested on Samsung SMDKV6410. Signed-off-by: David Dajun Chen <dchen@diasemi.com> Signed-off-by: Ashish Jangam <ashish.jangam@kpitcummins.com> Acked-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>	2011-12-14 19:53:32 +08:00
Joerg Roedel	a06ec394c9	Merge branch 'iommu/page-sizes' into x86/amd Conflicts: drivers/iommu/amd_iommu.c	2011-12-14 12:52:09 +01:00
Nicholas Bellinger	4d2300ccff	target: Remove extra se_device->execute_task_lock access in fast path This patch makes __transport_execute_tasks() perform the addition of tasks to dev->execute_task_list via __transport_add_tasks_from_cmd() while holding dev->execute_task_lock during normal I/O fast path submission. It effectively removes the unnecessary re-acquire of dev->execute_task_lock during transport_execute_tasks() -> transport_add_tasks_from_cmd() ahead of calling __transport_execute_tasks() to queue tasks for the passed *se_cmd descriptor. (v2: Re-add goto check_depth usage for multi-task submission for now..) Cc: Christoph Hellwig <hch@lst.de> Cc: Roland Dreier <roland@purestorage.com> Cc: Joern Engel <joern@logfs.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2011-12-14 11:48:46 +00:00
Nicholas Bellinger	65586d51e0	target: Drop se_device TCQ queue_depth usage from I/O path Historically, pSCSI devices have been the ones that required target-core to enforce a per se_device->depth_left. This patch changes target-core to no longer (by default) enforce a per se_device->depth_left or sleep in transport_tcq_window_closed() when we out of queue slots for all backend export cases. Cc: Christoph Hellwig <hch@lst.de> Cc: Roland Dreier <roland@purestorage.com> Cc: Joern Engel <joern@logfs.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2011-12-14 11:42:13 +00:00
Nicholas Bellinger	ec54cc081e	target: Remove TFO->check_release_cmd() fabric API caller Remove the now unused target_core_fabric_ops->check_release_cmd() as target_core handles this directly for se_cmd->cmd_kref objects now. Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2011-12-14 11:42:11 +00:00
Nicholas Bellinger	a636078552	target: Add target_submit_cmd() for process context fabric submission This patch adds a target_submit_cmd() caller that can be used by fabrics to submit an uninitialized se_cmd descriptor to an struct se_session + unpacked_lun from workqueue process context. This call will invoke the following steps: - transport_init_se_cmd() to setup se_cmd specific pointers - Obtain se_cmd->cmd_kref references with target_get_sess_cmd() - set se_cmd->t_tasks_bidi - transport_lookup_cmd_lun() to setup struct se_cmd->se_lun from the passed unpacked_lun - transport_generic_allocate_tasks() to setup the passed cdb, and - transport_handle_cdb_direct() handle READ dispatch or WRITE ready-to-transfer callback to fabric v2 changes from hch feedback: ) Add target_sc_flags_table for target_submit_cmd flags ) Rename bidi parameter to flags, add TARGET_SCF_BIDI_OP ) Convert checks to BUG_ON ) Add out_check_cond for transport_send_check_condition_and_sense usage v3 changes: ) Add TARGET_SCF_ACK_KREF for target_submit_cmd into target_get_sess_cmd to determine when the fabric caller is expecting a second kref_put() from fabric packet acknowledgement. Cc: Christoph Hellwig <hch@lst.de> Cc: Roland Dreier <roland@purestorage.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2011-12-14 11:40:56 +00:00
Nicholas Bellinger	7481deb413	target: Make target_put_sess_cmd use target_release_cmd_kref This patch moves target_put_sess_cmd() to use a se_cmd->cmd_kref callback target_release_cmd_kref when performing driver release of fabric->se_cmd descriptor memory. It sets the default cmd_kref count value to '2' within target_get_sess_cmd() setup, and currently assumes TFO->check_stop_free() usage. It drops se_tfo->check_release_cmd() usage in the main transport_release_cmd codepath. Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2011-12-14 11:38:29 +00:00
Christoph Hellwig	e0a53e70e8	target: remove overagressive ____cacheline_aligned annoations If we want dynamically allocated objects to be cacheline aligned we need to tell that to the slab allocator by using the proper flags and not by liberally sprinkling annotations onto all structures. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2011-12-14 11:28:13 +00:00
Christoph Hellwig	1880807adb	target: make the se_task task_state_active a normal bool There is no need to make task_state_active an atomic_t given that it is always set under the execute_task_lock so we can make it a simple bool. Also rename it to t_state_active to be closer to the list it guards, and make sure all checks before the list addion/removal actually happen under execute_task_lock. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2011-12-14 11:27:02 +00:00
Christoph Hellwig	41e16e9816	target: remove the se_task task_error_status field We only reach transport_complete_task once per task, so the test and set on task_error_status is never going to have an effect. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2011-12-14 11:26:44 +00:00
Christoph Hellwig	ef804a849f	target: fold se_task.task_sense into task_flags Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2011-12-14 11:26:27 +00:00
Christoph Hellwig	c4795fb20e	target: header reshuffle, part2 This reorganized the headers under include/target into: - target_core_base.h stays as is with all target-wide data stuctures and defines - target_core_backend.h contains the whole interface to I/O backends - target_core_fabric.h contains the whole interface to fabric modules Except for those only the various configfs macro headers stay around. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2011-12-14 11:26:05 +00:00
Joerg Roedel	175d614673	iommu/amd: Add invalid_ppr callback This callback can be used to change the PRI response code sent to a device when a PPR fault fails. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2011-12-14 12:09:39 +01:00
Christoph Hellwig	e26d99aed4	target: reshuffle headers Create a new headers, drivers/target/target_core_internal.h that is supposed to hold all target_core-internal prototypes. Move all non-exported includes from include/target to it, and merge the smaller prototype-only includes inside drivers/target into it as well. Mark functions that were found to not be called outside their implementation file static. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2011-12-14 08:51:12 +00:00
Ingo Molnar	919b83452b	Merge branch 'rcu/next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu	2011-12-14 08:16:43 +01:00
Andrew Lunn	db33f4de99	ARM: Orion: Remove address map info from all platform data structures Signed-off-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Michael Walle <michael@walle.cc> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Nicolas Pitre <nico@fluxnic.net>	2011-12-13 18:46:56 -05:00
Andrew Lunn	63a9332b23	ARM: Orion: Get address map from plat-orion instead of via platform_data Use an getter function in plat-orion/addr-map.c to get the address map structure, rather than pass it to drivers in the platform_data structures. When the drivers are built for none orion platforms, a dummy function is provided instead which returns NULL. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Michael Walle <michael@walle.cc> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Nicolas Pitre <nico@fluxnic.net>	2011-12-13 18:46:55 -05:00
Tejun Heo	f1f8cc9465	block, cfq: move icq creation and rq->elv.icq association to block core Now block layer knows everything necessary to create and associate icq's with requests. Move ioc_create_icq() to blk-ioc.c and update get_request() such that, if elevator_type->icq_size is set, requests are automatically associated with their matching icq's before elv_set_request(). io_context reference is also managed by block core on request alloc/free. * Only ioprio/cgroup changed handling remains from cfq_get_cic(). Collapsed into cfq_set_request(). * This removes queue kicking on icq allocation failure (for now). As icq allocation failure is rare and the only effect of queue kicking achieved was possibily accelerating queue processing, this change shouldn't be noticeable. There is a larger underlying problem. Unlike request allocation, icq allocation is not guaranteed to succeed eventually after retries. The number of icq is unbound and thus mempool can't be the solution either. This effectively adds allocation dependency on memory free path and thus possibility of deadlock. This usually wouldn't happen because icq allocation is not a hot path and, even when the condition triggers, it's highly unlikely that none of the writeback workers already has icq. However, this is still possible especially if elevator is being switched under high memory pressure, so we better get it fixed. Probably the only solution is just bypassing elevator and appending to dispatch queue on any elevator allocation failure. * Comment added to explain how icq's are managed and synchronized. This completes cleanup of io_context interface. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-12-14 00:33:42 +01:00
Tejun Heo	9b84cacd01	block, cfq: restructure io_cq creation path for io_context interface cleanup Add elevator_ops->elevator_init_icq_fn() and restructure cfq_create_cic() and rename it to ioc_create_icq(). The new function expects its caller to pass in io_context, uses elevator_type->icq_cache, handles generic init, calls the new elevator operation for elevator specific initialization, and returns pointer to created or looked up icq. This leaves cfq_icq_pool variable without any user. Removed. This prepares for io_context interface cleanup and doesn't introduce any functional difference. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-12-14 00:33:42 +01:00
Tejun Heo	7e5a879449	block, cfq: move io_cq exit/release to blk-ioc.c With kmem_cache managed by blk-ioc, io_cq exit/release can be moved to blk-ioc too. The odd ->io_cq->exit/release() callbacks are replaced with elevator_ops->elevator_exit_icq_fn() with unlinking from both ioc and q, and freeing automatically handled by blk-ioc. The elevator operation only need to perform exit operation specific to the elevator - in cfq's case, exiting the cfqq's. Also, clearing of io_cq's on q detach is moved to block core and automatically performed on elevator switch and q release. Because the q io_cq points to might be freed before RCU callback for the io_cq runs, blk-ioc code should remember to which cache the io_cq needs to be freed when the io_cq is released. New field io_cq->__rcu_icq_cache is added for this purpose. As both the new field and rcu_head are used only after io_cq is released and the q/ioc_node fields aren't, they are put into unions. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-12-14 00:33:42 +01:00
Tejun Heo	3d3c2379fe	block, cfq: move icq cache management to block core Let elevators set ->icq_size and ->icq_align in elevator_type and elv_register() and elv_unregister() respectively create and destroy kmem_cache for icq. * elv_register() now can return failure. All callers updated. * icq caches are automatically named "ELVNAME_io_cq". * cfq_slab_setup/kill() are collapsed into cfq_init/exit(). * While at it, minor indentation change for iosched_cfq.elevator_name for consistency. This will help moving icq management to block core. This doesn't introduce any functional change. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-12-14 00:33:42 +01:00
Tejun Heo	a612fddf0d	block, cfq: move cfqd->icq_list to request_queue and add request->elv.icq Most of icq management is about to be moved out of cfq into blk-ioc. This patch prepares for it. * Move cfqd->icq_list to request_queue->icq_list * Make request explicitly point to icq instead of through elevator private data. ->elevator_private[3] is replaced with sub struct elv which contains icq pointer and priv[2]. cfq is updated accordingly. * Meaningless clearing of ->elevator_private[0] removed from elv_set_request(). At that point in code, the field was guaranteed to be %NULL anyway. This patch doesn't introduce any functional change. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-12-14 00:33:41 +01:00
Tejun Heo	c586980732	block, cfq: reorganize cfq_io_context into generic and cfq specific parts Currently io_context and cfq logics are mixed without clear boundary. Most of io_context is independent from cfq but cfq_io_context handling logic is dispersed between generic ioc code and cfq. cfq_io_context represents association between an io_context and a request_queue, which is a concept useful outside of cfq, but it also contains fields which are useful only to cfq. This patch takes out generic part and put it into io_cq (io context-queue) and the rest into cfq_io_cq (cic moniker remains the same) which contains io_cq. The following changes are made together. * cfq_ttime and cfq_io_cq now live in cfq-iosched.c. * All related fields, functions and constants are renamed accordingly. * ioc->ioc_data is now "struct io_cq " instead of "void " and renamed to icq_hint. This prepares for io_context API cleanup. Documentation is currently sparse. It will be added later. Changes in this patch are mechanical and don't cause functional change. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-12-14 00:33:41 +01:00
Tejun Heo	22f746e235	block: remove elevator_queue->ops elevator_queue->ops points to the same ops struct ->elevator_type.ops is pointing to. The only effect of caching it in elevator_queue is shorter notation - it doesn't save any indirect derefence. Relocate elevator_type->list which used only during module init/exit to the end of the structure, rename elevator_queue->elevator_type to ->type, and replace elevator_queue->ops with elevator_queue->type.ops. This doesn't introduce any functional difference. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-12-14 00:33:41 +01:00
Tejun Heo	1238033c79	block, cfq: kill cic->key Now that lazy paths are removed, cfqd_dead_key() is meaningless and cic->q can be used whereever cic->key is used. Kill cic->key. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-12-14 00:33:40 +01:00
Tejun Heo	b50b636bce	block, cfq: kill ioc_gone Now that cic's are immediately unlinked under both locks, there's no need to count and drain cic's before module unload. RCU callback completion is waited with rcu_barrier(). While at it, remove residual RCU operations on cic_list. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-12-14 00:33:39 +01:00
Tejun Heo	b9a1920837	block, cfq: remove delayed unlink Now that all cic's are immediately unlinked from both ioc and queue, lazy dropping from lookup path and trimming on elevator unregister are unnecessary. Kill them and remove now unused elevator_ops->trim(). This also leaves call_for_each_cic() without any user. Removed. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-12-14 00:33:39 +01:00
Tejun Heo	b2efa05265	block, cfq: unlink cfq_io_context's immediately cic is association between io_context and request_queue. A cic is linked from both ioc and q and should be destroyed when either one goes away. As ioc and q both have their own locks, locking becomes a bit complex - both orders work for removal from one but not from the other. Currently, cfq tries to circumvent this locking order issue with RCU. ioc->lock nests inside queue_lock but the radix tree and cic's are also protected by RCU allowing either side to walk their lists without grabbing lock. This rather unconventional use of RCU quickly devolves into extremely fragile convolution. e.g. The following is from cfqd going away too soon after ioc and q exits raced. general protection fault: 0000 [#1] PREEMPT SMP CPU 2 Modules linked in: [ 88.503444] Pid: 599, comm: hexdump Not tainted 3.1.0-rc10-work+ #158 Bochs Bochs RIP: 0010:[<ffffffff81397628>] [<ffffffff81397628>] cfq_exit_single_io_context+0x58/0xf0 ... Call Trace: [<ffffffff81395a4a>] call_for_each_cic+0x5a/0x90 [<ffffffff81395ab5>] cfq_exit_io_context+0x15/0x20 [<ffffffff81389130>] exit_io_context+0x100/0x140 [<ffffffff81098a29>] do_exit+0x579/0x850 [<ffffffff81098d5b>] do_group_exit+0x5b/0xd0 [<ffffffff81098de7>] sys_exit_group+0x17/0x20 [<ffffffff81b02f2b>] system_call_fastpath+0x16/0x1b The only real hot path here is cic lookup during request initialization and avoiding extra locking requires very confined use of RCU. This patch makes cic removal from both ioc and request_queue perform double-locking and unlink immediately. * From q side, the change is almost trivial as ioc->lock nests inside queue_lock. It just needs to grab each ioc->lock as it walks cic_list and unlink it. * From ioc side, it's a bit more difficult because of inversed lock order. ioc needs its lock to walk its cic_list but can't grab the matching queue_lock and needs to perform unlock-relock dancing. Unlinking is now wholly done from put_io_context() and fast path is optimized by using the queue_lock the caller already holds, which is by far the most common case. If the ioc accessed multiple devices, it tries with trylock. In unlikely cases of fast path failure, it falls back to full double-locking dance from workqueue. Double-locking isn't the prettiest thing in the world but it's far simpler and more understandable than RCU trick without adding any meaningful overhead. This still leaves a lot of now unnecessary RCU logics. Future patches will trim them. -v2: Vivek pointed out that cic->q was being dereferenced after cic->release() was called. Updated to use local variable @this_q instead. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-12-14 00:33:39 +01:00
Tejun Heo	dc86900e0a	block, cfq: move ioc ioprio/cgroup changed handling to cic ioprio/cgroup change was handled by marking the changed state in ioc and, on the following access to the ioc, performing RCU-protected iteration through all cic's grabbing the matching queue_lock. This patch moves the changed state to each cic. When ioprio or cgroup changes, the respective bit is set on all cic's of the ioc and when each of those cic (not ioc) is accessed, change is applied for that specific ioc-queue pair. This also fixes the following two race conditions between setting and clearing of changed states. * Missing barrier between assign/load of ioprio and ioprio_changed allowed applying old ioprio. * Change requests could happen between application of change and clearing of changed variables. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-12-14 00:33:38 +01:00
Tejun Heo	283287a52e	block, cfq: misc updates to cfq_io_context Make the following changes to prepare for ioc/cic management cleanup. * Add cic->q so that ioc can determine the associated queue without querying cfq. This will eventually replace ->key. * Factor out cfq_release_cic() from cic_free_func(). This function assumes that the caller handled locking. * Rename __cfq_exit_single_io_context() to cfq_exit_cic() and make it take only @cic. * Restructure cfq_cic_link() for future updates. This patch doesn't introduce any functional changes. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-12-14 00:33:38 +01:00
Tejun Heo	09ac46c429	block: misc updates to blk_get_queue() * blk_get_queue() is peculiar in that it returns 0 on success and 1 on failure instead of 0 / -errno or boolean. Update it such that it returns %true on success and %false on failure. * Make sure the caller checks for the return value. * Separate out __blk_get_queue() which doesn't check whether @q is dead and put it in blk.h. This will be used later. This patch doesn't introduce any functional changes. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-12-14 00:33:38 +01:00
Tejun Heo	6e736be7f2	block: make ioc get/put interface more conventional and fix race on alloction Ignoring copy_io() during fork, io_context can be allocated from two places - current_io_context() and set_task_ioprio(). The former is always called from local task while the latter can be called from different task. The synchornization between them are peculiar and dubious. * current_io_context() doesn't grab task_lock() and assumes that if it saw %NULL ->io_context, it would stay that way until allocation and assignment is complete. It has smp_wmb() between alloc/init and assignment. * set_task_ioprio() grabs task_lock() for assignment and does smp_read_barrier_depends() between "ioc = task->io_context" and "if (ioc)". Unfortunately, this doesn't achieve anything - the latter is not a dependent load of the former. ie, if ioc itself were being dereferenced "ioc->xxx", it would mean something (not sure what tho) but as the code currently stands, the dependent read barrier is noop. As only one of the the two test-assignment sequences is task_lock() protected, the task_lock() can't do much about race between the two. Nothing prevents current_io_context() and set_task_ioprio() allocating its own ioc for the same task and overwriting the other's. Also, set_task_ioprio() can race with exiting task and create a new ioc after exit_io_context() is finished. ioc get/put doesn't have any reason to be complex. The only hot path is accessing the existing ioc of %current, which is simple to achieve given that ->io_context is never destroyed as long as the task is alive. All other paths can happily go through task_lock() like all other task sub structures without impacting anything. This patch updates ioc get/put so that it becomes more conventional. * alloc_io_context() is replaced with get_task_io_context(). This is the only interface which can acquire access to ioc of another task. On return, the caller has an explicit reference to the object which should be put using put_io_context() afterwards. * The functionality of current_io_context() remains the same but when creating a new ioc, it shares the code path with get_task_io_context() and always goes through task_lock(). * get_io_context() now means incrementing ref on an ioc which the caller already has access to (be that an explicit refcnt or implicit %current one). * PF_EXITING inhibits creation of new io_context and once exit_io_context() is finished, it's guaranteed that both ioc acquisition functions return %NULL. * All users are updated. Most are trivial but smp_read_barrier_depends() removal from cfq_get_io_context() needs a bit of explanation. I suppose the original intention was to ensure ioc->ioprio is visible when set_task_ioprio() allocates new io_context and installs it; however, this wouldn't have worked because set_task_ioprio() doesn't have wmb between init and install. There are other problems with this which will be fixed in another patch. * While at it, use NUMA_NO_NODE instead of -1 for wildcard node specification. -v2: Vivek spotted contamination from debug patch. Removed. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-12-14 00:33:38 +01:00
Tejun Heo	42ec57a8f6	block: misc ioc cleanups * int return from put_io_context() wasn't used by anybody. Make it return void like other put functions and docbook-fy the function comment. * Reorder dummy declarations for !CONFIG_BLOCK case a bit. * Make alloc_ioc_context() use __GFP_ZERO allocation, take init out of if block and drop 0'ing. * Docbook-fy current_io_context() comment. This patch doesn't introduce any functional change. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-12-14 00:33:37 +01:00
Tejun Heo	a73f730d01	block, cfq: move cfqd->cic_index to q->id cfq allocates per-queue id using ida and uses it to index cic radix tree from io_context. Move it to q->id and allocate on queue init and free on queue release. This simplifies cfq a bit and will allow for further improvements of io context life-cycle management. This patch doesn't introduce any functional difference. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-12-14 00:33:37 +01:00
Tejun Heo	34f6055c80	block: add blk_queue_dead() There are a number of QUEUE_FLAG_DEAD tests. Add blk_queue_dead() macro and use it. This patch doesn't introduce any functional difference. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-12-14 00:33:37 +01:00
Tejun Heo	1ba64edef6	block, sx8: kill blk_insert_request() The only user left for blk_insert_request() is sx8 and it can be trivially switched to use blk_execute_rq_nowait() - special requests aren't included in io stat and sx8 doesn't use block layer tagging. Switch sx8 and kill blk_insert_requeset(). This patch doesn't introduce any functional difference. Only compile tested. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Jeff Garzik <jgarzik@pobox.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-12-14 00:33:37 +01:00

... 14 15 16 17 18 ...

47,892 commits