linux-pinenote

Author	SHA1	Message	Date
Trond Myklebust	2d89a1d3c9	NFSv4.1/pNFS: Don't request a minimal read layout beyond the end of file If we have a read layout, then sanity check the minimal layout length so that it does not extend beyond the end of file. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-08-31 02:05:47 -07:00
Trond Myklebust	21b874c873	NFSv4.1/pnfs: Handle LAYOUTGET return values correctly According to RFC5661 section 18.43.3, if the server cannot satisfy the loga_minlength argument to LAYOUTGET, there are 2 cases: 1) If loga_minlength == 0, it returns NFS4ERR_LAYOUTTRYLATER 2) If loga_minlength != 0, it returns NFS4ERR_BADLAYOUT Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-08-31 01:33:12 -07:00
Trond Myklebust	4ae93560b1	NFSv4.1/pnfs: Don't ask for a read layout for an empty file. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-08-31 01:33:12 -07:00
Ingo Molnar	bac2e4a96d	perf/core improvement and fixes: User visible: - Add new compaction-times python script (Tony Jones) - Make the --[no-]-demangle/--[no-]-demangle-kernel command line options available in 'perf script' too (Mark Drayton) - Allow for negative numbers in libtraceevent's print format, fixing up misformatting in some tracepoints (Steven Rostedt) Infrastructure: - perf_env/perf_evlist changes to allow accessing the data structure with the environment where some perf data was collected in functions not necessarily related to perf.data file processing (Kan Liang) - Cleanups for the tracepoint definition location paths routines (Jiri Olsa) - Introduce sysfs/filename__sprintf_build_id, removing code duplication (Masami Hiramatsu) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJV4KISAAoJENZQFvNTUqpA4LsQALYpGTDLft7c7sQhRLMyGVAg vrd7xbJ/Plp7+A5XDSzEtqurKCTSCRxKQ/Wr9LQyD4Ja7lEKy6oWo/tIqTZSFoQt Xojx/NI2uvdKgX6TCx4JVPx2cXyzwn0r+GGFbFLYl+E1QMTeUI6xQICBGq7+cgfM WceXWkrydxC4rvJUKVYh/y9O98DVSWQ032vHDwPdmaWYl8sjFOG8c3TJQx3Jzrsa Vs0ZpDx01rqMtRMOPBY++9H3az4yv5J+hpLdkzrRD7BJHDyq6rkZTovlOwXdYvcs fspYm1dPxl0RhIZyqsFCj8njtGvjjRLPkE1EGYrr0SAcAFISC3xp9wHhS97I9XfV rLOed/xksEg81Koac7JanqXmxwpaY9fZPLqqLdrKhHSLbM3dOypee8IDNTpbGdoX CySojk7q/7aoRG5GRDz0UhpFSYj8+r15EulZSTG5TDUV36ZGurGI6H7DPjg0peu0 TpY4AL/si/C1vRxc0H97mdy6dQHgJh5DQdiwwNJftjnV7Oi5ZVwQjR/LOTqYK5TR 1+FxzOkHqF//cXUAxCt5801OAQKt+WMmunGrGk30vMses7kuWVKnCOD+z1PVLPfA vwI/BYNbBOoqkZvA8pz5duXbxRVpn0yk6hxff1Pdwk5e1LNG8U1SIrZ2WnzGN8N3 mRwr20IOLv28qrk5oHTD =9AFe -----END PGP SIGNATURE----- Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvement and fixes from Arnaldo Carvalho de Melo: User visible changes: - Add new compaction-times python script. (Tony Jones) - Make the --[no-]-demangle/--[no-]-demangle-kernel command line options available in 'perf script' too. (Mark Drayton) - Allow for negative numbers in libtraceevent's print format, fixing up misformatting in some tracepoints. (Steven Rostedt) Infrastructure changes: - perf_env/perf_evlist changes to allow accessing the data structure with the environment where some perf data was collected in functions not necessarily related to perf.data file processing. (Kan Liang) - Cleanups for the tracepoint definition location paths routines. (Jiri Olsa) - Introduce sysfs/filename__sprintf_build_id, removing code duplication. (Masami Hiramatsu) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-08-31 10:25:46 +02:00
Ingo Molnar	02b643b643	Merge branch 'perf/urgent' into perf/core, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-08-31 10:25:26 +02:00
Ingo Molnar	4c09e0d6ba	perf/urgent fix: User visible: - Use index, not CPU id, to find core/pkg id in 'perf stat' (Kan Liang) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJV4JVmAAoJENZQFvNTUqpARzsP/jFMKCGqK+hg1Tr3m05X3K3+ CFp1VIM7dCJLQJDGAFcxqTJtub5aE32j+RmEEfGsSvyjcE8dMqapw+JkUnk1Z8Tp jxN/WgqZA0SP+Iwt//vEBWwRszuuiAQ7Re0x232DJBWgAeMdvhGcOaTStVnZq6Yk d693xnmh/kxqJ3bB3Es/TCYYf8vKTfjPn2T8jCwmKQLHL18j8OwS92W5PdYhy2T/ MoqxEaz+SDf4jon1NS1sIYZZECFjeKAc8V7KDhfRoMhxwvnvkQzSwpGHOTOMrIqv PetQbeZZou+R6LSpIBnZ/1nysQ7YWZYupgX30RCsQ3mYzUAAYC+itGJqEvTJxFZc ykBJXSO3chbYR3m7dDeAPgQLYA+47QL3vGBfDhEtWgkVg9E58yfVaztaJzBxv22+ X69SbyIcrCvkQs8EYB10n58r2aAqtwTxl2DpLKl4g8XIeJWHjL1r+WVPgA418Sco owr0ejIcP48LDNbMZMgF0TUfiXY7KzrXcWmsywq5Pet5xkes6CQ+iSDKBbO4PLQS YivCVVq+egmzUtQR4xHZ8wsCVSzP6vP+WWJTGJbcfpM7OxOI+RBi34QM7rP4XE7f Cs6R26xWusLFbfDwyA4EalSJcL85b4NZ1wIpOVeOz1xsP5DKmbQ0ZIbN8mxPGqoQ rUs4NgaIP/q0KB0+h1pC =u4R/ -----END PGP SIGNATURE----- Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fix from Arnaldo Carvalho de Melo: - Use index, not CPU id, to find core/pkg id in 'perf stat' (Kan Liang) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-08-31 10:24:24 +02:00
Jialing Fu	71f8a4b81d	mmc: core: fix race condition in mmc_wait_data_done The following panic is captured in ker3.14, but the issue still exists in latest kernel. --------------------------------------------------------------------- [ 20.738217] c0 3136 (Compiler) Unable to handle kernel NULL pointer dereference at virtual address 00000578 ...... [ 20.738499] c0 3136 (Compiler) PC is at _raw_spin_lock_irqsave+0x24/0x60 [ 20.738527] c0 3136 (Compiler) LR is at _raw_spin_lock_irqsave+0x20/0x60 [ 20.740134] c0 3136 (Compiler) Call trace: [ 20.740165] c0 3136 (Compiler) [<ffffffc0008ee900>] _raw_spin_lock_irqsave+0x24/0x60 [ 20.740200] c0 3136 (Compiler) [<ffffffc0000dd024>] __wake_up+0x1c/0x54 [ 20.740230] c0 3136 (Compiler) [<ffffffc000639414>] mmc_wait_data_done+0x28/0x34 [ 20.740262] c0 3136 (Compiler) [<ffffffc0006391a0>] mmc_request_done+0xa4/0x220 [ 20.740314] c0 3136 (Compiler) [<ffffffc000656894>] sdhci_tasklet_finish+0xac/0x264 [ 20.740352] c0 3136 (Compiler) [<ffffffc0000a2b58>] tasklet_action+0xa0/0x158 [ 20.740382] c0 3136 (Compiler) [<ffffffc0000a2078>] __do_softirq+0x10c/0x2e4 [ 20.740411] c0 3136 (Compiler) [<ffffffc0000a24bc>] irq_exit+0x8c/0xc0 [ 20.740439] c0 3136 (Compiler) [<ffffffc00008489c>] handle_IRQ+0x48/0xac [ 20.740469] c0 3136 (Compiler) [<ffffffc000081428>] gic_handle_irq+0x38/0x7c ---------------------------------------------------------------------- Because in SMP, "mrq" has race condition between below two paths: path1: CPU0: <tasklet context> static void mmc_wait_data_done(struct mmc_request mrq) { mrq->host->context_info.is_done_rcv = true; // // If CPU0 has just finished "is_done_rcv = true" in path1, and at // this moment, IRQ or ICache line missing happens in CPU0. // What happens in CPU1 (path2)? // // If the mmcqd thread in CPU1(path2) hasn't entered to sleep mode: // path2 would have chance to break from wait_event_interruptible // in mmc_wait_for_data_req_done and continue to run for next // mmc_request (mmc_blk_rw_rq_prep). // // Within mmc_blk_rq_prep, mrq is cleared to 0. // If below line still gets host from "mrq" as the result of // compiler, the panic happens as we traced. wake_up_interruptible(&mrq->host->context_info.wait); } path2: CPU1: <The mmcqd thread runs mmc_queue_thread> static int mmc_wait_for_data_req_done(... { ... while (1) { wait_event_interruptible(context_info->wait, (context_info->is_done_rcv \|\| context_info->is_new_req)); static void mmc_blk_rw_rq_prep(... { ... memset(brq, 0, sizeof(struct mmc_blk_request)); This issue happens very coincidentally; however adding mdelay(1) in mmc_wait_data_done as below could duplicate it easily. static void mmc_wait_data_done(struct mmc_request mrq) { mrq->host->context_info.is_done_rcv = true; + mdelay(1); wake_up_interruptible(&mrq->host->context_info.wait); } At runtime, IRQ or ICache line missing may just happen at the same place of the mdelay(1). This patch gets the mmc_context_info at the beginning of function, it can avoid this race condition. Signed-off-by: Jialing Fu <jlfu@marvell.com> Tested-by: Shawn Lin <shawn.lin@rock-chips.com> Fixes: `2220eedfd7` ("mmc: fix async request mechanism ....") Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>	2015-08-31 09:20:16 +02:00
Linus Walleij	01e2dae991	Revert "gpio: extraxfs: fix returnvar.cocci warnings" This reverts commit `5e22ec0198`.	2015-08-31 08:56:04 +02:00
David S. Miller	80ec1927b1	ipv4: Fix 32-bit build. net/ipv4/af_inet.c: In function 'snmp_get_cpu_field64': >> net/ipv4/af_inet.c:1486:26: error: 'offt' undeclared (first use in this function) v = (((u64 )bhptr) + offt); ^ net/ipv4/af_inet.c:1486:26: note: each undeclared identifier is reported only once for each function it appears in net/ipv4/af_inet.c: In function 'snmp_fold_field64': >> net/ipv4/af_inet.c:1499:39: error: 'offct' undeclared (first use in this function) res += snmp_get_cpu_field(mib, cpu, offct, syncp_offset); ^ >> net/ipv4/af_inet.c:1499:10: error: too many arguments to function 'snmp_get_cpu_field' res += snmp_get_cpu_field(mib, cpu, offct, syncp_offset); ^ net/ipv4/af_inet.c:1455:5: note: declared here u64 snmp_get_cpu_field(void __percpu *mib, int cpu, int offt) ^ Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-30 22:40:44 -07:00
Ken-ichirou MATSUZAWA	0ef707700f	netlink: rx mmap: fix POLLIN condition Poll() returns immediately after setting the kernel current frame (ring->head) to SKIP from user space even though there is no new frame. And in a case of all frames is VALID, user space program unintensionally sets (only) kernel current frame to UNUSED, then calls poll(), it will not return immediately even though there are VALID frames. To avoid situations like above, I think we need to scan all frames to find VALID frames at poll() like netlink_alloc_skb(), netlink_forward_ring() finding an UNUSED frame at skb allocation. Signed-off-by: Ken-ichirou MATSUZAWA <chamas@h4.dion.ne.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-30 21:55:51 -07:00
David S. Miller	793768f55c	Merge branch 'thunderx-features-fixes' Aleksey Makarov says: ==================== net: thunderx: New features and fixes v2: - The unused affinity_mask field of the structure cmp_queue has been deleted. (thanks to David Miller) - The unneeded initializers have been dropped. (thanks to Alexey Klimov) - The commit message "net: thunderx: Rework interrupt handling" has been fixed. (thanks to Alexey Klimov) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-30 21:54:13 -07:00
Sunil Goutham	d77a238498	net: thunderx: Support for internal loopback mode Support for setting VF's corresponding BGX LMAC in internal loopback mode. This mode can be used for verifying basic HW functionality such as packet I/O, RX checksum validation, CQ/RBDR interrupts, stats e.t.c. Useful when DUT has no external network connectivity. 'loopback' mode can be enabled or disabled via ethtool. Note: This feature is not supported when no of VFs enabled are morethan no of physical interfaces i.e active BGX LMACs Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-30 21:54:12 -07:00
Sunil Goutham	92dc87697e	net: thunderx: Support for upto 96 queues for a VF This patch adds support for handling multiple qsets assigned to a single VF. There by increasing no of queues from earlier 8 to max no of CPUs in the system i.e 48 queues on a single node and 96 on dual node system. User doesn't have option to assign which Qsets/VFs to be merged. Upon request from VF, PF assigns next free Qsets as secondary qsets. To maintain current behavior no of queues is kept to 8 by default which can be increased via ethtool. If user wants to unbind NICVF driver from a secondary Qset then it should be done after tearing down primary VF's interface. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-30 21:54:12 -07:00
Sunil Goutham	39ad6eea6c	net: thunderx: Rework interrupt handling Rework interrupt handler to avoid checking IRQ affinity of CQ interrupts. Now separate handlers are registered for each IRQ including RBDR. Register interrupt handlers for only those which are being used. Add nicvf_dump_intr_status() and use it in irq handlers. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-30 21:54:12 -07:00
Sunil Goutham	aa2e259b47	net: thunderx: Support for HW VLAN stripping This patch configures HW to strip 802.1Q header if found in a receiving packet. The stripped VLAN ID and TCI information is passed on to software via CQE_RX. Also sets netdev's 'vlan_features' so that other HW offload features can be used for tagged packets. This offload feature can be enabled or disabled via ethtool. Network stack normally ignores RPS for 802.1Q packets and hence low throughput. With this offload enabled throughput for tagged packets will be almost same as normal packets. Note: This patch doesn't enable HW VLAN insertion for transmit packets. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-30 21:54:12 -07:00
Sunil Goutham	38bb5d4f4f	net: thunderx: Receive hashing HW offload support Adding support for receive hashing HW offload by using RSS_ALG and RSS_TAG fields of CQE_RX descriptor. Also removed dependency on minimum receive queue count to configure RSS so that hash is always generated. This hash is used by RPS logic to distribute flows across multiple CPUs. Offload can be disabled via ethtool. Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-30 21:54:12 -07:00
Sunil Goutham	6051cba77c	net: thunderx: mailboxes: remove code duplication Use the nicvf_send_msg_to_pf() function in the mailbox code. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-30 21:54:11 -07:00
Sunil Goutham	a2dc5dedbb	net: thunderx: Add receive error stats reporting via ethtool Added ethtool support to dump receive packet error statistics reported in CQE. Also made some small fixes Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-30 21:54:11 -07:00
Aleksey Makarov	322e5cc5c6	net: thunderx: fix MAINTAINERS The liquidio and thunder drivers have different maintainers. Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-30 21:54:11 -07:00
David S. Miller	ef34c0f6c1	Merge branch 'snmp-stat-aggregation' Raghavendra K T says: ==================== Optimize the snmp stat aggregation for large cpus While creating 1000 containers, perf is showing lot of time spent in snmp_fold_field on a large cpu system. The current patch tries to improve by reordering the statistics gathering. Please note that similar overhead was also reported while creating veth pairs https://lkml.org/lkml/2013/3/19/556 Changes in V4: - remove 'item' variable and use IPSTATS_MIB_MAX to avoid sparse warning (Eric) also remove 'item' parameter (Joe) - add missing memset of padding. Changes in V3: - use memset to initialize temp buffer in leaf function. (David) - use memcpy to copy the buffer data to stat instead of unalign_pu (Joe) - Move buffer definition to leaf function __snmp6_fill_stats64() (Eric) - Changes in V2: - Allocate the stat calculation buffer in stack. (Eric) Setup: 160 cpu (20 core) baremetal powerpc system with 1TB memory 1000 docker containers was created with command docker run -itd ubuntu:15.04 /bin/bash in loop observation: Docker container creation linearly increased from around 1.6 sec to 7.5 sec (at 1000 containers) perf data showed, creating veth interfaces resulting in the below code path was taking more time. rtnl_fill_ifinfo -> inet6_fill_link_af -> inet6_fill_ifla6_attrs -> snmp_fold_field proposed idea: currently __snmp6_fill_stats64 calls snmp_fold_field that walks through per cpu data to of an item (iteratively for around 36 items). The patch tries to aggregate the statistics by going through all the items of each cpu sequentially which is reducing cache misses. Performance of docker creation improved by around more than 2x after the patch. before the patch: ================ 3f45ba571a42e925c4ec4aaee0e48d7610a9ed82a4c931f83324d41822cf6617 real 0m6.836s user 0m0.095s sys 0m0.011s perf record -a docker run -itd ubuntu:15.04 /bin/bash ======================================================= 50.73% docker [kernel.kallsyms] [k] snmp_fold_field 9.07% swapper [kernel.kallsyms] [k] snooze_loop 3.49% docker [kernel.kallsyms] [k] veth_stats_one 2.85% swapper [kernel.kallsyms] [k] _raw_spin_lock 1.37% docker docker [.] backtrace_qsort 1.31% docker docker [.] strings.FieldsFunc cache-misses: 2.7% after the patch: ============= 9178273e9df399c8290b6c196e4aef9273be2876225f63b14a60cf97eacfafb5 real 0m3.249s user 0m0.088s sys 0m0.020s perf record -a docker run -itd ubuntu:15.04 /bin/bash ======================================================= 10.57% docker docker [.] scanblock 8.37% swapper [kernel.kallsyms] [k] snooze_loop 6.91% docker [kernel.kallsyms] [k] snmp_get_cpu_field 6.67% docker [kernel.kallsyms] [k] veth_stats_one 3.96% docker docker [.] runtime_MSpan_Sweep 2.47% docker docker [.] strings.FieldsFunc cache-misses: 1.41 % Please let me know if you have suggestions/comments. Thanks Eric, Joe and David for the comments. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-30 21:48:59 -07:00
Raghavendra K T	a3a773726c	net: Optimize snmp stat aggregation by walking all the percpu data at once Docker container creation linearly increased from around 1.6 sec to 7.5 sec (at 1000 containers) and perf data showed 50% ovehead in snmp_fold_field. reason: currently __snmp6_fill_stats64 calls snmp_fold_field that walks through per cpu data of an item (iteratively for around 36 items). idea: This patch tries to aggregate the statistics by going through all the items of each cpu sequentially which is reducing cache misses. Docker creation got faster by more than 2x after the patch. Result: Before After Docker creation time 6.836s 3.25s cache miss 2.7% 1.41% perf before: 50.73% docker [kernel.kallsyms] [k] snmp_fold_field 9.07% swapper [kernel.kallsyms] [k] snooze_loop 3.49% docker [kernel.kallsyms] [k] veth_stats_one 2.85% swapper [kernel.kallsyms] [k] _raw_spin_lock perf after: 10.57% docker docker [.] scanblock 8.37% swapper [kernel.kallsyms] [k] snooze_loop 6.91% docker [kernel.kallsyms] [k] snmp_get_cpu_field 6.67% docker [kernel.kallsyms] [k] veth_stats_one changes/ideas suggested: Using buffer in stack (Eric), Usage of memset (David), Using memcpy in place of unaligned_put (Joe). Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-30 21:48:58 -07:00
Raghavendra K T	c4c6bc3146	net: Introduce helper functions to get the per cpu data Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-30 21:48:58 -07:00
David S. Miller	06fb4e701b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2015-08-30 21:45:01 -07:00
Trond Myklebust	4a1e2feb9d	NFSv4.1: Fix a protocol issue with CLOSE stateids According to RFC5661 Section 18.2.4, CLOSE is supposed to return the zero stateid. This means that nfs_clear_open_stateid_locked() cannot assume that the result stateid will always match the 'other' field of the existing open stateid when trying to determine a race with a parallel OPEN. Instead, we look at the argument, and check for matches. Cc: stable@vger.kernel.org # v4.0+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-08-30 18:45:04 -07:00
Matt Turner	8f8dcb3f7f	alpha: select CONFIG_ARCH_USE_CMPXCHG_LOCKREF. On Alpha we have spinlocks that are 32b in size and an efficient cmpxchg64 implementation, so we qualify to make use of cmpxchg backed lockrefs. Select the ARCH_USE_CMPXCHG_LOCKREF Kconfig symbol and provide a trivial implementation of arch_spin_value_unlocked to satisfy the lockref code. Using Linus' simple testcase from http://article.gmane.org/gmane.linux.file-systems/77466 on a dual CPU ES47 system I see around an 8% gain: N Min Max Median Avg Stddev x 30 6194580 6295654 6272504 6272514 17694.232 + 30 6731164 6786334 `6767982` 6764274 13738.863 Difference at 95.0% confidence 491760 +/- 8188.17 7.83992% +/- 0.130541% (Student's t, pooled s = 15840.5) Signed-off-by: Matt Turner <mattst88@gmail.com>	2015-08-30 18:01:16 -07:00
Dave Airlie	879a37d00f	Merge branch 'exynos-drm-next' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-next This is a second pull-request which adds last part of atomic modeset/pageflip support, render node support, clean-up, and fix-up. * 'exynos-drm-next' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos: drm/exynos: fix build warning to exynos_drm_gem.c drm/exynos: Properly report supported formats for each device drm/exynos: add render node support drm/exynos: implement atomic_{begin/flush} of DECON drm/exynos: remove legacy ->suspend()/resume() drm/exynos: Enable atomic modesetting feature drm/exynos: remove wait queue for pending page flip drm/exynos: wait all planes updates to finish drm/exynos: add atomic asynchronous commit drm/exynos: fimd: only finish update if START == START_S drm/exynos: add macro to get the address of START_S reg drm/exynos: check for pending fb before finish update drm/exynos: fimd: move window protect code to prepare/cleanup_plane drm/exynos: add prepare and cleanup phases for planes drm/exynos: fimd: unify call to exynos_drm_crtc_finish_pageflip() drm/exynos: don't track enabled state at exynos_crtc	2015-08-31 10:25:45 +10:00
Dave Airlie	701078d538	Merge tag 'drm-intel-next-fixes-2015-08-28' of git://anongit.freedesktop.org/drm-intel into drm-next Some i915 fixes headed for v4.3. SKL DDI-E is a wip, but here's the first in a series. * tag 'drm-intel-next-fixes-2015-08-28' of git://anongit.freedesktop.org/drm-intel: drm/i915/skl: enable DDI-E hotplug drm/i915: Fix build warning on 32-bit drm/i915/skl: Update DDI buffer translation programming. drm/i915: Allow parsing of variable size child device entries from VBT drm/i915: fix link rates reported for SKL drm/i915: fix VBT parsing for SDVO child device mapping	2015-08-31 10:06:22 +10:00
Dave Airlie	d3e8ea5092	Merge tag 'drm-amdkfd-next-fixes-2015-08-30' of git://people.freedesktop.org/~gabbayo/linux into drm-next Just one small fix before 4.3 merge window: - Use linux/mman.h instead of uapi's mman-common.h inside the driver. * tag 'drm-amdkfd-next-fixes-2015-08-30' of git://people.freedesktop.org/~gabbayo/linux: amdkfd: use <linux/mman.h> instead of <uapi/asm-generic/mman-common.h>	2015-08-31 10:05:37 +10:00
Yishai Hadas	e1c30298cc	IB/ucma: HW Device hot-removal support Currently, IB/cma remove_one flow blocks until all user descriptor managed by IB/ucma are released. This prevents hot-removal of IB devices. This patch allows IB/cma to remove devices regardless of user space activity. Upon getting the RDMA_CM_EVENT_DEVICE_REMOVAL event we close all the underlying HW resources for the given ucontext. The ucontext itself is still alive till its explicit destroying by its creator. Running applications at that time will have some zombie device, further operations may fail. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Reviewed-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:41 -04:00
Yishai Hadas	ae184ddeca	IB/mlx4_ib: Disassociate support Implements the IB core disassociate_ucontext API. The driver detaches the HW resources for a given user context to prevent a dependency between application termination and device disconnecting. This is done by managing the VMAs that were mapped to the HW bars such as door bell and blueflame. When need to detach remap them to an arbitrary kernel page returned by the zap API. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:40 -04:00
Yishai Hadas	036b106357	IB/uverbs: Enable device removal when there are active user space applications Enables the uverbs_remove_one to succeed despite the fact that there are running IB applications working with the given ib device. This functionality enables a HW device to be unbind/reset despite the fact that there are running user space applications using it. It exposes a new IB kernel API named 'disassociate_ucontext' which lets a driver detaching its HW resources from a given user context without crashing/terminating the application. In case a driver implemented the above API and registered with ib_uverb there will be no dependency between its device to its uverbs_device. Upon calling remove_one of ib_uverbs the call should return after disassociating the open HW resources without waiting to clients disconnecting. In case driver didn't implement this API there will be no change to current behaviour and uverbs_remove_one will return only when last client has disconnected and reference count on uverbs device became 0. In case the lower driver device was removed any application will continue working over some zombie HCA, further calls will ended with an immediate error. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:40 -04:00
Yishai Hadas	057aec0d23	IB/uverbs: Explicitly pass ib_dev to uverbs commands Done in preparation for deploying RCU for the device removal flow. Allows isolating the RCU handling to the uverb_main layer and keeping the uverbs_cmd code as is. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:40 -04:00
Yishai Hadas	35d4a0b63d	IB/uverbs: Fix race between ib_uverbs_open and remove_one Fixes: `2a72f21226` ("IB/uverbs: Remove dev_table") Before this commit there was a device look-up table that was protected by a spin_lock used by ib_uverbs_open and by ib_uverbs_remove_one. When it was dropped and container_of was used instead, it enabled the race with remove_one as dev might be freed just after: dev = container_of(inode->i_cdev, struct ib_uverbs_device, cdev) but before the kref_get. In addition, this buggy patch added some dead code as container_of(x,y,z) can never be NULL and so dev can never be NULL. As a result the comment above ib_uverbs_open saying "the open method will either immediately run -ENXIO" is wrong as it can never happen. The solution follows Jason Gunthorpe suggestion from below URL: https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg25692.html cdev will hold a kref on the parent (the containing structure, ib_uverbs_device) and only when that kref is released it is guaranteed that open will never be called again. In addition, fixes the active count scheme to use an atomic not a kref to prevent WARN_ON as pointed by above comment from Jason. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:40 -04:00
Yishai Hadas	03c40442a0	IB/uverbs: Fix reference counting usage of event files Fix the reference counting usage to be handled in the event file creation/destruction function, instead of being done by the caller. This is done for both async/non-async event files. Based on Jason Gunthorpe report at https://www.mail-archive.com/ linux-rdma@vger.kernel.org/msg24680.html: "The existing code for this is broken, in ib_uverbs_get_context all the error paths between ib_uverbs_alloc_event_file and the kref_get(file->ref) are wrong - this will result in fput() which will call ib_uverbs_event_close, which will try to do kref_put and ib_unregister_event_handler - which are no longer paired." Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:39 -04:00
Jason Gunthorpe	7dd78647a2	IB/core: Make ib_dealloc_pd return void The majority of callers never check the return value, and even if they did, they can't do anything about a failure. All possible failure cases represent a bug in the caller, so just WARN_ON inside the function instead. This fixes a few random errors: net/rd/iw.c infinite loops while it fails. (racing with EBUSY?) This also lays the ground work to get rid of error return from the drivers. Most drivers do not error, the few that do are broken since it cannot be handled. Since uverbs can legitimately make use of EBUSY, open code the check. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:39 -04:00
Bart Van Assche	03f6fb93fd	IB/srp: Create an insecure all physical rkey only if needed The SRP initiator only needs this if the insecure register_always=N performance optimization is enabled, or if FRWR/FMR is not supported in the driver. Do not create an all physical MR unless it is needed to support either of those modes. Default register_always to true so the out of the box configuration does not create an insecure all physical MR. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> [bvanassche: reworked and rebased this patch] Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:39 -04:00
Bart Van Assche	330179f2fa	IB/srp: Register the indirect data buffer descriptor Instead of always using the global rkey for the indirect data buffer descriptor, register that descriptor with the HCA if the kernel module parameter register_always has been set to Y. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:38 -04:00
Bart Van Assche	002f15674c	IB/srp: Introduce srp_device.use_fmr Introduce the variable srp_device.use_fmr. Leave out the dev->has_fr / dev->has_fmr and ch->fr_pool / ch->fmr_pool checks since these are redundant. This patch does not change any functionality but makes the source code easier to read. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:38 -04:00
Bart Van Assche	3ae95da883	IB/srp: Remove use_mr argument from srp_map_sg_entry() Move the srp_map_desc() call from inside srp_map_sg_entry() to srp_map_sg() such that the use_mr argument can be removed from srp_map_sg_entry(). Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:38 -04:00
Bart Van Assche	0e0d3a4800	IB/srp: Remove the memory registration backtracking code Mapping a discontiguous sg-list requires multiple memory regions and hence can exhaust the memory region pool. The SRP initiator already handles this by temporarily reducing the queue depth. This means that it is safe to remove the memory registration backtracking code. This patch has been tested with direct I/O sizes up to 256 MB. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:37 -04:00
Bart Van Assche	f731ed6293	IB/srp: Add memory descriptor array pointer range checking Although most paths through which a request is submitted check block layer parameters like the max_segments limit, these are not checked when an SG_IO or direct I/O request is submitted. Hence add a range check for the memory descriptor array pointer. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:37 -04:00
Bart Van Assche	7e85c91970	IB/srp: Use multiple registrations for large memory regions Instead of using the global rkey for large memory regions, use multiple registrations. See also the while (dma_len) loop further down in srp_map_sg_entry(). Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:37 -04:00
Bart Van Assche	186fbc6689	IB/srp: Re-enable FMR for non-page aligned buffers During a discussion in 2011 nobody recalled why FMR was not used for non-page aligned buffers (see also http://thread.gmane.org/gmane.linux.drivers.rdma/7149). Re-enable FMR for such buffers. For the reason why the srp_map_fmr() function needs to be modified, see also patch "IB/srp: rework mapping engine to use multiple FMR entries" (commit ID `8f26c9ff9c`; January 2011). Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:36 -04:00
Jason Gunthorpe	e5580242aa	rds/ib: Remove ib_get_dma_mr calls The pd now has a local_dma_lkey member which completely replaces ib_get_dma_mr, use it instead. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:36 -04:00
Jason Gunthorpe	2f31fa881f	net/9p: Remove ib_get_dma_mr calls The pd now has a local_dma_lkey member which completely replaces ib_get_dma_mr, use it instead. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Tested-by: Dominique Martinet <dominique.martinet@cea.fr> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:36 -04:00
Jason Gunthorpe	5a783956c2	ib_srpt: Remove ib_get_dma_mr calls The pd now has a local_dma_lkey member which completely replaces ib_get_dma_mr, use it instead. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:35 -04:00
Jason Gunthorpe	e6bf5f48d2	IB/srp: Use pd->local_dma_lkey Replace all leys with pd->local_dma_lkey. This driver does not support iWarp, so this is safe. The insecure use of ib_get_dma_mr is thus isolated to an rkey, and will have to be fixed separately. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:35 -04:00
Jason Gunthorpe	34efc7dfbd	iser-target: Remove ib_get_dma_mr calls The pd now has a local_dma_lkey member which completely replaces ib_get_dma_mr, use it instead. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:35 -04:00
Jason Gunthorpe	256b7ad273	IB/iser: Use pd->local_dma_lkey Replace all leys with pd->local_dma_lkey. This driver does not support iWarp, so this is safe. The insecure use of ib_get_dma_mr is thus isolated to an rkey, and this looks trivially fixed by forcing the use of registration in a future patch. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:34 -04:00
Jason Gunthorpe	b37c788f59	IB/mlx5: Remove ib_get_dma_mr calls The pd now has a local_dma_lkey member which completely replaces ib_get_dma_mr, use it instead. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-08-30 18:12:34 -04:00

... 18 19 20 21 22 ...

545831 commits