linux-pinenote

Author	SHA1	Message	Date
Mike Marciniszyn	b9b06cb6fe	IB/hfi1: Fix missing lock/unlock in verbs drain callback The iowait_sdma_drained() callback lacked locking to protect the qp s_flags field. This causes the s_flags to be out of sync on multiple CPUs, potentially corrupting the s_flags. Fixes: `a545f5308b` ("staging/rdma/hfi: fix CQ completion order issue") Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 12:00:39 -04:00
Jubin John	e6d2e0176e	IB/rdmavt: Fix send scheduling call_send is used to determine whether to send immediately or schedule a send for later. The current logic in rdmavt is inverted and has a negative impact on the latency of the hfi1 and qib drivers. Fix this regression by correctly calling send immediately when call_send is set. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 12:00:39 -04:00
Mitko Haralanov	849e3e9398	IB/hfi1: Prevent unpinning of wrong pages The routine used by the SDMA cache to handle already cached nodes can extend an already existing node. In its error handling code, the routine will unpin pages when not all pages of the buffer extension were pinned. There was a bug in that part of the routine, which would mistakenly unpin pages from the original set rather than the newly pinned pages. This commit fixes that bug by offsetting the page array to the proper place pointing at the beginning of the newly pinned pages. Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 12:00:38 -04:00
Mitko Haralanov	de82bdff62	IB/hfi1: Fix deadlock caused by locking with wrong scope The locking around the interval RB tree is designed to prevent access to the tree while it's being modified. The locking in its current form is too overzealous, which is causing a deadlock in certain cases with the following backtrace: Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 0 CPU: 0 PID: 5836 Comm: IMB-MPI1 Tainted: G O 3.12.18-wfr+ #1 0000000000000000 ffff88087f206c50 ffffffff814f1caa ffffffff817b53f0 ffff88087f206cc8 ffffffff814ecd56 0000000000000010 ffff88087f206cd8 ffff88087f206c78 0000000000000000 0000000000000000 0000000000001662 Call Trace: <NMI> [<ffffffff814f1caa>] dump_stack+0x45/0x56 [<ffffffff814ecd56>] panic+0xc2/0x1cb [<ffffffff810d4370>] ? restart_watchdog_hrtimer+0x50/0x50 [<ffffffff810d4432>] watchdog_overflow_callback+0xc2/0xd0 [<ffffffff81109b4e>] __perf_event_overflow+0x8e/0x2b0 [<ffffffff8110a714>] perf_event_overflow+0x14/0x20 [<ffffffff8101c906>] intel_pmu_handle_irq+0x1b6/0x390 [<ffffffff814f927b>] perf_event_nmi_handler+0x2b/0x50 [<ffffffff814f8ad8>] nmi_handle.isra.3+0x88/0x180 [<ffffffff814f8d39>] do_nmi+0x169/0x310 [<ffffffff814f8177>] end_repeat_nmi+0x1e/0x2e [<ffffffff81272600>] ? unmap_single+0x30/0x30 [<ffffffff814f780d>] ? _raw_spin_lock_irqsave+0x2d/0x40 [<ffffffff814f780d>] ? _raw_spin_lock_irqsave+0x2d/0x40 [<ffffffff814f780d>] ? _raw_spin_lock_irqsave+0x2d/0x40 <<EOE>> <IRQ> [<ffffffffa056c4a8>] hfi1_mmu_rb_search+0x38/0x70 [hfi1] [<ffffffffa05919cb>] user_sdma_free_request+0xcb/0x120 [hfi1] [<ffffffffa0593393>] user_sdma_txreq_cb+0x263/0x350 [hfi1] [<ffffffffa057fad7>] ? sdma_txclean+0x27/0x1c0 [hfi1] [<ffffffffa0593130>] ? user_sdma_send_pkts+0x1710/0x1710 [hfi1] [<ffffffffa057fdd6>] sdma_make_progress+0x166/0x480 [hfi1] [<ffffffff810762c9>] ? ttwu_do_wakeup+0x19/0xd0 [<ffffffffa0581c7e>] sdma_engine_interrupt+0x8e/0x100 [hfi1] [<ffffffffa0546bdd>] sdma_interrupt+0x5d/0xa0 [hfi1] [<ffffffff81097e57>] handle_irq_event_percpu+0x47/0x1d0 [<ffffffff81098017>] handle_irq_event+0x37/0x60 [<ffffffff8109aa5f>] handle_edge_irq+0x6f/0x120 [<ffffffff810044af>] handle_irq+0xbf/0x150 [<ffffffff8104c9b7>] ? irq_enter+0x17/0x80 [<ffffffff8150168d>] do_IRQ+0x4d/0xc0 [<ffffffff814f7c6a>] common_interrupt+0x6a/0x6a <EOI> [<ffffffff81073524>] ? finish_task_switch+0x54/0xe0 [<ffffffff814f56c6>] __schedule+0x3b6/0x7e0 [<ffffffff810763a6>] __cond_resched+0x26/0x30 [<ffffffff814f5eda>] _cond_resched+0x3a/0x50 [<ffffffff814f4f82>] down_write+0x12/0x30 [<ffffffffa0591619>] hfi1_release_user_pages+0x69/0x90 [hfi1] [<ffffffffa059173a>] sdma_rb_remove+0x9a/0xc0 [hfi1] [<ffffffffa056c00d>] __mmu_rb_remove.isra.5+0x5d/0x70 [hfi1] [<ffffffffa056c536>] hfi1_mmu_rb_remove+0x56/0x70 [hfi1] [<ffffffffa059427b>] hfi1_user_sdma_process_request+0x74b/0x1160 [hfi1] [<ffffffffa055c763>] hfi1_aio_write+0xc3/0x100 [hfi1] [<ffffffff8116a14c>] do_sync_readv_writev+0x4c/0x80 [<ffffffff8116b58b>] do_readv_writev+0xbb/0x230 [<ffffffff811a9da1>] ? fsnotify+0x241/0x320 [<ffffffff81073524>] ? finish_task_switch+0x54/0xe0 [<ffffffff8116b795>] vfs_writev+0x35/0x60 [<ffffffff8116b8c9>] SyS_writev+0x49/0xc0 [<ffffffff810cd876>] ? __audit_syscall_exit+0x1f6/0x2a0 [<ffffffff814ff992>] system_call_fastpath+0x16/0x1b As evident from the backtrace above, the process was being put to sleep while holding the lock. Limiting the scope of the lock only to the RB tree operation fixes the above error allowing for proper locking and the process being put to sleep when needed. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 12:00:38 -04:00
Mitko Haralanov	f19bd643db	IB/hfi1: Prevent NULL pointer deferences in caching code There is a potential kernel crash when the MMU notifier calls the invalidation routines in the hfi1 pinned page caching code for sdma. The invalidation routine could call the remove callback for the node, which in turn ends up dereferencing the current task_struct to get a pointer to the mm_struct. However, the mm_struct pointer could be NULL resulting in the following backtrace: BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8 IP: [<ffffffffa041f75a>] sdma_rb_remove+0xaa/0x100 [hfi1] 15 task: ffff88085e66e080 ti: ffff88085c244000 task.ti: ffff88085c244000 RIP: 0010:[<ffffffffa041f75a>] [<ffffffffa041f75a>] sdma_rb_remove+0xaa/0x100 [hfi1] RSP: 0000:ffff88085c245878 EFLAGS: 00010002 RAX: 0000000000000000 RBX: ffff88105b9bbd40 RCX: ffffea003931a830 RDX: 0000000000000004 RSI: ffff88105754a9c0 RDI: ffff88105754a9c0 RBP: ffff88085c245890 R08: ffff88105b9bbd70 R09: 00000000fffffffb R10: ffff88105b9bbd58 R11: 0000000000000013 R12: ffff88105754a9c0 R13: 0000000000000001 R14: 0000000000000001 R15: ffff88105b9bbd40 FS: 0000000000000000(0000) GS:ffff88107ef40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000a8 CR3: 0000000001a0b000 CR4: 00000000001407e0 Stack: ffff88105b9bbd40 ffff88080ec481a8 ffff88080ec481b8 ffff88085c2458c0 ffffffffa03fa00e ffff88080ec48190 ffff88080ed9cd00 0000000001024000 0000000000000000 ffff88085c245920 ffffffffa03fa0e7 0000000000000282 Call Trace: [<ffffffffa03fa00e>] __mmu_rb_remove.isra.5+0x5e/0x70 [hfi1] [<ffffffffa03fa0e7>] mmu_notifier_mem_invalidate+0xc7/0xf0 [hfi1] [<ffffffffa03fa143>] mmu_notifier_page+0x13/0x20 [hfi1] [<ffffffff81156dd0>] __mmu_notifier_invalidate_page+0x50/0x70 [<ffffffff81140bbb>] try_to_unmap_one+0x20b/0x470 [<ffffffff81141ee7>] try_to_unmap_anon+0xa7/0x120 [<ffffffff81141fad>] try_to_unmap+0x4d/0x60 [<ffffffff8111fd7b>] shrink_page_list+0x2eb/0x9d0 [<ffffffff81120ab3>] shrink_inactive_list+0x243/0x490 [<ffffffff81121491>] shrink_lruvec+0x4c1/0x640 [<ffffffff81121641>] shrink_zone+0x31/0x100 [<ffffffff81121b0f>] kswapd_shrink_zone.constprop.62+0xef/0x1c0 [<ffffffff811229e3>] kswapd+0x403/0x7e0 [<ffffffff811225e0>] ? shrink_all_memory+0xf0/0xf0 [<ffffffff81068ac0>] kthread+0xc0/0xd0 [<ffffffff81068a00>] ? insert_kthread_work+0x40/0x40 [<ffffffff814ff8ec>] ret_from_fork+0x7c/0xb0 [<ffffffff81068a00>] ? insert_kthread_work+0x40/0x40 To correct this, the mm_struct passed to us by the MMU notifier is used (which is what should have been done to begin with). This avoids the broken derefences and ensures that the correct mm_struct is used. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 12:00:38 -04:00
Arnd Bergmann	6383190203	Second Round of Renesas ARM Based SoC Fixes for v4.6 * Don't disable referenced optional scif clock -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJXIUSSAAoJENfPZGlqN0++NxUP/iExGz3SUIwOMh9vcq4zl6Iz LSvZMo26XYkF4/HBOAfrtNGxfFZnZRPOLhQ1K757YJejvo7uPxp6loOxolfe93GI o92dMCRWCFN7IcmGX+JOlcnMkyG4o8fUjgrPm3V5DUV3+bn7UBYnkD91RXgFiMxw 2it4mas1sDajEzgJ16AJK/JT/x0amrMzZp9dieO/7++g1lLp8SToxDTtJarWgrkc OytMGN96fxfvKKSjDC/MUD7vq12+yVQXHYlxIo+E851fP42xqhUuc070/xPbtRqm 9EsGPoqF9CQL5c4NY9TbXtY2gD/cvHWd0ojbPtHPGpg7vMXGBSLIDBn7pBEmu5oK 5ivMXtSSIXX2lr0hGnJM27NmfSx/JGQiJhW/BdyYrMJgTS7/Sji3/Vau0+aEqxt/ SxpZQf0xcVKd+xd+ZnvBB7b3ffBbdkAPAYVAv/Y452oTWlUs6yH76nvMUqAf7dy/ wMfHGucxtNhKM8jB/iS3Z2aWMcWDZxH+B3Q/PwmkjUxJLxj9Db2bKQ/2Gn6IrX3B dl+S2ZrGC/n+Qqh7+aLoqbtAp7hn5i43fZIe3UL6xYf90kWi/D7iOz+w/LaHX7YE qlTF4lInwy39Y58PXkLNgivaS0zSFCZ3gWFk7hg0gXBPQlnLYsIZ9dFqGMF4uQZt 2Qj3ebjkIq37eM20l7cd =ySSs -----END PGP SIGNATURE----- Merge tag 'renesas-fixes2-for-v4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into fixes Merge "Second Round of Renesas ARM Based SoC Fixes for v4.6" from Simon Horman: * Don't disable referenced optional scif clock * tag 'renesas-fixes2-for-v4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas: arm64: dts: r8a7795: Don't disable referenced optional scif clock ARM: shmobile: timer: Fix preset_lpj leading to too short delays Revert "ARM: dts: porter: Enable SCIF_CLK frequency and pins" ARM: dts: r8a7791: Don't disable referenced optional clocks	2016-04-28 17:46:27 +02:00
Arnd Bergmann	94379acca0	Fixes for omaps for v4.6-rc cycle. All dts fixes, mostly affecting voltages and pinctrl for various device drivers: - Regulator minimum voltage fixes for omap5 - ISP syscon register offset fix for omap3 - Fix regulator initial modes for n900 - Fix omap5 pinctrl wkup instance size -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJXH6kGAAoJEBvUPslcq6Vz9kIP/3TaFg7PpMfCAIN5lL0RbMf+ cHOxwuPZ3qzBXbL0cqGYx9RuhIYwBHCmOTG7xBMn5IEQD3mIRi7LZsAKy5Uam79C a5xXGOjWeDV6e9Tq8NA4Yh5VpO+eSlDCcuwq2jDN7Uh6iLv19igf3WoHgjVyj+oD Axa6I5CBGkTNunU3Z4lSCOj3creR8ulg0wUyLyKp/8CKAhOqH3zf9hP91Oe2zg8C EvrtC021DOOxhoreMX7R9e1Sa0aJBK7WNFbhHYMReRL7Ri8WldDS6KgwHfzTlohR 9QC44j76fZGtGRf5xfIcHyiYlHOrYbAsxn51arNlVuHy55JO8ODk/rlobRD+Begg P+R6OzYwtnlTNQQUc95A2gzpOuqq8a2P3mKhXdnWYpExyCboRtiVU1PiV40PdmdD z3YwF76OUJ2G0EwQ2rIjT/oSxzwB/qzX8dJO6Rmk559wkafZoIfSiGA7ROygsfYQ A4/EHpPQv6DI/NXcUA/aefGODMe4x9nGmJCDWbZhRHtNcYkiHxNKy6P05r9dI/yi wNesiYoV+2RAswYRTgkCEjPlrDQ+WDz5ADKAQGatdwa4VxnC/VsPfmwBbmikLVEf 1GhI1x5WAiOECEhHcB2cvVs9pACMowGyoG5RPNOqGUJo8HNpNclU6OYg4CYocx0y 3qT9gdmD1ISThQ+iNOEd =OP/B -----END PGP SIGNATURE----- Merge tag 'omap-for-v4.6/fixes-rc5-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes Fixes for omaps for v4.6-rc cycle. All dts fixes, mostly affecting voltages and pinctrl for various device drivers: - Regulator minimum voltage fixes for omap5 - ISP syscon register offset fix for omap3 - Fix regulator initial modes for n900 - Fix omap5 pinctrl wkup instance size * tag 'omap-for-v4.6/fixes-rc5-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: dts: omap5: fix range of permitted wakeup pinmux registers ARM: dts: omap3-n900: Specify peripherals LDO regulators initial mode ARM: dts: omap3: Fix ISP syscon register offset ARM: dts: omap5-cm-t54: fix ldo1_reg and ldo4_reg ranges ARM: dts: omap5-board-common: fix ldo1_reg and ldo4_reg ranges	2016-04-28 17:43:33 +02:00
Sagi Grimberg	e7d2c25d94	MAINTAINERS: Update iser/isert maintainer contact info Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Acked-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 11:32:50 -04:00
Sagi Grimberg	986ef95ecd	IB/mlx5: Expose correct max_sge_rd limit mlx5 devices (Connect-IB, ConnectX-4, ConnectX-4-LX) has a limitation where rdma read work queue entries cannot exceed 512 bytes. A rdma_read wqe needs to fit in 512 bytes: - wqe control segment (16 bytes) - rdma segment (16 bytes) - scatter elements (16 bytes each) So max_sge_rd should be: (512 - 16 - 16) / 16 = 30. Cc: linux-stable@vger.kernel.org Reported-by: Christoph Hellwig <hch@lst.de> Tested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagig@grimberg.me> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2016-04-28 10:49:17 -04:00
Thomas Renninger	ac5a181d06	cpupower: Add cpuidle parts into library This more or less is a renaming and moving of functions and should not introduce any functional change. cpupower was built from cpufrequtils (which had a C library providing easy access to cpu frequency platform info). In the meantime it got enhanced by quite some neat cpuidle userspace tools. Now the cpu idle functions have been separated and added to the cpupower.so library. So beside an already existing public header file: cpufreq.h cpupower now also exports these cpu idle functions in: cpuidle.h Here again pasted for better review of the interfaces: ====================================== int cpuidle_is_state_disabled(unsigned int cpu, unsigned int idlestate); int cpuidle_state_disable(unsigned int cpu, unsigned int idlestate, unsigned int disable); unsigned long cpuidle_state_latency(unsigned int cpu, unsigned int idlestate); unsigned long cpuidle_state_usage(unsigned int cpu, unsigned int idlestate); unsigned long long cpuidle_state_time(unsigned int cpu, unsigned int idlestate); char cpuidle_state_name(unsigned int cpu, unsigned int idlestate); char cpuidle_state_desc(unsigned int cpu, unsigned int idlestate); unsigned int cpuidle_state_count(unsigned int cpu); char cpuidle_get_governor(void); char cpuidle_get_driver(void); ====================================== Signed-off-by: Thomas Renninger <trenn@suse.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-28 16:02:29 +02:00
Colin Ian King	fe7656a8e8	cpupowerutils: bench: trivial fix of spelling mistake on "average" fix spelling mistake, avarage -> average Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Thomas Renninger <trenn@suse.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-28 16:02:29 +02:00
Mattia Dongili	938bb850d5	Fix cpupower manpages "NAME" section The token before "-" should be the program name, no spaces allowed. See man(7) and lexgrog(1). Signed-off-by: Mattia Dongili <malattia@linux.it> Signed-off-by: Thomas Renninger <trenn@suse.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-28 16:02:29 +02:00
Colin Ian King	983d9e065b	cpupower: bench: parse.c: fix several resource leaks The error handling in prepare_output has several issues with resource leaks. Ensure that filename is free'd and the directory stream DIR is closed before returning. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Thomas Renninger <trenn@suse.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-28 16:02:28 +02:00
Mattia Dongili	04b03594c7	Honour user's LDFLAGS Signed-off-by: Mattia Dongili <malattia@linux.it> Signed-off-by: Thomas Renninger <trenn@suse.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-28 16:02:28 +02:00
Sudeep Holla	2482bc31ca	cpufreq: st: enable selective initialization based on the platform The sti-cpufreq does unconditional registration of the cpufreq-dt driver which causes issue on an multi-platform build. For example, on Vexpress TC2 platform, we get the following error on boot: cpu cpu0: OPP-v2 not supported cpu cpu0: Not doing voltage scaling cpu: dev_pm_opp_of_cpumask_add_table: couldn't find opp table for cpu:0, -19 cpu cpu0: dev_pm_opp_get_max_volt_latency: Invalid regulator (-6) ... arm_big_little: bL_cpufreq_register: Failed registering platform driver: vexpress-spc, err: -17 The actual driver fails to initialise as cpufreq-dt is probed successfully, which is incorrect. This issue can happen to any platform not using cpufreq-dt in a multi-platform build. This patch adds a check to do selective initialization of the driver. Fixes: `ab0ea257fc` (cpufreq: st: Provide runtime initialised driver for ST's platforms) Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Lee Jones <lee.jones@linaro.org> Cc: 4.5+ <stable@vger.kernel.org> # 4.5+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-28 15:25:56 +02:00
Viresh Kumar	9f123def55	cpufreq: mvebu: Move cpufreq code into drivers/cpufreq/ Move cpufreq bits for mvebu into drivers/cpufreq/ directory, that's where they really belong to. Compiled tested only. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-28 15:22:43 +02:00
Viresh Kumar	eb96924acd	cpufreq: dt: Kill platform-data There are no more users of platform-data for cpufreq-dt driver, get rid of it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-28 15:22:43 +02:00
Viresh Kumar	947bd567f7	mvebu: Use dev_pm_opp_set_sharing_cpus() to mark OPP tables as shared That will allow us to avoid using cpufreq-dt platform data. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-28 15:22:43 +02:00
Viresh Kumar	1530b9963e	cpufreq: dt: Identify cpu-sharing for platforms without operating-points-v2 Existing platforms, which do not support operating-points-v2, can explicitly tell the opp core that some of the CPUs share opp tables, with help of dev_pm_opp_set_sharing_cpus(). For such platforms, explicitly ask the opp core to provide list of CPUs sharing the opp table with current cpu device, before falling back to platform data. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-28 15:22:42 +02:00
Rafael J. Wysocki	29c5e7b2bc	Merge back earlier cpufreq material for v4.7.	2016-04-28 15:19:31 +02:00
Viresh Kumar	6f707daa38	PM / OPP: Add dev_pm_opp_get_sharing_cpus() OPP core allows a platform to mark OPP table as shared, when the platform isn't using operating-points-v2 bindings. And, so there should be a non DT way of finding out if the OPP table is shared or not. This patch adds dev_pm_opp_get_sharing_cpus(), which first tries to get OPP sharing information from the opp-table (in case it is already marked as shared), otherwise it uses the existing DT way of finding sharing information. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-28 15:18:18 +02:00
Viresh Kumar	dde370b23c	PM / OPP: Mark cpumask as const in dev_pm_opp_set_sharing_cpus() dev_pm_opp_set_sharing_cpus() isn't supposed to update the cpumask passed as its parameter, and so it should always have been marked 'const'. Do it now. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-28 15:18:18 +02:00
Viresh Kumar	d708b384c0	PM / OPP: -ENOSYS is applicable only to syscalls Some of the routines have used -ENOSYS for the cases where the functionality isn't implemented in the kernel. But ENOSYS is supposed to be used only for syscalls. Replace that with -ENOTSUPP, which specifically means that the operation isn't supported. While at it, replace exiting -EINVAL errors for similar cases to -ENOTSUPP. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-28 15:18:18 +02:00
James Morse	625fe4f8ff	ARM: cpuidle: Pass on arm_cpuidle_suspend()'s return value arm_cpuidle_suspend() may return -EOPNOTSUPP, or any value returned by the cpu_ops/cpuidle_ops suspend call. arm_enter_idle_state() doesn't update 'ret' with this value, meaning we always signal success to cpuidle_enter_state(), causing it to update the usage counters as if we succeeded. Fixes: `191de17aa3` ("ARM64: cpuidle: Replace cpu_suspend by the common ARM/ARM64 function") Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: 4.1+ <stable@vger.kernel.org> # 4.1+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-28 15:15:14 +02:00
Thierry Reding	fba1fbf563	PM / sleep: Drop unused `info' variable Commit `32e8d689dc` (PM / sleep: trace_device_pm_callback coverage in dpm_prepare/complete) removed all users of this variable but forgot to remove the variable itself. Signed-off-by: Thierry Reding <treding@nvidia.com> [ rjw: Subject ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-28 15:13:01 +02:00
Rafael J. Wysocki	b4f4b4b371	cpufreq: governor: Change confusing struct field and variable names The name of the prev_cpu_wall field in struct cpu_dbs_info is confusing, because it doesn't represent wall time, but the previous update time as returned by get_cpu_idle_time() (that may be the current value of jiffies_64 in some cases, for example). Moreover, the names of some related variables in dbs_update() take that confusion further. Rename all of those things to make their names reflect the purpose more accurately. While at it, drop unnecessary parens from one of the updated expressions. No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Chen Yu <yu.c.chen@intel.com>	2016-04-28 15:10:08 +02:00
Arnaldo Carvalho de Melo	a30e6259b5	perf trace: Move msg_flags beautifier to tools/perf/trace/beauty/ To reduce the size of builtin-trace.c. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-11zxg3qitk6bw2x30135k9z4@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-04-28 09:58:59 -03:00
Wang Nan	be7b0c9e37	perf record: Generate tracking events for process forked by perf With 'perf record --switch-output' without -a, record__synthesize() in record__switch_output() won't generate tracking events because there's no thread_map in evlist. Which causes newly created perf.data doesn't contain map and comm information. This patch creates a fake thread_map and directly call perf_event__synthesize_thread_map() for those events. Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1461178794-40467-8-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-04-28 09:58:59 -03:00
Wang Nan	0c1d46a879	perf record: Disable buildid cache options by default in switch output mode The cost of buildid cache processing is high: reading all events in output perf.data, opening each elf file to read buildids then copying them into ~/.debug directory. In switch output mode, these heavy works block perf from receiving perf events for too long. Enable no-buildid and no-buildid-cache by default if --switch-output is provided. Still allow user use --no-no-buildid to explicitly enable buildid in this case. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1461178794-40467-6-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> [ Updated man page ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-04-28 09:58:59 -03:00
Wang Nan	eca857ab38	perf record: Force enable --timestamp-filename when --switch-output is provided Without this patch, the last output doesn't have timestamp appended if --timestamp-filename is not explicitly provided. For example: # perf record -a --switch-output & [1] 11224 # kill -s SIGUSR2 11224 [ perf record: dump data: Woken up 1 times ] # [ perf record: Dump perf.data.2015122622372823 ] # fg perf record -a --switch-output ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.027 MB perf.data (540 samples) ] # ls -l total 836 -rw------- 1 root root 33256 Dec 26 22:37 perf.data <---- Odd -rw------- 1 root root 817156 Dec 26 22:37 perf.data.2015122622372823 Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1461178794-40467-5-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> [ Updated man page, that also got an entry for --timestamp-filename ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-04-28 09:58:59 -03:00
Wang Nan	3c1cb7e372	perf record: Split output into multiple files via '--switch-output' Allow 'perf record' to split its output into multiple files. For example: # ~/perf record -a --timestamp-filename --switch-output & [1] 10763 # kill -s SIGUSR2 10763 [ perf record: dump data: Woken up 1 times ] # [ perf record: Dump perf.data.2015122622314468 ] # kill -s SIGUSR2 10763 [ perf record: dump data: Woken up 1 times ] # [ perf record: Dump perf.data.2015122622314762 ] # kill -s SIGUSR2 10763 [ perf record: dump data: Woken up 1 times ] #[ perf record: Dump perf.data.2015122622315171 ] # fg perf record -a --timestamp-filename --switch-output ^C[ perf record: Woken up 1 times to write data ] [ perf record: Dump perf.data.2015122622315513 ] [ perf record: Captured and wrote 0.014 MB perf.data.<timestamp> (296 samples) ] # ls -l total 920 -rw------- 1 root root 797692 Dec 26 22:31 perf.data.2015122622314468 -rw------- 1 root root 59960 Dec 26 22:31 perf.data.2015122622314762 -rw------- 1 root root 59912 Dec 26 22:31 perf.data.2015122622315171 -rw------- 1 root root 19220 Dec 26 22:31 perf.data.2015122622315513 Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1461178794-40467-4-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> [ Added man page entry, used the re-synthesize patch in this series as a fixup ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-04-28 09:58:59 -03:00
Wang Nan	5f9cf5992c	perf tools: Derive trigger class from auxtrace_snapshot auxtrace_snapshot_state matches the trigger model. Use trigger to implement it. auxtrace_snapshot_state and auxtrace_snapshot_err are absorbed. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1461178794-40467-3-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-04-28 09:58:58 -03:00
Wang Nan	3dcc4436fa	perf tools: Introduce trigger class Use 'trigger' to model operations which need to be executed when an event (a signal, for example) is observed. States and transits: OFF--(on)--> READY --(hit)--> HIT ^ \| \| (ready) \| \| \_____________/ is_hit and is_ready are two key functions to query the state of a trigger. is_hit means the event already happen; is_ready means the trigger is waiting for the event. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1461178794-40467-2-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-04-28 09:58:58 -03:00
Masami Hiramatsu	909b0360ae	perf probe: Use strbuf for making strings Replace many fixed-length char array with strbuf to stringify perf_probe_event and probe_trace_event etc. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20160427183713.23446.97377.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-04-28 09:58:58 -03:00
Arnaldo Carvalho de Melo	81d64f46d4	perf evsel: Remove two extraneous ending newlines in open_strerror() The error messages returned by this method should not have an ending newline, fix the two cases where it was. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-8af0pazzhzl3dluuh8p7ar7p@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-04-28 09:58:58 -03:00
Arnaldo Carvalho de Melo	de46d5268c	perf evsel: Handle ENOMEM for perf_event_max_stack + PERF_SAMPLE_CALLCHAIN When the kernel allows tweaking perf_event_max_stack and the event being setup has PERF_SAMPLE_CALLCHAIN in its perf_event_attr.sample_type, tell the user that tweaking /proc/sys/kernel/perf_event_max_stack may solve the problem. Before: # echo 32000 > /proc/sys/kernel/perf_event_max_stack # perf record -g usleep 1 Error: The sys_perf_event_open() syscall returned with 12 (Cannot allocate memory) for event (cycles:ppp). /bin/dmesg may provide additional information. No CONFIG_PERF_EVENTS=y kernel support configured? # After: # echo 64000 > /proc/sys/kernel/perf_event_max_stack # perf record -g usleep 1 Error: Not enough memory to setup event with callchain. Hint: Try tweaking /proc/sys/kernel/perf_event_max_stack Hint: Current value: 64000 # Suggested-by: David Ahern <dsahern@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-ebv0orelj1s1ye857vhb82ov@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-04-28 09:58:58 -03:00
Florian Fainelli	1a71476e4f	bpf tools: Fix syscall argument Coverity flagged this under CID 1354884 as a sizeof mismatch, it turns out that the argument "attr" passed to syscall should have been a pointer to attr in the first place. Reported-by: coverity (CID 1354884) Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Wang Nan <wangnan0@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Fixes: `8f9e05fb29` ("perf tools: Fix PowerPC native building") Link: http://lkml.kernel.org/r/1461551694-5512-3-git-send-email-f.fainelli@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-04-28 09:58:57 -03:00
Florian Fainelli	a431d67934	bpf tools: Remove expression with no effect Assigning "attr" to "attr" does not have any effect, but was caught by Coverity, so let's remove this. Reported-by: coverity (CID 1354720) Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Wang Nan <wangnan0@huawei.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Fixes: `1b76c13e4b` ("bpf tools: Introduce 'bpf' library and add bpf feature check") Link: http://lkml.kernel.org/r/1461551694-5512-2-git-send-email-f.fainelli@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-04-28 09:58:57 -03:00
James Morse	1fe492ce64	arm64: hibernate: Refuse to hibernate if the boot cpu is offline Hibernation represents a system state save/restore through a system reboot; this implies that the logical cpus carrying out hibernation/thawing must be the same, so that the context saved in the snapshot image on hibernation is consistent with the state of the system on resume. If resume from hibernation is driven through kernel command line parameter, the cpu responsible for thawing the system will be whatever CPU firmware boots the system on upon cold-boot (ie logical cpu 0); this means that in order to keep system context consistent between the hibernate snapshot image and system state on kernel resume from hibernate, logical cpu 0 must be online on hibernation and must be the logical cpu that creates the snapshot image. This patch adds a PM notifier that enforces logical cpu 0 is online when the hibernation is started (and prevents hibernation if it is not), which is sufficient to guarantee it will be the one creating the snapshot image therefore providing the resume cpu a consistent snapshot of the system to resume to. Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2016-04-28 13:36:23 +01:00
James Morse	82869ac57b	arm64: kernel: Add support for hibernate/suspend-to-disk Add support for hibernate/suspend-to-disk. Suspend borrows code from cpu_suspend() to write cpu state onto the stack, before calling swsusp_save() to save the memory image. Restore creates a set of temporary page tables, covering only the linear map, copies the restore code to a 'safe' page, then uses the copy to restore the memory image. The copied code executes in the lower half of the address space, and once complete, restores the original kernel's page tables. It then calls into cpu_resume(), and follows the normal cpu_suspend() path back into the suspend code. To restore a kernel using KASLR, the address of the page tables, and cpu_resume() are stored in the hibernate arch-header and the el2 vectors are pivotted via the 'safe' page in low memory. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Kevin Hilman <khilman@baylibre.com> # Tested on Juno R2 Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2016-04-28 13:36:22 +01:00
James Morse	f6cf0545ec	PM / Hibernate: Call flush_icache_range() on pages restored in-place Some architectures require code written to memory as if it were data to be 'cleaned' from any data caches before the processor can fetch them as new instructions. During resume from hibernate, the snapshot code copies some pages directly, meaning these architectures do not get a chance to perform their cache maintenance. Modify the read and decompress code to call flush_icache_range() on all pages that are restored, so that the restored in-place pages are guaranteed to be executable on these architectures. Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Pavel Machek <pavel@ucw.cz> Acked-by: Rafael J. Wysocki <rjw@rjwysocki.net> Acked-by: Catalin Marinas <catalin.marinas@arm.com> [will: make clean_pages_on_* static and remove initialisers] Signed-off-by: Will Deacon <will.deacon@arm.com>	2016-04-28 13:35:48 +01:00
Geoff Levand	5003dbde45	arm64: Add new asm macro copy_page Kexec and hibernate need to copy pages of memory, but may not have all of the kernel mapped, and are unable to call copy_page(). Add a simplistic copy_page() macro, that can be inlined in these situations. lib/copy_page.S provides a bigger better version, but uses more registers. Signed-off-by: Geoff Levand <geoff@infradead.org> [Changed asm label to 9998, added commit message] Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2016-04-28 12:05:46 +01:00
James Morse	28c7258330	arm64: Promote KERNEL_START/KERNEL_END definitions to a header file KERNEL_START and KERNEL_END are useful outside head.S, move them to a header file. Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2016-04-28 12:05:46 +01:00
James Morse	812264550d	arm64: kernel: Include _AC definition in page.h page.h uses '_AC' in the definition of PAGE_SIZE, but doesn't include linux/const.h where this is defined. This produces build warnings when only asm/page.h is included by asm code. Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2016-04-28 12:05:46 +01:00
James Morse	cabe1c81ea	arm64: Change cpu_resume() to enable mmu early then access sleep_sp by va By enabling the MMU early in cpu_resume(), the sleep_save_sp and stack can be accessed by VA, which avoids the need to convert-addresses and clean to PoC on the suspend path. MMU setup is shared with the boot path, meaning the swapper_pg_dir is restored directly: ttbr1_el1 is no longer saved/restored. struct sleep_save_sp is removed, replacing it with a single array of pointers. cpu_do_{suspend,resume} could be further reduced to not restore: cpacr_el1, mdscr_el1, tcr_el1, vbar_el1 and sctlr_el1, all of which are set by __cpu_setup(). However these values all contain res0 bits that may be used to enable future features. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2016-04-28 12:05:46 +01:00
James Morse	adc9b2dfd0	arm64: kernel: Rework finisher callback out of __cpu_suspend_enter() Hibernate could make use of the cpu_suspend() code to save/restore cpu state, however it needs to be able to return '0' from the 'finisher'. Rework cpu_suspend() so that the finisher is called from C code, independently from the save/restore of cpu state. Space to save the context in is allocated in the caller's stack frame, and passed into __cpu_suspend_enter(). Hibernate's use of this API will look like a copy of the cpu_suspend() function. Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2016-04-28 12:05:46 +01:00
AKASHI Takahiro	67f6919766	arm64: kvm: allows kvm cpu hotplug The current kvm implementation on arm64 does cpu-specific initialization at system boot, and has no way to gracefully shutdown a core in terms of kvm. This prevents kexec from rebooting the system at EL2. This patch adds a cpu tear-down function and also puts an existing cpu-init code into a separate function, kvm_arch_hardware_disable() and kvm_arch_hardware_enable() respectively. We don't need the arm64 specific cpu hotplug hook any more. Since this patch modifies common code between arm and arm64, one stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid compilation errors. Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> [Rebase, added separate VHE init/exit path, changed resets use of kvm_call_hyp() to the __version, en/disabled hardware in init_subsystems(), added icache maintenance to __kvm_hyp_reset() and removed lr restore, removed guest-enter after teardown handling] Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2016-04-28 12:05:46 +01:00
James Morse	c94b0cf282	arm64: hyp/kvm: Make hyp-stub reject kvm_call_hyp() A later patch implements kvm_arch_hardware_disable(), to remove kvm from el2, and re-instate the hyp-stub. This can happen while guests are running, particularly when kvm_reboot() calls kvm_arch_hardware_disable() on each cpu. This can interrupt a guest, remove kvm, then allow the guest to be scheduled again. This causes kvm_call_hyp() to be run against the hyp-stub. Change the hyp-stub to return a new exception type when this happens, and add code to kvm's handle_exit() to tell userspace we failed to enter the guest. Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2016-04-28 12:05:46 +01:00
Geoff Levand	ad72e59ff2	arm64: hyp/kvm: Make hyp-stub extensible The existing arm64 hcall implementations are limited in that they only allow for two distinct hcalls; with the x0 register either zero or not zero. Also, the API of the hyp-stub exception vector routines and the KVM exception vector routines differ; hyp-stub uses a non-zero value in x0 to implement __hyp_set_vectors, whereas KVM uses it to implement kvm_call_hyp. To allow for additional hcalls to be defined and to make the arm64 hcall API more consistent across exception vector routines, change the hcall implementations to reserve all x0 values below 0xfff for hcalls such as {s,g}et_vectors(). Define two new preprocessor macros HVC_GET_VECTORS, and HVC_SET_VECTORS to be used as hcall type specifiers and convert the existing __hyp_get_vectors() and __hyp_set_vectors() routines to use these new macros when executing an HVC call. Also, change the corresponding hyp-stub and KVM el1_sync exception vector routines to use these new macros. Signed-off-by: Geoff Levand <geoff@infradead.org> [Merged two hcall patches, moved immediate value from esr to x0, use lr as a scratch register, changed limit to 0xfff] Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2016-04-28 12:05:46 +01:00
James Morse	00a44cdaba	arm64: kvm: Move lr save/restore from do_el2_call into EL1 Today the 'hvc' calling KVM or the hyp-stub is expected to preserve all registers. KVM saves/restores the registers it needs on the EL2 stack using do_el2_call(). The hyp-stub has no stack, later patches need to be able to be able to clobber the link register. Move the link register save/restore to the the call sites. Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2016-04-28 12:05:46 +01:00

... 17 18 19 20 21 ...

591450 commits