This patch adds paravirt-ops hooks in pv_mmu_ops for ptep_modify_prot_start and
ptep_modify_prot_commit. This allows the hypervisor-specific backends to
implement these in some more efficient way.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch adds an API for doing read-modify-write updates to a pte's
protection bits which may race against hardware updates to the pte.
After reading the pte, the hardware may asynchonously set the accessed
or dirty bits on a pte, which would be lost when writing back the
modified pte value.
The existing technique to handle this race is to use
ptep_get_and_clear() atomically fetch the old pte value and clear it
in memory. This has the effect of marking the pte as non-present,
which will prevent the hardware from updating its state. When the new
value is written back, the pte will be present again, and the hardware
can resume updating the access/dirty flags.
When running in a virtualized environment, pagetable updates are
relatively expensive, since they generally involve some trap into the
hypervisor. To mitigate the cost of these updates, we tend to batch
them.
However, because of the atomic nature of ptep_get_and_clear(), it is
inherently non-batchable. This new interface allows batching by
giving the underlying implementation enough information to open a
transaction between the read and write phases:
ptep_modify_prot_start() returns the current pte value, and puts the
pte entry into a state where either the hardware will not update the
pte, or if it does, the updates will be preserved on commit.
ptep_modify_prot_commit() writes back the updated pte, makes sure that
any hardware updates made since ptep_modify_prot_start() are
preserved.
ptep_modify_prot_start() and _commit() must be exactly paired, and
used while holding the appropriate pte lock. They do not protect
against other software updates of the pte in any way.
The current implementations of ptep_modify_prot_start and _commit are
functionally unchanged from before: _start() uses ptep_get_and_clear()
fetch the pte and zero the entry, preventing any hardware updates.
_commit() simply writes the new pte value back knowing that the
hardware has not updated the pte in the meantime.
The only current user of this interface is mprotect
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
--
WARNING: vmlinux.o(.text+0x721a): Section mismatch in reference from the function ___fill_code_cplbtab() to the function .init.text:_fill_cplbtab()
The function ___fill_code_cplbtab() references
the function __init _fill_cplbtab().
This is often because ___fill_code_cplbtab lacks a __init
annotation or the annotation of _fill_cplbtab is wrong.
WARNING: vmlinux.o(.text+0x7238): Section mismatch in reference from the function ___fill_code_cplbtab() to the function .init.text:_fill_cplbtab()
The function ___fill_code_cplbtab() references
the function __init _fill_cplbtab().
This is often because ___fill_code_cplbtab lacks a __init
annotation or the annotation of _fill_cplbtab is wrong.
WARNING: vmlinux.o(.text+0x7250): Section mismatch in reference from the function ___fill_code_cplbtab() to the function .init.text:_fill_cplbtab()
The function ___fill_code_cplbtab() references
the function __init _fill_cplbtab().
This is often because ___fill_code_cplbtab lacks a __init
annotation or the annotation of _fill_cplbtab is wrong.
WARNING: vmlinux.o(.text+0x7264): Section mismatch in reference from the function ___fill_code_cplbtab() to the function .init.text:_fill_cplbtab()
The function ___fill_code_cplbtab() references
the function __init _fill_cplbtab().
This is often because ___fill_code_cplbtab lacks a __init
annotation or the annotation of _fill_cplbtab is wrong.
WARNING: vmlinux.o(.text+0x72a2): Section mismatch in reference from the function ___fill_data_cplbtab() to the function .init.text:_fill_cplbtab()
The function ___fill_data_cplbtab() references
the function __init _fill_cplbtab().
This is often because ___fill_data_cplbtab lacks a __init
annotation or the annotation of _fill_cplbtab is wrong.
WARNING: vmlinux.o(.text+0x72bc): Section mismatch in reference from the function ___fill_data_cplbtab() to the function .init.text:_fill_cplbtab()
The function ___fill_data_cplbtab() references
the function __init _fill_cplbtab().
This is often because ___fill_data_cplbtab lacks a __init
annotation or the annotation of _fill_cplbtab is wrong.
WARNING: vmlinux.o(.text+0x72d4): Section mismatch in reference from the function ___fill_data_cplbtab() to the function .init.text:_fill_cplbtab()
The function ___fill_data_cplbtab() references
the function __init _fill_cplbtab().
This is often because ___fill_data_cplbtab lacks a __init
annotation or the annotation of _fill_cplbtab is wrong.
WARNING: vmlinux.o(.text+0x72e8): Section mismatch in reference from the function ___fill_data_cplbtab() to the function .init.text:_fill_cplbtab()
The function ___fill_data_cplbtab() references
the function __init _fill_cplbtab().
This is often because ___fill_data_cplbtab lacks a __init
annotation or the annotation of _fill_cplbtab is wrong.
--
Signed-off-by: Bryan Wu <cooloney@kernel.org>
Initialize the lock of bad_irq_desc properly.
The content of irq_desc array is replaced by bad_irq_desc in blackfin
arch irqchip init code. So, do it properly as common irq init code.
Signed-off-by: Sonic Zhang <sonic.zhang@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
The second argument "type" is not used in audit_filter_user(), so I think that type can be removed. If I'm wrong, please tell me.
Signed-off-by: Peng Haitao <penght@cn.fujitsu.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Fix auditfilter kernel-doc misssing parameter description:
Warning(lin2626-rc3//kernel/auditfilter.c:1551): No description found for parameter 'sessionid'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The first argument "nlh->nlmsg_type" of audit_receive_filter() should be modified to "msg_type" in audit_receive_msg().
Signed-off-by: Peng Haitao <penght@cn.fujitsu.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64] Eliminate NULL test after alloc_bootmem in iosapic_alloc_rte()
[IA64] Handle count==0 in sn2_ptc_proc_write()
[IA64] Fix boot failure on ia64/sn2
* 'kvm-updates-2.6.26' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm:
KVM: Remove now unused structs from kvm_para.h
x86: KVM guest: Use the paravirt clocksource structs and functions
KVM: Make kvm host use the paravirt clocksource structs
x86: Make xen use the paravirt clocksource structs and functions
x86: Add structs and functions for paravirt clocksource
KVM: VMX: Fix host msr corruption with preemption enabled
KVM: ioapic: fix lost interrupt when changing a device's irq
KVM: MMU: Fix oops on guest userspace access to guest pagetable
KVM: MMU: large page update_pte issue with non-PAE 32-bit guests (resend)
KVM: MMU: Fix rmap_write_protect() hugepage iteration bug
KVM: close timer injection race window in __vcpu_run
KVM: Fix race between timer migration and vcpu migration
OOPS reported by Friedrich Oslage <bluebird@porno-bullen.de>
The problem here is that tp->starget is set every time a lun
is allocated for a particular target so we can catch the
sdev_target parent value.
The reset handler uses the NULL'ness of this value to determine
which targets are active.
But esp_slave_destroy() does not NULL out this value when appropriate.
So for every target that doesn't respond, the SCSI bus scan causes
a stale pointer to be left here, with ensuing crashes like you're
seeing.
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Stable Tree <stable@kernel.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Rerouting should only happen in LOCAL_OUT, in INPUT its useless
since the packet has already chosen its final destination.
Noticed by Alexey Dobriyan <adobriyan@gmail.com>.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
On 9xx chips, bus mastering needs to be enabled at resume time for much of the
chip to function. With this patch, vblank interrupts will work as expected
on resume, along with other chip functions. Fixes kernel bugzilla #10844.
Signed-off-by: Jie Luo <clotho67@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The kvm_* structs are obsoleted by the pvclock_* ones.
Now all users have been switched over and the old structs
can be dropped.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
This patch updates the kvm host code to use the pvclock structs
and functions, thereby making it compatible with Xen.
The patch also fixes an initialization bug: on SMP systems the
per-cpu has two different locations early at boot and after CPU
bringup. kvmclock must take that in account when registering the
physical address within the host.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
This patch updates the kvm host code to use the pvclock structs.
It also makes the paravirt clock compatible with Xen.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
This patch updates the xen guest to use the pvclock structs
and helper functions.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
This patch adds structs for the paravirt clocksource ABI
used by both xen and kvm (pvclock-abi.h).
It also adds some helper functions to read system time and
wall clock time from a paravirtual clocksource (pvclock.[ch]).
They are based on the xen code. They are enabled using
CONFIG_PARAVIRT_CLOCK.
Subsequent patches of this series will put the code in use.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
This patch fixes bz 450641.
This patch changes the computation for zero_metapath_length(), which it
renames to metapath_branch_start(). When you are extending the metadata
tree, The indirect blocks that point to the new data block must either
diverge from the existing tree either at the inode, or at the first
indirect block. They can diverge at the first indirect block because the
inode has room for 483 pointers while the indirect blocks have room for
509 pointers, so when the tree is grown, there is some free space in the
first indirect block. What metapath_branch_start() now computes is the
height where the first indirect block for the new data block is located.
It can either be 1 (if the indirect block diverges from the inode) or 2
(if it diverges from the first indirect block).
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
As noted by Akinobu Mita alloc_bootmem and related functions never return
NULL and always return a zeroed region of memory. Thus a NULL test or
memset after calls to these functions is unnecessary.
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The fix applied in e0c6d97c65
"security hole in sn2_ptc_proc_write" didn't take into account
the case where count==0 (which results in a buffer underrun
when adding the trailing '\0'). Thanks to Andi Kleen for
pointing this out.
Signed-off-by: Cliff Wickman <cpw@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Call check_sal_cache_flush() after platform_setup() as
check_sal_cache_flush() now relies on being able to call platform
vector code.
Problem was introduced by: 3463a93def
"Update check_sal_cache_flush to use platform_send_ipi()"
Signed-off-by: Jes Sorensen <jes@sgi.com>
Tested-by: Alex Chiang: <achiang@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Timeouts are measured in jiffies, not in seconds.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Stable Tree <stable@kernel.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
- Fix warning reported by sparse
kernel/kgdb.c:1502:6: warning: symbol 'kgdb_console_write' was not declared.
Should it be static?
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
kgdboe is not presently included kgdb, and there should be no
references to it.
Also fix the tcp port terminal connection example.
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Non-PAE operation has been deprecated in Xen for a while, and is
rarely tested or used. xen-unstable has now officially dropped
non-PAE support. Since Xen/pvops' non-PAE support has also been
broken for a while, we may as well completely drop it altogether.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>