linux-pinenote

Author	SHA1	Message	Date
Avi Kivity	a988b910ef	KVM: Add API for determining the number of supported memory slots Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:23 +03:00
Avi Kivity	f725230af9	KVM: Add API to retrieve the number of supported vcpus per vm Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:23 +03:00
Harvey Harrison	7a95727567	KVM: x86 emulator: make register_address_increment and JMP_REL static inlines Change jmp_rel() to a function as well. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:23 +03:00
Harvey Harrison	e4706772ea	KVM: x86 emulator: make register_address, address_mask static inlines Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:22 +03:00
Harvey Harrison	ddcb2885e2	KVM: x86 emulator: add ad_mask static inline Replaces open-coded mask calculation in macros. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:22 +03:00
Glauber de Oliveira Costa	790c73f628	x86: KVM guest: paravirtualized clocksource This is the guest part of kvm clock implementation It does not do tsc-only timing, as tsc can have deltas between cpus, and it did not seem worthy to me to keep adjusting them. We do use it, however, for fine-grained adjustment. Other than that, time comes from the host. [randy dunlap: add missing include] [randy dunlap: disallow on Voyager or Visual WS] Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:22 +03:00
Glauber de Oliveira Costa	18068523d3	KVM: paravirtualized clocksource: host part This is the host part of kvm clocksource implementation. As it does not include clockevents, it is a fairly simple implementation. We only have to register a per-vcpu area, and start writing to it periodically. The area is binary compatible with xen, as we use the same shadow_info structure. [marcelo: fix bad_page on MSR_KVM_SYSTEM_TIME] [avi: save full value of the msr, even if enable bit is clear] [avi: clear previous value of time_page] Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:22 +03:00
Joerg Roedel	24e09cbf48	KVM: SVM: enable LBR virtualization This patch implements the Last Branch Record Virtualization (LBRV) feature of the AMD Barcelona and Phenom processors into the kvm-amd module. It will only be enabled if the guest enables last branch recording in the DEBUG_CTL MSR. So there is no increased world switch overhead when the guest doesn't use these MSRs. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Markus Rechberger <markus.rechberger@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:21 +03:00
Joerg Roedel	f65c229c3e	KVM: SVM: allocate the MSR permission map per VCPU This patch changes the kvm-amd module to allocate the SVM MSR permission map per VCPU instead of a global map for all VCPUs. With this we have more flexibility allowing specific guests to access virtualized MSRs. This is required for LBR virtualization. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Markus Rechberger <markus.rechberger@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:21 +03:00
Joerg Roedel	e6101a96c9	KVM: SVM: let init_vmcb() take struct vcpu_svm as parameter Change the parameter of the init_vmcb() function in the kvm-amd module from struct vmcb to struct vcpu_svm. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Markus Rechberger <markus.rechberger@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:21 +03:00
Ryan Harper	2e11384c2c	KVM: VMX: fix typo in VMX header define Looking at Intel Volume 3b, page 148, table 20-11 and noticed that the field name is 'Deliver' not 'Deliever'. Attached patch changes the define name and its user in vmx.c Signed-off-by: Ryan Harper <ryanh@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:21 +03:00
Joerg Roedel	709ddebf81	KVM: SVM: add support for Nested Paging This patch contains the SVM architecture dependent changes for KVM to enable support for the Nested Paging feature of AMD Barcelona and Phenom processors. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:21 +03:00
Joerg Roedel	fb72d1674d	KVM: MMU: add TDP support to the KVM MMU This patch contains the changes to the KVM MMU necessary for support of the Nested Paging feature in AMD Barcelona and Phenom Processors. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:20 +03:00
Joerg Roedel	cc4b6871e7	KVM: export the load_pdptrs() function to modules The load_pdptrs() function is required in the SVM module for NPT support. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:20 +03:00
Joerg Roedel	4d9976bbdc	KVM: MMU: make the __nonpaging_map function generic The mapping function for the nonpaging case in the softmmu does basically the same as required for Nested Paging. Make this function generic so it can be used for both. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:20 +03:00
Joerg Roedel	1855267210	KVM: export information about NPT to generic x86 code The generic x86 code has to know if the specific implementation uses Nested Paging. In the generic code Nested Paging is called Two Dimensional Paging (TDP) to avoid confusion with (future) TDP implementations of other vendors. This patch exports the availability of TDP to the generic x86 code. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:19 +03:00
Joerg Roedel	6c7dac72d5	KVM: SVM: add module parameter to disable Nested Paging To disable the use of the Nested Paging feature even if it is available in hardware this patch adds a module parameter. Nested Paging can be disabled by passing npt=0 to the kvm_amd module. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:19 +03:00
Joerg Roedel	e3da3acdb3	KVM: SVM: add detection of Nested Paging feature Let SVM detect if the Nested Paging feature is available on the hardware. Disable it to keep this patch series bisectable. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:19 +03:00
Joerg Roedel	33bd6a0b3e	KVM: SVM: move feature detection to hardware setup code By moving the SVM feature detection from the each_cpu code to the hardware setup code it runs only once. As an additional advance the feature check is now available earlier in the module setup process. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:19 +03:00
Joerg Roedel	9457a712a2	KVM: allow access to EFER in 32bit KVM This patch makes the EFER register accessible on a 32bit KVM host. This is necessary to boot 32 bit PAE guests under SVM. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:19 +03:00
Joerg Roedel	9f62e19a11	KVM: VMX: unifdef the EFER specific code To allow access to the EFER register in 32bit KVM the EFER specific code has to be exported to the x86 generic code. This patch does this in a backwards compatible manner. [avi: add check for EFER-less hosts] Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:18 +03:00
Joerg Roedel	50a37eb4e0	KVM: align valid EFER bits with the features of the host system This patch aligns the bits the guest can set in the EFER register with the features in the host processor. Currently it lets EFER.NX disabled if the processor does not support it and enables EFER.LME and EFER.LMA only for KVM on 64 bit hosts. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:18 +03:00
Joerg Roedel	f2b4b7ddf6	KVM: make EFER_RESERVED_BITS configurable for architecture code This patch give the SVM and VMX implementations the ability to add some bits the guest can set in its EFER register. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:18 +03:00
Sheng Yang	2384d2b326	KVM: VMX: Enable Virtual Processor Identification (VPID) To allow TLB entries to be retained across VM entry and VM exit, the VMM can now identify distinct address spaces through a new virtual-processor ID (VPID) field of the VMCS. [avi: drop vpid_sync_all()] [avi: add "cc" to asm constraints] Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:17 +03:00
Avi Kivity	d196e34336	KVM: MMU: Decouple mmio from shadow page tables Currently an mmio guest pte is encoded in the shadow pagetable as a not-present trapping pte, with the SHADOW_IO_MARK bit set. However nothing is ever done with this information, so maintaining it is a useless complication. This patch moves the check for mmio to before shadow ptes are instantiated, so the shadow code is never invoked for ptes that reference mmio. The code is simpler, and with future work, can be made to handle mmio concurrently. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:17 +03:00
Avi Kivity	1d6ad2073e	KVM: x86 emulator: group decoding for group 1 instructions Opcodes 0x80-0x83 Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:16 +03:00
Avi Kivity	d95058a1a7	KVM: x86 emulator: add group 7 decoding This adds group decoding for opcode 0x0f 0x01 (group 7). Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:15 +03:00
Avi Kivity	fd60754e4f	KVM: x86 emulator: Group decoding for groups 4 and 5 Add group decoding support for opcode 0xfe (group 4) and 0xff (group 5). Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:15 +03:00
Avi Kivity	7d858a19ef	KVM: x86 emulator: Group decoding for group 3 This adds group decoding support for opcodes 0xf6, 0xf7 (group 3). Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:14 +03:00
Avi Kivity	43bb19cd33	KVM: x86 emulator: group decoding for group 1A This adds group decode support for opcode 0x8f. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:14 +03:00
Avi Kivity	e09d082c03	KVM: x86 emulator: add support for group decoding Certain x86 instructions use bits 3:5 of the byte following the opcode as an opcode extension, with the decode sometimes depending on bits 6:7 as well. Add support for this in the main decoding table rather than an ad-hock adaptation per opcode. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:14 +03:00
Dong, Eddie	1ae0a13def	KVM: MMU: Simplify hash table indexing Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:14 +03:00
Dong, Eddie	489f1d6526	KVM: MMU: Update shadow ptes on partial guest pte writes A guest partial guest pte write will leave shadow_trap_nonpresent_pte in spte, which generates a vmexit at the next guest access through that pte. This patch improves this by reading the full guest pte in advance and thus being able to update the spte and eliminate the vmexit. This helps pae guests which use two 32-bit writes to set a single 64-bit pte. [truncation fix by Eric] Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Feng (Eric) Liu <eric.e.liu@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 11:53:13 +03:00
David S. Miller	7cf069955f	sparc64: Kill bogus RT_ALIGNEDSZ macro from signal.c The structure has to be 8-byte aligned in size, so this macro is just noise. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-27 00:25:30 -07:00
David S. Miller	5da496e4b9	sparc64: Kill unused local ISA bus layer. No more drivers use this, and therefore it can die. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-26 21:41:23 -07:00
David S. Miller	dc8ca2a111	sparc64: Do not ignore 'pmu' device ranges. I must have disabled this due to other bugs which were fixed over time. And this is needed in order for child devices of "pmu" to get proper resource values. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-26 21:41:20 -07:00
David S. Miller	09337f501e	sparc64: Kill CONFIG_SPARC32_COMPAT It's completely superfluous, CONFIG_COMPAT is sufficient. What this used to be is an umbrella for enabling code shared by all 32-bit compat binary support types. But with the removal of SunOS and Solaris support, the only one left is Linux 32-bit ELF. Update defconfig. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-26 21:41:19 -07:00
David S. Miller	05d515ef3d	sparc64: Cleanups and corrections for arch/sparc64/Kconfig Refer to chip as "SPARC" throughout. Say 32-bit SPARC and 64-bit SPARC rather than mentioning specific chips such like UltraSPARC, as appropriate. Remove non-sense help text referring to things that will never appear on a SPARC system, such as EISA busses etc. Use "help" instead of "--help--" Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-26 21:41:17 -07:00
David S. Miller	227c331178	sparc64: Fix wedged irq regression. Kernel bugzilla 10273 As reported by Jos van der Ende, ever since commit `5a606b72a4` ("[SPARC64]: Do not ACK an INO if it is disabled or inprogress.") sun4u interrupts can get stuck. What this changset did was add the following conditional to the various IRQ chip ->enable() handlers on sparc64: if (unlikely(desc->status & (IRQ_DISABLED\|IRQ_INPROGRESS))) return; which is correct, however it means that special care is needed in the ->enable() method. Specifically we must put the interrupt into IDLE state during an enable, or else it might never be sent out again. Setting the INO interrupt state to IDLE resets the state machine, the interrupt input to the INO is retested by the hardware, and if an interrupt is being signalled by the device, the INO moves back into TRANSMIT state, and an interrupt vector is sent to the cpu. The two sun4v IRQ chip handlers were already doing this properly, only sun4u got it wrong. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-26 21:41:15 -07:00
Peter Zijlstra	7f424a8b08	fix idle (arch, acpi and apm) and lockdep OK, so 25-mm1 gave a lockdep error which made me look into this. The first thing that I noticed was the horrible mess; the second thing I saw was hacks like: `71e93d1561` The problem is that arch idle routines are somewhat inconsitent with their IRQ state handling and instead of fixing _that_, we go paper over the problem. So the thing I've tried to do is set a standard for idle routines and fix them all up to adhere to that. So the rules are: idle routines are entered with IRQs disabled idle routines will exit with IRQs enabled Nearly all already did this in one form or another. Merge the 32 and 64 bit bits so they no longer have different bugs. As for the actual lockdep warning; __sti_mwait() did a plainly un-annotated irq-enable. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Bob Copeland <me@bobcopeland.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-27 00:01:45 +02:00
Yinghai Lu	5f0b2976cb	x86: add pci=check_enable_amd_mmconf and dmi check so will disable that feature by default, and only enable that via pci=check_enable_amd_mmconf or for system match with dmi table. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-26 23:41:04 +02:00
Yinghai Lu	e8ee6f0ae5	x86: work around io allocation overlap of HT links normally BIOSes assign io/mmio range to different HT links without overlapping, even same node same link should get non overlapping entries. but Rafael L. Wysocki's buggy BIOS creates a link with overlapping entries for mmio and io: node 0 link 0: io port [1000, ffffff] node 0 link 0: mmio [e0000000, efffffff] node 0 link 0: mmio [a0000, bffff] node 0 link 0: mmio [80000000, ffffffff] try to merge them and we will get: bus: [00, ff] on node 0 link 0 bus: 00 index 0 io port: [0, ffff] bus: 00 index 1 mmio: [80000000, fcffffffff] bus: 00 index 2 mmio: [a0000, bffff] so later we will reduce the chance to assign used resource to unassigned device. Reported-by: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Tested-by: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-26 23:41:04 +02:00
Yinghai Lu	cbf9bd603a	acpi: get boot_cpu_id as early for k8_scan_nodes [mingo@elte.hu: split from "x86_64: get boot_cpu_id as early for k8_scan_nodes] Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-26 23:41:04 +02:00
Yinghai Lu	4cf1946374	x86_64: don't need set default res if only have one root bus if there's only one root bus there's no need to split resources. This patch fixes the issue described at: http://lkml.org/lkml/2008/4/10/304 Reported-and-bisected-by: Rafael J. Wysocki <rjw@sisk.pl> Tested-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-26 23:41:04 +02:00
Yinghai Lu	6e184f299d	x86: double check the multi root bus with fam10h mmconf some bioses give same range to mmconf for fam10h msr, and mmio for node/link. fam10h msr will overide mmio for node/link. so we can not assign range to devices under node/link for unassigned resources. this patch will take range out from the mmio for node/link Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-26 23:41:04 +02:00
Yinghai Lu	30a18d6c3f	x86: multi pci root bus with different io resource range, on 64-bit scan AMD opteron io/mmio routing to make sure every pci root bus get correct resource range. Thus later pci scan could assign correct resource to device with unassigned resource. this can fix a system without _CRS for multi pci root bus. Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-26 23:41:04 +02:00
Yinghai Lu	35ddd068fb	x86: use bus conf in NB conf fun1 to get bus range on, on 64-bit ... so we use the same code with Quad core cpu as old opteron. This patch is useful when acpi=off or _PXM is not there in DSDT. Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-26 23:41:04 +02:00
Yinghai Lu	871d5f8dd0	x86: get mp_bus_to_node early Currently, on an amd k8 system with multi ht chains, the numa_node of pci devices under /sys/devices/pci0000:80/* is always 0, even if that chain is on node 1 or 2 or 3. Workaround: pcibus_to_node(bus) is used when we want to get the node that pci_device is on. In struct device, we already have numa_node member, and we could use dev_to_node()/set_dev_node() to get and set numa_node in the device. set_dev_node is called in pci_device_add() with pcibus_to_node(bus), and pcibus_to_node uses bus->sysdata for nodeid. The problem is when pci_add_device is called, bus->sysdata is not assigned correct nodeid yet. The result is that numa_node will always be 0. pcibios_scan_root and pci_scan_root could take sysdata. So we need to get mp_bus_to_node mapping before these two are called, and thus get_mp_bus_to_node could get correct node for sysdata in root bus. In scanning of the root bus, all child busses will take parent bus sysdata. So all pci_device->dev.numa_node will be assigned correctly and automatically. Later we could use dev_to_node(&pci_dev->dev) to get numa_node, and we could also could make other bus specific device get the correct numa_node too. This is an updated version of pci_sysdata and Jeff's pci_domain patch. [ mingo@elte.hu: build fix ] Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-26 23:41:04 +02:00
Yinghai Lu	bb63b42199	x86 pci: remove checking type for mmconfig probe doesn't need to check if it is type1 or type2, we can use raw_pci_ops directly. also make pci_direct_conf1 static again. anyway is there system with type 2 and mmconf support? Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-26 23:41:04 +02:00
Yinghai Lu	d2ebdf4bae	x86: remove unneeded check in mmconf reject mmconfig is only used to access extended configuration space. so don't need to reject MFG that only have one entry and only handle bus0. Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-26 23:41:04 +02:00

... 14 15 16 17 18 ...

22544 commits