Commit graph

99308 commits

Author SHA1 Message Date
Kenji Kaneshige
9e4f2e8d4d pciehp: add message about pciehp_slot_with_bus option
Some (broken?) platform assign the same slot name to multiple hotplug
slots. On such system, slot initialization would fail because of name
collision. The pciehp driver already have a "slot_with_bus" module
option which adds the bus number into the slot name. This patch adds
the message about this module option that will be displayed when slot
name collision is detected.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-05-27 15:43:47 -07:00
Kenji Kaneshige
a86161b313 pci hotplug core: add check of duplicate slot name
Fix the following errors reported by Jan C. Nordholz in
http://bugzilla.kernel.org/show_bug.cgi?id=10751.

kobject_add_internal failed for 2 with -EEXIST, don't try to register things with the same name in the same directory.
Pid: 1, comm: swapper Tainted: G        W 2.6.26-rc3 #1
 [<c0266980>] kobject_add_internal+0x140/0x190
 [<c0266afd>] kobject_init_and_add+0x2d/0x40
 [<c027bc91>] pci_hp_register+0x81/0x2f0
 [<c027fd07>] pciehp_probe+0x1a7/0x470
 [<c01b3b84>] sysfs_add_one+0x44/0xa0
 [<c01b3c1f>] sysfs_addrm_start+0x3f/0xb0
 [<c01b497a>] sysfs_create_link+0x8a/0xf0
 [<c0279570>] pcie_port_probe_service+0x50/0x80
 [<c02e0545>] driver_sysfs_add+0x55/0x70
 [<c02e0662>] driver_probe_device+0x82/0x180
 [<c02e07cc>] __driver_attach+0x6c/0x70
 [<c02dfe0a>] bus_for_each_dev+0x3a/0x60
 [<c05db2d0>] pcied_init+0x0/0x80
 [<c02e04e6>] driver_attach+0x16/0x20
 [<c02e0760>] __driver_attach+0x0/0x70
 [<c02e0341>] bus_add_driver+0x1a1/0x220
 [<c05db2d0>] pcied_init+0x0/0x80
 [<c02e09cd>] driver_register+0x4d/0x120
 [<c05db050>] ibm_acpiphp_init+0x0/0x190
 [<c0125aab>] printk+0x1b/0x20
 [<c05db2d0>] pcied_init+0x0/0x80
 [<c05db2de>] pcied_init+0xe/0x80
 [<c05c751a>] kernel_init+0x10a/0x300
 [<c0120138>] schedule_tail+0x18/0x50
 [<c0103b9a>] ret_from_fork+0x6/0x1c
 [<c05c7410>] kernel_init+0x0/0x300
 [<c05c7410>] kernel_init+0x0/0x300
 [<c010485b>] kernel_thread_helper+0x7/0x1c
 =======================
pci_hotplug: Unable to register kobject '2'<3>pciehp: pci_hp_register failed with error -22

Slot with the same name can be registered multiple times if shpchp or
pciehp driver is loaded after acpiphp is loaded because ACPI based
hotplug driver and Native OS hotplug driver trying to handle the same
physical slot. In this case, current pci_hotplug core will call
kobject_init_and_add() muliple time with the same name. This is the
cause of this problem. To fix this problem, this patch adds the check
into pci_hp_register() to see if the slot with the same name.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-05-27 15:43:40 -07:00
Kenji Kaneshige
0711c70ec0 pciehp: move msleep after power off
According to the PCI Express specification, we must wait for at least
1 second after turning power off before taking any action that relies
on power having been removed from the slot/adapter. For this, current
pciehp wait for 1 second after issuing the power off command in
hpc_power_off_slot() function. But waiting for 1 second in
hpc_power_off_slot() can make pciehp probing slow-down because pciehp
probe code calls hpc_power_off_slot() if the slot is not occupied just
in case. We don't need to wait for 1 second at the pciehp probe time
because there is no action on that empty slot. So move 1 second wait
from hpc_power_off_slot() to the caller of hpc_power_off_slot().

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-05-27 15:43:33 -07:00
Kenji Kaneshige
6592e02ae4 pciehp: poll cmd completion if hotplug interrupt is disabled
Fix improper long wait for command completion in pciehp probing.

As described in PCI Express specification, software notification is
not generated if the command that occurs as a result of a write to the
Slot Control register that disables software notification of command
completed events. Since pciehp driver doesn't take it into account,
such command is issued in pciehp probing, and it causes improper long
wait for command completion.

This patch changes the pciehp driver to take such command into
account.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-05-27 15:43:25 -07:00
Kenji Kaneshige
5808639bfa pciehp: fix slow probing
Fix the "pciehp probing slow" problem reported from Jan C. Nordholz in
http://bugzilla.kernel.org/show_bug.cgi?id=10751.

The command completed bit in Slot Status register applies only to
commands issued to control the attention indicator, power indicator,
power controller, or electromechanical interlock. However, writes to
other parts of the Slot Control register would end up writing to the
control fields. Hence, any write to Slot Control register is
considered as a command. However, if the controller doesn't support
any of attention indicator, power indicator, power controller and
electromechanical interlock, command completed bit would not set in
writing to Slot Control register. In this case, we should not wait for
command completed bit set, otherwise all commands would be considered
not completed in timeout seconds (1 sec.).

The cause of the problem is pciehp driver didn't take this situation
into account. This patch changes pciehp to take it into account. This
patch also add the check for "No Command Completed Support" bit in
Slot Capability register. If it is set, we should not wait for command
completed bit set as well.

This problem seems to be revealed by the commit
c27fb883df that fixed the bug that
pciehp did not wait for command completed properly (pciehp just
ignored the command completion event).

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-05-27 15:43:16 -07:00
Kenji Kaneshige
dbd79aed1a pciehp: fix NULL dereference in interrupt handler
Fix the following NULL dereference problem reported from Pierre Ossman
and Ingo Molnar.

pciehp: HPC vendor_id 8086 device_id 27d0 ss_vid 0 ss_did 0
pciehp: pciehp_find_slot: slot (device=0x0) not found
BUG: unable to handle kernel NULL pointer dereference at 0000000000000070
IP: [<ffffffff80494a8b>] pciehp_handle_presence_change+0x7e/0x113
PGD 0
Oops: 0000 [1]
CPU 0
Modules linked in:
Pid: 1, comm: swapper Tainted: G        W 2.6.26-rc3-sched-devel.git-00001-g2b99b26-dirty #170
RIP: 0010:[<ffffffff80494a8b>]  [<ffffffff80494a8b>] pciehp_handle_presence_change+0x7e/0x113
RSP: 0000:ffff81003f83fbb0  EFLAGS: 00010046
RAX: 0000000000000039 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000046
RBP: ffff81003f83fbd0 R08: 0000000000000001 R09: ffffffff80245103
R10: 0000000000000020 R11: 0000000000000000 R12: ffff81003ea53a30
R13: 0000000000000000 R14: 0000000000000011 R15: ffffffff80495926
FS:  0000000000000000(0000) GS:ffffffff80be7400(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000070 CR3: 0000000000201000 CR4: 00000000000006a0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 1, threadinfo ffff81003f83e000, task ffff81003f840000)
Stack:  0000000000000008 ffff81003f83fbf6 ffff81003ea53a30 0000000000000008
 ffff81003f83fc10 ffffffff80495ab4 0000000000000011 0000000000000002
 0000000000000202 0000000000000202 00000000fffffff4 ffff81003ea53a30
Call Trace:
 [<ffffffff80495ab4>] pcie_isr+0x18e/0x1bc
 [<ffffffff80260831>] request_irq+0x106/0x12f
 [<ffffffff80495fb6>] pcie_init+0x15e/0x6cc
 [<ffffffff804933a3>] pciehp_probe+0x64/0x541
 [<ffffffff8048f4e7>] pcie_port_probe_service+0x4c/0x76
 [<ffffffff8054af70>] driver_probe_device+0xd4/0x1f0
 [<ffffffff8054b108>] __driver_attach+0x7c/0x7e
 [<ffffffff8054b08c>] ? __driver_attach+0x0/0x7e
 [<ffffffff8054a4b6>] bus_for_each_dev+0x53/0x7d
 [<ffffffff8054ad3c>] driver_attach+0x1c/0x1e
 [<ffffffff8054a9c2>] bus_add_driver+0xdd/0x25b
 [<ffffffff80c09d3d>] ? pcied_init+0x0/0x8b
 [<ffffffff8054b288>] driver_register+0x5f/0x13e
 [<ffffffff80c09d3d>] ? pcied_init+0x0/0x8b
 [<ffffffff8048f441>] pcie_port_service_register+0x47/0x49
 [<ffffffff80c09d52>] pcied_init+0x15/0x8b
 [<ffffffff80bf3938>] kernel_init+0x75/0x243
 [<ffffffff808639d2>] ? _spin_unlock_irq+0x2b/0x3a
 [<ffffffff80228d1f>] ? finish_task_switch+0x57/0x9a
 [<ffffffff8020c258>] child_rip+0xa/0x12
 [<ffffffff8020bcec>] ? restore_args+0x0/0x30
 [<ffffffff80bf38c3>] ? kernel_init+0x0/0x243
 [<ffffffff8020c24e>] ? child_rip+0x0/0x12

Code: 83 80 00 00 00 48 39 f0 75 e1 0f b6 c9 48 c7 c2 00 0e 8d 80 48 c7 c6 8a 60 a6 80 48 c7 c7 10 db a8 80 31 c0 e8 3f 8d d9 ff 31 db <48> 8b 43 70 48 8d 75 ef 48 89 df ff 50 30 80 7d ef 00 74 37 48
RIP  [<ffffffff80494a8b>] pciehp_handle_presence_change+0x7e/0x113
 RSP <ffff81003f83fbb0>
CR2: 0000000000000070
Kernel panic - not syncing: Fatal exception

The situation under which it occurs is hw and timing related: it appears
to happen on a system that has PCI hotplug hardware but with no active
hotplug cards, and another interrupt in the same (shared) IRQ line
arrives too early, before the hotplug-slot entry has been set up - as
triggered by CONFIG_DEBUG_SHIRQ=y:

This patch contains the following two fixes.

(1) Clear all events bits in Slot Status register to prevent the pciehp
    driver from detecting the spurious events that would have been occur
    before pciehp loading.

(2) Add check whether slot initialization had been already done.

This is short term fix. We need more structural fixes to install
interrupt handler after slot initialization is done.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-05-27 15:43:08 -07:00
Kenji Kaneshige
b3bd307c62 shpchp: add message about shpchp_slot_with_bus option
Some (broken?) platform assign the same slot name to multiple hotplug
slots. On such system, slot initialization would fail because of name
collision. The shpchp driver already have a "slot_with_bus" module
option which adds the bus number into the slot name. This patch adds
the message about this module option that will be displayed when slot
name collision is detected.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-05-27 15:42:55 -07:00
Olof Johansson
732bee4c85 [POWERPC] pasemi: update pasemi_defconfig, enable electra_cf
Refresh pasemi_defconfig and enable ELECTRA_CF=y.

Signed-off-by: Olof Johansson <olof@lixom.net>
2008-05-27 16:11:13 -05:00
Olof Johansson
c433a1b642 electra_cf: Add MODULE_DEVICE_TABLE()
Add a module device table to electra_cf so that modules can be
auto-probed/loaded.

Signed-off-by: Olof Johansson <olof@lixom.net>
2008-05-27 16:07:45 -05:00
Tony Luck
4dcc29e157 [IA64] Workaround for RSE issue
Problem: An application violating the architectural rules regarding
operation dependencies and having specific Register Stack Engine (RSE)
state at the time of the violation, may result in an illegal operation
fault and invalid RSE state.  Such faults may initiate a cascade of
repeated illegal operation faults within OS interruption handlers.
The specific behavior is OS dependent.

Implication: An application causing an illegal operation fault with
specific RSE state may result in a series of illegal operation faults
and an eventual OS stack overflow condition.

Workaround: OS interruption handlers that switch to kernel backing
store implement a check for invalid RSE state to avoid the series
of illegal operation faults.

The core of the workaround is the RSE_WORKAROUND code sequence
inserted into each invocation of the SAVE_MIN_WITH_COVER and
SAVE_MIN_WITH_COVER_R19 macros.  This sequence includes hard-coded
constants that depend on the number of stacked physical registers
being 96.  The rest of this patch consists of code to disable this
workaround should this not be the case (with the presumption that
if a future Itanium processor increases the number of registers, it
would also remove the need for this patch).

Move the start of the RBS up to a mod32 boundary to avoid some
corner cases.

The dispatch_illegal_op_fault code outgrew the spot it was
squatting in when built with this patch and CONFIG_VIRT_CPU_ACCOUNTING=y
Move it out to the end of the ivt.

Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-05-27 13:24:39 -07:00
Chris Wright
cc94bc37d5 LSM: remove stale web site from MAINTAINERS
Pointed out by Adrian Bunk.

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-05-27 12:29:49 -07:00
Brian King
ca61668b82 [SCSI] ibmvscsi: Non SCSI error status fixup
Some versions of the Virtual I/O Server on Power
return 0x99 in the non-SCSI error status field as success,
rather than 0. This fixes the ibmvscsi driver to treat this
response as success.

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-05-27 11:10:57 -05:00
Pavel Machek
dd564d0cf0 x86: aperture_64.c: cleanups
Some small cleanups for aperture_64.c; they should not really change
any code.

Signed-off-by: Pavel Machek <pavel@suse.cz>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-27 18:03:56 +02:00
Michael Reed
7ba2db5f38 [SCSI] fusion mpt: fix target missing after resetting external raid
Following a hard reset of a SAS raid, one of the raid targets is occasionally
missing.  I tracked this down to a pretty obscure little bug.

The LSI fusion drivers for SAS and Fibre Channel both use their respective
transport layers.  Those transport layers increment the target number
assigned to new targets.

The routine __scsi_scan_target uses the "this_id" element of the Scsi_Host
structure to avoid scanning the scsi host adapter.  Both fusion drivers set
"this_id" from a value returned in a firmware PortFacts response.  For my
particular test case (SAS) the firmware id assigned to the initiator was
173.  After enough raid resets to cause the raid targets to go and come a
sufficient number of times, the id assigned by the transport to a raid
target would match the id assigned by the host adapter to the "this_id"
field, resulting in that target not being scanned.

Fix by not assigning this_id and not checking it in slave_configure. 

Signed-off-by: Michael Reed <mdr@sgi.com>
Acked-by: "Moore, Eric" <Eric.Moore@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-05-27 10:58:09 -05:00
Linus Torvalds
3dbfd0801b Merge git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/avr32-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/avr32-2.6:
  avr32: Fix cpufreq oops when ondemand governor is default
  avr32: Update defconfigs
  avr32: export strnlen_user
  avr32: export copy_page
2008-05-27 08:27:20 -07:00
David Woodhouse
edb2301f29 ck804rom: fix driver_data in probe table.
There's a reason why using C99 initialisers even in the supposedly
trivial structs is a good idea.

Signed-off-by: Carl-Daniel Hailfinger <c-d.hailfinger.devel.2006@gmx.net>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-27 07:34:38 -07:00
Gerrit Renker
825de27d9e dccp ccid-3: Fix "t_ipi explosion" bug
The identification of this bug is thanks to Cheng Wei and Tomasz
Grobelny.

To avoid divide-by-zero, the implementation previously ignored RTTs
smaller than 4 microseconds when performing integer division RTT/4.

When the RTT reached a value less than 4 microseconds (as observed on
loopback), this prevented the Window Counter CCVal value from
advancing. As a result, the receiver stopped sending feedback. This in
turn caused non-ending expiries of the nofeedback timer at the sender,
so that the sending rate was progressively reduced until reaching the
minimum of one packet per 64 seconds.

The patch fixes this bug by handling integer division more
intelligently. Due to consistent use of dccp_sample_rtt(),
divide-by-zero-RTT is avoided.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-27 06:33:54 -07:00
Wei Yongjun
6079a463cf dccp: Fix to handle short sequence numbers packet correctly
RFC4340 said:
  8.5.  Pseudocode
       ...
       If P.type is not Data, Ack, or DataAck and P.X == 0 (the packet
             has short sequence numbers), drop packet and return

But DCCP has some mistake to handle short sequence numbers packet, now
it drop packet only if P.type is Data, Ack, or DataAck and P.X == 0.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-27 06:22:38 -07:00
Yinghai Lu
3945e2c9ab x86: extend e820 ealy_res support 32bit - fix #2
remove extra -1 in reseve_early calling
    panic if can not find space for new RAMDISK

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-27 10:20:12 +02:00
Jeremy Fitzhardinge
359cdd3f86 xen: maintain clock offset over save/restore
Hook into the device model to make sure that timekeeping's resume handler
is called.  This deals with our clocksource's non-monotonicity over the
save/restore.  Explicitly call clock_has_changed() to make sure that
all the timers get retriggered properly.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-27 10:11:38 +02:00
Jeremy Fitzhardinge
0e91398f2a xen: implement save/restore
This patch implements Xen save/restore and migration.

Saving is triggered via xenbus, which is polled in
drivers/xen/manage.c.  When a suspend request comes in, the kernel
prepares itself for saving by:

1 - Freeze all processes.  This is primarily to prevent any
    partially-completed pagetable updates from confusing the suspend
    process.  If CONFIG_PREEMPT isn't defined, then this isn't necessary.

2 - Suspend xenbus and other devices

3 - Stop_machine, to make sure all the other vcpus are quiescent.  The
    Xen tools require the domain to run its save off vcpu0.

4 - Within the stop_machine state, it pins any unpinned pgds (under
    construction or destruction), performs canonicalizes various other
    pieces of state (mostly converting mfns to pfns), and finally

5 - Suspend the domain

Restore reverses the steps used to save the domain, ending when all
the frozen processes are thawed.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-27 10:11:38 +02:00
Jeremy Fitzhardinge
7d88d32a46 xenbus: rebind irq on restore
When restoring, rebind the existing xenbus irq to the new xenbus event
channel.  (It turns out in practice that this is always the same, and
is never updated on restore.  That's a bug, but Xeno-linux has been
like this for a long time, so it can't really be fixed.)

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-27 10:11:38 +02:00
Jeremy Fitzhardinge
6b9b732d0e xen-console: add save/restore
Add code to:

 1. Deal with the console page being canonicalized.  During save, the
    console's mfn in the start_info structure is canonicalized to a pfn.
    In order to deal with that, we always use a copy of the pfn and
    indirect off that all the time.  However, we fall back to using the
    mfn if the pfn hasn't been initialized yet.

 2. Restore the console event channel, and rebind it to the existing irq.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-27 10:11:37 +02:00
Jeremy Fitzhardinge
0f2287ad7c xen: fix unbind_from_irq()
Rearrange the tests in unbind_from_irq() so that we can still unbind
an irq even if the underlying event channel is bad.  This allows a
device driver to shuffle its irqs on save/restore before the
underlying event channels have been fixed up.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-27 10:11:37 +02:00
Jeremy Fitzhardinge
eb1e305f4e xen: add rebind_evtchn_irq
Add rebind_evtchn_irq(), which will rebind an device driver's existing
irq to a new event channel on restore.  Since the new event channel
will be masked and bound to vcpu0, we update the state accordingly and
unmask the irq once everything is set up.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-27 10:11:37 +02:00
Jeremy Fitzhardinge
d5edbc1f75 xen: add p2m mfn_list_list
When saving a domain, the Xen tools need to remap all our mfns to
portable pfns.  In order to remap our p2m table, it needs to know
where all its pages are, so maintain the references to the p2m table
for it to use.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-27 10:11:37 +02:00
Jeremy Fitzhardinge
a0d695c821 xen: make dummy_shared_info non-static
Rename dummy_shared_info to xen_dummy_shared_info and make it
non-static, in anticipation of users outside of enlighten.c

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-27 10:11:37 +02:00
Jeremy Fitzhardinge
cf0923ea29 xen: efficiently support a holey p2m table
When using sparsemem and memory hotplug, the kernel's pseudo-physical
address space can be discontigious.  Previously this was dealt with by
having the upper parts of the radix tree stubbed off.  Unfortunately,
this is incompatible with save/restore, which requires a complete p2m
table.

The solution is to have a special distinguished all-invalid p2m leaf
page, which we can point all the hole areas at.  This allows the tools
to see a complete p2m table, but it only costs a page for all memory
holes.

It also simplifies the code since it removes a few special cases.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-27 10:11:37 +02:00
Jeremy Fitzhardinge
8006ec3e91 xen: add configurable max domain size
Add a config option to set the max size of a Xen domain.  This is used
to scale the size of the physical-to-machine array; it ends up using
around 1 page/GByte, so there's no reason to be very restrictive.

For a 32-bit guest, the default value of 8GB is probably sufficient;
there's not much point in giving a 32-bit machine much more memory
than that.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-27 10:11:37 +02:00
Jeremy Fitzhardinge
d451bb7aa8 xen: make phys_to_machine structure dynamic
We now support the use of memory hotplug, so the physical to machine
page mapping structure must be dynamic.  This is implemented as a
two-level radix tree structure, which allows us to efficiently
incrementally allocate memory for the p2m table as new pages are
added.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-27 10:11:37 +02:00
Adrian Bunk
955d6f1778 xen: drivers/xen/balloon.c: make a function static
Make the needlessly global balloon_set_new_target() static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-27 10:11:37 +02:00
Jeremy Fitzhardinge
38bb5ab417 xen: count resched interrupts properly
Make sure resched interrupts appear in /proc/interrupts in the proper
place.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-27 10:11:37 +02:00
Isaku Yamahata
bfdab126cf xen: add missing definitions in include/xen/interface/memory.h which ia64/xen needs
Add xen handles realted definitions for xen memory which ia64/xen needs.
Pointer argumsnts for ia64/xen hypercall are passed in pseudo physical
address (guest physical address) so that it is required to convert
guest kernel virtual address into pseudo physical address.
The xen guest handle represents such arguments.

Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-27 10:11:36 +02:00
Isaku Yamahata
a90971ebdd xen: compilation fix to balloon driver for ia64 support
fix compilation error of ballon driver on ia64.
extent_start member is pointer argument. On x86 pointer argument for
xen hypercall is passed as virtual address.
On the other hand, ia64 and ppc, pointer argument is passed in pseudo
physical address. (guest physicall address.)
So they must be passed as handle and convert right before issuing hypercall.

  CC      drivers/xen/balloon.o
linux-2.6-x86/drivers/xen/balloon.c: In function 'increase_reservation':
linux-2.6-x86/drivers/xen/balloon.c:228: error: incompatible types in assignment
linux-2.6-x86/drivers/xen/balloon.c: In function 'decrease_reservation':
linux-2.6-x86/drivers/xen/balloon.c:324: error: incompatible types in assignment
linux-2.6-x86/drivers/xen/balloon.c: In function 'dealloc_pte_fn':
linux-2.6-x86/drivers/xen/balloon.c:486: error: incompatible types in assignment
linux-2.6-x86/drivers/xen/balloon.c: In function 'alloc_empty_pages_and_pagevec':
linux-2.6-x86/drivers/xen/balloon.c:522: error: incompatible types in assignment
make[2]: *** [drivers/xen/balloon.o] Error 1

Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-27 10:11:36 +02:00
Isaku Yamahata
ec9b2065d4 xen: Move manage.c to drivers/xen for ia64/xen support
move arch/x86/xen/manage.c under drivers/xen/to share codes
with x86 and ia64.
ia64/xen also uses manage.c

Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-27 10:11:36 +02:00
Jeremy Fitzhardinge
83abc70a4c xen: make earlyprintk=xen work again
For some perverse reason, if you call add_preferred_console() it prevents
setup_early_printk() from successfully enabling the boot console -
unless you make it a preferred console too...

Also, make xenboot console output distinct from normal console output,
since it gets repeated when the console handover happens, and the
duplicated output is confusing without disambiguation.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
2008-05-27 10:11:36 +02:00
Markus Armbruster
e4dcff1f6e xen pvfb: Dynamic mode support (screen resizing)
The pvfb backend indicates dynamic mode support by creating node
feature_resize with a non-zero value in its xenstore directory.
xen-fbfront sends a resize notification event on mode change.  Fully
backwards compatible both ways.

Framebuffer size and initial resolution can be controlled through
kernel parameter xen_fbfront.video.  The backend enforces a separate
size limit, which it advertises in node videoram in its xenstore
directory.

xen-kbdfront gets the maximum screen resolution from nodes width and
height in the backend's xenstore directory instead of hardcoding it.

Additional goodie: support for larger framebuffers (512M on a 64-bit
system with 4K pages).

Changing the number of bits per pixels dynamically is not supported,
yet.

Ported from
http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/92f7b3144f41
http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/bfc040135633

Signed-off-by: Pat Campbell <plc@novell.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-27 10:11:36 +02:00
Markus Armbruster
f4ad1ebd7a xen pvfb: Zero unused bytes in events sent to backend
This isn't a security flaw (the backend can see all our memory
anyway).  But it's the right thing to do all the same.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-27 10:11:36 +02:00
Markus Armbruster
1e892c959d xen pvfb: Module aliases to support module autoloading
These are mostly for completeness and consistency with the other
frontends, as PVFB is typically compiled in rather than a module.

Derived from
http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/5e294e29a43e

While there, add module descriptions.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-27 10:11:36 +02:00
Markus Armbruster
6ba0e7b36c xen pvfb: Pointer z-axis (mouse wheel) support
Add z-axis motion to pointer events.  Backward compatible, because
there's space for the z-axis in union xenkbd_in_event, and old
backends zero it.

Derived from
http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/57dfe0098000
http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/1edfea26a2a9
http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/c3ff0b26f664

Signed-off-by: Pat Campbell <plc@novell.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-27 10:11:36 +02:00
Markus Armbruster
9e124fe16f xen: Enable console tty by default in domU if it's not a dummy
Without console= arguments on the kernel command line, the first
console to register becomes enabled and the preferred console (the one
behind /dev/console).  This is normally tty (assuming
CONFIG_VT_CONSOLE is enabled, which it commonly is).

This is okay as long tty is a useful console.  But unless we have the
PV framebuffer, and it is enabled for this domain, tty0 in domU is
merely a dummy.  In that case, we want the preferred console to be the
Xen console hvc0, and we want it without having to fiddle with the
kernel command line.  Commit b8c2d3dfbc
did that for us.

Since we now have the PV framebuffer, we want to enable and prefer tty
again, but only when PVFB is enabled.  But even then we still want to
enable the Xen console as well.

Problem: when tty registers, we can't yet know whether the PVFB is
enabled.  By the time we can know (xenstore is up), the console setup
game is over.

Solution: enable console tty by default, but keep hvc as the preferred
console.  Change the preferred console to tty when PVFB probes
successfully, unless we've been given console kernel parameters.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-27 10:11:36 +02:00
Jeremy Fitzhardinge
a15af1c9ea x86/paravirt: add pte_flags to just get pte flags
Add pte_flags() to extract the flags from a pte.  This is a special
case of pte_val() which is only guaranteed to return the pte's flags
correctly; the page number may be corrupted or missing.

The intent is to allow paravirt implementations to return pte flags
without having to do any translation of the page number (most notably,
Xen).

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-27 10:11:36 +02:00
Jeremy Fitzhardinge
239d1fc04e xen: don't worry about preempt during xen_irq_enable()
When enabling interrupts, we don't need to worry about preemption,
because we either enter with interrupts disabled - so no preemption -
or the caller is confused and is re-enabling interrupts on some
indeterminate processor.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-27 10:11:35 +02:00
Jeremy Fitzhardinge
2956a3511c xen: allow some cr4 updates
The guest can legitimately change things like cr4.OSFXSR and
OSXMMEXCPT, so let it.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-27 10:11:35 +02:00
Jeremy Fitzhardinge
349c709f42 xen: use new sched_op
Use the new sched_op hypercall, mainly because xenner doesn't support
the old one.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-27 10:11:35 +02:00
Jeremy Fitzhardinge
7b1333aa4c xen: use hypercall rather than clts
Xen will trap and emulate clts, but its better to use a hypercall.
Also, xenner doesn't handle clts.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-27 10:11:35 +02:00
Jeremy Fitzhardinge
0922abdc39 xen: make early console also write to debug console
When using "earlyprintk=xen", also write the console output to the raw
debug console.  This will appear on dom0's console if the hypervisor
has been compiled to allow it.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-27 10:11:35 +02:00
Jeremy Fitzhardinge
0acf10d8fb xen: add raw console write functions for debug
Add a couple of functions which can write directly to the Xen console
for debugging.  This output ends up on the host's dom0 console
(assuming it allows the domain to write there).

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-27 10:11:35 +02:00
Jeremy Fitzhardinge
4e09e21ccb x86: use symbolic constant in stts()
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-27 10:04:29 +02:00
Jeremy Fitzhardinge
4226ab93d8 x86: use pteval_t for _PAGE_FOO
Rather than making _PAGE_* constants signed, and then relying on
sign-extension to make sure that masks derived from them are wide
enough, just explicitly type them pteval_t.  This guarantees that they
and any derived values are the right size for the current pte format.

The reliance on sign extension is fragile, and invokes some very
subtle corners of the C type system.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-27 10:01:20 +02:00