Commit graph

48801 commits

Author SHA1 Message Date
Ingo Molnar
95492e4646 [PATCH] x86: rewrite SMP TSC sync code
make the TSC synchronization code more robust, and unify it between x86_64 and
i386.

The biggest change is the removal of the 'fix up TSCs' code on x86_64 and
i386, in some rare cases it was /causing/ time-warps on SMP systems.

The new code only checks for TSC asynchronity - and if it can prove a
time-warp (if it can observe the TSC going backwards when going from one CPU
to another within a critical section), then the TSC clock-source is turned
off.

The TSC synchronization-checking code also got moved into a separate file.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:57 -08:00
Thomas Gleixner
92c7e00254 [PATCH] Simplify the registration of clocksources
Enqueue clocksources in rating order to make selection of the clocksource
easier.  Also check the match with an user override at enqueue time.

Preparatory patch for the generic clocksource verification.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:57 -08:00
Thomas Gleixner
26a08eb301 [PATCH] i386 Remove useless code in tsc.c
The delayed work code in arch/i386/kernel/tsc.c is an unused leftover of the
GTOD conversion. Remove it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:57 -08:00
John Stultz
c1d370e167 [PATCH] i386: use GTOD persistent clock support
Persistent clock support: do proper timekeeping across suspend/resume, i386
arch support.

[bunk@stusta.de: cleanup]
Build-fixes-from: Andrew Morton <akpm@osdl.org>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:57 -08:00
John Stultz
411187fb05 [PATCH] GTOD: persistent clock support
Persistent clock support: do proper timekeeping across suspend/resume.

[bunk@stusta.de: cleanup]
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:57 -08:00
Ingo Molnar
9f907c0144 [PATCH] Fix timeout overflow with jiffies
Prevent timeout overflow if timer ticks are behind jiffies (due to high
softirq load or due to dyntick), by limiting the valid timeout range to
MAX_LONG/2.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:56 -08:00
Ingo Molnar
41cf54455d [PATCH] Fix multiple conversion bugs in msecs_to_jiffies
Fix multiple conversion bugs in msecs_to_jiffies().

The main problem is that this condition:

	if (m > jiffies_to_msecs(MAX_JIFFY_OFFSET))

overflows if HZ is smaller than 1000!

This change is user-visible: for HZ=250 SUS-compliant poll()-timeout
value of -20 is mistakenly converted to 'immediate timeout'.

(The new dyntick code also triggered this, that's how we noticed.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:56 -08:00
Ingo Molnar
8b9365d753 [PATCH] Uninline jiffies.h functions
There are loads of fat functions hidden in jiffies.h.  Uninline them.  No code
changes.

[jeremy@goop.org: export fix]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:56 -08:00
john stultz
f4304ab215 [PATCH] HZ free ntp
Distangle the NTP update from HZ.  This is necessary for dynamic tick enabled
kernels.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:56 -08:00
Thomas Gleixner
771ee3b04e [PATCH] Add a function to handle interrupt affinity setting
Provide funtions to:
 - check, whether an interrupt can set the affinity
 - pin the interrupt to a given cpu

Necessary for the ability to setup clocksources more flexible (e.g.  use the
different HPET channels per CPU)

[akpm@osdl.org: alpha build fix]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:56 -08:00
Thomas Gleixner
950f4427c2 [PATCH] Add irq flag to disable balancing for an interrupt
Add a flag so we can prevent the irq balancing of an interrupt.  Move the
bits, so we have room for more :)

Necessary for the ability to setup clocksources more flexible (e.g.  use the
different HPET channels per CPU)

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:56 -08:00
Andrew Morton
b463fc6073 [PATCH] vmi-versus-hrtimers
arch/i386/kernel/built-in.o: In function `vmi_stop_hz_timer':
: undefined reference to `next_timer_interrupt'

If CONFIG_NO_HZ, next_timer_interrupt() doesn't exist (and presumably doesn't
make sense).

Perhaps VMI shouildn't be playing with timer internals at this level.

Cc: Zachary Amsden <zach@vmware.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:56 -08:00
Adrian Bunk
c6025a79f5 [PATCH] correct CONFIG_GIGASET_M101 Makefile entry
Advanced Mathematics, lesson 1:
101 != 105

;-)

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:56 -08:00
Jeff Dike
838e56a11c [PATCH] uml: fix 2.6.20 hang
A previous cleanup misused need_poll, which had a fairly broken interface.
It implemented a growable array, changing the used elements count itself,
but leaving it up to the caller to fill in the actual elements, including
the entire array if the array had to be reallocated.  This worked because
the previous users were switching between two such structures, and the
elements were copied from the inactive array to the active array after
making sure the active array had enough room.

maybe_sigio_broken was made to use need_poll, but it was operating on a
single array, so when the buffer was reallocated, the previous contents
were lost.

This patch makes need_poll implement more sane semantics.  It merely
assures that the array is of the proper size and that the contents are
preserved.  It is up to the caller to adjust the used elements count and to
ensure that the proper elements are resent.

This manifested itself as a hang in 2.6.20 as the uninitialized buffer
convinced UML that one of its own file descriptors didn't support SIGIO and
needed to be watched by poll in a separate thread.  The result was an
interrupt flood as control traffic over this descriptor sparked interrupts,
which resulted in more control traffic, ad nauseum.

Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:56 -08:00
Dmitriy Monakhov
beb497ab48 [PATCH] __page_symlink retry loop error code fix
If prepare_write or commit_write return AOP_TRUNCATED_PAGE we jump to
"retry" label and than if find_or_create_page() failed function return
incorrect error code.

Signed-off-by: Dmitriy Monakhov <dmonakhov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:56 -08:00
Frederik Deweerdt
fb4d64e78c [PATCH] pci_iomap_regions() error handling fix
It appears that the pcim_iomap_regions() function doesn't get the error
handling right. It BUGs early at boot with a backtrace along the lines of:

ahci_init
pci_register_driver
driver_register
[...]
ahci_init_one
pcim_iomap_region
pcim_iounmap

The following patch allows me to boot. Only the if(mask..) continue;
part fixes the problem actually, the gotos where changed so that we
don't try to unmap something we couldn't map anyway.

Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:55 -08:00
David Brownell
f5de611148 [PATCH] GPIO core documentation
Small updates to the GPIO documentation, addressing feedback and
fixing a few spelling errors.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:55 -08:00
Haavard Skinnemoen
41d8ca452f [AVR32] Use per-controller spi_board_info structures
Set up one spi_board_info array per controller and pass this to
at32_add_device_spi so that it can set up any GPIO pins for chip
selects based on this information.

Extracted from a patch by David Brownell and adapted slightly.

Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
2007-02-16 14:01:40 +01:00
Haavard Skinnemoen
23cebe2287 [AVR32] Warn, don't BUG if clk_disable is called too many times
Print a helpful warning along with a stack dump if clk_disable is
called on a already-disabled clock. Remove the BUG_ON().

Extracted from a patch by David Brownell.

Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
2007-02-16 13:19:47 +01:00
Haavard Skinnemoen
7a5fe23879 [AVR32] Make sure all genclocks have a parent
Initialize the parent field of each generic clock by looking at the
PM registers. This means that the genclock operations can always
assume that the parent field is non-null, so they don't have to
check. Also remove a few unnecessary BUG_ON()s.

Extracted from a patch by David Brownell.

Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
2007-02-16 13:14:33 +01:00
Haavard Skinnemoen
160f34531a [AVR32] Remove unnecessary sys_nfsservctl conditional
kernel/sys_ni.c defines sys_nfsservctl as a weak alias for
sys_ni_syscall, so it's always safe to include it in the system
call table.

Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
2007-02-16 12:55:42 +01:00
Haavard Skinnemoen
1a6f1436d5 [AVR32] Wire up the SysV IPC calls properly
Wire up the individual sysvipc system calls and remove sys_ipc.
Strictly speaking, this breaks the ABI, but since sys_ipc never
worked anyway due to a silly bug, it isn't actually a regression.

Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
2007-02-16 12:54:44 +01:00
Haavard Skinnemoen
2201ec2b10 [AVR32] Define ioremap_nocache, ioport_map and ioport_unmap
These are all defined in terms of ioremap/iounmap since port I/O
isn't really different from memory-mapped I/O on AVR32.

Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
2007-02-16 12:53:57 +01:00
Haavard Skinnemoen
b60f16eb56 [AVR32] Fix prototypes for __raw_writesb and friends
The first parameter to __raw_writes[bwl] and __raw_reads[bwl] should
be a void __iomem *, not unsigned long.

Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
2007-02-16 12:47:40 +01:00
Konstantin Karasyov
b1028c545c ACPI: fix fan after resume from S3
http://bugzilla.kernel.org/show_bug.cgi?id=7570

Signed-off-by: Konstantin Karasyov <konstantin.a.karasyov@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2007-02-16 02:23:07 -05:00
Len Brown
e8363f3327 ACPI: update acpi_power_resume() per new acpi_op_resume
drivers/acpi/power.c:69: warning: initialization from incompatible pointer type

Signed-off-by: Len Brown <len.brown@intel.com>
2007-02-16 02:05:39 -05:00
Konstantin Karasyov
0a6139027f ACPI: Thermal issues on HP nx6325
The previous reference counting scheme to enable power resources
got confused when multiple devices were present that might
repeatedly enable or disable the resource and throw off the count.

The new code simply lists the referencing devices which
are requesting the resource to be enabled.  When there are none,
then it is off.

Signed-off-by: Konstantin Karasyov <konstantin.a.karasyov@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2007-02-16 01:47:06 -05:00
Sanjoy Mahajan
636cedf9df ACPI: thermal: fix units in debug output
http://bugzilla.kernel.org/show_bug.cgi?id=4972

Signed-off-by: Len Brown <len.brown@intel.com>
2007-02-16 01:24:43 -05:00
Thomas Gleixner
5c95d3f578 ACPI: include apic.h in processor driver for benefit of UP kernels
apic.h does not get included on UP compiles.  That way the
APICTIMER_STOPS_ON_C3 is not there and UP boxen have no support for timer
broadcasting.  This was never noticed, because the lapic timer is only used
for profiling on UP.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2007-02-15 23:27:13 -05:00
Len Brown
8d4956c201 ACPI: remove non-PNPACPI version of get_rtc_dev()
It isn't needed in ACPI code anymore because
now ACPI always includes PNPACPI.

Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Len Brown <len.brown@intel.com>
2007-02-15 22:46:42 -05:00
Len Brown
243b66e76a ACPI: always enable CONFIG_PNPACPI on CONFIG_ACPI kernels
We removed the ACPI motherboard driver which handled
the ACPI=y, PNP=n case, so now we need to enforce that
PNP & PNPACPI are always enabled for ACPI kernels.

Most major distros ship this way this already.

Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2007-02-15 22:38:04 -05:00
Len Brown
fc955f670c ACPI: remove acpi_os_readable(), acpi_os_writable()
...which are now unused

Signed-off-by: Len Brown <len.brown@intel.com>
2007-02-15 22:19:17 -05:00
Randy Dunlap
70c0846e43 ACPI: Fix sparse warnings
Use NULL for pointers

drivers/acpi/osl.c:208:10: warning: Using plain integer as NULL pointer
drivers/acpi/tables/tbxface.c:411:49: warning: Using plain integer as NULL pointer
drivers/acpi/processor_core.c:1008:10: warning: Using plain integer as NULL pointer

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2007-02-15 22:19:07 -05:00
Nate Dailey
7de970e11f sata_vsc: use default cache line size if non-zero
This modifies drivers/ata/sata_vsc.c to only set the cache line size
to 0x80 if the default value is zero. Apparently zero isn't allowed
due to a bug in the chip, but I've found performance is much better
with the (non-zero) default instead of 0x80.

[note1: "default" means BIOS-programmed value, in this context -jgarzik]

[note2: superfluous braces were removed from the patch -jg]

Signed-off-by: Nate Dailey <nate.dailey@stratus.com>
Signed-off-by: Jeremy Higdon <jeremy@sgi.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-15 18:13:46 -05:00
Robert Hancock
5278b50cea sata_nv: handle SError status indication
ADMA-capable controllers provide a bit in the status register that appears
to indicate that the controller detected an SError condition. Update sata_nv
to detect this and trigger error handling in order to handle the fault.

Signed-off-by: Robert Hancock <hancockr@shaw.ca>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-15 18:05:32 -05:00
Olaf Hering
8361cd79f2 add delay around sl82c105_reset_engine calls
The hald media changed polling does really confuse things.
Noone knows why the delays are needed, but they give us access to the CD.

An udelay(50) will give reliable access to the drive, but there is still
one (or more) EH reset. The drive works without EH resets with udelay(100).

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-15 18:04:53 -05:00
Zhang, Yanmin
9f271d576a ATA convert GSI to irq on ia64
If an ATA drive uses legacy mode, ata driver will choose 14 and 15
as the fixed irq number. On ia64 platform, such numbers are GSI and
should be converted to irq vector.

Below patch against kernel 2.6.20 fixes it.

Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-15 18:04:53 -05:00
Tejun Heo
81afe89318 libata: clear TF before IDENTIFYing
Some devices chock if Feature is not clear when IDENTIFY is issued.
Set ATA_TFLAG_ISADDR | ATA_TFLAG_DEVICE for IDENTIFY such that whole
TF is cleared when reading ID data.

Kudos to Art Haas for testing various futile patches over several
months and Mark Lord for pointing out the fix.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Art Haas <ahaas@airmail.net>
Cc: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-15 18:04:01 -05:00
Alan Cox
f834e49f1a libata: Add a host flag to indicate lack of IORDY capability
This is the first preparation to doing the !IORDY cases properly.  Further
diffs will then add the needed logic to do it right.

Signed-off-by: Alan Cox <alan@redhat.com>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Tejun Heo <htejun@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-15 18:04:01 -05:00
Tejun Heo
61f216c719 libata: fix drive side 80c cable check, take 3
The 80c wire bit is bit 13, not 14.  Bit 14 is always 1 if word93 is
implemented.  This increases the chance of incorrect wire detection
especially because host side cable detection is often unreliable and
we sometimes soley depend on drive side cable detection.  Fix the test
and add word93 validity check.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-15 18:04:01 -05:00
Mikael Pettersson
5387373bfe sata_promise: new EH conversion for 20619 chips, take 2
This patch updates the sata_promise driver to use new-style
libata error handling for 20619 (TX4000) chips. sata_promise
already uses new EH for the other chips it supports, so the
patch is quite simple:

* remove ->phy_reset and ->eng_timeout ops from pdc_pata_ops,
  and instead bind ->freeze, ->thaw, ->error_handler, and
  ->post_internal_cmd to existing new EH functions
* drop ATA_FLAG_SRST from board_20619's flags
* remove now unused pdc_pata_phy_reset() and pdc_eng_timeout()

Tested on a TX4000 with both modern working disks and old/quirky
disks. Also used a CD-RW drive to test reading and writing CDs.

Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-15 18:04:00 -05:00
Mikael Pettersson
2fb8b49fb2 sata_promise: fix missing PATA cable detection
This patch fixes an oversight which caused sata_promise to
not perform cable detection on the TX2plus chips' PATA ports.

Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-15 18:04:00 -05:00
Alexey Starikovskiy
5f7748cf91 Execute AML Notify() requests on stack.
HP nx6125/nx6325/... machines have a _GPE handler with an infinite
loop sending Notify() events to different ACPI subsystems.

The notify handler in the ACPI thermal driver is a C-routine,
which may invoke the ACPI interpreter again to get access
to some ACPI variables such as temperature.  (acpi_evaluate_xxx)
On these HP machines such an evaluation changes state of an ASL variable
and lets the loop above break.

In the current ACPI implementation, Notify requests are being deferred
to the same kacpid workqueue on which the above GPE handler with
infinite loop is executing. Thus we have a deadlock -- loop will
continue to spin, sending notify events, and at the same time
preventing these notify events from being run on a workqueue. All
notify events are deferred, thus we see explosion in memory consumption.

Also as GPE handling is blocked, machines overheat because ACPI-based
fan control is stalled.  Eventually by external poll of the same
acpi_evaluate, kacpid is released and all the queued notify events are
free to run, thus 100% CPU utilization by kacpid for several seconds
or more.

To prevent this failure,  Linux must not send notify events to the
kacpid workqueue -- either executing them immediately or putting them
on some other thread.

The first attempt to create a new thread was done by Peter Wainwright
He created a bunch of threads, which were stealing work from a kacpid
workqueue.
This patch appeared in 2.6.15-based kernel shipped with Ubuntu 6.06 LTS.

Second attempt was done by Alexey Starikovskiy, who created a new thread
for each Notify event. This worked OK on HP nx machines,
but broke Linus' Compaq n620c, by producing threads with a speed what
they stopped the machine completely.
Thus this patch was reverted from 2.6.18-rc2.

Alexey re-made the patch to create second workqueue just for notify events,
thus hopping it will not break Linus' machine. Patch was tested on the
same HP nx machines in #5534 and #7122, but this broke Linus' machine
also and was reverted from 2.6.19-rc with much fanfair.

The 4th patch inserted schedule_timeout(1) into deferred
execution of kacpid, if we had any notify requests pending, but Linus
decided that it was too complex (involved either changes to workqueue
to see if it's empty or atomic inc/dec).  Then a 5th attempt did a
yield() to every GPE execution.

Finally, this 6th generation patch simply executes the notify handler
on the stack.  Previous attempts to do this simple solution failed
because of issues in AML mutex re-entrancy which are now fixed
by the previous patch in this series.

http://bugzilla.kernel.org/show_bug.cgi?id=5534

Signed-off-by: Alexey Starikovskiy <alexey.y.starikovskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2007-02-15 16:13:51 -05:00
Alexey Starikovskiy
c0d127b569 ACPICA: fix AML mutex re-entrancy
ACPI AML supports "serialized" methods which are protected
by an implicit mutex.  The mutex is re-entrant for that AML thread
to allow recursion.

However, Linux implements notify() by creating a new AML thread.
So for systems where notify() re-enters a serialized method,
deadlock results.

The fix is to use the Linux thread_id as the key to allowing
re-entrancy, not the AML thread pointer.

http://bugzilla.kernel.org/show_bug.cgi?id=5534

Signed-off-by: Alexey Starikovskiy <alexey.y.starikovskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2007-02-15 16:13:16 -05:00
Linus Torvalds
f99c6bb6e2 Merge master.kernel.org:/pub/scm/linux/kernel/git/lethal/sh-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/lethal/sh-2.6: (35 commits)
  sh: rts7751r2d board updates.
  sh: Kill off dead bigsur and ec3104 boards.
  sh: Fixup r7780rp pata_platform for devres conversion.
  sh: Revert TLB miss fast-path changes that broke PTEA parts.
  sh: Compile fix for heartbeat consolidation.
  sh: heartbeat consolidation for banked LEDs.
  sh: define dma noncoherent API functions.
  sh: Missing flush_dcache_all() proto in cacheflush.h.
  sh: Kill dead/unused ISA code from __ioremap().
  sh: Add cpu-features header to asm/Kbuild.
  sh: Move __KERNEL__ up in asm/page.h.
  sh: Fix syscall numbering breakage.
  sh: dcache write-back for R7780RP PIO.
  sh: Switch to local TLB flush variants in additional callsites.
  sh: Local TLB flushing variants for SMP prep.
  sh: Fixup cpu_data references for the non-boot CPUs.
  sh: Use a per-cpu ASID cache.
  sh: add SH_CLK_MD Kconfig default.
  sh: Fixup SHMIN INTC register definitions.
  sh: SH-DMAC compile fixes
  ...
2007-02-15 10:01:15 -08:00
Nick Piggin
e0a04cffa4 [PATCH] mincore: vma crossing fix
My mincore also forgot about crossing vmas.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-15 09:57:03 -08:00
Nick Piggin
4a76ef036a [PATCH] mincore: fill in results properly
Paper bag time. Thanks to Randy for noticing that I didn't actually assign
'present' to anything.

Unfortunately my original patch passed the few simple test cases I gave it,
purely by coincidence.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-15 09:57:03 -08:00
Nick Piggin
30fcffed81 [PATCH] mincore: CONFIG_SWAP=n fix
Fix mincore-anon patch to compile with CONFIG_SWAP=n

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-15 09:57:03 -08:00
Paul Mundt
9c57548f17 sh: rts7751r2d board updates.
This tidies up some of the rts7751r2d mess and gets it booting
again. Update the defconfig, too.

Signed-off-by: Masayuki Hosokawa <hosokawa@ace-jp.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2007-02-15 18:20:52 +09:00
Rafa Bilski
2b8c0e1302 [CPUFREQ] Longhaul - Redo Longhaul ver. 2
Start using v2 version of Longhaul when available. It provides
voltage scaling and can use ACPI C3 state. That's curious. CPU
will not change frequency on ACPI C3 when v1 is in use, but it will
when v2 is used. Driver will return max frequency all the time if
this isn't true for all processors. There is strange thing with
mobile voltage. Looks like only Nehemiah (C3-M) supports it.
Earlier processors have different mobile VRM (in docs), but I can't
find any which is using it. Looks like all are using VRM 8.5. So
fail for non Nehemiah with mobile VRM.

Signed-off-by: Rafal Bilski <rafalbilski@interia.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
2007-02-14 17:32:06 -05:00