When we hot-plug in new cpus, the core_id and proc_id of existing
cpus can change. So in order to set the cpu groups correctly we
need to clear the maps out completely first.
Signed-off-by: David S. Miller <davem@davemloft.net>
dr-cpu unconfigure requests will walk throught he enabled
IRQs and trigger ->set_affinity so that the going-down
cpu no longer has INOs targetted to it.
Signed-off-by: David S. Miller <davem@davemloft.net>
Take a page from the powerpc folks and just calculate the
delay factor directly.
Since frequency scaling chips use a system-tick register,
the value is going to be the same system-wide.
Signed-off-by: David S. Miller <davem@davemloft.net>
With the move of ldom_startcpu_cpuid() into smp.c some other
things need to follow along:
1) smp.c is not a driver so we can't use "PFX" macro in the
printk calls.
2) smp.c now needs asm/io.h and asm/hvtramp.h, ds.c no longer
does
3) kimage_addr_to_ra() also needs to move into smp.c
While we're here, update copyright info and my email address
in smp.c
Signed-off-by: David S. Miller <davem@davemloft.net>
Do not select HOTPLUG_CPU from SUN_LDOMS, that causes
HOTPLUG_CPU to be selected even on non-SMP which is
illegal.
Only build hvtramp.o when SMP, just like trampoline.o
Protect dr-cpu code in ds.c with HOTPLUG_CPU.
Likewise move ldom_startcpu_cpuid() to smp.c and protect
it and the call site with SUN_LDOMS && HOTPLUG_CPU.
Signed-off-by: David S. Miller <davem@davemloft.net>
The VIO drivers register themselves unconditionally just
like those of any other bus type, so to avoid crashes
on non-VIO systems we need to always register vio_bus_type.
Signed-off-by: David S. Miller <davem@davemloft.net>
Only adding cpus is supports at the moment, removal
will come next.
When new cpus are configured, the machine description is
updated. When we get the configure request we pass in a
cpu mask of to-be-added cpus to the mdesc CPU node parser
so it only fetches information for those cpus. That code
also proceeds to update the SMT/multi-core scheduling bitmaps.
cpu_up() does all the work and we return the status back
over the DS channel.
CPUs via dr-cpu need to be booted straight out of the
hypervisor, and this requires:
1) A new trampoline mechanism. CPUs are booted straight
out of the hypervisor with MMU disabled and running in
physical addresses with no mappings installed in the TLB.
The new hvtramp.S code sets up the critical cpu state,
installs the locked TLB mappings for the kernel, and
turns the MMU on. It then proceeds to follow the logic
of the existing trampoline.S SMP cpu bringup code.
2) All calls into OBP have to be disallowed when domaining
is enabled. Since cpus boot straight into the kernel from
the hypervisor, OBP has no state about that cpu and therefore
cannot handle being invoked on that cpu.
Luckily it's only a handful of interfaces which can be called
after the OBP device tree is obtained. For example, rebooting,
halting, powering-off, and setting options node variables.
CPU removal support will require some infrastructure changes
here. Namely we'll have to process the requests via a true
kernel thread instead of in a workqueue. workqueues run on
a per-cpu thread, but when unconfiguring we might need to
force the thread to execute on another cpu if the current cpu
is the one being removed. Removal of a cpu also causes the kernel
to destroy that cpu's workqueue running thread.
Another issue on removal is that we may have interrupts still
pointing to the cpu-to-be-removed. So new code will be needed
to walk the active INO list and retarget those cpus as-needed.
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a special domain services capability for setting
variables in the OBP options node. Guests don't have permanent
store for the OBP variables like a normal system, so they are
instead maintained in the LDOM control node or in the SC.
Signed-off-by: David S. Miller <davem@davemloft.net>
Property values cannot be referenced outside of
mdesc_grab()/mdesc_release() pairs. The only major
offender was the VIO bus layer, easily fixed.
Add some commentary to mdesc.h describing these rules.
Signed-off-by: David S. Miller <davem@davemloft.net>
Since we have to be able to handle MD updates, having an in-tree
set of data structures representing the MD objects actually makes
things more painful.
The MD itself is easy to parse, and we can implement the existing
interfaces using direct parsing of the MD binary image.
The MD is now reference counted, so accesses have to now take the
form:
handle = mdesc_grab();
... operations on MD ...
mdesc_release(handle);
The only remaining issue are cases where code holds on to references
to MD property values. mdesc_get_property() returns a direct pointer
to the property value, most cases just pull in the information they
need and discard the pointer, but there are few that use the pointer
directly over a long lifetime. Those will be fixed up in a subsequent
changeset.
A preliminary handler for MD update events from domain services is
there, it is rudimentry but it works and handles all of the reference
counting. It does not check the generation number of the MDs,
and it does not generate a "add/delete" list for notification to
interesting parties about MD changes but that will be forthcoming.
Signed-off-by: David S. Miller <davem@davemloft.net>
All of the interrupts say "LDX RX" and "LDX TX" currently
which is next to useless. Put a device specific prefix
before "RX" and "TX" instead which makes it much more
useful.
Signed-off-by: David S. Miller <davem@davemloft.net>
Besides the existing usage for power-button interrupts, we'll
want to make use of this code for domain-services where the
LDOM manager can send reboot requests to the guest node.
Signed-off-by: David S. Miller <davem@davemloft.net>
1) LDC_MODE_RELIABLE is deprecated an unused by anything, plus
it and LDC_MODE_STREAM were mis-numbered.
2) read_stream() should try to read as much as possible into
the per-LDC stream buffer area, so do not trim the read_nonraw()
length by the caller's size parameter.
3) Send data ACKs when necessary in read_nonraw().
4) In read_nonraw() when we get a pure ACK, advance the RX head
unconditionally past it.
5) Provide the ACKID field in the ldcdgb() packet dump in read_nonraw().
This helps debugging stream mode LDC channel problems.
6) Decrease verbosity of rx_data_wait() so that it is more useful.
A debugging message each loop iteration is too much.
7) In process_data_ack() stop the loop checking when we hit lp->tx_tail
not lp->tx_head.
8) Set the seqid field properly in send_data_nack().
Signed-off-by: David S. Miller <davem@davemloft.net>
This is also a partial workaround for a bug in the LDOM firmware which
double-transmits RX inos during high load. Without this, such an
event causes the kernel to loop forever in the interrupt call chain
ACK'ing but never actually running the IRQ handler (and thus clearing
the interrupt condition in the device).
There is still a bad potential effect when double INOs occur,
not covered by this changeset. Namely, if the INO is already on
the per-cpu INO vector list, we still blindly re-insert it and
thus we can end up losing interrupts already linked in after
it.
We could deal with that by traversing the list before insertion,
but that's too expensive for this edge case.
Signed-off-by: David S. Miller <davem@davemloft.net>
Virtual devices on Sun Logical Domains are built on top
of a virtual channel framework. This, with help of hypervisor
interfaces, provides a link layer protocol with basic
handshaking over which virtual device clients and servers
communicate.
Built on top of this is a VIO device protocol which has it's
own handshaking and message types. At this layer attributes
are exchanged (disk size, network device addresses, etc.)
descriptor rings are registered, and data transfers are
triggers and replied to.
Signed-off-by: David S. Miller <davem@davemloft.net>
This removes the requirement for callers to get_cpu() to check in simple
cases.
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Avi Kivity <avi@qumranet.com>
This removes the requirement for callers to get_cpu() to check in simple
cases.
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Add GPIO leds to the NGW100 platform and its defconfig.
Access through /sys/class/leds/{a,b,sys}/* files; one
defaults to a heartbeat.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
s390 is the only 32bit with unsigned long for size_t (usual for those
is unsigned int). Tell sparse...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Initialize the exception vectors early in the boot process, so that CPLB faults
can be handled when memory protection is enabled.
Signed-off-by: Bernd Schmidt <bernd.schmidt@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
switch to using proper defines this time (THREAD_SIZE and PAGE_SIZE)
instead of just PAGE_SIZE everywhere
Signed-off-by: Mike Frysinger <michael.frysinger@analog.com>
Signed-off-by: Bernd Schmidt <bernd.schmidt@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
Fortunately this function is only used on old 533 revisions.
Signed-off-by: Bernd Schmidt <bernd.schmidt@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
setup aliases for some core Core A MMRs to ease porting in cases
where common code would actually want Core A (or Core B MMR is reserved)
Signed-off-by: Mike Frysinger <michael.frysinger@analog.com>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
Robert P.J. Day has a script that finds places in the code that
use non-existent CONFIG variables. It complained of two uses in
ia64 specific code: CONFIG_IA64_SDV and CONFIG_KDB (both used in
the hp/sim code).
Signed-off-by: Tony Luck <tony.luck@intel.com>
The ".acq" semantics of the load only apply w.r.t. other data access.
Reading the clock (ar.itc) isn't a data access so strange things can
happen here. Specifically the read of ar.itc can be launched as soon
as the read of xtime_lock.sequence is ISSUED. Since this may cache
miss, and that might cause a thread switch, and there may be cache
contention for the line containing xtime_lock, it may be a long time
before the actual value is returned, so the ar.itc value may be very
stale.
Move the consumption of r28 up before the read of ar.itc to make sure
that we really have got the current value of xtime_lock.sequence
before look at ar.itc.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Correctly count CPU objects for SGI ia64/sn hwperf interface
Signed-off-by: Mark Goodwin <markgw@sgi.com>
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
* master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq:
[CPUFREQ] Fix typos in powernow-k8 printk's.
[CPUFREQ] Restore previously used governor on a hot-replugged CPU
[CPUFREQ] bugfix cpufreq in combination with performance governor
[CPUFREQ] powernow-k8 compile fix.
[CPUFREQ] the overdue removal of X86_SPEEDSTEP_CENTRINO_ACPI
[CPUFREQ] Longhaul - Option to disable ACPI C3 support
Fixed up arch/i386/kernel/cpu/cpufreq/powernow-k8.c due to revert that
got fixed differently in the cpufreq branch.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'ioat-md-accel-for-linus' of git://lost.foo-projects.org/~dwillia2/git/iop: (28 commits)
ioatdma: add the unisys "i/oat" pci vendor/device id
ARM: Add drivers/dma to arch/arm/Kconfig
iop3xx: surface the iop3xx DMA and AAU units to the iop-adma driver
iop13xx: surface the iop13xx adma units to the iop-adma driver
dmaengine: driver for the iop32x, iop33x, and iop13xx raid engines
md: remove raid5 compute_block and compute_parity5
md: handle_stripe5 - request io processing in raid5_run_ops
md: handle_stripe5 - add request/completion logic for async expand ops
md: handle_stripe5 - add request/completion logic for async read ops
md: handle_stripe5 - add request/completion logic for async check ops
md: handle_stripe5 - add request/completion logic for async compute ops
md: handle_stripe5 - add request/completion logic for async write ops
md: common infrastructure for running operations with raid5_run_ops
md: raid5_run_ops - run stripe operations outside sh->lock
raid5: replace custom debug PRINTKs with standard pr_debug
raid5: refactor handle_stripe5 and handle_stripe6 (v3)
async_tx: add the async_tx api
xor: make 'xor_blocks' a library routine for use with async_tx
dmaengine: make clients responsible for managing channels
dmaengine: refactor dmaengine around dma_async_tx_descriptor
...
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
[MIPS] Workaround for a sparse warning in include/asm-mips/mach-tx4927/ioremap.h
[MIPS] Make show_code static and add __user tag
[MIPS] Workaround for a sparse warning in include/asm-mips/compat.h
[MIPS] Add some __user tags
[MIPS] math-emu minor cleanup
[MIPS] Kill CONFIG_TX4927BUG_WORKAROUND
[MIPS] Alchemy: Remove code wrapped by dead symbol CONFIG_FB_XPERT98
[MIPS] Alchemy: Remove code wrapped by dead symbol CONFIG_AU1000_SRC_CLK
[MIPS] Alchemy: Remove code wrapped by dead symbol CONFIG_AU1000_USE32K
[MIPS] Alchemy: Remove code wrapped by dead symbol CONFIG_AU1XXX_PSC_SPI
[CHAR] Delete leftovers of old Alchemy UART driver
This reverts commit 904f7a3f04.
As noted by Peter Anvin:
"It causes build failures on i386.
Yet another case of unnecessary divergence between i386 and x86-64
I'm afraid..."
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Declaring emulpc and contpc as "unsigned long" can get rid of some casts.
This also get rid of some sparse warnings.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>