Basic setup routines, the mm_struct has a pointer to the cgroup that
it belongs to and the the page has a page_cgroup associated with it.
Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Paul Menage <menage@google.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Kirill Korotaev <dev@sw.ru>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Setup the memory cgroup and add basic hooks and controls to integrate
and work with the cgroup.
Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Pavel Emelianov <xemul@openvz.org>
Cc: Paul Menage <menage@google.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Kirill Korotaev <dev@sw.ru>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: David Rientjes <rientjes@google.com>
Cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With fixes from David Rientjes <rientjes@google.com>
Introduce generic structures and routines for resource accounting.
Each resource accounting cgroup is supposed to aggregate it,
cgroup_subsystem_state and its resource-specific members within.
Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Paul Menage <menage@google.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Kirill Korotaev <dev@sw.ru>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Pavel Emelianov <xemul@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This legacy define from the old buffer code is now only used in a single
power pc driver than doesn't compile anyway.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
And to go with it Dave's type checking x86 termios headers. I've updated
these as the original sent by Dave had some wrong types in it.
Signed-off-by: Alan Cox <alan@redhat.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rename old vfs_ioctl to do_ioctl, because the comment above it clearly
indicates that it is an internal function not to be exported to modules;
therefore it should have a more traditional do_XXX name. The new do_ioctl
is exported in fs.h but not to modules.
Rename the old do_ioctl to vfs_ioctl because the names vfs_XXX should
preferably be reserved to callable VFS functions which modules may call, as
many other vfs_XXX functions already do. Export the new vfs_ioctl to GPL
modules so others can use it (including Unionfs and eCryptfs). Add DocBook
for new vfs_ioctl.
[akpm@linux-foundation.org: fix build]
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The DS1WM driver incorrectly infers the IAS bit (1-wire interrupt active
high) from IRQ settings. There are devices that have IAS=0 but still need
the IRQ to trigger on a rising edge. With this patch, machines with DS1WM
that need IAS=1 have to set .active_high=1 in the ds1wm_platform_data.
Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>
Acked-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Acked-by: Matt Reimer <mreimer@vpop.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Supporting SunOS ptrace() is pretty pointless and these
kinds of quirks keep us from being able to share more
code with other platforms.
Signed-off-by: David S. Miller <davem@davemloft.net>
MTDs are well suited for logging critical data and the mtdoops driver
allows kernel panics/oops to be written to flash in a blackbox flight
recorder fashion allowing better debugging and analysis of crashes.
Any kernel oops in user context can be easily handled since the kernel
continues as normal and any queued mtd writes are scheduled. Any kernel
oops in interrupt context results in a panic and the delayed writes will
not be scheduled however. The existing mtd->write function cannot be
called in interrupt context so these messages can never be written to
flash.
This patch adds a panic_write function pointer that drivers can
optionally implement which can be called in interrupt context. It is
only intended to be called when its known the kernel is about to panic
and we need to write to succeed. Since the kernel is not going to be
running for much longer, this function can break locks and delay to
ensure the write succeeds (but not sleep).
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Extends the leds subsystem with a blink_set() callback function which can
be optionally implemented by a LED driver. If implemented, the driver can use
the hardware acceleration for blinking a LED.
Signed-off-by: Márton Németh <nm127@freemail.hu>
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
This makes the SPE register data appear in ELF core dumps, using the
new n_type value NT_PPC_SPE (0x101). This new note type is not used
by any consumers of core files yet, but support can be added. I don't
even have any hardware with SPE capabilities, so I've never seen such
a note. But this demonstrates how simple it is to export register
information in core dumps when the user_regset style is used for the
low-level code.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This replaces powerpc's compat_sys_ptrace with a compat_arch_ptrace and
enables the new generic definition of compat_sys_ptrace instead.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This switches the CONFIG_PPC64 support for 32-bit ELF to use the
generic fs/compat_binfmt_elf.c implementation instead of our own
binfmt_elf32.c. Since so much is the same between 32/64, there is
only one macro we have to define to make the generic support work out
of the box.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This switches powerpc to using the user_regset-based code for ELF core
dumps. The core dumps come out exactly the same either way, except that
the NT_PPC_VMX note is now omitted for any thread that never touched its
Altivec registers (thread_struct.vr_used).
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Two cleanups to <linux/acpi.h>:
* Stop defining acpi_mp_config, it isn't used anywhere.
* Discard nested "#ifdef CONFIG_ACPI", they are useless and
error-prone.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
798d910398
(ACPI: create CONFIG_ACPI_DEBUG_FUNC_TRACE)
failed to associate the new tracing config option with the tracing code.
Signed-off-by: Len Brown <len.brown@intel.com>
Kernel mode graphics drivers need this ACPI notifier chaine
so that they can get notified upon hotkey events.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Add a default poll idle state with 0 latency. Provides an option to users
to use poll_idle by using 0 as the latency requirement.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Add MWAIT idle for C1 state instead of halt, on platforms that support
C1 state with MWAIT.
Renames cx->space_id to something more appropriate.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Export acpi_check_resource_conflict(), sometimes drivers already have
a struct resource at hand so no need to use the wrappers to build a new
one.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: "Mark M. Hoffman" <mhoffman@lightlink.com>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
Small ACPICA extension to be able to store the name of operation regions in osl.c later
In ACPI, AML can define accesses to IO ports and System Memory by Operation
Regions. Those are not registered as done by PNPACPI using resource templates
(and _CRS/_SRS methods).
The IO ports and System Memory regions may get accessed by arbitrary AML code.
When native drivers are accessing the same resources bad things can happen
(e.g. a critical shutdown temperature of 3000 C every 2 months or so).
It is not really possible to register the operation regions via
request_resource, as they often overlap with pnp or other resources (e.g.
statically setup IO resources below 0x100).
This approach stores all Operation Region declarations (IO and System Memory
only) at ACPI table parse time. It offers a similar functionality like
request_region and let drivers which are known to possibly use the same IO
ports and Memory which are also often used by ACPI (hwmon and i2c) check for
ACPI interference.
A boot parameter acpi_enforce_resources=strict/lax/no is provided, which
is default set to lax:
- strict: let conflicting drivers fail to load with an error message
- lax: let conflicting driver work normal with a warning message
- no: no functional change at all
Depending on the feedback and the kind of interferences we see, this
should be set to strict at later time.
Goal of this patch set is:
- Identify ACPI interferences in bug reports (very hard to reproduce
and to identify)
- Find BIOSes for that an ACPI driver should exist for specific HW
instead of a native one.
- stability in general
Provide acpi_check_{mem_}region.
Drivers can additionally check against possible ACPI interference by also
invoking this shortly before they call request_region.
If -EBUSY is returned, the driver must not load.
Use acpi_enforce_resources=strict/lax/no options to:
- strict: let conflicting drivers fail to load with an error message
- lax: let conflicting driver work normal with a warning message
- no: no functional change at all
Cc: "Mark M. Hoffman" <mhoffman@lightlink.com>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
Now that struct mlx4_buf.u is a struct instead of a union because of
the vmap() changes, there's no point in having a struct at all. So
move .direct and .page_list directly into struct mlx4_buf and get rid
of a bunch of unnecessary ".u"s.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Since kernel virtual memory is not a problem on 64-bit systems, there
is no reason to use our own 2-layer page mapping scheme for large
kernel queue buffers on such systems. Instead, map the page list to a
single virtually contiguous buffer with vmap(), so that can we access
buffer memory via direct indexing.
Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
We use struct mlx4_buf for kernel QP, CQ and SRQ buffers, and the code
to look up an entry is duplicated in get_cqe_from_buf() and the QP and
SRQ versions of get_wqe(). Factor this out into mlx4_buf_offset().
This will also make it easier to switch over to using vmap() for buffers.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Eliminate cpufreq_userspace scaling_setspeed deadlock.
Luming Yu recently uncovered yet another cpufreq related deadlock.
One thread that continuously switches the governors and the other thread that
repeatedly cats the contents of cpufreq directory causes both these threads to
go into a deadlock.
Detailed examination of the deadlock showed the exact flow before the deadlock
as:
Thread 1 Thread 2
________ ________
cats files under /sys/devices/.../cpufreq/
Set governor to userspace
Adds a new sysfs entry for
scaling_setspeed
cats files under /sys/devices/.../cpufreq/
Set governor to performance
Holds cpufreq_rw_sem in write
mode
Sends a STOP notify to
userspace governor
cat /sys/devices/.../cpufreq/scaling_setspeed
Gets a handle on the above sysfs entry with
sysfs_get_active
Blocks while trying to get cpufreq_rw_sem
in read mode
Remove a sysfs entry for
scaling_setspeed
Blocks on sysfs_deactivate
while waiting for earlier
get_active (on other thread)
to drain
At this point both threads go into deadlock and any other thread that tries to
do anything with sysfs cpufreq will also block.
There seems to be no easy way to avoid this deadlock as long as
cpufreq_userspace adds/removes the sysfs entry under same kobject as cpufreq.
Below patch moves scaling_setspeed to cpufreq.c, keeping it always and calling
back the governor on read/write. This is the cleanest fix I could think of,
even though adding two callbacks in governor structure just for this seems
unnecessary.
Note that the change makes scaling_setspeed under /sys/.../cpufreq permanent
and returns <unsupported> when governor is not userspace.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
See Documentation/ABI/testing/sysfs-firmware-acpi
Based-on-original-patch-by: Luming Yu <luming.yu@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
When an ACPI table is overridden (for now this can happen only for DSDT)
display a big warning and taint the kernel with flag A.
Signed-off-by: Eric Piel <eric.piel@tremplin-utc.net>
Signed-off-by: Len Brown <len.brown@intel.com>
Since we have mfdcri() and mtdcri() as macros, we can't use constructions,
such as "mtdcri(base, reg, mfdcri(base, reg) | val)". In this case the
mfdcri() stuff is not evaluated first. It's evaluated inside the mtdcri()
macro and we have the dcr_ind_lock spinlock acquired twice.
To avoid this error, I've added __mfdcri()/__mtdcri() inline functions that
take the lock after register name fix-up.
Signed-off-by: Valentine Barshak <vbarshak@ru.mvista.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
This merges the mux.c (including the connection interface) with trans_fd
in preparation for transport API changes. Ultimately, trans_fd will need
to be rewritten to clean it up and simplify the implementation, but this
reorganization is viewed as the first step.
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
GDM gets unhappy if /var/gdm doesn't have the sticky bit set. This patch adds
support for the sticky bit in much the same way setuid/setgid is supported.
With this patch, I can launch X from a v9fs rootfs (although I quickly run out
of fds in the server once gnome starts up).
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Acked-by: Eric Van Hensbergen <ericvh@gmail.com>