this patch adds a _range version of hrtimer_start() so that range timers
can be created; the hrtimer_start() function is just a wrapper around this.
In addition, hrtimer_start_expires() will now preserve existing ranges.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
This patch provides a mechanism for platforms to be able to supply the
LED configuration via platform data, rather than having to hard code
it in smc91x.h.
Acked-by: Eric Miao <eric.y.miao@gmail.com>
Acked-by: Nicolas Pitre <nico@cam.org>
Acked-by: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Perodically check for corruption in low phusical memory. Don't bother
checking at fault time, since it won't show anything useful.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Some BIOSes have been observed to corrupt memory in the low 64k. This
change:
- Reserves all memory which does not have to be in that area, to
prevent it from being used as general memory by the kernel. Things
like the SMP trampoline are still in the memory, however.
- Clears the reserved memory so we can observe changes to it.
- Adds a function check_for_bios_corruption() which checks and reports on
memory becoming unexpectedly non-zero. Currently it's called in the
x86 fault handler, and the powermanagement debug output.
Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We don't need whole 32 of them, only NR_SOFTIRQS.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
What I realized recently is that calling rebuild_sched_domains() in
arch_reinit_sched_domains() by itself is not enough when cpusets are enabled.
partition_sched_domains() code is trying to avoid unnecessary domain rebuilds
and will not actually rebuild anything if new domain masks match the old ones.
What this means is that doing
echo 1 > /sys/devices/system/cpu/sched_mc_power_savings
on a system with cpusets enabled will not take affect untill something changes
in the cpuset setup (ie new sets created or deleted).
This patch fixes restore correct behaviour where domains must be rebuilt in
order to enable MC powersaving flags.
Test on quad-core Core2 box with both CONFIG_CPUSETS and !CONFIG_CPUSETS.
Also tested on dual-core Core2 laptop. Lockdep is happy and things are working
as expected.
Signed-off-by: Max Krasnyansky <maxk@qualcomm.com>
Tested-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
in some randconfig configurations, hrtimers are used even though
the hrtimer config if off; and it broke the build due to some of
the new functions being on the wrong side of the ifdef.
This patch moves the functions to the other side of the ifdef, fixing
the build bug.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
We want to be able to control the default "rounding" that is used by
select() and poll() and friends. This is a per process property
(so that we can have a "nice" like program to start certain programs with
a looser or stricter rounding) that can be set/get via a prctl().
For this purpose, a field called "timer_slack_ns" is added to the task
struct. In addition, a field called "default_timer_slack"ns" is added
so that tasks easily can temporarily to a more/less accurate slack and then
back to the default.
The default value of the slack is set to 50 usec; this is significantly less
than 2.6.27's average select() and poll() timing error but still allows
the kernel to group timers somewhat to preserve power behavior. Applications
and admins can override this via the prctl()
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
this patch turns hrtimers into range timers; they have 2 expire points
1) the soft expire point
2) the hard expire point
the kernel will do it's regular best effort attempt to get the timer run
at the hard expire point. However, if some other time fires after the soft
expire point, the kernel now has the freedom to fire this timer at this point,
and thus grouping the events and preventing a power-expensive wakeup in the
future.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
To catch code that still touches the "expires" memory directly, rename it
to have the compiler complain rather than get nasty, hard to explain,
runtime behavior
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
In order to be able to turn hrtimers into range based, we need to provide
accessor functions for getting to the "expires" ktime_t member of the
struct hrtimer.
This patch adds a set of accessors for this purpose:
* hrtimer_set_expires
* hrtimer_set_expires_tv64
* hrtimer_add_expires
* hrtimer_add_expires_ns
* hrtimer_get_expires
* hrtimer_get_expires_tv64
* hrtimer_get_expires_ns
* hrtimer_expires_remaining
* hrtimer_start_expires
No users of these new accessors are added yet; these follow in later patches.
Hopefully this patch can even go into 2.6.27-rc so that the conversions will
not have a bottleneck in -next
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
With lots of help, input and cleanups from Thomas Gleixner
This patch switches select() and poll() over to hrtimers.
The core of the patch is replacing the "s64 timeout" with a
"struct timespec end_time" in all the plumbing.
But most of the diffstat comes from using the just introduced helpers:
poll_select_set_timeout
poll_select_copy_remaining
timespec_add_safe
which make manipulating the timespec easier and less error-prone.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
with hrtimer poll/select, the signal restart data no longer is a single
long representing a jiffies count, but it becomes a second/nanosecond pair
that also needs to encode if there was a timeout at all or not.
This patch adds a struct to the restart_block union for this purpose
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
This patch adds 2 helpers that will be used for the hrtimer based select/poll:
poll_select_set_timeout() is a helper that takes a timeout (as a second, nanosecond
pair) and turns that into a "struct timespec" that represents the absolute end time.
This is a common operation in the many select() and poll() variants and needs various,
common, sanity checks.
poll_select_copy_remaining() is a helper that takes care of copying the remaining
time to userspace, as select(), pselect() and ppoll() do. This function comes in
both a natural and a compat implementation (due to datastructure differences).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
For the select() rework, it's important to be able to add timespec
structures in an overflow-safe manner.
This patch adds a timespec_add_safe() function for this which is similar in
operation to ktime_add_safe(), but works on a struct timespec.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
This patch adds a schedule_hrtimeout() function, to be used by select() and
poll() in a later patch. This function works similar to schedule_timeout()
in most ways, but takes a timespec rather than jiffies.
With a lot of contributions/fixes from Thomas
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Fix some pasto's in comments in the new linux/tracehook.h and
asm-generic/syscall.h files.
Reported-by: Wenji Huang <wenji.huang@oracle.com>
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I found we can no longer set limit to 0 with 2.6.27-rcX:
# mount -t cgroup -omemory xxx /mnt
# mkdir /mnt/0
# echo 0 > /mnt/0/memory.limit_in_bytes
bash: echo: write error: Device or resource busy
It turned out 'limit' can't be set to 'usage', which is wrong IMO.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: fix process time monotonicity
sched_clock: fix NOHZ interaction
* git://git.infradead.org/~dwmw2/dwmw2-2.6.27:
Revert "[ARM] use the new byteorder headers"
Fix conditional export of kvh.h and a.out.h to userspace.
[MTD] [NAND] tmio_nand: fix base address programming
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb: (98 commits)
V4L/DVB (8881): gspca: After 'while (retry--) {...}', retry will be -1 but not 0.
V4L/DVB (8880): PATCH: Fix parents on some webcam drivers
V4L/DVB (8877): b2c2 and bt8xx: udelay to mdelay
V4L/DVB (8876): budget: udelay changed to mdelay
V4L/DVB (8874): gspca: Adjust hstart for sn9c103/ov7630 and update usb-id's.
V4L/DVB (8873): gspca: Bad image offset with rev012a of spca561 and adjust exposure.
V4L/DVB (8872): gspca: Bad image format and offset with rev072a of spca561.
V4L/DVB (8870): gspca: Fix dark room problem with sonixb.
V4L/DVB (8869): gspca: Move the Sonix webcams with TAS5110C1B from sn9c102 to gspca.
V4L/DVB (8868): gspca: Support for vga modes with sif sensors in sonixb.
V4L/DVB (8844): dabusb_fpga_download(): fix a memory leak
V4L/DVB (8843): tda10048_firmware_upload(): fix a memory leak
V4L/DVB (8842): vivi_release(): fix use-after-free
V4L/DVB (8840): dib0700: add basic support for Hauppauge Nova-TD-500 (84xxx)
V4L/DVB (8839): dib0700: add comment to identify 35th USB id pair
V4L/DVB (8837): dvb: fix I2C adapters name size
V4L/DVB (8835): gspca: Same pixfmt as the sn9c102 driver and raw Bayer added in sonixb.
V4L/DVB (8834): gspca: Have a bigger buffer for sn9c10x compressed images.
V4L/DVB (8833): gspca: Cleanup the sonixb code.
V4L/DVB (8832): gspca: Bad pixelformat of vc0321 webcams.
...
It is obviously good for userspace to know up front which
interface modes a given piece of hardware might support (even
if adding such an interface might fail later because of
concurrency issues), so let's make cfg80211 aware of that.
For good measure, disallow adding interfaces in all other
modes so drivers don't forget to announce support for one mode
when they add it.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Stephen Blackheath <tramp.enshrine.stephen@blacksapphire.com>
Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The PCI device ids for AMD family 0x11 processors are missing in pci_ids.h.
This patch adds them.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Spencer reported a problem where utime and stime were going negative despite
the fixes in commit b27f03d4bd. The suspected
reason for the problem is that signal_struct maintains it's own utime and
stime (of exited tasks), these are not updated using the new task_utime()
routine, hence sig->utime can go backwards and cause the same problem
to occur (sig->utime, adds tsk->utime and not task_utime()). This patch
fixes the problem
TODO: using max(task->prev_utime, derived utime) works for now, but a more
generic solution is to implement cputime_max() and use the cputime_gt()
function for comparison.
Reported-by: spencer@bluehost.com
Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Some architectures have moved the asm/ into arch/ and some have not.
This patch checks for a.out.h and kvh.h in both places before exporting
the corresponding file from linux/
[dwmw2: simplified a little]
Signed-off-by: Khem Raj <raj.khem@gmail.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
There is a ordering related problem with clockevents code, due to which
clockevents_register_device() called after tickless/highres switch
will not work. The new clockevent ends up with clockevents_handle_noop as
event handler, resulting in no timer activity.
The problematic path seems to be
* old device already has hrtimer_interrupt as the event_handler
* new clockevent device registers with a higher rating
* tick_check_new_device() is called
* clockevents_exchange_device() gets called
* old->event_handler is set to clockevents_handle_noop
* tick_setup_device() is called for the new device
* which sets new->event_handler using the old->event_handler which is noop.
Change the ordering so that new device inherits the proper handler.
This does not have any issue in normal case as most likely all the clockevent
devices are setup before the highres switch. But, can potentially be affecting
some corner case where HPET force detect happens after the highres switch.
This was a problem with HPET in MSI mode code that we have been experimenting
with.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Currently, there are two different fields in the
mv643xx_eth_platform_data struct that together describe the PHY
address -- one field (phy_addr) has the address of the PHY, but if
that address is zero, a second field (force_phy_addr) needs to be
set to distinguish the actual address zero from a zero due to not
having filled in the PHY address explicitly (which should mean
'use the default PHY address').
If we are a bit smarter about the encoding of the phy_addr field,
we can avoid the need for a second field -- this patch does that.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Which top-level unit's SMI interface to use should be a property of
the top-level unit, not of the individual ports. This patch moves the
->shared_smi pointer from the per-port platform data to the global
platform data.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Simplify receive and transmit queue handling by requiring the set
of queue numbers to be contiguous starting from zero.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
add reserve_region_with_split() to not lose e820 reserved entries if
they overlap with existing IO regions:
with test case by extend 0xe0000000 - 0xeffffff to 0xdd800000 -
we get:
e0000000-efffffff : PCI MMCONFIG 0
e0000000-efffffff : reserved
and in /proc/iomem we get:
found conflict for reserved [dd800000, efffffff], try to reserve with split
__reserve_region_with_split: (PCI Bus #80) [dd000000, ddffffff], res: (reserved) [dd800000, efffffff]
__reserve_region_with_split: (PCI Bus #00) [de000000, dfffffff], res: (reserved) [de000000, efffffff]
initcall pci_subsys_init+0x0/0x121 returned 0 after 381 msecs
in dmesg
various fixes and improvements suggested by Linus.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Also, stop looking at the NAND controller (0x4100) and checking the
device class. For a while during development, all three functions on the
chip had the same ID. We made them fix that fairly promptly, and we can
forget about it now.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Acked-by: Jonathan Corbet <corbet@lwn.net>
This patch adds a generic infrastructure for policy-based dequeueing of
TX packets and provides two policies:
* a simple FIFO policy (which is the default) and
* a priority based policy (set via socket options).
Both policies honour the tx_qlen sysctl for the maximum size of the write
queue (can be overridden via socket options).
The priority policy uses skb->priority internally to assign an u32 priority
identifier, using the same ranking as SO_PRIORITY. The skb->priority field
is set to 0 when the packet leaves DCCP. The priority is supplied as ancillary
data using cmsg(3), the patch also provides the requisite parsing routines.
Signed-off-by: Tomasz Grobelny <tomasz@grobelny.oswiecenia.net>
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
This extends the packet dequeuing interface of dccp_write_xmit() to allow
1. CCIDs to take care of timing when the next packet may be sent;
2. delayed sending (as before, with an inter-packet gap up to 65.535 seconds).
The main purpose is to take CCID2 out of its polling mode (when it is network-
limited, it tries every millisecond to send, without interruption).
The interface can also be used to support other CCIDs.
The mode of operation for (2) is as follows:
* new packet is enqueued via dccp_sendmsg() => dccp_write_xmit(),
* ccid_hc_tx_send_packet() detects that it may not send (e.g. window full),
* it signals this condition via `CCID_PACKET_WILL_DEQUEUE_LATER',
* dccp_write_xmit() returns without further action;
* after some time the wait-condition for CCID becomes true,
* that CCID schedules the tasklet,
* tasklet function calls ccid_hc_tx_send_packet() via dccp_write_xmit(),
* since the wait-condition is now true, ccid_hc_tx_packet() returns "send now",
* packet is sent, and possibly more (since dccp_write_xmit() loops).
Code reuse: the taskled function calls dccp_write_xmit(), the timer function
reduces to a wrapper around the same code.
If the tasklet finds that the socket is locked, it re-schedules the tasklet
function (not the tasklet) after one jiffy.
Changed DCCP_BUG to dccp_pr_debug when transmit_skb returns an error (e.g. when a
local qdisc is used, NET_XMIT_DROP=1 can be returned for many packets).
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
The problem with Ack Vectors is that
i) their length is variable and can in principle grow quite large,
ii) it is hard to predict exactly how large they will be.
Due to the second point it seems not a good idea to reduce the MPS; in
particular when on average there is enough room for the Ack Vector and an
increase in length is momentarily due to some burst loss, after which the
Ack Vector returns to its normal/average length.
The solution taken by this patch is to subtract a minimum-expected Ack Vector
length from the MPS (previous patch), and to defer any larger Ack Vectors onto
a separate Sync - but only if indeed there is no space left on the skb.
This patch provides the infrastructure to schedule Sync-packets for transporting
(urgent) out-of-band data. Its signalling is quicker than scheduling an Ack, since
it does not need to wait for new application data.
It can thus serve other parts of the DCCP code as well.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
The constants DCCPO_{MIN,MAX}_CCID_SPECIFIC are nowhere used in the code, but
instead for the CCID-specific options numbers are used.
This patch unifies the use of CCID-specific option numbers, by adding symbolic
names reflecting the definitions in RFC 4340, 10.3.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
This patch takes care of initialising and type-checking sysctls related to
feature negotiation. Type checking is important since some of the sysctls
now directly act on the feature-negotiation process.
The sysctls are initialised with the known default values for each feature.
For the type-checking the value constraints from RFC 4340 are used:
* Sequence Window uses the specified Wmin=32, the maximum is ulong (4 bytes),
tested and confirmed that it works up to 4294967295 - for Gbps speed;
* Ack Ratio is between 0 .. 0xffff (2-byte unsigned integer);
* CCIDs are between 0 .. 255;
* request_retries, retries1, retries2 also between 0..255 for good measure;
* tx_qlen is checked to be non-negative;
* sync_ratelimit remains as before.
Further changes:
----------------
Performed s@sysctl_dccp_feat@sysctl_dccp@g since the sysctls are now in feat.c.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>