There is a NUMA memory configuration issue in 2.6.24:
A 2-node machine of ours has got the following memory layout:
Node 0: 0 - 2 Gbytes
Node 0: 4 - 8 Gbytes
Node 1: 8 - 16 Gbytes
Node 0: 16 - 18 Gbytes
"efi_memmap_init()" merges the three last ranges into one.
"register_active_ranges()" is called as follows:
efi_memmap_walk(register_active_ranges, NULL);
i.e. once for the 4 - 18 Gbytes range. It picks up the node
number from the start address, and registers all the memory for
the node #0.
"register_active_ranges()" should be called as follows to
make sure there is no merged address range at its entry:
efi_memmap_walk(filter_memory, register_active_ranges);
"filter_memory()" is similar to "filter_rsvd_memory()",
but the reserved memory ranges are not filtered out.
Signed-off-by: Zoltan Menyhart <Zoltan.Menyhart@bull.net>
Signed-off-by: Tony Luck <tony.luck@intel.com>
* 'docs' of git://git.lwn.net/linux-2.6:
Add additional examples in Documentation/spinlocks.txt
Move sched-rt-group.txt to scheduler/
Documentation: move rpc-cache.txt to filesystems/
Documentation: move nfsroot.txt to filesystems/
Spell out behavior of atomic_dec_and_lock() in kerneldoc
Fix a typo in highres.txt
Fixes to the seq_file document
Fill out information on patch tags in SubmittingPatches
Add the seq_file documentation
git commit 54a0151041 ("asmlinkage_protect
replaces prevent_tail_call") causes this build failure on s390:
AS arch/s390/kernel/entry64.o
In file included from arch/s390/kernel/entry64.S:14:
include/linux/linkage.h:34: error: syntax error in macro parameter list
make[1]: *** [arch/s390/kernel/entry64.o] Error 1
make: *** [arch/s390/kernel] Error 2
and some other architectures. The reason is that some architectures add
the "-traditional" flag to the invocation of $(AS), which disables
variadic macro argument support.
So just surround the new define with an #ifndef __ASSEMBLY__ to prevent
any side effects on asm code.
Cc: Roland McGrath <roland@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
[NETNS][IPV6] tcp - assign the netns for timewait sockets
[IPV4]: Fix byte value boundary check in do_ip_getsockopt().
BNX2X: Correct bringing chip out of reset
[NETFILTER]: nf_nat: autoload IPv4 connection tracking
[NETFILTER]: xt_hashlimit: fix mask calculation
[XFRM]: xfrm_user: fix selector family initialization
rt61pci: rt61pci_beacon_update do not free skb twice
ssb-mipscore: Fix interrupt vectors
ssb-pcicore: Fix IRQ TPS flag handling
mac80211: use short_preamble mode from capability if ERP IE not present
[NET]: Undo code bloat in hot paths due to print_mac().
[TCP]: Don't allow FRTO to take place while MTU is being probed
[TCP]: tcp_simple_retransmit can cause S+L
[TCP]: Fix NewReno's fast rexmit/recovery problems with GSOed skb
[TCP]: Restore 2.6.24 mark_head_lost behavior for newreno/fack
nl80211: fix STA AID bug
b43legacy: fix bcm4303 crash
iwlwifi: fix n-band association problem
ipw2200: set MAC address on radiotap interface
libertas: fix mode initialization problem
Increase the PNP "number of devices" limit. We currently use an unsigned
char, which limits us to 256 devices per protocol. This patch changes that to
an unsigned int.
Not all backends can take advantage of this: we limit ISAPNP to 10 devices in
isapnp_cfg_begin(), and PNPBIOS is limited to 256 devices because the BIOS
interfaces use a one-byte device node number.
But there is no limit on the number of PNPACPI devices we may have. Large HP
Integrity machines have more than 256, which causes the current "unsigned char
number" to wrap around. This causes errors like this:
pnp: PnP ACPI init
kobject_add failed for 00:00 with -EEXIST, don't try to register things with the same name in the same directory.
Call Trace:
[<a000000100010720>] show_stack+0x40/0xa0
[<a0000001000107b0>] dump_stack+0x30/0x60
[<a0000001001dbdf0>] kobject_add+0x290/0x2c0
[<a0000001002bfd40>] device_add+0x160/0x860
[<a0000001002c0470>] device_register+0x30/0x60
[<a00000010026ba70>] __pnp_add_device+0x130/0x180
[<a00000010026bb70>] pnp_add_device+0xb0/0xe0
[<a0000001007f2730>] pnpacpi_add_device+0x510/0x5a0
[<a0000001007f2810>] pnpacpi_add_device_handler+0x50/0x80
This patch increases the limit to fix this PNPACPI problem. It should not
have any adverse effect on ISAPNP or PNPBIOS because their limits are still
enforced in the backends.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It's really a pretty ugly thing to need, and some day it will hopefully
be obviated by teaching gcc about the magic calling conventions for the
low-level system call code, but in the meantime we can at least add big
honking comments about why we need these insane and strange macros.
I took my comments from my version of the macro, but I ended up deciding
to just pick Roland's version of the actual code instead (with his
prettier syntax that uses vararg macros). Thus the previous two commits
that actually implement it.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The prevent_tail_call() macro works around the problem of the compiler
clobbering argument words on the stack, which for asmlinkage functions
is the caller's (user's) struct pt_regs. The tail/sibling-call
optimization is not the only way that the compiler can decide to use
stack argument words as scratch space, which we have to prevent.
Other optimizations can do it too.
Until we have new compiler support to make "asmlinkage" binding on the
compiler's own use of the stack argument frame, we have work around all
the manifestations of this issue that crop up.
More cases seem to be prevented by also keeping the incoming argument
variables live at the end of the function. This makes their original
stack slots attractive places to leave those variables, so the compiler
tends not clobber them for something else. It's still no guarantee, but
it handles some observed cases that prevent_tail_call() did not.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Don't make smp_{r,w,}mb() interpolate a MEMBAR instruction when CONFIG_SMP=n as
SMP memory barries on UP systems should interpolate a compiler barrier only.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use traps 120-126 to emulate atomic cmpxchg32, xchg32, and XOR-, OR-, AND-, SUB-
and ADD-to-memory operations for userspace.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Move STACK_TOP_MAX up so that we don't try moving the stack above it as that
causes setup_arg_pages() to malfunction.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Handle update_mmu_cache() being called when current->mm is NULL.
We cache static TLB mappings for the current page table in DAMPR4 and DAMPR5
on the theory that the next data lookup is likely to be in the same general
region, and thus is likely to be mapped by the same page table. However, we
can't get this information if we can't access the appropriate mm_struct.
If current->mm is NULL, we just clear the cache in the knowledge that the TLB
miss handlers will load it.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This deprecates old set/reset_scoop_gpio interfacein favour of
support for generic gpio interface.
It requires gpiolib, so it depends on the previous patch
(gpiolib for SA-1100).
Signed-off-by: Dmitry Baryshkov <dbaryshkov@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This adds gpiolib support for the SA-1100 arch:
- Move all GPIO API functions from generic.c into gpio.c
- Convert all gpio functions into gpiolib callbacks.
Signed-off-by: Dmitry Baryshkov <dbaryshkov@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The allowed values for the MDIV field (Master Clock Division) in the
PMC controller differ between the AT91RM9200 and AT91SAM9/CAP9.
To remove possible confusion, change the definitions to be more explicit.
Also define the Processor Clock Division bits.
Signed-off-by: Andrew Victor <linux@maxim.org.za>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Allow the use of SACK and window scaling when syncookies are used
and the client supports tcp timestamps. Options are encoded into
the timestamp sent in the syn-ack and restored from the timestamp
echo when the ack is received.
Based on earlier work by Glenn Griffin.
This patch avoids increasing the size of structs by encoding TCP
options into the least significant bits of the timestamp and
by not using any 'timestamp offset'.
The downside is that the timestamp sent in the packet after the synack
will increase by several seconds.
changes since v1:
don't duplicate timestamp echo decoding function, put it into ipv4/syncookie.c
and have ipv6/syncookies.c use it.
Feedback from Glenn Griffin: fix line indented with spaces, kill redundant if ()
Reviewed-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
1) Standlaone ip6_null_entry is no longer needed as it is replaced by
the ip6_null_entry member of ipv6 (instance of struct netns_ipv6) in
struct net (as a result of Network Namespaces patches).
2) These 3 methods from this same header are not defined anywhere:
ip6_rt_addr_add(), ip6_rt_addr_del(), rt6_sndmsg()
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
SKF_ADF_NLATTR searches for a netlink attribute, which avoids manually
parsing and walking attributes. It takes the offset at which to start
searching in the 'A' register and the attribute type in the 'X' register
and returns the offset in the 'A' register. When the attribute is not
found it returns zero.
A top-level attribute can be located using a filter like this
(example for nfnetlink, using struct nfgenmsg):
...
{
/* A = offset of first attribute */
.code = BPF_LD | BPF_IMM,
.k = sizeof(struct nlmsghdr) + sizeof(struct nfgenmsg)
},
{
/* X = CTA_PROTOINFO */
.code = BPF_LDX | BPF_IMM,
.k = CTA_PROTOINFO,
},
{
/* A = netlink attribute offset */
.code = BPF_LD | BPF_B | BPF_ABS,
.k = SKF_AD_OFF + SKF_AD_NLATTR
},
{
/* Exit if not found */
.code = BPF_JMP | BPF_JEQ | BPF_K,
.k = 0,
.jt = <error>
},
...
A nested attribute below the CTA_PROTOINFO attribute would then
be parsed like this:
...
{
/* A += sizeof(struct nlattr) */
.code = BPF_ALU | BPF_ADD | BPF_K,
.k = sizeof(struct nlattr),
},
{
/* X = CTA_PROTOINFO_TCP */
.code = BPF_LDX | BPF_IMM,
.k = CTA_PROTOINFO_TCP,
},
{
/* A = netlink attribute offset */
.code = BPF_LD | BPF_B | BPF_ABS,
.k = SKF_AD_OFF + SKF_AD_NLATTR
},
...
The data of an attribute can be loaded into 'A' like this:
...
{
/* X = A (attribute offset) */
.code = BPF_MISC | BPF_TAX,
},
{
/* A = skb->data[X + k] */
.code = BPF_LD | BPF_B | BPF_IND,
.k = sizeof(struct nlattr),
},
...
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes two unused method declarations in
include/net/ndisc.h: ndisc_forwarding_on(void) and
ndisc_forwarding_off(void);
Also igmp6_cleanup(void) appears twice in this header, so one
igmp6_cleanup(void) declaration is removed.
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The sk_filter function is too big to be inlined. This saves 2296 bytes
of text on allyesconfig.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some minor style cleanups:
* Move __KERNEL__ definitions to one place in filter.h
* Use const for sk_filter_len
* Line wrapping
* Put EXPORT_SYMBOL next to function definition
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Updates based on the "Intel® Itanium® Architecture Software Developer's Manual
Specification Update October 2007".
24869911.pdf
Signed-off-by: Russ Anderson <rja@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Add kprobe-booster support on ia64.
Kprobe-booster improves the performance of kprobes by eliminating single-step,
where possible. Currently, kprobe-booster is implemented on x86 and x86-64.
This is an ia64 port.
On ia64, kprobe-booster executes a copied bundle directly, instead of single
stepping. Bundles which have B or X unit and which may cause an exception
(including break) are not executed directly. And also, to prevent hitting
break exceptions on the copied bundle, only the hindmost kprobe is executed
directly if several kprobes share a bundle and are placed in different slots.
Note: set_brl_inst() is used for preparing an instruction buffer(it does not
modify any active code), so it does not need any atomic operation.
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: bibo,mao <bibo.mao@intel.com>
Cc: Rusty Lynch <rusty.lynch@intel.com>
Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
when compile 2.6.25-rc8-mm1, below warning happend.
because walk_page_range pass argument as "const struct mm*",
but pgd_offset() receive as "struct mm*".
CC mm/pagewalk.o
mm/pagewalk.c: In function 'walk_page_range':
mm/pagewalk.c:111: warning: passing argument 1 of 'pgd_offset' discards qualifiers from pointer target type
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This attached patch significantly shrinks boot memory allocation on ia64.
It does this by not allocating per_cpu areas for cpus that can never
exist.
In the case where acpi does not have any numa node description of the
cpus, I defaulted to assigning the first 32 round-robin on the known
nodes.. For the !CONFIG_ACPI I used for_each_possible_cpu().
Signed-off-by: Robin Holt <holt@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Add new API to MAC80211 to allow low level driver to
notify MAC with driver status.
Signed-off-by: Mohamed Abbas <mabbas@linux.intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This adds support for block based I/O to SSB.
This is needed in order to efficiently support PIO data
transfers to the card.
The block-I/O support is only compiled, if it's selected by the
weird driver that needs it. So there's no overhead for sane devices.
Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch is necessary for the upcoming Accesspoint patch for p54.
Signed-off-by: Christian Lamparter <chunkeey@web.de>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Turn the SSB bus suspend mechanism upside down.
Instead of deciding by an internal reference count when to suspend/resume,
let the parent bus call us in their suspend/resume routine.
Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch adds assocation capability, timestamp (tsf) and beacon interval
to bss_conf. This is required for successful assocation of iwlwifi drivers
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch eliminates the use of conf_ht, replacing it with
bss_info_changed.
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This fixes Bugzilla #10384
tcp_simple_retransmit does L increment without any checking
whatsoever for overflowing S+L when Reno is in use.
The simplest scenario I can currently think of is rather
complex in practice (there might be some more straightforward
cases though). Ie., if mss is reduced during mtu probing, it
may end up marking everything lost and if some duplicate ACKs
arrived prior to that sacked_out will be non-zero as well,
leading to S+L > packets_out, tcp_clean_rtx_queue on the next
cumulative ACK or tcp_fastretrans_alert on the next duplicate
ACK will fix the S counter.
More straightforward (but questionable) solution would be to
just call tcp_reset_reno_sack() in tcp_simple_retransmit but
it would negatively impact the probe's retransmission, ie.,
the retransmissions would not occur if some duplicate ACKs
had arrived.
So I had to add reno sacked_out reseting to CA_Loss state
when the first cumulative ACK arrives (this stale sacked_out
might actually be the explanation for the reports of left_out
overflows in kernel prior to 2.6.23 and S+L overflow reports
of 2.6.24). However, this alone won't be enough to fix kernel
before 2.6.24 because it is building on top of the commit
1b6d427bb7 ([TCP]: Reduce sacked_out with reno when purging
write_queue) to keep the sacked_out from overflowing.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Reported-by: Alessandro Suardi <alessandro.suardi@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
flush_cache_vmap / flush_cache_vunmap were calling flush_cache_all which -
having been deprecated - turned into a nop ...
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86:
x86: fix 64-bit asm NOPS for CONFIG_GENERIC_CPU
x86: fix call to set_cyc2ns_scale() from time_cpufreq_notifier()
revert "x86: tsc prevent time going backwards"
The 'disable_cb' callback is designed as an optimization to tell the host
we don't need callbacks now. As it is not reliable, the debug check is
overzealous: it can happen on two CPUs at the same time. Document this.
Even if it were reliable, the virtio_net driver doesn't disable
callbacks on transmit so the START_USE/END_USE debugging reentrance
protection can be easily tripped even on UP.
Thanks to Balaji Rao for the bug report and testing.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
CC: Balaji Rao <balajirrao@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Two functions in include/scsi/sas_ata.h don't have static inlines
leading to problems if they're built in:
On Thu, 2008-04-03 at 14:06 +0200, Toralf Förster wrote:
> drivers/scsi/mvsas.o: In function `sas_ata_init_host_and_port':
> mvsas.c:(.text+0x0): multiple definition of `sas_ata_init_host_and_port'
> drivers/scsi/libsas/built-in.o:(.text+0x37f4): first defined here
> drivers/scsi/mvsas.o: In function `sas_ata_task_abort':
> mvsas.c:(.text+0x7): multiple definition of `sas_ata_task_abort'
> drivers/scsi/libsas/built-in.o:(.text+0x37fb): first defined here
> make[2]: *** [drivers/scsi/built-in.o] Error 1
> make[1]: *** [drivers/scsi] Error 2
> make: *** [drivers] Error 2
Add the correct static inline modifiers.
Tested-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Every current transport class calls transport_container_release but
ignores the return value. This is catastrophic if it returns an error
because the containers are part of a global list and the next action of
almost every transport class is to free the memory used by the
container.
Fix this by making transport_container_release a void, but making it BUG
if attribute_container_release returns an error ... this catches the
root cause of a system panic much earlier. If we don't do this, we get
an eventual BUG when the attribute container list notices the corruption
caused by the freed memory it's still referencing.
Also made attribute_container_release __must_check as a reminder.
Cc: Greg KH <greg@kroah.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This adds scsi_build_sense_buffer, a simple helper function to build
sense data in a buffer.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This is needed by things like USB storage that want to set up static
commands for later use at start of day.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
LLDs need to copies data between the SG table in struct scsi_cmnd and
liner buffer. So they use the helper functions like
sg_copy_from_buffer(scsi_sglist(sc), scsi_sg_count(sc), buf, buflen)
sg_copy_to_buffer(scsi_sglist(sc), scsi_sg_count(sc), buf, buflen)
This patch just adds wrapper functions:
scsi_sg_copy_from_buffer(sc, buf, buflen)
scsi_sg_copy_to_buffer(sc, buf, buflen)
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This patch adds new three helper functions to copy data between an SG
list and a linear buffer.
- sg_copy_from_buffer copies data from linear buffer to an SG list
- sg_copy_to_buffer copies data from an SG list to a linear buffer
When the APIs copy data from a linear buffer to an SG list,
flush_kernel_dcache_page is called. It's not necessary for everyone
but it's a no-op on most architectures and in general the API is not
used in performance critical path.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The problem is that serveral drivers are sending a target reset from the
device reset handler, and if we have multiple devices a target reset gets
sent for each device when only one would be sufficient. And if we do a target
reset it affects all the commands on the target so the device reset handler
code only cleaning up one devices's commands makes programming the driver a
little more difficult than it should be.
This patch adds a target reset handler, which drivers can use to send
a target reset. If successful it cleans up the commands for a devices
accessed through that starget.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Add new option MT_ST_SILI to enable setting the SILI bit in reads in variable
block mode. If SILI is set, reading a block shorter than the byte count does
not result in CHECK CONDITION. The length of the block is determined using the
residual count from the HBA. Avoiding the REQUEST SENSE command for every
block speeds up some real applications considerably.
Signed-off-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>