lock_kiocb() was introduced to serialize retrying and cancellation. In the
process of doing so it tried to sleep waiting for KIF_LOCKED while holding
the ctx_lock spinlock. Recent fixes have ensured that multiple concurrent
retries won't be attempted for a given iocb. Cancel has other problems and
has no significant in-tree users that have been complaining about it. So
for the immediate future we'll revert sleeping with the lock held and will
address proper cancellation and retry serialization in the future.
Signed-off-by: Zach Brown <zach.brown@oracle.com>
Acked-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Bind communication identifiers to a device to support device removal.
Export per HCA CM devices to userspace.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
This makes call_rcu() keep track of how many events there are on the RCU
list, and cause a reschedule event when the list gets too long.
This helps keep RCU event lists down.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add kernel/user ABI structures for marshalling poll CQ, request CQ
notification, post send, post receive, post SRQ receive, create AH and
destroy AH commands. These commands allow us to support userspace
verbs for devices that can't perform these operations directly from
userspace (eg the PathScale HCA).
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Give each device a uverbs_cmd_mask, so that a low-level driver can
control which methods may be called on behalf of userspace.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Add abi_version attribute to uverbs class devices to allow for
ABI versioning of device-specific interfaces.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Introduce new userspace verbs ABI version 3. This eliminates some
unneeded commands, and adds support for user-created completion
channels. This cleans up problems with file leaks on error paths, and
also makes sure that file descriptors are always installed into the
correct process.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
It seems that all the list_*_rcu primitives are missing a memory barrier
on the very first dereference. For example,
#define list_for_each_rcu(pos, head) \
for (pos = (head)->next; prefetch(pos->next), pos != (head); \
pos = rcu_dereference(pos->next))
It will go something like:
pos = (head)->next
prefetch(pos->next)
pos != (head)
do stuff
We're missing a barrier here.
pos = rcu_dereference(pos->next)
fetch pos->next
barrier given by rcu_dereference(pos->next)
store pos
Without the missing barrier, the pos->next value may turn out to be stale.
In fact, if "do stuff" were also dereferencing pos and relying on
list_for_each_rcu to provide the barrier then it may also break.
So here is a patch to make sure that we have a barrier for the first
element in the list.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
... otherwise, things like alpha and sparc64 break and break
badly. They define cpu_possible_map to something else in smp.h
*AFTER* having included cpumask.h.
If that puppy is a macro, expansion will happen at the actual
caller, when we'd already seen #define cpu_possible_map ... and we will
get the right thing used.
As an inline helper it will be tokenized before we get to that
define and that's it; no matter what we define later, it won't affect
anything. We get modules with dependency on cpu_possible_map instead
of the right symbol (phys_cpu_present_map in case of sparc64), or outright
link errors if they are built-in.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix copy and paste error in jiffies_to_AHZ conversion which leads to wrong
BSD accounting information on alpha and ia64 when
CONFIG_BSD_PROCESS_ACCT_V3 is turned on.
Also update comment to match reorganised header files.
Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Patch from Richard Purdie
Add a function to allow machines to set the parent of the pxa
framebuffer device. This means the power up/down sequence can be
controlled where required by the machine.
Update spitz to use the new function, fixing a compile error.
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Ben Dooks
The __inwc/__outwc calls are capable of creating
LDRH and STRH instructions with offsets over 8bits
as GCC does not have a constraint for an 8bit
offset.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The hairy fast allocator in the sparc64 PCI IOMMU code
has a hard limit of 256 pages. Certain devices can
exceed this when performing very large I/Os.
So replace with a more simple allocator, based largely
upon the arch/ppc64/kernel/iommu.c code.
Signed-off-by: David S. Miller <davem@davemloft.net>
The only real user of this file outside platforms/iseries was
drivers/net/iseries_veth.c but all it wanted was ISERIES_HV_ADDR()
so we move that to abs_addr.h (and lowercase it).
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
All the PCI controller drivers were doing the same thing
setting up the IOMMU software state, put it all in one spot.
Signed-off-by: David S. Miller <davem@davemloft.net>
Original patch by Harald Welte, with feedback from Herbert Xu
and testing by Sbastien Bernard.
EBTABLES, ARP tables, and IP/IP6 tables all assume that cpus
are numbered linearly. That is not necessarily true.
This patch fixes that up by calculating the largest possible
cpu number, and allocating enough per-cpu structure space given
that.
Signed-off-by: David S. Miller <davem@davemloft.net>
Patch from Ben Dooks
include/asm-arm/arch-s3c2410/hardware.h was missing
the definition for s3c2440_set_dsc()
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When netpoll is not being used, the macro that
defines the removed routing netpoll_poll_lock
defines the return as zero, but the real
routine returns a `void *`
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patch from Liam Girdwood
This patch updates the pxa2xx channel map registers definitions in
pxa-regs.h
Changes:-
o Added description for SSP2 registers
o Added definitions for SSP3 registers
Signed-off-by:Liam Girdwood <liam.girdwood@wolfsonmicro.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Glibc is about to get some new high precision timer stuff that relies on
the standard timebase of the PPC architecture.
However, some (rare & old) CPUs do not have such timebase and it is a
bit annoying to have your stuff just crash because you are running on
the wrong CPU...
This exposes to userland a CPU feature bit that tells that the current
processor doesn't have a standard timebase. It's negative logic so that
glibc will still "just work" on older kernels (it will just be unhappy
on those old CPUs but that doesn't really matter as distro tend to
update glibc & kernel at the same time).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Interestingly enough, ppc32 had broken timekeeping for ages... It
worked, but probably drifted a bit more than could be explained by the
actual bad precision of the timebase calibration. We discovered that
recently when somebody figured out that the common code was using
CLOCK_TICK_RATE to correct the timekeeing, and ppc32 had a completely
bogus value for it.
This patch turns it into something saner. Probably not as good as doing
something based on the actual timebase frequency precision but I'll
leave that sort of math to others. This at least makes it better for
the common HZ values.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This removes three headers from include/asm-ppc64 that are now in
include/asm-powerpc and are sufficiently similar that they can be
used with ARCH=ppc64.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Recent commits upstream have changed files which are currently
duplicated in arch/powerpc and include/asm-powerpc. This updates
them with the corresponding changes.
Signed-off-by: Paul Mackerras <paulus@samba.org>
This fixes up a variety of minor problems in compiling with ARCH=ppc
arising from using the merged versions of various header files.
A lot of the changes are just adding #include <asm/machdep.h> to
files that use ppc_md or smp_ops_t.
This also arranges for us to use semaphore.c, vecemu.c, vector.S and
fpu.S from arch/powerpc/kernel when compiling with ARCH=ppc.
Signed-off-by: Paul Mackerras <paulus@samba.org>
This is required to avoid unloading a module that has active timewait
sockets, such as DCCP.
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the ability of changing the state a TCP connection. I know
that this must be used with care but it's required to provide a complete
conntrack creation via conntrack_netlink. So I'll document this aspect on
the upcoming docs.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Initially we used 64bit counters for conntrack-based accounting, since we
had no event mechanism to tell userspace that our counters are about to
overflow. With nfnetlink_conntrack, we now have such a event mechanism and
thus can save 16bytes per connection.
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
To keep consistency, the TCP private protocol information is nested
attributes under CTA_PROTOINFO_TCP. This way the sequence of attributes to
access the TCP state information looks like here below:
CTA_PROTOINFO
CTA_PROTOINFO_TCP
CTA_PROTOINFO_TCP_STATE
instead of:
CTA_PROTOINFO
CTA_PROTOINFO_TCP_STATE
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Without this #include, __be16 is not defined and userspace programs
will break.
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When 'rustynat' was merged in 2.6.12, the use of the "helper" pointer of
struct ipt_nat_info was obsoleted, but the pointer not removed from the
struct.
This patch removes the pointer, thereby yet again shrinking struct
ip_conntrack.
Discovered-by: Rusty Russell <rusty@netfilter.org>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
As Henrik Nordstrom pointed out, all our efforts with "split endian" (i.e.
host byte order tags, net byte order values) are useless, unless a parser
can determine whether an attribute is nested or not.
This patch steals the highest bit of nfattr.nfa_type to indicate whether
the data payload contains a nested nfattr (1) or not (0).
This will break userspace compatibility, but luckily no kernel with
nfnetlink was released so far.
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>