Commit graph

189269 commits

Author SHA1 Message Date
Akinobu Mita
984b3f5746 bitops: rename for_each_bit() to for_each_set_bit()
Rename for_each_bit to for_each_set_bit in the kernel source tree.  To
permit for_each_clear_bit(), should that ever be added.

The patch includes a macro to map the old for_each_bit() onto the new
for_each_set_bit().  This is a (very) temporary thing to ease the migration.

[akpm@linux-foundation.org: add temporary for_each_bit()]
Suggested-by: Alexey Dobriyan <adobriyan@gmail.com>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Artem Bityutskiy <dedekind@infradead.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-06 11:26:23 -08:00
David Miller
e3cb91ce1a timbgpio: fix build
Use of get_irq_chip_data() et al.  requires including linux/irq.h

Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Richard Röjfors <richard.rojfors@pelagicore.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-06 11:26:23 -08:00
Al Viro
781b16775b Fix a dumb typo - use of & instead of &&
We managed to lose O_DIRECTORY testing due to a stupid typo in commit
1f36f774b2 ("Switch !O_CREAT case to use of do_last()")

Reported-by: Walter Sheets <w41ter@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-06 10:54:48 -08:00
Martyn Welch
cda61c9420 [WATCHDOG] gef_wdt: Author corrections following split of GE Fanuc joint venture
This patch corrects author and copyright notices  following the split-up of
the GE Fanuc joint venture.

Signed-off-by: Martyn Welch <martyn.welch@ge.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2010-03-06 15:13:04 +00:00
Naga Chumbalkar
ec26985be4 [WATCHDOG] iTCO_wdt: clean up probe(), modify err msg
It's possible that the platform is not allowing reboot via TCO timer
expiration.

Also, differentiate between not finding a chipset that has TCO, and the case
where TCO is present but the driver fails to initialize for some reason.

Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2010-03-06 15:12:55 +00:00
Wim Van Sebroeck
f538ed9ea0 [WATCHDOG] ep93xx: watchdog timer driver for TS-72xx SBCs cleanup
Clean-up driver:
* make release the reverse of probe so that both are consistent
* add WDIOC_GETSTATUS & WDIOC_GETBOOTSTATUS ioctls.

Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2010-03-06 15:12:40 +00:00
Marc Zyngier
66aaa7a559 [WATCHDOG] support for max63xx watchdog timer chips
This driver adds support for the max63{69,70,71,72,73,74} family of
watchdog timer chips.

It has been tested on an Arcom Zeus (max6369).

Signed-off-by: Marc Zyngier <maz@misterjones.org>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2010-03-06 15:12:29 +00:00
Mika Westerberg
12926dc416 [WATCHDOG] ep93xx: added platform side support for TS-72xx WDT driver
Signed-off-by: Mika Westerberg <mika.westerberg@iki.fi>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2010-03-06 15:12:17 +00:00
Mika Westerberg
c90bf2aa94 [WATCHDOG] ep93xx: implemented watchdog timer driver for TS-72xx SBCs
Technologic Systems TS-72xx SBCs have external glue logic
CPLD which includes watchdog timer. This driver implements
kernel support for that.

Signed-off-by: Mika Westerberg <mika.westerberg@iki.fi>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2010-03-06 15:12:03 +00:00
Joern Engel
c2f843f03d [LogFS] Change magic number
Many changes were made during development that could result in old
versions of mklogfs and the kernel code being subtly incompatible.
Not being a friend of subtleties, I hereby change the magic number.
Any old version of mklogfs is now guaranteed to fail.
2010-03-06 10:03:11 +01:00
Joern Engel
9cf05b416d [LogFS] Remove h_version field
Incompatible change: h_compr is moved up so the padding is all in one chunk.
2010-03-06 10:01:46 +01:00
Jeff Layton
c8634fd311 cifs: add a CIFSSMBUnixQFileInfo function
...to allow us to get unix attrs via filehandle.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-03-06 04:38:25 +00:00
Jeff Layton
bcd5357f43 cifs: add a CIFSSMBQFileInfo function
...to get inode attributes via filehandle instead of by path.

In some places, we need to revalidate an inode on an open filehandle,
but we can't necessarily guarantee that the dentry associated with it
will still be valid. When we have an open filehandle already, it makes
more sense to do a filehandle based operation anyway.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-03-06 04:38:02 +00:00
Jeff Layton
df2cf170c8 cifs: overhaul cifs_revalidate and rename to cifs_revalidate_dentry
cifs_revalidate is renamed to cifs_revalidate_dentry as a later patch
will add a by-filehandle variant.

Add a new "invalid_mapping" flag to the cifsInodeInfo that indicates
that the pagecache is considered invalid. Add a new routine to check
inode attributes whenever they're updated and set that flag if the inode
has changed on the server.

cifs_revalidate_dentry is then changed to just update the attrcache if
needed and then to zap the pagecache if it's not valid.

There are some other behavior changes in here as well. Open files are
now allowed to have their caches invalidated. I see no reason why we'd
want to keep stale data around just because a file is open. Also,
cifs_revalidate_cache uses the server_eof for revalidating the file
size since that should more closely match the size of the file on the
server.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-03-06 04:37:05 +00:00
Takahiro Yasui
f070304094 dm raid1: fix deadlock when suspending failed device
To prevent deadlock, bios in the hold list should be flushed before
dm_rh_stop_recovery() is called in mirror_suspend().

The recovery can't start because there are pending bios and therefore
dm_rh_stop_recovery deadlocks.

When there are pending bios in the hold list, the recovery waits for
the completion of the bios after recovery_count is acquired.
The recovery_count is released when the recovery finished, however,
the bios in the hold list are processed after dm_rh_stop_recovery() in
mirror_presuspend(). dm_rh_stop_recovery() also acquires recovery_count,
then deadlock occurs.

Signed-off-by: Takahiro Yasui <tyasui@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
2010-03-06 02:32:35 +00:00
Mike Snitzer
924e600d41 dm: eliminate some holes data structures
Eliminate a 4-byte hole in 'struct dm_io_memory' by moving 'offset' above the
'ptr' to which it applies (size reduced from 24 to 16 bytes).  And by
association, 1-4 byte hole is eliminated in 'struct dm_io_request' (size
reduced from 56 to 48 bytes).

Eliminate all 6 4-byte holes and 1 cache-line in 'struct dm_snapshot' (size
reduced from 392 to 368 bytes).

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2010-03-06 02:32:33 +00:00
Peter Rajnoha
3abf85b5b5 dm ioctl: introduce flag indicating uevent was generated
Set a new DM_UEVENT_GENERATED_FLAG when returning from ioctls to
indicate that a uevent was actually generated.  This tells the userspace
caller that it may need to wait for the event to be processed.

Signed-off-by: Peter Rajnoha <prajnoha@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2010-03-06 02:32:31 +00:00
Mikulas Patocka
a97f925a32 dm: free dm_io before bio_endio not after
Free the dm_io structure before calling bio_endio() instead of after it,
to ensure that the io_pool containing it is not referenced after it is
freed.

This partially fixes a problem described here
  https://www.redhat.com/archives/dm-devel/2010-February/msg00109.html

thread 1:
bio_endio(bio, io_error);
/* scheduling happens */
					thread 2:
					close the device
					remove the device
thread 1:
free_io(md, io);

Thread 2, when removing the device, sees non-empty md->io_pool (because the
io hasn't been freed by thread 1 yet) and may crash with BUG in mempool_free.
Thread 1 may also crash, when freeing into a nonexisting mempool.

To fix this we must make sure that bio_endio() is the last call and
the md structure is not accessed afterwards.

There is another bio_endio in process_barrier, but it is called from the thread
and the thread is destroyed prior to freeing the mempools, so this call is
not affected by the bug.

A similar bug exists with module unloads - the module may be unloaded
immediately after bio_endio - but that is more difficult to fix.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2010-03-06 02:32:29 +00:00
Nikanth Karthikesan
8215d6ec5f dm table: remove unused dm_get_device range parameters
Remove unused parameters(start and len) of dm_get_device()
and fix the callers.

Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2010-03-06 02:32:27 +00:00
Mike Snitzer
0f3649a9e3 dm ioctl: only issue uevent on resume if state changed
Only issue a uevent on a resume if the state of the device changed,
i.e. if it was suspended and/or its table was replaced.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2010-03-06 02:32:24 +00:00
Mikulas Patocka
ede5ea0b8b dm raid1: always return error if all legs fail
If all mirror legs fail, always return an error instead of holding the
bio, even if the handle_errors option was set.  At present it is the
responsibility of the driver underneath us to deal with retries,
multipath etc.

The patch adds the bio to the failures list instead of holding it
directly.  do_failures tests first if all legs failed and, if so,
returns the bio with -EIO.  If any leg is still alive and handle_errors
is set, do_failures calls hold_bio.

Reviewed-by: Takahiro Yasui <tyasui@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2010-03-06 02:32:22 +00:00
Kiyoshi Ueda
fb61264297 dm mpath: refactor pg_init
This patch pulls the pg_init path activation code out of
process_queued_ios() into a new function.

No functional change.

Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2010-03-06 02:32:18 +00:00
Kiyoshi Ueda
2bded7bd7e dm mpath: wait for pg_init completion when suspending
When suspending the device we must wait for all I/O to complete, but
pg-init may be still in progress even after flushing the workqueue
for kmpath_handlerd in multipath_postsuspend.

This patch waits for pg-init completion correctly in
multipath_postsuspend().

Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2010-03-06 02:32:13 +00:00
Kiyoshi Ueda
d0259bf0ee dm mpath: hold io until all pg_inits completed
m->queue_io is set to block processing I/Os, and it needs to be kept
while pg-init, which issues multiple path activations, is in progress.
But m->queue is cleared when a path activation completes without error
in pg_init_done(), even while other path activations are in progress.
That may cause undesired -EIO on paths which are not complete activation.

This patch fixes that by not clearing m->queue_io until all path
activations complete.

(Before the hardware handlers were moved into the SCSI layer, pg_init
only used one path.)

Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2010-03-06 02:30:02 +00:00
Kiyoshi Ueda
fce323dd68 dm mpath: avoid storing private suspended state
'suspended' flag in struct multipath was introduced to check whether
the multipath target is in suspended state, but the same check is
done through dm_suspended() now, so remove the flag and related code.

Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Cc: Mike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2010-03-06 02:29:59 +00:00
Mike Snitzer
c53a381efb dm: document when snapshot has finished merging
Update Documentation/device-mapper/snapshot.txt to cover "How to
determine when a snapshot has finished merging".

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2010-03-06 02:29:56 +00:00
Kiyoshi Ueda
ecdb2e257a dm table: remove dm_get from dm_table_get_md
Remove the dm_get() in dm_table_get_md() because dm_table_get_md() could
be called from presuspend/postsuspend, which are called while
mapped_device is in DMF_FREEING state, where dm_get() is not allowed.

Justification for that is the lifetime of both objects: As far as the
current dm design/implementation, mapped_device is never freed while
targets are doing something, because dm core waits for targets to become
quiet in dm_put() using presuspend/postsuspend.  So targets should be
able to touch mapped_device without holding reference count of the
mapped_device, and we should allow targets to touch mapped_device even
if it is in DMF_FREEING state.

Backgrounds:
I'm trying to remove the multipath internal queue, since dm core now has
a generic queue for request-based dm.  In the patch-set, the multipath
target wants to request dm core to start/stop queue.  One of such
start/stop requests can happen during postsuspend() while the target
waits for pg-init to complete, because the target stops queue when
starting pg-init and tries to restart it when completing pg-init.  Since
queue belongs to mapped_device, it involves calling dm_table_get_md()
and dm_put().  On the other hand, postsuspend() is called in dm_put()
for mapped_device which is in DMF_FREEING state, and that triggers
BUG_ON(DMF_FREEING) in the 2nd dm_put().

I had tried to solve this problem by changing only multipath not to
touch mapped_device which is in DMF_FREEING state, but I couldn't and I
came up with a question why we need dm_get() in dm_table_get_md().

Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2010-03-06 02:29:52 +00:00
Moger, Babu
f7b934c812 dm mpath: skip activate_path for failed paths
This patch adds two minor fixes while processing device mapper path activation.

Skip failed paths while calling activate_path.  If the path is already failed
then activate_path will fail for sure. We don't have to call in that case. In
some case this might cause prolonged retries unnecessarily.

Change the misleading message if the path being activated fails with SCSI_DH_NOSYS.

Signed-off-by: Babu Moger <babu.moger@lsi.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2010-03-06 02:29:49 +00:00
Moger, Babu
83c0d5d538 dm mpath: pass struct pgpath to pg init done
This patch removes some unnecessary argument casting. There is no
functional change with this patch.

Passes 'struct pgpath' through to pg_init_done() instead of the enclosed
'struct dm_path'.

Tested the changes with LSI storage..

CC: Chandra Seetharaman <chandra.seetharaman@us.ibm.com>
Signed-off-by: Babu Moger <babu.moger@lsi.com>
Acked-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2010-03-06 02:29:45 +00:00
Tim Bird
0e95017355 function-graph: Add tracing_thresh support to function_graph tracer
Add support for tracing_thresh to the function_graph tracer.  This
version of this feature isolates the checks into new entry and
return functions, to avoid adding more conditional code into the
main function_graph paths.

When the tracing_thresh is set and the function graph tracer is
enabled, only the functions that took longer than the time in
microseconds that was set in tracing_thresh are recorded. To do this
efficiently, only the function exits are recorded:

 [tracing]# echo 100 > tracing_thresh
 [tracing]# echo function_graph > current_tracer
 [tracing]# cat trace
 # tracer: function_graph
 #
 # CPU  DURATION                  FUNCTION CALLS
 # |     |   |                     |   |   |   |
  1) ! 119.214 us  |  } /* smp_apic_timer_interrupt */
  1)   <========== |
  0) ! 101.527 us  |              } /* __rcu_process_callbacks */
  0) ! 126.461 us  |            } /* rcu_process_callbacks */
  0) ! 145.111 us  |          } /* __do_softirq */
  0) ! 149.667 us  |        } /* do_softirq */
  0) ! 168.817 us  |      } /* irq_exit */
  0) ! 248.254 us  |    } /* smp_apic_timer_interrupt */

Also, add support for specifying tracing_thresh on the kernel
command line.  When used like so: "tracing_thresh=200 ftrace=function_graph"
this can be used to analyse system startup.  It is important to disable
tracing soon after boot, in order to avoid losing the trace data.

Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Tim Bird <tim.bird@am.sony.com>
LKML-Reference: <4B87098B.4040308@am.sony.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-03-05 21:20:57 -05:00
Arnaldo Carvalho de Melo
1acaa1b2d9 tracing: Update the comm field in the right variable in update_max_tr
The latency output showed:

 #    | task: -3 (uid:0 nice:0 policy:1 rt_prio:99)

The comm is missing in the "task:" and it looks like a minus 3 is
the output. The correct display should be:

 #    | task: migration/0-3 (uid:0 nice:0 policy:1 rt_prio:99)

The problem is that the comm is being stored in the wrong data
structure. The max_tr.data[cpu] is what stores the comm, not the
tr->data[cpu].

Before this patch the max_tr.data[cpu]->comm was zeroed and the /debug/trace
ended up showing just the '-' sign followed by the pid.

Also remove a needless initialization of max_data.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1267824230-23861-1-git-send-email-acme@infradead.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-03-05 21:12:08 -05:00
Steven Rostedt
a094fe04c7 function-graph: Use comment notation for func names of dangling '}'
When a '}' does not have a matching function start, the name is printed
within parenthesis. But this makes it confusing between ending '}'
and function starts. This patch makes the function name appear in C comment
notation.

Old view:
 3)   1.281 us    |            } (might_fault)
 3)   3.620 us    |          } (filldir)
 3)   5.251 us    |        } (call_filldir)
 3)               |        call_filldir() {
 3)               |          filldir() {

New view:
 3)   1.281 us    |            } /* might_fault */
 3)   3.620 us    |          } /* filldir */
 3)   5.251 us    |        } /* call_filldir */
 3)               |        call_filldir() {
 3)               |          filldir() {

Requested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-03-05 21:11:13 -05:00
Steven Rostedt
801c29fd1f function-graph: Fix unused reference to ftrace_set_func()
The declaration of ftrace_set_func() is at the start of the ftrace.c file
and wrapped with a #ifdef CONFIG_FUNCTION_GRAPH condition. If function
graph tracing is enabled but CONFIG_DYNAMIC_FTRACE is not, a warning
about that function being declared static and unused is given.

This really should have been placed within the CONFIG_FUNCTION_GRAPH
condition that uses ftrace_set_func().

Moving the declaration down fixes the warning and makes the code cleaner.

Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-03-05 21:00:30 -05:00
Rafael J. Wysocki
bb910a7040 PCI/PM Runtime: Make runtime PM of PCI devices inactive by default
Make the run-time power management of PCI devices be inactive by
default by calling pm_runtime_forbid() for each PCI device during its
initialization.  This setting may be overriden by the user space with
the help of the /sys/devices/.../power/control interface.

That's necessary to avoid breakage on systems where ACPI-based
wake-up is known to fail for some devices.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-03-05 15:09:39 -08:00
Stephen Rothwell
f1a3d57213 ceph: update for write_inode API change
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Sage Weil <sage@newdream.net>
2010-03-05 14:49:41 -08:00
Linus Torvalds
64096c1741 Merge branch 'slab-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6
* 'slab-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
  SLUB: Fix per-cpu merge conflict
  failslab: add ability to filter slab caches
  slab: fix regression in touched logic
  dma kmalloc handling fixes
  slub: remove impossible condition
  slab: initialize unused alien cache entry as NULL at alloc_alien_cache().
  SLUB: Make slub statistics use this_cpu_inc
  SLUB: this_cpu: Remove slub kmem_cache fields
  SLUB: Get rid of dynamic DMA kmalloc cache allocation
  SLUB: Use this_cpu operations in slub
2010-03-05 14:35:40 -08:00
Breno Leitao
3a22813a5a s2io: Fixing debug message
Currently s2io is dumping debug messages using the interface name
before it was allocated, showing a message like the following:

s2io: eth%d: Ring Mem PHY: 0x7ef80000
s2io: s2io_reset: Resetting XFrame card eth%d

This patch just fixes it, printing the pci bus information for
the card instead of the interface name.

Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-05 14:00:19 -08:00
Jesse Brandeburg
a80483d372 e1000e: fix packet corruption and tx hang during NFSv2
when receiving a particular type of NFS v2 UDP traffic, the hardware could
DMA some bad data and then hang, possibly corrupting memory.

Disable the NFS parsing in this hardware, verified to fix the bug.

Originally reported and reproduced by RedHat's Neil Horman
CC: nhorman@tuxdriver.com
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-05 14:00:18 -08:00
David Dillow
5fe88eae26 typhoon: fix incorrect use of smp_wmb()
The typhoon driver was incorrectly using smp_wmb() to order memory
accesses against IO to the NIC in a few instances. Use wmb() instead,
which is required to actually order between memory types.

Signed-off-by: David Dillow <dave@thedillows.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-05 14:00:18 -08:00
Jeff Garzik
d17792ebdf ethtool: Add direct access to ops->get_sset_count
On 03/04/2010 09:26 AM, Ben Hutchings wrote:
> On Thu, 2010-03-04 at 00:51 -0800, Jeff Kirsher wrote:
>> From: Jeff Garzik<jgarzik@redhat.com>
>>
>> This patch is an alternative approach for accessing string
>> counts, vs. the drvinfo indirect approach.  This way the drvinfo
>> space doesn't run out, and we don't break ABI later.
> [...]
>> --- a/net/core/ethtool.c
>> +++ b/net/core/ethtool.c
>> @@ -214,6 +214,10 @@ static noinline int ethtool_get_drvinfo(struct net_device *dev, void __user *use
>>   	info.cmd = ETHTOOL_GDRVINFO;
>>   	ops->get_drvinfo(dev,&info);
>>
>> +	/*
>> +	 * this method of obtaining string set info is deprecated;
>> +	 * consider using ETHTOOL_GSSET_INFO instead
>> +	 */
>
> This comment belongs on the interface (ethtool.h) not the
> implementation.

Debatable -- the current comment is located at the callsite of
ops->get_sset_count(), which is where an implementor might think to add
a new call.  Not all the numeric fields in ethtool_drvinfo are obtained
from ->get_sset_count().

Hence the "some" in the attached patch to include/linux/ethtool.h,
addressing your comment.

> [...]
>> +static noinline int ethtool_get_sset_info(struct net_device *dev,
>> +                                          void __user *useraddr)
>> +{
> [...]
>> +	/* calculate size of return buffer */
>> +	for (i = 0; i<  64; i++)
>> +		if (sset_mask&  (1ULL<<  i))
>> +			n_bits++;
> [...]
>
> We have a function for this:
>
> 	n_bits = hweight64(sset_mask);

Agreed.

I've attached a follow-up patch, which should enable my/Jeff's kernel
patch to be applied, followed by this one.

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-05 14:00:17 -08:00
Jeff Garzik
723b2f57ad ethtool: Add direct access to ops->get_sset_count
This patch is an alternative approach for accessing string
counts, vs. the drvinfo indirect approach.  This way the drvinfo
space doesn't run out, and we don't break ABI later.

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-05 14:00:17 -08:00
David Brown
4b79a1aedc net: smc91x: Support Qualcomm MSM development boards.
Signed-off-by: David Brown <davidb@quicinc.com>
Signed-off-by: Daniel Walker <dwalker@codeaurora.org>
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-05 13:56:40 -08:00
Zhu Yi
a3a858ff18 net: backlog functions rename
sk_add_backlog -> __sk_add_backlog
sk_add_backlog_limited -> sk_add_backlog

Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-05 13:34:03 -08:00
Zhu Yi
2499849ee8 x25: use limited socket backlog
Make x25 adapt to the limited socket backlog change.

Cc: Andrew Hendry <andrew.hendry@gmail.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-05 13:34:02 -08:00
Zhu Yi
53eecb1be5 tipc: use limited socket backlog
Make tipc adapt to the limited socket backlog change.

Cc: Jon Maloy <jon.maloy@ericsson.com>
Cc: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-05 13:34:02 -08:00
Zhu Yi
50b1a782f8 sctp: use limited socket backlog
Make sctp adapt to the limited socket backlog change.

Cc: Vlad Yasevich <vladislav.yasevich@hp.com>
Cc: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-05 13:34:01 -08:00
Zhu Yi
79545b6819 llc: use limited socket backlog
Make llc adapt to the limited socket backlog change.

Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-05 13:34:01 -08:00
Zhu Yi
55349790d7 udp: use limited socket backlog
Make udp adapt to the limited socket backlog change.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: "Pekka Savola (ipv6)" <pekkas@netcore.fi>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-05 13:34:00 -08:00
Zhu Yi
6b03a53a5a tcp: use limited socket backlog
Make tcp adapt to the limited socket backlog change.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: "Pekka Savola (ipv6)" <pekkas@netcore.fi>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-05 13:34:00 -08:00
Zhu Yi
8eae939f14 net: add limit for socket backlog
We got system OOM while running some UDP netperf testing on the loopback
device. The case is multiple senders sent stream UDP packets to a single
receiver via loopback on local host. Of course, the receiver is not able
to handle all the packets in time. But we surprisingly found that these
packets were not discarded due to the receiver's sk->sk_rcvbuf limit.
Instead, they are kept queuing to sk->sk_backlog and finally ate up all
the memory. We believe this is a secure hole that a none privileged user
can crash the system.

The root cause for this problem is, when the receiver is doing
__release_sock() (i.e. after userspace recv, kernel udp_recvmsg ->
skb_free_datagram_locked -> release_sock), it moves skbs from backlog to
sk_receive_queue with the softirq enabled. In the above case, multiple
busy senders will almost make it an endless loop. The skbs in the
backlog end up eat all the system memory.

The issue is not only for UDP. Any protocols using socket backlog is
potentially affected. The patch adds limit for socket backlog so that
the backlog size cannot be expanded endlessly.

Reported-by: Alex Shi <alex.shi@intel.com>
Cc: David Miller <davem@davemloft.net>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru
Cc: "Pekka Savola (ipv6)" <pekkas@netcore.fi>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Vlad Yasevich <vladislav.yasevich@hp.com>
Cc: Sridhar Samudrala <sri@us.ibm.com>
Cc: Jon Maloy <jon.maloy@ericsson.com>
Cc: Allan Stephens <allan.stephens@windriver.com>
Cc: Andrew Hendry <andrew.hendry@gmail.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-05 13:33:59 -08:00