Commit graph

145274 commits

Author SHA1 Message Date
Steven Rostedt
4671c79408 tracing: add trace_set_clr_event to export event enabling function
Other parts of the kernel may need to be able to enable or disable
specific events. Especially parts that create trace events.

[ Impact: allow enabling of trace events by those that create the event ]

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-05-08 16:30:26 -04:00
Steven Rostedt
29f93943d1 tracing: initialize return value for __ftrace_set_clr_event
Commit 8f31bfe538
tracing/events: clean up for ftrace_set_clr_event()

Moved out the code for ftrace_set_clr_event into a helper funciton but
did not initialize the return value. As a result, we do not warn about
a typo in the echoing of events in set_event.

This patch restores the old warning:

 # echo foobar > set_event
-bash: echo: write error: Invalid argument

[ Impact: restore warning of invalid entries to set_event ]

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-05-08 16:06:47 -04:00
David S. Miller
e81963b180 ipv4: Make INET_LRO a bool instead of tristate.
This code is used as a library by several device drivers,
which select INET_LRO.

If some are modules and some are statically built into the
kernel, we get build failures if INET_LRO is modular.

Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-08 12:45:26 -07:00
Jean Delvare
848ddf116b hwmon: (w83781d) Fix W83782D support (NULL pointer dereference)
Commit 360782dde0 (hwmon: (w83781d) Stop
abusing struct i2c_client for ISA devices) broke W83782D support for
devices connected on the ISA bus. You will hit a NULL pointer
dereference as soon as you read any device attribute. Other devices,
and W83782D devices on the SMBus, aren't affected.

Reported-by: Michel Abraham
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Tested-by: Michel Abraham
2009-05-08 20:27:28 +02:00
Luca Tettamanti
b9008708f2 hwmon: (asus_atk0110) Fix compiler warning
atk_sensor_type is only used when DEBUG is defined.

Signed-off-by: Luca Tettamanti <kronos.it@gmail.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2009-05-08 20:27:28 +02:00
Peter Horton
cd1a6de7d4 mtd: fix timeout in M25P80 driver
Extend erase timeout in M25P80 SPI Flash driver.

The M25P80 drivers fails erasing sectors on a M25P128 because the ready
wait timeout is too short. Change the timeout from a simple loop count to a
suitable number of seconds.

Signed-off-by: Peter Horton <zero@colonel-panic.org>
Tested-by: Martin Michlmayr <tbm@cyrius.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-05-08 13:51:53 +01:00
Li Zefan
c142b15dc5 tracing/events: simplify system_enable_read()
A smarter way to figure out the output of an enable file.

[ Impact: clean up ]

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <4A0399A5.2080603@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-08 14:00:36 +02:00
Li Zefan
8f31bfe538 tracing/events: clean up for ftrace_set_clr_event()
Add a helper function __ftrace_set_clr_event(), and replace some
ftrace_set_clr_event() calls with this helper, thus we don't need any
kstrdup() or kmalloc().

As a side effect, this patch fixes an issue in self tests code, which is
similar to the one fixed in commit d6bf81ef0f
("tracing: append ":*" to internal setting of system events")

It's a small issue and won't cause any bug in fact, but we should do things
right anyway.

[ Impact: prevent spurious event-enabling in tracing self-tests ]

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <4A03998E.3020503@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-08 14:00:35 +02:00
Hidetoshi Seto
e5299926d7 x86: MCE: make cmci_discover_lock irq-safe
Lockdep reports the warning below when Li tries to offline one cpu:

[  110.835487] =================================
[  110.835616] [ INFO: inconsistent lock state ]
[  110.835688] 2.6.30-rc4-00336-g8c9ed89 #52
[  110.835757] ---------------------------------
[  110.835828] inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
[  110.835908] swapper/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
[  110.835982]  (cmci_discover_lock){?.+...}, at: [<ffffffff80236dc0>] cmci_clear+0x30/0x9b

cmci_clear() can be called via smp_call_function_single().

It is better to disable interrupt while holding cmci_discover_lock,
to turn it into an irq-safe lock - we can deadlock otherwise.

[ Impact: fix possible deadlock in the MCE code ]

Reported-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <4A03ED38.8000700@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Reported-by: Shaohua Li<shaohua.li@intel.com>
2009-05-08 11:03:26 +02:00
Jeremy Fitzhardinge
33df4db04a x86: xen, i386: reserve Xen pagetables
The Xen pagetables are no longer implicitly reserved as part of the other
i386_start_kernel reservations, so make sure we explicitly reserve them.
This prevents them from being released into the general kernel free page
pool and reused.

[ Impact: fix Xen guest crash ]

Also-Bisected-by: Bryan Donlan <bdonlan@gmail.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Xen-devel <xen-devel@lists.xensource.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <4A032EEC.30509@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-08 10:49:11 +02:00
Takashi Iwai
5dd17cb992 ALSA: hda - Fix line-in on Mac Mini Core2 Duo
BIOS on Mac Mini Core2 Duo sets both INPUT and OUTPUT pinctl bits to
the line-in jack, and it confuses the driver as if it's a valid input.
This patch adds the check of OUTPUT bit so that the driver fixes the
invalid pin setup.

Tested-by: Tino Keitel <tino.keitel@gmx.de>
Cc: <stable@kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2009-05-08 07:55:10 +02:00
Huang Ying
6407df5ca5 x86, kexec: fix crashdump panic with CONFIG_KEXEC_JUMP
Tim Starling reported that crashdump will panic with kernel compiled
with CONFIG_KEXEC_JUMP due to null pointer deference in
machine_kexec_32.c: machine_kexec(), when deferencing
kexec_image. Refering to:

http://bugzilla.kernel.org/show_bug.cgi?id=13265

This patch fixes the BUG via replacing global variable reference:
kexec_image in machine_kexec() with local variable reference: image,
which is more appropriate, and will not be null.

Same BUG is in machine_kexec_64.c too, so fixed too in the same way.

[ Impact: fix crash on kexec ]

Reported-by: Tim Starling <tstarling@wikimedia.org>
Signed-off-by: Huang Ying <ying.huang@intel.com>
LKML-Reference: <1241751101.6259.85.camel@yhuang-dev.sh.intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-05-07 22:01:05 -07:00
Jan Beulich
4983439676 x86-64: finish cleanup_highmaps()'s job wrt. _brk_end
With the introduction of the .brk section, special care must be taken
that no unused page table entries remain if _brk_end and _end are
separated by a 2M page boundary. cleanup_highmap() runs very early and
hence cannot take care of that, hence potential entries needing to be
removed past _brk_end must be cleared once the brk allocator has done
its job.

[ Impact: avoids undesirable TLB aliases ]

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-05-07 21:51:34 -07:00
Jan Beulich
6143876651 x86: fix boot hang in early_reserve_e820()
If the first non-reserved (sub-)range doesn't fit the size requested,
an endless loop will be entered. If a range returned from
find_e820_area_size() turns out insufficient in size, the range must
be skipped before calling the function again.

[ Impact: fixes boot hang on some platforms ]

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-05-07 21:42:39 -07:00
Jack Morgenstein
2b6b7d4be4 IB/mlx4: Don't overwrite fast registration page list when posting work request
The low-level mlx4 driver modified the page-list addresses for fast
register work requests post send to big-endian, and set a "present"
bit.  This caused problems later when the consumer attempted to unmap
the pages using the page-list (using the list addresses which were
assumed to be still in CPU-endian order).  Fix the mlx4 driver to
allocate two buffers and use a private buffer for the hardware-format
bus addresses.

This patch fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1571>,
an NFS/RDMA server crash.  The cause of the crash was found by Vu Pham
of Mellanox.  The fix is along the lines suggested by Steve Wise in
comment #21 in bug 1571.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-05-07 21:35:13 -07:00
Linus Torvalds
d7a5926978 Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: (32 commits)
  [CIFS] Fix double list addition in cifs posix open code
  [CIFS] Allow raw ntlmssp code to be enabled with sec=ntlmssp
  [CIFS] Fix SMB uid in NTLMSSP authenticate request
  [CIFS] NTLMSSP reenabled after move from connect.c to sess.c
  [CIFS] Remove sparse warning
  [CIFS] remove checkpatch warning
  [CIFS] Fix final user of old string conversion code
  [CIFS] remove cifs_strfromUCS_le
  [CIFS] NTLMSSP support moving into new file, old dead code removed
  [CIFS] Fix endian conversion of vcnum field
  [CIFS] Remove trailing whitespace
  [CIFS] Remove sparse endian warnings
  [CIFS] Add remaining ntlmssp flags and standardize field names
  [CIFS] Fix build warning
  cifs: fix length handling in cifs_get_name_from_search_buf
  [CIFS] Remove unneeded QuerySymlink call and fix mapping for unmapped status
  [CIFS] rename cifs_strndup to cifs_strndup_from_ucs
  Added loop check when mounting DFS tree.
  Enable dfs submounts to handle remote referrals.
  [CIFS] Remove older session setup implementation
  ...
2009-05-07 21:13:24 -07:00
Steve French
90e4ee5d31 [CIFS] Fix double list addition in cifs posix open code
Remove adding open file entry twice to lists in the file
Do not fill file info twice in case of posix opens and creates

Signed-off-by: Shirish Pargaonkar <shirishp@us.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-05-08 03:04:30 +00:00
Jussi Kivilinna
9e68177ef9 Input: ff-memless - fix signed to unsigned bit overflow
When userspace sets effect->u.rumble.strong_magnitude to 0x8001 or
larger, ml_combine_effects() would always return strong_magnitude
0xffff.

Problem is that 'gain' is passed in as signed integer. Multiplying
magnitude (__u16) with gain (int) causes magnitude read as signed and
results negative value (with magnitude > 0x8000). This signed integer
is then divided and value, still negative, converted to 32bit unsigned
integer. Finally checking combine overflow min(new+old, 0xffff) gives
out 0xffff.

Fix is to simply change 'gain' to unsigned int.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: Anssi Hannula <anssi.hannula@gmail.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2009-05-07 19:04:16 -07:00
Tim Cole
d07a9cba6b Input: joydev - blacklist digitizers
BTN_TOUCH is not set by the wacom driver which causes it to be handled by the
joydev driver while the resulting device is broken. This causes problems with
applications that try to use a joystick device.

Ubuntu BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/300143

Signed-off-by: Tim Cole <tim.cole@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Amit Kucheria <amit.kucheria@canonical.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2009-05-07 19:04:03 -07:00
Steven Rostedt
74f4fd2166 ring-buffer: change WARN_ON from checking preempt_count to preemptible
There's a WARN_ON in the ring buffer code that makes sure preemption
is disabled. It checks "!preempt_count()". But when CONFIG_PREEMPT is not
enabled, preempt_count() is always zero, and this will trigger the warning.

[ Impact: prevent false warning on non preemptible kernels ]

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-05-07 20:01:11 -04:00
Steven Rostedt
7da3046d6c ring-buffer: add total count in ring-buffer-benchmark
It is nice to see the overhead of the benchmark test when tracing is
disabled. That is, we turn off the ring buffer just to see what the
cost of running the loop that calls into the ring buffer is.

Currently, if no entries wer made, we get 0. This is not informative.
This patch changes it to check if we had any "missed" (non recorded)
events. If so, a total count is also reported.

[ Impact: evaluate the over head of the ring buffer benchmark test ]

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-05-07 19:52:20 -04:00
Ashish Karkare
9b05126baa net: remove stale reference to fastroute from Kconfig help text
Signed-off-by: Ashish Karkare <akarkare@marvell.com>
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-07 16:31:01 -07:00
David Howells
8c9ed899b4 NOMMU: Don't check vm_region::vm_start is page aligned in add_nommu_region()
Don't check vm_region::vm_start is page aligned in add_nommu_region() because
the region may reflect some non-page-aligned mapped file, such as could be
obtained from RomFS XIP.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-05-07 12:03:41 -07:00
Linus Torvalds
ee7fee0b91 Merge branch 'for-linus' of git://neil.brown.name/md
* 'for-linus' of git://neil.brown.name/md:
  md: remove rd%d links immediately after stopping an array.
  md: remove ability to explicit set an inactive array to 'clean'.
  md: constify VFTs
  md: tidy up status_resync to handle large arrays.
  md: fix some (more) errors with bitmaps on devices larger than 2TB.
  md/raid10: don't clear bitmap during recovery if array will still be degraded.
  md: fix loading of out-of-date bitmap.
2009-05-07 12:01:41 -07:00
Linus Torvalds
8a0a9bd4db random: make get_random_int() more random
It's a really simple patch that basically just open-codes the current
"secure_ip_id()" call, but when open-coding it we now use a _static_
hashing area, so that it gets updated every time.

And to make sure somebody can't just start from the same original seed of
all-zeroes, and then do the "half_md4_transform()" over and over until
they get the same sequence as the kernel has, each iteration also mixes in
the same old "current->pid + jiffies" we used - so we should now have a
regular strong pseudo-number generator, but we also have one that doesn't
have a single seed.

Note: the "pid + jiffies" is just meant to be a tiny tiny bit of noise. It
has no real meaning. It could be anything. I just picked the previous
seed, it's just that now we keep the state in between calls and that will
feed into the next result, and that should make all the difference.

I made that hash be a per-cpu data just to avoid cache-line ping-pong:
having multiple CPU's write to the same data would be fine for randomness,
and add yet another layer of chaos to it, but since get_random_int() is
supposed to be a fast interface I did it that way instead. I considered
using "__raw_get_cpu_var()" to avoid any preemption overhead while still
getting the hash be _mostly_ ping-pong free, but in the end good taste won
out.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-05-07 11:59:06 -07:00
Steven Rostedt
0574ea421b ring-buffer: only periodically call cond_resched to ring-buffer-benchmark
Calling cond_resched at every iteration of the loop adds a bit of
overhead to the benchmark.

This patch does two things.

1) only calls cond-resched when CONFIG_PREEMPT is not enabled
2) only calls cond-resched after so many traces has been performed.

[ Impact: less overhead to the ring-buffer-benchmark ]

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-05-07 14:20:28 -04:00
Linus Torvalds
2c66fa7e6b Merge master.kernel.org:/home/rmk/linux-2.6-arm
* master.kernel.org:/home/rmk/linux-2.6-arm:
  [ARM] 5507/1: support R_ARM_MOVW_ABS_NC and MOVT_ABS relocation types
  [ARM] 5506/1: davinci: DMA_32BIT_MASK --> DMA_BIT_MASK(32)
  i.MX31: Disable CPU_32v6K in mx3_defconfig.
  mx3fb: Fix compilation with CONFIG_PM
  mx27ads: move PBC mapping out of vmalloc space
  MXC: remove BUG_ON in interrupt handler
  mx31: remove mx31moboard_defconfig
  ARM: ARCH_MXC should select HAVE_CLK
  mxc : BUG in imx_dma_request
  mxc : Clean up properly when imx_dma_free() used without imx_dma_disable()
  [ARM] mv78xx0: update defconfig
  [ARM] orion5x: update defconfig
  [ARM] Kirkwood: update defconfig
  [ARM] Kconfig typo fix:  "PXA930" -> "CPU_PXA930".
  [ARM] S3C2412: Add missing cache flush in suspend code
  [ARM] S3C: Add UDIVSLOT support for newer UARTS
  [ARM] S3C64XX: Add S3C64XX_PA_IIS{0,1} to <mach/map.h>
2009-05-07 10:54:32 -07:00
Steven Rostedt
65b7724204 tracing: have menu default enabled when kernel debug is configured
Tracing can be very helpful to debug the kernel. When DEBUG_KERNEL is
enabled it is nice to enable the trace menu as well.

This patch only make the tracing menu enabled by default, it does not
make any of the tracers enabled. And the menu is only enabled by
default if DEBUG_KERNEL is enabled.

[ Impact: show tracing options to those debugging the kernel ]

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-05-07 12:49:27 -04:00
Paul Gortmaker
ae51e60984 [ARM] 5507/1: support R_ARM_MOVW_ABS_NC and MOVT_ABS relocation types
From: Bruce Ashfield <bruce.ashfield@windriver.com>

To fully support the armv7-a instruction set/optimizations, support
for the R_ARM_MOVW_ABS_NC and R_ARM_MOVT_ABS relocation types is
required.

The MOVW and MOVT are both load-immediate instructions, MOVW loads 16
bits into the bottom half of a register, and MOVT loads 16 bits into the
top half of a register.

The relocation information for these instructions has a full 32 bit
value, plus an addend which is stored in the 16 immediate bits in the
instruction itself.  The immediate bits in the instruction are not
contiguous (the register # splits it into a 4 bit and 12 bit value),
so the addend has to be extracted accordingly and added to the value.
The value is then split and put into the instruction; a MOVW uses the
bottom 16 bits of the value, and a MOVT uses the top 16 bits.

Signed-off-by: David Borman <david.borman@windriver.com>
Signed-off-by: Bruce Ashfield <bruce.ashfield@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-05-07 17:21:01 +01:00
Steven Rostedt
d6bf81ef0f tracing: append ":*" to internal setting of system events
The system enabling of events uses the same code as the set_event file.
It passes in the name of the system to the parser and that will enable
all the events that has that system as a name.

The problem is that it will also enable events with the same name as the
system.

If you have system name foo, and system name bar, but within the system
bar, there exists an event called foo. By setting the system name foo,
you will also be enabling the event foo in the system bar. This is not
an expected result.

The solution is to pass in "foo:*", which will only enable the system
foo and not events called foo.

[ Impact: prevent accidental enabling of events with same name as a system ]

Reported-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-05-07 11:49:35 -04:00
Steven Rostedt
29c8000ee7 ring-buffer: remove complex calculations in ring-buffer-test
Ingo Molnar thought that the code to calculate the time in cond_resched
is a bit too ugly and is not needed. This patch removes it and replaces
it with a simple call to cond_resched. I kept the comment that explains
the reason for the cond_resched.

[ Impact: remove ugly code ]

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-05-07 11:16:18 -04:00
Kevin Hilman
a029b706d3 [ARM] 5506/1: davinci: DMA_32BIT_MASK --> DMA_BIT_MASK(32)
As per commit 284901a90a, use
DMA_BIT_MASK(n)

Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-05-07 14:44:47 +01:00
Ingo Molnar
0ad5d703c6 Merge branch 'tracing/hw-branch-tracing' into tracing/core
Merge reason: this topic is ready for upstream now. It passed
              Oleg's review and Andrew had no further mm/*
              objections/observations either.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-07 13:36:22 +02:00
Ingo Molnar
44347d947f Merge branch 'linus' into tracing/core
Merge reason: tracing/core was on a .30-rc1 base and was missing out on
              on a handful of tracing fixes present in .30-rc5-almost.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-07 11:17:34 +02:00
Li Zefan
d94fc523f3 tracing/events: fix concurrent access to ftrace_events list, fix
In filter_add_subsystem_pred() we should release event_mutex before
calling filter_free_subsystem_preds(), since both functions hold
event_mutex.

[ Impact: fix deadlock when writing invalid pred into subsystem filter ]

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: tzanussi@gmail.com
Cc: a.p.zijlstra@chello.nl
Cc: fweisbec@gmail.com
Cc: rostedt@goodmis.org
LKML-Reference: <4A028993.7020509@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-07 10:07:28 +02:00
Frederic Weisbecker
5928c3cc0f tracing/filters: support for operator reserved characters in strings
When we set a filter for an event, such as:

echo "name == my_lock_name" > \
	/debug/tracing/events/lockdep/lock_acquired/filter

then the following order of token type is parsed:

- space
- operator
- parentheses
- operand

Because the operators and parentheses have a higher precedence
than the operand characters, which is normal, then we can't
use any string containing such special characters:

()=<>!&|

To get this support and also avoid ambiguous intepretation from
the parser or the human, we can do it using double quotes so that
we keep the usual languages habits.

Then after this patch you can still declare string condition like
before:

echo name == myname

But if you want to compare against a string containing an operator
character, you can use double quotes:

echo 'name == "&myname"'

Don't forget to include the whole expression into single quotes or
the double ones will be eaten by echo.

[ Impact: support strings with special characters for tracing filters ]

Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Zhaolei <zhaolei@cn.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-05-07 10:05:57 +02:00
Frederic Weisbecker
e8808c1019 tracing/filters: support for filters of dynamic sized arrays
Currently the filtering infrastructure supports well the
numeric types and fixed sized array types.

But the recently added __string() field uses a specific
indirect offset mechanism which requires a specific
predicate. Until now it wasn't supported.

This patch adds this support and implies very few changes,
only a new predicate is needed, the management of this specific
field can be done through the usual string helpers in the
filtering infrastructure.

[ Impact: support all kinds of strings in the tracing filters ]

Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Zhaolei <zhaolei@cn.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-05-07 10:05:57 +02:00
Steven Rostedt
8ae79a138e tracing: add hierarchical enabling of events
With the current event directory, you can only enable individual events.
The file debugfs/tracing/set_event is used to be able to enable or
disable several events at once. But that can still be awkward.

This patch adds hierarchical enabling of events. That is, each directory
in debugfs/tracing/events has an "enable" file. This file can enable
or disable all events within the directory and below.

 # echo 1 > /debugfs/tracing/events/enable

will enable all events.

 # echo 1 > /debugfs/tracing/events/sched/enable

will enable all events in the sched subsystem.

 # echo 1 > /debugfs/tracing/events/enable
 # echo 0 > /debugfs/tracing/events/irq/enable

will enable all events, but then disable just the irq subsystem events.

When reading one of these enable files, there are four results:

 0 - all events this file affects are disabled
 1 - all events this file affects are enabled
 X - there is a mixture of events enabled and disabled
 ? - this file does not affect any event

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-05-06 23:11:42 -04:00
Steven Rostedt
9456f0fa6d tracing: reset ring buffer when removing modules with events
Li Zefan found that there's a race using the event ids of events and
modules. When a module is loaded, an event id is incremented. We only
have 16 bits for event ids (65536) and there is a possible (but highly
unlikely) race that we could load and unload a module that registers
events so many times that the event id counter overflows.

When it overflows, it then restarts and goes looking for available
ids. An id is available if it was added by a module and released.

The race is if you have one module add an id, and then is removed.
Another module loaded can use that same event id. But if the old module
still had events in the ring buffer, the new module's call back would
get bogus data.  At best (and most likely) the output would just be
garbage. But if the module for some reason used pointers (not recommended)
then this could potentially crash.

The safest thing to do is just reset the ring buffer if a module that
registered events is removed.

[ Impact: prevent unpredictable results of event id overflows ]

Reported-by: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <49FEAFD0.30106@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-05-06 23:11:41 -04:00
Steven Rostedt
71e1c8ac42 tracing: update sample with TRACE_INCLUDE_FILE
When creating trace events for ftrace, the header file with the TRACE_EVENT
macros must also have a macro called TRACE_SYSTEM. This macro describes
the name of the system the TRACE_EVENTS are defined for. It also doubles
as a way for the define_trace.h file to include the file that included
it.

For example:

in irq.h

 #define TRACE_SYSTEM irq

[...]

 #include <trace/define_trace.h>

The define_trace will use TRACE_SYSTEM to include irq.h. But if the name
of the trace system does not match the name of the trace header file,
one can override it with:

Which will change define_trace.h to inclued foo_trace.h instead of foo.h

The sample comments this, but people that use the sample code will more
likely use the code and not read the comments. This patch changes the
sample code to use the TRACE_INCLUDE_FILE to better show developers how to
use it.

[ Impact: make sample less confusing to developers ]

Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-05-06 23:10:42 -04:00
NeilBrown
c4647292fd md: remove rd%d links immediately after stopping an array.
md maintains link in sys/mdXX/md/ to identify which device has
which role in the array. e.g.
   rd2 -> dev-sda

indicates that the device with role '2' in the array is sda.

These links are only present when the array is active.  They are
created immediately after ->run is called, and so should be removed
immediately after ->stop is called.
However they are currently removed a little bit later, and it is
possible for ->run to be called again, thus adding these links, before
they are removed.

So move the removal earlier so they are consistently only present when
the array is active.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-05-07 12:51:06 +10:00
NeilBrown
5bf2959754 md: remove ability to explicit set an inactive array to 'clean'.
Being able to write 'clean' to an 'array_state' of an inactive array
to activate it in 'clean' mode is both unnecessary and inconvenient.

It is unnecessary because the same can be achieved by writing
'active'.  This activates and array, but it still remains 'clean'
until the first write.

It is inconvenient because writing 'clean' is more often used to
cause an 'active' array to revert to 'clean' mode (thus blocking
any writes until a 'write-pending' is promoted to 'active').

Allowing 'clean' to both activate an array and mark an active array as
clean can lead to races:  One program writes 'clean' to mark the
active array as clean at the same time as another program writes
'inactive' to deactivate (stop) and active array.  Depending on which
writes first, the array could be deactivated and immediately
reactivated which isn't what was desired.

So just disable the use of 'clean' to activate an array.

This avoids a race that can be triggered with mdadm-3.0 and external
metadata, so it suitable for -stable.

Reported-by: Rafal Marszewski <rafal.marszewski@intel.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Cc: <stable@kernel.org>
Signed-off-by: NeilBrown <neilb@suse.de>
2009-05-07 12:50:57 +10:00
Jan Engelhardt
110518bccf md: constify VFTs
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2009-05-07 12:49:37 +10:00
NeilBrown
dd71cf6b27 md: tidy up status_resync to handle large arrays.
Two problems in status_resync.
1/ It still used Kilobytes as the basic block unit, while most code
   now uses sectors uniformly.
2/ It doesn't allow for the possibility that max_sectors exceeds
   the range of "unsigned long".

So
 - change "max_blocks" to "max_sectors", and store sector numbers
   in there and in 'resync'
 - Make 'rt' a 'sector_t' so it can temporarily hold the number of
   remaining sectors.
 - use sector_div rather than normal division.
 - change the magic '100' used to preserve precision to '32'.
   + making it a power of 2 makes division easier
   + it doesn't need to be as large as it was chosen when we averaged
     speed over the entire run.  Now we average speed over the last 30
     seconds or so.

Reported-by: "Mario 'BitKoenig' Holbe" <Mario.Holbe@TU-Ilmenau.DE>
Signed-off-by: NeilBrown <neilb@suse.de>
2009-05-07 12:49:35 +10:00
NeilBrown
db305e507d md: fix some (more) errors with bitmaps on devices larger than 2TB.
If a write intent bitmap covers more than 2TB, we sometimes work with
values beyond 32bit, so these need to be sector_t.  This patches
add the required casts to some unsigned longs that are being shifted
up.

This will affect any raid10 larger than 2TB, or any raid1/4/5/6 with
member devices that are larger than 2TB.

Signed-off-by: NeilBrown <neilb@suse.de>
Reported-by: "Mario 'BitKoenig' Holbe" <Mario.Holbe@TU-Ilmenau.DE>
Cc: stable@kernel.org
2009-05-07 12:49:06 +10:00
NeilBrown
1805556912 md/raid10: don't clear bitmap during recovery if array will still be degraded.
If we have a raid10 with multiple missing devices, and we recover just
one of these to a spare, then we risk (depending on the bitmap and
array chunk size) clearing bits of the bitmap for which recovery isn't
complete (because a device is still missing).

This can lead to a subsequent "re-add" being recovered without
any IO happening, which would result in loss of data.

This patch takes the safe approach of not clearing bitmap bits
if the array will still be degraded.

This patch is suitable for all active -stable kernels.

Cc: stable@kernel.org
Signed-off-by: NeilBrown <neilb@suse.de>
2009-05-07 12:48:10 +10:00
NeilBrown
b74fd2826c md: fix loading of out-of-date bitmap.
When md is loading a bitmap which it knows is out of date, it fills
each page with 1s and writes it back out again.  However the
write_page call makes used of bitmap->file_pages and
bitmap->last_page_size which haven't been set correctly yet.  So this
can sometimes fail.

Move the setting of file_pages and last_page_size to before the call
to write_page.

This bug can cause the assembly on an array to fail, thus making the
data inaccessible.  Hence I think it is a suitable candidate for
-stable.

Cc: stable@kernel.org
Reported-by: Vojtech Pavlik <vojtech@suse.cz>
Signed-off-by: NeilBrown <neilb@suse.de>
2009-05-07 12:47:19 +10:00
Lennert Buytenhek
b805007545 net: update skb_recycle_check() for hardware timestamping changes
Commit ac45f602ee ("net: infrastructure
for hardware time stamping") added two skb initialization actions to
__alloc_skb(), which need to be added to skb_recycle_check() as well.

Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Patrick Ohly <patrick.ohly@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-06 16:49:18 -07:00
Michael Chan
581daf7e00 bnx2: Fix panic in bnx2_poll_work().
Add barrier() to bnx2_get_hw_{tx|rx}_cons() to fix this issue:

http://bugzilla.kernel.org/show_bug.cgi?id=12698

This issue was reported by multiple i386 users.  Without barrier(),
the compiled code looks like the following where %eax contains the
address of the tx_cons or rx_cons in the DMA status block.  The
status block contents can change between the cmpb and the movzwl
instruction.  The driver would crash if the value was not 0xff during
the cmpb instruction, but changed to 0xff during the movzwl
instruction.

6828:	80 38 ff             	cmpb   $0xff,(%eax)
682b:	0f b7 10             	movzwl (%eax),%edx

With the added barrier(), the compiled code now looks correct:

683d:	0f b7 10             	movzwl (%eax),%edx
6840:	0f b6 c2             	movzbl %dl,%eax
6843:	3d ff 00 00 00       	cmp    $0xff,%eax

Thanks to Pascal de Bruijn <pmjdebruijn@pcode.nl> for reporting the
problem and Holger Noefer <hnoefer@pironet-ndh.com> for patiently
testing test patches for us.

Also updated version to 2.0.1.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-06 16:46:47 -07:00
Patrick McHardy
6473990c7f net-sched: fix bfifo default limit
When no limit is given, the bfifo uses a default of tx_queue_len * mtu.
Packets handled by qdiscs include the link layer header, so this should
be taken into account, similar to what other qdiscs do.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-06 16:45:07 -07:00