Commit graph

10,164 commits

Author SHA1 Message Date
Rusty Russell
3c3ed482dc lguest: Simplify device initialization.
We used to notify the Host every time we updated a device's status.  However,
it only really needs to know when we're resetting the device, or failed to
initialize it, or when we've finished our feature negotiation.

In particular, we used to wait for VIRTIO_CONFIG_S_DRIVER_OK in the
status byte before starting the device service threads.  But this
corresponds to the successful finish of device initialization, which
might (like virtio_blk's partition scanning) use the device.  So we
had a hack, if they used the device before we expected we started the
threads anyway.

Now we hook into the finalize_features hook in the Guest: at that
point we tell the Launcher that it can rely on the features we have
acked.  On the Launcher side, we look at the status at that point, and
start servicing the device.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-07-22 14:39:49 +09:30
Sakari Ailus
e0377e2520 lguest: Do not exit on non-fatal errors
Do not exit on some non-fatal errors:

- writev() fails in net_output(). The result is a lost packet or packets.
- writev() fails in console_output(). The result is partially lost console
output.
- readv() fails in net_input(). The result is a lost packet or packets.

Rather than bringing the guest down, this patch ignores e.g. an allocation
failure on the host side. Example:

lguest: page allocation failure. order:4, mode:0x4d0
Pid: 4045, comm: lguest Tainted: G        W   2.6.36 #1
Call Trace:
 [<c138d614>] ? printk+0x18/0x1c
 [<c106a4e2>] __alloc_pages_nodemask+0x4d2/0x570
 [<c1087954>] cache_alloc_refill+0x2a4/0x4d0
 [<c1305149>] ? __netif_receive_skb+0x189/0x270
 [<c1087c5a>] __kmalloc+0xda/0xf0
 [<c12fffa5>] __alloc_skb+0x55/0x100
 [<c1305519>] ? net_rx_action+0x79/0x100
 [<c12fafed>] sock_alloc_send_pskb+0x18d/0x280
 [<c11fda25>] ? _copy_from_user+0x35/0x130
 [<c13010b6>] ? memcpy_fromiovecend+0x56/0x80
 [<c12a74dc>] tun_chr_aio_write+0x1cc/0x500
 [<c108a125>] do_sync_readv_writev+0x95/0xd0
 [<c11fda25>] ? _copy_from_user+0x35/0x130
 [<c1089fa8>] ? rw_copy_check_uvector+0x58/0x100
 [<c108a7bc>] do_readv_writev+0x9c/0x1d0
 [<c12a7310>] ? tun_chr_aio_write+0x0/0x500
 [<c108a93a>] vfs_writev+0x4a/0x60
 [<c108aa21>] sys_writev+0x41/0x80
 [<c138f061>] syscall_call+0x7/0xb
Mem-Info:
DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd:   0
HighMem per-cpu:
CPU    0: hi:  186, btch:  31 usd:   0
active_anon:134651 inactive_anon:50543 isolated_anon:0
 active_file:96881 inactive_file:132007 isolated_file:0
 unevictable:0 dirty:3 writeback:0 unstable:0
 free:91374 slab_reclaimable:6300 slab_unreclaimable:2802
 mapped:2281 shmem:9 pagetables:330 bounce:0
DMA free:3524kB min:64kB low:80kB high:96kB active_anon:0kB inactive_anon:8kB active_file:8760kB inactive_file:2760kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15868kB mlocked:0kB dirty:0kB writeback:0kB mapped:16kB shmem:0kB slab_reclaimable:88kB slab_unreclaimable:148kB kernel_stack:40kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 865 2016 2016
Normal free:150100kB min:3728kB low:4660kB high:5592kB active_anon:6224kB inactive_anon:15772kB active_file:324084kB inactive_file:325944kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:885944kB mlocked:0kB dirty:12kB writeback:0kB mapped:1520kB shmem:0kB slab_reclaimable:25112kB slab_unreclaimable:11060kB kernel_stack:1888kB pagetables:1320kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 9207 9207
HighMem free:211872kB min:512kB low:1752kB high:2992kB active_anon:532380kB inactive_anon:186392kB active_file:54680kB inactive_file:199324kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1178504kB mlocked:0kB dirty:0kB writeback:0kB mapped:7588kB shmem:36kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA: 3*4kB 65*8kB 35*16kB 18*32kB 11*64kB 9*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3524kB
Normal: 35981*4kB 344*8kB 158*16kB 28*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 150100kB
HighMem: 5732*4kB 5462*8kB 2826*16kB 1598*32kB 84*64kB 10*128kB 7*256kB 1*512kB 1*1024kB 1*2048kB 9*4096kB = 211872kB
231237 total pagecache pages
2340 pages in swap cache
Swap cache stats: add 160060, delete 157720, find 189017/194106
Free swap  = 4179840kB
Total swap = 4194300kB
524271 pages RAM
296946 pages HighMem
5668 pages reserved
867664 pages shared
82155 pages non-shared

Signed-off-by: Sakari Ailus <sakari.ailus@iki.fi>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-07-22 14:39:48 +09:30
Phillip Lougher
812753d66f Squashfs: Update documentation for XZ and add squashfs-tools devel tree
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
2011-07-22 02:33:52 +01:00
Giuseppe CAVALLARO
557e2a394d stmmac: improve and up-to-date the documentation
This patch adds new information for the driver
especially about its platform structure fields.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-21 15:29:16 -07:00
David S. Miller
033b1142f4 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	net/bluetooth/l2cap_core.c
2011-07-21 13:38:42 -07:00
Ingo Molnar
994bf1c922 Merge branch 'linus' into sched/core
Merge reason: pick up the latest scheduler fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-21 18:00:01 +02:00
Per Forlin
7937e878f9 mmc: documentation of mmc non-blocking request usage and design.
Documentation about the background and the design of mmc non-blocking.
Host driver guidelines to minimize request preparation overhead.

Signed-off-by: Per Forlin <per.forlin@linaro.org>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-07-21 10:34:52 -04:00
Phil Carmody
497888cf69 treewide: fix potentially dangerous trailing ';' in #defined values/expressions
All these are instances of
  #define NAME value;
or
  #define NAME(params_opt) value;

These of course fail to build when used in contexts like
  if(foo $OP NAME)
  while(bar $OP NAME)
and may silently generate the wrong code in contexts such as
  foo = NAME + 1;    /* foo = value; + 1; */
  bar = NAME - 1;    /* bar = value; - 1; */
  baz = NAME & quux; /* baz = value; & quux; */

Reported on comp.lang.c,
Message-ID: <ab0d55fe-25e5-482b-811e-c475aa6065c3@c29g2000yqd.googlegroups.com>
Initial analysis of the dangers provided by Keith Thompson in that thread.

There are many more instances of more complicated macros having unnecessary
trailing semicolons, but this pile seems to be all of the cases of simple
values suffering from the problem. (Thus things that are likely to be found
in one of the contexts above, more complicated ones aren't.)

Signed-off-by: Phil Carmody <ext-phil.2.carmody@nokia.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-07-21 14:10:00 +02:00
Josef Bacik
02c24a8218 fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers
Btrfs needs to be able to control how filemap_write_and_wait_range() is called
in fsync to make it less of a painful operation, so push down taking i_mutex and
the calling of filemap_write_and_wait() down into the ->fsync() handlers.  Some
file systems can drop taking the i_mutex altogether it seems, like ext3 and
ocfs2.  For correctness sake I just pushed everything down in all cases to make
sure that we keep the current behavior the same for everybody, and then each
individual fs maintainer can make up their mind about what to do from there.
Thanks,

Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20 20:47:59 -04:00
Josef Bacik
982d816581 fs: add SEEK_HOLE and SEEK_DATA flags
This just gets us ready to support the SEEK_HOLE and SEEK_DATA flags.  Turns out
using fiemap in things like cp cause more problems than it solves, so lets try
and give userspace an interface that doesn't suck.  We need to match solaris
here, and the definitions are

*o* If /whence/ is SEEK_HOLE, the offset of the start of the
next hole greater than or equal to the supplied offset
is returned. The definition of a hole is provided near
the end of the DESCRIPTION.

*o* If /whence/ is SEEK_DATA, the file pointer is set to the
start of the next non-hole file region greater than or
equal to the supplied offset.

So in the generic case the entire file is data and there is a virtual hole at
the end.  That means we will just return i_size for SEEK_HOLE and will return
the same offset for SEEK_DATA.  This is how Solaris does it so we have to do it
the same way.

Thanks,

Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20 20:47:56 -04:00
Dave Chinner
8ab47664d5 vfs: increase shrinker batch size
Now that the per-sb shrinker is responsible for shrinking 2 or more
caches, increase the batch size to keep econmies of scale for
shrinking each cache.  Increase the shrinker batch size to 1024
objects.

To allow for a large increase in batch size, add a conditional
reschedule to prune_icache_sb() so that we don't hold the LRU spin
lock for too long. This mirrors the behaviour of the
__shrink_dcache_sb(), and allows us to increase the batch size
without needing to worry about problems caused by long lock hold
times.

To ensure that filesystems using the per-sb shrinker callouts don't
cause problems, document that the object freeing method must
reschedule appropriately inside loops.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20 20:47:41 -04:00
Dave Chinner
0e1fdafd93 superblock: add filesystem shrinker operations
Now we have a per-superblock shrinker implementation, we can add a
filesystem specific callout to it to allow filesystem internal
caches to be shrunk by the superblock shrinker.

Rather than perpetuate the multipurpose shrinker callback API (i.e.
nr_to_scan == 0 meaning "tell me how many objects freeable in the
cache), two operations will be added. The first will return the
number of objects that are freeable, the second is the actual
shrinker call.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20 20:47:41 -04:00
J. Bruce Fields
8fb47a4fbf locks: rename lock-manager ops
Both the filesystem and the lock manager can associate operations with a
lock.  Confusingly, one of them (fl_release_private) actually has the
same name in both operation structures.

It would save some confusion to give the lock-manager ops different
names.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-07-20 20:23:19 -04:00
Linus Torvalds
919d25a710 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86. reboot: Make Dell Latitude E6320 use reboot=pci
  x86, doc only: Correct real-mode kernel header offset for init_size
  x86: Disable AMD_NUMA for 32bit for now
2011-07-20 15:33:59 -07:00
Al Viro
76fe3276be ->permission() sanitizing: document API changes
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20 01:43:36 -04:00
Al Viro
10556cb21a ->permission() sanitizing: don't pass flags to ->permission()
not used by the instances anymore.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20 01:43:24 -04:00
Al Viro
7e40145eb1 ->permission() sanitizing: don't pass flags to ->check_acl()
not used in the instances anymore.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20 01:43:21 -04:00
Dave Martin
540b573875 ARM: 6999/1: head, zImage: Always Enter the kernel in ARM state
Currently, the documented kernel entry requirements are not
explicit about whether the kernel should be entered in ARM or
Thumb, leading to an ambiguitity about how to enter Thumb-2
kernels.  As a result, the kernel is reliant on the zImage
decompressor to enter the kernel proper in the correct instruction
set state.

This patch changes the boot entry protocol for head.S and Image to
be the same as for zImage: in all cases, the kernel is now entered
in ARM.

Documentation/arm/Booting is updated to reflect this new policy.

A different rule will be needed for Cortex-M class CPUs as and when
support for those lands in mainline, since these CPUs don't support
the ARM instruction set at all: a note is added to the effect that
the kernel must be entered in Thumb on such systems.

Signed-off-by: Dave Martin <dave.martin@linaro.org>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-07-19 12:00:53 +01:00
Takashi Iwai
737c265bb8 ALSA: hda - Add documentation for codec-specific mixer controls
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2011-07-19 09:34:10 +02:00
J. Bruce Fields
c46556c6be nfsd4: update nfsv4.1 implementation notes
Update documentation to reflect recent progress.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-07-18 18:41:33 -04:00
Keiichi Kii
21d541aa19 updated Documentation/ja_JP/SubmittingPatches
Updated Documentaion/ja_JP/SubmittingPatches due to changes from upstream
version(2.6.39) of the file.

And also, this translated SubmittingPatches is already reviewed by JF
project. The JF project is the Japanese equivalent of LDP.

Signed-off-by: Keiichi Kii <k-keiichi@bx.jp.nec.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-07-18 13:37:12 -07:00
Akinobu Mita
d0a542637f debugfs: add documentation for debugfs_create_x64
debugfs_create_x64() exists.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-07-18 13:36:15 -07:00
Mimi Zohar
7102ebcd65 evm: permit only valid security.evm xattrs to be updated
In addition to requiring CAP_SYS_ADMIN permission to modify/delete
security.evm, prohibit invalid security.evm xattrs from changing,
unless in fixmode. This patch prevents inadvertent 'fixing' of
security.evm to reflect offline modifications.

Changelog v7:
- rename boot paramater 'evm_mode' to 'evm'

Reported-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
2011-07-18 12:29:49 -04:00
Mimi Zohar
66dbc325af evm: re-release
EVM protects a file's security extended attributes(xattrs) against integrity
attacks.  This patchset provides the framework and an initial method.  The
initial method maintains an HMAC-sha1 value across the security extended
attributes, storing the HMAC value as the extended attribute 'security.evm'.
Other methods of validating the integrity of a file's metadata will be posted
separately (eg. EVM-digital-signatures).

While this patchset does authenticate the security xattrs, and
cryptographically binds them to the inode, coming extensions will bind other
directory and inode metadata for more complete protection.  To help simplify
the review and upstreaming process, each extension will be posted separately
(eg. IMA-appraisal, IMA-appraisal-directory).  For a general overview of the
proposed Linux integrity subsystem, refer to Dave Safford's whitepaper:
http://downloads.sf.net/project/linux-ima/linux-ima/Integrity_overview.pdf.

EVM depends on the Kernel Key Retention System to provide it with a
trusted/encrypted key for the HMAC-sha1 operation. The key is loaded onto the
root's keyring using keyctl.  Until EVM receives notification that the key has
been successfully loaded onto the keyring (echo 1 > <securityfs>/evm), EVM can
not create or validate the 'security.evm' xattr, but returns INTEGRITY_UNKNOWN.
Loading the key and signaling EVM should be done as early as possible. Normally
this is done in the initramfs, which has already been measured as part of the
trusted boot.  For more information on creating and loading existing
trusted/encrypted keys, refer to Documentation/keys-trusted-encrypted.txt.  A
sample dracut patch, which loads the trusted/encrypted key and enables EVM, is
available from http://linux-ima.sourceforge.net/#EVM.

Based on the LSMs enabled, the set of EVM protected security xattrs is defined
at compile.  EVM adds the following three calls to the existing security hooks:
evm_inode_setxattr(), evm_inode_post_setxattr(), and evm_inode_removexattr.  To
initialize and update the 'security.evm' extended attribute, EVM defines three
calls: evm_inode_post_init(), evm_inode_post_setattr() and
evm_inode_post_removexattr() hooks.  To verify the integrity of a security
xattr, EVM exports evm_verifyxattr().

Changelog v7:
- Fixed URL in EVM ABI documentation

Changelog v6: (based on Serge Hallyn's review)
- fix URL in patch description
- remove evm_hmac_size definition
- use SHA1_DIGEST_SIZE (removed both MAX_DIGEST_SIZE and evm_hmac_size)
- moved linux include before other includes
- test for crypto_hash_setkey failure
- fail earlier for invalid key
- clear entire encrypted key, even on failure
- check xattr name length before comparing xattr names

Changelog:
- locking based on i_mutex, remove evm_mutex
- using trusted/encrypted keys for storing the EVM key used in the HMAC-sha1
  operation.
- replaced crypto hash with shash (Dmitry Kasatkin)
- support for additional methods of verifying the security xattrs
  (Dmitry Kasatkin)
- iint not allocated for all regular files, but only for those appraised
- Use cap_sys_admin in lieu of cap_mac_admin
- Use __vfs_setxattr_noperm(), without permission checks, from EVM

Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
2011-07-18 12:29:40 -04:00
Nicolas Pitre
632b7cf6c0 ARM: mach-s3c2400: delete
On Tue, 28 Jun 2011, Ben Dooks wrote:

> On Tue, Jun 28, 2011 at 11:22:57PM +0200, Arnd Bergmann wrote:
>
> > On a related note, what about mach-s3c2400? It seems to be even more
> > incomplete.
>
> Probably the same fate awaits that. It is so old that there's little
> incentive to do anything with it.

So out it goes as well.

The PORT_S3C2400 definition in include/linux/serial_core.h is left there
to prevent a reuse of the same number for another port type.

Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
2011-07-18 10:59:26 -04:00
Arnd Bergmann
3a6cb8ce07 Merge branch 'zynq/master' of git+ssh://master.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc into next/soc
Conflicts:
	arch/arm/Kconfig
	arch/arm/mm/Kconfig
2011-07-17 21:43:26 +02:00
Takao Indoh
4996c02306 ACPI: introduce "acpi_rsdp=" parameter for kdump
There is a problem with putting the first kernel in EFI virtual mode,
it is that when the second kernel comes up it tries to initialize the
EFI again and once we have put EFI in virtual mode we can not really
do that.

Actually, EFI is not necessary for kdump, we can boot the second kernel
with "noefi" parameter, but the boot will mostly fail because 2nd kernel
cannot find RSDP.

In this situation, we introduced "acpi_rsdp=" kernel parameter, so that
kexec-tools can pass the "noefi acpi_rsdp=X" to the second kernel to
make kdump works. The physical address of the RSDP can be got from
sysfs(/sys/firmware/efi/systab).

Signed-off-by: Takao Indoh <indou.takao@jp.fujitsu.com>
Reviewed-by: WANG Cong <amwang@redhat.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-07-16 18:40:16 -04:00
Stefan Richter
9a00c24ae7 firewire: document the sysfs ABIs
of firewire-core and firewire-sbp2.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2011-07-16 07:24:33 +02:00
Stefan Richter
f6a7cd0212 firewire: cdev: ABI documentation enhancements
Add overview documentation in Documentation/ABI/stable/firewire-cdev.

Improve the inline reference documentation in firewire-cdev.h:

  - Add /* available since kernel... */ comments to event numbers
    consistent with the comments on ioctl numbers.

  - Shorten some documentation on an event and an ioctl that are
    less interesting to current programming because there are newer
    preferable variants.

  - Spell Configuration ROM (name of an IEEE 1212 register) in
    upper case.

  - Move the dummy FW_CDEV_VERSION out of the reader's field of
    vision.  We should remove it from the header next year or so.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2011-07-16 07:24:32 +02:00
Grant Likely
8c11642a50 Merge commit 'v3.0-rc7' into devicetree/next 2011-07-15 20:11:34 -06:00
NeilBrown
49b28684fd nfsd: Remove deprecated nfsctl system call and related code.
As promised in feature-removal-schedule.txt it is time to
remove the nfsctl system call.

Userspace has perferred to not use this call throughout 2.6 and it has been
excluded in the default configuration since 2.6.36 (9 months ago).

So this patch removes all the code that was being compiled out.

There are still references to sys_nfsctl in various arch systemcall tables
and related code.  These should be cleaned out too, probably in the next
merge window.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-07-15 18:58:42 -04:00
Rafael J. Wysocki
7ae033cc0d Merge branch 'pm-runtime' into for-linus
* pm-runtime:
  OMAP: PM: disable idle on suspend for GPIO and UART
  OMAP: PM: omap_device: add API to disable idle on suspend
  OMAP: PM: omap_device: add system PM methods for PM domain handling
  OMAP: PM: omap_device: conditionally use PM domain runtime helpers
  PM / Runtime: Add new helper function: pm_runtime_status_suspended()
  PM / Runtime: Consistent utilization of deferred_resume
  PM / Runtime: Prevent runtime_resume from racing with probe
  PM / Runtime: Replace "run-time" with "runtime" in documentation
  PM / Runtime: Improve documentation of enable, disable and barrier
  PM: Limit race conditions between runtime PM and system sleep (v2)
  PCI / PM: Detect early wakeup in pci_pm_prepare()
  PM / Runtime: Return special error code if runtime PM is disabled
  PM / Runtime: Update documentation of interactions with system sleep
2011-07-15 23:59:25 +02:00
Rafael J. Wysocki
ba1389d74f Merge branch 'pm-domains' into for-linus
* pm-domains: (33 commits)
  ARM / shmobile: Return -EBUSY from A4LC power off if A3RV is active
  PM / Domains: Take .power_off() error code into account
  ARM / shmobile: Use genpd_queue_power_off_work()
  ARM / shmobile: Use pm_genpd_poweroff_unused()
  PM / Domains: Introduce function to power off all unused PM domains
  PM / Domains: Queue up power off work only if it is not pending
  PM / Domains: Improve handling of wakeup devices during system suspend
  PM / Domains: Do not restore all devices on power off error
  PM / Domains: Allow callbacks to execute all runtime PM helpers
  PM / Domains: Do not execute device callbacks under locks
  PM / Domains: Make failing pm_genpd_prepare() clean up properly
  PM / Domains: Set device state to "active" during system resume
  ARM: mach-shmobile: sh7372 A3RV requires A4LC
  PM / Domains: Export pm_genpd_poweron() in header
  ARM: mach-shmobile: sh7372 late pm domain off
  ARM: mach-shmobile: Runtime PM late init callback
  ARM: mach-shmobile: sh7372 D4 support
  ARM: mach-shmobile: sh7372 A4MP support
  ARM: mach-shmobile: sh7372: make sure that fsi is peripheral of spu2
  ARM: mach-shmobile: sh7372 A3SG support
  ...
2011-07-15 23:59:09 +02:00
Nishanth Menon
99f381d354 PM / OPP: Introduce function to free cpufreq table
cpufreq table allocated by opp_init_cpufreq_table is better
freed by OPP layer itself. This allows future modifications to
the table handling to be transparent to the users.

Signed-off-by: Nishanth Menon <nm@ti.com>
Acked-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-07-15 23:58:18 +02:00
Masami Hiramatsu
6142431810 tracing/kprobes: Support module init function probing
To support probing module init functions, kprobe-tracer allows
user to define a probe on non-existed function when it is given
with a module name. This also enables user to set a probe on
a function on a specific module, even if a same name (but different)
function is locally defined in another module.

The module name must be in the front of function name and separated
by a ':'. e.g. btrfs:btrfs_init_sysfs

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/r/20110627072656.6528.89970.stgit@fedora15
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-15 15:17:14 -04:00
Linus Torvalds
af8a927c8b Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2:
  nilfs2: remove resize from unsupported features list
2011-07-15 11:03:49 -07:00
Olof Johansson
eb5064db40 gpio/tegra: dt: add binding for gpio polarity
Allocate one bit in the available extra cell to indicate if the gpio
should be considered logically inverted.

Signed-off-by: Olof Johansson <olof@lixom.net>
Acked-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-07-15 11:51:27 -06:00
John W. Linville
95a943c162 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem
Conflicts:
	net/bluetooth/l2cap_core.c
2011-07-15 10:05:24 -04:00
Russell King
4aa96ccf9e Merge branch 'kprobes-thumb' of git://git.yxit.co.uk/linux into devel-stable 2011-07-15 10:06:42 +01:00
Andy Lutomirski
98eedc3a9d Document the vDSO and add a reference parser
It turns out that parsing the vDSO is nontrivial if you don't already
have an ELF dynamic loader around.  So document it in Documentation/ABI
and add a reference CC0-licenced parser.

This code is dedicated to Go issue 1933:
http://code.google.com/p/go/issues/detail?id=1933

Signed-off-by: Andy Lutomirski <luto@mit.edu>
Link: http://lkml.kernel.org/r/a315a9514cd71bcf29436cc31e35aada21a5ff21.1310563276.git.luto@mit.edu
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-07-14 17:57:09 -07:00
Shawn Guo
22a85e4cd5 spi/imx: add device tree probe support
It adds device tree probe support for spi-imx driver.

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Cc: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-07-14 13:47:18 -06:00
David S. Miller
6a7ebdf2fd Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	net/bluetooth/l2cap_core.c
2011-07-14 07:56:40 -07:00
Linus Torvalds
5d7d5d9332 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits)
  slip: fix wrong SLIP6 ifdef-endif placing
  natsemi: fix another dma-debug report
  sctp: ABORT if receive, reassmbly, or reodering queue is not empty while closing socket
  net: Fix default in docs for tcp_orphan_retries.
  hso: fix a use after free condition
  net/natsemi: Fix module parameter permissions
  XFRM: Fix memory leak in xfrm_state_update
  sctp: Enforce retransmission limit during shutdown
  mac80211: fix TKIP replay vulnerability
  mac80211: fix ie memory allocation for scheduled scans
  ssb: fix init regression of hostmode PCI core
  rtlwifi: rtl8192cu: Add new USB ID for Netgear WNA1000M
  ath9k: Fix tx throughput drops for AR9003 chips with AES encryption
  carl9170: add NEC WL300NU-AG usbid
  cfg80211: fix deadlock with rfkill/sched_scan by adding new mutex
  ath5k: fix incorrect use of drvdata in PCI suspend/resume code
  ath5k: fix incorrect use of drvdata in sysfs code
  Bluetooth: Fix memory leak under page timeouts
  Bluetooth: Fix regression with incoming L2CAP connections
  Bluetooth: Fix hidp disconnect deadlocks and lost wakeup
  ...
2011-07-13 13:51:32 -07:00
Ryusuke Konishi
3a36199114 nilfs2: remove resize from unsupported features list
Resize feature was supported by the commit 4e33f9eab0 but it was not
reflected to the list of unsupported features in nilfs2.txt file.
This updates the list to fix discrepancy.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2011-07-13 16:08:59 +09:00
Michał Mirosław
e5b1de1f5e net: Add documentation for netdev features handling
v2: incorporated suggestions from Randy Dunlap

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-12 22:27:00 -07:00
Darren Hart
11e48feebe x86, doc only: Correct real-mode kernel header offset for init_size
The real-mode kernel header init_size field is located at 0x260 per the field
listing in th e"REAL-MODE KERNEL HEADER" section. It is listed as 0x25c in
the "DETAILS OF HEADER FIELDS" section, which overlaps with pref_address.
Correct the details listing to 0x260.

Signed-off-by: Darren Hart <dvhart@linux.intel.com>
Link: http://lkml.kernel.org/r/541cf88e2dfe5b8186d8b96b136d892e769a68c1.1310441260.git.dvhart@linux.intel.com
CC: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-07-12 20:39:11 -07:00
Glauber Costa
9ddabbe72e KVM: KVM Steal time guest/host interface
To implement steal time, we need the hypervisor to pass the guest information
about how much time was spent running other processes outside the VM.
This is per-vcpu, and using the kvmclock structure for that is an abuse
we decided not to make.

In this patchset, I am introducing a new msr, KVM_MSR_STEAL_TIME, that
holds the memory area address containing information about steal time

This patch contains the headers for it. I am keeping it separate to facilitate
backports to people who wants to backport the kernel part but not the
hypervisor, or the other way around.

Signed-off-by: Glauber Costa <glommer@redhat.com>
Acked-by: Rik van Riel <riel@redhat.com>
Tested-by: Eric B Munson <emunson@mgebm.net>
CC: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-07-12 13:17:03 +03:00
Paul Mackerras
aa04b4cc5b KVM: PPC: Allocate RMAs (Real Mode Areas) at boot for use by guests
This adds infrastructure which will be needed to allow book3s_hv KVM to
run on older POWER processors, including PPC970, which don't support
the Virtual Real Mode Area (VRMA) facility, but only the Real Mode
Offset (RMO) facility.  These processors require a physically
contiguous, aligned area of memory for each guest.  When the guest does
an access in real mode (MMU off), the address is compared against a
limit value, and if it is lower, the address is ORed with an offset
value (from the Real Mode Offset Register (RMOR)) and the result becomes
the real address for the access.  The size of the RMA has to be one of
a set of supported values, which usually includes 64MB, 128MB, 256MB
and some larger powers of 2.

Since we are unlikely to be able to allocate 64MB or more of physically
contiguous memory after the kernel has been running for a while, we
allocate a pool of RMAs at boot time using the bootmem allocator.  The
size and number of the RMAs can be set using the kvm_rma_size=xx and
kvm_rma_count=xx kernel command line options.

KVM exports a new capability, KVM_CAP_PPC_RMA, to signal the availability
of the pool of preallocated RMAs.  The capability value is 1 if the
processor can use an RMA but doesn't require one (because it supports
the VRMA facility), or 2 if the processor requires an RMA for each guest.

This adds a new ioctl, KVM_ALLOCATE_RMA, which allocates an RMA from the
pool and returns a file descriptor which can be used to map the RMA.  It
also returns the size of the RMA in the argument structure.

Having an RMA means we will get multiple KMV_SET_USER_MEMORY_REGION
ioctl calls from userspace.  To cope with this, we now preallocate the
kvm->arch.ram_pginfo array when the VM is created with a size sufficient
for up to 64GB of guest memory.  Subsequently we will get rid of this
array and use memory associated with each memslot instead.

This moves most of the code that translates the user addresses into
host pfns (page frame numbers) out of kvmppc_prepare_vrma up one level
to kvmppc_core_prepare_memory_region.  Also, instead of having to look
up the VMA for each page in order to check the page size, we now check
that the pages we get are compound pages of 16MB.  However, if we are
adding memory that is mapped to an RMA, we don't bother with calling
get_user_pages_fast and instead just offset from the base pfn for the
RMA.

Typically the RMA gets added after vcpus are created, which makes it
inconvenient to have the LPCR (logical partition control register) value
in the vcpu->arch struct, since the LPCR controls whether the processor
uses RMA or VRMA for the guest.  This moves the LPCR value into the
kvm->arch struct and arranges for the MER (mediated external request)
bit, which is the only bit that varies between vcpus, to be set in
assembly code when going into the guest if there is a pending external
interrupt request.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:57 +03:00
Paul Mackerras
371fefd6f2 KVM: PPC: Allow book3s_hv guests to use SMT processor modes
This lifts the restriction that book3s_hv guests can only run one
hardware thread per core, and allows them to use up to 4 threads
per core on POWER7.  The host still has to run single-threaded.

This capability is advertised to qemu through a new KVM_CAP_PPC_SMT
capability.  The return value of the ioctl querying this capability
is the number of vcpus per virtual CPU core (vcore), currently 4.

To use this, the host kernel should be booted with all threads
active, and then all the secondary threads should be offlined.
This will put the secondary threads into nap mode.  KVM will then
wake them from nap mode and use them for running guest code (while
they are still offline).  To wake the secondary threads, we send
them an IPI using a new xics_wake_cpu() function, implemented in
arch/powerpc/sysdev/xics/icp-native.c.  In other words, at this stage
we assume that the platform has a XICS interrupt controller and
we are using icp-native.c to drive it.  Since the woken thread will
need to acknowledge and clear the IPI, we also export the base
physical address of the XICS registers using kvmppc_set_xics_phys()
for use in the low-level KVM book3s code.

When a vcpu is created, it is assigned to a virtual CPU core.
The vcore number is obtained by dividing the vcpu number by the
number of threads per core in the host.  This number is exported
to userspace via the KVM_CAP_PPC_SMT capability.  If qemu wishes
to run the guest in single-threaded mode, it should make all vcpu
numbers be multiples of the number of threads per core.

We distinguish three states of a vcpu: runnable (i.e., ready to execute
the guest), blocked (that is, idle), and busy in host.  We currently
implement a policy that the vcore can run only when all its threads
are runnable or blocked.  This way, if a vcpu needs to execute elsewhere
in the kernel or in qemu, it can do so without being starved of CPU
by the other vcpus.

When a vcore starts to run, it executes in the context of one of the
vcpu threads.  The other vcpu threads all go to sleep and stay asleep
until something happens requiring the vcpu thread to return to qemu,
or to wake up to run the vcore (this can happen when another vcpu
thread goes from busy in host state to blocked).

It can happen that a vcpu goes from blocked to runnable state (e.g.
because of an interrupt), and the vcore it belongs to is already
running.  In that case it can start to run immediately as long as
the none of the vcpus in the vcore have started to exit the guest.
We send the next free thread in the vcore an IPI to get it to start
to execute the guest.  It synchronizes with the other threads via
the vcore->entry_exit_count field to make sure that it doesn't go
into the guest if the other vcpus are exiting by the time that it
is ready to actually enter the guest.

Note that there is no fixed relationship between the hardware thread
number and the vcpu number.  Hardware threads are assigned to vcpus
as they become runnable, so we will always use the lower-numbered
hardware threads in preference to higher-numbered threads if not all
the vcpus in the vcore are runnable, regardless of which vcpus are
runnable.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:57 +03:00
David Gibson
54738c0971 KVM: PPC: Accelerate H_PUT_TCE by implementing it in real mode
This improves I/O performance for guests using the PAPR
paravirtualization interface by making the H_PUT_TCE hcall faster, by
implementing it in real mode.  H_PUT_TCE is used for updating virtual
IOMMU tables, and is used both for virtual I/O and for real I/O in the
PAPR interface.

Since this moves the IOMMU tables into the kernel, we define a new
KVM_CREATE_SPAPR_TCE ioctl to allow qemu to create the tables.  The
ioctl returns a file descriptor which can be used to mmap the newly
created table.  The qemu driver models use them in the same way as
userspace managed tables, but they can be updated directly by the
guest with a real-mode H_PUT_TCE implementation, reducing the number
of host/guest context switches during guest IO.

There are certain circumstances where it is useful for userland qemu
to write to the TCE table even if the kernel H_PUT_TCE path is used
most of the time.  Specifically, allowing this will avoid awkwardness
when we need to reset the table.  More importantly, we will in the
future need to write the table in order to restore its state after a
checkpoint resume or migration.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:56 +03:00