Commit graph

423182 commits

Author SHA1 Message Date
Jan Kiszka
7af40ad37b KVM: nVMX: Fix nested_run_pending on activity state HLT
When we suspend the guest in HLT state, the nested run is no longer
pending - we emulated it completely. So only set nested_run_pending
after checking the activity state.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-01-17 10:22:17 +01:00
Jan Kiszka
cae501397a KVM: nVMX: Clean up handling of VMX-related MSRs
This simplifies the code and also stops issuing warning about writing to
unhandled MSRs when VMX is disabled or the Feature Control MSR is
locked - we do handle them all according to the spec.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-01-17 10:22:16 +01:00
Jan Kiszka
542060ea79 KVM: nVMX: Add tracepoints for nested_vmexit and nested_vmexit_inject
Already used by nested SVM for tracing nested vmexit: kvm_nested_vmexit
marks exits from L2 to L0 while kvm_nested_vmexit_inject marks vmexits
that are reflected to L1.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-01-17 10:22:15 +01:00
Jan Kiszka
533558bcb6 KVM: nVMX: Pass vmexit parameters to nested_vmx_vmexit
Instead of fixing up the vmcs12 after the nested vmexit, pass key
parameters already when calling nested_vmx_vmexit. This will help
tracing those vmexits.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-01-17 10:22:14 +01:00
Jan Kiszka
42124925c1 KVM: nVMX: Leave VMX mode on clearing of feature control MSR
When userspace sets MSR_IA32_FEATURE_CONTROL to 0, make sure we leave
root and non-root mode, fully disabling VMX. The register state of the
VCPU is undefined after this step, so userspace has to set it to a
proper state afterward.

This enables to reboot a VM while it is running some hypervisor code.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-01-17 10:22:13 +01:00
Jan Kiszka
8246bf52c7 KVM: VMX: Fix DR6 update on #DB exception
According to the SDM, only bits 0-3 of DR6 "may" be cleared by "certain"
debug exception. So do update them on #DB exception in KVM, but leave
the rest alone, only setting BD and BS in addition to already set bits
in DR6. This also aligns us with kvm_vcpu_check_singlestep.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-01-17 10:22:12 +01:00
Jan Kiszka
73aaf249ee KVM: SVM: Fix reading of DR6
In contrast to VMX, SVM dose not automatically transfer DR6 into the
VCPU's arch.dr6. So if we face a DR6 read, we must consult a new vendor
hook to obtain the current value. And as SVM now picks the DR6 state
from its VMCB, we also need a set callback in order to write updates of
DR6 back.

Fixes a regression of 020df0794f.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-01-17 10:22:10 +01:00
Jan Kiszka
9926c9fdbd KVM: x86: Sync DR7 on KVM_SET_DEBUGREGS
Whenever we change arch.dr7, we also have to call kvm_update_dr7. In
case guest debugging is off, this will synchronize the new state into
hardware.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-01-17 10:22:09 +01:00
Vadim Rozenfeld
e984097b55 add support for Hyper-V reference time counter
Signed-off: Peter Lieven <pl@kamp.de>
Signed-off: Gleb Natapov
Signed-off: Vadim Rozenfeld <vrozenfe@redhat.com>

After some consideration I decided to submit only Hyper-V reference
counters support this time. I will submit iTSC support as a separate
patch as soon as it is ready.

v1 -> v2
1. mark TSC page dirty as suggested by
    Eric Northup <digitaleric@google.com> and Gleb
2. disable local irq when calling get_kernel_ns,
    as it was done by Peter Lieven <pl@amp.de>
3. move check for TSC page enable from second patch
    to this one.

v3 -> v4
    Get rid of ref counter offset.

v4 -> v5
    replace __copy_to_user with kvm_write_guest
    when updateing iTSC page.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-01-17 10:22:08 +01:00
Julia Lawall
073bf17261 i810: delete useless variable
Delete a variable that is at most only assigned to a constant, but never
used otherwise.

A simplified version of the semantic patch that fixes this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@@
type T;
identifier i;
constant c;
@@

-T i;
<... when != i
-i = c;
...>
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-01-17 10:57:50 +02:00
Stefan Kristiansson
e61d05dd88 video: add OpenCores VGA/LCD framebuffer driver
This adds support for the VGA/LCD core available from OpenCores:
http://opencores.org/project,vga_lcd

The driver have been tested together with both OpenRISC and
ARM (socfpga) processors.

Signed-off-by: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-01-17 10:57:49 +02:00
Geert Uytterhoeven
db25295788 video/logo: Remove MIPS-specific include section
Since commit 41702d9a4f ("logo.c: get rid of
mips_machgroup") there's no longer a need to include <asm/bootinfo.h> on
MIPS.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-01-17 10:57:49 +02:00
Dan Carpenter
0ede5804ca tgafb: potential NULL dereference in init
Static checkers complain that there are paths where "tga_type_name" can
be NULL.  I've re-arranged the code slightly so that's impossible.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-01-17 10:57:48 +02:00
Dan Carpenter
6fc19c40be video: mmp: Using plain integer as NULL pointer
Sparse complains here:

	drivers/video/mmp/core.c:33:16:
		warning: Using plain integer as NULL pointer

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-01-17 10:57:48 +02:00
Dan Carpenter
4ac8bd618b video: mmp: delete a stray mutex_unlock()
We aren't holding the disp_lock so we shouldn't release it.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Haojian Zhuang <haojian.zhuang@gmail.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-01-17 10:57:47 +02:00
Mark Brown
c3235bfc0c video: amba-clcd: Make CLCD driver available on more platforms
The CLCD driver is used on ARM reference models for ARMv8 so add ARM64
to the list of dependencies. The driver also has no build time dependencies
on ARM (stubs are provided for ARM-specific DMA functions in the code) so
make it available with COMPILE_TEST in order to maximise build coverage.

Signed-off-by: Mark Brown <broonie@linaro.org>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-01-17 10:57:47 +02:00
Yijing Wang
3ca356b6de video: Replace local macro with PCI standard macro
Replace local macro TGA_BUS_PCI with PCI standard
marco dev_is_pci().

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-01-17 10:57:46 +02:00
Olaf Hering
f5d2b7c28b fbmem: really support wildcard video=options for all fbdev drivers
Documentation/fb/modedb.txt states that video=option should be
considered a global option. But video_setup and fb_get_options are not
coded that way. Instead its required to boot with video=driver:option to
set a given option in drvier.  This is cumbersome because it requires to
know in advance which driver will be active for a given board/kernel.

The following patch implements the documented catchall for the fbdev
drivers. It is now possible to boot with video=XxY without the need to
know the active driver in advance. The specific case it tries to fix is
syslinux in the SUSE installer which offers a menu to set a display
resolution. Right now this just appends the vga= option the kernel. But
in addition to vga= it should be possible to pass a generic video=XxY
for all framebuffer/drm drivers. With this change forcing a certain
window size of VM displays is now much easier.

Today the video= option is stored in a global fb_mode_option. But
unfortunately only drm uses it.

Note: this change introduces a small memleak if video=option is actually
used because fb_mode_option is const. Most drivers use strsep to get to
individual options. This could be fixed in a followup patch which always
releases the option string in every caller of fb_get_options.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-01-17 10:57:45 +02:00
Mark Brown
ee23794b86 video: vgacon: Don't build on arm64
arm64 is unlikely to have a VGA console and does not export screen_info
causing build failures if the driver is build, for example in all*config.
Add a dependency on !ARM64 to prevent this.

This list is getting quite long, it may be easier to depend on a symbol
which architectures that do support the driver can select.

Signed-off-by: Mark Brown <broonie@linaro.org>
[tomi.valkeinen@ti.com: moved && to first modified line]
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-01-17 10:57:45 +02:00
Sascha Hauer
4d32d04769 video: mx3fb: Allow blocking during framebuffer allocation
No need to allocate the framebuffer from the atomic pool, we are not
in interrupt context. Adding GFP_KERNEL to the framebuffer allocation
allows to use the much bigger CMA pool to allocate the framebuffer.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Cc: linux-fbdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-01-17 10:57:44 +02:00
Masami Ichikawa
46862145d4 fbcon: Fix memory leak in fbcon_exit().
kmemleak reported a memory leak as below.

unreferenced object 0xffff880036ca84c0 (size 16):
  comm "swapper/0", pid 1, jiffies 4294877407 (age 4434.633s)
  hex dump (first 16 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 ff ff  ................
  backtrace:
    [<ffffffff814ed01e>] kmemleak_alloc+0x4e/0xb0
    [<ffffffff8118913c>] __kmalloc+0x1fc/0x290
    [<ffffffff81302c9e>] bit_cursor+0x24e/0x6c0
    [<ffffffff812ff2f4>] fbcon_cursor+0x154/0x1d0
    [<ffffffff813675d8>] hide_cursor+0x28/0xa0
    [<ffffffff81368acf>] update_region+0x6f/0x90
    [<ffffffff81300268>] fbcon_switch+0x518/0x550
    [<ffffffff813695b9>] redraw_screen+0x189/0x240
    [<ffffffff8136a0e0>] do_bind_con_driver+0x360/0x380
    [<ffffffff8136a6e4>] do_take_over_console+0x114/0x1c0
    [<ffffffff812fdc83>] do_fbcon_takeover+0x63/0xd0
    [<ffffffff813023e5>] fbcon_event_notify+0x605/0x720
    [<ffffffff81501dcc>] notifier_call_chain+0x4c/0x70
    [<ffffffff81087f8d>] __blocking_notifier_call_chain+0x4d/0x70
    [<ffffffff81087fc6>] blocking_notifier_call_chain+0x16/0x20
    [<ffffffff812f201b>] fb_notifier_call_chain+0x1b/0x20

In this case ops->cursor_state.mask is allocated in bit_cursor() but
not freed in fbcon_exit(). So, fbcon_exit() needs to free buffer in its
process.
In the case, fbcon_exit() was called from fbcon_deinit() when driver
called remove_conflicting_framebuffers().

Signed-off-by: Masami Ichikawa <masami256@gmail.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-01-17 10:57:44 +02:00
Wang YanQing
5aa133d6c8 fbcon: trivial optimization for fbcon_exit
Break out as soon as we find a mapped entry con2fb_map.

Signed-off-by: Wang YanQing <udknight@gmail.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-01-17 10:57:43 +02:00
Sachin Kamat
8fe940af05 video: pxa168fb: Cleanup pxa168fb.h file
Remove incorrect file reference in comments section.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Eric Miao <eric.y.miao@gmail.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-01-17 10:57:43 +02:00
Sachin Kamat
71269ec875 video: pxa: Cleanup video-pxafb.h header
Commit 293b2da1b6 ("ARM: pxa: move platform_data definitions")
moved the file to the current location but forgot to remove the pointer
to its previous location. Clean it up.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-01-17 10:57:42 +02:00
Sachin Kamat
b6c5a2b845 video: msm: Cleanup video-msm_fb.h header
Commit 1ef21f6343 ("ARM: msm: move platform_data definitions")
moved the file to the current location but forgot to remove the pointer
to its previous location. Clean it up.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-01-17 10:57:42 +02:00
Sachin Kamat
6abe6e51f9 video: ep93xx: Cleanup video-ep93xx.h header
Commit a3b2924547 ("ARM: ep93xx: move platform_data definitions")
moved the file to the current location but forgot to remove the pointer
to its previous location. Clean it up. While at it also change the header
file protection macros appropriately.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-01-17 10:57:41 +02:00
Lothar Waßmann
838bdf723b video: mxsfb: fix broken videomode selection
Currently the driver re-implements the code found in of_get_videomode()
except for the fact that the latter honors the 'native-mode' property
to select a spcific video timing from the list of possible timings.
The driver builds up a list of all video timings, but uses only the
last mode from the list anyway. While building the list it incorrectly
OR's the 'pixelclk-active' and 'de-active' flags of all modes into
single flags, possibly leading to a wrong pixelclock or data-enable
polarity setting.

Fix this by using the of_get_videomode() directly with the
OF_USE_NATIVE_MODE flag.

Since all current dts files only have one entry in their
display-timings node, this bug was not apparent and the fix does not
change the driver's behaviour for the current users.

Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-01-17 10:57:41 +02:00
Lothar Waßmann
18dd44d89f video: mxsfb: convert pr_debug()/dev_dbg() to pr_err()/dev_err() for error messages
Make the messages that are printed in case of fatal errors actually
visible to the user without having to recompile the driver with
debugging enabled.

Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-01-17 10:57:40 +02:00
Jingoo Han
15b6176857 video: vmlfb: remove unnecessary pci_set_drvdata()
The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-01-17 10:57:40 +02:00
Jingoo Han
11cf994c99 video: rivafb: remove unnecessary pci_set_drvdata()
The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-01-17 10:57:39 +02:00
Jingoo Han
7c815c59c0 video: nvidiafb: remove unnecessary pci_set_drvdata()
The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-01-17 10:57:39 +02:00
Jingoo Han
a1d99607c4 video: asiliantfb: remove unnecessary pci_set_drvdata()
The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-01-17 10:57:38 +02:00
Laurent Pinchart
40af1eb5a7 fbdev: sh_mobile_lcdcfb: Don't use plain 0 as NULL pointer
This fixes a sparse warning.

Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Cc: linux-fbdev@vger.kernel.org
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-01-17 10:57:38 +02:00
David S. Miller
cf84eb0b09 Merge branch 'virtio_rx_merging'
Michael Dalton says:

====================
virtio-net: mergeable rx buffer size auto-tuning

The virtio-net device currently uses aligned MTU-sized mergeable receive
packet buffers. Network throughput for workloads with large average
packet size can be improved by posting larger receive packet buffers.
However, due to SKB truesize effects, posting large (e.g, PAGE_SIZE)
buffers reduces the throughput of workloads that do not benefit from GRO
and have no large inbound packets.

This patchset introduces virtio-net mergeable buffer size auto-tuning,
with buffer sizes ranging from aligned MTU-size to PAGE_SIZE. Packet
buffer size is chosen based on a per-receive queue EWMA of incoming
packet size.

To unify mergeable receive buffer memory allocation and improve
SKB frag coalescing, all mergeable buffer memory allocation is
migrated to per-receive queue page frag allocators.

The per-receive queue mergeable packet buffer size is exported via
sysfs, and the network device sysfs layer has been extended to add
support for device-specific per-receive queue sysfs attribute groups.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16 23:46:17 -08:00
Michael Dalton
fbf28d78f5 virtio-net: initial rx sysfs support, export mergeable rx buffer size
Add initial support for per-rx queue sysfs attributes to virtio-net. If
mergeable packet buffers are enabled, adds a read-only mergeable packet
buffer size sysfs attribute for each RX queue.

Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael Dalton <mwdalton@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16 23:46:07 -08:00
Michael Dalton
03144b5869 lib: Ensure EWMA does not store wrong intermediate values
To ensure ewma_read() without a lock returns a valid but possibly
out of date average, modify ewma_add() by using ACCESS_ONCE to prevent
intermediate wrong values from being written to avg->internal.

Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Michael Dalton <mwdalton@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16 23:46:06 -08:00
Michael Dalton
a953be53ce net-sysfs: add support for device-specific rx queue sysfs attributes
Extend existing support for netdevice receive queue sysfs attributes to
permit a device-specific attribute group. Initial use case for this
support will be to allow the virtio-net device to export per-receive
queue mergeable receive buffer size.

Signed-off-by: Michael Dalton <mwdalton@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16 23:46:06 -08:00
Michael Dalton
ab7db91705 virtio-net: auto-tune mergeable rx buffer size for improved performance
Commit 2613af0ed1 ("virtio_net: migrate mergeable rx buffers to page frag
allocators") changed the mergeable receive buffer size from PAGE_SIZE to
MTU-size, introducing a single-stream regression for benchmarks with large
average packet size. There is no single optimal buffer size for all
workloads.  For workloads with packet size <= MTU bytes, MTU + virtio-net
header-sized buffers are preferred as larger buffers reduce the TCP window
due to SKB truesize. However, single-stream workloads with large average
packet sizes have higher throughput if larger (e.g., PAGE_SIZE) buffers
are used.

This commit auto-tunes the mergeable receiver buffer packet size by
choosing the packet buffer size based on an EWMA of the recent packet
sizes for the receive queue. Packet buffer sizes range from MTU_SIZE +
virtio-net header len to PAGE_SIZE. This improves throughput for
large packet workloads, as any workload with average packet size >=
PAGE_SIZE will use PAGE_SIZE buffers.

These optimizations interact positively with recent commit
ba27524103 ("virtio-net: coalesce rx frags when possible during rx"),
which coalesces adjacent RX SKB fragments in virtio_net. The coalescing
optimizations benefit buffers of any size.

Benchmarks taken from an average of 5 netperf 30-second TCP_STREAM runs
between two QEMU VMs on a single physical machine. Each VM has two VCPUs
with all offloads & vhost enabled. All VMs and vhost threads run in a
single 4 CPU cgroup cpuset, using cgroups to ensure that other processes
in the system will not be scheduled on the benchmark CPUs. Trunk includes
SKB rx frag coalescing.

net-next w/ virtio_net before 2613af0ed1 (PAGE_SIZE bufs): 14642.85Gb/s
net-next (MTU-size bufs):  13170.01Gb/s
net-next + auto-tune: 14555.94Gb/s

Jason Wang also reported a throughput increase on mlx4 from 22Gb/s
using MTU-sized buffers to about 26Gb/s using auto-tuning.

Signed-off-by: Michael Dalton <mwdalton@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16 23:46:06 -08:00
Michael Dalton
fb51879dbc virtio-net: use per-receive queue page frag alloc for mergeable bufs
The virtio-net driver currently uses netdev_alloc_frag() for GFP_ATOMIC
mergeable rx buffer allocations. This commit migrates virtio-net to use
per-receive queue page frags for GFP_ATOMIC allocation. This change unifies
mergeable rx buffer memory allocation, which now will use skb_refill_frag()
for both atomic and GFP-WAIT buffer allocations.

To address fragmentation concerns, if after buffer allocation there
is too little space left in the page frag to allocate a subsequent
buffer, the remaining space is added to the current allocated buffer
so that the remaining space can be used to store packet data.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael Dalton <mwdalton@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16 23:46:06 -08:00
Michael Dalton
097b4f19e5 net: allow > 0 order atomic page alloc in skb_page_frag_refill
skb_page_frag_refill currently permits only order-0 page allocs
unless GFP_WAIT is used. Change skb_page_frag_refill to attempt
higher-order page allocations whether or not GFP_WAIT is used. If
memory cannot be allocated, the allocator will fall back to
successively smaller page allocs (down to order-0 page allocs).

This change brings skb_page_frag_refill in line with the existing
page allocation strategy employed by netdev_alloc_frag, which attempts
higher-order page allocations whether or not GFP_WAIT is set, falling
back to successively lower-order page allocations on failure. Part
of migration of virtio-net to per-receive queue page frag allocators.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Michael Dalton <mwdalton@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16 23:46:06 -08:00
Wei Yongjun
722e47d792 net_sched: fix error return code in fw_change_attrs()
The error code was not set if change indev fail, so the error
condition wasn't reflected in the return value. Fix to return a
negative error code from this error handling case instead of 0.

Fixes: 2519a602c2 ('net_sched: optimize tcf_match_indev()')
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16 19:12:03 -08:00
David S. Miller
8b88a11e44 Merge branch 'tipc'
Ying Xue says:

====================
tipc: align TIPC behaviours of waiting for events with other stacks

Comparing the current implementations of waiting for events in TIPC
socket layer with other stacks, TIPC's behaviour is very different
because wait_event_interruptible_timeout()/wait_event_interruptible()
are always used by TIPC to wait for events while relevant socket or
port variables are fed to them as their arguments. As socket lock has
to be released temporarily before the two routines of waiting for
events are called, their arguments associated with socket or port
structures are out of socket lock protection. This might cause
serious issues where the process of calling socket syscall such as
sendsmg(), connect(), accept(), and recvmsg(), cannot be waken up
at all even if proper event arrives or improperly be woken up
although the condition of waking up the process is not satisfied
in practice.

Therefore, aligning its behaviours with similar functions implemented
in other stacks, for instance, sk_stream_wait_connect() and
inet_csk_wait_for_connect() etc, can avoid above risks for us.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16 19:11:22 -08:00
Ying Xue
9bbb4ecc68 tipc: standardize recvmsg routine
Standardize the behaviour of waiting for events in TIPC recvmsg()
so that all variables of socket or port structures are protected
within socket lock, allowing the process of calling recvmsg() to
be woken up at appropriate time.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16 19:10:34 -08:00
Ying Xue
391a6dd1da tipc: standardize sendmsg routine of connected socket
Standardize the behaviour of waiting for events in TIPC send_packet()
so that all variables of socket or port structures are protected within
socket lock, allowing the process of calling sendmsg() to be woken up
at appropriate time.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16 19:10:34 -08:00
Ying Xue
3f40504f7e tipc: standardize sendmsg routine of connectionless socket
Comparing the behaviour of how to wait for events in TIPC sendmsg()
with other stacks, the TIPC implementation might be perceived as
different, and sometimes even incorrect. For instance, sk_sleep()
and tport->congested variables associated with socket are exposed
without socket lock protection while wait_event_interruptible_timeout()
accesses them. So standardizing it with similar implementation
in other stacks can help us correct these errors which the process
of calling sendmsg() cannot be woken up event if an expected event
arrive at socket or improperly woken up although the wake condition
doesn't match.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16 19:10:34 -08:00
Ying Xue
6398e23cdb tipc: standardize accept routine
Comparing the behaviour of how to wait for events in TIPC accept()
with other stacks, the TIPC implementation might be perceived as
different, and sometimes even incorrect. As sk_sleep() and
sk->sk_receive_queue variables associated with socket are not
protected by socket lock, the process of calling accept() may be
woken up improperly or sometimes cannot be woken up at all. After
standardizing it with inet_csk_wait_for_connect routine, we can
get benefits including: avoiding 'thundering herd' phenomenon,
adding a timeout mechanism for accept(), coping with a pending
signal, and having sk_sleep() and sk->sk_receive_queue being
always protected within socket lock scope and so on.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16 19:10:34 -08:00
Ying Xue
78eb3a5379 tipc: standardize connect routine
Comparing the behaviour of how to wait for events in TIPC connect()
with other stacks, the TIPC implementation might be perceived as
different, and sometimes even incorrect. For instance, as both
sock->state and sk_sleep() are directly fed to
wait_event_interruptible_timeout() as its arguments, and socket lock
has to be released before we call wait_event_interruptible_timeout(),
the two variables associated with socket are exposed out of socket
lock protection, thereby probably getting stale values so that the
process of calling connect() cannot be woken up exactly even if
correct event arrives or it is woken up improperly even if the wake
condition is not satisfied in practice. Therefore, standardizing its
behaviour with sk_stream_wait_connect routine can avoid these risks.

Additionally the implementation of connect routine is simplified as a
whole, allowing it to return correct values in all different cases.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16 19:10:34 -08:00
wangweidong
abfce3ef58 sctp: remove the unnecessary assignment
When go the right path, the status is 0, no need to assign it again.
So just remove the assignment.

Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16 17:31:49 -08:00
Jason Wang
be121f46af virtio-net: drop rq->max and rq->num
It looks like there's no need for those two fields:

- Unless there's a failure for the first refill try, rq->max should be always
  equal to the vring size.
- rq->num is only used to determine the condition that we need to do the refill,
  we could check vq->num_free instead.
- rq->num was required to be increased or decreased explicitly after each
  get/put which results a bad API.

So this patch removes them both to make the code simpler.

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16 17:30:42 -08:00
Lad, Prabhakar
9b05f462b7 net: davinci_mdio: Fix sparse warning
This patch fixes following sparse warning
davinci_mdio.c:85:27: warning: symbol 'default_pdata' was not declared. Should it be static?
Also makes the default_pdata as a constant.

Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com>
Acked-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16 17:29:53 -08:00