Add helper function to handle the submission of fenced control requests.
Make sure we initialize the fence while holding the virtqueue lock, so
requests can't be reordered.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Add virtio_gpu_queue_ctrl_buffer_locked function, which does the same as
virtio_gpu_queue_ctrl_buffer but does not take the virtqueue lock. The
caller must hold the lock instead.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Allow for arch-specific interrupt types to be set. For that, add
kvm_arch_set_irq() which takes interrupt type-specific action if it
recognizes the interrupt type given, and -EWOULDBLOCK otherwise.
The default implementation always returns -EWOULDBLOCK.
Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Vitaly Kuznetsov <vkuznets@redhat.com>
CC: "K. Y. Srinivasan" <kys@microsoft.com>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Factor out kvm_notify_acked_gsi() helper to iterate over EOI listeners
and notify those matching the given gsi.
It will be reused in the upcoming Hyper-V SynIC implementation.
Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Vitaly Kuznetsov <vkuznets@redhat.com>
CC: "K. Y. Srinivasan" <kys@microsoft.com>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The loop(for) inside irqfd_update() is unnecessary
because any other value for irq_entry.type will just trigger
schedule_work(&irqfd->inject) in irqfd_wakeup.
Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Vitaly Kuznetsov <vkuznets@redhat.com>
CC: "K. Y. Srinivasan" <kys@microsoft.com>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
As reported at https://bugs.launchpad.net/qemu/+bug/1494350,
it is possible to have vcpu->arch.st.last_steal initialized
from a thread other than vcpu thread, say the iothread, via
KVM_SET_MSRS.
Which can cause an overflow later (when subtracting from vcpu threads
sched_info.run_delay).
To avoid that, move steal time accumulation to vcpu entry time,
before copying steal time data to guest.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Reviewed-by: David Matlack <dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Calling kvm_vcpu_gfn_to_memslot() twice in mapping_level() should be
avoided since getting a slot by binary search may not be negligible,
especially for virtual machines with many memory slots.
Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Now that it has only one caller, and its name is not so helpful for
readers, remove it. The new memslot_valid_for_gpte() function
makes it possible to share the common code between
gfn_to_memslot_dirty_bitmap() and mapping_level().
Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is necessary to eliminate an extra memory slot search later.
Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
As a bonus, an extra memory slot search can be eliminated when
is_self_change_mapping is true.
Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This will be passed to a function later.
Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently we always write the next_rip of the shadow vmcb to
the guests vmcb when we emulate a vmexit. This could confuse
the guest when its cpuid indicated no support for the
next_rip feature.
Fix this by only propagating next_rip if the guest actually
supports it.
Cc: Bandan Das <bsd@redhat.com>
Cc: Dirk Mueller <dmueller@suse.com>
Tested-By: Dirk Mueller <dmueller@suse.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Expose VPID capability to L1. For nested guests, we don't do anything
specific for single context invalidation. Hence, only advertise support
for global context invalidation. The major benefit of nested VPID comes
from having separate vpids when switching between L1 and L2, and also
when L2's vCPUs not sched in/out on L1.
Reviewed-by: Wincy Van <fanwenyi0529@gmail.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
VPID is used to tag address space and avoid a TLB flush. Currently L0 use
the same VPID to run L1 and all its guests. KVM flushes VPID when switching
between L1 and L2.
This patch advertises VPID to the L1 hypervisor, then address space of L1
and L2 can be separately treated and avoid TLB flush when swithing between
L1 and L2. For each nested vmentry, if vpid12 is changed, reuse shadow vpid
w/ an invvpid.
Performance:
run lmbench on L2 w/ 3.5 kernel.
Context switching - times in microseconds - smaller is better
-------------------------------------------------------------------------
Host OS 2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw
--------- ------------- ------ ------ ------ ------ ------ ------- -------
kernel Linux 3.5.0-1 1.2200 1.3700 1.4500 4.7800 2.3300 5.60000 2.88000 nested VPID
kernel Linux 3.5.0-1 1.2600 1.4300 1.5600 12.7 12.9 3.49000 7.46000 vanilla
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Wincy Van <fanwenyi0529@gmail.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add the INVVPID instruction emulation.
Reviewed-by: Wincy Van <fanwenyi0529@gmail.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Because eth_type_trans() consumes ethernet header worth of bytes, a call
to read TCI from end of packet using rhine_rx_vlan_tag() no longer works
as it's reading from an invalid offset.
Tested to be working on PCEngines Alix board.
Fixes: 810f19bcb8 ("via-rhine: add consistent memory barrier in vlan receive code.")
Signed-off-by: Andrej Ota <andrej@ota.si>
Acked-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, is only called from __prog_put_rcu in the bpf_prog_release
path. Need this to call this from bpf_prog_put also to get correct
accounting.
Fixes: aaac3ba95e ("bpf: charge user for creation of BPF maps and programs")
Signed-off-by: Tom Herbert <tom@herbertland.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet says:
====================
tcp/dccp: make our listener code more robust
This patch series addresses request sockets leaks and listener dismantle
phase. This survives a stress test with listeners being added/removed
quite randomly.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Under stress, a close() on a listener can trigger the
WARN_ON(sk->sk_ack_backlog) in inet_csk_listen_stop()
We need to test if listener is still active before queueing
a child in inet_csk_reqsk_queue_add()
Create a common inet_child_forget() helper, and use it
from inet_csk_reqsk_queue_add() and inet_csk_listen_stop()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Let's reduce the confusion about inet_csk_reqsk_queue_drop() :
In many cases we also need to release reference on request socket,
so add a helper to do this, reducing code size and complexity.
Fixes: 4bdc3d6614 ("tcp/dccp: fix behavior of stale SYN_RECV request sockets")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit c69736696c.
At the time of above commit, tcp_req_err() and dccp_req_err()
were dead code, as SYN_RECV request sockets were not yet in ehash table.
Real bug was fixed later in a different commit.
We need to revert to not leak a refcount on request socket.
inet_csk_reqsk_queue_drop_and_put() will be added
in following commit to make clean inet_csk_reqsk_queue_drop()
does not release the reference owned by caller.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Split the API and FPU type definitions into separate header files
similar to "x86/fpu: Rename fpu-internal.h to fpu/internal.h" (78f7f1e54b).
The new header files and their meaning are:
asm/fpu/types.h:
FPU related data types, needed for 'struct thread_struct' and
'struct task_struct'.
asm/fpu/api.h:
FPU related 'public' functions for other subsystems and device
drivers.
asm/fpu/internal.h:
FPU internal functions mainly used to convert
FPU register contents in signal handling.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
ip6_blackhole_route() does not initialize the newly allocated
rt6_info properly. This patch:
1. Call rt6_info_init() to initialize rt6i_siblings and rt6i_uncached
2. The current rt->dst._metrics init code is incorrect:
- 'rt->dst._metrics = ort->dst._metris' is not always safe
- Not sure what dst_copy_metrics() is trying to do here
considering ip6_rt_blackhole_cow_metrics() always returns
NULL
Fix:
- Always do dst_copy_metrics()
- Replace ip6_rt_blackhole_cow_metrics() with
dst_cow_metrics_generic()
3. Mask out the RTF_PCPU bit from the newly allocated blackhole route.
This bug triggers an oops (reported by Phil Sutter) in rt6_get_cookie().
It is because RTF_PCPU is set while rt->dst.from is NULL.
Fixes: d52d3997f8 ("ipv6: Create percpu rt6_info")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Reported-by: Phil Sutter <phil@nwl.cc>
Tested-by: Phil Sutter <phil@nwl.cc>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: Phil Sutter <phil@nwl.cc>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce rt6_info_init() to do the common init work for
'struct rt6_info' (after calling dst_alloc).
It is a prep work to fix the rt6_info init logic in the
ip6_blackhole_route().
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: Phil Sutter <phil@nwl.cc>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch makes sure that conn_params that were created just for
explicit_connect, will get properly deleted during cleanup.
Signed-off-by: Jakub Pawlowski <jpawlowski@google.com>
Acked-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
After clearing the params->explicit_connect variable the parameters
may need to be either added back to the right list or potentially left
absent from both the le_reports and the le_conns lists.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Devices undergoing an explicit connect should not have their
conn_params struct removed by the mgmt Remove Device command. This
patch fixes the necessary checks in the command handler to correct the
behavior.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
We can't use hci_explicit_connect_lookup() since that would only cover
explicit connections, leaving normal reconnections completely
untouched. Not using it in turn means leaving out entries in
pend_le_reports.
To fix this and simplify the logic move conn params from the reports
list to the pend_le_conns list for the duration of an explicit
connect. Once the connect is complete move the params back to the
pend_le_reports list. This also means that the explicit connect lookup
function only needs to look into the pend_le_conns list.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The code should never directly call hci_conn_hash_del since many
cleanup & reference counting updates would be lost. Normally
hci_conn_del is the right thing to do, but in the case of a connection
doing LE scanning this could cause a deadlock due to doing a
cancel_delayed_work_sync() on the same work callback that we were
called from.
Connections in the LE scanning state actually need very little cleanup
- just a small subset of hci_conn_del. To solve the issue, refactor
out these essential pieces into a new hci_conn_cleanup() function and
call that from the two necessary places.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
When disable/enable scan command is issued twice, some controllers
will return an error for the second request, i.e. requests with this
command will fail on some controllers, and succeed on others.
This patch makes sure that unnecessary scan disable/enable commands
are not issued.
When adding device to the auto connect whitelist when there is pending
connect attempt, there is no need to update scan.
hci_connect_le_scan_cleanup is conditionally executing
hci_conn_params_del, that is calling hci_update_background_scan. Make
the other case also update scan, and remove reduntand call from
hci_connect_le_scan_remove.
When stopping interleaved discovery the state should be set to stopped
only when both LE scanning and discovery has stopped.
Signed-off-by: Jakub Pawlowski <jpawlowski@google.com>
Acked-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Many drivers initialize uselessly n_priv_flags, n_stats, testinfo_len,
eedump_len & regdump_len fields in their .get_drvinfo() ethtool op.
It's not necessary as these fields is filled in ethtool_get_drvinfo().
v2: removed unused variable
v3: removed another unused variable
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When we never got the the clock-reference, i.e. when IS_ERR(clk) is true,
don't try to print the clock name via %pC as this of course produces a
null-pointer-dereference in __clk_get_name().
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Caesar Wang <wxt@rock-chips.com>
The function is void and static, so just ifdef its contents
instead of duplicating the declaration.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Printing "N/A mBi" is strange - print just "N/A" instead.
Also add a missing opening parenthesis.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Instead of having a lot of places that free ignored requests
and then return REG_REQ_OK, make reg_process_hint() process
REG_REQ_IGNORE by freeing the request, and let functions it
calls return that instead of freeing.
This also fixes a leak when a second (different) country IE
hint was ignored.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This function can only deal with treatment values OK and ALREADY_SET
so make the callees not return anything else and warn if they do.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
If there's a built-in regulatory database, there may be little point
in also calling out to CRDA and failing if the system is configured
that way. Allow removing CRDA support to save ~1K kernel size.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Jon Maloy says:
====================
tipc: some link level code improvements
Extensive testing has revealed some weaknesses and non-optimal solutions
in the link level code.
This commit series addresses those issues.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The change made in the previous commit revealed a small flaw in the way
the node FSM is updated. When the function tipc_node_link_down() is
called for the last link to a node, we should check whether this was
caused by a local reset or by a received RESET message from the peer.
In the latter case, we can directly issue a PEER_LOST_CONTACT_EVT to
the node FSM, so that it is ready to re-establish contact. If this is
not done, the peer node will sometimes have to go through a second
establish cycle before the link becomes stable.
We fix this in this commit by conditionally issuing the mentioned
event in the function tipc_node_link_down(). We also move LINK_RESET
FSM even away from the link_reset() function and into the caller
function, partially because it is easier to follow the code when state
changes are gathered at a limited number of locations, partially
because there will be cases in future commits where we don't want the
link to go RESET mode when link_reset() is called.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a link is taken down because of a node local event, such as
disabling of a bearer or an interface, we currently leave it to the
peer node to discover the broken communication. The default time for
such failure discovery is 1.5-2 seconds.
If we instead allow the terminating link endpoint to send out a RESET
message at the moment it is reset, we can achieve the impression that
both endpoints are going down instantly. Since this is a very common
scenario, we find it worthwhile to make this small modification.
Apart from letting the link produce the said message, we also have to
ensure that the interface is able to transmit it before TIPC is
detached. We do this by performing the disabling of a bearer in three
steps:
1) Disable reception of TIPC packets from the interface in question.
2) Take down the links, while allowing them so send out a RESET message.
3) Disable transmission of TIPC packets on the interface.
Apart from this, we now have to react on the NETDEV_GOING_DOWN event,
instead of as currently the NEDEV_DOWN event, to ensure that such
transmission is possible during the teardown phase.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>