The array subscript increments after the execution of the statement.
So there is no issue here. However it helps to read the code better.
Signed-off-by: Babu Moger <babu.moger@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
I noticed this variable used for register reads wasn't an unsigned
so this patch corrects that. I don't believe this was causing any
issue as is but this is more consistent with the rest of the driver.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bump the version number to more closely match the function included
in the driver.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The second parameter of these functions is the index to the led we
are interested in affecting. However we were mistakenly passing
the offset in the register. This patch corrects that and adds some
bonds checking which would hopefully make bugs like this more noticeable
in the future.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Fix the NACK check in ixgbevf_set_uc_addr_vf() for instances where
index != 0.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Fixes dmesg spam when we just need to wait a moment for the clock
driver to probe.
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
i40e PF reset clears mac filters. If platform-specific mac address
is used, driver has to explicitly write default mac address to mac
filters otherwise all incoming traffic destined to default mac
address will be dropped after reset.
This issue was found on SPARC while toggling i40e ntuple via ethtool.
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Adding the missing link advertise for some X710 NICs.
This can be observed by simply calling ethtool on the interface.
root@rhel7:~ # ethtool eth0
Settings for eth0:
Supported ports: [ FIBRE ]
Supported link modes: 10000baseT/Full
Supported pause frame use: Symmetric
Supports auto-negotiation: No
Advertised link modes: Not reported
[...]
With fix applied.
root@rhel7:~ # ethtool eth0
Settings for eth0:
Supported ports: [ FIBRE ]
Supported link modes: 10000baseT/Full
Supported pause frame use: Symmetric
Supports auto-negotiation: No
Advertised link modes: 10000baseT/Full
[...]
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
We need to lock the client list around the i40e_client_release call to
prevent the release from interrupting the client instances while they are
being added.
Change-Id: I99993f20179aaf8730207833e7d0869d2ccffa1d
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Remove redundant call to memset before a call to memcpy.
The Coccinelle semantic patch used to make this change is as follows:
@@
expression e1,e2,e3,e4;
@@
- memset(e1,e2,e3);
memcpy(e1,e4,e3);
Signed-off-by: Amitoj Kaur Chawla <amitoj1606@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Properly track filter adds and deletes so the driver doesn't lose filters
during resets and up/down cycles. Add a tracking mechanism so that the
driver knows when to enter and leave promiscuous mode.
Implement a simple state machine so the driver can track the status of
each filter throughout its lifecycle. Properly manage the overflow promiscuous
state for the each VSI, and provide a way for the driver to detect when to exit
overflow promiscuous mode.
Remove all possible default MAC filters that the firmware may have set up so
that the driver can manage these correctly, particularly when VLANs come into
play. Remove the LAA flag for filters; instead just send whatever we get through
set_mac to the firmware as the LAA for wakeup purposes.
Finally, add the state of each filter to debugfs output so we can see what's
going on inside the driver's pointy little head.
Change-ID: I97c5e366fac2254fa01eaff4f65c0af61dcf2e1f
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds the Hyper-V specific VF device ids.
Change-ID: I9c4fe6d8dfd34f7f68ebc9fdae225c8768439c89
Signed-off-by: Joshua Hay <joshua.a.hay@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This device ID is not needed, so take it out.
Change-ID: I148d29f68a1f58b03980ecd83047a1b440f4f74d
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This initializer isn't needed because the variable is assigned right
away.
Change-ID: I6ce3edb3f4e0364db248a7a0bcc62ca95c01d941
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When link is down, Advertised Link Modes was wrongly displaying full
supported link modes instead of Advertised link mode. Added conditional
checks in order to make sure correct Advertised link modes are
displayed when the link is down.
Change-ID: I8a61413f9ee174149c7a33157b5f0b0a8da9842d
Signed-off-by: Avinash Dayanand <avinash.dayanand@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Drivers which use the SMBus extensions select I2C_SMBUS, so the
stubs are not needed.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
In function i40e_debug_aq parameter desc is assumed to be
possibly NULL. Do not dereference it before checking the
value.
Fixes: f905dd62be ("i40e/i40evf: add max buf len to aq debug print helper")
Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
sock_cmsg_send() can return different error codes and not only
-EINVAL, and we should properly propagate them.
Fixes: c14ac9451c ("sock: enable timestamping using control messages")
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni says:
====================
macsec: enable s/w offloads
This patches leverage gro_cells infrastructure to enable both GRO and RPS
on macsec devices.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Use gro_gells to trigger GRO and allow RPS on macsec traffic
after decryption.
Also, be sure to avoid clearing software offload features in
macsec_fix_features().
Overall this increase TCP tput by 30% on recent h/w.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
so that the caller can update stats accordingly, if needed
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Been around for long enough now, hasn't caused any regression test
failures in the past 3 months, so it's time to make it a fully
supported feature.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
In xfs_finish_page_writeback(), we have a loop that looks like this:
do {
if (off < bvec->bv_offset)
goto next_bh;
if (off > end)
break;
bh->b_end_io(bh, !error);
next_bh:
off += bh->b_size;
} while ((bh = bh->b_this_page) != head);
The b_end_io function is end_buffer_async_write(), which will call
end_page_writeback() once all the buffers have marked as no longer
under IO. This issue here is that the only thing currently
protecting both the bufferhead chain and the page from being
reclaimed is the PageWriteback state held on the page.
While we attempt to limit the loop to just the buffers covered by
the IO, we still read from the buffer size and follow the next
pointer in the bufferhead chain. There is no guarantee that either
of these are valid after the PageWriteback flag has been cleared.
Hence, loops like this are completely unsafe, and result in
use-after-free issues. One such problem was caught by Calvin Owens
with KASAN:
.....
INFO: Freed in 0x103fc80ec age=18446651500051355200 cpu=2165122683 pid=-1
free_buffer_head+0x41/0x90
__slab_free+0x1ed/0x340
kmem_cache_free+0x270/0x300
free_buffer_head+0x41/0x90
try_to_free_buffers+0x171/0x240
xfs_vm_releasepage+0xcb/0x3b0
try_to_release_page+0x106/0x190
shrink_page_list+0x118e/0x1a10
shrink_inactive_list+0x42c/0xdf0
shrink_zone_memcg+0xa09/0xfa0
shrink_zone+0x2c3/0xbc0
.....
Call Trace:
<IRQ> [<ffffffff81e8b8e4>] dump_stack+0x68/0x94
[<ffffffff8153a995>] print_trailer+0x115/0x1a0
[<ffffffff81541174>] object_err+0x34/0x40
[<ffffffff815436e7>] kasan_report_error+0x217/0x530
[<ffffffff81543b33>] __asan_report_load8_noabort+0x43/0x50
[<ffffffff819d651f>] xfs_destroy_ioend+0x3bf/0x4c0
[<ffffffff819d69d4>] xfs_end_bio+0x154/0x220
[<ffffffff81de0c58>] bio_endio+0x158/0x1b0
[<ffffffff81dff61b>] blk_update_request+0x18b/0xb80
[<ffffffff821baf57>] scsi_end_request+0x97/0x5a0
[<ffffffff821c5558>] scsi_io_completion+0x438/0x1690
[<ffffffff821a8d95>] scsi_finish_command+0x375/0x4e0
[<ffffffff821c3940>] scsi_softirq_done+0x280/0x340
Where the access is occuring during IO completion after the buffer
had been freed from direct memory reclaim.
Prevent use-after-free accidents in this end_io processing loop by
pre-calculating the loop conditionals before calling bh->b_end_io().
The loop is already limited to just the bufferheads covered by the
IO in progress, so the offset checks are sufficient to prevent
accessing buffers in the chain after end_page_writeback() has been
called by the the bh->b_end_io() callout.
Yet another example of why Bufferheads Must Die.
cc: <stable@vger.kernel.org> # 4.7
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reported-and-Tested-by: Calvin Owens <calvinowens@fb.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
One of the problems we currently have with delayed logging is that
under serious memory pressure we can deadlock memory reclaim. THis
occurs when memory reclaim (such as run by kswapd) is reclaiming XFS
inodes and issues a log force to unpin inodes that are dirty in the
CIL.
The CIL is pushed, but this will only occur once it gets the CIL
context lock to ensure that all committing transactions are complete
and no new transactions start being committed to the CIL while the
push switches to a new context.
The deadlock occurs when the CIL context lock is held by a
committing process that is doing memory allocation for log vector
buffers, and that allocation is then blocked on memory reclaim
making progress. Memory reclaim, however, is blocked waiting for
a log force to make progress, and so we effectively deadlock at this
point.
To solve this problem, we have to move the CIL log vector buffer
allocation outside of the context lock so that memory reclaim can
always make progress when it needs to force the log. The problem
with doing this is that a CIL push can take place while we are
determining if we need to allocate a new log vector buffer for
an item and hence the current log vector may go away without
warning. That means we canot rely on the existing log vector being
present when we finally grab the context lock and so we must have a
replacement buffer ready to go at all times.
To ensure this, introduce a "shadow log vector" buffer that is
always guaranteed to be present when we gain the CIL context lock
and format the item. This shadow buffer may or may not be used
during the formatting, but if the log item does not have an existing
log vector buffer or that buffer is too small for the new
modifications, we swap it for the new shadow buffer and format
the modifications into that new log vector buffer.
The result of this is that for any object we modify more than once
in a given CIL checkpoint, we double the memory required
to track dirty regions in the log. For single modifications then
we consume the shadow log vectorwe allocate on commit, and that gets
consumed by the checkpoint. However, if we make multiple
modifications, then the second transaction commit will allocate a
shadow log vector and hence we will end up with double the memory
usage as only one of the log vectors is consumed by the CIL
checkpoint. The remaining shadow vector will be freed when th elog
item is freed.
This can probably be optimised in future - access to the shadow log
vector is serialised by the object lock (as opposited to the active
log vector, which is controlled by the CIL context lock) and so we
can probably free shadow log vector from some objects when the log
item is marked clean on removal from the AIL.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
xfsprogs source commit 4280e59dcbc4cd8e01585efe788a68eb378048e8
xfs_da3_split() has to handle all three versions of the
directory/attribute btree structure. The attr tree is v1, the dir
tre is v2 or v3. The main difference between the v1 and v2/3 trees
is the way tree nodes are split - in the v1 tree we can require a
double split to occur because the object to be inserted may be
larger than the space made by splitting a leaf. In this case we need
to do a double split - one to split the full leaf, then another to
allocate an empty leaf block in the correct location for the new
entry. This does not happen with dir (v2/v3) formats as the objects
being inserted are always guaranteed to fit into the new space in
the split blocks.
Indeed, for directories they *may* be an extra block on this buffer
pointer. However, it's guaranteed not to be a leaf block (i.e. a
directory data block) - the directory code only ever places hash
index or free space blocks in this pointer (as a cursor of
sorts), and so to use it as a directory data block will immediately
corrupt the directory.
The problem is that the code assumes that there may be extra blocks
that we need to link into the tree once we've split the root, but
this is not true for either dir or attr trees, because the extra
attr block is always consumed by the last node split before we split
the root. Hence the linking in an extra block is always wrong at the
root split level, and this manifests itself in repair as a directory
corruption in a repaired directory, leaving the directory rebuild
incomplete.
This is a dir v2 zero-day bug - it was in the initial dir v2 commit
that was made back in February 1998.
Fix this by ensuring the linking of the blocks after the root split
never tries to make use of the extra blocks that may be held in the
cursor. They are held there for other purposes and should never be
touched by the root splitting code.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
We check IS_DAX(inode) before calling either xfs_file_dax_read or
xfs_file_dax_write, and this will lead the call being optimized out at
compile time when CONFIG_FS_DAX is disabled.
However, the two functions are marked STATIC, so they become global
symbols when CONFIG_XFS_DEBUG is set, leaving us with two unused global
functions that call into an undefined function and a broken "allmodconfig"
build:
fs/built-in.o: In function `xfs_file_dax_read':
fs/xfs/xfs_file.c:348: undefined reference to `dax_do_io'
fs/built-in.o: In function `xfs_file_dax_write':
fs/xfs/xfs_file.c:758: undefined reference to `dax_do_io'
Marking the two functions 'static noinline' instead of 'STATIC' will let
the compiler drop the symbols when there are no callers but avoid the
implicit inlining.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 16d4d43595 ("xfs: split direct I/O and DAX path")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
XFS has had scattered reports of delalloc blocks present at
->releasepage() time. This results in a warning with a stack trace
similar to the following:
...
Call Trace:
[<ffffffffa23c5b8f>] dump_stack+0x63/0x84
[<ffffffffa20837a7>] warn_slowpath_common+0x97/0xe0
[<ffffffffa208380a>] warn_slowpath_null+0x1a/0x20
[<ffffffffa2326caf>] xfs_vm_releasepage+0x10f/0x140
[<ffffffffa218c680>] ? page_mkclean_one+0xd0/0xd0
[<ffffffffa218d3a0>] ? anon_vma_prepare+0x150/0x150
[<ffffffffa21521c2>] try_to_release_page+0x32/0x50
[<ffffffffa2166b2e>] shrink_active_list+0x3ce/0x3e0
[<ffffffffa21671c7>] shrink_lruvec+0x687/0x7d0
[<ffffffffa21673ec>] shrink_zone+0xdc/0x2c0
[<ffffffffa2168539>] kswapd+0x4f9/0x970
[<ffffffffa2168040>] ? mem_cgroup_shrink_node_zone+0x1a0/0x1a0
[<ffffffffa20a0d99>] kthread+0xc9/0xe0
[<ffffffffa20a0cd0>] ? kthread_stop+0x100/0x100
[<ffffffffa26b404f>] ret_from_fork+0x3f/0x70
[<ffffffffa20a0cd0>] ? kthread_stop+0x100/0x100
This occurs because it is possible for shrink_active_list() to send
pages marked dirty to ->releasepage() when certain buffer_head threshold
conditions are met. shrink_active_list() doesn't check the page dirty
state apparently to handle an old ext3 corner case where in some cases
clean pages would not have the dirty bit cleared, thus it is up to the
filesystem to determine how to handle the page.
XFS currently handles the delalloc case properly, but this behavior
makes the warning spurious. Update the XFS ->releasepage() handler to
explicitly skip dirty pages. Retain the existing delalloc/unwritten
checks so we continue to warn if such buffers exist on clean pages when
they shouldn't.
Diagnosed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
The handlers provided by cpufreq core are sufficient for resolving the
frequency for drivers providing ->target_index(), as the core already
has the frequency table and so ->resolve_freq() isn't required for such
platforms.
This patch disallows drivers with ->target_index() callback to use the
->resolve_freq() callback.
Also, it fixes a potential kernel crash for drivers providing ->target()
but no ->resolve_freq().
Fixes: e3c0623608 "cpufreq: add cpufreq_driver_resolve_freq()"
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Now that ACPI processor idle driver supports LPI(Low Power Idle), lets
enable ACPI_PROCESSOR_IDLE for ARM64 too.
This patch just removes the IA64 and X86 dependency on ACPI_PROCESSOR_IDLE
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This patch adds appropriate callbacks to support ACPI Low Power Idle
(LPI) on ARM64.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This patch adds support for initialisation of PSCI CPUIdle states
from Low Power Idle(_LPI) entries in the ACPI tables when acpi is
enabled.
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The function arm_enter_idle_state is exactly the same in both generic
ARM{32,64} CPUIdle driver and will be the same even on ARM64 backend
for ACPI processor idle driver. So we can unify it and move it to a
common place by introducing CPU_PM_CPU_IDLE_ENTER macro that can be
used in all places avoiding duplication.
This is in preparation of reuse of the generic cpuidle entry function
for ACPI LPI support on ARM64.
Suggested-by: Rafael J. Wysocki <rjw@rjwysocki.net>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit ea389daa7f (arm64: cpuidle: add __init section marker to
arm_cpuidle_init) added the __init annotation to arm_cpuidle_init
as it was not needed after booting which was correct at that time.
However with the introduction of ACPI LPI support, this will be used
from cpuhotplug path in ACPI processor driver.
This patch drops the __init annotation from arm_cpuidle_init to avoid
the following warning:
WARNING: vmlinux.o(.text+0x113c8): Section mismatch in reference from the
function acpi_processor_ffh_lpi_probe() to the function
.init.text:arm_cpuidle_init()
The function acpi_processor_ffh_lpi_probe() references
the function __init arm_cpuidle_init().
This is often because acpi_processor_ffh_lpi_probe() lacks a __init
annotation or the annotation of arm_cpuidle_init is wrong.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
ACPI 6.0 introduced an optional object _LPI that provides an alternate
method to describe Low Power Idle states. It defines the local power
states for each node in a hierarchical processor topology. The OSPM can
use _LPI object to select a local power state for each level of processor
hierarchy in the system. They used to produce a composite power state
request that is presented to the platform by the OSPM.
Since multiple processors affect the idle state for any non-leaf hierarchy
node, coordination of idle state requests between the processors is
required. ACPI supports two different coordination schemes: Platform
coordinated and OS initiated.
This patch adds initial support for Platform coordination scheme of LPI.
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
ACPI 6.0 adds a new method to specify the CPU idle states(C-states)
called Low Power Idle(LPI) states. Since new architectures like ARM64
use only LPIs, introduce ACPI_PROCESSOR_CSTATE to encapsulate all the
code supporting the old style C-states(_CST).
This patch will help to extend the processor_idle module to support
LPI.
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
When assign new PCI platform PM operations check for all mandatory fields to
prevent NULL pointer dereference.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
A call to cpufreq_driver_resolve_freq will cache the mapping from
the desired target frequency to the frequency table index. If there
is a mapping for the desired target frequency then use it instead of
looking up the mapping again.
Signed-off-by: Steve Muckle <smuckle@linaro.org>
Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The slow-path frequency transition path is relatively expensive as it
requires waking up a thread to do work. Should support be added for
remote CPU cpufreq updates that is also expensive since it requires an
IPI. These activities should be avoided if they are not necessary.
To that end, calculate the actual driver-supported frequency required by
the new utilization value in schedutil by using the recently added
cpufreq_driver_resolve_freq API. If it is the same as the previously
requested driver frequency then there is no need to continue with the
update assuming the cpu frequency limits have not changed. This will
have additional benefits should the semantics of the rate limit be
changed to apply solely to frequency transitions rather than to
frequency calculations in schedutil.
The last raw required frequency is cached. This allows the driver
frequency lookup to be skipped in the event that the new raw required
frequency matches the last one, assuming a frequency update has not been
forced due to limits changing (indicated by a next_freq value of
UINT_MAX, see sugov_should_update_freq).
Signed-off-by: Steve Muckle <smuckle@linaro.org>
Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Before this patch, if you used gfs2_jadd to add new journals of a
size smaller than the existing journals, replaying those new journals
would withdraw. That's because function gfs2_replay_incr_blk was
using the number of journal blocks (jd_block) from the superblock's
journal pointer. In other words, "My journal's max size" rather than
"the journal we're replaying's size." This patch changes the function
to use the size of the pertinent journal rather than always using the
journal we happen to be using.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
* kprobes:
arm64: kprobes: Add KASAN instrumentation around stack accesses
arm64: kprobes: Cleanup jprobe_return
arm64: kprobes: Fix overflow when saving stack
arm64: kprobes: WARN if attempting to step with PSTATE.D=1
kprobes: Add arm64 case in kprobe example module
arm64: Add kernel return probes support (kretprobes)
arm64: Add trampoline code for kretprobes
arm64: kprobes instruction simulation support
arm64: Treat all entry code as non-kprobe-able
arm64: Blacklist non-kprobe-able symbol
arm64: Kprobes with single stepping support
arm64: add conditional instruction simulation support
arm64: Add more test functions to insn.c
arm64: Add HAVE_REGS_AND_STACK_ACCESS_API feature
I don't think it is really possible to have a system where CPUID
enumerates support for XSAVE but that it does not have FP/SSE
(they are "legacy" features and always present).
But, I did manage to hit this case in qemu when I enabled its
somewhat shaky XSAVE support. The bummer is that the FPU is set
up before we parse the command-line or have *any* console support
including earlyprintk. That turned what should have been an easy
thing to debug in to a bit more of an odyssey.
So a BUG() here is worthless. All it does it guarantee that
if/when we hit this case we have an empty console. So, remove
the BUG() and try to limp along by disabling XSAVE and trying to
continue. Add a comment on why we are doing this, and also add
a common "out_disable" path for leaving fpu__init_system_xstate().
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160720194551.63BB2B58@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Passing "nosmp" should boot the kernel with a single processor, without
provision to enable secondary CPUs even if they are present. "nosmp" is
implemented by setting maxcpus=0. At the moment we still mark the secondary
CPUs present even with nosmp, which allows the userspace to bring them
up. This patch corrects the smp_prepare_cpus() to honor the maxcpus == 0.
Commit 44dbcc93ab ("arm64: Fix behavior of maxcpus=N") fixed the
behavior for maxcpus >= 1, but broke maxcpus = 0.
Fixes: 44dbcc93ab ("arm64: Fix behavior of maxcpus=N")
Cc: <stable@vger.kernel.org> # 4.7+
Cc: Will Deacon <will.deacon@arm.com>
Cc: James Morse <james.morse@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
[catalin.marinas@arm.com: updated code comment]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
In smp_prepare_boot_cpu(), we invoke cpuinfo_store_boot_cpu to store
the cpuinfo in a per-cpu ptr, before initialising the per-cpu offset for
the boot CPU. This patch reorders the sequence to make sure we initialise
the per-cpu offset before accessing the per-cpu area.
Commit 4b998ff188 ("arm64: Delay cpuinfo_store_boot_cpu") fixed the
issue where we modified the per-cpu area even before the kernel initialises
the per-cpu areas, but failed to wait until the boot cpu updated it's
offset.
Fixes: 4b998ff188 ("arm64: Delay cpuinfo_store_boot_cpu")
Cc: <stable@vger.kernel.org> # 4.4+
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cpufreq governors may need to know what a particular target frequency
maps to in the driver without necessarily wanting to set the frequency.
Support this operation via a new cpufreq API,
cpufreq_driver_resolve_freq(). This API returns the lowest driver
frequency equal or greater than the target frequency
(CPUFREQ_RELATION_L), subject to any policy (min/max) or driver
limitations. The mapping is also cached in the policy so that a
subsequent fast_switch operation can avoid repeating the same lookup.
The API will call a new cpufreq driver callback, resolve_freq(), if it
has been registered by the driver. Otherwise the frequency is resolved
via cpufreq_frequency_table_target(). Rather than require ->target()
style drivers to provide a resolve_freq() callback it is left to the
caller to ensure that the driver implements this callback if necessary
to use cpufreq_driver_resolve_freq().
Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Steve Muckle <smuckle@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Previous patches added support for Intel's AVX-512 instructions to the
kernel and perf tools instruction decoders.
AVX-512 instructions are documented in Intel Architecture Instruction
Set Extensions Programming Reference (February 2016).
Add a representative set of instructions to perf's "new instructions"
test. e.g.
perf test "new instructions"
Or to view a particular instruction:
perf test -v "new instructions" 2>&1 | grep vbroadcasti64x4
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: X86 ML <x86@kernel.org>
Link: http://lkml.kernel.org/r/1469003437-32706-5-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add support for Intel's AVX-512 instructions to perf tools instruction
decoder used by Intel PT. The kernel's instruction decoder was updated in
a previous patch.
AVX-512 instructions are documented in Intel Architecture Instruction Set
Extensions Programming Reference (February 2016).
AVX-512 instructions are identified by a EVEX prefix which, for the purpose
of instruction decoding, can be treated as though it were a 4-byte VEX
prefix.
Existing instructions which can now accept an EVEX prefix need not be
further annotated in the op code map (x86-opcode-map.txt). In the case of
new instructions, the op code map is updated accordingly.
Also add associated Mask Instructions that are used to manipulate mask
registers used in AVX-512 instructions.
A representative set of instructions is added to the perf tools new
instructions test in a subsequent patch.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: X86 ML <x86@kernel.org>
Link: http://lkml.kernel.org/r/1469003437-32706-4-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add support for Intel's AVX-512 instructions to the instruction decoder.
AVX-512 instructions are documented in Intel Architecture Instruction
Set Extensions Programming Reference (February 2016).
AVX-512 instructions are identified by a EVEX prefix which, for the
purpose of instruction decoding, can be treated as though it were a
4-byte VEX prefix.
Existing instructions which can now accept an EVEX prefix need not be
further annotated in the op code map (x86-opcode-map.txt). In the case
of new instructions, the op code map is updated accordingly.
Also add associated Mask Instructions that are used to manipulate mask
registers used in AVX-512 instructions.
The 'perf tools' instruction decoder is updated in a subsequent patch.
And a representative set of instructions is added to the perf tools new
instructions test in a subsequent patch.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: X86 ML <x86@kernel.org>
Link: http://lkml.kernel.org/r/1469003437-32706-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>