* pci/host-dra7xx:
ARM: dts: am57xx-evm: Add 'gpios' property with gpio2_8
PCI: dra7xx: Add support to make GPIO drive PERST# line
PCI: dra7xx: Clear MSE bit during suspend so clocks will idle
PCI: dra7xx: Add PM support
PCI: dra7xx: Disable pm_runtime on get_sync failure
* pci/host-iproc:
PCI: iproc: Allow BCMA bus driver to be built as module
PCI: iproc: Add arm64 support
PCI: iproc: Delete unnecessary checks before phy calls
* pci/hotplug:
PCI: pciehp: Remove ignored MRL sensor interrupt events
PCI: pciehp: Remove unused interrupt events
PCI: pciehp: Handle invalid data when reading from non-existent devices
PCI: Hold pci_slot_mutex while searching bus->slots list
PCI: Protect pci_bus->slots with pci_slot_mutex, not pci_bus_sem
PCI: pciehp: Simplify pcie_poll_cmd()
PCI: Use "slot" and "pci_slot" for struct hotplug_slot and struct pci_slot
* pci/iommu:
PCI: Remove pci_ats_enabled()
PCI: Stop caching ATS Invalidate Queue Depth
PCI: Move ATS declarations to linux/pci.h so they're all together
PCI: Clean up ATS error handling
PCI: Use pci_physfn() rather than looking up physfn by hand
PCI: Inline the ATS setup code into pci_ats_init()
PCI: Rationalize pci_ats_queue_depth() error checking
PCI: Reduce size of ATS structure elements
PCI: Embed ATS info directly into struct pci_dev
PCI: Allocate ATS struct during enumeration
iommu/vt-d: Cache PCI ATS state and Invalidate Queue Depth
* pci/irq:
PCI: Kill off set_irq_flags() usage
* pci/virtualization:
PCI: Add ACS quirks for Intel I219-LM/V
During probe free the memory allocated to "exynos_info" in case of
unknown SoC type.
Signed-off-by: Shailendra Verma <shailendra.capricorn@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Lukasz Majewski <l.majewski@samsung.com>
[k.kozlowski: Rebased the patch around if(of_machine_is_compatible)]
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Kukjin Kim <kgene@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Before this commit, the following would happen:
a) acpi_video_get_backlight_type() gets called
b) acpi_video_get_backlight_type() calls acpi_video_init_backlight_type()
c) acpi_video_init_backlight_type() locks its function static init_mutex
d) acpi_video_init_backlight_type() calls backlight_register_notifier()
e) backlight_register_notifier() takes its notifier-chain lock
And when the backlight notifier chain gets called we've:
1) blocking_notifier_call_chain() gets called
2) blocking_notifier_call_chain() takes the notifier-chain lock
3) blocking_notifier_call_chain() calls acpi_video_backlight_notify()
4) acpi_video_backlight_notify() calls acpi_video_get_backlight_type()
5) acpi_video_get_backlight_type() calls acpi_video_init_backlight_type()
6) acpi_video_init_backlight_type() locks its function static init_mutex
So in the first call sequence we have:
a) init_mutex gets locked
b) notifier-chain gets locked
and in the second call sequence we have:
1) notifier-chain gets locked
2) init_mutex gets locked
And we've a circular locking dependency. This specific locking dependency
is fixable without using the big hammer otherwise known as a workqueue,
but further analysis shows a similar problem with the backlight notifier
chain lock vs register_count_mutex from drivers/acpi/acpi_video.c,
and fixing that becomes problematic.
So this commit simply fixes this with the big hammer, performance
wise this is a non issue as we expect the work to get scheduled
exactly zero or one times during normal system use.
Fixes: 93a291dfaf (ACPI / video: Move backlight notifier to video_detect.c)
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reported-and-tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
When replacing del_timer() with del_timer_sync(), I introduced
a deadlock condition :
reqsk_queue_unlink() is called from inet_csk_reqsk_queue_drop()
inet_csk_reqsk_queue_drop() can be called from many contexts,
one being the timer handler itself (reqsk_timer_handler()).
In this case, del_timer_sync() loops forever.
Simple fix is to test if timer is pending.
Fixes: 2235f2ac75 ("inet: fix races with reqsk timers")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The code in question is called outside of standard driver
probe()/remove() callbacks and thus will not benefit from use of devm*
infrastructure.
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
There are some MAC registers that need to be kept in sync
with the link state parameters, see adjust_link().
However, after a MAC soft reset default values for
these registers are assumed. In some cases (excepting
if down/ if up for example) adjust_link() does not see
that these values were reset to default because the
priv->old* link parameters were left unchanged.
So, reset the priv->old* link params as well during a
MAC reset to let adjust_link() restore the MAC link
settings to the actual link state values.
Fixes following case, for example:
Setting link to 100M, changing MTU (implies MAC reset),
link state remains unchanged to 100M but MAC registers
were reset to default (1G) breaking the connectivity w/
the PHY. Closing and re-opening the interface would
restore the MAC link parameters to the correct values.
Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When generating /proc/net/route we emit a header followed by a line for
each route. When a short read is performed we will restart this process
based on the open file descriptor. When calculating the start point we
fail to take into account that the 0th entry is the header. This leads
us to skip the first entry when doing a continuation read.
This can be easily seen with the comparison below:
while read l; do echo "$l"; done </proc/net/route >A
cat /proc/net/route >B
diff -bu A B | grep '^[+-]'
On my example machine I have approximatly 10KB of route output. There we
see the very first non-title element is lost in the while read case,
and an entry around the 8K mark in the cat case:
+wlan0 00000000 02021EAC 0003 0 0 400 00000000 0 0 0
-tun1 00C0AC0A 00000000 0001 0 0 950 00C0FFFF 0 0 0
Fix up the off-by-one when reaquiring position on continuation.
Fixes: 8be33e955c ("fib_trie: Fib walk rcu should take a tnode and key instead of a trie and a leaf")
BugLink: http://bugs.launchpad.net/bugs/1483440
Acked-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It is used to override the default setting for burst size, changing
burst size takes effect only when the SBUSCFG.AHBBRST = 0.
Signed-off-by: Peter Chen <peter.chen@freescale.com>
With this setting and AHBBRST at SBUSCFG as "Incremental burst of
unspecified length", each non-burst size can be taken as single
transfer. It is benefit for non-burst size transfer.
Signed-off-by: Peter Chen <peter.chen@freescale.com>
Stream mode enable is known for better performance, this stream mode
enable patch has been passed with stress tests at device mode for
imx6sl and imx6sx, and no issue is found.
Signed-off-by: Peter Chen <peter.chen@freescale.com>
The system bus and chipidea IP have different limitations for
both host and device mode.
For example, with below errata, we need to enable SDIS(Stream Disable
Mode) at host mode. But we don't want it for device mode at the
same system.
TAR 9000378958
Title: Non-Double Word Aligned Buffer Address Sometimes Causes Host to
Hang on OUT Retry
Impacted Configuration: Host mode, all transfer types
Description:
The host core operating in streaming mode may under run while sending
the data packet of an OUT transaction. This under run can occur if
there are unexpected system delays in fetching the remaining packet
data from memory. The host forces a bad CRC on the packet, the device
detects the error and discards the packet. The host then retries a Bulk,
Interrupt, or Control transfer if an under run occurs according to the
USB specification. During simulations, it was found that the host does
not issue the retry of the failed bulk OUT. It does not issue any other
transactions except SOF packets that have incorrect frame numbers.
The second failure mode occurs if the under run occurs on an ISO OUT
transaction and the next ISO transaction is a zero byte packet. The host
does not issue any transactions (including SOFs). The device detects a
Suspend condition, reverts to full speed, and waits for resume signaling.
A third failure mode occurs when the host under runs on an ISO OUT and
the next ISO in the schedule is an ISO OUT with two max packets of 1024
bytes each. The host should issue MDATA for the first OUT followed by
DATA1 for the second. However, it drops the MDATA transaction, and
issues the DATA1 transaction.
The system impact of this bug is the same regardless of the failure mode
observed. The host core hangs, the ehci_ctrl state machine waits for the
protocol engine to send the completion status for the corrupted
transaction, which never occurs. No indication is sent to the host
controller driver, no register bits change and no interrupts occur.
Eventually the requesting application times out.
Detailed internal behavior:
The EHCI control state machine (ehci_ctrl) in the DMA block is responsible
for parsing the schedules and initiating all transactions. The ehci_ctrl
state machine passes the transaction details to the protocol block by
writing the transaction information in to the TxFIFO. It then asserts
the pe_hst_run_pkt signal to inform the host protocol state machine
(pe_hst_state) that there is a packet in the TxFIFO.
A tag of 0x0 indicates a start of packet with the data providing the
following information:
35:32 Tag
31:30 Reserved
29:23 Endpoint (lowest 4 bits)
22:16 Address
15:10 Reserved
9:8 Endpoint speed
7:6 Endpoint type
5:6 Data Toggle
3:0 PID
The pe_hst_state reads the packet information and constructs the packet
and issues it to the PHY interface.
The ehci_ctrl state machine writes the start transaction information in
to the TxFIFO as 0x03002910c for the OUT packet that had the under run
error. However, it writes 0xC3002910C for the retry of the Out
transaction, which is incorrect.
The pe_hst_state enters a bus timeout state after sending the bad CRC
for the packet that under ran. It then purges any data that was back
filled in to the TxFIFO for the packet that under ran. The pe_hst_state
machine stops purging the TxFIFO when it is empty or if it reads a
location that has a tag of 0x0, indicating a start of packet command.
The pe_hst_state reads 0xC3002910C and discards it as it does not decode
to a start of packet command. It continues to purge the OUT data that
has been pre-buffered for the OUT retry . The pe_hst_state detects the
hst_packet_run signal and attempts to read the PID and address
information from the TxFIFO. This location has packet data and so does
not decode to a valid PID and so falls through to the PE_HST_SOF_LOAD
state where the frame_num_counter is updated. The frame_num_counter
is updated with the data in the TxFIFO. In this case, the data is
incorrect as the ehci_ctrl state machine did not initiate the load.
The hst_pe_state machine detects the SOF request signal and sends an
SOF with the bad frame number. Meanwhile, the ehci_ctrl state machine
waits indefinitely in the run_pkt state waiting for the completion
status from pe_hst_state machine, which will never happen.
The ISO failure case is similar except that there is no retry for ISO.
The ehci_ctrl state machine moves to the next transfer in the periodic
schedule. If the under run occurs on the last entry of the periodic
list then it moves to the Async schedule.
In the case of ISO OUT simulations, the next ISO is a zero byte OUT
and again the start of packet command gets corrupted. The TxFIFO is
empty when the hst_pe_state attempts to read the Address and PID
information as the transaction is a zero byte packet. This results
in the hst_pe_state machine staying in the GET_PID state, which means
that it does not issue any transactions (including SOFs). The device
detects a Suspend condition and reverts to full speed mode and waits
for a Resume or Reset signal.
The EHCI specification allows a Non-DoubleWord (32 bits) offset to
be used as a current offset for Buffer Pointer Page 0 of the qTD.
In Non-DoubleWord aligned cases, the core reads the packet data
from the AHB memory, performs the alignment operation before writing
it in to the TxFIFO as a 32 bit data word. An End Of Packet tag (EOP)
is written to the TxFIFO after all the packet data has been written
in to the TxFIFO. The alignment function is reset to Idle by the EOP
tag. The corruption of the start of packet command arises because the
packet buffer for the OUT transaction that under ran is not aligned
to a DoubleWord, and hence no EOP tag is written to the TxFIFO. The
alignment function is still active when the start packet information
is written in to the TxFIFO for the retry of the bulk packet or for
the next transaction in the case of an under run on an ISO. This
results in the corruption of the start tag and the transaction
information.
Click for waveform showing the command 0x 0000300291 being written in
to the TX FIFO for the Out that under ran.
Click for waveform showing the command 0xC3002910C written to the
TxFIFO instead of 0x 0000300291
Versions affected: Versions 2.10a and previous versions
How discovered: Customer simulation
Workaround:
1- The EHCI specification allows a non-DoubleWord offset to be used
as a current offset for Buffer Pointer Page 0 of the qTD. However,
if a DoubleWord offset is used then this issue does not arise.
2- Use non streaming mode to eliminate under runs.
Resolution:
The fix involves changes to the traffic state machine in the
vusb_hs_dma_traf block. The ehci_ctrl state machine updates the context
information by encoding the transaction results on the
hst_op_context_update signals at the end of a transaction. The signal
hst_op_context_update is added to the traffic state machine, and the
tx_fifo_under_ran_r signal is generated if the transaction results in
an under run error. Click for waveform
The traffic state machine then traverses to the do_eop states if the
tx_fifo_under_ran error is asserted. Thus an EOP tag is written in to
the TxFIFO as shown in this waveform .
The EOP tag resets the align state machine to the Idle state ensuring
that the next command written by the echi_ctrl state machine does not
get corrupted.
File(s) modified:
RTL code fixed: …..
Method of reproducing: This failure cannot be reproduced in the current
test bench.
Date Found: March 2010
Date Fixed: June 2010
Update information:
Added the RTL code fix
Signed-off-by: Peter Chen <peter.chen@freescale.com>
The zero-length packet is the sendor tells the receiver that there
is no more data, so it is only needed at the TX side.
Signed-off-by: Peter Chen <peter.chen@freescale.com>
The system configuration API should be called before the controller
run, otherwise, undefined results may occur. So, we override hcd
reset API, and add system configuration API after controller reset.
Cc: Li Jun <peter.chen@freescale.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Peter Chen <peter.chen@freescale.com>
If usbmisc hasn't probed yet, defer the probe.
It's not enough to check if the platform device for the OF node of the
usbmisc has been registered, but it also needs to have been probed
already before we can call imx_usbmisc_init().
This can happen if the order in which devices are probed change due to
async probing or on-demand probing of dependencies.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
This makes the function hw_alloc_repmap be declared to have a return
type of void now due to this particular function never returning
a error code to its caller due to this function always running
successfully to completion nor it's caller putting the return
value into a variable in order to check if a error code is passed
from the function hw_alloc_repmap when calling this function.
Signed-off-by: Nicholas Krause <xerofoify@gmail.com>
The struct ci_hdrc is the drvdata for hcd device, so we don't
need to introduce extra ci_hdrc structure for ehci.
Signed-off-by: Peter Chen <peter.chen@freescale.com>
ITC (Interrupt Threshold Control) is used to set the maximum rate at which
the host/device controller will issue interrupts. The default value is 8 (1ms)
for it. EHCI core will modify it to 1, but device mode keeps it as default
value.
In some use cases like Android ADB, it only has one usb request for each
direction, and maximum payload data is only 4KB, so the speed is 4MB/s
at most, it needs controller to trigger interrupt as fast as possible
to increase the speed. The USB performance will be better if the interrupt
can be triggered faster.
Reduce ITC value is benefit for USB performance, but the interrupt number
is increased at the same time, it may increase cpu utilization too.
Most of use case cares about performance, but some may care about
cpu utilization, so, we leave a platform interface for user.
We set ITC as 1 (1 micro-frame) as default value which is aligned
with ehci core default value.
Signed-off-by: Peter Chen <peter.chen@freescale.com>
It is used to configure controller parameters according to
platform data, like speed, stream mode, etc, both host and
device's initialization need it, most of code are the
same for both roles, with this new interface, it can reduce
the duplicated code and be easy to maintain in future.
Signed-off-by: Peter Chen <peter.chen@freescale.com>
We can support USB OTG 1.3 USB_DEVICE_A_HNP_SUPPORT request when
the driver supports OTG FSM mode.
Signed-off-by: Peter Chen <peter.chen@freescale.com>
For i.mx platform, set ttctrl.ttha with non-zero value only affects
sitd, and ehci core makes sure the schedule is not full when accepts
new request, so it will not occur the transaction which will acorss
the SoF.
Signed-off-by: Peter Chen <peter.chen@freescale.com>
The register of ttctrl.ttha describes like below:
- Internal TT Hub Address Representation
- RW
- Default = 0000000b
This field is used to match against the Hub Address field in QH & siTD
to determine if the packet is routed to the internal TT for directly
attached FS/LS devices. If the Hub Address in the QH or siTD does not
match this address then the packet will be broadcast on the High Speed
ports destined for a downstream High Speed hub with the address in the QH/siTD.
In silicon RTL, this entry only affects QH and siTD, and the hub.addr at
both QH and siTD are 0 in ehci core for chipidea (with hcd->has_tt = 1).
So, for QH, if the "usage_tt" flag at RTL is 0, set CI_HDRC_SET_NON_ZERO_TTHA
will not affect QH (with non-hs device); for siTD, set this flag
will change remaining space requirement for the last transaction from 1023
bytes to 188 bytes, it can increase the number of transactions within one
frame, ehci periodic schedule code will not queue the packet if the frame space
is full, so it is safe to set this flag for siTD.
With this flag, it can fix the problem Alan Stern reported below:
http://www.spinics.net/lists/linux-usb/msg123125.html
And may fix Michael Tessier's problem too.
http://www.spinics.net/lists/linux-usb/msg118679.html
CC: stern@rowland.harvard.edu
CC: michael.tessier@axiontech.ca
Signed-off-by: Peter Chen <peter.chen@freescale.com>
The recent refactoring of the IGMP and MLD parsing code into
ipv6_mc_check_mld() / ip_mc_check_igmp() introduced a potential crash /
BUG() invocation for bridges:
I wrongly assumed that skb_get() could be used as a simple reference
counter for an skb which is not the case. skb_get() bears additional
semantics, a user count. This leads to a BUG() invocation in
pskb_expand_head() / kernel panic if pskb_may_pull() is called on an skb
with a user count greater than one - unfortunately the refactoring did
just that.
Fixing this by removing the skb_get() call and changing the API: The
caller of ipv6_mc_check_mld() / ip_mc_check_igmp() now needs to
additionally check whether the returned skb_trimmed is a clone.
Fixes: 9afd85c9e4 ("net: Export IGMP/MLD message validation code")
Reported-by: Brenden Blanco <bblanco@plumgrid.com>
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix minor typo so that can pass correct pointer variable for
container_of().
Signed-off-by: Leo Yan <leo.yan@linaro.org>
[jc: tweaked formatting]
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Add a document describing the process of adding a new system call,
including the need for a flags argument for future compatibility, and
covering 32-bit/64-bit concerns (albeit in an x86-centric way).
Signed-off-by: David Drysdale <drysdale@google.com>
Reviewed-by: Michael Kerrisk <mtk.manpages@gmail.com>
Reviewed-by: Eric B Munson <emunson@akamai.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
This reverts commit 1addc12648
This commit seems to cause crashes in gk104_fifo_intr_runlist() by
returning 0xbad0da00 when register 0x2a00 is read. Since this commit was
intended for GM20B which is not completely supported yet, let's revert
it for the time being.
Reported-by: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Tested-by: Afzal Mohammed <afzal.mohd.ma@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This addresses two issues that cause problems with viewperf maya-03 in
situation with memory pressure.
The first issue causes attempts to unreserve buffers if batched
reservation fails due to, for example, a signal pending. While previously
the ttm_eu api was resistant against this type of error, it is no longer
and the lockdep code will complain about attempting to unreserve buffers
that are not reserved. The issue is resolved by avoid calling
ttm_eu_backoff_reservation in the buffer reserve error path.
The second issue is that the binding_mutex may be held when user-space
fence objects are created and hence during memory reclaims. This may cause
recursive attempts to grab the binding mutex. The issue is resolved by not
holding the binding mutex across fence creation and submission.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
There are some workqueue functions declared in workqueue.h, so include
that in the workqueue section of the DocBook docs.
Signed-off-by: Tim Bird <tim.bird@sonymobile.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Pull ARM fixes from Russell King:
"Another few small ARM fixes, mostly addressing some VDSO issues"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: 8410/1: VDSO: fix coarse clock monotonicity regression
ARM: 8409/1: Mark ret_fast_syscall as a function
ARM: 8408/1: Fix the secondary_startup function in Big Endian case
ARM: 8405/1: VDSO: fix regression with toolchains lacking ld.bfd executable
Commit 3f5159a922 ("x86/asm/entry/32: Update -ENOSYS handling to match
the 64-bit logic") broke the ENOSYS handling for the 32-bit compat case.
The proper error return value was never loaded into %rax, except if
things just happened to go through the audit paths, which ended up
reloading the return value.
This moves the loading or %rax into the normal system call path, just to
make sure the error case triggers it. It's kind of sad, since it adds a
useless instruction to reload the register to the fast path, but it's
not like that single load from the stack is going to be noticeable.
Reported-by: David Drysdale <drysdale@google.com>
Tested-by: Kees Cook <keescook@chromium.org>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove pci_ats_enabled(). There are no callers outside the ATS code
itself. We don't need to check ats_cap, because if we don't find an ATS
capability, we'll never set ats_enabled.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Stop caching the Invalidate Queue Depth in struct pci_dev.
pci_ats_queue_depth() is typically called only once per device, and it
returns a fixed value per-device, so callers who need the value frequently
can cache it themselves.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Move ATS declarations to linux/pci.h so they're all in one place.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
There's no need to BUG() if we enable ATS when it's already enabled. We
don't need to BUG() when disabling ATS on a device that doesn't support ATS
or if it's already disabled. If ATS is enabled, certainly we found an ATS
capability in the past, so it should still be there now.
Clean up these error paths.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Use the pci_physfn() helper rather than looking up physfn by hand.
No functional change.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
The ATS setup code in ats_alloc_one() is only used by pci_ats_init(), so
inline it there. No functional change.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
We previously returned -ENODEV for devices that don't support ATS (except
that we always returned 0 for VFs, whether or not they support ATS).
For consistency, always return -EINVAL (not -ENODEV) if the device doesn't
support ATS. Return zero for VFs that support ATS.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
The extended capabilities list is linked with 12-bit pointers, and the ATS
Smallest Translation Unit and Invalidate Queue Depth fields are both 5
bits.
Use u16 and u8 to hold the extended capability address and the stu and qdep
values. No functional change.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
The pci_ats struct is small and will get smaller, so I don't think it's
worth allocating it separately from the pci_dev struct.
Embed the ATS fields directly into struct pci_dev.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Previously, we allocated pci_ats structures when an IOMMU driver called
pci_enable_ats(). An SR-IOV VF shares the STU setting with its PF, so when
enabling ATS on the VF, we allocated a pci_ats struct for the PF if it
didn't already have one. We held the sriov->lock to serialize threads
concurrently enabling ATS on several VFS so only one would allocate the PF
pci_ats.
Gregor reported a deadlock here:
pci_enable_sriov
sriov_enable
virtfn_add
mutex_lock(dev->sriov->lock) # acquire sriov->lock
pci_device_add
device_add
BUS_NOTIFY_ADD_DEVICE notifier chain
iommu_bus_notifier
amd_iommu_add_device # iommu_ops.add_device
init_iommu_group
iommu_group_get_for_dev
iommu_group_add_device
__iommu_attach_device
amd_iommu_attach_device # iommu_ops.attach_device
attach_device
pci_enable_ats
mutex_lock(dev->sriov->lock) # deadlock
There's no reason to delay allocating the pci_ats struct, and if we
allocate it for each device at enumeration-time, there's no need for
locking in pci_enable_ats().
Allocate pci_ats struct during enumeration, when we initialize other
capabilities.
Note that this implementation requires ATS to be enabled on the PF first,
before on any of the VFs because the PF controls the STU for all the VFs.
Link: http://permalink.gmane.org/gmane.linux.kernel.iommu/9433
Reported-by: Gregor Dick <gdick@solarflare.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
metadata snapshots aren't widely used but help provide a consistent
view of the metadata associated with an active thin-pool.
- a dm-cache fix for the 4.2 "default" policy switch from "mq" to "smq"
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJVzPl/AAoJEMUj8QotnQNa2MUH/24BKs8WYNy79B9pr6H41ouQ
LAXFVfx3Vy7n3U1KrZcPylXb7R0feXphUrSXJ6ZlKz/Df+AhG5ycUOCfSZxSDpA9
pX5rtWFSWLXGF3wRf0GXqB3CXkrMKlNWVAUlzge39TpIzpdkbh9reM9He4yQphl9
u0QVvIOTQsUrtN9htnQb7Dgmm7qD6NinrebOrWTiOqBdCh3Iw4J48RsYggYOG0ZP
fbMnpZlH537vgLyKaz72Xh+kMrYmP5DBd1f+iAUBVKk12HQVQNMzEwOgkEB0M+4l
EG3c4O3zMVrBxXdxsI2srFZfLX2XeUYAuMx8cfVSSCDEzQwjC0kmXVdkCYuAjAk=
=r/53
-----END PGP SIGNATURE-----
Merge tag 'dm-4.2-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- two stable fixes for corruption seen in a snapshot of thinp metadata;
metadata snapshots aren't widely used but help provide a consistent
view of the metadata associated with an active thin-pool.
- a dm-cache fix for the 4.2 "default" policy switch from "mq" to "smq"
* tag 'dm-4.2-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm cache policy smq: move 'dm-cache-default' module alias to SMQ
dm btree: add ref counting ops for the leaves of top level btrees
dm thin metadata: delete btrees when releasing metadata snapshot
Pull xen block driver fixes from Jens Axboe:
"A few small bug fixes for xen-blk{front,back} that have been sitting
over my vacation"
* 'for-linus' of git://git.kernel.dk/linux-block:
xen-blkback: replace work_pending with work_busy in purge_persistent_gnt()
xen-blkfront: don't add indirect pages to list when !feature_persistent
xen-blkfront: introduce blkfront_gather_backend_features()
- Revert a fix from 4.2-rc5 that was causing lots of WARNING spam.
- Fix a memory leak affecting backends in HVM guests.
- Fix PV domU hang with certain configurations.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJVzHsAAAoJEFxbo/MsZsTR9Y0H/2j1PHt29RPcNdgGQ84AH0Wh
tw1emL8rMcdhWQnsO7bNmywNNvRNQnU3ZJ8dzoq+5GPikNsbfQzYc7U2pIL4A+gB
AAJsNDNzecuq4srk8vNxcmZ7ySvm9w6dccDUex2ge3sNWaq6gzSQvz6FSWiL0Sxg
k3JcnemEg6JrYOTWdxKInAORMcRO6rgx9eIsdPUPOpgC5XLg6/mZOqBAWXIksDvs
V9uCMqQicaUgBgKFIOSllqH6fcCNooRu3aDwNNj/2mMcJmEvMeBkHmNlQgEm2j5L
ubdDyrC5y48TUPJm8i3+W2/AY+kgWzhThcqyVy6LRAAj5RItJFxMf0nMXzIEqlQ=
=UgMy
-----END PGP SIGNATURE-----
Merge tag 'for-linus-4.2-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen bug fixes from David Vrabel:
- revert a fix from 4.2-rc5 that was causing lots of WARNING spam.
- fix a memory leak affecting backends in HVM guests.
- fix PV domU hang with certain configurations.
* tag 'for-linus-4.2-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/xenbus: Don't leak memory when unmapping the ring on HVM backend
Revert "xen/events/fifo: Handle linked events when closing a port"
x86/xen: build "Xen PV" APIC driver for domU as well