Prefetch instructions can generate spurious faults on certain
models of older CPUs. The faults themselves cannot be stopped
and they can occur pretty much anywhere - so the way we solve
them is that we detect certain patterns and ignore the fault.
There is one small path of code where we must not take faults
though: the #PF handler execution leading up to the reading
of the CR2 (the faulting address). If we take a fault there
then we destroy the CR2 value (with that of the prefetching
instruction's) and possibly mishandle user-space or
kernel-space pagefaults.
It turns out that in current upstream we do exactly that:
prefetchw(&mm->mmap_sem);
/* Get the faulting address: */
address = read_cr2();
This is not good.
So turn around the order: first read the cr2 then prefetch
the lock address. Reading cr2 is plenty fast (2 cycles) so
delaying the prefetch by this amount shouldnt be a big issue
performance-wise.
[ And this might explain a mystery fault.c warning that sometimes
occurs on one an old AMD/Semptron based test-system i have -
which does have such prefetch problems. ]
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
LKML-Reference: <20090616030522.GA22162@Krystal>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Now that enable_iommus() will call iommu_disable() for each iommu,
the call to disable_iommus() during resume is redundant. Also, the order
for an invalidation is to invalidate device table entries first, then
domain translations.
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Clock function was changed, but highlander used old function.
Signed-off-by: Nobuhiro Iwamatsu <iwamatsu.nobuhiro@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch unifies the flex_bdry setting for module vs. built-in
configuration of OneNAND.
Signed-off-by: Amul Kumar Saha <amul.saha@samsung.com>
Signed-off-by: Vishak G <vishak.g@samsung.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Maintain two flows, one for pow2 chunk sizes (which uses masks and
shift), and a flow for the general case (which uses sector_div).
This is for the sake of performance.
- introduce map_sector and is_io_in_chunk_boundary to encapsulate
those two flows better for raid0_make_request
- fix blk_mergeable to support the two flows.
Signed-off-by: raziebe@gmail.com
Signed-off-by: NeilBrown <neilb@suse.de>
Remove chunk size check from md as this is now performed in the run
function in each personality.
Replace chunk size power 2 code calculations by a regular division.
Signed-off-by: raziebe@gmail.com
Signed-off-by: NeilBrown <neilb@suse.de>
have raid0 check chunk size in run method instead of in md.
This is part of a series moving the checks from common code to
the personalities where they belong.
hardsect is short and chunksize is an int, so it is safe to use %.
Signed-off-by: raziebe@gmail.com
Signed-off-by: NeilBrown <neilb@suse.de>
Replace the linear search with binary search in which_dev.
Signed-off-by: Sandeep K Sinha <sandeepksinha@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Remove num_sectors from dev_info and replace start_sector with
end_sector. This makes a lot of comparisons much simpler.
Signed-off-by: Sandeep K Sinha <sandeepksinha@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Get rid of sector_div and hash table for linear raid and replace
with a linear search in which_dev.
The hash table adds a lot of complexity for little if any gain.
Ultimately a binary search will be used which will have smaller
cache foot print, a similar number of memory access, and no
divisions.
Signed-off-by: Sandeep K Sinha <sandeepksinha@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Having a macro just to cast a void* isn't really helpful.
I would must rather see that we are simply de-referencing ->private,
than have to know what the macro does.
So open code the macro everywhere and remove the pointless cast.
Signed-off-by: NeilBrown <neilb@suse.de>
This setting doesn't seem to make sense (half the chunk size??) and
shouldn't be needed.
The segment boundary exported by raid0 should simply be the minimum
of the segment boundary of all component devices. And we already
get that right.
Signed-off-by: NeilBrown <neilb@suse.de>
If we treat conf->devlist more like a 2 dimensional array,
we can get the devlist for a particular zone simply by indexing
that array, so we don't need to store the pointers to subarrays
in strip_zone. This makes strip_zone smaller and so (hopefully)
searches faster.
Signed-of-by: NeilBrown <neilb@suse.de>
storing ->sectors is redundant as is can be computed from the
difference z->zone_end - (z-1)->zone_end
The one place where it is used, it is just as efficient to use
a zone_end value instead.
And removing it makes strip_zone smaller, so they array of these that
is searched on every request has a better chance to say in cache.
So discard the field and get the value from elsewhere.
Signed-off-by: NeilBrown <neilb@suse.de>
raid0_stop() removes all references to the raid0 configuration but
misses to free the ->devlist buffer.
This patch closes this leak, removes a pointless initialization and
fixes a coding style issue in raid0_stop().
Signed-off-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: NeilBrown <neilb@suse.de>
Currently the raid0 configuration is allocated in raid0_run() while
the buffers for the strip_zone and the dev_list arrays are allocated
in create_strip_zones(). On errors, all three buffers are freed
in raid0_run().
It's easier and more readable to do the allocation and cleanup within
a single function. So move that code into create_strip_zones().
Signed-off-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: NeilBrown <neilb@suse.de>
Currently raid0_run() always returns -ENOMEM on errors. This is
incorrect as running the array might fail for other reasons, for
example because not all component devices were available.
This patch changes create_strip_zones() so that it returns a proper
error code (either -ENOMEM or -EINVAL) rather than 1 on errors and
makes raid0_run(), its single caller, return that value instead
of -ENOMEM.
Signed-off-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: NeilBrown <neilb@suse.de>
The "sector_shift" and "spacing" fields of struct raid0_private_data
were only used for the hash table lookups. So the removal of the
hash table allows get rid of these fields as well which simplifies
create_strip_zones() and raid0_run() quite a bit.
Signed-off-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: NeilBrown <neilb@suse.de>
The raid0 hash table has become unused due to the changes in the
previous patch. This patch removes the hash table allocation and
setup code and kills the hash_table field of struct raid0_private_data.
Signed-off-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: NeilBrown <neilb@suse.de>
1/ remove current_start. The same value is available in
zone->dev_start and storing it separately doesn't gain anything.
2/ rename curr_zone_start to curr_zone_end as we are now more
focused on the 'end' of each zone. We end up storing the
same number though - the old name was a little confusing
(and what does 'current' mean in this context anyway).
Signed-off-by: NeilBrown <neilb@suse.de>
eec9462088 fold mg_disk.h into mg_disk.c,
but mg_disk platform driver needs private data for operation. This also
make mg_disk.c as machine independent. Seperate only needed structure and
defines to mg_disk.h
Signed-off-by: unsik Kim <donari75@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
DM reuses the request queue when swapping in a new device table
Introduce blk_set_default_limits() which can be used to reset the the
queue_limits prior to stacking devices.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Alasdair G Kergon <agk@redhat.com>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
I noticed a blank line in blktrace output. This patch fixes that.
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
btrfs assigns this bdi to all inodes on that file system, so make
sure it's registered. This isn't really important now, but will be
when we put dirty inodes there. Even now, we miss the stats when the
bdi isn't visible.
Also fixes failure to check bdi_init() return value, and bad inherit of
->capabilities flags from the default bdi.
Acked-by: Chris Mason <chris.mason@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Actually, last_end_request in cfq_data isn't used now. So lets
just remove it.
Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
The number of strip_zones of a raid0 array is bounded by the number of
drives in the array and is in fact much smaller for typical setups. For
example, any raid0 array containing identical disks will have only
a single strip_zone.
Therefore, the hash tables which are used for quickly finding the
strip_zone that holds a particular sector are of questionable value
and add quite a bit of unnecessary complexity.
This patch replaces the hash table lookup by equivalent code which
simply loops over all strip zones to find the zone that holds the
given sector.
In order to make this loop as fast as possible, the zone->start field
of struct strip_zone has been renamed to zone_end, and it now stores
the beginning of the next zone in sectors. This allows to save one
addition in the loop.
Subsequent cleanup patches will remove the hash table structure.
Signed-off-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: NeilBrown <neilb@suse.de>
While it looks like xhci was written with both PCI and non-PCI in mind,
apparently only the former has seen any testing. xhci-mem.o can be "fixed"
with a linux/dmapool.h include, but there are still parts of the code that
make use of struct pci_dev directly. So, at least more work is needed before
this can be turned on for non-PCI builds:
CC drivers/usb/host/xhci-mem.o
drivers/usb/host/xhci-mem.c: In function 'xhci_segment_alloc':
drivers/usb/host/xhci-mem.c:45: error: implicit declaration of function 'dma_pool_alloc'
drivers/usb/host/xhci-mem.c:45: warning: assignment makes pointer from integer without a cast
drivers/usb/host/xhci-mem.c: In function 'xhci_segment_free':
drivers/usb/host/xhci-mem.c:67: error: implicit declaration of function 'dma_pool_free'
drivers/usb/host/xhci-mem.c: In function 'xhci_alloc_virt_device':
drivers/usb/host/xhci-mem.c:239: warning: assignment makes pointer from integer without a cast
drivers/usb/host/xhci-mem.c:248: warning: assignment makes pointer from integer without a cast
drivers/usb/host/xhci-mem.c: In function 'xhci_mem_cleanup':
drivers/usb/host/xhci-mem.c:578: error: implicit declaration of function 'dma_pool_destroy'
drivers/usb/host/xhci-mem.c: In function 'xhci_mem_init':
drivers/usb/host/xhci-mem.c:657: error: implicit declaration of function 'dma_pool_create'
drivers/usb/host/xhci-mem.c:658: warning: assignment makes pointer from integer without a cast
drivers/usb/host/xhci-mem.c:663: warning: assignment makes pointer from integer without a cast
make[3]: *** [drivers/usb/host/xhci-mem.o] Error 1
CC drivers/usb/host/xhci-pci.o
drivers/usb/host/xhci-pci.c: In function 'xhci_pci_reinit':
drivers/usb/host/xhci-pci.c:39: error: implicit declaration of function 'pci_set_mwi'
drivers/usb/host/xhci-pci.c: At top level:
drivers/usb/host/xhci-pci.c:151: error: 'usb_hcd_pci_probe' undeclared here (not in a function)
drivers/usb/host/xhci-pci.c:152: error: 'usb_hcd_pci_remove' undeclared here (not in a function)
drivers/usb/host/xhci-pci.c:155: error: 'usb_hcd_pci_shutdown' undeclared here (not in a function)
drivers/usb/host/xhci-pci.c:159: warning: function declaration isn't a prototype
drivers/usb/host/xhci-pci.c:164: warning: function declaration isn't a prototype
make[3]: *** [drivers/usb/host/xhci-pci.o] Error 1
Cc: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Add Makefile and Kconfig entries for the xHCI host controller driver.
List Sarah Sharp as the maintainer for the xHCI driver.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Narrow down time spent holding the xHCI spinlock so that it's only used to
protect the xHCI rings, not as mutual exclusion. Stop allocating memory
while holding the spinlock and calling xhci_alloc_virt_device() and
xhci_endpoint_init().
The USB core should have locking in it to prevent device state to be
manipulated by more than one kernel thread. E.g. you can't free a device
while you're in the middle of setting a new configuration. So removing
the locks from the sections where xhci_alloc_dev() and
xhci_reset_bandwidth() touch xHCI's representation of the device should be
OK.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Mask off the lower 16 bits of the interrupt control register, instead of
masking off the upper 16 bits. The interrupt moderation interval field is
the lower 16 bytes, and is set to 0x4000 (1ms) by default. The previous
code was adding 40 us to the default value, instead of setting it to 40
us. This makes performance really bad.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The packed attribute allows gcc to muck with the alignment of data
structures, which may lead to byte-wise writes that break atomicity of
writes. Packed should only be used when the compile may add undesired
padding to the structure. Each element of the structure will be aligned
by C based on its size and the size of the elements around it. E.g. a u64
would be aligned on an 8 byte boundary, the next u32 would be aligned on a
four byte boundary, etc.
Since most of the xHCI structures contain only u32 bit values, removing
the packed attribute for them should be harmless. (A future patch will
change some of the twin 32-bit address fields to one 64-bit field, but all
those places have an even number of 32-bit fields before them, so the
alignment should be correct.) Add BUILD_BUG_ON statements to check that
the compiler doesn't add padding to the data structures that have a
hardware-defined layout.
While we're modifying the registers, change the name of intr_reg to
xhci_intr_reg to avoid global conflicts.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Greg KH introduced a bug into xhci_trb_virt_to_dma() when he changed the
type of offset to dma_addr_t from unsigned int and dropped the casts to
unsigned int around the virtual address pointer subtraction.
trb and seg->trbs are both valid pointers to virtual addresses, so the
compiler will mod the subtraction by the size of union trb (16 bytes).
segment_offset is an unsigned long, which is guaranteed to be at least as
big as a void *.
Drop the void * casts in the first if statement because trb and seg->trbs
are both pointers of the same type (pointers to union trb).
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Replace if-elseif-else with switch-case
to keep the code consistent which is semantically same
Switch-case is used here,
http://www.spinics.net/lists/linux-usb/msg17201.html
Making consistent at other places in usb/core
Also easier to read and maintain when USB4.0, 5.0, ... comes
Signed-off-by: Viral Mehta <viral.mehta@einfochips.com>
Tested-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
xhci-mem.c includes calls to dma_pool_alloc() and other functions defined
in linux/dmapool.h. Make sure to include that header file.
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Make sure the error path in xhci_urb_enqueue() releases the spinlock
before it returns. Reported by Oliver in
http://marc.info/?l=linux-usb&m=124091637311832&w=2
Reported-by: Oliver Neukum <oliver@neukum.org>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Differentiate between SuperSpeed endpoint companion descriptor and the
wireless USB endpoint companion descriptor. Make all structure names for
this descriptor have "ss" (SuperSpeed) in them. David Vrabel asked for
this change in http://marc.info/?l=linux-usb&m=124091465109367&w=2
Reported-by: David Vrabel <david.vrabel@csr.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Force the compiler to write the cycle bit of the Link TRB last. This
ensures that the hardware doesn't think it owns the Link TRB before we set
the chain bit. Reported by Oliver in this thread:
http://marc.info/?l=linux-usb&m=124091532410219&w=2
Reported-by: Oliver Neukum <oliver@neukum.org>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Drop spinlock in xhci_irq() error path.
This fixes the issue reported by Oliver Neukum on this thread:
http://marc.info/?l=linux-usb&m=124090924401444&w=2
Remove unnecessary register read reported by Viral Mehta:
http://marc.info/?l=linux-usb&m=124091326007398&w=2
Reported-by: Oliver Neukum <oliver@neukum.org>
Reported-by: Viral Mehta <viral.mehta@einfochips.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Make all globally visible functions start with xhci_ and mark functions as
static if they're only called within the same C file. Fix some long lines
while we're at it.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Make sure to preserve all bits *except* the TRB_CHAIN bit when giving a
Link TRB to the hardware. We need to save things like TRB type and the
toggle bit in the control dword.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>