We need a cleared cpu_mask to record if mce is initialized, especially
when MAXSMP is used.
used zalloc_... instead
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: stable@kernel.org
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The sysfs attribute cmci_disabled was accidentall turned into a
duplicate of ignore_ce, breaking all other attributes.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
arch/sh has a couple of stray markers without any users introduced
in commit 3d58695edb. Remove them in
preparation of removing the markers in favour of the TRACE_EVENT
macro (and also because we don't keep dead code around).
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
asm/desc.h is included in three assembly files, but the only macro
it defines, GET_DESC_BASE, is never used. This patch removes the
includes, removes the macro GET_DESC_BASE and the ASSEMBLY guard
from asm/desc.h.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The espfix code triggers if we have a protected mode userspace
application with a 16-bit stack. On returning to userspace, with iret,
the CPU doesn't restore the high word of the stack pointer. This is an
"official" bug, and the work-around used in the kernel is to temporarily
switch to a 32-bit stack segment/pointer pair where the high word of the
pointer is equal to the high word of the userspace stackpointer.
The current implementation uses THREAD_SIZE to determine the cut-off,
but there is no good reason not to use the more natural 64kb... However,
implementing this by simply substituting THREAD_SIZE with 65536 in
patch_espfix_desc crashed the test application. patch_espfix_desc tries
to do what is described above, but gets it subtly wrong if the userspace
stack pointer is just below a multiple of THREAD_SIZE: an overflow
occurs to bit 13... With a bit of luck, when the kernelspace
stackpointer is just below a 64kb-boundary, the overflow then ripples
trough to bit 16 and userspace will see its stack pointer changed by
65536.
This patch moves all espfix code into entry_32.S. Selecting a 16-bit
cut-off simplifies the code. The game with changing the limit dynamically
is removed too. It complicates matters and I see no value in it. Changing
only the top 16-bit word of ESP is one instruction and it also implies
that only two bytes of the ESPFIX GDT entry need to be changed and this
can be implemented in just a handful simple to understand instructions.
As a side effect, the operation to compute the original ESP from the
ESPFIX ESP and the GDT entry simplifies a bit too, and the remaining
three instructions have been expanded inline in entry_32.S.
impact: can now reliably run userspace with ESP=xxxxfffc on 16-bit
stack segment
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Acked-by: Stas Sergeev <stsp@aknet.ru>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Returning to a task with a 16-bit stack requires special care: the iret
instruction does not restore the high word of esp in that case. The
espfix code fixes this, but currently is not invoked on NMIs. This means
that a running task gets the upper word of esp clobbered due intervening
NMIs. To reproduce, compile and run the following program with the nmi
watchdog enabled (nmi_watchdog=2 on the command line). Using gdb you can
see that the high bits of esp contain garbage, while the low bits are
still correct.
This patch puts the espfix code back into the NMI code path.
The patch is slightly complicated due to the irqtrace infrastructure not
being NMI-safe. The NMI return path cannot call TRACE_IRQS_IRET.
Otherwise, the tail of the normal iret-code is correct for the nmi code
path too. To be able to share this code-path, the TRACE_IRQS_IRET was
move up a bit. The espfix code exists after the TRACE_IRQS_IRET, but
this code explicitly disables interrupts. This short interrupts-off
section is now not traced anymore. The return-to-kernel path now always
includes the preliminary test to decide if the espfix code should be
called. This is never the case, but doing it this way keeps the patch as
simple as possible and the few extra instructions should not affect
timing in any significant way.
#define _GNU_SOURCE
#include <stdio.h>
#include <sys/types.h>
#include <sys/mman.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <asm/ldt.h>
int modify_ldt(int func, void *ptr, unsigned long bytecount)
{
return syscall(SYS_modify_ldt, func, ptr, bytecount);
}
/* this is assumed to be usable */
#define SEGBASEADDR 0x10000
#define SEGLIMIT 0x20000
/* 16-bit segment */
struct user_desc desc = {
.entry_number = 0,
.base_addr = SEGBASEADDR,
.limit = SEGLIMIT,
.seg_32bit = 0,
.contents = 0, /* ??? */
.read_exec_only = 0,
.limit_in_pages = 0,
.seg_not_present = 0,
.useable = 1
};
int main(void)
{
setvbuf(stdout, NULL, _IONBF, 0);
/* map a 64 kb segment */
char *pointer = mmap((void *)SEGBASEADDR, SEGLIMIT+1,
PROT_EXEC|PROT_READ|PROT_WRITE,
MAP_SHARED|MAP_ANONYMOUS, -1, 0);
if (pointer == NULL) {
printf("could not map space\n");
return 0;
}
/* write ldt, new mode */
int err = modify_ldt(0x11, &desc, sizeof(desc));
if (err) {
printf("error modifying ldt: %i\n", err);
return 0;
}
for (int i=0; i<1000; i++) {
asm volatile (
"pusha\n\t"
"mov %ss, %eax\n\t" /* preserve ss:esp */
"mov %esp, %ebp\n\t"
"push $7\n\t" /* index 0, ldt, user mode */
"push $65536-4096\n\t" /* esp */
"lss (%esp), %esp\n\t" /* switch to new stack */
"push %eax\n\t" /* save old ss:esp on new stack */
"push %ebp\n\t"
"add $17*65536, %esp\n\t" /* set high bits */
"mov %esp, %edx\n\t"
"mov $10000000, %ecx\n\t" /* wait... */
"1: loop 1b\n\t" /* ... a bit */
"cmp %esp, %edx\n\t"
"je 1f\n\t"
"ud2\n\t" /* esp changed inexplicably! */
"1:\n\t"
"sub $17*65536, %esp\n\t" /* restore high bits */
"lss (%esp), %esp\n\t" /* restore old ss:esp */
"popa\n\t");
printf("\rx%ix", i);
}
return 0;
}
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Acked-by: Stas Sergeev <stsp@aknet.ru>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Action police statistics could be misleading because drops are not
shown when expected.
With feedback from: Jamal Hadi Salim <hadi@cyberus.ca>
Reported-by: Pawel Staszewski <pstaszewski@itcare.pl>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch implements skb recycling. It reclaims transmitted skb's
for use in the receive ring.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reduce the size of the driver transmit ring to reduce latency
and allow qdisc to do better rate control. Also make it
obvious what the minimum transmit ring allowed is and why.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since it is likely that there are multiple packets received per
interrupt, only update the receive counters once after all
packets are processed.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The logic in sky2_down was incorrect. Receiver could report status
after rx_stop was called.
The steps need to be:
* stop new frames from being transmitted
* shut off transmit/receive logic
* synchronize with NAPI to process status info about transmitter
and receiver
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add some read's to avoid any PCI posting issues when controlling
irq's.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reset more parts of the receive path when device is take offline.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This unblocks the chip if it is stuck in pause cycle during
shutdown.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stopping all activity through ChipCmd and blindly acking the irqs
is neither nice nor completely needed: the transition to low-power
mode does enough work and it apparently keeps the device in a sane
state.
Patch suggested by a fix for http://bugzilla.kernel.org/show_bug.cgi?id=9512
The rtl_shutdown path is kept unchanged so far.
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Tested-by: Anders Eriksson <aeriksson@fastmail.fm>
Cc: Edward Hsu <edward_hsu@realtek.com.tw>
Signed-off-by: David S. Miller <davem@davemloft.net>
When sis190 driver is trying to get default phy, if it doesn't find home
or lan phy, it falls back to the first phy in the phy list but list_entry()
points to a bogus entry. list_first_entry() should be used instead.
Signed-off-by: Arnaud Patard <apatard@mandriva.com>
Acked-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-- derived from reverted commit 047584ce94
-- reworked by Grant Likely to play nice with commit:
"net: Rework ucc_geth driver to use of_mdio infrastructure"
(0b9da337dc)
Signed-off-by: Haiying Wang <Haiying.Wang@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 047584ce94.
This patch meshes badly with "net: Rework ucc_geth driver to use
of_mdio infrastructure" (0b9da337dc).
Since most of the patch needs to be reworked, it is clearer to revert
the patch and then apply the corrected version
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
The skb mac_header field is sometimes NULL (or ~0u) as a sentinel
value. The places where skb is expanded add an offset which would
change this flag into an invalid pointer (or offset).
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Looking at the crash in log_martians(), one suspect is that the check for
mac header being set is not correct. The value of mac_header defaults to
0 on allocation, therefore skb_mac_header_was_set will always be true on
platforms using NET_SKBUFF_USES_OFFSET.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch reverts 3f31fddf, which is no longer needed because if a
race between freeing buffer and committing transaction functionality
occurs and dio gets error, currently dio falls back to buffered IO due
to the commit 6ccfa806.
Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
Cc: Mingming Cao <cmm@us.ibm.com>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
At the end of reshape_request we update cyrr_resync_completed
if we are about to pause due to reaching resync_max.
However we update it to the wrong value. We need to add the
"reshape_sectors" that have just been reshaped.
Signed-off-by: NeilBrown <neilb@suse.de>
In the unlikely event that reshape progresses past the current request
while it is waiting for a stripe we need to schedule() before retrying
for 2 reasons:
1/ Prevent list corruption from duplicated list_add() calls without
intervening list_del().
2/ Give the reshape code a chance to make some progress to resolve the
conflict.
Cc: <stable@kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Current, when we update the 'conf' structure, when adding a
drive to a linear array, we keep the old version around until
the array is finally stopped, as it is not safe to free it
immediately.
Now that we have rcu protection on all accesses to 'conf',
we can use call_rcu to free it more promptly.
Signed-off-by: NeilBrown <neilb@suse.de>
Due to the lack of memory ordering guarantees, we may have races around
mddev->conf.
In particular, the correct contents of the structure we get from
dereferencing ->private might not be visible to this CPU yet, and
they might not be correct w.r.t mddev->raid_disks.
This patch addresses the problem using rcu protection to avoid
such race conditions.
Signed-off-by: SandeepKsinha <sandeepksinha@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
If the superblock of a component device indicates the presence of a
bitmap but the corresponding raid personality does not support bitmaps
(raid0, linear, multipath, faulty), then something is seriously wrong
and we'd better refuse to run such an array.
Currently, this check is performed while the superblocks are examined,
i.e. before entering personality code. Therefore the generic md layer
must know which raid levels support bitmaps and which do not.
This patch avoids this layer violation without adding identical code
to various personalities. This is accomplished by introducing a new
public function to md.c, md_check_no_bitmap(), which replaces the
hard-coded checks in the superblock loading functions.
A call to md_check_no_bitmap() is added to the ->run method of each
personality which does not support bitmaps and assembly is aborted
if at least one component device contains a bitmap.
Signed-off-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: NeilBrown <neilb@suse.de>
It is easiest to round sizes to multiples of chunk size in
the personality code for those personalities which care.
Those personalities now do the rounding, so we can
remove that function from common code.
Also remove the upper bound on the size of a chunk, and the lower
bound on the size of a device (1 chunk), neither of which really buy
us anything.
Signed-off-by: NeilBrown <neilb@suse.de>
This is currently ensured by common code, but it is more reliable to
ensure it where it is needed in personality code.
All the other personalities that care already round the size to
the chunk_size. raid0 and linear are the only hold-outs.
Signed-off-by: NeilBrown <neilb@suse.de>
Currently the assignment to utime gets skipped for 'external'
metadata. So move it to the top of the function so that it
always gets effected.
This is of largely cosmetic interest. Nothing actually depends
on ->utime being right for external arrays.
"mdadm --monitor" does use it for 0.90 and 1.x arrays, but with
mdadm-3.0, this is not important for external metadata.
Signed-off-by: NeilBrown <neilb@suse.de>
Currently, the md layer checks in analyze_sbs() if the raid level
supports reconstruction (mddev->level >= 1) and if reconstruction is
in progress (mddev->recovery_cp != MaxSector).
Move that printk into the personality code of those raid levels that
care (levels 1, 4, 5, 6, 10).
Signed-off-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: NeilBrown <neilb@suse.de>
The difference between these two methods is artificial.
Both check that a pending reshape is valid, and perform any
aspect of it that can be done immediately.
'reconfig' handles chunk size and layout.
'check_reshape' handles raid_disks.
So make them just one method.
Signed-off-by: NeilBrown <neilb@suse.de>
Passing the new layout and chunksize as args is not necessary as
the mddev has fields for new_check and new_layout.
This is preparation for combining the check_reshape and reconfig
methods
Signed-off-by: NeilBrown <neilb@suse.de>
In reshape cases that do not change the number of devices,
start_reshape is called without first calling check_reshape.
Currently, the check that the stripe_cache is large enough is
only done in check_reshape. It should be in start_reshape too.
Signed-off-by: NeilBrown <neilb@suse.de>
1/ Raid5 has learned to take over also raid4 and raid6 arrays.
2/ new_chunk in mdp_superblock_1 is in sectors, not bytes.
Signed-off-by: NeilBrown <neilb@suse.de>
A straight-forward conversion which gets rid of some
multiplications/divisions/shifts. The patch also introduces a couple
of new ones, most of which are due to conf->chunk_size still being
represented in bytes. This will be cleaned up in subsequent patches.
Signed-off-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: NeilBrown <neilb@suse.de>
This patch renames the chunk_size field to chunk_sectors with the
implied change of semantics. Since
is_power_of_2(chunk_size) = is_power_of_2(chunk_sectors << 9)
= is_power_of_2(chunk_sectors)
these bits don't need an adjustment for the shift.
Signed-off-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: NeilBrown <neilb@suse.de>
After performing an operation over the page list for a buffer retrieved by
i915_gem_object_get_pages() the pages need to be returned with
i915_gem_object_put_pages(). This was not being observed for the phys
objects which were thus leaking references to their backing pages.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
CC: Dave Airlie <airlied@gmail.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
While sifting through the inteldrmfb code trying to solve #22040 I found that
the fb restore path doesn't check the return value of
drm_crtc_helper_set_config(), which seems to have all sorts of potential
failure modes. We should warn someone if one of these is triggered.
Signed-Off-By: Ben Gamari <bgamari.foss@gmail.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[anholt: hand-applied, failures are mine]
Signed-off-by: Eric Anholt <eric@anholt.net>
ia64 was assigning resources to root busses after allocations had
been made for child busses. Calling pcibios_setup_root_windows() from
pcibios_fixup_bus() solves this problem by assigning the resources to
the root bus before child busses are scanned.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Tested-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Instead of open-coding pci_find_parent_resource and request_resource,
just call pci_claim_resource.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This function was only used by pci_claim_resource(), and the last commit
deleted that use.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>