v1->v2: Use strlcpy() to ensure s[i].name be null-termination.
1. In netdev_boot_setup_add(), a long name will leak.
ex. : dev=21,0x1234,0x1234,0x2345,eth123456789verylongname.........
2. In netdev_boot_setup_check(), mismatch will happen if s[i].name
is a substring of dev->name.
ex. : dev=...eth1 dev=...eth11
[ With feedback from Ben Hutchings. ]
Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We already have a variable, which has the same capability.
Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Filters need to be destroyed before beginning to destroy classes
since the destination class needs to still be alive to unbind the
filter.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pass double tcf_proto pointers to tcf_destroy_chain() to make it
clear the start of the filter list for more consistency.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
These sysctl values are time related and all use the same routine
(proc_dointvec_jiffies) that internally converts from seconds to jiffies.
The code is fine, the documentation is just wrong.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Suspend/resume ("echo mem > /sys/power/state") does not work with
vanilla kernels -- the system does not suspend correctly and just
hangs. This patch fixes this so suspend/resume works:
1) of_iomap does not map the whole 0xC000 of the MPC5200 immr so
saving registers does not work.
2) PCI registers need to be saved and restored.
Signed-off-by: Tim Yamin <plasm@roo.me.uk>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
The legacy serial driver does not work with an 8250 type UART that is
described in the device tree with the reg-offset and reg-shift
properties. This change makes legacy_serial ignore these devices.
Signed-off-by: John Linn <john.linn@xilinx.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
i2c.h mentions -1 as a not-issued irq. This false hint was taken by
of_i2c and caused crashes. Don't give any advice as 'no irq' is not
consistent across all architectures yet and it is not needed internally
by the i2c-core.
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
The matching process described for new style clients in
Documentation/i2c/writing-clients is classed as out-of-date
as it requires the presence of an .id_table entry in the
driver's i2c_driver entry.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
This change to the makefile corrects the build of a simpleImage with initrd.
Signed-off-by: John Linn <john.linn@xilinx>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Some PCI devices will lock up if we attempt to read from VPD addresses
beyond some device-dependent limit. Until we can identify these
devices and adjust the file size accordingly, only let root read VPD
through sysfs to prevent a DoS by normal users.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Add a MODULE_ALIAS() statement for the i2c-s3c2410 controller
to ensure that it can be autoloaded on the S3C2440 systems that
we support.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
The driver should be returning -ENXIO for transfers that do not
pass the initial address byte stage.
Note, also small tidyups to the driver comments in the area.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
We should check for the reception of an ACK after transmitting each
data byte. The address send has been correctly checking this, but the
data write byte state should have also been checking for these failures.
As part of the same fix, we remove the ACK checking from the receive
path where it should not have been checking for an ACK which our hardware
was sending.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Dhaval Giani reported this warning during cpu hotplug stress-tests:
| On running kernel compiles in parallel with cpu hotplug:
|
| WARNING: at arch/x86/kernel/smp.c:118
| native_smp_send_reschedule+0x21/0x36()
| Modules linked in:
| Pid: 27483, comm: cc1 Not tainted 2.6.26-rc7 #1
| [...]
| [<c0110355>] native_smp_send_reschedule+0x21/0x36
| [<c014fe8f>] force_quiescent_state+0x47/0x57
| [<c014fef0>] call_rcu+0x51/0x6d
| [<c01713b3>] __fput+0x130/0x158
| [<c0171231>] fput+0x17/0x19
| [<c016fd99>] filp_close+0x4d/0x57
| [<c016fdff>] sys_close+0x5c/0x97
IMHO the warning is a spurious one.
cpu_online_map is updated by the _cpu_down() using stop_machine_run().
Since force_quiescent_state is invoked from irqs disabled section,
stop_machine_run() won't be executing while a cpu is executing
force_quiescent_state(). Hence the cpu_online_map is stable while we're
in the irq disabled section.
However, a cpu might have been offlined _just_ before we disabled irqs
while entering force_quiescent_state(). And rcu subsystem might not yet
have handled the CPU_DEAD notification, leading to the offlined cpu's
bit being set in the rcp->cpumask.
Hence cpumask = (rcp->cpumask & cpu_online_map) to prevent sending
smp_reschedule() to an offlined CPU.
Here's the timeline:
CPU_A CPU_B
--------------------------------------------------------------
cpu_down(): .
. .
. .
stop_machine(): /* disables preemption, .
* and irqs */ .
. .
. .
take_cpu_down(); .
. .
. .
. .
cpu_disable(); /*this removes cpu .
*from cpu_online_map .
*/ .
. .
. .
restart_machine(); /* enables irqs */ .
------WINDOW DURING WHICH rcp->cpumask is stale ---------------
. call_rcu();
. /* disables irqs here */
. .force_quiescent_state();
.CPU_DEAD: .for_each_cpu(rcp->cpumask)
. . smp_send_reschedule();
. .
. . WARN_ON() for offlined CPU!
.
.
.
rcu_cpu_notify:
.
-------- WINDOW ENDS ------------------------------------------
rcu_offline_cpu() /* Which calls cpu_quiet()
* which removes
* cpu from rcp->cpumask.
*/
If a new batch was started just before calling stop_machine_run(), the
"tobe-offlined" cpu is still present in rcp-cpumask.
During a cpu-offline, from take_cpu_down(), we queue an rt-prio idle
task as the next task to be picked by the scheduler. We also call
cpu_disable() which will disable any further interrupts and remove the
cpu's bit from the cpu_online_map.
Once the stop_machine_run() successfully calls take_cpu_down(), it calls
schedule(). That's the last time a schedule is called on the offlined
cpu, and hence the last time when rdp->passed_quiesc will be set to 1
through rcu_qsctr_inc().
But the cpu_quiet() will be on this cpu will be called only when the
next RCU_SOFTIRQ occurs on this CPU. So at this time, the offlined CPU
is still set in rcp->cpumask.
Now coming back to the idle_task which truely offlines the CPU, it does
check for a pending RCU and raises the softirq, since it will find
rdp->passed_quiesc to be 0 in this case. However, since the cpu is
offline I am not sure if the softirq will trigger on the CPU.
Even if it doesn't the rcu_offline_cpu() will find that rcp->completed
is not the same as rcp->cur, which means that our cpu could be holding
up the grace period progression. Hence we call cpu_quiet() and move
ahead.
But because of the window explained in the timeline, we could still have
a call_rcu() before the RCU subsystem executes it's CPU_DEAD
notification, and we send smp_send_reschedule() to offlined cpu while
trying to force the quiescent states. The appended patch adds comments
and prevents checking for offlined cpu everytime.
cpu_online_map is updated by the _cpu_down() using stop_machine_run().
Since force_quiescent_state is invoked from irqs disabled section,
stop_machine_run() won't be executing while a cpu is executing
force_quiescent_state(). Hence the cpu_online_map is stable while we're
in the irq disabled section.
Reported-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Acked-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Rusty Russel <rusty@rustcorp.com.au>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
fsync_buffers_list() and sync_dirty_buffer() both issue async writes and
then immediately wait on them. Conceptually, that makes them sync writes
and we should treat them as such so that the IO schedulers can handle
them appropriately.
This patch fixes a write starvation issue that Lin Ming reported, where
xx is stuck for more than 2 minutes because of a large number of
synchronous IO in the system:
INFO: task kjournald:20558 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
kjournald D ffff810010820978 6712 20558 2
ffff81022ddb1d10 0000000000000046 ffff81022e7baa10 ffffffff803ba6f2
ffff81022ecd0000 ffff8101e6dc9160 ffff81022ecd0348 000000008048b6cb
0000000000000086 ffff81022c4e8d30 0000000000000000 ffffffff80247537
Call Trace:
[<ffffffff803ba6f2>] kobject_get+0x12/0x17
[<ffffffff80247537>] getnstimeofday+0x2f/0x83
[<ffffffff8029c1ac>] sync_buffer+0x0/0x3f
[<ffffffff8066d195>] io_schedule+0x5d/0x9f
[<ffffffff8029c1e7>] sync_buffer+0x3b/0x3f
[<ffffffff8066d3f0>] __wait_on_bit+0x40/0x6f
[<ffffffff8029c1ac>] sync_buffer+0x0/0x3f
[<ffffffff8066d48b>] out_of_line_wait_on_bit+0x6c/0x78
[<ffffffff80243909>] wake_bit_function+0x0/0x23
[<ffffffff8029e3ad>] sync_dirty_buffer+0x98/0xcb
[<ffffffff8030056b>] journal_commit_transaction+0x97d/0xcb6
[<ffffffff8023a676>] lock_timer_base+0x26/0x4b
[<ffffffff8030300a>] kjournald+0xc1/0x1fb
[<ffffffff802438db>] autoremove_wake_function+0x0/0x2e
[<ffffffff80302f49>] kjournald+0x0/0x1fb
[<ffffffff802437bb>] kthread+0x47/0x74
[<ffffffff8022de51>] schedule_tail+0x28/0x5d
[<ffffffff8020cac8>] child_rip+0xa/0x12
[<ffffffff80243774>] kthread+0x0/0x74
[<ffffffff8020cabe>] child_rip+0x0/0x12
Lin Ming confirms that this patch fixes the issue. I've run tests with
it for the past week and no ill effects have been observed, so I'm
proposing it for inclusion into 2.6.26.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
AS scheduler alternates between issuing read and write batches. It does
the batch switch only after all requests from the previous batch are
completed.
When switching to a write batch, if there is an on-going read request,
it waits for its completion and indicates its intention of switching by
setting ad->changed_batch and the new direction but does not update the
batch_expire_time for the new write batch which it does in the case of
no previous pending requests.
On completion of the read request, it sees that we were waiting for the
switch and schedules work for kblockd right away and resets the
ad->changed_data flag.
Now when kblockd enters dispatch_request where it is expected to pick
up a write request, it in turn ends the write batch because the
batch_expire_timer was not updated and shows the expire timestamp for
the previous batch.
This results in the write starvation for all the cases where there is
the intention for switching to a write batch, but there is a previous
in-flight read request and the batch gets reverted to a read_batch
right away.
This also holds true in the reverse case (switching from a write batch
to a read batch with an in-flight write request).
I've checked that this bug exists on 2.6.11, 2.6.18, 2.6.24 and
linux-2.6-block git HEAD. I've tested the fix on x86 platforms with
SCSI drives where the driver asks for the next request while a current
request is in-flight.
This patch is based off linux-2.6-block git HEAD.
Bug reproduction:
A simple scenario which reproduces this bug is:
- dd if=/dev/hda3 of=/dev/null &
- lilo
The lilo takes forever to complete.
This can also be reproduced fairly easily with the earlier dd and
another test
program doing msync().
The example test program below should print out a message after every
iteration
but it simply hangs forever. With this bugfix it makes forward progress.
====
Example test program using msync() (thanks to suleiman AT google DOT
com)
inline uint64_t
rdtsc(void)
{
int64_t tsc;
__asm __volatile("rdtsc" : "=A" (tsc));
return (tsc);
}
int
main(int argc, char **argv)
{
struct stat st;
uint64_t e, s, t;
char *p, q;
long i;
int fd;
if (argc < 2) {
printf("Usage: %s <file>\n", argv[0]);
return (1);
}
if ((fd = open(argv[1], O_RDWR | O_NOATIME)) < 0)
err(1, "open");
if (fstat(fd, &st) < 0)
err(1, "fstat");
p = mmap(NULL, st.st_size, PROT_READ | PROT_WRITE,
MAP_SHARED, fd, 0);
t = 0;
for (i = 0; i < 1000; i++) {
*p = 0;
msync(p, 4096, MS_SYNC);
s = rdtsc();
*p = 0;
__asm __volatile(""::: "memory");
e = rdtsc();
if (argc > 2)
printf("%d: %lld cycles %jd %jd\n",
i, e - s, (intmax_t)s, (intmax_t)e);
t += e - s;
}
printf("average time: %lld cycles\n", t / 1000);
return (0);
}
Cc: <stable@kernel.org>
Acked-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
commit 4323838215
x86: change size of node ids from u8 to s16
set the range for NODES_SHIFT to 1..15.
The possible range is 1..9
Fixes Bugzilla #10726
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This correctly hooks the VSX dump into Roland McGrath core file
infrastructure. It adds the VSX dump information as an additional elf
note in the core file (after talking more to the tool chain/gdb guys).
This also ensures the formats are consistent between signals, ptrace
and core files.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Fix compile error when CONFIG_VSX is enabled.
arch/powerpc/kernel/signal_64.c: In function 'restore_sigcontext':
arch/powerpc/kernel/signal_64.c:241: error: 'i' undeclared (first use in this function)
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Currently when a 32 bit process is exec'd on a powerpc 64 bit host the
value in the top three bytes of the personality is clobbered. patch
adds a check in the SET_PERSONALITY macro that will carry all the
values in the top three bytes across the exec.
These three bytes currently carry flags to disable address randomisation,
limit the address space, force zeroing of an mmapped page, etc. Should an
application set any of these bits they will be maintained and honoured on
homogeneous environment but discarded and ignored on a heterogeneous
environment. So if an application requires all mmapped pages to be initialised
to zero and a wrapper is used to setup the personality and exec the target,
these flags will remain set on an all 32 or all 64 bit envrionment, but they
will be lost in the exec on a mixed 32/64 bit environment. Losing these bits
means that the same application would behave differently in different
environments. Tested on a POWER5+ machine with 64bit kernel and a mixed
64/32 bit user space.
Signed-off-by: Eric B Munson <ebmunson@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
When compiling kernel modules for ppc that include <linux/spinlock.h>,
gcc prints a warning message every time it encounters a function
declaration where the inline keyword appears after the return type.
This makes sure that the order of the inline keyword and the return
type is as gcc expects it. Additionally, the __inline__ keyword is
replaced by inline, as checkpatch expects.
Signed-off-by: Bart Van Assche <bart.vanassche@gmail.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Gcc 4.3 produced this warning:
arch/powerpc/kernel/signal_64.c: In function 'restore_sigcontext':
arch/powerpc/kernel/signal_64.c:161: warning: array subscript is above array bounds
This is caused by us copying to aliases of elements of the pt_regs
structure. Make those explicit.
This adds one extra __get_user and unrolls a loop.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This removes the experimental status of kdump on PPC64. kdump is on
PPC64 now since more than one year and it has proven to be stable.
Signed-off-by: Bernhard Walle <bwalle@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The implementation of huge_ptep_set_wrprotect() directly calls
ptep_set_wrprotect() to mark a hugepte write protected. However this
call is not appropriate on ppc64 kernels as this is a small page only
implementation. This can lead to the hash not being flushed correctly
when a mapping is being converted to COW, allowing processes to continue
using the original copy.
Currently huge_ptep_set_wrprotect() unconditionally calls
ptep_set_wrprotect(). This is fine on ppc32 kernels as this call is
generic. On 64 bit this is implemented as:
pte_update(mm, addr, ptep, _PAGE_RW, 0);
On ppc64 this last parameter is the page size and is passed directly on
to hpte_need_flush():
hpte_need_flush(mm, addr, ptep, old, huge);
And this directly affects the page size we pass to flush_hash_page():
flush_hash_page(vaddr, rpte, psize, ssize, 0);
As this changes the way the hash is calculated we will flush the wrong
pages, potentially leaving live hashes to the original page.
Move the definition of huge_ptep_set_wrprotect() to the 32/64 bit specific
headers.
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
On PowerPC processors with non-coherent cache architectures the DMA
subsystem calls invalidate_dcache_range() before performing a DMA read
operation. If the address and length of the DMA buffer are not aligned
to a cache-line boundary this can result in memory outside of the DMA
buffer being invalidated in the cache. If this memory has an
uncommitted store then the data will be lost and a subsequent read of
that address will result in an old value being returned from main memory.
Only when the DMA buffer starts on a cache-line boundary and is an exact
mutiple of the cache-line size can invalidate_dcache_range() be called,
otherwise flush_dcache_range() must be called. flush_dcache_range()
will first flush uncommitted writes, and then invalidate the cache.
Signed-off-by: Andrew Lewis <andrew-lewis at netspace.net.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Since most bootloaders or wrappers tend to update or add some information
to the .dtb they a handled they need some working space to do that in.
By default add 1K of padding via a default setting of DTS_FLAGS.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Add CONFIG_VSX config build option. Must compile with POWER4, FPU and ALTIVEC.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch extends the floating point save and restore code to use the
VSX load/stores when VSX is available. This will make FP context
save/restore marginally slower on FP only code, when VSX is available,
as it has to load/store 128bits rather than just 64bits.
Mixing FP, VMX and VSX code will get constant architected state.
The signals interface is extended to enable access to VSR 0-31
doubleword 1 after discussions with tool chain maintainers. Backward
compatibility is maintained.
The ptrace interface is also extended to allow access to VSR 0-31 full
registers.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This adds the macros for the VSX load/store instruction as most
binutils are not going to support this for a while.
Also add VSX register save/restore macros and vsr[0-63] register definitions.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Add a VSX CPU feature. Also add code to detect if VSX is available
from the device tree.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Joel Schopp <jschopp@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The layout of the new VSR registers and how they overlap on top of the
legacy FPR and VR registers is:
VSR doubleword 0 VSR doubleword 1
----------------------------------------------------------------
VSR[0] | FPR[0] | |
----------------------------------------------------------------
VSR[1] | FPR[1] | |
----------------------------------------------------------------
| ... | |
| ... | |
----------------------------------------------------------------
VSR[30] | FPR[30] | |
----------------------------------------------------------------
VSR[31] | FPR[31] | |
----------------------------------------------------------------
VSR[32] | VR[0] |
----------------------------------------------------------------
VSR[33] | VR[1] |
----------------------------------------------------------------
| ... |
| ... |
----------------------------------------------------------------
VSR[62] | VR[30] |
----------------------------------------------------------------
VSR[63] | VR[31] |
----------------------------------------------------------------
VSX has 64 128bit registers. The first 32 regs overlap with the FP
registers and hence extend them with and additional 64 bits. The
second 32 regs overlap with the VMX registers.
This commit introduces the thread_struct changes required to reflect
this register layout. Ptrace and signals code is updated so that the
floating point registers are correctly accessed from the thread_struct
when CONFIG_VSX is enabled.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Make load_up_fpu and load_up_altivec callable so they can be reused by
the VSX code.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Move the altivec_unavailable code, to make room at 0xf40 where the
vsx_unavailable exception will be.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
We are going to change where the floating point registers are stored
in the thread_struct, so in preparation add some macros to access the
floating point registers. Update all code to use these new macros.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
If we set the SPE MSR bit in save_user_regs we can blow away the VEC
bit. This doesn't matter in reality as they are in fact the same bit
but looks bad.
Also, when we add VSX in a later patch, we need to be able to set two
separate MSR bits here.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Currently we set the start of the .text section to be 4Mb for pSeries.
In situations where the zImage is > 8Mb we'll fail to boot (due to
overlapping with OF). Move .text in a zImage from 4MB to 64MB
(well past OF).
We still will not be able to load large zImage unless we also move OF,
to that end, add a note to the zImage ELF to move OF to 32Mb. If this
is the very first kernel booted then we'll need to move OF manually by
setting real-base.
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Use an alternative feature section in _switch. There are three cases
handled here, either we don't have an SLB, in which case we jump over the
entire code section, or if we do we either do or don't have 1TB segments.
Boot tested on Power3, Power5 and Power5+.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This commit adds tests of the feature fixup code, they are run during
boot if CONFIG_FTR_FIXUP_SELFTEST=y. Some of the tests manually invoke
the patching routines to check their behaviour, and others use the
macros and so are patched during the normal patching done during boot.
Because we have two sets of macros with different names, we use a macro
to generate the test of the macros, very niiiice.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This commit adds the logic to patch alternative sections. This is fairly
straightforward, except for branches. Relative branches that jump from
inside the else section to outside of it need to be translated as they're
moved, otherwise they will jump to the wrong location.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The current feature section logic only supports nop'ing out code, this means
if you want to choose at runtime between instruction sequences, one or both
cases will have to execute the nop'ed out contents of the other section, eg:
BEGIN_FTR_SECTION
or 1,1,1
END_FTR_SECTION_IFSET(FOO)
BEGIN_FTR_SECTION
or 2,2,2
END_FTR_SECTION_IFCLR(FOO)
and the resulting code will be either,
or 1,1,1
nop
or,
nop
or 2,2,2
For small code segments this is fine, but for larger code blocks and in
performance criticial code segments, it would be nice to avoid the nops.
This commit starts to implement logic to allow the following:
BEGIN_FTR_SECTION
or 1,1,1
FTR_SECTION_ELSE
or 2,2,2
ALT_FTR_SECTION_END_IFSET(FOO)
and the resulting code will be:
or 1,1,1
or,
or 2,2,2
We achieve this by extending the existing FTR macros. The current feature
section semantic just becomes a special case, ie. if the else case is empty
we nop out the default case.
The key limitation is that the size of the else case must be less than or
equal to the size of the default case. If the else case is smaller the
remainder of the section is nop'ed.
We let the linker put the else case code in with the rest of the text,
so that relative branches from the else case are more likley to link,
this has the disadvantage that we can't free the unused else cases.
This commit introduces the required macro and linker script changes, but
does not enable the patching of the alternative sections.
We also need to update two hand-made section entries in reg.h and timex.h
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Currently we have three versions of MAKE_FTR_SECTION_ENTRY(), the macro that
generates a feature section entry. There is 64bit version, a 32bit version
and version for 32bit code built with a 64bit kernel.
Rather than triplicating (?) the MAKE_FTR_SECTION_ENTRY() logic, we can
move the 64bit/32bit differences into separate macros, and then only have
one version of MAKE_FTR_SECTION_ENTRY().
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The CPU and firmware feature fixup macros are currently spread across
three files, firmware.h, cputable.h and asm-compat.h. Consolidate them
into their own file, feature-fixups.h
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The logic to patch CPU feature sections lives in cputable.c, but these days
it's used for CPU features as well as firmware features. Move it into
it's own file for neatness and as preparation for some additions.
While we're moving the code, we pull the loop body logic into a separate
routine, and remove a comment which doesn't apply anymore.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
A bunch of code has hard-coded the value for a "nop" instruction, it
would be nice to have a #define for it.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Add tests of the existing code patching routines, as well as the new
routines added in the last commit. The self-tests are run late in boot
when CONFIG_CODE_PATCHING_SELFTEST=y, which depends on DEBUG_KERNEL=y.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This commit adds some new routines for patching code, which will be used
in a following commit.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Because function pointers point to different things on 32-bit vs 64-bit,
add a macro that deals with dereferencing the OPD on 64-bit. The soon to
be merged ftrace wants this, as well as other code I am working on.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>