Pull block fixes from Jens Axboe:
"Another week, another round of fixes.
These have been brewing for a bit and in various iterations, but I
feel pretty comfortable about the quality of them. They fix real
issues. The pull request is mostly blk-mq related, and the only one
not fixing a real bug, is the tag iterator abstraction from Christoph.
But it's pretty trivial, and we'll need it for another fix soon.
Apart from the blk-mq fixes, there's an NVMe affinity fix from Keith,
and a single fix for xen-blkback from Roger fixing failure to free
requests on disconnect"
* 'for-linus' of git://git.kernel.dk/linux-block:
blk-mq: factor out a helper to iterate all tags for a request_queue
blk-mq: fix racy updates of rq->errors
blk-mq: fix deadlock when reading cpu_list
blk-mq: avoid inserting requests before establishing new mapping
blk-mq: fix q->mq_usage_counter access race
blk-mq: Fix use after of free q->mq_map
blk-mq: fix sysfs registration/unregistration race
blk-mq: avoid setting hctx->tags->cpumask before allocation
NVMe: Set affinity after allocating request queues
xen/blkback: free requests on disconnection
This reverts commit e51e38494a: we
actually do want the device to work in extended W mode, as this is the
mode that allows us receiving multiple contact information.
Cc: stable@vger.kernel.org
During development it was found that a number of builds would panic
during the kernel init process, more specifically in 'delayed_fput()'.
The panic showed the kernel trying to access a memory address of
'0xb7fdc00' while traversing the 'delayed_fput_list' structure.
Comparing this memory address to the value of the pointer used on
builds that did not panic confirmed that the pointer on crashing
builds must have been corrupted at some stage earlier in the init
process.
By traversing the list earlier and earlier in the code it was found
that 'plat_mem_setup()' was responsible for corrupting the list.
Specifically the line:
memory = cvmx_bootmem_phy_alloc(mem_alloc_size,
__pa_symbol(&__init_end), -1,
0x100000,
CVMX_BOOTMEM_FLAG_NO_LOCKING);
Which would eventually call:
cvmx_bootmem_phy_set_size(new_ent_addr,
cvmx_bootmem_phy_get_size
(ent_addr) -
(desired_min_addr -
ent_addr));
Where 'new_ent_addr'=0x4800000 (the address of 'delayed_fput_list')
and the second argument (size)=0xb7fdc00 (the address causing the
kernel panic). The job of this part of 'plat_mem_setup()' is to
allocate chunks of memory for the kernel to use. At the start of
each chunk of memory the size of the chunk is written, hence the
value 0xb7fdc00 is written onto memory at 0x4800000, therefore the
kernel panics when it goes back to access 'delayed_fput_list' later
on in the initialisation process.
On builds that were not crashing it was found that the compiler had
placed 'delayed_fput_list' at 0x4800008, meaning it wasn't corrupted
(but something else in memory was overwritten).
As can be seen in the first function call above the code begins to
allocate chunks of memory beginning from the symbol '__init_end'.
The MIPS linker script (vmlinux.lds.S) however defines the .bss
section to begin after '__init_end'. Therefore memory within the
.bss section is allocated to the kernel to use (System.map shows
'delayed_fput_list' and other kernel structures to be in .bss).
To stop the kernel panic (and the .bss section being corrupted)
memory should begin being allocated from the symbol '_end'.
Signed-off-by: Matt Bennett <matt.bennett@alliedtelesis.co.nz>
Acked-by: David Daney <david.daney@cavium.com>
Cc: linux-mips@linux-mips.org
Cc: aleksey.makarov@auriga.com
Patchwork: https://patchwork.linux-mips.org/patch/11251/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Commit 1a3d59579b ("MIPS: Tidy up FPU context switching") removed FP
context saving from the asm-written resume function in favour of reusing
existing code to perform the same task. However it only removed the FP
context saving code from the r4k_switch.S implementation of resume.
Remove it from the r2300_switch.S implementation too in order to prevent
attempting to save the FP context twice, which would likely lead to an
exception from the second save because the FPU had already been disabled
by the first save.
This patch has only been build tested, using rbtx49xx_defconfig.
Fixes: 1a3d59579b ("MIPS: Tidy up FPU context switching")
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: linux-kernel@vger.kernel.org
Cc: Manuel Lauss <manuel.lauss@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/11167/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The codec supports 4 channel recording with TDM on AIF1.
This patch modifies the DAI capability to allow it.
Signed-off-by: Ben Zhang <benzh@chromium.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
This adds an entry to indicate the DA7219 bindings document (and
other Dialog codecs bindings documents) are supported.
Signed-off-by: Adam Thomson <Adam.Thomson.Opensource@diasemi.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
This adds support for the DA7219 audio codec with built-in advanced
accessory detect features.
Signed-off-by: Adam Thomson <Adam.Thomson.Opensource@diasemi.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Designware I2S uses tx empty and rx available signals as the DMA
handshaking signals. during music playing, if XRUN occurs,
i2s_stop() function will be executed and both tx and rx irq are
masked, when music continues to be played, i2s_start() is executed
but both tx and rx irq are not unmasked which cause I2S stop
sending DMA handshaking signal to DMA controller, and it finally
causes music playing will be stopped once XRUN occurs for the first
time.
[On list discussion suggests this may be partly a race condition on slow
systems -- broonie]
Signed-off-by: Yitian Bu <yitian.bu@tangramtek.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
These two fields are line parameters for BE/CC links and
should not be from toplogy but from ACPI.
Signed-off-by: Mengdong Lin <mengdong.lin@intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
struct snd_soc_tplg_link_config is defined to configure BE & CC links.
Signed-off-by: Vedang Patel <vedang.patel@intel.com>
Signed-off-by: Mengdong Lin <mengdong.lin@intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Define the topology type for BE DAI link: SND_SOC_TPLG_TYPE_BACKEND_LINK.
Signed-off-by: Vedang Patel <vedang.patel@intel.com>
Signed-off-by: Mengdong Lin <mengdong.lin@intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
The toplogy user space tool will generate this bitwise flag by using
SNDRV_PCM_FORMAT_* exposed by asound.h, and the topology core will copy
this flag when generating DAI streams.
Signed-off-by: Mengdong Lin <mengdong.lin@intel.com>
Acked-by: Liam Girdwood <liam.r.girdwood@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
For codec-codec links, this struct will be mapped to the DAI links's
params, which is struct snd_soc_pcm_stream and it needs a stream name.
Signed-off-by: Vedang Patel <vedang.patel@intel.com>
Signed-off-by: Mengdong Lin <mengdong.lin@intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
This fixes the endianness of the ABI parameters in the struct.
The field 'num_kcontrols' is also extended from 16 bits to 32 bits.
Signed-off-by: Vedang Patel <vedang.patel@intel.com>
Signed-off-by: Mengdong Lin <mengdong.lin@intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
The struct snd_soc_tplg_stream_config is no longer used in the ABI.
We are using snd_soc_tplg_stream instead.
Signed-off-by: Vedang Patel <vedang.patel@intel.com>
Signed-off-by: Mengdong Lin <mengdong.lin@intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
The struct snd_soc_tplg_pcm_dai is renamed to snd_soc_tplg_pcm.
This struct will now be used to handle data related to PCMs
(FE DAI & DAI links). It's not for BE, because BE DAI mappings will be
provided by ACPI/FDT data.
Remove the unused struct snd_soc_tplg_pcm_cfg_caps. We are using
snd_soc_tplg_stream and snd_soc_stream_caps instead.
Bump ABI version to 4.
Signed-off-by: Vedang Patel <vedang.patel@intel.com>
Signed-off-by: Mengdong Lin <mengdong.lin@intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Richard Fitzgerald <rf@opensource.wolfsonmicro.com>
Signed-off-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Adds convenience defines for declaring a gain control that
has an input mux. These blocks are functionally equivalent to
the existing mixer blocks but can only have a single input
active at once.
Signed-off-by: Richard Fitzgerald <rf@opensource.wolfsonmicro.com>
Acked-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
This commit adds hwdep interface so as the other IEEE 1394 sound devices
has.
This interface is designed for mixer/control applications. By using this
interface, an application can get information about firewire node, can
lock/unlock kernel streaming and can get notification at starting/stopping
kernel streaming.
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
This commit adds PCM functionality to transmit/receive PCM samples.
When one of PCM substreams are running or external clock source is
selected, current sampling rate is used. Else, the sampling rate is
changed as an userspace application requests.
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
This commit adds streaming functionality for both direction. To utilize
the sequence of the number of data blocks in packets, full duplex with
synchronization is applied.
Besides, TASCAM FireWire series allows drivers to decide which PCM data
channels are enabled. For convenience, this driver always enable whole the
data channels.
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
TASCAM FireWire series uses non-blocking transmission for AMDTP packet
streaming, while the format of data blocks is unique.
The CIP headers includes specific value in FMT field and no SYT
information.
In transmitted packets, the first data channel represents event counter,
and the last data channel has status and control information. The rest
has 24bit PCM samples with right padding.
In received packets, all of data channels include 16, 24, 32bit PCM
samples. There's no other kind of information.
This commit adds support for this protocol. For convenience, the size of
PCM samples in outgoing packet is limited by 16 and 24bit. The status and
control information will be supported in future commits.
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
TASCAM FireWire series has certain registers for firmware information.
This commit adds proc node to show the information.
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
TASCAM FireWire series doesn't tell drivers their capabilities, thus
the drivers should have model-dependent parameters and apply it to
detected devices.
This commit adds a structure to represent such parameters.
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
This commit adds a new driver for TASCAM FireWire series. In this commit,
this driver just creates/removes card instance according to bus event.
More functionalities will be added in following commits.
TASCAM FireWire series consists of:
* PDI 1394P23 for IEEE 1394 PHY layer
* PDI 1394L40 for IEEE 1394 LINK layer and IEC 61883 interface
* XILINX XC9536XL
* XILINX Spartan-II XC2S100
* ATMEL AT91M42800A
Ilya Zimnovich had investigated TASCAM FireWire series in 2011, and
discover some features of his FW-1804. You can see a part of his research
in FFADO project.
http://subversion.ffado.org/wiki/Tascam
A part of my work are based on Ilya's investigation, while this series
doesn't support the FW-1804, because of a lack of config ROM
information and its protocol detail, especially for PCM channels.
I observed that FW-1884 and FW-1082 don't work properly with 1394 OHCI
controller based on VT6315. The controller can actually communicate packets
to these models, while these models generate no sounds. It may be due to
the PHY/LINK layer issues. Using 1394 OHCI controller produced by the other
vendors such as Texas Instruments may work. Or adding another node on the
bus.
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Commit 3a0f9aaee0 ("dm raid: round region_size to power of two")
intended to make sure that the default region size is a power of two.
However, the logic in that commit is incorrect and sets the variable
region_size to 0 or 1, depending on whether min_region_size is a power
of two.
Fix this logic, using roundup_pow_of_two(), so that region_size is
properly rounded up to the next power of two.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Fixes: 3a0f9aaee0 ("dm raid: round region_size to power of two")
Cc: stable@vger.kernel.org # v3.8+
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
- Allow users of mmc_of_parse() to succeed when CONFIG_GPIOLIB is unset
- Prevent infinite loop of re-tuning for CRC-errors for CMD19 and CMD21
MMC host:
- pxamci: Fix issues with card detect
- sunxi: Fix clk-delay settings
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWDkvnAAoJEP4mhCVzWIwpfg4P/A2KXUrFNw4e0FbFGY49pgvs
aqIYWy0g9DFWnki/yl1R/W9NmrpXeRFKRoilC1BGUGqWvWnjBRNObl66ZQ2gWOkb
Xg7lZvhINs1ovqOAm6mCp8xr1sSShucoQ4TjKF2nV99BmuV0r5iHtCPFkpU7gAx7
9J+07Ksss/FoBiM6g+2ImVUKCp0HBe13lnXIL9GFv3QVXFNDzVyEoSqaP0GorjCj
CD53BlNd8dE7IVme02q/xPuXJ7VuRrp8+tCORForfxvvLc2cy4eoIT43efQp10Fd
FPnkWkDg+kmZaGg2clkN9igJja7WZzgMS1r7bZpgx9xc37BXfyv/D3zdWawM/zqJ
FhKFGSreAY9mGQNR0kdHNfq4Gk4UV0c2T5BUYgk+OR6bqpXWitRkGYddLmk6q2mD
YnNJ6qV9U9y1PlMajrkHNfkYRFitFQYmQfKY/VxqJrVZYSS4jt4k6BBEcfS0YQpu
aRRLnx+G4uqYri1l7DyVfaq9GtO5EWyllgy1m8QNgrFJeV09oQXBcWQxjP3HQxqh
NIVXVuSNG/5Imj1HX40i1Pa/NRvgd4HRE/QviM3Ukby9Nr2atXFdYWjn5jtMyqk9
KFwI6+dqSKKaxzVB30Mb3booEy6IdueWNc2Sg2pFj1Q5U3O2AofZwjoOHozvDoY1
lf+NQifwnJFloCCXMXLY
=bo65
-----END PGP SIGNATURE-----
Merge tag 'mmc-v4.3-rc3' of git://git.linaro.org/people/ulf.hansson/mmc
Pull MMC fixes from Ulf Hansson:
"Here are some mmc fixes intended for v4.3 rc4:
MMC core:
- Allow users of mmc_of_parse() to succeed when CONFIG_GPIOLIB is
unset
- Prevent infinite loop of re-tuning for CRC-errors for CMD19 and
CMD21
MMC host:
- pxamci: Fix issues with card detect
- sunxi: Fix clk-delay settings"
* tag 'mmc-v4.3-rc3' of git://git.linaro.org/people/ulf.hansson/mmc:
mmc: core: fix dead loop of mmc_retune
mmc: pxamci: fix card detect with slot-gpio API
mmc: sunxi: Fix clk-delay settings
mmc: core: Don't return an error for CD/WP GPIOs when GPIOLIB is unset
Pull IOVA fixes from David Woodhouse:
"The main fix here is the first one, fixing the over-allocation of
size-aligned requests. The other patches simply make the existing
IOVA code available to users other than the Intel VT-d driver, with no
functional change.
I concede the latter really *should* have been submitted during the
merge window, but since it's basically risk-free and people are
waiting to build on top of it and it's my fault I didn't get it in, I
(and they) would be grateful if you'd take it"
* git://git.infradead.org/intel-iommu:
iommu: Make the iova library a module
iommu: iova: Export symbols
iommu: iova: Move iova cache management to the iova library
iommu/iova: Avoid over-allocating when size-aligned
Currently, input enable settings are missing from the PH1-sLD8
pinctrl driver. (All the entries in the pin table are set to
UNIPHIER_PIN_IECTRL_NONE).
Fill the table with correct values.
Fixes: 95372f9dc8 ("pinctrl: UniPhier: add UniPhier PH1-sLD8 pinctrl driver")
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
The code in pinctrl-imx.c only works correctly if in the
imx_pinctrl_soc_info passed to imx_pinctrl_probe we have:
info->pins[i].number = i
conf_reg(info->pins[i]) = 4 * i
(which conf_reg(pin) being the offset of the pin's configuration
register).
When the imx25 specific part was introduced in b4a87c9b96 ("pinctrl:
pinctrl-imx: add imx25 pinctrl driver") we had:
info->pins[i].number = i + 1
conf_reg(info->pins[i]) = 4 * i
. Commit 34027ca2bb ("pinctrl: imx25: fix numbering for pins") tried
to fix that but made the situation:
info->pins[i-1].number = i
conf_reg(info->pins[i-1]) = 4 * i
which is hardly better but fixed the error seen back then.
So insert another reserved entry in the array to finally yield:
info->pins[i].number = i
conf_reg(info->pins[i]) = 4 * i
Fixes: 34027ca2bb ("pinctrl: imx25: fix numbering for pins")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
The comment for PG14 mux setting 3 already correctly states that this
muxes PG13 to pwm1, but the text ascociated with it said uart3, fix this.
Note that we use "pwm" rather then "pwm1" to be consistent with pwm0
where the mux setting is also simply called "pwm" and to be consistent
with sun4i/sun7i which do the same.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
from Designware I2S datasheet, tx/rx XRUN irq is cleared by
reading register TOR/ROR, rather than by writing into them.
Signed-off-by: Yitian Bu <yitian.bu@tangramtek.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
The entire bpf_jit_asm.S is written in noreorder mode because "we know
better" according to a comment. This also prevented the assembler from
throwing in the required NOPs for MIPS I processors which have no
load-use interlock, thus the load's consumer might end up using the
old value of the register from prior to the load.
Fixed by putting the assembler in reorder mode for just the affected
load instructions. This is not enough for gas to actually try to be
clever by looking at the next instruction and inserting a nop only
when needed but as the comment said "we know better", so getting gas
to unconditionally emit a NOP is just right in this case and prevents
adding further ifdefery.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
On x32, gcc predefines __x86_64__ but long is only 32-bit. Use
__ILP32__ to distinguish x32.
Fixes this compiler error in perf:
tools/include/asm-generic/bitops/__ffs.h: In function '__ffs':
tools/include/asm-generic/bitops/__ffs.h:19:8: error: right shift count >= width of type [-Werror=shift-count-overflow]
word >>= 32;
^
This isn't sufficient to build perf for x32, though.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/1443660043.2730.15.camel@decadent.org.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Passing -1 to bitmap_storage_alloc() causes page->index to be set to
-1, which is quite problematic.
So only pass ->cluster_slot if mddev_is_clustered().
Fixes: b97e92574c ("Use separate bitmaps for each nodes in the cluster")
Cc: stable@vger.kernel.org (v4.1+)
Signed-off-by: NeilBrown <neilb@suse.com>
close_sync() needs to set conf->next_resync to a large, but safe value
below MaxSector and use it to determine whether or not to set
start_next_window in wait_barrier()
Solution suggested by Neil Brown.
Reported-by: Nate Dailey <nate.dailey@stratus.com>
Tested-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Remove unneeded NULL test.
The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@@ expression x; @@
-if (x != NULL)
\(kmem_cache_destroy\|mempool_destroy\|dma_pool_destroy\)(x);
// </smpl>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: NeilBrown <neilb@suse.com>
If faulty disks of an array are more than allowed degraded number, the
array enters error handling. It will be marked as read-only with
MD_CHANGE_PENDING/RECOVERY_NEEDED set. But currently recovery doesn't
clear CHANGE_PENDING bit for read-only array. If MD_CHANGE_PENDING is
set for a raid5 array, all returned IO will be hold on a list till the
bit is clear. But recovery nevery clears this bit, the IO is always in
pending state and nevery finish. This has bad effects like upper layer
can't get an IO error and the array can't be stopped.
Fixes: c3cce6cda1 ("md/raid5: ensure device failure recorded before write request returns.")
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Calling e.g. blk_queue_max_hw_sectors() after calls to
disk_stack_limits() discards the settings determined by
disk_stack_limits().
So we need to make those calls first.
Fixes: 199dc6ed51 ("md/raid0: update queue parameter in a safer location.")
Cc: stable@vger.kernel.org (v2.6.35+ - please apply with 199dc6ed51).
Reported-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.com>
When need_this_block probably shouldn't be called when there
are more than 2 failed devices, we really don't want it to try
indexing beyond the end of the failed_num[] of fdev[] arrays.
So limit the loops to at most 2 iterations.
Reported-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.de>
handle_failed_stripe() makes the stripe fail, eg, all IO will return
with a failure, but it doesn't update stripe_head_state. Later
handle_stripe() has special handling for raid6 for handle_stripe_fill().
That check before handle_stripe_fill() doesn't skip the failed stripe
and we get a kernel crash in need_this_block. This patch clear the
analysis state to make sure no functions wrongly called after
handle_failed_stripe()
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
If a superblock update is pending, wait for it to complete before
letting md_set_readonly() switch to readonly.
Otherwise we might lose important information about a device having
failed.
For external arrays, waiting for superblock updates can wait on
user-space, so in that case, just return an error.
Reported-and-tested-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Unused space between the end of __ex_table and the start of
rodata can be left W+x in the kernel page tables. Extend the
setting of the NX bit to cover this gap by starting from
text_end rather than rodata_start.
Before:
---[ High Kernel Mapping ]---
0xffffffff80000000-0xffffffff81000000 16M pmd
0xffffffff81000000-0xffffffff81600000 6M ro PSE GLB x pmd
0xffffffff81600000-0xffffffff81754000 1360K ro GLB x pte
0xffffffff81754000-0xffffffff81800000 688K RW GLB x pte
0xffffffff81800000-0xffffffff81a00000 2M ro PSE GLB NX pmd
0xffffffff81a00000-0xffffffff81b3b000 1260K ro GLB NX pte
0xffffffff81b3b000-0xffffffff82000000 4884K RW GLB NX pte
0xffffffff82000000-0xffffffff82200000 2M RW PSE GLB NX pmd
0xffffffff82200000-0xffffffffa0000000 478M pmd
After:
---[ High Kernel Mapping ]---
0xffffffff80000000-0xffffffff81000000 16M pmd
0xffffffff81000000-0xffffffff81600000 6M ro PSE GLB x pmd
0xffffffff81600000-0xffffffff81754000 1360K ro GLB x pte
0xffffffff81754000-0xffffffff81800000 688K RW GLB NX pte
0xffffffff81800000-0xffffffff81a00000 2M ro PSE GLB NX pmd
0xffffffff81a00000-0xffffffff81b3b000 1260K ro GLB NX pte
0xffffffff81b3b000-0xffffffff82000000 4884K RW GLB NX pte
0xffffffff82000000-0xffffffff82200000 2M RW PSE GLB NX pmd
0xffffffff82200000-0xffffffffa0000000 478M pmd
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: <stable@vger.kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/1443704662-3138-1-git-send-email-sds@tycho.nsa.gov
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The original bug is a page fault crash that sometimes happens
on big machines when preparing ELF headers:
BUG: unable to handle kernel paging request at ffffc90613fc9000
IP: [<ffffffff8103d645>] prepare_elf64_ram_headers_callback+0x165/0x260
The bug is caused by us under-counting the number of memory ranges
and subsequently not allocating enough ELF header space for them.
The bug is typically masked on smaller systems, because the ELF header
allocation is rounded up to the next page.
This patch modifies the code in fill_up_crash_elf_data() by using
walk_system_ram_res() instead of walk_system_ram_range() to correctly
count the max number of crash memory ranges. That's because the
walk_system_ram_range() filters out small memory regions that
reside in the same page, but walk_system_ram_res() does not.
Here's how I found the bug:
After tracing prepare_elf64_headers() and prepare_elf64_ram_headers_callback(),
the code uses walk_system_ram_res() to fill-in crash memory regions information
to the program header, so it counts those small memory regions that
reside in a page area.
But, when the kernel was using walk_system_ram_range() in
fill_up_crash_elf_data() to count the number of crash memory regions,
it filters out small regions.
I printed those small memory regions, for example:
kexec: Get nr_ram ranges. vaddr=0xffff880077592258 paddr=0x77592258, sz=0xdc0
Based on the code in walk_system_ram_range(), this memory region
will be filtered out:
pfn = (0x77592258 + 0x1000 - 1) >> 12 = 0x77593
end_pfn = (0x77592258 + 0xfc0 -1 + 1) >> 12 = 0x77593
end_pfn - pfn = 0x77593 - 0x77593 = 0 <=== if (end_pfn > pfn) is FALSE
So, the max_nr_ranges that's counted by the kernel doesn't include
small memory regions - causing us to under-allocate the required space.
That causes the page fault crash that happens in a later code path
when preparing ELF headers.
This bug is not easy to reproduce on small machines that have few
CPUs, because the allocated page aligned ELF buffer has more free
space to cover those small memory regions' PT_LOAD headers.
Signed-off-by: Lee, Chun-Yi <jlee@suse.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: kexec@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/1443531537-29436-1-git-send-email-jlee@suse.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The default clock enabling functions for TI clocks -
omap2_dflt_clk_enable() and omap2_dflt_clk_disable() perform a
NULL check for the enable_reg field of the clk_hw_omap structure.
This enable_reg field however is merely a combination of the index
of the master IP module, and the offset from the master IP module's
base address. A value of 0 is perfectly valid, and the current error
checking will fail in these cases. The issue was found when trying
to enable the iva2_ck clock on OMAP3 platforms.
So, switch the check to use IS_ERR. This correction is similar to the
logic used in commit c807dbedb5 ("clk: ti: fix ti_clk_get_reg_addr
error handling").
Fixes: 9f37e90efa ("clk: ti: dflt: move support for default gate clock..")
Signed-off-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Tero Kristo <t-kristo@ti.com>