linux-pinenote

Author	SHA1	Message	Date
Wang Nan	52d3352e79	bpf tools: Create eBPF maps defined in an object file This patch creates maps based on 'map' section in object file using bpf_create_map(), and stores the fds into an array in 'struct bpf_object'. Previous patches parse ELF object file and collects required data, but doesn't play with the kernel. They belong to the 'opening' phase. This patch is the first patch in 'loading' phase. The 'loaded' field is introduced in 'struct bpf_object' to avoid loading an object twice, because the loading phase clears resources collected during the opening which becomes useless after loading. In this patch, maps_buf is cleared. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kaixu Xia <xiakaixu@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1435716878-189507-17-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2015-08-07 10:16:58 -03:00
Wang Nan	e3ed2fef22	bpf tools: Add bpf.c/h for common bpf operations This patch introduces bpf.c and bpf.h, which hold common functions issuing bpf syscall. The goal of these two files is to hide syscall completely from user. Note that bpf.c and bpf.h deal with kernel interface only. Things like structure of 'map' section in the ELF object is not cared by of bpf.[ch]. We first introduce bpf_create_map(). Note that, since functions in bpf.[ch] are wrapper of sys_bpf, they don't use OO style naming. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kaixu Xia <xiakaixu@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1435716878-189507-16-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2015-08-07 10:16:58 -03:00
Wang Nan	340909152a	bpf tools: Record map accessing instructions for each program This patch records the indices of instructions which are needed to be relocated. That information is saved in the 'reloc_desc' field in 'struct bpf_program'. In the loading phase (this patch takes effect in the opening phase), the collected instructions will be replaced by map loading instructions. Since we are going to close the ELF file and clear all data at the end of the 'opening' phase, the ELF information will no longer be valid in the 'loading' phase. We have to locate the instructions before maps are loaded, instead of directly modifying the instruction. 'struct bpf_map_def' is introduced in this patch to let us know how many maps are defined in the object. This is the third part of map relocation. The principle of map relocation is described in commit message of 'bpf tools: Collect symbol table from SHT_SYMTAB section'. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kaixu Xia <xiakaixu@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1435716878-189507-15-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2015-08-07 10:16:58 -03:00
Wang Nan	b62f06e81b	bpf tools: Collect relocation sections from SHT_REL sections This patch collects relocation sections into 'struct object'. Such sections are used for connecting maps to bpf programs. 'reloc' field in 'struct bpf_object' is introduced for storing such information. This patch simply store the data into 'reloc' field. Following patch will parse them to know the exact instructions which are needed to be relocated. Note that the collected data will be invalid after ELF object file is closed. This is the second patch related to map relocation. The first one is 'bpf tools: Collect symbol table from SHT_SYMTAB section'. The principle of map relocation is described in its commit message. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kaixu Xia <xiakaixu@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1435716878-189507-14-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2015-08-07 10:16:57 -03:00
Wang Nan	a5b8bd47dc	bpf tools: Collect eBPF programs from their own sections This patch collects all programs in an object file into an array of 'struct bpf_program' for further processing. That structure is for representing each eBPF program. 'bpf_prog' should be a better name, but it has been used by linux/filter.h. Although it is a kernel space name, I still prefer to call it 'bpf_program' to prevent possible confusion. bpf_object__add_program() creates a new 'struct bpf_program' object. It first init a variable in stack using bpf_program__init(), then if success, enlarges obj->programs array and copy the new object in. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kaixu Xia <xiakaixu@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1435716878-189507-13-git-send-email-wangnan0@huawei.com [ Made bpf_object__add_program() propagate the error (-EINVAL or -ENOMEM) ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2015-08-07 10:16:57 -03:00
Wang Nan	bec7d68cb5	bpf tools: Collect symbol table from SHT_SYMTAB section This patch collects symbols section. This section is useful when linking BPF maps. What 'bpf_map_xxx()' functions actually require are map's file descriptors (and the internal verifier converts fds into pointers to 'struct bpf_map'), which we don't know when compiling. Therefore, we should make compiler generate a 'ldr_64 r1, <imm>' instruction, and fill the 'imm' field with the actual file descriptor when loading in libbpf. BPF programs should be written in this way: struct bpf_map_def SEC("maps") my_map = { .type = BPF_MAP_TYPE_HASH, .key_size = sizeof(unsigned long), .value_size = sizeof(unsigned long), .max_entries = 1000000, }; SEC("my_func=sys_write") int my_func(void *ctx) { ... bpf_map_update_elem(&my_map, &key, &value, BPF_ANY); ... } Compiler should convert '&my_map' into a 'ldr_64, r1, <imm>' instruction, where imm should be the address of 'my_map'. According to the address, libbpf knows which map it actually referenced, and then fills the imm field with the 'fd' of that map created by it. However, since we never really 'link' the object file, the imm field is only a record in relocation section. Therefore libbpf should do the relocation: 1. In relocation section (type == SHT_REL), positions of each such 'ldr_64' instruction are recorded with a reference of an entry in symbol table (SHT_SYMTAB); 2. From records in symbol table we can find the indics of map variables. Libbpf first record SHT_SYMTAB and positions of each instruction which required bu such operation. Then create file descriptor. Finally, after map creation complete, replace the imm field. This is the first patch of BPF map related stuff. It records SHT_SYMTAB into object's efile field for further use. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kaixu Xia <xiakaixu@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1435716878-189507-12-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2015-08-07 10:16:57 -03:00
Wang Nan	0b3d1efade	bpf tools: Collect map definitions from 'maps' section If maps are used by eBPF programs, corresponding object file(s) should contain a section named 'map'. Which contains map definitions. This patch copies the data of the whole section. Map data parsing should be acted just before map loading. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kaixu Xia <xiakaixu@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1435716878-189507-11-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2015-08-07 10:16:57 -03:00
Wang Nan	cb1e5e9619	bpf tools: Collect version and license from ELF sections Expand bpf_obj_elf_collect() to collect license and kernel version information in eBPF object file. eBPF object file should have a section named 'license', which contains a string. It should also have a section named 'version', contains a u32 LINUX_VERSION_CODE. bpf_obj_validate() is introduced to validate object file after loaded. Currently it only check existence of 'version' section. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kaixu Xia <xiakaixu@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1435716878-189507-10-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2015-08-07 10:16:57 -03:00
Wang Nan	296036653a	bpf tools: Iterate over ELF sections to collect information bpf_obj_elf_collect() is introduced to iterate over each elf sections to collection information in eBPF object files. This function will futher enhanced to collect license, kernel version, programs, configs and map information. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kaixu Xia <xiakaixu@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1435716878-189507-9-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2015-08-07 10:16:56 -03:00
Wang Nan	cc4228d57c	bpf tools: Check endianness and make libbpf fail early Check endianness according to EHDR. Code is taken from tools/perf/util/symbol-elf.c. Libbpf doesn't magically convert missmatched endianness. Even if we swap eBPF instructions to correct byte order, we are unable to deal with endianness in code logical generated by LLVM. Therefore, libbpf should simply reject missmatched ELF object, and let LLVM to create good code. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kaixu Xia <xiakaixu@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1435716878-189507-8-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2015-08-07 10:16:56 -03:00
Wang Nan	6c956392b0	bpf tools: Read eBPF object from buffer To support dynamic compiling, this patch allows caller to pass a in-memory buffer to libbpf by bpf_object__open_buffer(). libbpf calls elf_memory() to open it as ELF object file. Because __bpf_object__open() collects all required data and won't need that buffer anymore, libbpf uses that buffer directly instead of clone a new buffer. Caller of libbpf can free that buffer or use it do other things after bpf_object__open_buffer() return. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kaixu Xia <xiakaixu@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1435716878-189507-7-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2015-08-07 10:16:56 -03:00
Wang Nan	1a5e3fb1e9	bpf tools: Open eBPF object file and do basic validation This patch defines basic interface of libbpf. 'struct bpf_object' will be the handler of each object file. Its internal structure is hide to user. eBPF object files are compiled by LLVM as ELF format. In this patch, libelf is used to open those files, read EHDR and do basic validation according to e_type and e_machine. All elf related staffs are grouped together and reside in efile field of 'struct bpf_object'. bpf_object__elf_finish() is introduced to clear it. After all eBPF programs in an object file are loaded, related ELF information is useless. Close the object file and free those memory. The zfree() and zclose() functions are introduced to ensure setting NULL pointers and negative file descriptors after resources are released. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kaixu Xia <xiakaixu@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1435716878-189507-6-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2015-08-07 10:16:56 -03:00
Wang Nan	b3f59d66e2	bpf tools: Allow caller to set printing function By libbpf_set_print(), users of libbpf are allowed to register he/she own debug, info and warning printing functions. Libbpf will use those functions to print messages. If not provided, default info and warning printing functions are fprintf(stderr, ...); default debug printing is NULL. This API is designed to be used by perf, enables it to register its own logging functions to make all logs uniform, instead of separated logging level control. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kaixu Xia <xiakaixu@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1435716878-189507-5-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2015-08-07 10:16:56 -03:00
Wang Nan	1b76c13e4b	bpf tools: Introduce 'bpf' library and add bpf feature check This is the first patch of libbpf. The goal of libbpf is to create a standard way for accessing eBPF object files. This patch creates 'Makefile' and 'Build' for it, allows 'make' to build libbpf.a and libbpf.so, 'make install' to put them into proper directories. Most part of Makefile is borrowed from traceevent. Before building, it checks the existence of libelf in Makefile, and deny to build if not found. Instead of throwing an error if libelf not found, the error raises in a phony target "elfdep". This design is to ensure 'make clean' still workable even if libelf is not found. Because libbpf requires 'kern_version' field set for 'union bpf_attr' (bpfdep" is used for that dependency), Kernel BPF API is also checked by intruducing a new feature check 'bpf' into tools/build/feature, which checks the existence and version of linux/bpf.h. When building libbpf, it searches that file from include/uapi/linux in kernel source tree (controlled by FEATURE_CHECK_CFLAGS-bpf). Since it searches kernel source tree it reside, installing of newest kernel headers is not required, except we are trying to port these files to an old kernel. To avoid checking that file when perf building, the newly introduced 'bpf' feature check doesn't added into FEATURE_TESTS and FEATURE_DISPLAY by default in tools/build/Makefile.feature, but added into libbpf's specific. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kaixu Xia <xiakaixu@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Bcc: pi3orama@163.com Link: http://lkml.kernel.org/r/1435716878-189507-4-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2015-08-07 10:16:56 -03:00
Daniel Vetter	209e4dbc8d	drm/vblank: Use u32 consistently for vblank counters In commit `99264a61df` Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Wed Apr 15 19:34:43 2015 +0200 drm/vblank: Fixup and document timestamp update/read barriers I've switched vblank->count from atomic_t to unsigned long and accidentally created an integer comparison bug in drm_vblank_count_and_time since vblanke->count might overflow the u32 local copy and hence the retry loop never succeed. Fix this by consistently using u32. Cc: Michel Dänzer <michel@daenzer.net> Reported-by: Michel Dänzer <michel@daenzer.net> Reviewed-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>	2015-08-07 14:35:53 +02:00
Takashi Iwai	6ac7ada210	ASoC: Fixes for v4.2 There are a couple of small driver specific fixes here but the overwhelming bulk of these changes are fixes to the topology ABI that has been newly introduced in v4.2. Once this makes it into a release we will have to firm this up but for now getting enhancements in before they've made it into a release is the most expedient thing. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJVxJsBAAoJECTWi3JdVIfQHUIH/2NeQyG7bL3TSIm0qK9RdmaT UrWyGm6AwhOrg9j9jnZkXag1v4Fsmo4nnslaDbmCKaWJPs7ol94NuWJK7z3He3nr 47yxl2m8OhxsS7aBDACa4gJFQce+JSjKs1ADnKR7QzY9c1/nTAWLO/LE/I5Ca18u tZg77hviip2ftfLYdNXDZvz3HzuCcRnIouc4izTs16DXwU3aYpqcctFrcqU3nNmp /+5zPwIoLA7KDwPqdYhSPNRuCnFtjVbI8hTzq+aWFlxFBZYlKmSCTSwpCN2aQDgF 1oCOP4dlv7s9Xavd2TOMJ8g3Fuyb/gjX2DhIEm7QCYxwBxdkpwm0na6gMrRJqsg= =NEOq -----END PGP SIGNATURE----- Merge tag 'asoc-fix-v4.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Fixes for v4.2 There are a couple of small driver specific fixes here but the overwhelming bulk of these changes are fixes to the topology ABI that has been newly introduced in v4.2. Once this makes it into a release we will have to firm this up but for now getting enhancements in before they've made it into a release is the most expedient thing.	2015-08-07 13:53:41 +02:00
Haozhong Zhang	d7add05458	KVM: x86: Use adjustment in guest cycles when handling MSR_IA32_TSC_ADJUST When kvm_set_msr_common() handles a guest's write to MSR_IA32_TSC_ADJUST, it will calcuate an adjustment based on the data written by guest and then use it to adjust TSC offset by calling a call-back adjust_tsc_offset(). The 3rd parameter of adjust_tsc_offset() indicates whether the adjustment is in host TSC cycles or in guest TSC cycles. If SVM TSC scaling is enabled, adjust_tsc_offset() [i.e. svm_adjust_tsc_offset()] will first scale the adjustment; otherwise, it will just use the unscaled one. As the MSR write here comes from the guest, the adjustment is in guest TSC cycles. However, the current kvm_set_msr_common() uses it as a value in host TSC cycles (by using true as the 3rd parameter of adjust_tsc_offset()), which can result in an incorrect adjustment of TSC offset if SVM TSC scaling is enabled. This patch fixes this problem. Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Cc: stable@vger.linux.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-08-07 13:28:03 +02:00
Paolo Bonzini	18c3626e3d	KVM: x86: zero IDT limit on entry to SMM The recent BlackHat 2015 presentation "The Memory Sinkhole" mentions that the IDT limit is zeroed on entry to SMM. This is not documented, and must have changed some time after 2010 (see http://www.ssi.gouv.fr/uploads/IMG/pdf/IT_Defense_2010_final.pdf). KVM was not doing it, but the fix is easy. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-08-07 12:46:32 +02:00
Vineet Gupta	1097163870	ARCv2: spinlock/rwlock/atomics: reduce 1 instruction in exponential backoff The increment of delay counter was 2 instructions: Arithmatic Shfit Left (ASL) + set to 1 on overflow This can be done in 1 using ROtate Left (ROL) Suggested-by: Nigel Topham <ntopham@synopsys.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-07 13:56:16 +05:30
Jason Wang	48900cb6af	virtio-net: drop NETIF_F_FRAGLIST virtio declares support for NETIF_F_FRAGLIST, but assumes that there are at most MAX_SKB_FRAGS + 2 fragments which isn't always true with a fraglist. A longer fraglist in the skb will make the call to skb_to_sgvec overflow the sg array, leading to memory corruption. Drop NETIF_F_FRAGLIST so we only get what we can handle. Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-07 00:13:10 -07:00
Mathieu Olivari	4f7eb70f7b	stmmac: dwmac-ipq806x: fix static checker warning The patch `b1c17215d7`: "stmmac: add ipq806x glue layer", leads to the following static checker warning: .../stmmac/dwmac-ipq806x.c:314 ipq806x_gmac_probe() warn: double left shift '1 << (1 << gmac->id)' The NSS_COMMON_CLK_SRC_CTRL_OFFSET macro is used once as an offset, and once as a mask, which is a bug indeed. We'll fix it by defining the offset as the real offset value and computing the mask from it when required. Tested on IPQ806x ref designs AP148 & DB149. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Mathieu Olivari <mathieu@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-07 00:12:04 -07:00
Ingo Molnar	1354ac6ad8	perf/core improvements and fixes: User visible: - IPC and cycle accounting in 'perf annotate' (Andi Kleen) - Display cycles in branch sort mode in 'perf report' (Andi Kleen) - Add total time column to 'perf trace' syscall stats summary (Milian Woff) Infrastructure: - PMU helpers to use in Intel PT (Adrian Hunter) - Fix perf-with-kcore script not to split args with spaces (Adrian Hunter) - Add empty Build files for some more architectures (Ben Hutchings) - Move 'perf stat' config variables to a struct to allow using some of its functions in more places (Jiri Olsa) - Add DWARF register names for 'xtensa' arch (Max Filippov) - Implement BPF programs attached to uprobes (Wang Nan) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJVxA9EAAoJENZQFvNTUqpAP88P/0I8l7DJrD4e2PpwSIuwUPII kq8fYMJ4OpR2XiGjyny68iZmnASIon+cV6AoidZ27eqG/+qmhzgv9nCjMlPUIpAT KcilxUXjOc2xba8nUdrNRKHKdcxcvp5iuw1dXkfJuf5U5l7cSTv5tko6vDaA6ngH bpmn8wa73ajRwtTErgBSJAwQVMPzo9Ods/FLeZK6t0hYNYNN9ISp7pq+0RhEnNVb gtlE1/DgGccsTs9NDWQqi3bmvCVsVMhaeWLDyCBjx/cwkwuhdhYHAfs8Llmse+51 7adaqFQ7BZMS/8wXCUwCNnIMBBURpQodW/3H//GQ8CtBSNRt+EX8u0zHs+aJj/NR JqlRVOxhFFJU3E/67HDnU1IM9ANQYZq2JomQ1B+PJTUBZvUaBQtKfFlfKhI36o21 2Sv/fsOjcZLJePBPeUVgjCmBvc0vAUBHPN23wHMyP8o6I6NTmTb3LvomZGaO7Af5 HuebGfd92ahVPT1/h3y5lVDnjiNYikoNKJdDh8JiTTbuj8LtvhHN/o5AeAP3Ig2H kJEWMbSuDCdUPRGYeW4z3aDDP0/vxEH8+kXWoTSwORVZcXXbg38eoYcOPqfQQWQt 80+Rf3sTNt8RCoKWJ0AS+nS/S2HWHrJ5G4DIdc1ldm+zL6ElkOkPVm1W7EqvCBd6 tZP13miwhMxritYGX7pM =b2ho -----END PGP SIGNATURE----- Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: User visible changes: - IPC and cycle accounting in 'perf annotate'. (Andi Kleen) - Display cycles in branch sort mode in 'perf report'. (Andi Kleen) - Add total time column to 'perf trace' syscall stats summary. (Milian Woff) Infrastructure changes: - PMU helpers to use in Intel PT. (Adrian Hunter) - Fix perf-with-kcore script not to split args with spaces. (Adrian Hunter) - Add empty Build files for some more architectures. (Ben Hutchings) - Move 'perf stat' config variables to a struct to allow using some of its functions in more places. (Jiri Olsa) - Add DWARF register names for 'xtensa' arch. (Max Filippov) - Implement BPF programs attached to uprobes. (Wang Nan) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-08-07 09:11:30 +02:00
WingMan Kwok	866b8b18e3	net: netcp: fix unused interface rx buffer size configuration Prior to this patch, rx buffer size for each rx queue of an interface is configurable through dts bindings. But for an interface, the first rx queue's rx buffer size is always the usual MTU size (plus usual overhead) and page size for the remaining rx queues (if they are enabled by specifying a non-zero rx queue depth dts binding of the corresponding interface). This patch removes the rx buffer size configuration capability. Signed-off-by: WingMan Kwok <w-kwok2@ti.com> Acked-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-07 00:02:36 -07:00
Ian Campbell	22f54bf932	net: thunderx: remove effective "default y" from Kconfig if ARCH_THUNDER=y As well as for kernels built only for ThunderX ARCH_THUNDERX is also enabled for kernels which support multiple platforms (such as distro kernels). Thus "default ARCH_THUNDER" is inappropriate. I believe default m is equally frowned upon, so remove the line completely rather than "default m if ARCH_THUNDER". Signed-off-by: Ian Campbell <ijc@hellion.org.uk> Cc: Sunil Goutham <sgoutham@cavium.com> Cc: Robert Richter <rric@kernel.org> Cc: Derek Chickles <derek.chickles@caviumnetworks.com> Cc: Satanand Burla <satananda.burla@caviumnetworks.com> Cc: Felix Manlunas <felix.manlunas@caviumnetworks.com> Cc: Raghu Vatsavayi <raghu.vatsavayi@caviumnetworks.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linux-arm-kernel@lists.infradead.org Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-07 00:01:37 -07:00
Ivan Vecera	7ebc482202	r8169: enforce RX_MULTI_EN on rtl8168ep/8111ep chips Enforcing this flag in RxConfig for the mentioned chips fixes netdev watchdog issues prepended with AMD IOMMU message(s) like: AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0 domain=0x001d address=0x0000000000003000 flags=0x0050] Note that this flag is also set in Realtek's own driver for these chips. Signed-off-by: Ivan Vecera <ivecera@redhat.com> Tested-by: Alexander Lindqvist <alexander@bitspace.se> Acked-by: Francois Romieu <romieu@fr.zoreil.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-07 00:00:10 -07:00
Nikolay Aleksandrov	786c2077ec	bridge: netlink: account for the IFLA_BRPORT_PROXYARP_WIFI attribute size and policy The attribute size wasn't accounted for in the get_slave_size() callback (br_port_get_slave_size) when it was introduced, so fix it now. Also add a policy entry for it in br_port_policy. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Fixes: `842a9ae08a` ("bridge: Extend Proxy ARP design to allow optional rules for Wi-Fi") Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-06 23:54:42 -07:00
Nikolay Aleksandrov	355b9f9df1	bridge: netlink: account for the IFLA_BRPORT_PROXYARP attribute size and policy The attribute size wasn't accounted for in the get_slave_size() callback (br_port_get_slave_size) when it was introduced, so fix it now. Also add a policy entry for it in br_port_policy. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Fixes: `958501163d` ("bridge: Add support for IEEE 802.11 Proxy ARP") Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-06 23:54:42 -07:00
David S. Miller	50e18af16f	iwlwifi: * a fix for the stuck TFD queue mechanism - it was producing noisy false alarms * a fix for the NIC prepare flow that prevented the driver from being able to access the device on certain systems * a fix for the scan prority handling which allows the regular scan to run even if a scheduled scan is already running rsi: * fix firmware load DMA regression b43: * fix extpa_gain check for 2GHz rtlwifi: * fix NULL dereference when PCI driver used as an AP * add missing module parameter declaration for rtl8723be_mod_params.msi_support -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQEcBAABAgAGBQJVwOsmAAoJEG4XJFUm622bWSAIAJDlW0ZKnxI+tmsbU2CYuoQS yfiK+oTuQE5+eB+5bNaJqSI2QSzVo11JBxnf4014wu+/gjKy2OHQ48ufXbkFfHB2 t2o+rIx7WL5zvoy67fIifgIQSdg542e68xPc8Wz6MV58szuYscBT78dy2ZWvthqB S6DK8K+ohHzHYMV5fw65xkcmsXB8BEKEoUckm8ZxjsLF4Pj+ZzAgpqf+i85JYkFM LwktfKRxg2un9u51IhBCPZUUxe7MhLgZbSuHSP/7ltxWIGctseMAFMlCMrmi9+2B zXPR+2EMWLy0rgmQ04/sG6LjNdW6tmF1rCtcuKnS7hiOngubAj3UEj9d5r98/LI= =SqUX -----END PGP SIGNATURE----- Merge tag 'wireless-drivers-for-davem-2015-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers Kalle Valo says: ==================== iwlwifi: * a fix for the stuck TFD queue mechanism - it was producing noisy false alarms * a fix for the NIC prepare flow that prevented the driver from being able to access the device on certain systems * a fix for the scan prority handling which allows the regular scan to run even if a scheduled scan is already running rsi: * fix firmware load DMA regression b43: * fix extpa_gain check for 2GHz rtlwifi: * fix NULL dereference when PCI driver used as an AP * add missing module parameter declaration for rtl8723be_mod_params.msi_support ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-06 23:53:34 -07:00
Oleg Nesterov	7ba8bd75dd	net: pktgen: don't abuse current->state in pktgen_thread_worker() Commit `1fbe4b46ca` "net: pktgen: kill the Wait for kthread_stop code in pktgen_thread_worker()" removed (in particular) the final __set_current_state(TASK_RUNNING) and I didn't notice the previous set_current_state(TASK_INTERRUPTIBLE). This triggers the warning in __might_sleep() after return. Afaics, we can simply remove both set_current_state()'s, and we could do this a long ago right after `ef87979c27` "pktgen: better scheduler friendliness" which changed pktgen_thread_worker() to use wait_event_interruptible_timeout(). Reported-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-06 23:52:44 -07:00
Ross Lagerwall	57b229063a	xen/netback: Wake dealloc thread after completing zerocopy work Waking the dealloc thread before decrementing inflight_packets is racy because it means the thread may go to sleep before inflight_packets is decremented. If kthread_stop() has already been called, the dealloc thread may wait forever with nothing to wake it. Instead, wake the thread only after decrementing inflight_packets. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-06 23:42:48 -07:00
Herbert Xu	a0a2a66024	net: Fix skb_set_peeked use-after-free bug The commit `738ac1ebb9` ("net: Clone skb before setting peeked flag") introduced a use-after-free bug in skb_recv_datagram. This is because skb_set_peeked may create a new skb and free the existing one. As it stands the caller will continue to use the old freed skb. This patch fixes it by making skb_set_peeked return the new skb (or the old one if unchanged). Fixes: `738ac1ebb9` ("net: Clone skb before setting peeked flag") Reported-by: Brenden Blanco <bblanco@plumgrid.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Tested-by: Brenden Blanco <bblanco@plumgrid.com> Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-06 21:55:47 -07:00
Linus Torvalds	49d7c6559b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc Pull sparc fix from David Miller: "FPU register corruption bug fix" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc64: Fix userspace FPU register corruptions.	2015-08-07 05:28:24 +03:00
Linus Torvalds	8664b90bae	Merge branch 'akpm' (patches from Andrew) Merge fixes from Andrew Morton: "21 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (21 commits) writeback: fix initial dirty limit mm/memory-failure: set PageHWPoison before migrate_pages() mm: check __PG_HWPOISON separately from PAGE_FLAGS_CHECK_AT_* mm/memory-failure: give up error handling for non-tail-refcounted thp mm/memory-failure: fix race in counting num_poisoned_pages mm/memory-failure: unlock_page before put_page ipc: use private shmem or hugetlbfs inodes for shm segments. mm: initialize hotplugged pages as reserved ocfs2: fix shift left overflow kthread: export kthread functions fsnotify: fix oops in fsnotify_clear_marks_by_group_flags() lib/iommu-common.c: do not use 0xffffffffffffffffl for computing align_mask mm/slub: allow merging when SLAB_DEBUG_FREE is set signalfd: fix information leak in signalfd_copyinfo signal: fix information leak in copy_siginfo_to_user signal: fix information leak in copy_siginfo_from_user32 ocfs2: fix BUG in ocfs2_downconvert_thread_do_work() fs, file table: reinit files_stat.max_files after deferred memory initialisation mm, meminit: replace rwsem with completion mm, meminit: allow early_pfn_to_nid to be used during runtime ...	2015-08-07 05:20:40 +03:00
David S. Miller	44922150d8	sparc64: Fix userspace FPU register corruptions. If we have a series of events from userpsace, with %fprs=FPRS_FEF, like follows: ETRAP ETRAP VIS_ENTRY(fprs=0x4) VIS_EXIT RTRAP (kernel FPU restore with fpu_saved=0x4) RTRAP We will not restore the user registers that were clobbered by the FPU using kernel code in the inner-most trap. Traps allocate FPU save slots in the thread struct, and FPU using sequences save the "dirty" FPU registers only. This works at the initial trap level because all of the registers get recorded into the top-level FPU save area, and we'll return to userspace with the FPU disabled so that any FPU use by the user will take an FPU disabled trap wherein we'll load the registers back up properly. But this is not how trap returns from kernel to kernel operate. The simplest fix for this bug is to always save all FPU register state for anything other than the top-most FPU save area. Getting rid of the optimized inner-slot FPU saving code ends up making VISEntryHalf degenerate into plain VISEntry. Longer term we need to do something smarter to reinstate the partial save optimizations. Perhaps the fundament error is having trap entry and exit allocate FPU save slots and restore register state. Instead, the VISEntry et al. calls should be doing that work. This bug is about two decades old. Reported-by: James Y Knight <jyknight@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-06 19:13:25 -07:00
Lucas Stach	14d2b7c1a9	net: fec: fix initial runtime PM refcount The clocks are initially active and thus the device is marked active. This still keeps the PM refcount at 0, the pm_runtime_put_autosuspend() call at the end of probe then leaves us with an invalid refcount of -1, which in turn leads to the device staying in suspended state even though netdev open had been called. Fix this by initializing the refcount to be coherent with the initial device status. Fixes: `8fff755e9f` (net: fec: Ensure clocks are enabled while using mdio bus) Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-06 18:53:25 -07:00
Linus Torvalds	a58997e1a6	Merge branch 'drm-fixes-4.2' of git://people.freedesktop.org/~agd5f/linux Pull amdgpu fixes from Alex Deucher: "Just a few amdgpu fixes to make sure we report the proper firmware information and number of render buffers to userspace and a typo in a debugging function" [ Pulling directly from Alex since Dave Airlie is on vacation - Linus ] * 'drm-fixes-4.2' of git://people.freedesktop.org/~agd5f/linux: drm/amdgpu: set fw_version and feature_version for smu fw loading drm/amdgpu: add feature version for SDMA ucode drm/amdgpu: add feature version for RLC and MEC v2 drm/amdgpu: increment queue when iterating on this variable. drm/amdgpu: fix rb setting for CZ	2015-08-07 04:51:14 +03:00
Linus Torvalds	ebc90be6b9	Merge branch 'drm-tda998x-fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm Pull TDA998x i2c driver fixes from Russell King: "This fixes the double-checksumming of the AVI infoframe which was resulting in the checksum always being zero. It went unnoticed as none of my HDMI devices had a problem with this" [ Pulling directly from rmk since Dave Airlie is on vacation - Linus ] * 'drm-tda998x-fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: drm/i2c: tda998x: fix bad checksum of the HDMI AVI infoframe	2015-08-07 04:48:46 +03:00
Rabin Vincent	a50fcb512d	writeback: fix initial dirty limit The initial value of global_wb_domain.dirty_limit set by writeback_set_ratelimit() is zeroed out by the memset in wb_domain_init(). Signed-off-by: Rabin Vincent <rabin.vincent@axis.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-08-07 04:39:42 +03:00
Naoya Horiguchi	4491f71260	mm/memory-failure: set PageHWPoison before migrate_pages() Now page freeing code doesn't consider PageHWPoison as a bad page, so by setting it before completing the page containment, we can prevent the error page from being reused just after successful page migration. I added TTU_IGNORE_HWPOISON for try_to_unmap() to make sure that the page table entry is transformed into migration entry, not to hwpoison entry. Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Dean Nelson <dnelson@redhat.com> Cc: Tony Luck <tony.luck@intel.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Hugh Dickins <hughd@google.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-08-07 04:39:42 +03:00
Naoya Horiguchi	f4c18e6f7b	mm: check __PG_HWPOISON separately from PAGE_FLAGS_CHECK_AT_* The race condition addressed in commit `add05cecef` ("mm: soft-offline: don't free target page in successful page migration") was not closed completely, because that can happen not only for soft-offline, but also for hard-offline. Consider that a slab page is about to be freed into buddy pool, and then an uncorrected memory error hits the page just after entering __free_one_page(), then VM_BUG_ON_PAGE(page->flags & PAGE_FLAGS_CHECK_AT_PREP) is triggered, despite the fact that it's not necessary because the data on the affected page is not consumed. To solve it, this patch drops __PG_HWPOISON from page flag checks at allocation/free time. I think it's justified because __PG_HWPOISON flags is defined to prevent the page from being reused, and setting it outside the page's alloc-free cycle is a designed behavior (not a bug.) For recent months, I was annoyed about BUG_ON when soft-offlined page remains on lru cache list for a while, which is avoided by calling put_page() instead of putback_lru_page() in page migration's success path. This means that this patch reverts a major change from commit `add05cecef` about the new refcounting rule of soft-offlined pages, so "reuse window" revives. This will be closed by a subsequent patch. Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Dean Nelson <dnelson@redhat.com> Cc: Tony Luck <tony.luck@intel.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Hugh Dickins <hughd@google.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-08-07 04:39:42 +03:00
Naoya Horiguchi	98ed2b0052	mm/memory-failure: give up error handling for non-tail-refcounted thp "non anonymous thp" case is still racy with freeing thp, which causes panic due to put_page() for refcount-0 page. It seems that closing up this race might be hard (and/or not worth doing,) so let's give up the error handling for this case. Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Dean Nelson <dnelson@redhat.com> Cc: Tony Luck <tony.luck@intel.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Hugh Dickins <hughd@google.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-08-07 04:39:41 +03:00
Naoya Horiguchi	a209ef09af	mm/memory-failure: fix race in counting num_poisoned_pages When memory_failure() is called on a page which are just freed after page migration from soft offlining, the counter num_poisoned_pages is raised twi= ce. So let's fix it with using TestSetPageHWPoison. Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Dean Nelson <dnelson@redhat.com> Cc: Tony Luck <tony.luck@intel.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Hugh Dickins <hughd@google.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-08-07 04:39:41 +03:00
Naoya Horiguchi	a09233f3e1	mm/memory-failure: unlock_page before put_page Recently I addressed a few of hwpoison race problems and the patches are merged on v4.2-rc1. It made progress, but unfortunately some problems still remain due to less coverage of my testing. So I'm trying to fix or avoid them in this series. One point I'm expecting to discuss is that patch 4/5 changes the page flag set to be checked on free time. In current behavior, __PG_HWPOISON is not supposed to be set when the page is freed. I think that there is no strong reason for this behavior, and it causes a problem hard to fix only in error handler side (because __PG_HWPOISON could be set at arbitrary timing.) So I suggest to change it. With this patchset, hwpoison stress testing in official mce-test testsuite (which previously failed) passes. This patch (of 5): In "just unpoisoned" path, we do put_page and then unlock_page, which is a wrong order and causes "freeing locked page" bug. So let's fix it. Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Dean Nelson <dnelson@redhat.com> Cc: Tony Luck <tony.luck@intel.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Hugh Dickins <hughd@google.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-08-07 04:39:41 +03:00
Stephen Smalley	e1832f2923	ipc: use private shmem or hugetlbfs inodes for shm segments. The shm implementation internally uses shmem or hugetlbfs inodes for shm segments. As these inodes are never directly exposed to userspace and only accessed through the shm operations which are already hooked by security modules, mark the inodes with the S_PRIVATE flag so that inode security initialization and permission checking is skipped. This was motivated by the following lockdep warning: ====================================================== [ INFO: possible circular locking dependency detected ] 4.2.0-0.rc3.git0.1.fc24.x86_64+debug #1 Tainted: G W ------------------------------------------------------- httpd/1597 is trying to acquire lock: (&ids->rwsem){+++++.}, at: shm_close+0x34/0x130 but task is already holding lock: (&mm->mmap_sem){++++++}, at: SyS_shmdt+0x4b/0x180 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #3 (&mm->mmap_sem){++++++}: lock_acquire+0xc7/0x270 __might_fault+0x7a/0xa0 filldir+0x9e/0x130 xfs_dir2_block_getdents.isra.12+0x198/0x1c0 [xfs] xfs_readdir+0x1b4/0x330 [xfs] xfs_file_readdir+0x2b/0x30 [xfs] iterate_dir+0x97/0x130 SyS_getdents+0x91/0x120 entry_SYSCALL_64_fastpath+0x12/0x76 -> #2 (&xfs_dir_ilock_class){++++.+}: lock_acquire+0xc7/0x270 down_read_nested+0x57/0xa0 xfs_ilock+0x167/0x350 [xfs] xfs_ilock_attr_map_shared+0x38/0x50 [xfs] xfs_attr_get+0xbd/0x190 [xfs] xfs_xattr_get+0x3d/0x70 [xfs] generic_getxattr+0x4f/0x70 inode_doinit_with_dentry+0x162/0x670 sb_finish_set_opts+0xd9/0x230 selinux_set_mnt_opts+0x35c/0x660 superblock_doinit+0x77/0xf0 delayed_superblock_init+0x10/0x20 iterate_supers+0xb3/0x110 selinux_complete_init+0x2f/0x40 security_load_policy+0x103/0x600 sel_write_load+0xc1/0x750 __vfs_write+0x37/0x100 vfs_write+0xa9/0x1a0 SyS_write+0x58/0xd0 entry_SYSCALL_64_fastpath+0x12/0x76 ... Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Reported-by: Morten Stevens <mstevens@fedoraproject.org> Acked-by: Hugh Dickins <hughd@google.com> Acked-by: Paul Moore <paul@paul-moore.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Eric Paris <eparis@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-08-07 04:39:41 +03:00
Mel Gorman	e298ff75f1	mm: initialize hotplugged pages as reserved Commit `92923ca3aa` ("mm: meminit: only set page reserved in the memblock region") broke memory hotplug which expects the memmap for newly added sections to be reserved until onlined by online_pages_range(). This patch marks hotplugged pages as reserved when adding new zones. Signed-off-by: Mel Gorman <mgorman@suse.de> Reported-by: David Vrabel <david.vrabel@citrix.com> Tested-by: David Vrabel <david.vrabel@citrix.com> Cc: Nathan Zimmer <nzimmer@sgi.com> Cc: Robin Holt <holt@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-08-07 04:39:41 +03:00
Joseph Qi	32e5a2a2be	ocfs2: fix shift left overflow When using a large volume, for example 9T volume with 2T already used, frequent creation of small files with O_DIRECT when the IO is not cluster aligned may clear sectors in the wrong place. This will cause filesystem corruption. This is because p_cpos is a u32. When calculating the corresponding sector it should be converted to u64 first, otherwise it may overflow. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: <stable@vger.kernel.org> [4.0+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-08-07 04:39:41 +03:00
David Kershner	18896451ea	kthread: export kthread functions The s-Par visornic driver, currently in staging, processes a queue being serviced by the an s-Par service partition. We can get a message that something has happened with the Service Partition, when that happens, we must not access the channel until we get a message that the service partition is back again. The visornic driver has a thread for processing the channel, when we get the message, we need to be able to park the thread and then resume it when the problem clears. We can do this with kthread_park and unpark but they are not exported from the kernel, this patch exports the needed functions. Signed-off-by: David Kershner <david.kershner@unisys.com> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Richard Weinberger <richard.weinberger@gmail.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-08-07 04:39:41 +03:00
Jan Kara	8f2f3eb59d	fsnotify: fix oops in fsnotify_clear_marks_by_group_flags() fsnotify_clear_marks_by_group_flags() can race with fsnotify_destroy_marks() so that when fsnotify_destroy_mark_locked() drops mark_mutex, a mark from the list iterated by fsnotify_clear_marks_by_group_flags() can be freed and thus the next entry pointer we have cached may become stale and we dereference free memory. Fix the problem by first moving marks to free to a special private list and then always free the first entry in the special list. This method is safe even when entries from the list can disappear once we drop the lock. Signed-off-by: Jan Kara <jack@suse.com> Reported-by: Ashish Sangwan <a.sangwan@samsung.com> Reviewed-by: Ashish Sangwan <a.sangwan@samsung.com> Cc: Lino Sanfilippo <LinoSanfilippo@gmx.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-08-07 04:39:41 +03:00
Sowmini Varadhan	447f6a95a9	lib/iommu-common.c: do not use 0xffffffffffffffffl for computing align_mask Using a 64 bit constant generates "warning: integer constant is too large for 'long' type" on 32 bit platforms. Instead use ~0ul and BITS_PER_LONG. Detected by Andrew Morton on ARMD. Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David S. Miller <davem@davemloft.net> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-08-07 04:39:41 +03:00
Konstantin Khlebnikov	3e810ae2db	mm/slub: allow merging when SLAB_DEBUG_FREE is set This patch fixes creation of new kmem-caches after enabling sanity_checks for existing mergeable kmem-caches in runtime: before that patch creation fails because unique name in sysfs already taken by existing kmem-cache. Unlike other debug options this doesn't change object layout and could be enabled and disabled at any time. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Acked-by: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Acked-by: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-08-07 04:39:40 +03:00

... 4 5 6 7 8 ...

535088 commits