This repository was archived by the owner on Jan 10, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 25
Add Support to enable xHCI DBC #1
Open
gururajk53
wants to merge
5,872
commits into
projectceladon:master
Choose a base branch
from
gururajk53:master
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The code scans all components looking for the default name i2c-10EC5670:00, which of course doesn't work in platforms where the BIOS uses a different HID such as Dell 5585 Since we already have the correct information available, just use the actual codec name and length. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Acked-By: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit e45e391)
For now reason the current card name is a ridiculous 'cherrytrailcraudio'. This isn't very useful or self-explanatory, change to driver name cht-bsw-rt5672. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Takashi Iwai <tiwai@suse.de> Acked-By: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit 57e9e87)
Signed-off-by: Ryan Lee <ryans.lee@maximintegrated.com> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit 4eee202)
Signed-off-by: Ryan Lee <ryans.lee@maximintegrated.com> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit d4a8bce)
…OS driver Remove dead code which was missed in previous changes. This is not visible with git diff but there is a test+return on the same condition just above. Also reuse local variable instead of fetching the jack information twice. Tested on Acer R11 (cyan) Fixes: 3bbda5a ('ASoC: Intel: cht_bsw_max98090_ti: Fix jack initialization') Cc: Thierry Escande <thierry.escande@collabora.com> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Acked-By: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit 0087f20)
Return proper error instead of 0 if the revision does not match. Signed-off-by: Axel Lin <axel.lin@ingics.com> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit d6e18b8)
Move set_sysclk to codec level and people can use it at both codec and dai level. Signed-off-by: Bard Liao <bardliao@realtek.com> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit fe01e5e)
Move set_pll function to codec level and people can use it at both codec and dai level. Signed-off-by: Bard Liao <bardliao@realtek.com> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit c8a04b5)
Create a helper function to remove duplicate code used in machine drivers Suggested-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Acked-by: Liam Girdwood <liam.r.girdwood@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit 7c761b5)
Same as for other codecs, enable MCLK by default. When it is not present, e.g. on MinnowBoard B3 since it's not routed on the LSE connector, we fall back to blck-based clocking. The DMIC quirks are also fixed, there is a single DMIC input of the codec. reorder variables in reverse x-mas tree as suggested by Andy Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Acked-by: Liam Girdwood <liam.r.girdwood@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit 02c0a3b)
Reorder variable names, change MCLK test, change quirks No functional change Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Acked-by: Liam Girdwood <liam.r.girdwood@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit 17b5273)
Reorder variable names, change MCLK test, change quirks No functional change Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Acked-by: Liam Girdwood <liam.r.girdwood@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit 6cdf01a)
Reorder variable names, change tests No functional change Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Acked-by: Liam Girdwood <liam.r.girdwood@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit 22a317a)
get_codec_dai() is not used, remove it Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Acked-by: Liam Girdwood <liam.r.girdwood@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit 8a2eca5)
Remove duplicate code with a common helper in all Intel machine drivers. Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Suggested-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Acked-by: Liam Girdwood <liam.r.girdwood@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit dfb6ec7)
For some reason the Atom/HiFi2 machine drivers use an id=1 instead of zero as done on all other platforms. This gets in the way of topology-based matching, realign for consistency. This should not have any functional impact on existing solutions with don't rely on topology. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Acked-by: Liam Girdwood <liam.r.girdwood@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit 149f775)
The hard-coded compressed dailinks are not supported using publicly-available firmwares, which creates unnecessary user confusion [1]. Even if the firmware was available, the mainline code does not have the required .dynamic=1 and .dpcm_playback=1 fields so probably never worked as is, and last and they conflict with topology-defined streams. Remove them and move on. This can be re-enabled with SOF later in a more flexible manner. [1] http://mailman.alsa-project.org/pipermail/alsa-devel/2017-August/124868.html Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Acked-by: Liam Girdwood <liam.r.girdwood@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit 2193eb9)
The pointer dma_dev_name is assigned but never read, it is redundant and can therefore be removed. Cleans up clang warning: sound/soc/intel/common/sst-firmware.c:288:3: warning: Value stored to 'dma_dev_name' is never read Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit 271ef65)
Introduce a new custom dapm routes map to quirk platforms with the internal mic connected to IN2P. Signed-off-by: Carlo Caione <carlo@endlessm.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit 39712db)
ACPI support is not specific to the Intel/SST driver. Move the enumeration and matching code which is not hardware-dependent to sound/soc and rename relevant sst_acpi_ structures and functions with snd_soc_acpi_ prefix soc-acpi.h is protected by a #ifndef __LINUX_SND_SOC_ACPI_H for consistency with all other SoC .h files: grep -L __LINUX include/sound/soc* | wc -l 0 grep __LINUX include/sound/soc* | wc -l 14 Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Acked-by: Liam Girdwood <liam.r.girdwood@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit 7feb2f7)
Before we add new fields for SOF support, move to C99 syntax as done for atom/sst and legacy hsw/bdw code No functional change Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Acked-by: Liam Girdwood <liam.r.girdwood@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit 928c8a8)
To prepare for SOF integration, we need new fields in the machine table. It is intended that the same table is used for both closed-source and open-source firmware to avoid repeating ACPI-related information multiple times No functional change Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Acked-by: Liam Girdwood <liam.r.girdwood@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit 976b5a0)
First step of cleaning, move all tables to soc-acpi-intel-match module. The tables remain in separate files per platform to keep them manageable. Skylake+ platforms are still handled elsewhere since there is no conflict with SOF for now, but this will have to be handled at a later point. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Acked-by: Liam Girdwood <liam.r.girdwood@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit 6004640)
Prepare for SOF integration, no functional change Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Acked-by: Liam Girdwood <liam.r.girdwood@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit c1a8ed6)
split Kconfig to prepare for reuse of machine drivers for SOF support no functional change or edits Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Acked-by: Liam Girdwood <liam.r.girdwood@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit ea2851b)
This file is a mess, order by generation with more recent last Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Acked-by: Liam Girdwood <liam.r.girdwood@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit d4d86b8)
Introduce more logical dependencies, with the SOC selected first and the relevant machine drivers are exposed. The same mechanism will be used for SOF support. Also select SND_SOC_ACPI_INTEL_MATCH for all machine drivers Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Acked-by: Liam Girdwood <liam.r.girdwood@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit f6a118a)
Make all Intel audio drivers dependent on X86 to avoid compilation errors for s390 and xtensa architectures. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit 468663a)
MICBIAS widget type has been deprecated. Convert it to a SUPPLY widget. Signed-off-by: Carlo Caione <carlo@endlessm.com> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit be96fc5)
Enable jack detection for the RT5651 codec on the JD* pins. Signed-off-by: Carlo Caione <carlo@endlessm.com> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit 80bbe4a)
swaroopbalan
pushed a commit
that referenced
this pull request
Mar 12, 2020
… module commit ec7470b834fe7b5d7eff11b6677f5d7fdf5e9a91 upstream. This patch fixes a WARN_ON in nft_set_destroy() due to missing set reference count drop from the preparation phase. This is triggered by the module autoload path. Do not exercise the abort path from nft_request_module() while preparation phase cleaning up is still pending. WARNING: CPU: 3 PID: 3456 at net/netfilter/nf_tables_api.c:3740 nft_set_destroy+0x45/0x50 [nf_tables] [...] CPU: 3 PID: 3456 Comm: nft Not tainted 5.4.6-arch3-1 #1 RIP: 0010:nft_set_destroy+0x45/0x50 [nf_tables] Code: e8 30 eb 83 c6 48 8b 85 80 00 00 00 48 8b b8 90 00 00 00 e8 dd 6b d7 c5 48 8b 7d 30 e8 24 dd eb c5 48 89 ef 5d e9 6b c6 e5 c5 <0f> 0b c3 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 7f 10 e9 52 RSP: 0018:ffffac4f43e53700 EFLAGS: 00010202 RAX: 0000000000000001 RBX: ffff99d63a154d80 RCX: 0000000001f88e03 RDX: 0000000001f88c03 RSI: ffff99d6560ef0c0 RDI: ffff99d63a101200 RBP: ffff99d617721de0 R08: 0000000000000000 R09: 0000000000000318 R10: 00000000f0000000 R11: 0000000000000001 R12: ffffffff880fabf0 R13: dead000000000122 R14: dead000000000100 R15: ffff99d63a154d80 FS: 00007ff3dbd5b740(0000) GS:ffff99d6560c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00001cb5de6a9000 CR3: 000000016eb6a004 CR4: 00000000001606e0 Call Trace: __nf_tables_abort+0x3e3/0x6d0 [nf_tables] nft_request_module+0x6f/0x110 [nf_tables] nft_expr_type_request_module+0x28/0x50 [nf_tables] nf_tables_expr_parse+0x198/0x1f0 [nf_tables] nft_expr_init+0x3b/0xf0 [nf_tables] nft_dynset_init+0x1e2/0x410 [nf_tables] nf_tables_newrule+0x30a/0x930 [nf_tables] nfnetlink_rcv_batch+0x2a0/0x640 [nfnetlink] nfnetlink_rcv+0x125/0x171 [nfnetlink] netlink_unicast+0x179/0x210 netlink_sendmsg+0x208/0x3d0 sock_sendmsg+0x5e/0x60 ____sys_sendmsg+0x21b/0x290 Update comment on the code to describe the new behaviour. Reported-by: Marco Oliverio <marco.oliverio@tanaza.com> Fixes: 452238e ("netfilter: nf_tables: add and use helper for module autoload") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
swaroopbalan
pushed a commit
that referenced
this pull request
Mar 12, 2020
[ Upstream commit 682fee802843b332f9c51ffc8e062de5ff773f2e ] Trying to register the DSP platform device results in a null pointer access: [ 0.124184] CPU 0 Unable to handle kernel paging request at virtual address 00000000, epc == 804e305c, ra == 804e6f20 [ 0.135208] Oops[#1]: [ 0.137514] CPU: 0 PID: 1 Comm: swapper Not tainted 4.14.87 ... [ 0.197117] epc : 804e305c bcm63xx_dsp_register+0x80/0xa4 [ 0.202838] ra : 804e6f20 board_register_devices+0x258/0x390 ... This happens because it tries to copy the passed platform data over the platform_device's unpopulated platform_data. Since this code has been broken since its submission, no driver was ever submitted for it, and apparently nobody was using it, just remove it instead of trying to fix it. Fixes: e7300d0 ("MIPS: BCM63xx: Add support for the Broadcom BCM63xx family of SOCs.") Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Signed-off-by: Paul Burton <paul.burton@mips.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Cc: linux-mips@linux-mips.org Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <jhogan@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
swaroopbalan
pushed a commit
that referenced
this pull request
Mar 12, 2020
[ Upstream commit 638a3a1e349ddf5b82f222ff5cb3b4f266e7c278 ] BUG: unable to handle kernel NULL pointer dereference at 0000000000000128 PGD 0 P4D 0 Oops: 0000 [#1 CPU: 0 PID: 5697 Comm: modprobe Tainted: G W 5.1.0-rc7+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014 RIP: 0010:__lock_acquire+0x53/0x10b0 Code: 8b 1c 25 40 5e 01 00 4c 8b 6d 10 45 85 e4 0f 84 bd 06 00 00 44 8b 1d 7c d2 09 02 49 89 fe 41 89 d2 45 85 db 0f 84 47 02 00 00 <48> 81 3f a0 05 70 83 b8 00 00 00 00 44 0f 44 c0 83 fe 01 0f 86 3a RSP: 0018:ffffc90001c07a28 EFLAGS: 00010002 RAX: 0000000000000000 RBX: ffff88822f038440 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000128 RBP: ffffc90001c07a88 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000001 R13: 0000000000000000 R14: 0000000000000128 R15: 0000000000000000 FS: 00007fead0811540(0000) GS:ffff888237a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000128 CR3: 00000002310da000 CR4: 00000000000006f0 Call Trace: ? __lock_acquire+0x24e/0x10b0 lock_acquire+0xdf/0x230 ? flush_workqueue+0x71/0x530 flush_workqueue+0x97/0x530 ? flush_workqueue+0x71/0x530 l2tp_exit_net+0x170/0x2b0 [l2tp_core ? l2tp_exit_net+0x93/0x2b0 [l2tp_core ops_exit_list.isra.6+0x36/0x60 unregister_pernet_operations+0xb8/0x110 unregister_pernet_device+0x25/0x40 l2tp_init+0x55/0x1000 [l2tp_core ? 0xffffffffa018d000 do_one_initcall+0x6c/0x3cc ? do_init_module+0x22/0x1f1 ? rcu_read_lock_sched_held+0x97/0xb0 ? kmem_cache_alloc_trace+0x325/0x3b0 do_init_module+0x5b/0x1f1 load_module+0x1db1/0x2690 ? m_show+0x1d0/0x1d0 __do_sys_finit_module+0xc5/0xd0 __x64_sys_finit_module+0x15/0x20 do_syscall_64+0x6b/0x1d0 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7fead031a839 Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1f f6 2c 00 f7 d8 64 89 01 48 RSP: 002b:00007ffe8d9acca8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 RAX: ffffffffffffffda RBX: 0000560078398b80 RCX: 00007fead031a839 RDX: 0000000000000000 RSI: 000056007659dc2e RDI: 0000000000000003 RBP: 000056007659dc2e R08: 0000000000000000 R09: 0000560078398b80 R10: 0000000000000003 R11: 0000000000000246 R12: 0000000000000000 R13: 00005600783a04a0 R14: 0000000000040000 R15: 0000560078398b80 Modules linked in: l2tp_core(+) e1000 ip_tables ipv6 [last unloaded: l2tp_core CR2: 0000000000000128 ---[ end trace 8322b2b8bf83f8e1 If alloc_workqueue fails in l2tp_init, l2tp_net_ops is unregistered on failure path. Then l2tp_exit_net is called which will flush NULL workqueue, this patch add a NULL check to fix it. Fixes: 67e04c2 ("l2tp: unregister l2tp_net_ops on failure path") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
swaroopbalan
pushed a commit
that referenced
this pull request
Mar 12, 2020
[ Upstream commit 2b81f8161dfeda4017cef4f2498ccb64b13f0d61 ] af_inet sets sock->sk to NULL which trips strparser over: BUG: kernel NULL pointer dereference, address: 0000000000000012 PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 7 PID: 0 Comm: swapper/7 Not tainted 5.2.0-rc1-00139-g14629453a6d3 #21 RIP: 0010:tcp_peek_len+0x10/0x60 RSP: 0018:ffffc02e41c54b98 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff9cf924c4e030 RCX: 0000000000000051 RDX: 0000000000000000 RSI: 000000000000000c RDI: ffff9cf97128f480 RBP: ffff9cf9365e0300 R08: ffff9cf94fe7d2c0 R09: 0000000000000000 R10: 000000000000036b R11: ffff9cf939735e00 R12: ffff9cf91ad9ae40 R13: ffff9cf924c4e000 R14: ffff9cf9a8fcbaae R15: 0000000000000020 FS: 0000000000000000(0000) GS:ffff9cf9af7c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000012 CR3: 000000013920a003 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> strp_data_ready+0x48/0x90 tls_data_ready+0x22/0xd0 [tls] tcp_rcv_established+0x569/0x620 tcp_v4_do_rcv+0x127/0x1e0 tcp_v4_rcv+0xad7/0xbf0 ip_protocol_deliver_rcu+0x2c/0x1c0 ip_local_deliver_finish+0x41/0x50 ip_local_deliver+0x6b/0xe0 ? ip_protocol_deliver_rcu+0x1c0/0x1c0 ip_rcv+0x52/0xd0 ? ip_rcv_finish_core.isra.20+0x380/0x380 __netif_receive_skb_one_core+0x7e/0x90 netif_receive_skb_internal+0x42/0xf0 napi_gro_receive+0xed/0x150 nfp_net_poll+0x7a2/0xd30 [nfp] ? kmem_cache_free_bulk+0x286/0x310 net_rx_action+0x149/0x3b0 __do_softirq+0xe3/0x30a ? handle_irq_event_percpu+0x6a/0x80 irq_exit+0xe8/0xf0 do_IRQ+0x85/0xd0 common_interrupt+0xf/0xf </IRQ> RIP: 0010:cpuidle_enter_state+0xbc/0x450 To avoid this issue set sock->sk after sk_prot->close. My grepping and testing did not discover any code which would depend on the current behaviour. Fixes: c46234e ("tls: RX path for ktls") Reported-by: David Beckett <david.beckett@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
swaroopbalan
pushed a commit
that referenced
this pull request
Mar 12, 2020
[ Upstream commit bf8c02f961c89e5ccae5987b7ab28f5592a35101 ] kasan will report a BUG when run command 'insmod hns_roce_hw_v2.ko', the calltrace is as follows: ================================================================== BUG: KASAN: slab-out-of-bounds in hns_roce_v2_init_eq_table+0x1324/0x1948 [hns_roce_hw_v2] Read of size 8 at addr ffff8020e7a10608 by task insmod/256 CPU: 0 PID: 256 Comm: insmod Tainted: G O 5.2.0-rc4 #1 Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI RC0 Call trace: dump_backtrace+0x0/0x1e8 show_stack+0x14/0x20 dump_stack+0xc4/0xfc print_address_description+0x60/0x270 __kasan_report+0x164/0x1b8 kasan_report+0xc/0x18 __asan_load8+0x84/0xa8 hns_roce_v2_init_eq_table+0x1324/0x1948 [hns_roce_hw_v2] hns_roce_init+0xf8/0xfe0 [hns_roce] __hns_roce_hw_v2_init_instance+0x284/0x330 [hns_roce_hw_v2] hns_roce_hw_v2_init_instance+0xd0/0x1b8 [hns_roce_hw_v2] hclge_init_roce_client_instance+0x180/0x310 [hclge] hclge_init_client_instance+0xcc/0x508 [hclge] hnae3_init_client_instance.part.3+0x3c/0x80 [hnae3] hnae3_register_client+0x134/0x1a8 [hnae3] hns_roce_hw_v2_init+0x14/0x10000 [hns_roce_hw_v2] do_one_initcall+0x9c/0x3e0 do_init_module+0xd4/0x2d8 load_module+0x3284/0x3690 __se_sys_init_module+0x274/0x308 __arm64_sys_init_module+0x40/0x50 el0_svc_handler+0xbc/0x210 el0_svc+0x8/0xc Allocated by task 256: __kasan_kmalloc.isra.0+0xd0/0x180 kasan_kmalloc+0xc/0x18 __kmalloc+0x16c/0x328 hns_roce_v2_init_eq_table+0x764/0x1948 [hns_roce_hw_v2] hns_roce_init+0xf8/0xfe0 [hns_roce] __hns_roce_hw_v2_init_instance+0x284/0x330 [hns_roce_hw_v2] hns_roce_hw_v2_init_instance+0xd0/0x1b8 [hns_roce_hw_v2] hclge_init_roce_client_instance+0x180/0x310 [hclge] hclge_init_client_instance+0xcc/0x508 [hclge] hnae3_init_client_instance.part.3+0x3c/0x80 [hnae3] hnae3_register_client+0x134/0x1a8 [hnae3] hns_roce_hw_v2_init+0x14/0x10000 [hns_roce_hw_v2] do_one_initcall+0x9c/0x3e0 do_init_module+0xd4/0x2d8 load_module+0x3284/0x3690 __se_sys_init_module+0x274/0x308 __arm64_sys_init_module+0x40/0x50 el0_svc_handler+0xbc/0x210 el0_svc+0x8/0xc Freed by task 0: (stack is not available) The buggy address belongs to the object at ffff8020e7a10600 which belongs to the cache kmalloc-128 of size 128 The buggy address is located 8 bytes inside of 128-byte region [ffff8020e7a10600, ffff8020e7a10680) The buggy address belongs to the page: page:ffff7fe00839e840 refcount:1 mapcount:0 mapping:ffff802340020200 index:0x0 flags: 0x5fffe00000000200(slab) raw: 5fffe00000000200 dead000000000100 dead000000000200 ffff802340020200 raw: 0000000000000000 0000000081000100 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8020e7a10500: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc ffff8020e7a10580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff8020e7a10600: 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ^ ffff8020e7a10680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff8020e7a10700: 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ================================================================== Disabling lock debugging due to kernel taint Fixes: a5073d6 ("RDMA/hns: Add eq support of hip08") Signed-off-by: Xi Wang <wangxi11@huawei.com> Link: https://lore.kernel.org/r/1565343666-73193-7-git-send-email-oulijun@huawei.com Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
swaroopbalan
pushed a commit
that referenced
this pull request
Mar 12, 2020
[ Upstream commit 7764d56baa844d7f6206394f21a0e8c1f303c476 ] If we are able to load an existing inode cache off disk, we set the state of the cache to BTRFS_CACHE_FINISHED, but we don't wake up any one waiting for the cache to be available. This means that anyone waiting for the cache to be available, waiting on the condition that either its state is BTRFS_CACHE_FINISHED or its available free space is greather than zero, can hang forever. This could be observed running fstests with MOUNT_OPTIONS="-o inode_cache", in particular test case generic/161 triggered it very frequently for me, producing a trace like the following: [63795.739712] BTRFS info (device sdc): enabling inode map caching [63795.739714] BTRFS info (device sdc): disk space caching is enabled [63795.739716] BTRFS info (device sdc): has skinny extents [64036.653886] INFO: task btrfs-transacti:3917 blocked for more than 120 seconds. [64036.654079] Not tainted 5.2.0-rc4-btrfs-next-50 #1 [64036.654143] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [64036.654232] btrfs-transacti D 0 3917 2 0x80004000 [64036.654239] Call Trace: [64036.654258] ? __schedule+0x3ae/0x7b0 [64036.654271] schedule+0x3a/0xb0 [64036.654325] btrfs_commit_transaction+0x978/0xae0 [btrfs] [64036.654339] ? remove_wait_queue+0x60/0x60 [64036.654395] transaction_kthread+0x146/0x180 [btrfs] [64036.654450] ? btrfs_cleanup_transaction+0x620/0x620 [btrfs] [64036.654456] kthread+0x103/0x140 [64036.654464] ? kthread_create_worker_on_cpu+0x70/0x70 [64036.654476] ret_from_fork+0x3a/0x50 [64036.654504] INFO: task xfs_io:3919 blocked for more than 120 seconds. [64036.654568] Not tainted 5.2.0-rc4-btrfs-next-50 #1 [64036.654617] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [64036.654685] xfs_io D 0 3919 3633 0x00000000 [64036.654691] Call Trace: [64036.654703] ? __schedule+0x3ae/0x7b0 [64036.654716] schedule+0x3a/0xb0 [64036.654756] btrfs_find_free_ino+0xa9/0x120 [btrfs] [64036.654764] ? remove_wait_queue+0x60/0x60 [64036.654809] btrfs_create+0x72/0x1f0 [btrfs] [64036.654822] lookup_open+0x6bc/0x790 [64036.654849] path_openat+0x3bc/0xc00 [64036.654854] ? __lock_acquire+0x331/0x1cb0 [64036.654869] do_filp_open+0x99/0x110 [64036.654884] ? __alloc_fd+0xee/0x200 [64036.654895] ? do_raw_spin_unlock+0x49/0xc0 [64036.654909] ? do_sys_open+0x132/0x220 [64036.654913] do_sys_open+0x132/0x220 [64036.654926] do_syscall_64+0x60/0x1d0 [64036.654933] entry_SYSCALL_64_after_hwframe+0x49/0xbe Fix this by adding a wake_up() call right after setting the cache state to BTRFS_CACHE_FINISHED, at start_caching(), when we are able to load the cache from disk. Fixes: 82d5902 ("Btrfs: Support reading/writing on disk free ino cache") Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
swaroopbalan
pushed a commit
that referenced
this pull request
Mar 12, 2020
commit 927d780ee371d7e121cea4fc7812f6ef2cea461c upstream.
Scenario 1, ARMv7
=================
If code in arch/arm/kernel/ftrace.c would operate on mcount() pointer
the following may be generated:
00000230 <prealloc_fixed_plts>:
230: b5f8 push {r3, r4, r5, r6, r7, lr}
232: b500 push {lr}
234: f7ff fffe bl 0 <__gnu_mcount_nc>
234: R_ARM_THM_CALL __gnu_mcount_nc
238: f240 0600 movw r6, #0
238: R_ARM_THM_MOVW_ABS_NC __gnu_mcount_nc
23c: f8d0 1180 ldr.w r1, [r0, #384] ; 0x180
FTRACE currently is not able to deal with it:
WARNING: CPU: 0 PID: 0 at .../kernel/trace/ftrace.c:1979 ftrace_bug+0x1ad/0x230()
...
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.4.116-... #1
...
[<c0314e3d>] (unwind_backtrace) from [<c03115e9>] (show_stack+0x11/0x14)
[<c03115e9>] (show_stack) from [<c051a7f1>] (dump_stack+0x81/0xa8)
[<c051a7f1>] (dump_stack) from [<c0321c5d>] (warn_slowpath_common+0x69/0x90)
[<c0321c5d>] (warn_slowpath_common) from [<c0321cf3>] (warn_slowpath_null+0x17/0x1c)
[<c0321cf3>] (warn_slowpath_null) from [<c038ee9d>] (ftrace_bug+0x1ad/0x230)
[<c038ee9d>] (ftrace_bug) from [<c038f1f9>] (ftrace_process_locs+0x27d/0x444)
[<c038f1f9>] (ftrace_process_locs) from [<c08915bd>] (ftrace_init+0x91/0xe8)
[<c08915bd>] (ftrace_init) from [<c0885a67>] (start_kernel+0x34b/0x358)
[<c0885a67>] (start_kernel) from [<00308095>] (0x308095)
---[ end trace cb88537fdc8fa200 ]---
ftrace failed to modify [<c031266c>] prealloc_fixed_plts+0x8/0x60
actual: 44:f2:e1:36
ftrace record flags: 0
(0) expected tramp: c03143e9
Scenario 2, ARMv4T
==================
ftrace: allocating 14435 entries in 43 pages
------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at kernel/trace/ftrace.c:2029 ftrace_bug+0x204/0x310
CPU: 0 PID: 0 Comm: swapper Not tainted 4.19.5 #1
Hardware name: Cirrus Logic EDB9302 Evaluation Board
[<c0010a24>] (unwind_backtrace) from [<c000ecb0>] (show_stack+0x20/0x2c)
[<c000ecb0>] (show_stack) from [<c03c72e8>] (dump_stack+0x20/0x30)
[<c03c72e8>] (dump_stack) from [<c0021c18>] (__warn+0xdc/0x104)
[<c0021c18>] (__warn) from [<c0021d7c>] (warn_slowpath_null+0x4c/0x5c)
[<c0021d7c>] (warn_slowpath_null) from [<c0095360>] (ftrace_bug+0x204/0x310)
[<c0095360>] (ftrace_bug) from [<c04dabac>] (ftrace_init+0x3b4/0x4d4)
[<c04dabac>] (ftrace_init) from [<c04cef4c>] (start_kernel+0x20c/0x410)
[<c04cef4c>] (start_kernel) from [<00000000>] ( (null))
---[ end trace 0506a2f5dae6b341 ]---
ftrace failed to modify
[<c000c350>] perf_trace_sys_exit+0x5c/0xe8
actual: 1e:ff:2f:e1
Initializing ftrace call sites
ftrace record flags: 0
(0)
expected tramp: c000fb24
The analysis for this problem has been already performed previously,
refer to the link below.
Fix the above problems by allowing only selected reloc types in
__mcount_loc. The list itself comes from the legacy recordmcount.pl
script.
Link: https://lore.kernel.org/lkml/56961010.6000806@pengutronix.de/
Cc: stable@vger.kernel.org
Fixes: ed60453 ("ARM: 6511/1: ftrace: add ARM support for C version of recordmcount")
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
swaroopbalan
pushed a commit
that referenced
this pull request
Mar 12, 2020
commit feee6b2989165631b17ac6d4ccdbf6759254e85a upstream.
-- snip --
- Missing arm64 hot(un)plug support
- Missing some vmem_altmap_offset() cleanups
- Missing sub-section hotadd support
- Missing unification of mm/hmm.c and kernel/memremap.c
-- snip --
We currently try to shrink a single zone when removing memory. We use
the zone of the first page of the memory we are removing. If that
memmap was never initialized (e.g., memory was never onlined), we will
read garbage and can trigger kernel BUGs (due to a stale pointer):
BUG: unable to handle page fault for address: 000000000000353d
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 0 P4D 0
Oops: 0002 [#1] SMP PTI
CPU: 1 PID: 7 Comm: kworker/u8:0 Not tainted 5.3.0-rc5-next-20190820+ #317
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.4
Workqueue: kacpi_hotplug acpi_hotplug_work_fn
RIP: 0010:clear_zone_contiguous+0x5/0x10
Code: 48 89 c6 48 89 c3 e8 2a fe ff ff 48 85 c0 75 cf 5b 5d c3 c6 85 fd 05 00 00 01 5b 5d c3 0f 1f 840
RSP: 0018:ffffad2400043c98 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000200000000 RCX: 0000000000000000
RDX: 0000000000200000 RSI: 0000000000140000 RDI: 0000000000002f40
RBP: 0000000140000000 R08: 0000000000000000 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000140000
R13: 0000000000140000 R14: 0000000000002f40 R15: ffff9e3e7aff3680
FS: 0000000000000000(0000) GS:ffff9e3e7bb00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000000353d CR3: 0000000058610000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
__remove_pages+0x4b/0x640
arch_remove_memory+0x63/0x8d
try_remove_memory+0xdb/0x130
__remove_memory+0xa/0x11
acpi_memory_device_remove+0x70/0x100
acpi_bus_trim+0x55/0x90
acpi_device_hotplug+0x227/0x3a0
acpi_hotplug_work_fn+0x1a/0x30
process_one_work+0x221/0x550
worker_thread+0x50/0x3b0
kthread+0x105/0x140
ret_from_fork+0x3a/0x50
Modules linked in:
CR2: 000000000000353d
Instead, shrink the zones when offlining memory or when onlining failed.
Introduce and use remove_pfn_range_from_zone(() for that. We now
properly shrink the zones, even if we have DIMMs whereby
- Some memory blocks fall into no zone (never onlined)
- Some memory blocks fall into multiple zones (offlined+re-onlined)
- Multiple memory blocks that fall into different zones
Drop the zone parameter (with a potential dubious value) from
__remove_pages() and __remove_section().
Link: http://lkml.kernel.org/r/20191006085646.5768-6-david@redhat.com
Fixes: f1dd2cd ("mm, memory_hotplug: do not associate hotadded memory to zones until online") [visible after d0dc12e]
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: <stable@vger.kernel.org> [5.0+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
swaroopbalan
pushed a commit
that referenced
this pull request
Mar 12, 2020
commit 334b0f4e9b1b4a1d475f803419d202f6c5e4d18e upstream.
There is a race condition which results in a deadlock when rmdir and
mkdir execute concurrently:
$ ls /sys/fs/resctrl/c1/mon_groups/m1/
cpus cpus_list mon_data tasks
Thread 1: rmdir /sys/fs/resctrl/c1
Thread 2: mkdir /sys/fs/resctrl/c1/mon_groups/m1
3 locks held by mkdir/48649:
#0: (sb_writers#17){.+.+}, at: [<ffffffffb4ca2aa0>] mnt_want_write+0x20/0x50
#1: (&type->i_mutex_dir_key#8/1){+.+.}, at: [<ffffffffb4c8c13b>] filename_create+0x7b/0x170
#2: (rdtgroup_mutex){+.+.}, at: [<ffffffffb4a4389d>] rdtgroup_kn_lock_live+0x3d/0x70
4 locks held by rmdir/48652:
#0: (sb_writers#17){.+.+}, at: [<ffffffffb4ca2aa0>] mnt_want_write+0x20/0x50
#1: (&type->i_mutex_dir_key#8/1){+.+.}, at: [<ffffffffb4c8c3cf>] do_rmdir+0x13f/0x1e0
#2: (&type->i_mutex_dir_key#8){++++}, at: [<ffffffffb4c86d5d>] vfs_rmdir+0x4d/0x120
#3: (rdtgroup_mutex){+.+.}, at: [<ffffffffb4a4389d>] rdtgroup_kn_lock_live+0x3d/0x70
Thread 1 is deleting control group "c1". Holding rdtgroup_mutex,
kernfs_remove() removes all kernfs nodes under directory "c1"
recursively, then waits for sub kernfs node "mon_groups" to drop active
reference.
Thread 2 is trying to create a subdirectory "m1" in the "mon_groups"
directory. The wrapper kernfs_iop_mkdir() takes an active reference to
the "mon_groups" directory but the code drops the active reference to
the parent directory "c1" instead.
As a result, Thread 1 is blocked on waiting for active reference to drop
and never release rdtgroup_mutex, while Thread 2 is also blocked on
trying to get rdtgroup_mutex.
Thread 1 (rdtgroup_rmdir) Thread 2 (rdtgroup_mkdir)
(rmdir /sys/fs/resctrl/c1) (mkdir /sys/fs/resctrl/c1/mon_groups/m1)
------------------------- -------------------------
kernfs_iop_mkdir
/*
* kn: "m1", parent_kn: "mon_groups",
* prgrp_kn: parent_kn->parent: "c1",
*
* "mon_groups", parent_kn->active++: 1
*/
kernfs_get_active(parent_kn)
kernfs_iop_rmdir
/* "c1", kn->active++ */
kernfs_get_active(kn)
rdtgroup_kn_lock_live
atomic_inc(&rdtgrp->waitcount)
/* "c1", kn->active-- */
kernfs_break_active_protection(kn)
mutex_lock
rdtgroup_rmdir_ctrl
free_all_child_rdtgrp
sentry->flags = RDT_DELETED
rdtgroup_ctrl_remove
rdtgrp->flags = RDT_DELETED
kernfs_get(kn)
kernfs_remove(rdtgrp->kn)
__kernfs_remove
/* "mon_groups", sub_kn */
atomic_add(KN_DEACTIVATED_BIAS, &sub_kn->active)
kernfs_drain(sub_kn)
/*
* sub_kn->active == KN_DEACTIVATED_BIAS + 1,
* waiting on sub_kn->active to drop, but it
* never drops in Thread 2 which is blocked
* on getting rdtgroup_mutex.
*/
Thread 1 hangs here ---->
wait_event(sub_kn->active == KN_DEACTIVATED_BIAS)
...
rdtgroup_mkdir
rdtgroup_mkdir_mon(parent_kn, prgrp_kn)
mkdir_rdt_prepare(parent_kn, prgrp_kn)
rdtgroup_kn_lock_live(prgrp_kn)
atomic_inc(&rdtgrp->waitcount)
/*
* "c1", prgrp_kn->active--
*
* The active reference on "c1" is
* dropped, but not matching the
* actual active reference taken
* on "mon_groups", thus causing
* Thread 1 to wait forever while
* holding rdtgroup_mutex.
*/
kernfs_break_active_protection(
prgrp_kn)
/*
* Trying to get rdtgroup_mutex
* which is held by Thread 1.
*/
Thread 2 hangs here ----> mutex_lock
...
The problem is that the creation of a subdirectory in the "mon_groups"
directory incorrectly releases the active protection of its parent
directory instead of itself before it starts waiting for rdtgroup_mutex.
This is triggered by the rdtgroup_mkdir() flow calling
rdtgroup_kn_lock_live()/rdtgroup_kn_unlock() with kernfs node of the
parent control group ("c1") as argument. It should be called with kernfs
node "mon_groups" instead. What is currently missing is that the
kn->priv of "mon_groups" is NULL instead of pointing to the rdtgrp.
Fix it by pointing kn->priv to rdtgrp when "mon_groups" is created. Then
it could be passed to rdtgroup_kn_lock_live()/rdtgroup_kn_unlock()
instead. And then it operates on the same rdtgroup structure but handles
the active reference of kernfs node "mon_groups" to prevent deadlock.
The same changes are also made to the "mon_data" directories.
This results in some unused function parameters that will be cleaned up
in follow-up patch as the focus here is on the fix only in support of
backporting efforts.
Backporting notes:
Since upstream commit fa7d949337cc ("x86/resctrl: Rename and move rdt
files to a separate directory"), the file
arch/x86/kernel/cpu/intel_rdt_rdtgroup.c has been renamed and moved to
arch/x86/kernel/cpu/resctrl/rdtgroup.c.
Apply the change against file arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
for older stable trees.
Fixes: c7d9aac ("x86/intel_rdt/cqm: Add mkdir support for RDT monitoring")
Suggested-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1578500886-21771-4-git-send-email-xiaochen.shen@intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
swaroopbalan
pushed a commit
that referenced
this pull request
Mar 12, 2020
[ Upstream commit 2d797e9ff95ecbcf0a83d657928ed20579444857 ] Writing a cloned file triggers a kernel oops and the user-space command process is also killed by the system. The bug can be reproduced stably via: 1) create a file under ocfs2 file system directory. journalctl -b > aa.txt 2) create a cloned file for this file. reflink aa.txt bb.txt 3) write the cloned file with dd command. dd if=/dev/zero of=bb.txt bs=512 count=1 conv=notrunc The dd command is killed by the kernel, then you can see the oops message via dmesg command. [ 463.875404] BUG: kernel NULL pointer dereference, address: 0000000000000028 [ 463.875413] #PF: supervisor read access in kernel mode [ 463.875416] #PF: error_code(0x0000) - not-present page [ 463.875418] PGD 0 P4D 0 [ 463.875425] Oops: 0000 [#1] SMP PTI [ 463.875431] CPU: 1 PID: 2291 Comm: dd Tainted: G OE 5.3.16-2-default [ 463.875433] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 [ 463.875500] RIP: 0010:ocfs2_refcount_cow+0xa4/0x5d0 [ocfs2] [ 463.875505] Code: 06 89 6c 24 38 89 eb f6 44 24 3c 02 74 be 49 8b 47 28 [ 463.875508] RSP: 0018:ffffa2cb409dfce8 EFLAGS: 00010202 [ 463.875512] RAX: ffff8b1ebdca8000 RBX: 0000000000000001 RCX: ffff8b1eb73a9df0 [ 463.875515] RDX: 0000000000056a01 RSI: 0000000000000000 RDI: 0000000000000000 [ 463.875517] RBP: 0000000000000001 R08: ffff8b1eb73a9de0 R09: 0000000000000000 [ 463.875520] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000 [ 463.875522] R13: ffff8b1eb922f048 R14: 0000000000000000 R15: ffff8b1eb922f048 [ 463.875526] FS: 00007f8f44d15540(0000) GS:ffff8b1ebeb00000(0000) knlGS:0000000000000000 [ 463.875529] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 463.875532] CR2: 0000000000000028 CR3: 000000003c17a000 CR4: 00000000000006e0 [ 463.875546] Call Trace: [ 463.875596] ? ocfs2_inode_lock_full_nested+0x18b/0x960 [ocfs2] [ 463.875648] ocfs2_file_write_iter+0xaf8/0xc70 [ocfs2] [ 463.875672] new_sync_write+0x12d/0x1d0 [ 463.875688] vfs_write+0xad/0x1a0 [ 463.875697] ksys_write+0xa1/0xe0 [ 463.875710] do_syscall_64+0x60/0x1f0 [ 463.875743] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 463.875758] RIP: 0033:0x7f8f4482ed44 [ 463.875762] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 80 00 00 00 [ 463.875765] RSP: 002b:00007fff300a79d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 463.875769] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f8f4482ed44 [ 463.875771] RDX: 0000000000000200 RSI: 000055f771b5c000 RDI: 0000000000000001 [ 463.875774] RBP: 0000000000000200 R08: 00007f8f44af9c78 R09: 0000000000000003 [ 463.875776] R10: 000000000000089f R11: 0000000000000246 R12: 000055f771b5c000 [ 463.875779] R13: 0000000000000200 R14: 0000000000000000 R15: 000055f771b5c000 This regression problem was introduced by commit e74540b28556 ("ocfs2: protect extent tree in ocfs2_prepare_inode_for_write()"). Link: http://lkml.kernel.org/r/20200121050153.13290-1-ghe@suse.com Fixes: e74540b28556 ("ocfs2: protect extent tree in ocfs2_prepare_inode_for_write()"). Signed-off-by: Gang He <ghe@suse.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
swaroopbalan
pushed a commit
that referenced
this pull request
Mar 12, 2020
[ Upstream commit 2b5b8251bc9fe2f9118411f037862ee17cf81e97 ] hsr_port_get_rcu() can return NULL, so we need to be careful. general protection fault, probably for non-canonical address 0xdffffc0000000006: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000030-0x0000000000000037] CPU: 1 PID: 10249 Comm: syz-executor.5 Not tainted 5.5.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__read_once_size include/linux/compiler.h:199 [inline] RIP: 0010:hsr_addr_is_self+0x86/0x330 net/hsr/hsr_framereg.c:44 Code: 04 00 f3 f3 f3 65 48 8b 04 25 28 00 00 00 48 89 45 d0 31 c0 e8 6b ff 94 f9 4c 89 f2 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <80> 3c 02 00 0f 85 75 02 00 00 48 8b 43 30 49 39 c6 49 89 47 c0 0f RSP: 0018:ffffc90000da8a90 EFLAGS: 00010206 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff87e0cc33 RDX: 0000000000000006 RSI: ffffffff87e035d5 RDI: 0000000000000000 RBP: ffffc90000da8b20 R08: ffff88808e7de040 R09: ffffed1015d2707c R10: ffffed1015d2707b R11: ffff8880ae9383db R12: ffff8880a689bc5e R13: 1ffff920001b5153 R14: 0000000000000030 R15: ffffc90000da8af8 FS: 00007fd7a42be700(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b32338000 CR3: 00000000a928c000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> hsr_handle_frame+0x1c5/0x630 net/hsr/hsr_slave.c:31 __netif_receive_skb_core+0xfbc/0x30b0 net/core/dev.c:5099 __netif_receive_skb_one_core+0xa8/0x1a0 net/core/dev.c:5196 __netif_receive_skb+0x2c/0x1d0 net/core/dev.c:5312 process_backlog+0x206/0x750 net/core/dev.c:6144 napi_poll net/core/dev.c:6582 [inline] net_rx_action+0x508/0x1120 net/core/dev.c:6650 __do_softirq+0x262/0x98c kernel/softirq.c:292 do_softirq_own_stack+0x2a/0x40 arch/x86/entry/entry_64.S:1082 </IRQ> Fixes: c5a7591 ("net/hsr: Use list_head (and rcu) instead of array for slave devices.") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
swaroopbalan
pushed a commit
that referenced
this pull request
Mar 12, 2020
commit f1037ec0cc8ac1a450974ad9754e991f72884f48 upstream.
The daxctl unit test for the dax_kmem driver currently triggers the
(false positive) lockdep splat below. It results from the fact that
remove_memory_block_devices() is invoked under the mem_hotplug_lock()
causing lockdep entanglements with cpu_hotplug_lock() and sysfs (kernfs
active state tracking). It is a false positive because the sysfs
attribute path triggering the memory remove is not the same attribute
path associated with memory-block device.
sysfs_break_active_protection() is not applicable since there is no real
deadlock conflict, instead move memory-block device removal outside the
lock. The mem_hotplug_lock() is not needed to synchronize the
memory-block device removal vs the page online state, that is already
handled by lock_device_hotplug(). Specifically, lock_device_hotplug()
is sufficient to allow try_remove_memory() to check the offline state of
the memblocks and be assured that any in progress online attempts are
flushed / blocked by kernfs_drain() / attribute removal.
The add_memory() path safely creates memblock devices under the
mem_hotplug_lock(). There is no kernfs active state synchronization in
the memblock device_register() path, so nothing to fix there.
This change is only possible thanks to the recent change that refactored
memory block device removal out of arch_remove_memory() (commit
4c4b7f9ba948 "mm/memory_hotplug: remove memory block devices before
arch_remove_memory()"), and David's due diligence tracking down the
guarantees afforded by kernfs_drain(). Not flagged for -stable since
this only impacts ongoing development and lockdep validation, not a
runtime issue.
======================================================
WARNING: possible circular locking dependency detected
5.5.0-rc3+ #230 Tainted: G OE
------------------------------------------------------
lt-daxctl/6459 is trying to acquire lock:
ffff99c7f0003510 (kn->count#241){++++}, at: kernfs_remove_by_name_ns+0x41/0x80
but task is already holding lock:
ffffffffa76a5450 (mem_hotplug_lock.rw_sem){++++}, at: percpu_down_write+0x20/0xe0
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (mem_hotplug_lock.rw_sem){++++}:
__lock_acquire+0x39c/0x790
lock_acquire+0xa2/0x1b0
get_online_mems+0x3e/0xb0
kmem_cache_create_usercopy+0x2e/0x260
kmem_cache_create+0x12/0x20
ptlock_cache_init+0x20/0x28
start_kernel+0x243/0x547
secondary_startup_64+0xb6/0xc0
-> #1 (cpu_hotplug_lock.rw_sem){++++}:
__lock_acquire+0x39c/0x790
lock_acquire+0xa2/0x1b0
cpus_read_lock+0x3e/0xb0
online_pages+0x37/0x300
memory_subsys_online+0x17d/0x1c0
device_online+0x60/0x80
state_store+0x65/0xd0
kernfs_fop_write+0xcf/0x1c0
vfs_write+0xdb/0x1d0
ksys_write+0x65/0xe0
do_syscall_64+0x5c/0xa0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
-> #0 (kn->count#241){++++}:
check_prev_add+0x98/0xa40
validate_chain+0x576/0x860
__lock_acquire+0x39c/0x790
lock_acquire+0xa2/0x1b0
__kernfs_remove+0x25f/0x2e0
kernfs_remove_by_name_ns+0x41/0x80
remove_files.isra.0+0x30/0x70
sysfs_remove_group+0x3d/0x80
sysfs_remove_groups+0x29/0x40
device_remove_attrs+0x39/0x70
device_del+0x16a/0x3f0
device_unregister+0x16/0x60
remove_memory_block_devices+0x82/0xb0
try_remove_memory+0xb5/0x130
remove_memory+0x26/0x40
dev_dax_kmem_remove+0x44/0x6a [kmem]
device_release_driver_internal+0xe4/0x1c0
unbind_store+0xef/0x120
kernfs_fop_write+0xcf/0x1c0
vfs_write+0xdb/0x1d0
ksys_write+0x65/0xe0
do_syscall_64+0x5c/0xa0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
other info that might help us debug this:
Chain exists of:
kn->count#241 --> cpu_hotplug_lock.rw_sem --> mem_hotplug_lock.rw_sem
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(mem_hotplug_lock.rw_sem);
lock(cpu_hotplug_lock.rw_sem);
lock(mem_hotplug_lock.rw_sem);
lock(kn->count#241);
*** DEADLOCK ***
No fixes tag as this has been a long standing issue that predated the
addition of kernfs lockdep annotations.
Link: http://lkml.kernel.org/r/157991441887.2763922.4770790047389427325.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
swaroopbalan
pushed a commit
that referenced
this pull request
Mar 12, 2020
…nodes commit 7227ff4de55d931bbdc156c8ef0ce4f100c78a5b upstream. There is a race between adding and removing elements to the tree mod log list and rbtree that can lead to use-after-free problems. Consider the following example that explains how/why the problems happens: 1) Task A has mod log element with sequence number 200. It currently is the only element in the mod log list; 2) Task A calls btrfs_put_tree_mod_seq() because it no longer needs to access the tree mod log. When it enters the function, it initializes 'min_seq' to (u64)-1. Then it acquires the lock 'tree_mod_seq_lock' before checking if there are other elements in the mod seq list. Since the list it empty, 'min_seq' remains set to (u64)-1. Then it unlocks the lock 'tree_mod_seq_lock'; 3) Before task A acquires the lock 'tree_mod_log_lock', task B adds itself to the mod seq list through btrfs_get_tree_mod_seq() and gets a sequence number of 201; 4) Some other task, name it task C, modifies a btree and because there elements in the mod seq list, it adds a tree mod elem to the tree mod log rbtree. That node added to the mod log rbtree is assigned a sequence number of 202; 5) Task B, which is doing fiemap and resolving indirect back references, calls btrfs get_old_root(), with 'time_seq' == 201, which in turn calls tree_mod_log_search() - the search returns the mod log node from the rbtree with sequence number 202, created by task C; 6) Task A now acquires the lock 'tree_mod_log_lock', starts iterating the mod log rbtree and finds the node with sequence number 202. Since 202 is less than the previously computed 'min_seq', (u64)-1, it removes the node and frees it; 7) Task B still has a pointer to the node with sequence number 202, and it dereferences the pointer itself and through the call to __tree_mod_log_rewind(), resulting in a use-after-free problem. This issue can be triggered sporadically with the test case generic/561 from fstests, and it happens more frequently with a higher number of duperemove processes. When it happens to me, it either freezes the VM or it produces a trace like the following before crashing: [ 1245.321140] general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI [ 1245.321200] CPU: 1 PID: 26997 Comm: pool Not tainted 5.5.0-rc6-btrfs-next-52 #1 [ 1245.321235] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014 [ 1245.321287] RIP: 0010:rb_next+0x16/0x50 [ 1245.321307] Code: .... [ 1245.321372] RSP: 0018:ffffa151c4d039b0 EFLAGS: 00010202 [ 1245.321388] RAX: 6b6b6b6b6b6b6b6b RBX: ffff8ae221363c80 RCX: 6b6b6b6b6b6b6b6b [ 1245.321409] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8ae221363c80 [ 1245.321439] RBP: ffff8ae20fcc4688 R08: 0000000000000002 R09: 0000000000000000 [ 1245.321475] R10: ffff8ae20b120910 R11: 00000000243f8bb1 R12: 0000000000000038 [ 1245.321506] R13: ffff8ae221363c80 R14: 000000000000075f R15: ffff8ae223f762b8 [ 1245.321539] FS: 00007fdee1ec7700(0000) GS:ffff8ae236c80000(0000) knlGS:0000000000000000 [ 1245.321591] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1245.321614] CR2: 00007fded4030c48 CR3: 000000021da16003 CR4: 00000000003606e0 [ 1245.321642] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1245.321668] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1245.321706] Call Trace: [ 1245.321798] __tree_mod_log_rewind+0xbf/0x280 [btrfs] [ 1245.321841] btrfs_search_old_slot+0x105/0xd00 [btrfs] [ 1245.321877] resolve_indirect_refs+0x1eb/0xc60 [btrfs] [ 1245.321912] find_parent_nodes+0x3dc/0x11b0 [btrfs] [ 1245.321947] btrfs_check_shared+0x115/0x1c0 [btrfs] [ 1245.321980] ? extent_fiemap+0x59d/0x6d0 [btrfs] [ 1245.322029] extent_fiemap+0x59d/0x6d0 [btrfs] [ 1245.322066] do_vfs_ioctl+0x45a/0x750 [ 1245.322081] ksys_ioctl+0x70/0x80 [ 1245.322092] ? trace_hardirqs_off_thunk+0x1a/0x1c [ 1245.322113] __x64_sys_ioctl+0x16/0x20 [ 1245.322126] do_syscall_64+0x5c/0x280 [ 1245.322139] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 1245.322155] RIP: 0033:0x7fdee3942dd7 [ 1245.322177] Code: .... [ 1245.322258] RSP: 002b:00007fdee1ec6c88 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 1245.322294] RAX: ffffffffffffffda RBX: 00007fded40210d8 RCX: 00007fdee3942dd7 [ 1245.322314] RDX: 00007fded40210d8 RSI: 00000000c020660b RDI: 0000000000000004 [ 1245.322337] RBP: 0000562aa89e7510 R08: 0000000000000000 R09: 00007fdee1ec6d44 [ 1245.322369] R10: 0000000000000073 R11: 0000000000000246 R12: 00007fdee1ec6d48 [ 1245.322390] R13: 00007fdee1ec6d40 R14: 00007fded40210d0 R15: 00007fdee1ec6d50 [ 1245.322423] Modules linked in: .... [ 1245.323443] ---[ end trace 01de1e9ec5dff3cd ]--- Fix this by ensuring that btrfs_put_tree_mod_seq() computes the minimum sequence number and iterates the rbtree while holding the lock 'tree_mod_log_lock' in write mode. Also get rid of the 'tree_mod_seq_lock' lock, since it is now redundant. Fixes: bd989ba ("Btrfs: add tree modification log functions") Fixes: 097b8a7 ("Btrfs: join tree mod log code with the code holding back delayed refs") CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
swaroopbalan
pushed a commit
that referenced
this pull request
Mar 12, 2020
commit 080d89f522e2baddb4fbbd1af4b67b5f92537ef8 upstream. Once rc_open is called on the input device, lirc events can be delivered. Ensure lirc is ready to do so else we might get this: Registered IR keymap rc-hauppauge rc rc0: Hauppauge WinTV PVR-350 as /devices/pci0000:00/0000:00:1e.0/0000:04:00.0/i2c-0/0-0018/rc/rc0 input: Hauppauge WinTV PVR-350 as /devices/pci0000:00/0000:00:1e.0/0000:04:00.0/i2c-0/0-0018/rc/rc0/input9 BUG: kernel NULL pointer dereference, address: 0000000000000038 PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 1 PID: 17 Comm: kworker/1:0 Not tainted 5.3.11-300.fc31.x86_64 #1 Hardware name: /DG43NB, BIOS NBG4310H.86A.0096.2009.0903.1845 09/03/2009 Workqueue: events ir_work [ir_kbd_i2c] RIP: 0010:ir_lirc_scancode_event+0x3d/0xb0 Code: a6 b4 07 00 00 49 81 c6 b8 07 00 00 55 53 e8 ba a7 9d ff 4c 89 e7 49 89 45 00 e8 5e 7a 25 00 49 8b 1e 48 89 c5 4c 39 f3 74 58 <8b> 43 38 8b 53 40 89 c1 2b 4b 3c 39 ca 72 41 21 d0 49 8b 7d 00 49 RSP: 0018:ffffaae2000b3d88 EFLAGS: 00010017 RAX: 0000000000000002 RBX: 0000000000000000 RCX: 0000000000000019 RDX: 0000000000000001 RSI: 006e801b1f26ce6a RDI: ffff9e39797c37b4 RBP: 0000000000000002 R08: 0000000000000001 R09: 0000000000000001 R10: 0000000000000001 R11: 0000000000000001 R12: ffff9e39797c37b4 R13: ffffaae2000b3db8 R14: ffff9e39797c37b8 R15: ffff9e39797c33d8 FS: 0000000000000000(0000) GS:ffff9e397b680000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000038 CR3: 0000000035844000 CR4: 00000000000006e0 Call Trace: ir_do_keydown+0x8e/0x2b0 rc_keydown+0x52/0xc0 ir_work+0xb8/0x130 [ir_kbd_i2c] process_one_work+0x19d/0x340 worker_thread+0x50/0x3b0 kthread+0xfb/0x130 ? process_one_work+0x340/0x340 ? kthread_park+0x80/0x80 ret_from_fork+0x35/0x40 Modules linked in: rc_hauppauge tuner msp3400 saa7127 saa7115 ivtv(+) tveeprom cx2341x v4l2_common videodev mc i2c_algo_bit ir_kbd_i2c ip_tables firewire_ohci e1000e serio_raw firewire_core ata_generic crc_itu_t pata_acpi pata_jmicron fuse CR2: 0000000000000038 ---[ end trace c67c2697a99fa74b ]--- RIP: 0010:ir_lirc_scancode_event+0x3d/0xb0 Code: a6 b4 07 00 00 49 81 c6 b8 07 00 00 55 53 e8 ba a7 9d ff 4c 89 e7 49 89 45 00 e8 5e 7a 25 00 49 8b 1e 48 89 c5 4c 39 f3 74 58 <8b> 43 38 8b 53 40 89 c1 2b 4b 3c 39 ca 72 41 21 d0 49 8b 7d 00 49 RSP: 0018:ffffaae2000b3d88 EFLAGS: 00010017 RAX: 0000000000000002 RBX: 0000000000000000 RCX: 0000000000000019 RDX: 0000000000000001 RSI: 006e801b1f26ce6a RDI: ffff9e39797c37b4 RBP: 0000000000000002 R08: 0000000000000001 R09: 0000000000000001 R10: 0000000000000001 R11: 0000000000000001 R12: ffff9e39797c37b4 R13: ffffaae2000b3db8 R14: ffff9e39797c37b8 R15: ffff9e39797c33d8 FS: 0000000000000000(0000) GS:ffff9e397b680000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000038 CR3: 0000000035844000 CR4: 00000000000006e0 rc rc0: lirc_dev: driver ir_kbd_i2c registered at minor = 0, scancode receiver, no transmitter tuner-simple 0-0061: creating new instance tuner-simple 0-0061: type set to 2 (Philips NTSC (FI1236,FM1236 and compatibles)) ivtv0: Registered device video0 for encoder MPG (4096 kB) ivtv0: Registered device video32 for encoder YUV (2048 kB) ivtv0: Registered device vbi0 for encoder VBI (1024 kB) ivtv0: Registered device video24 for encoder PCM (320 kB) ivtv0: Registered device radio0 for encoder radio ivtv0: Registered device video16 for decoder MPG (1024 kB) ivtv0: Registered device vbi8 for decoder VBI (64 kB) ivtv0: Registered device vbi16 for decoder VOUT ivtv0: Registered device video48 for decoder YUV (1024 kB) Cc: stable@vger.kernel.org Tested-by: Nick French <nickfrench@gmail.com> Reported-by: Nick French <nickfrench@gmail.com> Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
swaroopbalan
pushed a commit
that referenced
this pull request
Mar 12, 2020
commit 69503e585192fdd84b240f18a0873d20e18a2e0a upstream. After the commit 44ea394 ("drivers/watchdog: make use of devm_register_reboot_notifier()") the struct notifier_block reboot_nb in the struct watchdog_device is removed from the reboot notifiers chain at the time watchdog's chardev is closed. But at least in i6300esb.c case reboot_nb is embedded in the struct esb_dev which can be freed on its device removal and before the chardev is closed, thus UAF at reboot: [ 7.728581] esb_probe: esb_dev.watchdog_device ffff91316f91ab28 ts# uname -r note the address ^^^ 5.5.0-rc5-ae6088-wdog ts# ./openwdog0 & [1] 696 ts# opened /dev/watchdog0, sleeping 10s... ts# echo 1 > /sys/devices/pci0000\:00/0000\:00\:09.0/remove [ 178.086079] devres:rel_nodes: dev ffff91317668a0b0 data ffff91316f91ab28 esb_dev.watchdog_device.reboot_nb memory is freed here ^^^ ts# ...woken up [ 181.459010] devres:rel_nodes: dev ffff913171781000 data ffff913174a1dae8 [ 181.460195] devm_unreg_reboot_notifier: res ffff913174a1dae8 nb ffff91316f91ab78 attempt to use memory already freed ^^^ [ 181.461063] devm_unreg_reboot_notifier: nb->call 6b6b6b6b6b6b6b6b [ 181.461243] devm_unreg_reboot_notifier: nb->next 6b6b6b6b6b6b6b6b freed memory is filled with a slub poison ^^^ [1]+ Done ./openwdog0 ts# reboot [ 229.921862] systemd-shutdown[1]: Rebooting. [ 229.939265] notifier_call_chain: nb ffffffff9c6c2f20 nb->next ffffffff9c6d50c0 [ 229.943080] notifier_call_chain: nb ffffffff9c6d50c0 nb->next 6b6b6b6b6b6b6b6b [ 229.946054] notifier_call_chain: nb 6b6b6b6b6b6b6b6b INVAL [ 229.957584] general protection fault: 0000 [#1] SMP [ 229.958770] CPU: 0 PID: 1 Comm: systemd-shutdow Not tainted 5.5.0-rc5-ae6088-wdog [ 229.960224] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), ... [ 229.963288] RIP: 0010:notifier_call_chain+0x66/0xd0 [ 229.969082] RSP: 0018:ffffb20dc0013d88 EFLAGS: 00010246 [ 229.970812] RAX: 000000000000002e RBX: 6b6b6b6b6b6b6b6b RCX: 00000000000008b3 [ 229.972929] RDX: 0000000000000000 RSI: 0000000000000096 RDI: ffffffff9ccc46ac [ 229.975028] RBP: 0000000000000001 R08: 0000000000000000 R09: 00000000000008b3 [ 229.977039] R10: 0000000000000001 R11: ffffffff9c26c740 R12: 0000000000000000 [ 229.979155] R13: 6b6b6b6b6b6b6b6b R14: 0000000000000000 R15: 00000000fffffffa ... slub_debug=FZP poison ^^^ [ 229.989089] Call Trace: [ 229.990157] blocking_notifier_call_chain+0x43/0x59 [ 229.991401] kernel_restart_prepare+0x14/0x30 [ 229.992607] kernel_restart+0x9/0x30 [ 229.993800] __do_sys_reboot+0x1d2/0x210 [ 230.000149] do_syscall_64+0x3d/0x130 [ 230.001277] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 230.002639] RIP: 0033:0x7f5461bdd177 [ 230.016402] Modules linked in: i6300esb [ 230.050261] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b Fix the crash by reverting 44ea394 so unregister_reboot_notifier() is called when watchdog device is removed. This also makes handling of the reboot notifier unified with the handling of the restart handler, which is freed with unregister_restart_handler() in the same place. Fixes: 44ea394 ("drivers/watchdog: make use of devm_register_reboot_notifier()") Cc: stable@vger.kernel.org # v4.15+ Signed-off-by: Vladis Dronov <vdronov@redhat.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20200108125347.6067-1-vdronov@redhat.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
swaroopbalan
pushed a commit
that referenced
this pull request
Mar 12, 2020
[ Upstream commit 96bf313ecb33567af4cb53928b0c951254a02759 ] There exists a deadlock with range_cyclic that has existed forever. If we loop around with a bio already built we could deadlock with a writer who has the page locked that we're attempting to write but is waiting on a page in our bio to be written out. The task traces are as follows PID: 1329874 TASK: ffff889ebcdf3800 CPU: 33 COMMAND: "kworker/u113:5" #0 [ffffc900297bb658] __schedule at ffffffff81a4c33f #1 [ffffc900297bb6e0] schedule at ffffffff81a4c6e3 #2 [ffffc900297bb6f8] io_schedule at ffffffff81a4ca42 #3 [ffffc900297bb708] __lock_page at ffffffff811f145b #4 [ffffc900297bb798] __process_pages_contig at ffffffff814bc502 #5 [ffffc900297bb8c8] lock_delalloc_pages at ffffffff814bc684 #6 [ffffc900297bb900] find_lock_delalloc_range at ffffffff814be9ff #7 [ffffc900297bb9a0] writepage_delalloc at ffffffff814bebd0 #8 [ffffc900297bba18] __extent_writepage at ffffffff814bfbf2 #9 [ffffc900297bba98] extent_write_cache_pages at ffffffff814bffbd PID: 2167901 TASK: ffff889dc6a59c00 CPU: 14 COMMAND: "aio-dio-invalid" #0 [ffffc9003b50bb18] __schedule at ffffffff81a4c33f #1 [ffffc9003b50bba0] schedule at ffffffff81a4c6e3 #2 [ffffc9003b50bbb8] io_schedule at ffffffff81a4ca42 #3 [ffffc9003b50bbc8] wait_on_page_bit at ffffffff811f24d6 #4 [ffffc9003b50bc60] prepare_pages at ffffffff814b05a7 #5 [ffffc9003b50bcd8] btrfs_buffered_write at ffffffff814b1359 #6 [ffffc9003b50bdb0] btrfs_file_write_iter at ffffffff814b5933 #7 [ffffc9003b50be38] new_sync_write at ffffffff8128f6a8 #8 [ffffc9003b50bec8] vfs_write at ffffffff81292b9d #9 [ffffc9003b50bf00] ksys_pwrite64 at ffffffff81293032 I used drgn to find the respective pages we were stuck on page_entry.page 0xffffea00fbfc7500 index 8148 bit 15 pid 2167901 page_entry.page 0xffffea00f9bb7400 index 7680 bit 0 pid 1329874 As you can see the kworker is waiting for bit 0 (PG_locked) on index 7680, and aio-dio-invalid is waiting for bit 15 (PG_writeback) on index 8148. aio-dio-invalid has 7680, and the kworker epd looks like the following crash> struct extent_page_data ffffc900297bbbb0 struct extent_page_data { bio = 0xffff889f747ed830, tree = 0xffff889eed6ba448, extent_locked = 0, sync_io = 0 } Probably worth mentioning as well that it waits for writeback of the page to complete while holding a lock on it (at prepare_pages()). Using drgn I walked the bio pages looking for page 0xffffea00fbfc7500 which is the one we're waiting for writeback on bio = Object(prog, 'struct bio', address=0xffff889f747ed830) for i in range(0, bio.bi_vcnt.value_()): bv = bio.bi_io_vec[i] if bv.bv_page.value_() == 0xffffea00fbfc7500: print("FOUND IT") which validated what I suspected. The fix for this is simple, flush the epd before we loop back around to the beginning of the file during writeout. Fixes: b293f02 ("Btrfs: Add writepages support") CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
swaroopbalan
pushed a commit
that referenced
this pull request
Mar 12, 2020
[ Upstream commit 907ec5f ] Patch series "mm: Fix for movable_node boot option", v3. This patch series contains a fix for the movable_node boot option issue which was introduced by commit 124049d ("x86/e820: put !E820_TYPE_RAM regions into memblock.reserved"). The commit breaks the option because it changed the memory gap range to reserved memblock. So, the node is marked as Normal zone even if the SRAT has Hot pluggable affinity. First and second patch fix the original issue which the commit tried to fix, then revert the commit. This patch (of 3): There is a kernel panic that is triggered when reading /proc/kpageflags on the kernel booted with kernel parameter 'memmap=nn[KMG]!ss[KMG]': BUG: unable to handle kernel paging request at fffffffffffffffe PGD 9b20e067 P4D 9b20e067 PUD 9b210067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 2 PID: 1728 Comm: page-types Not tainted 4.17.0-rc6-mm1-v4.17-rc6-180605-0816-00236-g2dfb086ef02c+ #160 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.fc28 04/01/2014 RIP: 0010:stable_page_flags+0x27/0x3c0 Code: 00 00 00 0f 1f 44 00 00 48 85 ff 0f 84 a0 03 00 00 41 54 55 49 89 fc 53 48 8b 57 08 48 8b 2f 48 8d 42 ff 83 e2 01 48 0f 44 c7 <48> 8b 00 f6 c4 01 0f 84 10 03 00 00 31 db 49 8b 54 24 08 4c 89 e7 RSP: 0018:ffffbbd44111fde0 EFLAGS: 00010202 RAX: fffffffffffffffe RBX: 00007fffffffeff9 RCX: 0000000000000000 RDX: 0000000000000001 RSI: 0000000000000202 RDI: ffffed1182fff5c0 RBP: ffffffffffffffff R08: 0000000000000001 R09: 0000000000000001 R10: ffffbbd44111fed8 R11: 0000000000000000 R12: ffffed1182fff5c0 R13: 00000000000bffd7 R14: 0000000002fff5c0 R15: ffffbbd44111ff10 FS: 00007efc4335a500(0000) GS:ffff93a5bfc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: fffffffffffffffe CR3: 00000000b2a58000 CR4: 00000000001406e0 Call Trace: kpageflags_read+0xc7/0x120 proc_reg_read+0x3c/0x60 __vfs_read+0x36/0x170 vfs_read+0x89/0x130 ksys_pread64+0x71/0x90 do_syscall_64+0x5b/0x160 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7efc42e75e23 Code: 09 00 ba 9f 01 00 00 e8 ab 81 f4 ff 66 2e 0f 1f 84 00 00 00 00 00 90 83 3d 29 0a 2d 00 00 75 13 49 89 ca b8 11 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 34 c3 48 83 ec 08 e8 db d3 01 00 48 89 04 24 According to kernel bisection, this problem became visible due to commit f7f9910 which changes how struct pages are initialized. Memblock layout affects the pfn ranges covered by node/zone. Consider that we have a VM with 2 NUMA nodes and each node has 4GB memory, and the default (no memmap= given) memblock layout is like below: MEMBLOCK configuration: memory size = 0x00000001fff75c00 reserved size = 0x000000000300c000 memory.cnt = 0x4 memory[0x0] [0x0000000000001000-0x000000000009efff], 0x000000000009e000 bytes on node 0 flags: 0x0 memory[0x1] [0x0000000000100000-0x00000000bffd6fff], 0x00000000bfed7000 bytes on node 0 flags: 0x0 memory[0x2] [0x0000000100000000-0x000000013fffffff], 0x0000000040000000 bytes on node 0 flags: 0x0 memory[0x3] [0x0000000140000000-0x000000023fffffff], 0x0000000100000000 bytes on node 1 flags: 0x0 ... If you give memmap=1G!4G (so it just covers memory[0x2]), the range [0x100000000-0x13fffffff] is gone: MEMBLOCK configuration: memory size = 0x00000001bff75c00 reserved size = 0x000000000300c000 memory.cnt = 0x3 memory[0x0] [0x0000000000001000-0x000000000009efff], 0x000000000009e000 bytes on node 0 flags: 0x0 memory[0x1] [0x0000000000100000-0x00000000bffd6fff], 0x00000000bfed7000 bytes on node 0 flags: 0x0 memory[0x2] [0x0000000140000000-0x000000023fffffff], 0x0000000100000000 bytes on node 1 flags: 0x0 ... This causes shrinking node 0's pfn range because it is calculated by the address range of memblock.memory. So some of struct pages in the gap range are left uninitialized. We have a function zero_resv_unavail() which does zeroing the struct pages outside memblock.memory, but currently it covers only the reserved unavailable range (i.e. memblock.memory && !memblock.reserved). This patch extends it to cover all unavailable range, which fixes the reported issue. Link: http://lkml.kernel.org/r/20181002143821.5112-2-msys.mizuma@gmail.com Fixes: f7f9910 ("mm: stop zeroing memory during allocation in vmemmap") Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Tested-by: Oscar Salvador <osalvador@suse.de> Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
swaroopbalan
pushed a commit
that referenced
this pull request
Mar 12, 2020
commit 6f1a4891a5928a5969c87fa5a584844c983ec823 upstream. Evan tracked down a subtle race between the update of the MSI message and the device raising an interrupt internally on PCI devices which do not support MSI masking. The update of the MSI message is non-atomic and consists of either 2 or 3 sequential 32bit wide writes to the PCI config space. - Write address low 32bits - Write address high 32bits (If supported by device) - Write data When an interrupt is migrated then both address and data might change, so the kernel attempts to mask the MSI interrupt first. But for MSI masking is optional, so there exist devices which do not provide it. That means that if the device raises an interrupt internally between the writes then a MSI message is sent built from half updated state. On x86 this can lead to spurious interrupts on the wrong interrupt vector when the affinity setting changes both address and data. As a consequence the device interrupt can be lost causing the device to become stuck or malfunctioning. Evan tried to handle that by disabling MSI accross an MSI message update. That's not feasible because disabling MSI has issues on its own: If MSI is disabled the PCI device is routing an interrupt to the legacy INTx mechanism. The INTx delivery can be disabled, but the disablement is not working on all devices. Some devices lose interrupts when both MSI and INTx delivery are disabled. Another way to solve this would be to enforce the allocation of the same vector on all CPUs in the system for this kind of screwed devices. That could be done, but it would bring back the vector space exhaustion problems which got solved a few years ago. Fortunately the high address (if supported by the device) is only relevant when X2APIC is enabled which implies interrupt remapping. In the interrupt remapping case the affinity setting is happening at the interrupt remapping unit and the PCI MSI message is programmed only once when the PCI device is initialized. That makes it possible to solve it with a two step update: 1) Target the MSI msg to the new vector on the current target CPU 2) Target the MSI msg to the new vector on the new target CPU In both cases writing the MSI message is only changing a single 32bit word which prevents the issue of inconsistency. After writing the final destination it is necessary to check whether the device issued an interrupt while the intermediate state #1 (new vector, current CPU) was in effect. This is possible because the affinity change is always happening on the current target CPU. The code runs with interrupts disabled, so the interrupt can be detected by checking the IRR of the local APIC. If the vector is pending in the IRR then the interrupt is retriggered on the new target CPU by sending an IPI for the associated vector on the target CPU. This can cause spurious interrupts on both the local and the new target CPU. 1) If the new vector is not in use on the local CPU and the device affected by the affinity change raised an interrupt during the transitional state (step #1 above) then interrupt entry code will ignore that spurious interrupt. The vector is marked so that the 'No irq handler for vector' warning is supressed once. 2) If the new vector is in use already on the local CPU then the IRR check might see an pending interrupt from the device which is using this vector. The IPI to the new target CPU will then invoke the handler of the device, which got the affinity change, even if that device did not issue an interrupt 3) If the new vector is in use already on the local CPU and the device affected by the affinity change raised an interrupt during the transitional state (step #1 above) then the handler of the device which uses that vector on the local CPU will be invoked. expose issues in device driver interrupt handlers which are not prepared to handle a spurious interrupt correctly. This not a regression, it's just exposing something which was already broken as spurious interrupts can happen for a lot of reasons and all driver handlers need to be able to deal with them. Reported-by: Evan Green <evgreen@chromium.org> Debugged-by: Evan Green <evgreen@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Evan Green <evgreen@chromium.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/87imkr4s7n.fsf@nanos.tec.linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
swaroopbalan
pushed a commit
that referenced
this pull request
Mar 12, 2020
commit febac332a819f0e764aa4da62757ba21d18c182b upstream. Kernel crashes inside QEMU/KVM are observed: kernel BUG at kernel/time/timer.c:1154! BUG_ON(timer_pending(timer) || !timer->function) in add_timer_on(). At the same time another cpu got: general protection fault: 0000 [#1] SMP PTI of poinson pointer 0xdead000000000200 in: __hlist_del at include/linux/list.h:681 (inlined by) detach_timer at kernel/time/timer.c:818 (inlined by) expire_timers at kernel/time/timer.c:1355 (inlined by) __run_timers at kernel/time/timer.c:1686 (inlined by) run_timer_softirq at kernel/time/timer.c:1699 Unfortunately kernel logs are badly scrambled, stacktraces are lost. Printing the timer->function before the BUG_ON() pointed to clocksource_watchdog(). The execution of clocksource_watchdog() can race with a sequence of clocksource_stop_watchdog() .. clocksource_start_watchdog(): expire_timers() detach_timer(timer, true); timer->entry.pprev = NULL; raw_spin_unlock_irq(&base->lock); call_timer_fn clocksource_watchdog() clocksource_watchdog_kthread() or clocksource_unbind() spin_lock_irqsave(&watchdog_lock, flags); clocksource_stop_watchdog(); del_timer(&watchdog_timer); watchdog_running = 0; spin_unlock_irqrestore(&watchdog_lock, flags); spin_lock_irqsave(&watchdog_lock, flags); clocksource_start_watchdog(); add_timer_on(&watchdog_timer, ...); watchdog_running = 1; spin_unlock_irqrestore(&watchdog_lock, flags); spin_lock(&watchdog_lock); add_timer_on(&watchdog_timer, ...); BUG_ON(timer_pending(timer) || !timer->function); timer_pending() -> true BUG() I.e. inside clocksource_watchdog() watchdog_timer could be already armed. Check timer_pending() before calling add_timer_on(). This is sufficient as all operations are synchronized by watchdog_lock. Fixes: 75c5158 ("timekeeping: Update clocksource with stop_machine") Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/158048693917.4378.13823603769948933793.stgit@buzz Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
swaroopbalan
pushed a commit
that referenced
this pull request
Mar 12, 2020
commit be8638344c70bf492963ace206a9896606b6922d upstream. Cleaning up a pq can result in the following warning and panic: WARNING: CPU: 52 PID: 77418 at lib/list_debug.c:53 __list_del_entry+0x63/0xd0 list_del corruption, ffff88cb2c6ac068->next is LIST_POISON1 (dead000000000100) Modules linked in: mmfs26(OE) mmfslinux(OE) tracedev(OE) 8021q garp mrp ib_isert iscsi_target_mod target_core_mod crc_t10dif crct10dif_generic opa_vnic rpcrdma ib_iser libiscsi scsi_transport_iscsi ib_ipoib(OE) bridge stp llc iTCO_wdt iTCO_vendor_support intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crct10dif_pclmul crct10dif_common crc32_pclmul ghash_clmulni_intel ast aesni_intel ttm lrw gf128mul glue_helper ablk_helper drm_kms_helper cryptd syscopyarea sysfillrect sysimgblt fb_sys_fops drm pcspkr joydev lpc_ich mei_me drm_panel_orientation_quirks i2c_i801 mei wmi ipmi_si ipmi_devintf ipmi_msghandler nfit libnvdimm acpi_power_meter acpi_pad hfi1(OE) rdmavt(OE) rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core binfmt_misc numatools(OE) xpmem(OE) ip_tables nfsv3 nfs_acl nfs lockd grace sunrpc fscache igb ahci i2c_algo_bit libahci dca ptp libata pps_core crc32c_intel [last unloaded: i2c_algo_bit] CPU: 52 PID: 77418 Comm: pvbatch Kdump: loaded Tainted: G OE ------------ 3.10.0-957.38.3.el7.x86_64 #1 Hardware name: HPE.COM HPE SGI 8600-XA730i Gen10/X11DPT-SB-SG007, BIOS SBED1229 01/22/2019 Call Trace: [<ffffffff90365ac0>] dump_stack+0x19/0x1b [<ffffffff8fc98b78>] __warn+0xd8/0x100 [<ffffffff8fc98bff>] warn_slowpath_fmt+0x5f/0x80 [<ffffffff8ff970c3>] __list_del_entry+0x63/0xd0 [<ffffffff8ff9713d>] list_del+0xd/0x30 [<ffffffff8fddda70>] kmem_cache_destroy+0x50/0x110 [<ffffffffc0328130>] hfi1_user_sdma_free_queues+0xf0/0x200 [hfi1] [<ffffffffc02e2350>] hfi1_file_close+0x70/0x1e0 [hfi1] [<ffffffff8fe4519c>] __fput+0xec/0x260 [<ffffffff8fe453fe>] ____fput+0xe/0x10 [<ffffffff8fcbfd1b>] task_work_run+0xbb/0xe0 [<ffffffff8fc2bc65>] do_notify_resume+0xa5/0xc0 [<ffffffff90379134>] int_signal+0x12/0x17 BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 IP: [<ffffffff8fe1f93e>] kmem_cache_close+0x7e/0x300 PGD 2cdab19067 PUD 2f7bfdb067 PMD 0 Oops: 0000 [#1] SMP Modules linked in: mmfs26(OE) mmfslinux(OE) tracedev(OE) 8021q garp mrp ib_isert iscsi_target_mod target_core_mod crc_t10dif crct10dif_generic opa_vnic rpcrdma ib_iser libiscsi scsi_transport_iscsi ib_ipoib(OE) bridge stp llc iTCO_wdt iTCO_vendor_support intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crct10dif_pclmul crct10dif_common crc32_pclmul ghash_clmulni_intel ast aesni_intel ttm lrw gf128mul glue_helper ablk_helper drm_kms_helper cryptd syscopyarea sysfillrect sysimgblt fb_sys_fops drm pcspkr joydev lpc_ich mei_me drm_panel_orientation_quirks i2c_i801 mei wmi ipmi_si ipmi_devintf ipmi_msghandler nfit libnvdimm acpi_power_meter acpi_pad hfi1(OE) rdmavt(OE) rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core binfmt_misc numatools(OE) xpmem(OE) ip_tables nfsv3 nfs_acl nfs lockd grace sunrpc fscache igb ahci i2c_algo_bit libahci dca ptp libata pps_core crc32c_intel [last unloaded: i2c_algo_bit] CPU: 52 PID: 77418 Comm: pvbatch Kdump: loaded Tainted: G W OE ------------ 3.10.0-957.38.3.el7.x86_64 #1 Hardware name: HPE.COM HPE SGI 8600-XA730i Gen10/X11DPT-SB-SG007, BIOS SBED1229 01/22/2019 task: ffff88cc26db9040 ti: ffff88b5393a8000 task.ti: ffff88b5393a8000 RIP: 0010:[<ffffffff8fe1f93e>] [<ffffffff8fe1f93e>] kmem_cache_close+0x7e/0x300 RSP: 0018:ffff88b5393abd60 EFLAGS: 00010287 RAX: 0000000000000000 RBX: ffff88cb2c6ac000 RCX: 0000000000000003 RDX: 0000000000000400 RSI: 0000000000000400 RDI: ffffffff9095b800 RBP: ffff88b5393abdb0 R08: ffffffff9095b808 R09: ffffffff8ff77c19 R10: ffff88b73ce1f160 R11: ffffddecddde9800 R12: ffff88cb2c6ac000 R13: 000000000000000c R14: ffff88cf3fdca780 R15: 0000000000000000 FS: 00002aaaaab52500(0000) GS:ffff88b73ce00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000010 CR3: 0000002d27664000 CR4: 00000000007607e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: [<ffffffff8fe20d44>] __kmem_cache_shutdown+0x14/0x80 [<ffffffff8fddda78>] kmem_cache_destroy+0x58/0x110 [<ffffffffc0328130>] hfi1_user_sdma_free_queues+0xf0/0x200 [hfi1] [<ffffffffc02e2350>] hfi1_file_close+0x70/0x1e0 [hfi1] [<ffffffff8fe4519c>] __fput+0xec/0x260 [<ffffffff8fe453fe>] ____fput+0xe/0x10 [<ffffffff8fcbfd1b>] task_work_run+0xbb/0xe0 [<ffffffff8fc2bc65>] do_notify_resume+0xa5/0xc0 [<ffffffff90379134>] int_signal+0x12/0x17 Code: 00 00 ba 00 04 00 00 0f 4f c2 3d 00 04 00 00 89 45 bc 0f 84 e7 01 00 00 48 63 45 bc 49 8d 04 c4 48 89 45 b0 48 8b 80 c8 00 00 00 <48> 8b 78 10 48 89 45 c0 48 83 c0 10 48 89 45 d0 48 8b 17 48 39 RIP [<ffffffff8fe1f93e>] kmem_cache_close+0x7e/0x300 RSP <ffff88b5393abd60> CR2: 0000000000000010 The panic is the result of slab entries being freed during the destruction of the pq slab. The code attempts to quiesce the pq, but looking for n_req == 0 doesn't account for new requests. Fix the issue by using SRCU to get a pq pointer and adjust the pq free logic to NULL the fd pq pointer prior to the quiesce. Fixes: e87473b ("IB/hfi1: Only set fd pointer when base context is completely initialized") Link: https://lore.kernel.org/r/20200210131033.87408.81174.stgit@awfm-01.aw.intel.com Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
swaroopbalan
pushed a commit
that referenced
this pull request
Mar 12, 2020
commit f92e48718889b3d49cee41853402aa88cac84a6b upstream. When the hfi1 device is shut down during a system reboot, it is possible that some QPs might have not not freed by ULPs. More requests could be post sent and a lingering timer could be triggered to schedule more packet sends, leading to a crash: BUG: unable to handle kernel NULL pointer dereference at 0000000000000102 IP: [ffffffff810a65f2] __queue_work+0x32/0x3c0 PGD 0 Oops: 0000 1 SMP Modules linked in: nvmet_rdma(OE) nvmet(OE) nvme(OE) dm_round_robin nvme_rdma(OE) nvme_fabrics(OE) nvme_core(OE) pal_raw(POE) pal_pmt(POE) pal_cache(POE) pal_pile(POE) pal(POE) pal_compatible(OE) rpcrdma sunrpc ib_isert iscsi_target_mod target_core_mod ib_iser libiscsi scsi_transport_iscsi ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx4_ib sb_edac edac_core intel_powerclamp coretemp intel_rapl iosf_mbi kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt iTCO_vendor_support mxm_wmi ipmi_ssif pcspkr ses enclosure joydev scsi_transport_sas i2c_i801 sg mei_me lpc_ich mei ioatdma shpchp ipmi_si ipmi_devintf ipmi_msghandler wmi acpi_power_meter acpi_pad dm_multipath hangcheck_timer ip_tables ext4 mbcache jbd2 mlx4_en sd_mod crc_t10dif crct10dif_generic mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm mlx4_core crct10dif_pclmul crct10dif_common hfi1(OE) igb crc32c_intel rdmavt(OE) ahci ib_core libahci libata ptp megaraid_sas pps_core dca i2c_algo_bit i2c_core devlink dm_mirror dm_region_hash dm_log dm_mod CPU: 23 PID: 0 Comm: swapper/23 Tainted: P OE ------------ 3.10.0-693.el7.x86_64 #1 Hardware name: Intel Corporation S2600CWR/S2600CWR, BIOS SE5C610.86B.01.01.0028.121720182203 12/17/2018 task: ffff8808f4ec4f10 ti: ffff8808f4ed8000 task.ti: ffff8808f4ed8000 RIP: 0010:[ffffffff810a65f2] [ffffffff810a65f2] __queue_work+0x32/0x3c0 RSP: 0018:ffff88105df43d48 EFLAGS: 00010046 RAX: 0000000000000086 RBX: 0000000000000086 RCX: 0000000000000000 RDX: ffff880f74e758b0 RSI: 0000000000000000 RDI: 000000000000001f RBP: ffff88105df43d80 R08: ffff8808f3c583c8 R09: ffff8808f3c58000 R10: 0000000000000002 R11: ffff88105df43da8 R12: ffff880f74e758b0 R13: 000000000000001f R14: 0000000000000000 R15: ffff88105a300000 FS: 0000000000000000(0000) GS:ffff88105df40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000102 CR3: 00000000019f2000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: ffff88105b6dd708 0000001f00000286 0000000000000086 ffff88105a300000 ffff880f74e75800 0000000000000000 ffff88105a300000 ffff88105df43d98 ffffffff810a6b85 ffff88105a301e80 ffff88105df43dc8 ffffffffc0224cde Call Trace: IRQ [ffffffff810a6b85] queue_work_on+0x45/0x50 [ffffffffc0224cde] _hfi1_schedule_send+0x6e/0xc0 [hfi1] [ffffffffc0170570] ? get_map_page+0x60/0x60 [rdmavt] [ffffffffc0224d62] hfi1_schedule_send+0x32/0x70 [hfi1] [ffffffffc0170644] rvt_rc_timeout+0xd4/0x120 [rdmavt] [ffffffffc0170570] ? get_map_page+0x60/0x60 [rdmavt] [ffffffff81097316] call_timer_fn+0x36/0x110 [ffffffffc0170570] ? get_map_page+0x60/0x60 [rdmavt] [ffffffff8109982d] run_timer_softirq+0x22d/0x310 [ffffffff81090b3f] __do_softirq+0xef/0x280 [ffffffff816b6a5c] call_softirq+0x1c/0x30 [ffffffff8102d3c5] do_softirq+0x65/0xa0 [ffffffff81090ec5] irq_exit+0x105/0x110 [ffffffff816b76c2] smp_apic_timer_interrupt+0x42/0x50 [ffffffff816b5c1d] apic_timer_interrupt+0x6d/0x80 EOI [ffffffff81527a02] ? cpuidle_enter_state+0x52/0xc0 [ffffffff81527b48] cpuidle_idle_call+0xd8/0x210 [ffffffff81034fee] arch_cpu_idle+0xe/0x30 [ffffffff810e7bca] cpu_startup_entry+0x14a/0x1c0 [ffffffff81051af6] start_secondary+0x1b6/0x230 Code: 89 e5 41 57 41 56 49 89 f6 41 55 41 89 fd 41 54 49 89 d4 53 48 83 ec 10 89 7d d4 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 be 02 00 00 41 f6 86 02 01 00 00 01 0f 85 58 02 00 00 49 c7 c7 28 19 01 00 RIP [ffffffff810a65f2] __queue_work+0x32/0x3c0 RSP ffff88105df43d48 CR2: 0000000000000102 The solution is to reset the QPs before the device resources are freed. This reset will change the QP state to prevent post sends and delete timers to prevent callbacks. Fixes: 0acb0cc ("IB/rdmavt: Initialize and teardown of qpn table") Link: https://lore.kernel.org/r/20200210131040.87408.38161.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
swaroopbalan
pushed a commit
that referenced
this pull request
Mar 12, 2020
commit a72f4ac1d778f7bde93dfee69bfc23377ec3d74f upstream. Add a check that the size specified in the flow spec header doesn't cause an overflow when calculating the filter size, and thus prevent access to invalid memory. The following crash from syzkaller revealed it. kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP KASAN PTI CPU: 1 PID: 17834 Comm: syz-executor.3 Not tainted 5.5.0-rc5 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 RIP: 0010:memchr_inv+0xd3/0x330 Code: 89 f9 89 f5 83 e1 07 0f 85 f9 00 00 00 49 89 d5 49 c1 ed 03 45 85 ed 74 6f 48 89 d9 48 b8 00 00 00 00 00 fc ff df 48 c1 e9 03 <80> 3c 01 00 0f 85 0d 02 00 00 44 0f b6 e5 48 b8 01 01 01 01 01 01 RSP: 0018:ffffc9000a13fa50 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: 7fff88810de9d820 RCX: 0ffff11021bd3b04 RDX: 000000000000fff8 RSI: 0000000000000000 RDI: 7fff88810de9d820 RBP: 0000000000000000 R08: ffff888110d69018 R09: 0000000000000009 R10: 0000000000000001 R11: ffffed10236267cc R12: 0000000000000004 R13: 0000000000001fff R14: ffff88810de9d820 R15: 0000000000000040 FS: 00007f9ee0e51700(0000) GS:ffff88811b100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000115ea0006 CR4: 0000000000360ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: spec_filter_size.part.16+0x34/0x50 ib_uverbs_kern_spec_to_ib_spec_filter+0x691/0x770 ib_uverbs_ex_create_flow+0x9ea/0x1b40 ib_uverbs_write+0xaa5/0xdf0 __vfs_write+0x7c/0x100 vfs_write+0x168/0x4a0 ksys_write+0xc8/0x200 do_syscall_64+0x9c/0x390 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x465b49 Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f9ee0e50c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000000000073bf00 RCX: 0000000000465b49 RDX: 00000000000003a0 RSI: 00000000200007c0 RDI: 0000000000000004 RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f9ee0e516bc R13: 00000000004ca2da R14: 000000000070deb8 R15: 00000000ffffffff Modules linked in: Dumping ftrace buffer: (ftrace buffer empty) Fixes: 94e03f1 ("IB/uverbs: Add support for flow tag") Link: https://lore.kernel.org/r/20200126171500.4623-1-leon@kernel.org Signed-off-by: Avihai Horon <avihaih@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
swaroopbalan
pushed a commit
that referenced
this pull request
Mar 12, 2020
commit 1dd017882e01d2fcd9c5dbbf1eb376211111c393 upstream. We don't need to set pkey as valid in case that user set only one of pkey index or port number, otherwise it will be resulted in NULL pointer dereference while accessing to uninitialized pkey list. The following crash from Syzkaller revealed it. kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP KASAN PTI CPU: 1 PID: 14753 Comm: syz-executor.2 Not tainted 5.5.0-rc5 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 RIP: 0010:get_pkey_idx_qp_list+0x161/0x2d0 Code: 01 00 00 49 8b 5e 20 4c 39 e3 0f 84 b9 00 00 00 e8 e4 42 6e fe 48 8d 7b 10 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 01 0f 8e d0 00 00 00 48 8d 7d 04 48 b8 RSP: 0018:ffffc9000bc6f950 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff82c8bdec RDX: 0000000000000002 RSI: ffffc900030a8000 RDI: 0000000000000010 RBP: ffff888112c8ce80 R08: 0000000000000004 R09: fffff5200178df1f R10: 0000000000000001 R11: fffff5200178df1f R12: ffff888115dc4430 R13: ffff888115da8498 R14: ffff888115dc4410 R15: ffff888115da8000 FS: 00007f20777de700(0000) GS:ffff88811b100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b2f721000 CR3: 00000001173ca002 CR4: 0000000000360ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: port_pkey_list_insert+0xd7/0x7c0 ib_security_modify_qp+0x6fa/0xfc0 _ib_modify_qp+0x8c4/0xbf0 modify_qp+0x10da/0x16d0 ib_uverbs_modify_qp+0x9a/0x100 ib_uverbs_write+0xaa5/0xdf0 __vfs_write+0x7c/0x100 vfs_write+0x168/0x4a0 ksys_write+0xc8/0x200 do_syscall_64+0x9c/0x390 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: d291f1a ("IB/core: Enforce PKey security on QPs") Link: https://lore.kernel.org/r/20200212080651.GB679970@unreal Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Message-Id: <20200212080651.GB679970@unreal> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
swaroopbalan
pushed a commit
that referenced
this pull request
Mar 12, 2020
[ Upstream commit 1b710b1b10eff9d46666064ea25f079f70bc67a8 ]
Sergey didn't like the locking order,
uart_port->lock -> tty_port->lock
uart_write (uart_port->lock)
__uart_start
pl011_start_tx
pl011_tx_chars
uart_write_wakeup
tty_port_tty_wakeup
tty_port_default
tty_port_tty_get (tty_port->lock)
but those code is so old, and I have no clue how to de-couple it after
checking other locks in the splat. There is an onging effort to make all
printk() as deferred, so until that happens, workaround it for now as a
short-term fix.
LTP: starting iogen01 (export LTPROOT; rwtest -N iogen01 -i 120s -s
read,write -Da -Dv -n 2 500b:$TMPDIR/doio.f1.$$
1000b:$TMPDIR/doio.f2.$$)
WARNING: possible circular locking dependency detected
------------------------------------------------------
doio/49441 is trying to acquire lock:
ffff008b7cff7290 (&(&zone->lock)->rlock){..-.}, at: rmqueue+0x138/0x2050
but task is already holding lock:
60ff000822352818 (&pool->lock/1){-.-.}, at: start_flush_work+0xd8/0x3f0
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #4 (&pool->lock/1){-.-.}:
lock_acquire+0x320/0x360
_raw_spin_lock+0x64/0x80
__queue_work+0x4b4/0xa10
queue_work_on+0xac/0x11c
tty_schedule_flip+0x84/0xbc
tty_flip_buffer_push+0x1c/0x28
pty_write+0x98/0xd0
n_tty_write+0x450/0x60c
tty_write+0x338/0x474
__vfs_write+0x88/0x214
vfs_write+0x12c/0x1a4
redirected_tty_write+0x90/0xdc
do_loop_readv_writev+0x140/0x180
do_iter_write+0xe0/0x10c
vfs_writev+0x134/0x1cc
do_writev+0xbc/0x130
__arm64_sys_writev+0x58/0x8c
el0_svc_handler+0x170/0x240
el0_sync_handler+0x150/0x250
el0_sync+0x164/0x180
-> #3 (&(&port->lock)->rlock){-.-.}:
lock_acquire+0x320/0x360
_raw_spin_lock_irqsave+0x7c/0x9c
tty_port_tty_get+0x24/0x60
tty_port_default_wakeup+0x1c/0x3c
tty_port_tty_wakeup+0x34/0x40
uart_write_wakeup+0x28/0x44
pl011_tx_chars+0x1b8/0x270
pl011_start_tx+0x24/0x70
__uart_start+0x5c/0x68
uart_write+0x164/0x1c8
do_output_char+0x33c/0x348
n_tty_write+0x4bc/0x60c
tty_write+0x338/0x474
redirected_tty_write+0xc0/0xdc
do_loop_readv_writev+0x140/0x180
do_iter_write+0xe0/0x10c
vfs_writev+0x134/0x1cc
do_writev+0xbc/0x130
__arm64_sys_writev+0x58/0x8c
el0_svc_handler+0x170/0x240
el0_sync_handler+0x150/0x250
el0_sync+0x164/0x180
-> #2 (&port_lock_key){-.-.}:
lock_acquire+0x320/0x360
_raw_spin_lock+0x64/0x80
pl011_console_write+0xec/0x2cc
console_unlock+0x794/0x96c
vprintk_emit+0x260/0x31c
vprintk_default+0x54/0x7c
vprintk_func+0x218/0x254
printk+0x7c/0xa4
register_console+0x734/0x7b0
uart_add_one_port+0x734/0x834
pl011_register_port+0x6c/0xac
sbsa_uart_probe+0x234/0x2ec
platform_drv_probe+0xd4/0x124
really_probe+0x250/0x71c
driver_probe_device+0xb4/0x200
__device_attach_driver+0xd8/0x188
bus_for_each_drv+0xbc/0x110
__device_attach+0x120/0x220
device_initial_probe+0x20/0x2c
bus_probe_device+0x54/0x100
device_add+0xae8/0xc2c
platform_device_add+0x278/0x3b8
platform_device_register_full+0x238/0x2ac
acpi_create_platform_device+0x2dc/0x3a8
acpi_bus_attach+0x390/0x3cc
acpi_bus_attach+0x108/0x3cc
acpi_bus_attach+0x108/0x3cc
acpi_bus_attach+0x108/0x3cc
acpi_bus_scan+0x7c/0xb0
acpi_scan_init+0xe4/0x304
acpi_init+0x100/0x114
do_one_initcall+0x348/0x6a0
do_initcall_level+0x190/0x1fc
do_basic_setup+0x34/0x4c
kernel_init_freeable+0x19c/0x260
kernel_init+0x18/0x338
ret_from_fork+0x10/0x18
-> #1 (console_owner){-...}:
lock_acquire+0x320/0x360
console_lock_spinning_enable+0x6c/0x7c
console_unlock+0x4f8/0x96c
vprintk_emit+0x260/0x31c
vprintk_default+0x54/0x7c
vprintk_func+0x218/0x254
printk+0x7c/0xa4
get_random_u64+0x1c4/0x1dc
shuffle_pick_tail+0x40/0xac
__free_one_page+0x424/0x710
free_one_page+0x70/0x120
__free_pages_ok+0x61c/0xa94
__free_pages_core+0x1bc/0x294
memblock_free_pages+0x38/0x48
__free_pages_memory+0xcc/0xfc
__free_memory_core+0x70/0x78
free_low_memory_core_early+0x148/0x18c
memblock_free_all+0x18/0x54
mem_init+0xb4/0x17c
mm_init+0x14/0x38
start_kernel+0x19c/0x530
-> #0 (&(&zone->lock)->rlock){..-.}:
validate_chain+0xf6c/0x2e2c
__lock_acquire+0x868/0xc2c
lock_acquire+0x320/0x360
_raw_spin_lock+0x64/0x80
rmqueue+0x138/0x2050
get_page_from_freelist+0x474/0x688
__alloc_pages_nodemask+0x3b4/0x18dc
alloc_pages_current+0xd0/0xe0
alloc_slab_page+0x2b4/0x5e0
new_slab+0xc8/0x6bc
___slab_alloc+0x3b8/0x640
kmem_cache_alloc+0x4b4/0x588
__debug_object_init+0x778/0x8b4
debug_object_init_on_stack+0x40/0x50
start_flush_work+0x16c/0x3f0
__flush_work+0xb8/0x124
flush_work+0x20/0x30
xlog_cil_force_lsn+0x88/0x204 [xfs]
xfs_log_force_lsn+0x128/0x1b8 [xfs]
xfs_file_fsync+0x3c4/0x488 [xfs]
vfs_fsync_range+0xb0/0xd0
generic_write_sync+0x80/0xa0 [xfs]
xfs_file_buffered_aio_write+0x66c/0x6e4 [xfs]
xfs_file_write_iter+0x1a0/0x218 [xfs]
__vfs_write+0x1cc/0x214
vfs_write+0x12c/0x1a4
ksys_write+0xb0/0x120
__arm64_sys_write+0x54/0x88
el0_svc_handler+0x170/0x240
el0_sync_handler+0x150/0x250
el0_sync+0x164/0x180
other info that might help us debug this:
Chain exists of:
&(&zone->lock)->rlock --> &(&port->lock)->rlock --> &pool->lock/1
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&pool->lock/1);
lock(&(&port->lock)->rlock);
lock(&pool->lock/1);
lock(&(&zone->lock)->rlock);
*** DEADLOCK ***
4 locks held by doio/49441:
#0: a0ff00886fc27408 (sb_writers#8){.+.+}, at: vfs_write+0x118/0x1a4
#1: 8fff00080810dfe0 (&xfs_nondir_ilock_class){++++}, at:
xfs_ilock+0x2a8/0x300 [xfs]
#2: ffff9000129f2390 (rcu_read_lock){....}, at:
rcu_lock_acquire+0x8/0x38
#3: 60ff000822352818 (&pool->lock/1){-.-.}, at:
start_flush_work+0xd8/0x3f0
stack backtrace:
CPU: 48 PID: 49441 Comm: doio Tainted: G W
Hardware name: HPE Apollo 70 /C01_APACHE_MB , BIOS
L50_5.13_1.11 06/18/2019
Call trace:
dump_backtrace+0x0/0x248
show_stack+0x20/0x2c
dump_stack+0xe8/0x150
print_circular_bug+0x368/0x380
check_noncircular+0x28c/0x294
validate_chain+0xf6c/0x2e2c
__lock_acquire+0x868/0xc2c
lock_acquire+0x320/0x360
_raw_spin_lock+0x64/0x80
rmqueue+0x138/0x2050
get_page_from_freelist+0x474/0x688
__alloc_pages_nodemask+0x3b4/0x18dc
alloc_pages_current+0xd0/0xe0
alloc_slab_page+0x2b4/0x5e0
new_slab+0xc8/0x6bc
___slab_alloc+0x3b8/0x640
kmem_cache_alloc+0x4b4/0x588
__debug_object_init+0x778/0x8b4
debug_object_init_on_stack+0x40/0x50
start_flush_work+0x16c/0x3f0
__flush_work+0xb8/0x124
flush_work+0x20/0x30
xlog_cil_force_lsn+0x88/0x204 [xfs]
xfs_log_force_lsn+0x128/0x1b8 [xfs]
xfs_file_fsync+0x3c4/0x488 [xfs]
vfs_fsync_range+0xb0/0xd0
generic_write_sync+0x80/0xa0 [xfs]
xfs_file_buffered_aio_write+0x66c/0x6e4 [xfs]
xfs_file_write_iter+0x1a0/0x218 [xfs]
__vfs_write+0x1cc/0x214
vfs_write+0x12c/0x1a4
ksys_write+0xb0/0x120
__arm64_sys_write+0x54/0x88
el0_svc_handler+0x170/0x240
el0_sync_handler+0x150/0x250
el0_sync+0x164/0x180
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Signed-off-by: Qian Cai <cai@lca.pw>
Link: https://lore.kernel.org/r/1573679785-21068-1-git-send-email-cai@lca.pw
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
swaroopbalan
pushed a commit
that referenced
this pull request
Mar 12, 2020
[ Upstream commit 576c75e36c689bec6a940e807bae27291ab0c0de ] With zpci_disable() working, lockdep detected a potential deadlock (lockdep output at the end). The deadlock is between recovering a PCI function via the /sys/bus/pci/devices/<dev>/recover attribute vs powering it off via /sys/bus/pci/slots/<slot>/power. The fix is analogous to the changes in commit 0ee223b ("scsi: core: Avoid that SCSI device removal through sysfs triggers a deadlock") that fixed a potential deadlock on removing a SCSI device via sysfs. [ 204.830107] ====================================================== [ 204.830109] WARNING: possible circular locking dependency detected [ 204.830111] 5.5.0-rc2-06072-gbc03ecc9a672 #6 Tainted: G W [ 204.830112] ------------------------------------------------------ [ 204.830113] bash/1034 is trying to acquire lock: [ 204.830115] 0000000192a1a610 (kn->count#200){++++}, at: kernfs_remove_by_name_ns+0x5c/0xa8 [ 204.830122] but task is already holding lock: [ 204.830123] 00000000c16134a8 (pci_rescan_remove_lock){+.+.}, at: pci_stop_and_remove_bus_device_locked+0x26/0x48 [ 204.830128] which lock already depends on the new lock. [ 204.830129] the existing dependency chain (in reverse order) is: [ 204.830130] -> #1 (pci_rescan_remove_lock){+.+.}: [ 204.830134] validate_chain+0x93a/0xd08 [ 204.830136] __lock_acquire+0x4ae/0x9d0 [ 204.830137] lock_acquire+0x114/0x280 [ 204.830140] __mutex_lock+0xa2/0x960 [ 204.830142] mutex_lock_nested+0x32/0x40 [ 204.830145] recover_store+0x4c/0xa8 [ 204.830147] kernfs_fop_write+0xe6/0x218 [ 204.830151] vfs_write+0xb0/0x1b8 [ 204.830152] ksys_write+0x6c/0xf8 [ 204.830154] system_call+0xd8/0x2d8 [ 204.830155] -> #0 (kn->count#200){++++}: [ 204.830187] check_noncircular+0x1e6/0x240 [ 204.830189] check_prev_add+0xfc/0xdb0 [ 204.830190] validate_chain+0x93a/0xd08 [ 204.830192] __lock_acquire+0x4ae/0x9d0 [ 204.830193] lock_acquire+0x114/0x280 [ 204.830194] __kernfs_remove.part.0+0x2e4/0x360 [ 204.830196] kernfs_remove_by_name_ns+0x5c/0xa8 [ 204.830198] remove_files.isra.0+0x4c/0x98 [ 204.830199] sysfs_remove_group+0x66/0xc8 [ 204.830201] sysfs_remove_groups+0x46/0x68 [ 204.830204] device_remove_attrs+0x52/0x90 [ 204.830207] device_del+0x182/0x418 [ 204.830208] pci_remove_bus_device+0x8a/0x130 [ 204.830210] pci_stop_and_remove_bus_device_locked+0x3a/0x48 [ 204.830212] disable_slot+0x68/0x100 [ 204.830213] power_write_file+0x7c/0x130 [ 204.830215] kernfs_fop_write+0xe6/0x218 [ 204.830217] vfs_write+0xb0/0x1b8 [ 204.830218] ksys_write+0x6c/0xf8 [ 204.830220] system_call+0xd8/0x2d8 [ 204.830221] other info that might help us debug this: [ 204.830223] Possible unsafe locking scenario: [ 204.830224] CPU0 CPU1 [ 204.830225] ---- ---- [ 204.830226] lock(pci_rescan_remove_lock); [ 204.830227] lock(kn->count#200); [ 204.830229] lock(pci_rescan_remove_lock); [ 204.830231] lock(kn->count#200); [ 204.830233] *** DEADLOCK *** [ 204.830234] 4 locks held by bash/1034: [ 204.830235] #0: 00000001b6fbc498 (sb_writers#4){.+.+}, at: vfs_write+0x158/0x1b8 [ 204.830239] #1: 000000018c9f5090 (&of->mutex){+.+.}, at: kernfs_fop_write+0xaa/0x218 [ 204.830242] #2: 00000001f7da0810 (kn->count#235){.+.+}, at: kernfs_fop_write+0xb6/0x218 [ 204.830245] #3: 00000000c16134a8 (pci_rescan_remove_lock){+.+.}, at: pci_stop_and_remove_bus_device_locked+0x26/0x48 [ 204.830248] stack backtrace: [ 204.830250] CPU: 2 PID: 1034 Comm: bash Tainted: G W 5.5.0-rc2-06072-gbc03ecc9a672 #6 [ 204.830252] Hardware name: IBM 8561 T01 703 (LPAR) [ 204.830253] Call Trace: [ 204.830257] [<00000000c05e10c0>] show_stack+0x88/0xf0 [ 204.830260] [<00000000c112dca4>] dump_stack+0xa4/0xe0 [ 204.830261] [<00000000c0694c06>] check_noncircular+0x1e6/0x240 [ 204.830263] [<00000000c0695bec>] check_prev_add+0xfc/0xdb0 [ 204.830264] [<00000000c06971da>] validate_chain+0x93a/0xd08 [ 204.830266] [<00000000c06994c6>] __lock_acquire+0x4ae/0x9d0 [ 204.830267] [<00000000c069867c>] lock_acquire+0x114/0x280 [ 204.830269] [<00000000c09ca15c>] __kernfs_remove.part.0+0x2e4/0x360 [ 204.830270] [<00000000c09cb5c4>] kernfs_remove_by_name_ns+0x5c/0xa8 [ 204.830272] [<00000000c09cee14>] remove_files.isra.0+0x4c/0x98 [ 204.830274] [<00000000c09cf2ae>] sysfs_remove_group+0x66/0xc8 [ 204.830276] [<00000000c09cf356>] sysfs_remove_groups+0x46/0x68 [ 204.830278] [<00000000c0e3dfe2>] device_remove_attrs+0x52/0x90 [ 204.830280] [<00000000c0e40382>] device_del+0x182/0x418 [ 204.830281] [<00000000c0dcfd7a>] pci_remove_bus_device+0x8a/0x130 [ 204.830283] [<00000000c0dcfe92>] pci_stop_and_remove_bus_device_locked+0x3a/0x48 [ 204.830285] [<00000000c0de7190>] disable_slot+0x68/0x100 [ 204.830286] [<00000000c0de6514>] power_write_file+0x7c/0x130 [ 204.830288] [<00000000c09cc846>] kernfs_fop_write+0xe6/0x218 [ 204.830290] [<00000000c08f3480>] vfs_write+0xb0/0x1b8 [ 204.830291] [<00000000c08f378c>] ksys_write+0x6c/0xf8 [ 204.830293] [<00000000c1154374>] system_call+0xd8/0x2d8 [ 204.830294] INFO: lockdep is turned off. Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
swaroopbalan
pushed a commit
that referenced
this pull request
Mar 12, 2020
[ Upstream commit 54155ed4199c7aa3fd20866648024ab63c96d579 ] A faulty userspace that calls destroy_session() before destroying the connections can trigger the failure. This patch prevents the issue by refusing to destroy the session if there are outstanding connections. ------------[ cut here ]------------ kernel BUG at mm/slub.c:306! invalid opcode: 0000 [#1] SMP PTI CPU: 1 PID: 1224 Comm: iscsid Not tainted 5.4.0-rc2.iscsi+ #7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 RIP: 0010:__slab_free+0x181/0x350 [...] [ 1209.686056] RSP: 0018:ffffa93d4074fae0 EFLAGS: 00010246 [ 1209.686694] RAX: ffff934efa5ad800 RBX: 000000008010000a RCX: ffff934efa5ad800 [ 1209.687651] RDX: ffff934efa5ad800 RSI: ffffeb4041e96b00 RDI: ffff934efd402c40 [ 1209.688582] RBP: ffffa93d4074fb80 R08: 0000000000000001 R09: ffffffffbb5dfa26 [ 1209.689425] R10: ffff934efa5ad800 R11: 0000000000000001 R12: ffffeb4041e96b00 [ 1209.690285] R13: ffff934efa5ad800 R14: ffff934efd402c40 R15: 0000000000000000 [ 1209.691213] FS: 00007f7945dfb540(0000) GS:ffff934efda80000(0000) knlGS:0000000000000000 [ 1209.692316] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1209.693013] CR2: 000055877fd3da80 CR3: 0000000077384000 CR4: 00000000000006e0 [ 1209.693897] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1209.694773] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1209.695631] Call Trace: [ 1209.695957] ? __wake_up_common_lock+0x8a/0xc0 [ 1209.696712] iscsi_pool_free+0x26/0x40 [ 1209.697263] iscsi_session_teardown+0x2f/0xf0 [ 1209.698117] iscsi_sw_tcp_session_destroy+0x45/0x60 [ 1209.698831] iscsi_if_rx+0xd88/0x14e0 [ 1209.699370] netlink_unicast+0x16f/0x200 [ 1209.699932] netlink_sendmsg+0x21a/0x3e0 [ 1209.700446] sock_sendmsg+0x4f/0x60 [ 1209.700902] ___sys_sendmsg+0x2ae/0x320 [ 1209.701451] ? cp_new_stat+0x150/0x180 [ 1209.701922] __sys_sendmsg+0x59/0xa0 [ 1209.702357] do_syscall_64+0x52/0x160 [ 1209.702812] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 1209.703419] RIP: 0033:0x7f7946433914 [...] [ 1209.706084] RSP: 002b:00007fffb99f2378 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 1209.706994] RAX: ffffffffffffffda RBX: 000055bc869eac20 RCX: 00007f7946433914 [ 1209.708082] RDX: 0000000000000000 RSI: 00007fffb99f2390 RDI: 0000000000000005 [ 1209.709120] RBP: 00007fffb99f2390 R08: 000055bc84fe9320 R09: 00007fffb99f1f07 [ 1209.710110] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000038 [ 1209.711085] R13: 000055bc8502306e R14: 0000000000000000 R15: 0000000000000000 Modules linked in: ---[ end trace a2d933ede7f730d8 ]--- Link: https://lore.kernel.org/r/20191226203148.2172200-1-krisman@collabora.com Signed-off-by: Nick Black <nlb@google.com> Co-developed-by: Salman Qazi <sqazi@google.com> Signed-off-by: Salman Qazi <sqazi@google.com> Co-developed-by: Junho Ryu <jayr@google.com> Signed-off-by: Junho Ryu <jayr@google.com> Co-developed-by: Khazhismel Kumykov <khazhy@google.com> Signed-off-by: Khazhismel Kumykov <khazhy@google.com> Co-developed-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
swaroopbalan
pushed a commit
that referenced
this pull request
Mar 12, 2020
[ Upstream commit a8f40111d184098cd2b3dc0c7170c42250a5fa09 ] The remoteproc_core and remoteproc drivers all initialize with module_init(). However remoteproc drivers need the rproc_class during their probe. If one of the remoteproc drivers runs init and gets through probe before remoteproc_init() runs, a NULL pointer access of rproc_class's `glue_dirs` spinlock occurs. > Unable to handle kernel NULL pointer dereference at virtual address 000000dc > pgd = c0004000 > [000000dc] *pgd=00000000 > Internal error: Oops: 5 [#1] PREEMPT ARM > Modules linked in: > CPU: 0 PID: 1 Comm: swapper Tainted: G W 4.14.106-rt56 #1 > Hardware name: Generic OMAP36xx (Flattened Device Tree) > task: c6050000 task.stack: c604a000 > PC is at rt_spin_lock+0x40/0x6c > LR is at rt_spin_lock+0x28/0x6c > pc : [<c0523c90>] lr : [<c0523c78>] psr: 60000013 > sp : c604bdc0 ip : 00000000 fp : 00000000 > r10: 00000000 r9 : c61c7c10 r8 : c6269c20 > r7 : c0905888 r6 : c6269c20 r5 : 00000000 r4 : 000000d4 > r3 : 000000dc r2 : c6050000 r1 : 00000002 r0 : 000000d4 > Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none ... > [<c0523c90>] (rt_spin_lock) from [<c03b65a4>] (get_device_parent+0x54/0x17c) > [<c03b65a4>] (get_device_parent) from [<c03b6bec>] (device_add+0xe0/0x5b4) > [<c03b6bec>] (device_add) from [<c042adf4>] (rproc_add+0x18/0xd8) > [<c042adf4>] (rproc_add) from [<c01110e4>] (my_rproc_probe+0x158/0x204) > [<c01110e4>] (my_rproc_probe) from [<c03bb6b8>] (platform_drv_probe+0x34/0x70) > [<c03bb6b8>] (platform_drv_probe) from [<c03b9dd4>] (driver_probe_device+0x2c8/0x420) > [<c03b9dd4>] (driver_probe_device) from [<c03ba02c>] (__driver_attach+0x100/0x11c) > [<c03ba02c>] (__driver_attach) from [<c03b7d08>] (bus_for_each_dev+0x7c/0xc0) > [<c03b7d08>] (bus_for_each_dev) from [<c03b910c>] (bus_add_driver+0x1cc/0x264) > [<c03b910c>] (bus_add_driver) from [<c03ba714>] (driver_register+0x78/0xf8) > [<c03ba714>] (driver_register) from [<c010181c>] (do_one_initcall+0x100/0x190) > [<c010181c>] (do_one_initcall) from [<c0800de8>] (kernel_init_freeable+0x130/0x1d0) > [<c0800de8>] (kernel_init_freeable) from [<c051eee8>] (kernel_init+0x8/0x114) > [<c051eee8>] (kernel_init) from [<c01175b0>] (ret_from_fork+0x14/0x24) > Code: e2843008 e3c2203f f5d3f000 e5922010 (e193cf9f) > ---[ end trace 0000000000000002 ]--- Signed-off-by: Brandon Maier <brandon.maier@rockwellcollins.com> Link: https://lore.kernel.org/r/20190530225223.136420-1-brandon.maier@rockwellcollins.com Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
swaroopbalan
pushed a commit
that referenced
this pull request
Mar 12, 2020
[ Upstream commit d6152e6ec9e2171280436f7b31a571509b9287e1 ] The following crash can be seen for setting CONFIG_DEBUG_TEST_DRIVER_REMOVE=y for DT FW (which some people still use): Hisilicon MBIGEN-V2 60080000.interrupt-controller: Failed to create mbi-gen irqdomain Hisilicon MBIGEN-V2: probe of 60080000.interrupt-controller failed with error -12 [...] Unable to handle kernel paging request at virtual address 0000000000005008 Mem abort info: ESR = 0x96000004 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x00000004 CM = 0, WnR = 0 user pgtable: 4k pages, 48-bit VAs, pgdp=0000041fb9990000 [0000000000005008] pgd=0000000000000000 Internal error: Oops: 96000004 [#1] PREEMPT SMP Modules linked in: CPU: 7 PID: 1 Comm: swapper/0 Not tainted 5.5.0-rc6-00002-g3fc42638a506-dirty #1622 Hardware name: Huawei Taishan 2280 /D05, BIOS Hisilicon D05 IT21 Nemo 2.0 RC0 04/18/2018 pstate: 40000085 (nZcv daIf -PAN -UAO) pc : mbigen_set_type+0x38/0x60 lr : __irq_set_trigger+0x6c/0x188 sp : ffff800014b4b400 x29: ffff800014b4b400 x28: 0000000000000007 x27: 0000000000000000 x26: 0000000000000000 x25: ffff041fd83bd0d4 x24: ffff041fd83bd188 x23: 0000000000000000 x22: ffff80001193ce00 x21: 0000000000000004 x20: 0000000000000000 x19: ffff041fd83bd000 x18: ffffffffffffffff x17: 0000000000000000 x16: 0000000000000000 x15: ffff8000119098c8 x14: ffff041fb94ec91c x13: ffff041fb94ec1a1 x12: 0000000000000030 x11: 0101010101010101 x10: 0000000000000040 x9 : 0000000000000000 x8 : ffff041fb98c6680 x7 : ffff800014b4b380 x6 : ffff041fd81636c8 x5 : 0000000000000000 x4 : 000000000000025f x3 : 0000000000005000 x2 : 0000000000005008 x1 : 0000000000000004 x0 : 0000000080000000 Call trace: mbigen_set_type+0x38/0x60 __setup_irq+0x744/0x900 request_threaded_irq+0xe0/0x198 pcie_pme_probe+0x98/0x118 pcie_port_probe_service+0x38/0x78 really_probe+0xa0/0x3e0 driver_probe_device+0x58/0x100 __device_attach_driver+0x90/0xb0 bus_for_each_drv+0x64/0xc8 __device_attach+0xd8/0x138 device_initial_probe+0x10/0x18 bus_probe_device+0x90/0x98 device_add+0x4c4/0x770 device_register+0x1c/0x28 pcie_port_device_register+0x1e4/0x4f0 pcie_portdrv_probe+0x34/0xd8 local_pci_probe+0x3c/0xa0 pci_device_probe+0x128/0x1c0 really_probe+0xa0/0x3e0 driver_probe_device+0x58/0x100 __device_attach_driver+0x90/0xb0 bus_for_each_drv+0x64/0xc8 __device_attach+0xd8/0x138 device_attach+0x10/0x18 pci_bus_add_device+0x4c/0xb8 pci_bus_add_devices+0x38/0x88 pci_host_probe+0x3c/0xc0 pci_host_common_probe+0xf0/0x208 hisi_pcie_almost_ecam_probe+0x24/0x30 platform_drv_probe+0x50/0xa0 really_probe+0xa0/0x3e0 driver_probe_device+0x58/0x100 device_driver_attach+0x6c/0x90 __driver_attach+0x84/0xc8 bus_for_each_dev+0x74/0xc8 driver_attach+0x20/0x28 bus_add_driver+0x148/0x1f0 driver_register+0x60/0x110 __platform_driver_register+0x40/0x48 hisi_pcie_almost_ecam_driver_init+0x1c/0x24 The specific problem here is that the mbigen driver real probe has failed as the mbigen_of_create_domain()->of_platform_device_create() call fails, the reason for that being that we never destroyed the platform device created during the remove test dry run and there is some conflict. Since we generally would never want to unbind this driver, and to save adding a driver tear down path for that, just set the driver .suppress_bind_attrs member to avoid this possibility. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Hanjun Guo <guohanjun@huawei.com> Link: https://lore.kernel.org/r/1579196323-180137-1-git-send-email-john.garry@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>
swaroopbalan
pushed a commit
that referenced
this pull request
Mar 12, 2020
…ync_trans()
[ Upstream commit 9f16ca48fc818a17de8be1f75d08e7f4addc4497 ]
I found a NULL pointer dereference in ocfs2_update_inode_fsync_trans(),
handle->h_transaction may be NULL in this situation:
ocfs2_file_write_iter
->__generic_file_write_iter
->generic_perform_write
->ocfs2_write_begin
->ocfs2_write_begin_nolock
->ocfs2_write_cluster_by_desc
->ocfs2_write_cluster
->ocfs2_mark_extent_written
->ocfs2_change_extent_flag
->ocfs2_split_extent
->ocfs2_try_to_merge_extent
->ocfs2_extend_rotate_transaction
->ocfs2_extend_trans
->jbd2_journal_restart
->jbd2__journal_restart
// handle->h_transaction is NULL here
->handle->h_transaction = NULL;
->start_this_handle
/* journal aborted due to storage
network disconnection, return error */
->return -EROFS;
/* line 3806 in ocfs2_try_to_merge_extent (),
it will ignore ret error. */
->ret = 0;
->...
->ocfs2_write_end
->ocfs2_write_end_nolock
->ocfs2_update_inode_fsync_trans
// NULL pointer dereference
->oi->i_sync_tid = handle->h_transaction->t_tid;
The information of NULL pointer dereference as follows:
JBD2: Detected IO errors while flushing file data on dm-11-45
Aborting journal on device dm-11-45.
JBD2: Error -5 detected when updating journal superblock for dm-11-45.
(dd,22081,3):ocfs2_extend_trans:474 ERROR: status = -30
(dd,22081,3):ocfs2_try_to_merge_extent:3877 ERROR: status = -30
Unable to handle kernel NULL pointer dereference at
virtual address 0000000000000008
Mem abort info:
ESR = 0x96000004
Exception class = DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000004
CM = 0, WnR = 0
user pgtable: 4k pages, 48-bit VAs, pgdp = 00000000e74e1338
[0000000000000008] pgd=0000000000000000
Internal error: Oops: 96000004 [#1] SMP
Process dd (pid: 22081, stack limit = 0x00000000584f35a9)
CPU: 3 PID: 22081 Comm: dd Kdump: loaded
Hardware name: Huawei TaiShan 2280 V2/BC82AMDD, BIOS 0.98 08/25/2019
pstate: 60400009 (nZCv daif +PAN -UAO)
pc : ocfs2_write_end_nolock+0x2b8/0x550 [ocfs2]
lr : ocfs2_write_end_nolock+0x2a0/0x550 [ocfs2]
sp : ffff0000459fba70
x29: ffff0000459fba70 x28: 0000000000000000
x27: ffff807ccf7f1000 x26: 0000000000000001
x25: ffff807bdff57970 x24: ffff807caf1d4000
x23: ffff807cc79e9000 x22: 0000000000001000
x21: 000000006c6cd000 x20: ffff0000091d9000
x19: ffff807ccb239db0 x18: ffffffffffffffff
x17: 000000000000000e x16: 0000000000000007
x15: ffff807c5e15bd78 x14: 0000000000000000
x13: 0000000000000000 x12: 0000000000000000
x11: 0000000000000000 x10: 0000000000000001
x9 : 0000000000000228 x8 : 000000000000000c
x7 : 0000000000000fff x6 : ffff807a308ed6b0
x5 : ffff7e01f10967c0 x4 : 0000000000000018
x3 : d0bc661572445600 x2 : 0000000000000000
x1 : 000000001b2e0200 x0 : 0000000000000000
Call trace:
ocfs2_write_end_nolock+0x2b8/0x550 [ocfs2]
ocfs2_write_end+0x4c/0x80 [ocfs2]
generic_perform_write+0x108/0x1a8
__generic_file_write_iter+0x158/0x1c8
ocfs2_file_write_iter+0x668/0x950 [ocfs2]
__vfs_write+0x11c/0x190
vfs_write+0xac/0x1c0
ksys_write+0x6c/0xd8
__arm64_sys_write+0x24/0x30
el0_svc_common+0x78/0x130
el0_svc_handler+0x38/0x78
el0_svc+0x8/0xc
To prevent NULL pointer dereference in this situation, we use
is_handle_aborted() before using handle->h_transaction->t_tid.
Link: http://lkml.kernel.org/r/03e750ab-9ade-83aa-b000-b9e81e34e539@huawei.com
Signed-off-by: Yan Wang <wangyan122@huawei.com>
Reviewed-by: Jun Piao <piaojun@huawei.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
swaroopbalan
pushed a commit
that referenced
this pull request
Mar 12, 2020
syzkaller reported this UAF: BUG: KASAN: use-after-free in n_tty_receive_buf_common+0x2481/0x2940 drivers/tty/n_tty.c:1741 Read of size 1 at addr ffff8880089e40e9 by task syz-executor.1/13184 CPU: 0 PID: 13184 Comm: syz-executor.1 Not tainted 5.4.7 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 Call Trace: ... kasan_report+0xe/0x20 mm/kasan/common.c:634 n_tty_receive_buf_common+0x2481/0x2940 drivers/tty/n_tty.c:1741 tty_ldisc_receive_buf+0xac/0x190 drivers/tty/tty_buffer.c:461 paste_selection+0x297/0x400 drivers/tty/vt/selection.c:372 tioclinux+0x20d/0x4e0 drivers/tty/vt/vt.c:3044 vt_ioctl+0x1bcf/0x28d0 drivers/tty/vt/vt_ioctl.c:364 tty_ioctl+0x525/0x15a0 drivers/tty/tty_io.c:2657 vfs_ioctl fs/ioctl.c:47 [inline] It is due to a race between parallel paste_selection (TIOCL_PASTESEL) and set_selection_user (TIOCL_SETSEL) invocations. One uses sel_buffer, while the other frees it and reallocates a new one for another selection. Add a mutex to close this race. The mutex takes care properly of sel_buffer and sel_buffer_lth only. The other selection global variables (like sel_start, sel_end, and sel_cons) are protected only in set_selection_user. The other functions need quite some more work to close the races of the variables there. This is going to happen later. This likely fixes (I am unsure as there is no reproducer provided) bug 206361 too. It was marked as CVE-2020-8648. Change-Id: Icb94c959993237f0557b83ab5e232af6fbafaa6a Tracked-On: PKT-3168 Signed-off-by: Jiri Slaby <jslaby@suse.cz> Reported-by: syzbot+59997e8d5cbdc486e6f6@syzkaller.appspotmail.com References: https://bugzilla.kernel.org/show_bug.cgi?id=206361 Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200210081131.23572-2-jslaby@suse.cz Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
xHCI compatible USB host controllers(i.e. super-speed USB3 controllers)
can be implemented with the Debug Capability(DbC). It presents a debug
device which is fully compliant with the USB framework and provides the
equivalent of a very high performance full-duplex serial link. The debug
capability operation model and registers interface are defined in 7.6.8
of the xHCI specification, revision 1.1.
The DbC debug device shares a root port with the xHCI host. By default,
the debug capability is disabled and the root port is assigned to xHCI.
When the DbC is enabled, the root port will be assigned to the DbC debug
device, and the xHCI sees nothing on this port. This implementation uses
a sysfs node named under the xHCI device to manage the enabling
and disabling of the debug capability.This implementation also uses a
sysfs node named <dbc_func> under the xHCI device to choose either
a tty interface or a raw character interface provided by DbC.
When the debug capability is enabled, it will present a debug device
through the debug port. This debug device is fully compliant with the
USB3 framework, and it can be enumerated by a debug host on the other
end of the USB link. As soon as the debug device is configured, if the
dbc_func is set to 1, a TTY serial device named /dev/ttyDBC0 will be
created, otherwise if the dbc_func is set to 0. a Character device
named /dev/dbc_raw will be created.
One use of this link is running a login service on the debug target.
Hence it can be remote accessed by a debug host. Another use case can
probably be found in servers. It provides a peer-to-peer USB link
between two host-only machines. This provides a reasonable out-of-band
communication method between two servers.
The raw USB function exposed to the user space as a cdev can be used
for user space USB functions development.
Jira: None
Test: None
Signed-off-by: Lu Baolu baolu.lu@linux.intel.com
Signed-off-by: Rajaram Regupathy rajaram.regupathy@intel.com
Signed-off-by: K, Gururaj gururaj.k@intel.com