syzbot


BUG: workqueue leaked lock or atomic in ocfs2_truncate_log_worker

Status: upstream: reported on 2025/03/19 19:34
Reported-by: syzbot+308b761b4bf510188d07@syzkaller.appspotmail.com
First crash: 61d, last: 61d

Sample crash report:
(kworker/u4:0,9,1):ocfs2_replay_truncate_records:5967 ERROR: status = -30
(kworker/u4:0,9,1):__ocfs2_flush_truncate_log:6048 ERROR: status = -30
(kworker/u4:0,9,1):ocfs2_truncate_log_worker:6082 ERROR: status = -30
BUG: workqueue leaked lock or atomic: kworker/u4:0/0x00000000/9
     last function: ocfs2_truncate_log_worker
3 locks held by kworker/u4:0/9:
 #0: ffff0000c16dc650 (sb_internal#2){.+.+}-{0:0}, at: ocfs2_replay_truncate_records fs/ocfs2/alloc.c:5931 [inline]
 #0: ffff0000c16dc650 (sb_internal#2){.+.+}-{0:0}, at: __ocfs2_flush_truncate_log+0x414/0x10f0 fs/ocfs2/alloc.c:6045
 #1: ffff0000d9d3cce8 (&journal->j_trans_barrier){.+.+}-{3:3}, at: ocfs2_start_trans+0x45c/0x804 fs/ocfs2/journal.c:352
 #2: ffff0000daf6e990 (jbd2_handle){++++}-{0:0}, at: start_this_handle+0xf98/0x12a4 fs/jbd2/transaction.c:462
CPU: 0 PID: 9 Comm: kworker/u4:0 Not tainted 5.15.179-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
Workqueue: ocfs2_wq ocfs2_truncate_log_worker
Call trace:
 dump_backtrace+0x0/0x530 arch/arm64/kernel/stacktrace.c:152
 show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:216
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x108/0x170 lib/dump_stack.c:106
 dump_stack+0x1c/0x58 lib/dump_stack.c:113
 process_one_work+0xb7c/0x11b8 kernel/workqueue.c:2325
 worker_thread+0x910/0x1034 kernel/workqueue.c:2457
 kthread+0x37c/0x45c kernel/kthread.c:334
 ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:870

======================================================
WARNING: possible circular locking dependency detected
5.15.179-syzkaller #0 Not tainted
------------------------------------------------------
kworker/u4:0/9 is trying to acquire lock:
ffff0000c0029138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x66c/0x11b8 kernel/workqueue.c:2283

but task is already holding lock:
ffff0000daf6e990 (jbd2_handle){++++}-{0:0}, at: start_this_handle+0xf98/0x12a4 fs/jbd2/transaction.c:462

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #4 (jbd2_handle){++++}-{0:0}:
       start_this_handle+0xfc0/0x12a4 fs/jbd2/transaction.c:464
       jbd2__journal_start+0x29c/0x7b4 fs/jbd2/transaction.c:521
       __ext4_journal_start_sb+0x358/0x70c fs/ext4/ext4_jbd2.c:105
       __ext4_journal_start fs/ext4/ext4_jbd2.h:326 [inline]
       ext4_dirty_inode+0x9c/0x100 fs/ext4/inode.c:6007
       __mark_inode_dirty+0x2b0/0x10f4 fs/fs-writeback.c:2464
       generic_update_time fs/inode.c:1881 [inline]
       inode_update_time fs/inode.c:1894 [inline]
       touch_atime+0x4d0/0xa4c fs/inode.c:1966
       file_accessed include/linux/fs.h:2521 [inline]
       ext4_file_mmap+0x140/0x2fc fs/ext4/file.c:763
       call_mmap include/linux/fs.h:2177 [inline]
       mmap_file+0x6c/0xc8 mm/util.c:1092
       __mmap_region mm/mmap.c:1784 [inline]
       mmap_region+0xb24/0x1408 mm/mmap.c:2921
       do_mmap+0x698/0xdc4 mm/mmap.c:1574
       vm_mmap_pgoff+0x1a4/0x2b4 mm/util.c:551
       vm_mmap+0x90/0xbc mm/util.c:570
       elf_map+0xec/0x214 fs/binfmt_elf.c:388
       load_elf_binary+0xd48/0x21a0 fs/binfmt_elf.c:1141
       search_binary_handler fs/exec.c:1742 [inline]
       exec_binprm fs/exec.c:1783 [inline]
       bprm_execve+0x7f4/0x1578 fs/exec.c:1852
       do_execveat_common+0x668/0x814 fs/exec.c:1957
       do_execve fs/exec.c:2027 [inline]
       __do_sys_execve fs/exec.c:2103 [inline]
       __se_sys_execve fs/exec.c:2098 [inline]
       __arm64_sys_execve+0x98/0xb0 fs/exec.c:2098
       __invoke_syscall arch/arm64/kernel/syscall.c:38 [inline]
       invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:52
       el0_svc_common+0x138/0x258 arch/arm64/kernel/syscall.c:142
       do_el0_svc+0x58/0x14c arch/arm64/kernel/syscall.c:181
       el0_svc+0x7c/0x1f0 arch/arm64/kernel/entry-common.c:608
       el0t_64_sync_handler+0x84/0xe4 arch/arm64/kernel/entry-common.c:626
       el0t_64_sync+0x1a0/0x1a4 arch/arm64/kernel/entry.S:584

-> #3 (&mm->mmap_lock){++++}-{3:3}:
       __might_fault+0xc8/0x128 mm/memory.c:5357
       _copy_to_user include/linux/uaccess.h:174 [inline]
       copy_to_user include/linux/uaccess.h:200 [inline]
       __tun_chr_ioctl+0xa78/0x2cf4 drivers/net/tun.c:3067
       tun_chr_ioctl+0x38/0x4c drivers/net/tun.c:3349
       vfs_ioctl fs/ioctl.c:51 [inline]
       __do_sys_ioctl fs/ioctl.c:874 [inline]
       __se_sys_ioctl fs/ioctl.c:860 [inline]
       __arm64_sys_ioctl+0x14c/0x1c8 fs/ioctl.c:860
       __invoke_syscall arch/arm64/kernel/syscall.c:38 [inline]
       invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:52
       el0_svc_common+0x138/0x258 arch/arm64/kernel/syscall.c:142
       do_el0_svc+0x58/0x14c arch/arm64/kernel/syscall.c:181
       el0_svc+0x7c/0x1f0 arch/arm64/kernel/entry-common.c:608
       el0t_64_sync_handler+0x84/0xe4 arch/arm64/kernel/entry-common.c:626
       el0t_64_sync+0x1a0/0x1a4 arch/arm64/kernel/entry.S:584

-> #2 (rtnl_mutex){+.+.}-{3:3}:
       __mutex_lock_common+0x194/0x2154 kernel/locking/mutex.c:596
       __mutex_lock kernel/locking/mutex.c:729 [inline]
       mutex_lock_nested+0xa4/0xf8 kernel/locking/mutex.c:743
       rtnl_lock+0x20/0x2c net/core/rtnetlink.c:72
       linkwatch_event+0x14/0x68 net/core/link_watch.c:251
       process_one_work+0x790/0x11b8 kernel/workqueue.c:2310
       worker_thread+0x910/0x1034 kernel/workqueue.c:2457
       kthread+0x37c/0x45c kernel/kthread.c:334
       ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:870

-> #1 ((linkwatch_work).work){+.+.}-{0:0}:
       process_one_work+0x6d4/0x11b8 kernel/workqueue.c:2286
       worker_thread+0x910/0x1034 kernel/workqueue.c:2457
       kthread+0x37c/0x45c kernel/kthread.c:334
       ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:870

-> #0 ((wq_completion)events_unbound){+.+.}-{0:0}:
       check_prev_add kernel/locking/lockdep.c:3053 [inline]
       check_prevs_add kernel/locking/lockdep.c:3172 [inline]
       validate_chain kernel/locking/lockdep.c:3788 [inline]
       __lock_acquire+0x32d4/0x7638 kernel/locking/lockdep.c:5012
       lock_acquire+0x240/0x77c kernel/locking/lockdep.c:5623
       process_one_work+0x6ac/0x11b8 kernel/workqueue.c:2285
       worker_thread+0x910/0x1034 kernel/workqueue.c:2457
       kthread+0x37c/0x45c kernel/kthread.c:334
       ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:870

other info that might help us debug this:

Chain exists of:
  (wq_completion)events_unbound --> &mm->mmap_lock --> jbd2_handle

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(jbd2_handle);
                               lock(&mm->mmap_lock);
                               lock(jbd2_handle);
  lock((wq_completion)events_unbound);

 *** DEADLOCK ***

3 locks held by kworker/u4:0/9:
 #0: ffff0000c16dc650 (sb_internal#2){.+.+}-{0:0}, at: ocfs2_replay_truncate_records fs/ocfs2/alloc.c:5931 [inline]
 #0: ffff0000c16dc650 (sb_internal#2){.+.+}-{0:0}, at: __ocfs2_flush_truncate_log+0x414/0x10f0 fs/ocfs2/alloc.c:6045
 #1: ffff0000d9d3cce8 (&journal->j_trans_barrier){.+.+}-{3:3}, at: ocfs2_start_trans+0x45c/0x804 fs/ocfs2/journal.c:352
 #2: ffff0000daf6e990 (jbd2_handle){++++}-{0:0}, at: start_this_handle+0xf98/0x12a4 fs/jbd2/transaction.c:462

stack backtrace:
CPU: 0 PID: 9 Comm: kworker/u4:0 Not tainted 5.15.179-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
Workqueue: events_unbound fsnotify_connector_destroy_workfn
Call trace:
 dump_backtrace+0x0/0x530 arch/arm64/kernel/stacktrace.c:152
 show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:216
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x108/0x170 lib/dump_stack.c:106
 dump_stack+0x1c/0x58 lib/dump_stack.c:113
 print_circular_bug+0x150/0x1b8 kernel/locking/lockdep.c:2011
 check_noncircular+0x2cc/0x378 kernel/locking/lockdep.c:2133
 check_prev_add kernel/locking/lockdep.c:3053 [inline]
 check_prevs_add kernel/locking/lockdep.c:3172 [inline]
 validate_chain kernel/locking/lockdep.c:3788 [inline]
 __lock_acquire+0x32d4/0x7638 kernel/locking/lockdep.c:5012
 lock_acquire+0x240/0x77c kernel/locking/lockdep.c:5623
 process_one_work+0x6ac/0x11b8 kernel/workqueue.c:2285
 worker_thread+0x910/0x1034 kernel/workqueue.c:2457
 kthread+0x37c/0x45c kernel/kthread.c:334
 ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:870
------------[ cut here ]------------
WARNING: CPU: 0 PID: 9 at fs/jbd2/transaction.c:615 jbd2_journal_start_reserved+0x2d8/0x56c fs/jbd2/transaction.c:616
Modules linked in:
CPU: 0 PID: 9 Comm: kworker/u4:0 Not tainted 5.15.179-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
Workqueue: ext4-rsv-conversion ext4_end_io_rsv_work
pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : jbd2_journal_start_reserved+0x2d8/0x56c fs/jbd2/transaction.c:616
lr : jbd2_journal_start_reserved+0x2d4/0x56c fs/jbd2/transaction.c:615
sp : ffff80001bd07940
x29: ffff80001bd07950 x28: 1fffe00018a59e3b x27: 1fffe00018a59e37
x26: ffff0000c0948000 x25: dfff800000000000 x24: ffff0000c0949168
x23: ffff0000c52cf1dc x22: ffff0000d4258000 x21: ffff0000c52cf1b8
x20: 000000000000000b x19: 0000000000001324 x18: 1fffe0003682e78e
x17: 1fffe0003682e78e x16: ffff800011b5a2f4 x15: ffff800014c0f2a0
x14: ffff0001b4173c80 x13: 0000000000000000 x12: 0000000000000001
x11: 0000000000000000 x10: 0000000000000000 x9 : ffff0000c0948000
x8 : ffff800008ed8544 x7 : 0000000000000000 x6 : ffff800008e0c8b0
x5 : 0000000000000000 x4 : 0000000000000000 x3 : ffff800008046154
x2 : 0000000000001324 x1 : 0000000000000004 x0 : ffff0000c52cf1b8
Call trace:
 jbd2_journal_start_reserved+0x2d8/0x56c fs/jbd2/transaction.c:616
 __ext4_journal_start_reserved+0x3b4/0x744 fs/ext4/ext4_jbd2.c:154
 ext4_convert_unwritten_io_end_vec+0x40/0x170 fs/ext4/extents.c:4899
 ext4_end_io_end fs/ext4/page-io.c:186 [inline]
 ext4_do_flush_completed_IO fs/ext4/page-io.c:259 [inline]
 ext4_end_io_rsv_work+0x2cc/0x5b0 fs/ext4/page-io.c:273
 process_one_work+0x790/0x11b8 kernel/workqueue.c:2310
 worker_thread+0x910/0x1034 kernel/workqueue.c:2457
 kthread+0x37c/0x45c kernel/kthread.c:334
 ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:870
irq event stamp: 458253
hardirqs last  enabled at (458253): [<ffff800011c31eb0>] __raw_spin_unlock_irq include/linux/spinlock_api_smp.h:168 [inline]
hardirqs last  enabled at (458253): [<ffff800011c31eb0>] _raw_spin_unlock_irq+0x9c/0x134 kernel/locking/spinlock.c:202
hardirqs last disabled at (458252): [<ffff800011c318cc>] __raw_spin_lock_irq include/linux/spinlock_api_smp.h:126 [inline]
hardirqs last disabled at (458252): [<ffff800011c318cc>] _raw_spin_lock_irq+0x38/0x13c kernel/locking/spinlock.c:170
softirqs last  enabled at (458242): [<ffff8000081b70b8>] softirq_handle_end kernel/softirq.c:401 [inline]
softirqs last  enabled at (458242): [<ffff8000081b70b8>] handle_softirqs+0xb88/0xdbc kernel/softirq.c:586
softirqs last disabled at (458189): [<ffff8000081b7750>] __do_softirq kernel/softirq.c:592 [inline]
softirqs last disabled at (458189): [<ffff8000081b7750>] do_softirq_own_stack include/asm-generic/softirq_stack.h:10 [inline]
softirqs last disabled at (458189): [<ffff8000081b7750>] invoke_softirq kernel/softirq.c:439 [inline]
softirqs last disabled at (458189): [<ffff8000081b7750>] __irq_exit_rcu+0x268/0x4d8 kernel/softirq.c:641
---[ end trace 01a1a18efeafc328 ]---
EXT4-fs (nvme0n1p2): failed to convert unwritten extents to written extents -- potential data loss!  (inode 1734, error -5)

Crashes (2):
Time Kernel Commit Syzkaller Config Log Report Syz repro C repro VM info Assets (help?) Manager Title
2025/03/19 19:36 linux-5.15.y 0c935c049b5c e20d7b13 .config console log report info [disk image] [vmlinux] [kernel image] ci2-linux-5-15-kasan-arm64 BUG: workqueue leaked lock or atomic in ocfs2_truncate_log_worker
2025/03/19 19:34 linux-5.15.y 0c935c049b5c e20d7b13 .config console log report info [disk image] [vmlinux] [kernel image] ci2-linux-5-15-kasan-arm64 BUG: workqueue leaked lock or atomic in ocfs2_truncate_log_worker
* Struck through repros no longer work on HEAD.