syzbot


ID Workflow Result Correct Bug Created Started Finished Revision Error
d82fc7a6-619f-4455-b288-312a73dd876b assessment-kcsan Benign: ✅  Confident: ✅  KCSAN: data-race in _copy_to_iter / copy_folio_from_iter_atomic (3) 2026/01/16 01:23 2026/01/16 01:23 2026/01/16 01:44 a9d6a79219801d2130df3b1a792c57f0e5428e9f
Benign: ✅    Confident: ✅   

Explanation:
The data race reported by KCSAN occurs between `shmem_file_read_iter` and `shmem_file_write_iter` (called via `generic_perform_write`). Specifically, one task is reading from a shmem folio while another task is writing to the same folio.

### Analysis:
1.  **Memory Accesses**: 
    *   **Task 13848** is performing a read in `_copy_to_iter`, which is called by `shmem_file_read_iter` through `copy_folio_to_iter`. This read access is from a shmem folio being copied to a destination (in this case, a pipe buffer during a `sendfile` operation).
    *   **Task 13854** is performing a write in `copy_folio_from_iter_atomic`, called by `generic_perform_write` within `shmem_file_write_iter`. This write access is to the same shmem folio.
2.  **Locking**:
    *   `shmem_file_write_iter` takes the `inode_lock(inode)` and locks the folio (via `shmem_write_begin`) during the write.
    *   `shmem_file_read_iter` does **not** take the `inode_lock`. It acquires the folio but explicitly unlocks it (`folio_unlock(folio)`) before copying the data to the user/destination iterator. The code contains a comment acknowledging this: `reads (unlike writes) are called without i_rwsem protection against truncate`.
3.  **Nature of the Race**:
    *   The race is on the actual data contents of the file stored in the shmem folio. 
    *   Because the reader does not hold the inode lock or the folio lock during the copy, a concurrent writer can modify the same bytes that the reader is currently copying.
    *   The addresses involved (`0xffff888155bc8a48` and `0xffff888155bc8a08`) are within the same page in the kernel's direct map.
4.  **Consequences**:
    *   **Kernel Integrity**: Standard `memcpy` operations on valid, reference-counted memory (the folio is kept alive by both tasks) do not cause kernel crashes or memory corruption, even if they race.
    *   **Data Integrity**: A reader might see "torn" data (a mix of old and new values). In the context of POSIX file I/O, this is expected behavior for concurrent, unsynchronized reads and writes to the same file region. The kernel itself does not use this data for control flow or internal state; it merely acts as a transport.

### Conclusion:
The data race is on the file content itself and is a result of a deliberate performance optimization in shmem (allowing concurrent reads without the inode lock). It does not pose a risk to the kernel's stability or correctness.

Final Verdict: **Benign**

Crash report:
==================================================================
BUG: KCSAN: data-race in _copy_to_iter / copy_folio_from_iter_atomic

read to 0xffff888155bc8a48 of 1 bytes by task 13848 on cpu 0:
 memcpy_to_iter lib/iov_iter.c:77 [inline]
 iterate_bvec include/linux/iov_iter.h:123 [inline]
 iterate_and_advance2 include/linux/iov_iter.h:306 [inline]
 iterate_and_advance include/linux/iov_iter.h:330 [inline]
 _copy_to_iter+0x602/0xe70 lib/iov_iter.c:197
 copy_page_to_iter+0x18f/0x2d0 lib/iov_iter.c:374
 copy_folio_to_iter include/linux/uio.h:204 [inline]
 shmem_file_read_iter+0x2d6/0x540 mm/shmem.c:3438
 copy_splice_read+0x442/0x660 fs/splice.c:363
 do_splice_read fs/splice.c:981 [inline]
 splice_direct_to_actor+0x290/0x680 fs/splice.c:1086
 do_splice_direct_actor fs/splice.c:1204 [inline]
 do_splice_direct+0xda/0x150 fs/splice.c:1230
 do_sendfile+0x380/0x650 fs/read_write.c:1370
 __do_sys_sendfile64 fs/read_write.c:1431 [inline]
 __se_sys_sendfile64 fs/read_write.c:1417 [inline]
 __x64_sys_sendfile64+0x105/0x150 fs/read_write.c:1417
 x64_sys_call+0x2db1/0x3000 arch/x86/include/generated/asm/syscalls_64.h:41
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xca/0x2b0 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

write to 0xffff888155bc8a08 of 66 bytes by task 13854 on cpu 1:
 memcpy_from_iter lib/iov_iter.c:85 [inline]
 iterate_bvec include/linux/iov_iter.h:123 [inline]
 iterate_and_advance2 include/linux/iov_iter.h:306 [inline]
 iterate_and_advance include/linux/iov_iter.h:330 [inline]
 __copy_from_iter lib/iov_iter.c:261 [inline]
 copy_folio_from_iter_atomic+0x760/0x1150 lib/iov_iter.c:491
 generic_perform_write+0x2c2/0x490 mm/filemap.c:4332
 shmem_file_write_iter+0xc5/0xf0 mm/shmem.c:3490
 iter_file_splice_write+0x66b/0xa20 fs/splice.c:738
 do_splice_from fs/splice.c:938 [inline]
 direct_splice_actor+0x156/0x2a0 fs/splice.c:1161
 splice_direct_to_actor+0x312/0x680 fs/splice.c:1105
 do_splice_direct_actor fs/splice.c:1204 [inline]
 do_splice_direct+0xda/0x150 fs/splice.c:1230
 do_sendfile+0x380/0x650 fs/read_write.c:1370
 __do_sys_sendfile64 fs/read_write.c:1431 [inline]
 __se_sys_sendfile64 fs/read_write.c:1417 [inline]
 __x64_sys_sendfile64+0x105/0x150 fs/read_write.c:1417
 x64_sys_call+0x2db1/0x3000 arch/x86/include/generated/asm/syscalls_64.h:41
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xca/0x2b0 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 UID: 0 PID: 13854 Comm: syz.1.2788 Not tainted syzkaller #0 PREEMPT(voluntary) 
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025
==================================================================

Trajectory:
Seq Timestamp Type Name Duration
0/0 2026/01/16 01:23 flow assessment-kcsan
21m Model:
unknown

Results:
map[Benign:true Confident:true Explanation:The data race reported by KCSAN occurs between `shmem_file_read_iter` and `shmem_file_write_iter` (called via `generic_perform_write`). Specifically, one task is reading from a shmem folio while another task is writing to the same folio.

### Analysis:
1.  **Memory Accesses**: 
    *   **Task 13848** is performing a read in `_copy_to_iter`, which is called by `shmem_file_read_iter` through `copy_folio_to_iter`. This read access is from a shmem folio being copied to a destination (in this case, a pipe buffer during a `sendfile` operation).
    *   **Task 13854** is performing a write in `copy_folio_from_iter_atomic`, called by `generic_perform_write` within `shmem_file_write_iter`. This write access is to the same shmem folio.
2.  **Locking**:
    *   `shmem_file_write_iter` takes the `inode_lock(inode)` and locks the folio (via `shmem_write_begin`) during the write.
    *   `shmem_file_read_iter` does **not** take the `inode_lock`. It acquires the folio but explicitly unlocks it (`folio_unlock(folio)`) before copying the data to the user/destination iterator. The code contains a comment acknowledging this: `reads (unlike writes) are called without i_rwsem protection against truncate`.
3.  **Nature of the Race**:
    *   The race is on the actual data contents of the file stored in the shmem folio. 
    *   Because the reader does not hold the inode lock or the folio lock during the copy, a concurrent writer can modify the same bytes that the reader is currently copying.
    *   The addresses involved (`0xffff888155bc8a48` and `0xffff888155bc8a08`) are within the same page in the kernel's direct map.
4.  **Consequences**:
    *   **Kernel Integrity**: Standard `memcpy` operations on valid, reference-counted memory (the folio is kept alive by both tasks) do not cause kernel crashes or memory corruption, even if they race.
    *   **Data Integrity**: A reader might see "torn" data (a mix of old and new values). In the context of POSIX file I/O, this is expected behavior for concurrent, unsynchronized reads and writes to the same file region. The kernel itself does not use this data for control flow or internal state; it merely acts as a transport.

### Conclusion:
The data race is on the file content itself and is a result of a deliberate performance optimization in shmem (allowing concurrent reads without the inode lock). It does not pose a risk to the kernel's stability or correctness.

Final Verdict: **Benign**]

1/1 2026/01/16 01:23 action kernel-checkouter
0m Model:
unknown

Results:
map[KernelSrc:/usr/local/google/home/dvyukov/syzkaller/agent/workdir/cache/src/30caab9703681a9dd401b6cd9636d83c8b199bfa]

2/1 2026/01/16 01:24 action kernel-builder
4m Model:
unknown

Results:
map[KernelObj:/usr/local/google/home/dvyukov/syzkaller/agent/workdir/cache/build/e25083339169d51b486d91871aaa529a9be6e6aa]

3/1 2026/01/16 01:28 action codesearch-prepare
14m Model:
unknown

Results:
map[Index:codesearch-index]

4/1 2026/01/16 01:42 agent expert
2m Model:
unknown

Results:
map[Benign:true Confident:true]

Instruction:
You are an experienced Linux kernel developer tasked with determining if the given kernel
data race is benign or not. The data race report is from KCSAN tool.
It contains 2 stack traces of the memory accesses that constitute a data race.

A "benign" data races are on a simple int/bool variable or similar field,
and the accesses are not supposed to be protected by any mutual exclusion primitives.
Common examples of such "benign" data races are accesses to various flags fields,
statistics counters, and similar. A "benign" data race does not lead to memory corruption/crash
with a conservative compiler that compiles memory accesses to primitive types
effectively as atomic.

A non-benign (or "harmful" data race) can lead to corruption/crash even with
a conservative compiler that compiles memory accesses to primitive types
effectively as atomic. A common example of a "harmful" data races is race on
a complex container (list/hashmap/etc), where accesses are supposed to be protected
by a mutual exclusion primitive.

In the final reply explain why you think the given data race is benign or is harmful.

Use the provided tools to confirm any assumptions, variables/fields being accessed, etc.
In particular, don't make assumptions about the kernel source code,
use codesearch tools to read the actual source code.


Use set-results tool to provide results of the analysis.
It must be called exactly once before the final reply.
Ignore results of this tool.

Prompt:
The data race report is:

==================================================================
BUG: KCSAN: data-race in _copy_to_iter / copy_folio_from_iter_atomic

read to 0xffff888155bc8a48 of 1 bytes by task 13848 on cpu 0:
 memcpy_to_iter lib/iov_iter.c:77 [inline]
 iterate_bvec include/linux/iov_iter.h:123 [inline]
 iterate_and_advance2 include/linux/iov_iter.h:306 [inline]
 iterate_and_advance include/linux/iov_iter.h:330 [inline]
 _copy_to_iter+0x602/0xe70 lib/iov_iter.c:197
 copy_page_to_iter+0x18f/0x2d0 lib/iov_iter.c:374
 copy_folio_to_iter include/linux/uio.h:204 [inline]
 shmem_file_read_iter+0x2d6/0x540 mm/shmem.c:3438
 copy_splice_read+0x442/0x660 fs/splice.c:363
 do_splice_read fs/splice.c:981 [inline]
 splice_direct_to_actor+0x290/0x680 fs/splice.c:1086
 do_splice_direct_actor fs/splice.c:1204 [inline]
 do_splice_direct+0xda/0x150 fs/splice.c:1230
 do_sendfile+0x380/0x650 fs/read_write.c:1370
 __do_sys_sendfile64 fs/read_write.c:1431 [inline]
 __se_sys_sendfile64 fs/read_write.c:1417 [inline]
 __x64_sys_sendfile64+0x105/0x150 fs/read_write.c:1417
 x64_sys_call+0x2db1/0x3000 arch/x86/include/generated/asm/syscalls_64.h:41
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xca/0x2b0 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

write to 0xffff888155bc8a08 of 66 bytes by task 13854 on cpu 1:
 memcpy_from_iter lib/iov_iter.c:85 [inline]
 iterate_bvec include/linux/iov_iter.h:123 [inline]
 iterate_and_advance2 include/linux/iov_iter.h:306 [inline]
 iterate_and_advance include/linux/iov_iter.h:330 [inline]
 __copy_from_iter lib/iov_iter.c:261 [inline]
 copy_folio_from_iter_atomic+0x760/0x1150 lib/iov_iter.c:491
 generic_perform_write+0x2c2/0x490 mm/filemap.c:4332
 shmem_file_write_iter+0xc5/0xf0 mm/shmem.c:3490
 iter_file_splice_write+0x66b/0xa20 fs/splice.c:738
 do_splice_from fs/splice.c:938 [inline]
 direct_splice_actor+0x156/0x2a0 fs/splice.c:1161
 splice_direct_to_actor+0x312/0x680 fs/splice.c:1105
 do_splice_direct_actor fs/splice.c:1204 [inline]
 do_splice_direct+0xda/0x150 fs/splice.c:1230
 do_sendfile+0x380/0x650 fs/read_write.c:1370
 __do_sys_sendfile64 fs/read_write.c:1431 [inline]
 __se_sys_sendfile64 fs/read_write.c:1417 [inline]
 __x64_sys_sendfile64+0x105/0x150 fs/read_write.c:1417
 x64_sys_call+0x2db1/0x3000 arch/x86/include/generated/asm/syscalls_64.h:41
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xca/0x2b0 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 UID: 0 PID: 13854 Comm: syz.1.2788 Not tainted syzkaller #0 PREEMPT(voluntary) 
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025
==================================================================


Reply:
The data race reported by KCSAN occurs between `shmem_file_read_iter` and `shmem_file_write_iter` (called via `generic_perform_write`). Specifically, one task is reading from a shmem folio while another task is writing to the same folio.

### Analysis:
1.  **Memory Accesses**: 
    *   **Task 13848** is performing a read in `_copy_to_iter`, which is called by `shmem_file_read_iter` through `copy_folio_to_iter`. This read access is from a shmem folio being copied to a destination (in this case, a pipe buffer during a `sendfile` operation).
    *   **Task 13854** is performing a write in `copy_folio_from_iter_atomic`, called by `generic_perform_write` within `shmem_file_write_iter`. This write access is to the same shmem folio.
2.  **Locking**:
    *   `shmem_file_write_iter` takes the `inode_lock(inode)` and locks the folio (via `shmem_write_begin`) during the write.
    *   `shmem_file_read_iter` does **not** take the `inode_lock`. It acquires the folio but explicitly unlocks it (`folio_unlock(folio)`) before copying the data to the user/destination iterator. The code contains a comment acknowledging this: `reads (unlike writes) are called without i_rwsem protection against truncate`.
3.  **Nature of the Race**:
    *   The race is on the actual data contents of the file stored in the shmem folio. 
    *   Because the reader does not hold the inode lock or the folio lock during the copy, a concurrent writer can modify the same bytes that the reader is currently copying.
    *   The addresses involved (`0xffff888155bc8a48` and `0xffff888155bc8a08`) are within the same page in the kernel's direct map.
4.  **Consequences**:
    *   **Kernel Integrity**: Standard `memcpy` operations on valid, reference-counted memory (the folio is kept alive by both tasks) do not cause kernel crashes or memory corruption, even if they race.
    *   **Data Integrity**: A reader might see "torn" data (a mix of old and new values). In the context of POSIX file I/O, this is expected behavior for concurrent, unsynchronized reads and writes to the same file region. The kernel itself does not use this data for control flow or internal state; it merely acts as a transport.

### Conclusion:
The data race is on the file content itself and is a result of a deliberate performance optimization in shmem (allowing concurrent reads without the inode lock). It does not pose a risk to the kernel's stability or correctness.

Final Verdict: **Benign**

5/2 2026/01/16 01:42 llm expert
0m Model:
unknown

Thoughts:
**Data Race Analysis: Shmem File I/O**

Okay, so the report indicates a data race. My initial read of it suggests a simultaneous read and write to the same memory location, which seems to be occurring within a shmem file. The read path starts with `_copy_to_iter` called by `shmem_file_read_iter`, and the write path starts with `copy_folio_from_iter_atomic` called by `shmem_file_write_iter`. Both paths ultimately trace back to `do_sendfile`, so this is likely file I/O related.

The call stacks clearly show these operations happen on the shmem file. The suspicion is that one process is reading and another is writing, leading to the race condition.

Looking at the memory addresses, the read is at `0xffff888155bc8a48` and the write at `0xffff888155bc8a08`. This is important, the read is of 1 byte, while the write is of 66 bytes. The addresses are quite close. A quick calculation reveals the read's memory location is just 64 bytes away from the write's starting location. This indicates a definite overlap: the read is *within* the range of the write. Bingo!

Re-examining the stack traces, the read path uses `_copy_to_iter`, which calls `memcpy_to_iter`, and the write path uses `copy_folio_from_iter_atomic`, which calls `memcpy_from_iter`. Now, I need to make sure I'm thinking about this correctly: in `memcpy_to_iter`, it is reading *from* the source and writing *to* the iterator. The report states "read to ...", this would be the `from` argument in the function.

In the case of the write, `memcpy_from_iter` writes *to* the `to` argument. The write call is to the folio. So `shmem_file_read_iter` is reading from a folio, while `copy_folio_from_iter_atomic` is writing to a folio.

Since these paths are sharing the same folio memory, it confirms my hypothesis. This *is* a data race. Typically, file I/O operations are protected by inode locks, but in this scenario, either the locking scheme for `shmem_file_read_iter` and `shmem_file_write_iter` is different, or, and this is a key point, they might be accessing different shmem files that happen to share a page. However, they are both `shmem` so, I'd expect similar locking behavior.

Now, I'm going to dig into the `mm/shmem.c` file and focus on the `shmem_file_read_iter` and `shmem_file_write_iter` functions to understand how they handle concurrency. This should give me a better understanding of *why* this data race is happening.



6/2 2026/01/16 01:42 tool codesearch-file-index
0m Model:
unknown

Args:
map[SourceFile:mm/shmem.c]

Results:
map[Entities:[map[Kind:function Name:SHMEM_SB] map[Kind:function Name:__shmem_file_setup] map[Kind:function Name:__shmem_get_inode] map[Kind:function Name:__shmem_zero_setup] map[Kind:function Name:limit_gfp_mask] map[Kind:function Name:shmem_acct_blocks] map[Kind:function Name:shmem_acct_size] map[Kind:function Name:shmem_add_to_page_cache] map[Kind:function Name:shmem_alloc_and_add_folio] map[Kind:function Name:shmem_alloc_folio] map[Kind:function Name:shmem_alloc_inode] map[Kind:function Name:shmem_charge] map[Kind:function Name:shmem_confirm_swap] map[Kind:function Name:shmem_create] map[Kind:function Name:shmem_default_max_blocks] map[Kind:function Name:shmem_default_max_inodes] map[Kind:function Name:shmem_delete_from_page_cache] map[Kind:function Name:shmem_destroy_inode] map[Kind:function Name:shmem_destroy_inodecache] map[Kind:function Name:shmem_encode_fh] map[Kind:function Name:shmem_error_remove_folio] map[Kind:function Name:shmem_evict_inode] map[Kind:function Name:shmem_falloc_wait] map[Kind:function Name:shmem_fallocate] map[Kind:function Name:shmem_fault] map[Kind:function Name:shmem_fh_to_dentry] map[Kind:function Name:shmem_file_llseek] map[Kind:function Name:shmem_file_open] map[Kind:function Name:shmem_file_read_iter] map[Kind:function Name:shmem_file_setup] map[Kind:function Name:shmem_file_setup] map[Kind:function Name:shmem_file_setup_with_mnt] map[Kind:function Name:shmem_file_setup_with_mnt] map[Kind:function Name:shmem_file_splice_read] map[Kind:function Name:shmem_file_write_iter] map[Kind:function Name:shmem_fileattr_get] map[Kind:function Name:shmem_fileattr_set] map[Kind:function Name:shmem_fill_super] map[Kind:function Name:shmem_find_alias] map[Kind:function Name:shmem_find_swap_entries] map[Kind:function Name:shmem_free_fc] map[Kind:function Name:shmem_free_in_core_inode] map[Kind:function Name:shmem_free_inode] map[Kind:function Name:shmem_free_swap] map[Kind:function Name:shmem_get_folio] map[Kind:function Name:shmem_get_folio] map[Kind:function Name:shmem_get_folio_gfp] map[Kind:function Name:shmem_get_inode] map[Kind:function Name:shmem_get_link] map[Kind:function Name:shmem_get_offset_ctx] map[Kind:function Name:shmem_get_parent] map[Kind:function Name:shmem_get_partial_folio] map[Kind:function Name:shmem_get_pgoff_policy] map[Kind:function Name:shmem_get_pgoff_policy] map[Kind:function Name:shmem_get_policy] map[Kind:function Name:shmem_get_sbmpol] map[Kind:function Name:shmem_get_tree] map[Kind:function Name:shmem_get_unmapped_area] map[Kind:function Name:shmem_getattr] map[Kind:function Name:shmem_huge_global_enabled] map[Kind:function Name:shmem_init] map[Kind:function Name:shmem_init_fs_context] map[Kind:function Name:shmem_init_inode] map[Kind:function Name:shmem_init_inodecache] map[Kind:function Name:shmem_initxattrs] map[Kind:function Name:shmem_initxattrs] map[Kind:function Name:shmem_inode_acct_blocks] map[Kind:function Name:shmem_inode_casefold_flags] map[Kind:function Name:shmem_inode_unacct_blocks] map[Kind:function Name:shmem_kernel_file_setup] map[Kind:function Name:shmem_kernel_file_setup] map[Kind:function Name:shmem_link] map[Kind:function Name:shmem_listxattr] map[Kind:function Name:shmem_lock] map[Kind:function Name:shmem_mapping] map[Kind:function Name:shmem_mapping] map[Kind:function Name:shmem_match] map[Kind:function Name:shmem_mkdir] map[Kind:function Name:shmem_mknod] map[Kind:function Name:shmem_mmap_prepare] map[Kind:function Name:shmem_next_opt] map[Kind:function Name:shmem_parse_monolithic] map[Kind:function Name:shmem_parse_one] map[Kind:function Name:shmem_parse_opt_casefold] map[Kind:function Name:shmem_partial_swap_usage] map[Kind:function Name:shmem_put_link] map[Kind:function Name:shmem_put_super] map[Kind:function Name:shmem_reacct_size] map[Kind:function Name:shmem_read_folio_gfp] map[Kind:function Name:shmem_read_folio_gfp] map[Kind:function Name:shmem_read_mapping_page_gfp] map[Kind:function Name:shmem_read_mapping_page_gfp] map[Kind:function Name:shmem_recalc_inode] map[Kind:function Name:shmem_reconfigure] map[Kind:function Name:shmem_rename2] map[Kind:function Name:shmem_replace_entry] map[Kind:function Name:shmem_replace_folio] map[Kind:function Name:shmem_reserve_inode] map[Kind:function Name:shmem_rmdir] map[Kind:function Name:shmem_set_folio_swapin_error] map[Kind:function Name:shmem_set_inode_flags] map[Kind:function Name:shmem_set_policy] map[Kind:function Name:shmem_setattr] map[Kind:function Name:shmem_should_replace_folio] map[Kind:function Name:shmem_show_mpol] map[Kind:function Name:shmem_show_options] map[Kind:function Name:shmem_split_large_entry] map[Kind:function Name:shmem_statfs] map[Kind:function Name:shmem_suitable_orders] map[Kind:function Name:shmem_swap_alloc_folio] map[Kind:function Name:shmem_swap_usage] map[Kind:function Name:shmem_swapin_cluster] map[Kind:function Name:shmem_swapin_folio] map[Kind:function Name:shmem_swapin_folio] map[Kind:function Name:shmem_symlink] map[Kind:function Name:shmem_tmpfile] map[Kind:function Name:shmem_truncate_range] map[Kind:function Name:shmem_truncate_range] map[Kind:function Name:shmem_unacct_blocks] map[Kind:function Name:shmem_unacct_size] map[Kind:function Name:shmem_uncharge] map[Kind:function Name:shmem_undo_range] map[Kind:function Name:shmem_unlink] map[Kind:function Name:shmem_unlock_mapping] map[Kind:function Name:shmem_unuse] map[Kind:function Name:shmem_unuse_inode] map[Kind:function Name:shmem_unuse_swap_entries] map[Kind:function Name:shmem_unused_huge_shrink] map[Kind:function Name:shmem_update_stats] map[Kind:function Name:shmem_whiteout] map[Kind:function Name:shmem_write_begin] map[Kind:function Name:shmem_write_end] map[Kind:function Name:shmem_writeout] map[Kind:function Name:shmem_writeout] map[Kind:function Name:shmem_xattr_handler_get] map[Kind:function Name:shmem_xattr_handler_set] map[Kind:function Name:shmem_zero_setup] map[Kind:function Name:shmem_zero_setup_desc] map[Kind:function Name:splice_zeropage_into_pipe] map[Kind:function Name:synchronous_wake_function] map[Kind:function Name:tmpfs_sysfs_init] map[Kind:function Name:vma_is_anon_shmem] map[Kind:function Name:vma_is_shmem] map[Kind:function Name:zero_pipe_buf_get] map[Kind:function Name:zero_pipe_buf_release] map[Kind:function Name:zero_pipe_buf_try_steal]] Missing:false]

7/2 2026/01/16 01:42 llm expert
0m Model:
unknown

Thoughts:
**Diving into Shared Memory Write Operations**

Okay, so I need to understand what's happening during these shared memory write operations. My initial thought is to zero in on `shmem_file_read_iter` and `shmem_file_write_iter` within `mm/shmem.c`. That's the obvious starting point. I also recall `generic_perform_write` in `mm/filemap.c`, which is called by `shmem_file_write_iter`, so that needs investigation as well.

Wait a minute, does `shmem_file_write_iter` actually call `generic_perform_write`? Let me double-check that. Yes, absolutely. Time to consult the stack trace again for clarity.

Ah, here's the sequence for Task 13854: `shmem_file_write_iter` -> `generic_perform_write` -> `copy_folio_from_iter_atomic`. Now it's perfectly clear. Let's delve into the specifics of the `shmem_file_write_iter` implementation itself to grasp the flow.



8/2 2026/01/16 01:42 tool codesearch-definition-source
0m Model:
unknown

Args:
map[IncludeLines:true Name:shmem_file_write_iter SourceFile:mm/shmem.c]

Results:
map[Missing:false SourceCode:3473:	static ssize_t shmem_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
3474:	{
3475:		struct file *file = iocb->ki_filp;
3476:		struct inode *inode = file->f_mapping->host;
3477:		ssize_t ret;
3478:	
3479:		inode_lock(inode);
3480:		ret = generic_write_checks(iocb, from);
3481:		if (ret <= 0)
3482:			goto unlock;
3483:		ret = file_remove_privs(file);
3484:		if (ret)
3485:			goto unlock;
3486:		ret = file_update_time(file);
3487:		if (ret)
3488:			goto unlock;
3489:		ret = generic_perform_write(iocb, from);
3490:	unlock:
3491:		inode_unlock(inode);
3492:		return ret;
3493:	}
 SourceFile:mm/shmem.c]

9/2 2026/01/16 01:42 llm expert
0m Model:
unknown

Thoughts:
Okay, here's my summary, taking the provided thought process and presenting it as my own:

**Working with Shared Memory and Inodes**

Alright, let's see... first, I need to grab that `inode_lock(inode)`. Got to make sure we're thread-safe when dealing with those inodes. Now that that's out of the way, I think it's time to dig into `shmem_file_read_iter`. I'm curious how they've handled the read iteration in shared memory files; there's always something interesting to learn from that area. Let's see what we've got.



10/2 2026/01/16 01:42 tool codesearch-definition-source
0m Model:
unknown

Args:
map[IncludeLines:true Name:shmem_file_read_iter SourceFile:mm/shmem.c]

Results:
map[Missing:false SourceCode:3353:	static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
3354:	{
3355:		struct file *file = iocb->ki_filp;
3356:		struct inode *inode = file_inode(file);
3357:		struct address_space *mapping = inode->i_mapping;
3358:		pgoff_t index;
3359:		unsigned long offset;
3360:		int error = 0;
3361:		ssize_t retval = 0;
3362:	
3363:		for (;;) {
3364:			struct folio *folio = NULL;
3365:			struct page *page = NULL;
3366:			unsigned long nr, ret;
3367:			loff_t end_offset, i_size = i_size_read(inode);
3368:			bool fallback_page_copy = false;
3369:			size_t fsize;
3370:	
3371:			if (unlikely(iocb->ki_pos >= i_size))
3372:				break;
3373:	
3374:			index = iocb->ki_pos >> PAGE_SHIFT;
3375:			error = shmem_get_folio(inode, index, 0, &folio, SGP_READ);
3376:			if (error) {
3377:				if (error == -EINVAL)
3378:					error = 0;
3379:				break;
3380:			}
3381:			if (folio) {
3382:				folio_unlock(folio);
3383:	
3384:				page = folio_file_page(folio, index);
3385:				if (PageHWPoison(page)) {
3386:					folio_put(folio);
3387:					error = -EIO;
3388:					break;
3389:				}
3390:	
3391:				if (folio_test_large(folio) &&
3392:				    folio_test_has_hwpoisoned(folio))
3393:					fallback_page_copy = true;
3394:			}
3395:	
3396:			/*
3397:			 * We must evaluate after, since reads (unlike writes)
3398:			 * are called without i_rwsem protection against truncate
3399:			 */
3400:			i_size = i_size_read(inode);
3401:			if (unlikely(iocb->ki_pos >= i_size)) {
3402:				if (folio)
3403:					folio_put(folio);
3404:				break;
3405:			}
3406:			end_offset = min_t(loff_t, i_size, iocb->ki_pos + to->count);
3407:			if (folio && likely(!fallback_page_copy))
3408:				fsize = folio_size(folio);
3409:			else
3410:				fsize = PAGE_SIZE;
3411:			offset = iocb->ki_pos & (fsize - 1);
3412:			nr = min_t(loff_t, end_offset - iocb->ki_pos, fsize - offset);
3413:	
3414:			if (folio) {
3415:				/*
3416:				 * If users can be writing to this page using arbitrary
3417:				 * virtual addresses, take care about potential aliasing
3418:				 * before reading the page on the kernel side.
3419:				 */
3420:				if (mapping_writably_mapped(mapping)) {
3421:					if (likely(!fallback_page_copy))
3422:						flush_dcache_folio(folio);
3423:					else
3424:						flush_dcache_page(page);
3425:				}
3426:	
3427:				/*
3428:				 * Mark the folio accessed if we read the beginning.
3429:				 */
3430:				if (!offset)
3431:					folio_mark_accessed(folio);
3432:				/*
3433:				 * Ok, we have the page, and it's up-to-date, so
3434:				 * now we can copy it to user space...
3435:				 */
3436:				if (likely(!fallback_page_copy))
3437:					ret = copy_folio_to_iter(folio, offset, nr, to);
3438:				else
3439:					ret = copy_page_to_iter(page, offset, nr, to);
3440:				folio_put(folio);
3441:			} else if (user_backed_iter(to)) {
3442:				/*
3443:				 * Copy to user tends to be so well optimized, but
3444:				 * clear_user() not so much, that it is noticeably
3445:				 * faster to copy the zero page instead of clearing.
3446:				 */
3447:				ret = copy_page_to_iter(ZERO_PAGE(0), offset, nr, to);
3448:			} else {
3449:				/*
3450:				 * But submitting the same page twice in a row to
3451:				 * splice() - or others? - can result in confusion:
3452:				 * so don't attempt that optimization on pipes etc.
3453:				 */
3454:				ret = iov_iter_zero(nr, to);
3455:			}
3456:	
3457:			retval += ret;
3458:			iocb->ki_pos += ret;
3459:	
3460:			if (!iov_iter_count(to))
3461:				break;
3462:			if (ret < nr) {
3463:				error = -EFAULT;
3464:				break;
3465:			}
3466:			cond_resched();
3467:		}
3468:	
3469:		file_accessed(file);
3470:		return retval ? retval : error;
3471:	}
 SourceFile:mm/shmem.c]

11/2 2026/01/16 01:42 llm expert
0m Model:
unknown

Thoughts:
**Understanding the Potential Data Race in `shmem_file_read_iter`**

Okay, so I'm digging into this `shmem_file_read_iter` function and immediately I see something interesting – it *doesn't* take the `inode_lock(inode)`. Right there in the comments! It even spells it out why, it's specifically designed to allow concurrent reads and writes, and it explicitly mentions that this is to handle truncates correctly. This means we're *expecting* concurrent access. The potential problem isn't in locking, it's in the data itself.

The data race is specifically on the page content being copied during read and write operations. The classic concurrent read/write scenario. If a read and a write try to touch the same part of a file at the same time, the read could end up with partially updated data. Not ideal, but pretty standard for file I/O unless you're using explicit application-level synchronization, like advisory locking.

Now, KCSAN is flagging this. My question is, is this a genuinely bad race that could corrupt the kernel, or is it a "benign" one where the worst outcome is inconsistent file data? It's the file content itself that's being raced upon, so I'm leaning toward the latter.

The read path is `copy_folio_to_iter`, which boils down to `_copy_to_iter`. The write path is `generic_perform_write`, which uses `copy_folio_from_iter_atomic`. Ah, `copy_folio_from_iter_atomic`... that's interesting. I should probably dive into `generic_perform_write` in `mm/filemap.c` to see how it's being used there. Let's see what's really happening.



12/2 2026/01/16 01:42 tool codesearch-definition-source
0m Model:
unknown

Args:
map[IncludeLines:true Name:generic_perform_write SourceFile:mm/filemap.c]

Results:
map[Missing:false SourceCode:4285:	ssize_t generic_perform_write(struct kiocb *iocb, struct iov_iter *i)
4286:	{
4287:		struct file *file = iocb->ki_filp;
4288:		loff_t pos = iocb->ki_pos;
4289:		struct address_space *mapping = file->f_mapping;
4290:		const struct address_space_operations *a_ops = mapping->a_ops;
4291:		size_t chunk = mapping_max_folio_size(mapping);
4292:		long status = 0;
4293:		ssize_t written = 0;
4294:	
4295:		do {
4296:			struct folio *folio;
4297:			size_t offset;		/* Offset into folio */
4298:			size_t bytes;		/* Bytes to write to folio */
4299:			size_t copied;		/* Bytes copied from user */
4300:			void *fsdata = NULL;
4301:	
4302:			bytes = iov_iter_count(i);
4303:	retry:
4304:			offset = pos & (chunk - 1);
4305:			bytes = min(chunk - offset, bytes);
4306:			balance_dirty_pages_ratelimited(mapping);
4307:	
4308:			if (fatal_signal_pending(current)) {
4309:				status = -EINTR;
4310:				break;
4311:			}
4312:	
4313:			status = a_ops->write_begin(iocb, mapping, pos, bytes,
4314:							&folio, &fsdata);
4315:			if (unlikely(status < 0))
4316:				break;
4317:	
4318:			offset = offset_in_folio(folio, pos);
4319:			if (bytes > folio_size(folio) - offset)
4320:				bytes = folio_size(folio) - offset;
4321:	
4322:			if (mapping_writably_mapped(mapping))
4323:				flush_dcache_folio(folio);
4324:	
4325:			/*
4326:			 * Faults here on mmap()s can recurse into arbitrary
4327:			 * filesystem code. Lots of locks are held that can
4328:			 * deadlock. Use an atomic copy to avoid deadlocking
4329:			 * in page fault handling.
4330:			 */
4331:			copied = copy_folio_from_iter_atomic(folio, offset, bytes, i);
4332:			flush_dcache_folio(folio);
4333:	
4334:			status = a_ops->write_end(iocb, mapping, pos, bytes, copied,
4335:							folio, fsdata);
4336:			if (unlikely(status != copied)) {
4337:				iov_iter_revert(i, copied - max(status, 0L));
4338:				if (unlikely(status < 0))
4339:					break;
4340:			}
4341:			cond_resched();
4342:	
4343:			if (unlikely(status == 0)) {
4344:				/*
4345:				 * A short copy made ->write_end() reject the
4346:				 * thing entirely.  Might be memory poisoning
4347:				 * halfway through, might be a race with munmap,
4348:				 * might be severe memory pressure.
4349:				 */
4350:				if (chunk > PAGE_SIZE)
4351:					chunk /= 2;
4352:				if (copied) {
4353:					bytes = copied;
4354:					goto retry;
4355:				}
4356:	
4357:				/*
4358:				 * 'folio' is now unlocked and faults on it can be
4359:				 * handled. Ensure forward progress by trying to
4360:				 * fault it in now.
4361:				 */
4362:				if (fault_in_iov_iter_readable(i, bytes) == bytes) {
4363:					status = -EFAULT;
4364:					break;
4365:				}
4366:			} else {
4367:				pos += status;
4368:				written += status;
4369:			}
4370:		} while (iov_iter_count(i));
4371:	
4372:		if (!written)
4373:			return status;
4374:		iocb->ki_pos += written;
4375:		return written;
4376:	}
 SourceFile:mm/filemap.c]

13/2 2026/01/16 01:42 llm expert
0m Model:
unknown

Thoughts:
**Shmem Data Race Analysis**

Okay, let's break this down. My initial thought is that `shmem`'s `write_begin` is `shmem_write_begin` and `write_end` is `shmem_write_end`. I need to understand what those do. The obvious assumption is that `shmem_write_begin` gets the folio and locks it for writing. But then I see that `shmem_file_read_iter` calls `shmem_get_folio` and *then* unlocks it with `folio_unlock(folio)`. That's unexpected!

So, the folio isn't locked during the copy in `shmem_file_read_iter`. While, `shmem_file_write_iter` (via `generic_perform_write`), likely *is* locked because of `write_begin`. Since the read unlocks the folio, there's no mutual exclusion on the folio data. This immediately screams "data race" on the file contents! Is this an issue?

Historically, standard POSIX read/write concurrency doesn't guarantee atomic reads with respect to concurrent writes *unless* `O_DIRECT` or similar is used, which `shmem` doesn't implement in this way. Kernel developers often consider this benign for file data because users expect potentially inconsistent reads if another process is writing concurrently. It's essentially "garbage in, garbage out" from the user's perspective.

But then... KCSAN is reporting this. That makes me worry. It's flagging a *raw* memory access race. While "garbage" file data might be okay, could this lead to kernel instability?

Both accesses use `memcpy`. On the read side, it's `copy_page_to_iter` -> `_copy_to_iter` -> `memcpy_to_iter`. On the write side, it's `copy_folio_from_iter_atomic` -> `__copy_from_iter` -> `memcpy_from_iter`.

The KCSAN report gives me addresses: `read to 0xffff888155bc8a48` (1 byte) and `write to 0xffff888155bc8a08` (66 bytes). Those `0xffff...` addresses are in the direct map, which means these are kernel addresses for the page cache (or `shmem`'s equivalent).

If it's *just* data copying, then it *should* be fine. However, I have to be absolutely sure there aren't *other* consequences of this race.

Does `shmem_file_read_iter` or `shmem_file_write_iter` do something that relies on the folio being locked, *besides* copying the data? The comment in `shmem_file_read_iter` explicitly says that it runs without the `i_rwsem` protection. So, it acknowledges the potential for races against truncation.

Looking at `shmem_file_write_iter`, I see that it does use `inode_lock(inode)` to protect the write. Specifically, it uses `inode_lock`, `shmem_write_begin` (which locks the folio), then does the `copy_folio_from_iter_atomic` write, `shmem_write_end` (which unlocks the folio), and finally unlocks the `inode`.

In contrast, the read does `shmem_get_folio` (which locks, but unlocks immediately), and then reads the folio with the folio unlocked. So clearly, Task B (read) unlocks the folio *before* the copy, meaning Task A (write) can potentially come in and start writing while Task B is reading. That's exactly what's been flagged.

Is this intentional? The comment in `shmem_file_read_iter` suggests it is. Many filesystems allow concurrent reads, and `shmem` seems to allow them without even inode locks.

Is there a risk of "load tearing" or "store tearing" that could be harmful? For file data, usually no. The user just gets whatever the CPU and memory system provide.

But, why does `shmem_file_read_iter` unlock the folio? In *many* filesystems, `read_iter` doesn't hold the folio lock while copying to user-space. This is to avoid deadlocks (if a user-space buffer is mmapped to the same folio). `generic_file_read_iter` usually calls `filemap_read`, and that may or may not hold the folio lock.

I need to dig into the `mm/shmem.c` history. Wait, the comment at line 3416 is interesting. It says: "If users can be writing


14/2 2026/01/16 01:43 tool codesearch-definition-source
0m Model:
unknown

Args:
map[IncludeLines:true Name:_copy_to_iter SourceFile:lib/iov_iter.c]

Results:
map[Missing:false SourceCode: 190:	size_t _copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
 191:	{
 192:		if (WARN_ON_ONCE(i->data_source))
 193:			return 0;
 194:		if (user_backed_iter(i))
 195:			might_fault();
 196:		return iterate_and_advance(i, bytes, (void *)addr,
 197:					   copy_to_user_iter, memcpy_to_iter);
 198:	}
 SourceFile:lib/iov_iter.c]

15/2 2026/01/16 01:43 llm expert
0m Model:
unknown

Thoughts:
**Understanding a Data Race in Shmem File Splicing**

Okay, so I'm looking at this stack trace and it's pretty clear what's happening. The kernel is using `memcpy_to_iter` and `memcpy_from_iter` because these are kernel-to-kernel copies, specifically when splicing between a shared memory file (shmem) and pipes. Task 13848 (the reader) is reading from the shmem file and writing to a pipe, and Task 13854 (the writer) is reading from a pipe and writing to the same shmem file. It looks like:

*   Pipe A -> File X (Writer)
*   File X -> Pipe B (Reader)

If both tasks are operating on the same region of File X simultaneously, we have a clear race condition on the shmem file data. And because it's shmem, this is just a memory copy operation. This race is occurring when writing to one region and simultaneously reading from it.

Now, is this harmful? From a kernel perspective, no. A `memcpy` on valid memory isn't going to crash the system, even if the data is changing concurrently. Data integrity is the concern here. The reader will likely get a mix of old and new data, which is standard for concurrent unsynchronized I/O. Nothing special about shmem seems immediately problematic; it doesn't have "dirty" pages in the same sense as disk-backed filesystems.

However, the reader *will* see inconsistent data. Task B sees inconsistent data on the shmem and if this data is then spliced into Pipe B, then something else that reads this pipe will see this data. It's a classic data race.

The kernel developers are aware of this, because the inode lock (`i_rwsem`) is *not* held during reads. This is an explicit performance optimization. In `shmem_file_read_iter`, no lock is held.

The question then is, is this a known benign race? I've seen plenty of KCSAN reports on file data races that are marked as benign.

The writer is calling `copy_folio_from_iter_atomic`, and this is "atomic" in the sense that it doesn't sleep. It uses something like `kmap_atomic`, or similar, and handles faults by failing.



16/2 2026/01/16 01:43 tool codesearch-definition-source
0m Model:
unknown

Args:
map[IncludeLines:true Name:copy_folio_from_iter_atomic SourceFile:lib/iov_iter.c]

Results:
map[Missing:false SourceCode: 471:	size_t copy_folio_from_iter_atomic(struct folio *folio, size_t offset,
 472:			size_t bytes, struct iov_iter *i)
 473:	{
 474:		size_t n, copied = 0;
 475:	
 476:		if (!page_copy_sane(&folio->page, offset, bytes))
 477:			return 0;
 478:		if (WARN_ON_ONCE(!i->data_source))
 479:			return 0;
 480:	
 481:		do {
 482:			char *to = kmap_local_folio(folio, offset);
 483:	
 484:			n = bytes - copied;
 485:			if (folio_test_partial_kmap(folio) &&
 486:			    n > PAGE_SIZE - offset_in_page(offset))
 487:				n = PAGE_SIZE - offset_in_page(offset);
 488:	
 489:			pagefault_disable();
 490:			n = __copy_from_iter(to, n, i);
 491:			pagefault_enable();
 492:			kunmap_local(to);
 493:			copied += n;
 494:			offset += n;
 495:		} while (copied != bytes && n > 0);
 496:	
 497:		return copied;
 498:	}
 SourceFile:lib/iov_iter.c]

17/2 2026/01/16 01:43 llm expert
0m Model:
unknown

Thoughts:
Okay, here's the summary, rendered as a first-person account reflecting those thoughts, assuming a high level of expertise in the subject:

**Analyzing `__copy_from_iter` – A Deep Dive**

Alright, let's dissect this `__copy_from_iter` function. My initial thought is to immediately consider the context: what kind of iterator are we dealing with? Is it a standard iterator, a generator, or something more specialized? The efficiency of the copy operation hinges directly on this. I'm already mentally running through potential optimizations based on the type of iterator. For example, can we leverage `memcpy` or a similar low-level memory copy if we have a contiguous block of data? Or do we need to iterate element by element, incurring the overhead of potentially repeated allocations or lookups?

I'm also considering potential error conditions. What happens if the iterator raises a StopIteration prematurely? How do we handle memory allocation failures if the destination structure needs to grow? Are we defensively checking for null pointers or invalid iterators at the outset? We can't afford any segfaults or unexpected behavior in a critical copying function like this.

Next, I'm thinking about the data structures involved. Are we copying into a list, a dictionary, a custom object? The data type of the elements being copied is relevant as well. Is it primitive types, or complex objects that will require deep copies or reference management? How do we handle potential aliasing issues? If we're working with shared data, we need to be extremely careful to ensure thread safety and avoid introducing subtle bugs.

The overall goal is a fast, robust copy. A good implementation will minimize redundant operations, optimize memory usage, and handle errors gracefully. I'm mentally comparing this approach to alternative methods, like a specialized `from_iter` constructor, and what the trade-offs are in terms of performance and code clarity. I have to make sure that whatever logic is in place is concise and can handle unexpected iterator types and elements.



18/2 2026/01/16 01:43 tool codesearch-definition-source
0m Model:
unknown

Args:
map[IncludeLines:true Name:__copy_from_iter SourceFile:lib/iov_iter.c]

Results:
map[Missing:false SourceCode: 257:	static __always_inline
 258:	size_t __copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
 259:	{
 260:		return iterate_and_advance(i, bytes, addr,
 261:					   copy_from_user_iter, memcpy_from_iter);
 262:	}
 SourceFile:lib/iov_iter.c]

19/2 2026/01/16 01:43 llm expert
0m Model:
unknown

Thoughts:
**Analysis of a Potential Data Race in shmem_file_read_iter**

Okay, so I'm looking at this report highlighting a potential data race between `memcpy_to_iter` and `memcpy_from_iter`, both of which are essentially just memory copies. The context is shared memory, specifically `shmem`. My initial thought is that the write side is synchronized through the inode lock, so concurrent writes to the *same* file are protected. Reads, however, aren't locked, which immediately raises a flag.

The question then becomes: can a read in `shmem_file_read_iter` see a partially written page (folio)? Yes, it can. But is it a *problem*?  Given that we're dealing with regular file data (not metadata or directory structures, as I know `shmem_file_read_iter` is used for regular files), my initial inclination is that it's probably benign.  The data is just being copied between buffers for a user-space application.

I need to confirm if `shmem_get_folio` could return a freshly allocated folio that’s in the process of being initialized. The `SGP_READ` flag suggests this possibility, although the folio lock in `shmem_get_folio` and the subsequent unlock should provide some safety.

The core of the issue seems to be a classic "read vs write" scenario: Task A writes to File X at index 0, Task B reads from File X at index 0, and they're accessing the same folio.  Crucially, Task B holds no lock during the `memcpy`, so the race is real.  The report confirms the memory addresses overlap (0xffff888155bc8a48 read, 0xffff888155bc8a08 write, and a 66-byte write). So, it's a 1-byte read that overlaps with the write.

Given it's just file data being copied, and the kernel isn't directly *using* this data for anything internally, it feels like a benign data race. The fact that the tasks are running `sendfile64`, probably under syzkaller, supports this conclusion.

However, I need to be more thorough. Is there a possibility of an information leak during folio initialization?  If `shmem_get_folio` returns a zeroed folio, that shouldn’t be a problem, and if the folio is already in cache, the write has *started*.  If the read sees the data while being written, it's still just the standard "read vs. write" race.

I then ask myself, are there any potential problems with `memcpy` itself in this context? Could it use instructions vulnerable to memory changes? No, that's not how it works. It might just give weird values. What if the memory is being *freed*? The folio refcount should prevent that, as both `shmem_file_read_iter` and `shmem_file_write_iter` hold a folio reference.

What about the pipe buffers, since this seems to be a pipe interaction? The race is on the shmem folio, not the pipe buffers themselves, so that's not an issue here.

My current confidence level is high that this is benign, but I need to make sure. I'll re-check if there's any “structural” issue. The report’s mention of “_copy_to_iter / copy_folio_from_iter_atomic” confirms it's a race specifically *within* the `shmem` data.

Could `shmem` be using the folio for something else? I know shmem folios can be swapped out. However, `shmem_get_folio` ensures the folio is in RAM, and `shmem_file_read_iter` keeps it alive until the copy is done. `folio_put(folio)` is called after the copy. Could the folio be truncated?  I see that `shmem_file_read_iter` checks `i_size` *before* and *after* getting the folio and also notes that reads aren't protected with `i_rwsem`. Even if a truncate occurs, the folio still has a valid refcount, and the copy is safe for the kernel (even if the data is no longer part of the file).

What if Task A is writing, Task B is reading, *and* Task A truncates? The writer holds `inode_lock`, so it should be synchronized. A truncate holds `inode_lock` also.  But the read does not, so a read could race. If the read happens during the truncate, it might see truncated data, but that's standard POSIX behavior.

My current conclusion: It's a benign data


20/2 2026/01/16 01:43 tool codesearch-definition-source
0m Model:
unknown

Args:
map[IncludeLines:true Name:memcpy_to_iter SourceFile:lib/iov_iter.c]

Results:
map[Missing:false SourceCode:  72:	static __always_inline
  73:	size_t memcpy_to_iter(void *iter_to, size_t progress,
  74:			      size_t len, void *from, void *priv2)
  75:	{
  76:		memcpy(iter_to, from + progress, len);
  77:		return 0;
  78:	}
 SourceFile:lib/iov_iter.c]

21/2 2026/01/16 01:43 llm expert
0m Model:
unknown

Thoughts:
Here's a summary of my thinking process, focusing on the code snippet:

**Code Verification and Function Analysis**

Okay, something's shifted in my memory. The previous iteration of this code snippet *felt* different, but this version seems to align much better with the report's line numbers, especially after factoring in a likely offset. Good, that's reassuring.

Specifically, Line 76: `memcpy(iter_to, from + progress, len);` – Ah, it's a standard `memcpy` call. No surprises there. Just a straight memory copy, moving `len` bytes from the source pointer (`from + progress`) to the destination (`iter_to`). That's as expected.

Now, let's turn our attention to `memcpy_from_iter`. Time to see what that function's doing.



22/2 2026/01/16 01:43 tool codesearch-definition-source
0m Model:
unknown

Args:
map[IncludeLines:true Name:memcpy_from_iter SourceFile:lib/iov_iter.c]

Results:
map[Missing:false SourceCode:  80:	static __always_inline
  81:	size_t memcpy_from_iter(void *iter_from, size_t progress,
  82:				size_t len, void *to, void *priv2)
  83:	{
  84:		memcpy(to + progress, iter_from, len);
  85:		return 0;
  86:	}
 SourceFile:lib/iov_iter.c]

23/2 2026/01/16 01:43 llm expert
0m Model:
unknown

Thoughts:
**Benign Data Race in Shmem File I/O: A Deep Dive**

Okay, so I'm looking at this reported data race involving `memcpy`, and my initial thought is, "Standard `memcpy`, nothing to see here... or is there?" The trace points to two tasks: Task 13848 reading from a location (`from + progress`) and Task 13854 writing to another (`to + progress`). Based on the code, in the `shmem_file_read_iter` path, the `from + progress` address points to the folio data, which makes sense.  Similarly, in the `shmem_file_write_iter` path, `to + progress` is also a location within the folio. So, it appears the race is on the folio data itself.

Since it's just raw file data, with no kernel metadata being touched, my first instinct screams "benign." It's concurrent I/O, which, without synchronization, is expected to result in some level of data inconsistency from an application's point of view. The kernel isn't trying to interpret the *meaning* of the bits being copied; it's just shuffling them.

But wait, I need to check something... Could `iter_to` or `iter_from` (the other side of the copy) be the real source of the race? Task 13848 uses `iter_to` for pipe buffers and Task 13854 uses `iter_from` for pipe buffers. The report provides specific addresses (`0xffff888155bc8a48` and `0xffff888155bc8a08`). Are these folio addresses or the pipe buffer addresses? In the `shmem_file_read_iter` and `shmem_file_write_iter` cases, I believe they are folio addresses because they're fixed in the direct map.

Aha!  Both tasks are using `splice_direct_to_actor`. Could they be using the *same* pipe? Let's consider a scenario with `do_sendfile`, as this involves transferring data to and from files. If two `sendfile` calls are happening, one to a pipe from a shmem file and another from another file to the same pipe, and the `shmem_in` and `shmem_out` files are the same, this would be a race on the folio. The addresses seem to align with this, since the addresses in the report, are persistent across the copy.

Okay, the report specifically says one task *reads* from `0xffff888155bc8a48` (the folio) and the other *writes* to `0xffff888155bc8a08` (also the folio). It's the *same* folio. Given KCSAN reports data races on `memcpy`, this raises a flag.

The real question is, should I care? The data *will* be corrupted from the perspective of an application reading it concurrently. But, from the kernel's view, it is just bits being shuffled.

But is there anything else that *could* make it matter? Is there a case where data racing in a buffer using `memcpy` actually harmful? Yes, if that buffer contains *kernel* control flow data or *kernel* metadata. Here, it is shmem file data, which the kernel doesn't use for control.

What about `flush_dcache_folio`? I saw that in both `copy_folio_to_iter` and `generic_perform_write`, these are just cache maintenance operations. So, that's not the problem.

Ok, so it is just a plain benign data race.

Wait, I should confirm no other critical operations are involved, `copy_folio_from_iter_atomic` uses `pagefault_disable()`. This is good; it avoids potential deadlocks. This reinforces the fact that the race is limited to the data itself, not some critical shmem internal structure.

One last thought: that offset in `_copy_to_iter+0x602/0xe70` seems quite large, even for an inline function. I should look at `include/linux/iov_iter.h`. It is a complex macro inline that expands to many loops. This explains it.

**Conclusion:** It's benign. The data race on the shmem folio is expected and not harmful.



24/2 2026/01/16 01:44 tool codesearch-definition-source
0m Model:
unknown

Args:
map[IncludeLines:true Name:iterate_and_advance2 SourceFile:include/linux/iov_iter.h]

Results:
map[Missing:false SourceCode: 265:	/**
 266:	 * iterate_and_advance2 - Iterate over an iterator
 267:	 * @iter: The iterator to iterate over.
 268:	 * @len: The amount to iterate over.
 269:	 * @priv: Data for the step functions.
 270:	 * @priv2: More data for the step functions.
 271:	 * @ustep: Function for UBUF/IOVEC iterators; given __user addresses.
 272:	 * @step: Function for other iterators; given kernel addresses.
 273:	 *
 274:	 * Iterate over the next part of an iterator, up to the specified length.  The
 275:	 * buffer is presented in segments, which for kernel iteration are broken up by
 276:	 * physical pages and mapped, with the mapped address being presented.
 277:	 *
 278:	 * Two step functions, @step and @ustep, must be provided, one for handling
 279:	 * mapped kernel addresses and the other is given user addresses which have the
 280:	 * potential to fault since no pinning is performed.
 281:	 *
 282:	 * The step functions are passed the address and length of the segment, @priv,
 283:	 * @priv2 and the amount of data so far iterated over (which can, for example,
 284:	 * be added to @priv to point to the right part of a second buffer).  The step
 285:	 * functions should return the amount of the segment they didn't process (ie. 0
 286:	 * indicates complete processsing).
 287:	 *
 288:	 * This function returns the amount of data processed (ie. 0 means nothing was
 289:	 * processed and the value of @len means processes to completion).
 290:	 */
 291:	static __always_inline
 292:	size_t iterate_and_advance2(struct iov_iter *iter, size_t len, void *priv,
 293:				    void *priv2, iov_ustep_f ustep, iov_step_f step)
 294:	{
 295:		if (unlikely(iter->count < len))
 296:			len = iter->count;
 297:		if (unlikely(!len))
 298:			return 0;
 299:	
 300:		if (likely(iter_is_ubuf(iter)))
 301:			return iterate_ubuf(iter, len, priv, priv2, ustep);
 302:		if (likely(iter_is_iovec(iter)))
 303:			return iterate_iovec(iter, len, priv, priv2, ustep);
 304:		if (iov_iter_is_bvec(iter))
 305:			return iterate_bvec(iter, len, priv, priv2, step);
 306:		if (iov_iter_is_kvec(iter))
 307:			return iterate_kvec(iter, len, priv, priv2, step);
 308:		if (iov_iter_is_folioq(iter))
 309:			return iterate_folioq(iter, len, priv, priv2, step);
 310:		if (iov_iter_is_xarray(iter))
 311:			return iterate_xarray(iter, len, priv, priv2, step);
 312:		return iterate_discard(iter, len, priv, priv2, step);
 313:	}
 SourceFile:include/linux/iov_iter.h]

25/2 2026/01/16 01:44 llm expert
0m Model:
unknown

Thoughts:
**Diving into the Kernel Data Structures**

Okay, so I'm looking at this report and it's pointing me directly to `iterate_bvec` in `include/linux/iov_iter.h` at line 123. It's also mentioning `iterate_and_advance2` in the same header, line 306. My internal logic is immediately mapping this to what I'm seeing in my debug output. Ah, line 306, that's where the code checks `if (iov_iter_is_kvec(iter))`. But wait, line 305 says `if (iov_iter_is_bvec(iter)) return iterate_bvec(...)`. Right! That confirms it. It's an `iterate_bvec` path and that aligns perfectly with a pipe. Pipes often use block vectors (bvecs) or folios to manage their buffers, and that's precisely what's happening here. This makes total sense. Now, let's go deeper and take a closer look at the `iterate_bvec` function.



26/2 2026/01/16 01:44 tool codesearch-definition-source
0m Model:
unknown

Args:
map[IncludeLines:true Name:iterate_bvec SourceFile:include/linux/iov_iter.h]

Results:
map[Missing:false SourceCode: 104:	/*
 105:	 * Handle ITER_BVEC.
 106:	 */
 107:	static __always_inline
 108:	size_t iterate_bvec(struct iov_iter *iter, size_t len, void *priv, void *priv2,
 109:			    iov_step_f step)
 110:	{
 111:		const struct bio_vec *p = iter->bvec;
 112:		size_t progress = 0, skip = iter->iov_offset;
 113:	
 114:		do {
 115:			size_t remain, consumed;
 116:			size_t offset = p->bv_offset + skip, part;
 117:			void *kaddr = kmap_local_page(p->bv_page + offset / PAGE_SIZE);
 118:	
 119:			part = min3(len,
 120:				   (size_t)(p->bv_len - skip),
 121:				   (size_t)(PAGE_SIZE - offset % PAGE_SIZE));
 122:			remain = step(kaddr + offset % PAGE_SIZE, progress, part, priv, priv2);
 123:			kunmap_local(kaddr);
 124:			consumed = part - remain;
 125:			len -= consumed;
 126:			progress += consumed;
 127:			skip += consumed;
 128:			if (skip >= p->bv_len) {
 129:				skip = 0;
 130:				p++;
 131:			}
 132:			if (remain)
 133:				break;
 134:		} while (len);
 135:	
 136:		iter->nr_segs -= p - iter->bvec;
 137:		iter->bvec = p;
 138:		iter->iov_offset = skip;
 139:		iter->count -= progress;
 140:		return progress;
 141:	}
 SourceFile:include/linux/iov_iter.h]

27/2 2026/01/16 01:44 llm expert
0m Model:
unknown

Thoughts:
**Data Race in Shared Memory: A Benign Finding**

Okay, let's break this down. It seems we have a data race within `step(...)`, which, for all intents and purposes, is either `memcpy_to_iter` (for the reader) or `memcpy_from_iter` (for the writer). The key is that the data being raced on is *the actual data being copied* in shared memory or regular files – straight data.

My immediate thought was to check if `kmap_local_page` or `kunmap_local` were somehow involved, but based on the addresses provided (`0xffff888155bc8a48`, `0xffff888155bc8a08`), which are likely within the direct map on x86_64, those functions are probably just returning the direct map address. So no, the race *is* at the data level.

So, the race is on file content, specifically. Is there any special protection I might have missed in `shmem`? I double-checked `shmem_file_read_iter` and `shmem_file_write_iter`. Read doesn't take the `inode_lock`, which is a key distinction. Write does. Write also holds the folio lock via `shmem_write_begin`, but this is only held during writing. However, the folio lock doesn't protect against concurrent readers.

Ah, I see. This is intentional, designed for performance. Given the specific tasks identified in the report, are there other points where `_copy_to_iter` or `copy_folio_from_iter_atomic` could also be problematic? Probably not, since the report focuses on these two specific tasks.

Now, let's consider the implications. Is there a risk of a "torn read" or "torn write" causing kernel misinterpretation? No. The kernel isn't interpreting the shared memory file data itself; it's simply acting as a conduit to or from the user or pipes. This isn't a race on critical kernel metadata like `shmem_inode_info->alloced`.

My initial assessment is that this is benign.

Let's do a final check, just to be sure. Those addresses, `0xffff888155bc8a48` and `0xffff888155bc8a08`, again, are likely in the direct map on x86_64. Within the 4KB page, those addresses are at offsets 0xa48 and 0xa08, respectively.  They're within the same page.

If this *were* a race on the `iov_iter` structure itself, it would be bad, but the report explicitly states: "read to 0xffff888155bc8a48" and "write to 0xffff888155bc8a08". And from the `memcpy` calls: reader is reading from shmem folio, writer is writing to shmem folio. These are the addresses being accessed.

Task 13848 reads from `0xffff888155bc8a48`. Task 13854 writes to `0xffff888155bc8a08`. These are *clearly* addresses in the shmem folio.

Wait a second, if Task 13854 is writing 66 bytes, then it's writing to the range `[0xa08, 0xa4a)`. Task 13848 is reading *one* byte at address `0xa48`.  That overlaps!

Okay, everything confirms that this is a data race on shmem folio content.

Now... is it possible that Task 13848's `iter_to` is also `0xffff888155bc8a48`? No, because `iter_to` is the *destination* of the read, and the report is clear: "read TO 0xffff888155bc8a48." Let me clarify my understanding. KCSAN means "read AT" or "write AT". So the report is saying:
*   `read to 0xffff888155bc8a48 of 1 bytes by task 13848`: a memory read AT that address.
*   `write to 0xffff888155bc8a08 of 66 bytes by task 13854`: a memory write AT that address.
This all confirms my analysis.

**Conclusion:** The data race is on the shared memory file content, and it's benign. It's an


28/2 2026/01/16 01:44 tool set-results
0m Model:
unknown

Args:
map[Benign:true Confident:true]

Results:
map[Benign:true Confident:true]

29/2 2026/01/16 01:44 llm expert
0m Model:
unknown

Thoughts:
**Benign Data Race in Shared Memory File I/O**

Okay, so the results are in, and now it's time to solidify the explanation. I'm looking at a data race situation, specifically between `shmem_file_read_iter` and `shmem_file_write_iter`, which boils down to concurrent read and write operations on a shared memory folio. It seems one task is reading while another is writing, which is where things get interesting.

I recall that `shmem_file_read_iter` *intentionally* avoids taking the inode lock (`i_rwsem`). The developer did that, then unlocks the folio before copying the data to the user iterator. That design choice facilitates concurrency, which is a common trade-off. We're trading strict consistency for increased throughput. It's a calculated decision, and the implications need to be clearly understood.

Now, because we're operating within the standard POSIX file I/O model, atomicity of reads in the face of concurrent writes is not guaranteed. This is expected. We’re working with the content of a file; nothing the kernel cares about for its own housekeeping.

The crux of the matter is this: because the kernel doesn't rely on the data's integrity for its internal operation or control flow, this data race is ultimately benign. We're *not* looking at memory corruption, crashes, or any form of incorrect kernel behavior. The worst-case scenario is the reader ends up with a mix of old and new data, which is completely acceptable within the context of unsynchronized I/O.



LLM Calls Summary:
Total Calls Total Tokens Avg Tokens Total Duration (Seconds) Avg Duration (Seconds)
Tool Calls Summary:
Total Calls Total Duration (Seconds) Avg Duration (Seconds)