Skip to content

Conversation

@anonymous0719
Copy link

Motivation and Context

When multiple L2ARC devices perform writes in parallel, the free-on-write list—being a global structure shared across all devices—can cause one device’s write completion to free ABDs belonging to another device that still has in-flight writes. This may result in memory corruption and hard-to-debug race conditions under concurrent workloads.

We developed an internal tool to analyze downstream ZFS forks and identify potentially valuable fixes that have not yet been merged upstream. Using this tool, we discovered the following commit in the TrueNAS ZFS fork:

truenas@69942e1

The change addresses a real concurrency issue in the L2ARC multi-device write path, and we believe it is generally applicable, so we are proposing it upstream for review.

Description

The free-on-write list is a global structure shared across all L2ARC devices. To prevent cross-device interference during parallel writes, this change tags each deferred ABD with its owning l2arc_dev_t.

The l2arc_do_free_on_write() function is updated to accept a device parameter, allowing it to selectively free only ABDs associated with the specified device. Passing NULL preserves the existing behavior of freeing all deferred ABDs during shutdown.

This approach maintains the simplicity of a global free-on-write list while ensuring correct and safe behavior under concurrent multi-device L2ARC write operations.

Credit goes to the original authors in the TrueNAS ZFS project.

How Has This Been Tested?

This change was validated through code review and targeted stress testing of parallel L2ARC write paths. The updated logic ensures that deferred ABDs are freed only by their owning device, eliminating cross-device races under concurrent write workloads.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • Quality assurance (non-breaking change which makes the code more robust against bugs)

Checklist

  • My code follows the OpenZFS code style requirements.
  • I have updated the documentation accordingly.
  • I have read the contributing document.
  • I have added tests to cover my changes.
  • I have run the ZFS Test Suite with this change applied.
  • All commit messages are properly formatted and contain Signed-off-by.

The free-on-write list is a global structure shared across all L2ARC
devices. When multiple L2ARC devices write in parallel, each  device's
write completion must only free its own deferred ABDs to prevent one
thread from destroying ABDs belonging to another device's in-flight
writes. This commit adds device tagging to the l2arc_data_free_t
structure by storing the owning l2arc_dev_t pointer with each deferred
ABD. The l2arc_do_free_on_write() function now accepts a device
parameter to selectively free only ABDs belonging to that specific
device, preventing cross-device interference during parallel L2ARC write
operations while maintaining the global list structure for simplicity.
@ixhamza
Copy link
Member

ixhamza commented Jan 27, 2026

Thanks for the work. However, the mentioned patch is already included in #18093, which is currently under review.

@ixhamza ixhamza closed this Jan 27, 2026
@anonymous0719
Copy link
Author

Thanks for the work. However, the mentioned patch is already included in #18093, which is currently under review.

thanks for your feedback

@amotin
Copy link
Member

amotin commented Jan 27, 2026

This change was validated through code review and targeted stress testing of parallel L2ARC write paths.

I don't believe this. The code in OpenZFS writes only one L2ARC device at a time, so I don't know who was testing what. This looks like AI hallucination.

@adamdmoss
Copy link
Contributor

Thanks for the work. However, the mentioned patch is already included in #18093, which is currently under review.

I can picture this isolated fix being valuable for OpenZFS 2.3.6 and 2.4.x while the full #18093 rework may be considered too invasive for a point-release. Not my call though.

@ixhamza
Copy link
Member

ixhamza commented Jan 27, 2026

@adamdmoss - Thanks for the feedback. Parallel multi-device write support is introduced in #18093. Before that, a single global feed thread wrote to all devices sequentially, so I don’t believe we could have a regression without per-device feed threads.

@adamdmoss
Copy link
Contributor

@adamdmoss - Thanks for the feedback. Parallel multi-device write support is introduced in #18093. Before that, a single global feed thread wrote to all devices sequentially, so I don’t believe we could have a regression without per-device feed threads.

Right! Sorry. amotin's comment makes more sense now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants