Skip to content

Conversation

@3pacccccc
Copy link
Contributor

Fixes #25103

Motivation

The ManagedCursorImpl.removeProperty method contains a thread safety vulnerability where it directly modifies the shared properties Map without proper synchronization. This can lead to:

  1. ConcurrentModificationException when putProperty creates a new HashMap copy while removeProperty modifies the original Map
  2. Data corruption in the internal HashMap structure under concurrent access
  3. Inconsistent state where one thread's modifications may not be visible to others

Modifications

  1. Fixed removeProperty method in ManagedCursorImpl.java:

    • Changed from direct Map modification (properties.remove(key)) to creating a new Map copy
    • Updated implementation to match the thread-safe pattern already used in putProperty
    • Maintains atomicity through LAST_MARK_DELETE_ENTRY_UPDATER.updateAndGet()
  2. Added concurrency test in ManagedCursorTest.java:

    • 100-thread stress test with mixed put/remove/get operations
    • 5-second duration to simulate sustained concurrent access
    • Verifies no ConcurrentModificationException or data inconsistencies
    • Validates final state matches expected results

Verifying this change

  • Make sure that the change passes the CI checks.

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: 3pacccccc#37

@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Dec 22, 2025
case 0: // Put operation
Long randomValue = random.nextLong();
cursor.putProperty(randomKey, randomValue);
records.put(randomKey, randomValue);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the 2 separate operations aren't atomic together so I don't see how the behavior could be validated this way.

It's possible to have a valid data race for example between the put and remove where the records becomes inconsistent even though there's no problem.

t1

cursor.putProperty(randomKey, randomValue);
records.put(randomKey, randomValue);

t2

cursor.removeProperty(randomKey);
records.remove(randomKey);

For example with the order of operations:

t1 - cursor.putProperty(randomKey, randomValue);
t2 - cursor.removeProperty(randomKey);
t2 - records.remove(randomKey);
t1 - records.put(randomKey, randomValue);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lhotari Thanks for the catch! You're right about the atomicity issue. I've removed the tracking map and now the test just checks for thread safety issues directly - no exceptions and no data corruption

Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check the comment about the test

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a critical thread safety issue in the ManagedCursorImpl.removeProperty method where direct modification of a shared Map could lead to ConcurrentModificationException and data corruption. The fix aligns the implementation with the existing thread-safe pattern used in putProperty by creating a new Map copy instead of modifying the original.

Key Changes:

  • Modified removeProperty to create a new HashMap copy and return a new MarkDeleteEntry instance, ensuring thread-safe updates through the atomic field updater
  • Added a concurrent stress test with 100 threads performing mixed put/remove/get operations to validate the fix

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedCursorImpl.java Fixed removeProperty method to use copy-on-write pattern instead of direct Map modification
managed-ledger/src/test/java/org/apache/bookkeeper/mledger/impl/ManagedCursorTest.java Added concurrency test testConcurrentPropertyOperationsThreadSafety with new imports for ConcurrentModificationException and Random

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@3pacccccc 3pacccccc requested a review from lhotari December 23, 2025 11:26
Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc-not-needed Your PR changes do not impact docs ready-to-test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Non-thread safe code in ManagedCursorImpl.removeProperty

2 participants