[improve][broker]Improve ManagedLedger search position by offset #25099
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
The current
asyncFindPositionmethod has performance issues: it performs binary search by reading all entries from the beginning based on offset to find the position. This implementation causes several problems when dealing with large amounts of data:This problem becomes particularly severe in Pulsar topics with large amounts of historical data, significantly impacting consumer startup speed and overall performance.
Modifications
This PR optimizes the
asyncFindPositionmethod through the following approaches:Record index information during ledger creation:
ManagedLedgerInterceptorindex intoLedgerInfofirstEntryIndexOptimize search logic:
firstEntryIndexinasyncFindPositionReduce unnecessary data reads:
Verifying this change
This change added tests and can be verified as follows:
ManagedLedgerInterceptorImplTestincluding:testSetFirstEntryIndex: Verifies proper setting of firstEntryIndex during ledger creationtestFindPositionByOffsetWithMissingFirstEntryIndex: Tests backward compatibility when firstEntryIndex is not availabletestFindPositionByOffset: Tests optimized position finding with various offset scenariosDoes this pull request potentially affect one of the following parts:
If the box was checked, please highlight the changes
firstEntryIndexfield to LedgerInfo metadata structureDocumentation
docdoc-requireddoc-not-neededdoc-completeMatching PR in forked repository
gaozhangmin#13