fix(datacell): fix memory leak in FlattenDataCell query on exception#1680
fix(datacell): fix memory leak in FlattenDataCell query on exception#1680
Conversation
…ception Co-authored-by: wxyucs <12595343+wxyucs@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Fixes an exception-path memory leak in FlattenDataCell queries by ensuring buffers returned from MemoryBlockIO::DirectReadImpl are always released even when downstream distance computations (or subsequent reads) throw.
Changes:
- Add null-safe cleanup lambdas and
try/catchguards around the batch-of-4 query loop to guaranteeRelease()runs on exceptions. - Add exception-safe cleanup to the single-item query loop with consistent
release && codesguarding. - Wrap
ComputePairVectors()body to ensure both code buffers are released on any exception.
| try { | ||
| codes1 = this->GetCodesById(idx[i], release1); | ||
| codes2 = this->GetCodesById(idx[i + 1], release2); | ||
| codes3 = this->GetCodesById(idx[i + 2], release3); | ||
| codes4 = this->GetCodesById(idx[i + 3], release4); | ||
| computer->ComputeDistsBatch4(codes1, | ||
| codes2, | ||
| codes3, | ||
| codes4, | ||
| result_dists[i], | ||
| result_dists[i + 1], | ||
| result_dists[i + 2], | ||
| result_dists[i + 3]); | ||
| } catch (...) { | ||
| release_batch(); | ||
| throw; | ||
| } | ||
| release_batch(); | ||
| } | ||
| for (; i < id_count; ++i) { | ||
| bool release = false; | ||
| const auto* codes = this->GetCodesById(idx[i], release); | ||
| computer->ComputeDist(codes, result_dists + i); | ||
| if (release) { | ||
| const uint8_t* codes = nullptr; | ||
| try { | ||
| codes = this->GetCodesById(idx[i], release); | ||
| computer->ComputeDist(codes, result_dists + i); | ||
| } catch (...) { | ||
| if (release && codes) { | ||
| this->io_->Release(codes); | ||
| } | ||
| throw; | ||
| } | ||
| if (release && codes) { | ||
| this->io_->Release(codes); | ||
| } |
There was a problem hiding this comment.
Missing regression test coverage for the new exception-safety paths. The leak fix is specifically about exceptions thrown between GetCodesById() and Release(), but current unit tests for FlattenDataCell appear to exercise only the non-throwing query path. Please add a Catch2 test that forces need_release=true (e.g., by using block_memory_io with a small block size / boundary-crossing reads) and then injects an exception from ComputeDist/ComputeDistsBatch4 (or an equivalent hook) to assert the allocated buffers are released (e.g., via a counting allocator or LSAN/ASAN expectation).
Co-authored-by: wxyucs <12595343+wxyucs@users.noreply.github.com>
Codecov Report❌ Patch coverage is @@ Coverage Diff @@
## main #1680 +/- ##
==========================================
+ Coverage 90.66% 90.91% +0.25%
==========================================
Files 329 329
Lines 19455 19474 +19
==========================================
+ Hits 17639 17705 +66
+ Misses 1816 1769 -47
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
…1680) * Initial plan * Fix memory leak in FlattenDataCell query and ComputePairVectors on exception Co-authored-by: wxyucs <12595343+wxyucs@users.noreply.github.com> * fix(datacell): fix memory leak in FlattenDataCell query on exception Co-authored-by: wxyucs <12595343+wxyucs@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: wxyucs <12595343+wxyucs@users.noreply.github.com>
…1680) * Initial plan * Fix memory leak in FlattenDataCell query and ComputePairVectors on exception Co-authored-by: wxyucs <12595343+wxyucs@users.noreply.github.com> * fix(datacell): fix memory leak in FlattenDataCell query on exception Co-authored-by: wxyucs <12595343+wxyucs@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: wxyucs <12595343+wxyucs@users.noreply.github.com>
…1680) * Initial plan * Fix memory leak in FlattenDataCell query and ComputePairVectors on exception Co-authored-by: wxyucs <12595343+wxyucs@users.noreply.github.com> * fix(datacell): fix memory leak in FlattenDataCell query on exception Co-authored-by: wxyucs <12595343+wxyucs@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: wxyucs <12595343+wxyucs@users.noreply.github.com>
When
MemoryBlockIO::DirectReadImplallocates a buffer across block boundaries (need_release = true), exceptions thrown by subsequentGetCodesByIdcalls orComputeDistsBatch4/ComputeDistskip the correspondingRelease()calls, leaking memory.Changes
query()batch-of-4 loop: Initialize all four code pointers tonullptr; wrapGetCodesById× 4 +ComputeDistsBatch4in try/catch with arelease_batchlambda that null-guards each pointer before releasing.query()single-item loop: WrapGetCodesById+ComputeDistin try/catch; release on both exception and normal paths with consistentrelease && codesguard.ComputePairVectors(): Same pattern — initialize tonullptr,release_pairlambda, try/catch guarantees cleanup on exception.Original prompt
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.