fix: add INVALID_ENTRY_POINT guard to iterative KnnSearch and RangeSearch#1772
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a helper function make_empty_dataset_with_stats to ensure that empty search results include search statistics. This helper is integrated into KnnSearch, RangeSearch, and SearchWithRequest. Feedback suggests updating an additional early exit condition in KnnSearch to use this new helper for better consistency.
| if (cur_count == 0) { | ||
| auto dataset_result = DatasetImpl::MakeEmptyDataset(); | ||
| return dataset_result; | ||
| return make_empty_dataset_with_stats(); | ||
| } |
There was a problem hiding this comment.
While this change correctly returns an empty dataset with statistics when cur_count is 0, there's an earlier check in this function (around line 322): if (GetNumElements() == 0). This check returns an empty dataset without statistics, which makes the behavior for an empty index inconsistent. To ensure the fix is complete, that initial check should also be updated to use make_empty_dataset_with_stats().
…arch Signed-off-by: Xiangyu Wang <wxy407827@antgroup.com>
ca68170 to
a5391fd
Compare
There was a problem hiding this comment.
Pull request overview
This PR backports a fix to prevent iterative KnnSearch and RangeSearch from crashing when the HGraph entry point is invalid (notably on empty indexes), and introduces a shared helper intended to return an empty dataset while preserving search statistics.
Changes:
- Add
make_empty_dataset_with_stats()helper for empty-result returns. - Guard iterative
KnnSearchagainstINVALID_ENTRY_POINT. - Guard
RangeSearchagainstINVALID_ENTRY_POINT.
src/algorithm/hgraph.cpp
Outdated
| SearchStatistics stats; | ||
| auto dataset_result = DatasetImpl::MakeEmptyDataset(); | ||
| dataset_result->Statistics(stats.Dump()); | ||
| return dataset_result; |
There was a problem hiding this comment.
make_empty_dataset_with_stats() currently won't compile on this branch: SearchStatistics is not defined anywhere in the repository, and vsag::Dataset/DatasetImpl has no Statistics(...) member (see include/vsag/dataset.h). Either backport the missing statistics types/APIs along with this change, or remove the statistics plumbing and just return DatasetImpl::MakeEmptyDataset() here.
| SearchStatistics stats; | |
| auto dataset_result = DatasetImpl::MakeEmptyDataset(); | |
| dataset_result->Statistics(stats.Dump()); | |
| return dataset_result; | |
| return DatasetImpl::MakeEmptyDataset(); |
| InnerSearchParam search_param; | ||
| search_param.ep = this->entry_point_id_; | ||
| search_param.topk = 1; | ||
| search_param.ef = 1; | ||
|
|
||
| if (search_param.ep == INVALID_ENTRY_POINT) { | ||
| return make_empty_dataset_with_stats(); | ||
| } |
There was a problem hiding this comment.
HGraph::RangeSearch still doesn't take global_mutex_ as a shared lock, unlike KnnSearch/SearchWithRequest. Since RangeSearch reads entry_point_id_ and iterates route_graphs_ (which are mutated under global_mutex_ during add/resize), this is a data race under the advertised concurrent add/search features; the new INVALID_ENTRY_POINT guard doesn’t address that. Consider adding a std::shared_lock on global_mutex_ before reading entry_point_id_ / traversing graphs.
Signed-off-by: Xiangyu Wang <wxy407827@antgroup.com>
Codecov Report❌ Patch coverage is ❌ Your patch status has failed because the patch coverage (25.00%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage. @@ Coverage Diff @@
## 0.16 #1772 +/- ##
==========================================
- Coverage 92.24% 92.07% -0.17%
==========================================
Files 295 295
Lines 15714 15718 +4
==========================================
- Hits 14496 14473 -23
- Misses 1218 1245 +27
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
Summary
Related