Skip to content

feat(pyramid): move topk_factor to IndexSearchParameter base class#1605

Merged
inabao merged 3 commits intomainfrom
feat-pyramid-topk-factor
Mar 2, 2026
Merged

feat(pyramid): move topk_factor to IndexSearchParameter base class#1605
inabao merged 3 commits intomainfrom
feat-pyramid-topk-factor

Conversation

@inabao
Copy link
Copy Markdown
Collaborator

@inabao inabao commented Feb 25, 2026

  • Extract topk_factor from HGraphSearchParameters and IVFSearchParameters
  • Add topk_factor to IndexSearchParameter for inheritance by all search parameters
  • Implement topk_factor support in Pyramid search to control reorder ratio
  • When topk_factor > 1.0, reorder_limit = limit * topk_factor

@inabao inabao self-assigned this Feb 25, 2026
@inabao inabao requested a review from wxyucs as a code owner February 25, 2026 12:02
@inabao inabao added kind/improvement Code improvements (variable/function renaming, refactoring, etc. ) version/1.0 needs-cherry-pick-release-0.18 labels Feb 25, 2026
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @inabao, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refactors the topk_factor parameter by moving it to a common base class, IndexSearchParameter, making it accessible to all search algorithms. This change streamlines parameter management and enhances the flexibility of search configurations. Additionally, the Pyramid search implementation has been updated to leverage this centralized topk_factor for more precise control over the reordering process, ensuring that the reorder ratio is applied consistently.

Highlights

  • Refactoring topk_factor: The topk_factor parameter has been extracted from HGraphSearchParameters and IVFSearchParameters to promote code reuse and a more unified parameter handling.
  • Centralized topk_factor in IndexSearchParameter: The topk_factor member and its JSON parsing logic have been moved to the base class IndexSearchParameter, allowing all derived search parameter classes to inherit and utilize it.
  • Pyramid Search Reorder Control: The Pyramid search algorithm now explicitly uses the topk_factor to determine the reorder_limit, controlling the number of candidates passed to the reordering step based on the search limit.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • src/algorithm/hgraph_parameter.cpp
    • Removed topk_factor parsing logic from FromJson method.
  • src/algorithm/hgraph_parameter.h
    • Removed topk_factor member variable.
  • src/algorithm/index_search_parameter.h
    • Added topk_factor member variable with a default value.
    • Implemented parsing of SEARCH_PARAM_FACTOR into topk_factor within the constructor.
  • src/algorithm/ivf_parameter.cpp
    • Removed topk_factor parsing logic from FromJson method.
  • src/algorithm/ivf_parameter.h
    • Removed topk_factor member variable.
  • src/algorithm/pyramid.cpp
    • Modified KnnSearch and RangeSearch to pass parsed_param.topk_factor to search_impl.
    • Updated search_impl signature to accept topk_factor.
    • Implemented logic within search_impl to calculate reorder_limit using topk_factor before calling the reorder function.
  • src/algorithm/pyramid.h
    • Updated the declaration of search_impl to include a topk_factor parameter with a default value.
Activity
  • No human activity (comments, reviews) has been recorded on this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the topk_factor search parameter by moving it from HGraphSearchParameters and IVFSearchParameters to the base class IndexSearchParameter. This is a good improvement for code reuse. The topk_factor is also newly implemented in Pyramid search to control the reordering ratio. My review has identified a potential precision issue in the calculation of reorder_limit in pyramid.cpp and an unintended change in the default value of topk_factor for IVFSearchParameters which could alter search behavior.


public:
int64_t scan_buckets_count{30};
float topk_factor{2.0F};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

By removing topk_factor from IVFSearchParameters, it will now inherit this member from IndexSearchParameter, where its default value is 0.0F. Previously, the default value here was 2.0F. This change in the default value could alter the behavior for users of IVF who were relying on the default reordering factor, potentially affecting search quality.

If IVF search parameters should retain the default topk_factor of 2.0F, you could set it in the IVFSearchParameters constructor. Since the constructor is not in the diff, here is an example of how you could modify it:

// in src/algorithm/ivf_parameter.h
private:
    IVFSearchParameters() {
        topk_factor = 2.0F;
    }

if (use_reorder_) {
int64_t reorder_limit = limit;
if (topk_factor > 1.0F) {
reorder_limit = static_cast<int64_t>(static_cast<float>(limit) * topk_factor);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

When calculating reorder_limit, you're casting limit (which is an int64_t) to a float. A float has limited precision and may not accurately represent large int64_t values (it can only represent all integers exactly up to 2^24). In RangeSearch, limit can be very large, potentially leading to precision loss during the cast and an incorrect reorder_limit.

To avoid this, it's safer to perform the calculation using double, which has much higher precision.

Suggested change
reorder_limit = static_cast<int64_t>(static_cast<float>(limit) * topk_factor);
reorder_limit = static_cast<int64_t>(static_cast<double>(limit) * topk_factor);

@codecov
Copy link
Copy Markdown

codecov bot commented Feb 26, 2026

Codecov Report

❌ Patch coverage is 71.42857% with 2 lines in your changes missing coverage. Please review.

❌ Your patch status has failed because the patch coverage (71.42%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

@@            Coverage Diff             @@
##             main    #1605      +/-   ##
==========================================
+ Coverage   90.76%   90.95%   +0.19%     
==========================================
  Files         329      329              
  Lines       19477    19480       +3     
==========================================
+ Hits        17679    17719      +40     
+ Misses       1798     1761      -37     
Flag Coverage Δ
cpp 90.95% <71.42%> (+0.19%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
common 85.81% <ø> (ø)
datacell 92.20% <ø> (+0.51%) ⬆️
index 90.68% <71.42%> (+0.30%) ⬆️
simd 100.00% <ø> (ø)

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d8b78e1...57ed5c9. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

inabao and others added 3 commits February 27, 2026 15:02
- Extract topk_factor from HGraphSearchParameters and IVFSearchParameters
- Add topk_factor to IndexSearchParameter for inheritance by all search parameters
- Implement topk_factor support in Pyramid search to control reorder ratio
- When topk_factor > 1.0, reorder_limit = limit * topk_factor

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: jinjiabao.jjb <jinjiabao.jjb@antgroup.com>
Signed-off-by: jinjiabao.jjb <jinjiabao.jjb@antgroup.com>
Signed-off-by: jinjiabao.jjb <jinjiabao.jjb@antgroup.com>
@inabao inabao force-pushed the feat-pyramid-topk-factor branch from 2384016 to 57ed5c9 Compare February 27, 2026 09:11
Copy link
Copy Markdown
Collaborator

@wxyucs wxyucs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@inabao inabao merged commit 0bea8f9 into main Mar 2, 2026
29 of 30 checks passed
@inabao inabao deleted the feat-pyramid-topk-factor branch March 2, 2026 07:59
wxyucs pushed a commit that referenced this pull request Mar 2, 2026
…1605)

Signed-off-by: jinjiabao.jjb <jinjiabao.jjb@antgroup.com>
Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants