fix(hnsw): fix concurrent access to max_level_ in HNSW search operations by inabao · Pull Request #1573 · antgroup/vsag

inabao · 2026-02-10T03:15:39Z

Fixed race condition where max_level_ was accessed outside the protection of max_level_mutex_ in searchKnn() and searchRange() functions.

The issue occurred when:

A search thread reads max_level_ without holding the lock
An add thread concurrently increases max_level_ and adds new nodes
The search thread may traverse uninitialized or invalid node IDs, causing 'cand error' exceptions

Changes:

In searchKnn(): Read max_level_ inside the lock scope into a local copy
In searchRange(): Read max_level_ inside the lock scope into a local copy

This ensures both enterpoint_node_ and max_level_ are read atomically under the same lock protection, preventing inconsistent state.

Fixes: cand error in concurrent add/search scenarios

Fixed race condition where max_level_ was accessed outside the protection of max_level_mutex_ in searchKnn() and searchRange() functions. The issue occurred when: 1. A search thread reads max_level_ without holding the lock 2. An add thread concurrently increases max_level_ and adds new nodes 3. The search thread may traverse uninitialized or invalid node IDs, causing 'cand error' exceptions Changes: - In searchKnn(): Read max_level_ inside the lock scope into a local copy - In searchRange(): Read max_level_ inside the lock scope into a local copy This ensures both enterpoint_node_ and max_level_ are read atomically under the same lock protection, preventing inconsistent state. Fixes: cand error in concurrent add/search scenarios Signed-off-by: jinjiabao.jjb <jinjiabao.jjb@antgroup.com>

gemini-code-assist · 2026-02-10T03:15:52Z

Summary of Changes

Hello @inabao, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves a critical concurrency issue within the HNSW algorithm's search functions. By ensuring that the max_level_ variable is read under a shared lock, it eliminates a race condition that could cause search operations to access uninitialized or invalid data when new elements are concurrently added. This change significantly improves the stability and correctness of the HNSW index under high-concurrency workloads.

Highlights

Race Condition Fix: Addressed a race condition in HNSW search operations where max_level_ was accessed without proper mutex protection, potentially leading to 'cand error' exceptions during concurrent add and search scenarios.
Atomic Access to max_level_: Ensured that max_level_ is read atomically within the max_level_mutex_ scope in both searchKnn() and searchRange() functions by introducing a local copy, thereby preventing inconsistent state.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

src/algorithm/hnswlib/hnswalg.cpp
- Modified searchKnn() to read max_level_ into a local variable max_level_copy within the max_level_mutex_ lock scope, and subsequently use this local copy.
- Modified searchRange() to read max_level_ into a local variable max_level_copy within the max_level_mutex_ lock scope, and subsequently use this local copy.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request addresses a race condition in the searchKnn and searchRange functions of the HNSW implementation. Previously, max_level_ was accessed outside the protection of max_level_mutex_, which could lead to inconsistent state and exceptions during concurrent add and search operations. The fix involves reading max_level_ into a local variable within the same critical section where enterpoint_node_ is read. This ensures both values are retrieved atomically. The changes are correct, minimal, and effectively resolve the identified concurrency issue.

LHT129

LGTM

wxyucs · 2026-02-10T03:37:17Z

@copilot, please help to review this pull request

Copilot · 2026-02-10T03:37:26Z

@wxyucs I've opened a new pull request, #1574, to work on those changes. Once the pull request is ready, I'll request review from you.

wxyucs

lgtm

codecov · 2026-02-10T05:03:06Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

@@            Coverage Diff             @@
##             main    #1573      +/-   ##
==========================================
- Coverage   91.24%   91.01%   -0.23%     
==========================================
  Files         329      329              
  Lines       19396    19398       +2     
==========================================
- Hits        17697    17656      -41     
- Misses       1699     1742      +43

Flag	Coverage Δ
cpp	`91.01% <100.00%> (-0.23%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
common	`85.81% <ø> (ø)`
datacell	`91.70% <ø> (-2.08%)`	⬇️
index	`90.65% <100.00%> (+0.03%)`	⬆️
simd	`100.00% <ø> (ø)`

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3dd2923...321aecb. Read the comment docs.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…ons (#1573) Fixed race condition where max_level_ was accessed outside the protection of max_level_mutex_ in searchKnn() and searchRange() functions. The issue occurred when: 1. A search thread reads max_level_ without holding the lock 2. An add thread concurrently increases max_level_ and adds new nodes 3. The search thread may traverse uninitialized or invalid node IDs, causing 'cand error' exceptions Changes: - In searchKnn(): Read max_level_ inside the lock scope into a local copy - In searchRange(): Read max_level_ inside the lock scope into a local copy This ensures both enterpoint_node_ and max_level_ are read atomically under the same lock protection, preventing inconsistent state. Fixes: cand error in concurrent add/search scenarios Signed-off-by: jinjiabao.jjb <jinjiabao.jjb@antgroup.com>

inabao self-assigned this Feb 10, 2026

inabao requested a review from wxyucs as a code owner February 10, 2026 03:15

inabao added kind/bug Something isn't working needs-cherry-pick-release-0.15 needs-cherry-pick-release-0.16 needs-cherry-pick-release-0.17 version/1.0 needs-cherry-pick-release-0.18 labels Feb 10, 2026

pull-request-size bot added the size/XS label Feb 10, 2026

gemini-code-assist bot reviewed Feb 10, 2026

View reviewed changes

LHT129 approved these changes Feb 10, 2026

View reviewed changes

Copilot AI mentioned this pull request Feb 10, 2026

Review PR: fix concurrent access to max_level_ in HNSW search operations #1574

Closed

wxyucs approved these changes Feb 10, 2026

View reviewed changes

inabao merged commit 765e6cb into main Feb 10, 2026
29 of 30 checks passed

inabao deleted the fix/hnsw-concurrent-maxlevel branch February 10, 2026 09:52

wxyucs added the cherry-pick-0.15-done label Feb 11, 2026

wxyucs added the cherry-pick-0.16-done label Feb 27, 2026

wxyucs added the cherry-pick-0.17-done label Feb 27, 2026

wxyucs added the cherry-pick-0.18-done label Feb 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(hnsw): fix concurrent access to max_level_ in HNSW search operations#1573

fix(hnsw): fix concurrent access to max_level_ in HNSW search operations#1573
inabao merged 1 commit intomainfrom
fix/hnsw-concurrent-maxlevel

inabao commented Feb 10, 2026

Uh oh!

gemini-code-assist bot commented Feb 10, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

LHT129 left a comment

Uh oh!

wxyucs commented Feb 10, 2026

Uh oh!

Copilot AI commented Feb 10, 2026

Uh oh!

wxyucs left a comment

Uh oh!

codecov bot commented Feb 10, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

inabao commented Feb 10, 2026

Uh oh!

gemini-code-assist bot commented Feb 10, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

LHT129 left a comment

Choose a reason for hiding this comment

Uh oh!

wxyucs commented Feb 10, 2026

Uh oh!

Copilot AI commented Feb 10, 2026

Uh oh!

wxyucs left a comment

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov bot commented Feb 10, 2026 •

edited

Loading