Fix UpdateHyperNode rebuild logic by hzxuzhonghu · Pull Request #5076 · volcano-sh/volcano

hzxuzhonghu · 2026-03-04T01:15:19Z

What type of PR is this?

/kind cleanup

And also make a optimization: only rebuild when its members change

What this PR does / why we need it:

Fixes UpdateHyperNode to properly apply spec changes, restore regex/label rebuild behavior, and rebuild caches when members change so scheduler topology state stays consistent. Updates related tests accordingly.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

volcano-sh-bot · 2026-03-04T01:15:28Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from hzxuzhonghu. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

pkg/scheduler/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

gemini-code-assist · 2026-03-04T01:15:39Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a bug in the UpdateHyperNode logic to ensure the scheduler's topology state remains consistent and accurate. It optimizes the update process by introducing intelligent conditional rebuilds, preventing full cache recomputations when only metadata changes, while guaranteeing that changes to HyperNode specifications, member lists, or dynamic selectors correctly trigger necessary updates. This leads to more efficient and reliable HyperNode management within the scheduler.

Highlights

Refactored HyperNode Update Logic: The UpdateHyperNode function has been significantly refactored to introduce conditional rebuilds. It now uses reflect.DeepEqual to determine if the HyperNode's spec, members, or tier have changed, avoiding unnecessary full cache rebuilds.
Improved Handling of Regex/Label Members: A new mechanism ensures that HyperNodes with regex or label-matching members always trigger a rebuild, even if their spec hasn't changed, to account for dynamic cluster node state changes that could affect their resolved node set.
Enhanced Parent-Child Relationship Management: The updateParent function now returns a set of 'freed' members whose parent pointers were cleared. This allows for immediate rebuilding of other HyperNodes that might claim these newly available members, resolving potential adoption issues.
Streamlined Cache Rebuild Process: The rebuildCache and updateAncestors functions have been updated to accept the current list of nodes as an argument, preventing redundant node list fetches. Error tracking via builtErrHyperNode has been removed.
Expanded Test Coverage: New test cases have been added to validate the conditional rebuild logic for scenarios involving no spec change, regex members, tier-only changes, and member changes, ensuring the robustness of the updated UpdateHyperNode behavior.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

pkg/scheduler/api/hyper_node_info.go
- Imported the reflect package for deep equality comparisons.
- Removed the builtErrHyperNode field from the HyperNodesInfo struct.
- Refactored UpdateHyperNode to conditionally apply spec changes and rebuild caches based on specChanged, membersChanged, and tierChanged flags.
- Added a fast-path in UpdateHyperNode to skip rebuilds for exact-match HyperNodes when the spec is unchanged.
- Introduced the hyperNodeHasRegexOrLabelMember function to identify HyperNodes with dynamic member selectors.
- Modified updateParent to return a set of freed member names.
- Removed hni.builtErrHyperNode assignments from addChild and resetParent functions.
- Added the hyperNodesThatClaimMember function to find HyperNodes claiming a specific member.
- Updated updateAncestors to fetch the node list once and pass it to rebuildCache.
- Modified rebuildCache to accept a list of nodes and an ancestor set, and removed internal node list fetching.
pkg/scheduler/api/hyper_node_info_test.go
- Imported fmt and sync/atomic packages.
- Added TestHyperNodesInfo_UpdateHyperNode_NoSpecChange to verify no rebuild occurs for exact-match HyperNodes with unchanged specs.
- Added TestHyperNodesInfo_UpdateHyperNode_RegexMember_AlwaysRebuilds to confirm rebuilds for regex-matching HyperNodes even with unchanged specs.
- Added TestHyperNodesInfo_UpdateHyperNode_TierOnlyChange to test rebuild behavior when only the tier changes.
- Added TestHyperNodesInfo_UpdateHyperNode_MembersChangeTriggersRebuild to validate rebuilds upon member list changes.
- Removed the correctHyperNode field from the test struct in TestHyperNodesInfo_UpdateHyperNode_MultipleParents.
- Added a fmt.Printf statement for debugging in TestHyperNodesInfo_UpdateHyperNode_MultipleParents.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request provides a significant and well-implemented refactoring of the UpdateHyperNode logic. The changes correctly handle spec modifications, member updates, and tier changes, introducing a performance-optimizing fast-path for updates that don't require a full cache rebuild. The new logic properly manages hypernodes with regex or label selectors that depend on the live cluster state. A key improvement is the handling of member removal, which now triggers a rebuild for other hypernodes that might adopt the freed members, resolving potential parent-child relationship issues. The accompanying tests are thorough and cover the new scenarios effectively. My review identified one minor issue: a leftover debug print statement in a test file that should be removed.

gemini-code-assist · 2026-03-04T01:17:44Z

pkg/scheduler/api/hyper_node_info_test.go

 			}
 			assert.Equal(t, false, hni.Ready())

+			fmt.Printf("%v\n", hni.HyperNodesInfo())


This fmt.Printf appears to be a leftover debugging statement. It should be removed to keep the test output clean.

Copilot

Pull request overview

Fixes UpdateHyperNode cache rebuild behavior so scheduler topology state stays consistent when HyperNode specs change, regex/label selectors are involved, or members are released and can be adopted by other HyperNodes.

Changes:

Add spec-diffing and a fast-path to skip rebuilds when spec is unchanged and members are exact-match only.
Reintroduce “always rebuild” behavior for regex/label selectors and rebuild caches when members change (including rebuilding claimers of newly-freed members).
Update and expand UpdateHyperNode unit tests to cover no-op updates, tier-only updates, regex rebuilds, and member-change rebuilds.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
pkg/scheduler/api/hyper_node_info.go	Refactors `UpdateHyperNode` rebuild logic, adds regex/label rebuild detection, rebuilds affected claimers, and threads a pre-listed node slice into rebuild functions.
pkg/scheduler/api/hyper_node_info_test.go	Adds focused tests for new rebuild semantics and adjusts existing multiple-parent test coverage.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-04T01:20:57Z

pkg/scheduler/api/hyper_node_info_test.go

+			fmt.Printf("%v\n", hni.HyperNodesInfo())
+


Avoid printing directly to stdout in tests (fmt.Printf). This adds noisy output to go test runs and forces an extra import. If you need diagnostics, use t.Logf/t.Log (or remove the line entirely) and drop the fmt import.

Forget to delete it

Please delete it, this PR looks overall good to me.

hzxuzhonghu · 2026-03-04T06:35:30Z

cc @Monokaix

hajnalmt

This PR looks overall good to me 😊 This is a really great improvement!
I had 2 minor comments, but they are really minor.

hajnalmt · 2026-03-04T08:18:54Z

pkg/scheduler/api/hyper_node_info.go

+	tierChanged := true
+	if exists && old.HyperNode != nil {
+		oldSpec := old.HyperNode.Spec
+		specChanged = !reflect.DeepEqual(oldSpec, hn.Spec)


I feel that reflect here is comparatively slow and a little overkill for a struct with 3 fields. Plus these are dependent specChanged will be true if membersChanged or tierChanged is true. I would prefer to use the k8s equality library for this instead of reflect.

oldSpec := old.HyperNode.Spec newSpec := hn.Spec tierChanged := oldSpec.Tier != newSpec.Tier membersChanged := !equality.Semantic.DeepEqual(oldSpec.Members, newSpec.Members) specChanged := tierChanged || membersChanged || oldSpec.TierName != newSpec.TierName

This is somewhat more exact.

pkg/scheduler/api/hyper_node_info_test.go

hajnalmt · 2026-03-04T09:06:52Z

pkg/scheduler/api/hyper_node_info_test.go

+			fmt.Printf("%v\n", hni.HyperNodesInfo())
+


Please delete it, this PR looks overall good to me.

hzxuzhonghu · 2026-03-05T03:45:02Z

Thank you for your review @hajnalmt I will try to address all the comments

Signed-off-by: Zhonghu Xu <xuzhonghu@huawei.com>

Copilot AI review requested due to automatic review settings March 4, 2026 01:15

volcano-sh-bot requested review from JesseStutler and hajnalmt March 4, 2026 01:15

volcano-sh-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Mar 4, 2026

Copilot started reviewing on behalf of hzxuzhonghu March 4, 2026 01:15 View session

gemini-code-assist bot reviewed Mar 4, 2026

View reviewed changes

Copilot AI reviewed Mar 4, 2026

View reviewed changes

hajnalmt reviewed Mar 4, 2026

View reviewed changes

hajnalmt mentioned this pull request Mar 4, 2026

Optimize CI with reusable image workflow and skip-rebuild support #5081

Merged

Fix UpdateHyperNode rebuild logic

2f2fad4

Signed-off-by: Zhonghu Xu <xuzhonghu@huawei.com>

hzxuzhonghu force-pushed the fix/update-hypernode-rebuild branch from 5992776 to 2f2fad4 Compare March 16, 2026 12:19

Conversation

hzxuzhonghu commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Uh oh!

volcano-sh-bot commented Mar 4, 2026

Uh oh!

gemini-code-assist bot commented Mar 4, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

hzxuzhonghu Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

hajnalmt Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

hzxuzhonghu commented Mar 4, 2026

Uh oh!

hajnalmt left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hajnalmt Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hzxuzhonghu Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

hajnalmt Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

hzxuzhonghu commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

hzxuzhonghu commented Mar 4, 2026 •

edited

Loading

hajnalmt left a comment •

edited

Loading

hajnalmt Mar 4, 2026 •

edited

Loading