-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Add logic to force terminate the VM if the session lock can't be acquired for 30 seconds when the service stops #13493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ired for 30 seconds when the service stops
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds timeout logic to force terminate the WSL VM when the session lock cannot be acquired within 30 seconds during service shutdown, addressing installation hangs caused by deadlocks on the session lock.
- Adds a new
ShutdownBehaviorenum with timeout options for VM shutdown - Changes the session mutex from
std::recursive_mutextostd::recursive_timed_mutexto support timeout operations - Implements timeout logic that force terminates the VM after 30 seconds if the lock can't be acquired
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| LxssUserSession.h | Adds ShutdownBehavior enum and changes mutex type to support timeouts |
| LxssUserSession.cpp | Implements timeout logic and refactors shutdown method to use new behavior enum |
| LxssUserSessionFactory.cpp | Updates shutdown call to use new ForceAfter30Seconds behavior |
| auto unlock = wil::scope_exit_log(WI_DIAGNOSTICS_INFO, [this, &locked]() { | ||
| if (locked) | ||
| { | ||
| m_instanceLock.unlock(); | ||
| } | ||
| }); |
Copilot
AI
Sep 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The scope_exit handler captures locked by reference, but locked will go out of scope when the function returns. This could lead to undefined behavior if the scope_exit handler is invoked after the function returns. Consider capturing locked by value or using a different synchronization approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is incorrect. Capturing locked by value would break the logic of this scope_exit
| bool locked = false; | ||
| auto unlock = wil::scope_exit_log(WI_DIAGNOSTICS_INFO, [this, &locked]() { | ||
| if (locked) | ||
| { | ||
| m_instanceLock.unlock(); | ||
| } | ||
| }); | ||
|
|
||
| if (Behavior == ShutdownBehavior::ForceAfter30Seconds) | ||
| { | ||
| if (m_instanceLock.try_lock_for(std::chrono::seconds(30))) | ||
| { | ||
| locked = true; | ||
| } | ||
| else | ||
| { | ||
| WSL_LOG("VmShutdownLockTimedOut"); | ||
| forceTerminate(); | ||
| } | ||
| } | ||
|
|
||
| if (!locked) | ||
| { | ||
| m_instanceLock.lock(); | ||
| locked = true; | ||
| } | ||
|
|
Copilot
AI
Sep 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If Behavior != ShutdownBehavior::ForceAfter30Seconds, the locked variable remains false and this block will always execute a blocking lock. This logic could be simplified by handling the different shutdown behaviors more explicitly to improve code clarity.
| bool locked = false; | |
| auto unlock = wil::scope_exit_log(WI_DIAGNOSTICS_INFO, [this, &locked]() { | |
| if (locked) | |
| { | |
| m_instanceLock.unlock(); | |
| } | |
| }); | |
| if (Behavior == ShutdownBehavior::ForceAfter30Seconds) | |
| { | |
| if (m_instanceLock.try_lock_for(std::chrono::seconds(30))) | |
| { | |
| locked = true; | |
| } | |
| else | |
| { | |
| WSL_LOG("VmShutdownLockTimedOut"); | |
| forceTerminate(); | |
| } | |
| } | |
| if (!locked) | |
| { | |
| m_instanceLock.lock(); | |
| locked = true; | |
| } | |
| auto unlock = wil::scope_exit_log(WI_DIAGNOSTICS_INFO, [this]() { | |
| m_instanceLock.unlock(); | |
| }); | |
| switch (Behavior) | |
| { | |
| case ShutdownBehavior::ForceAfter30Seconds: | |
| if (!m_instanceLock.try_lock_for(std::chrono::seconds(30))) | |
| { | |
| WSL_LOG("VmShutdownLockTimedOut"); | |
| forceTerminate(); | |
| // Lock not acquired, so skip further shutdown steps. | |
| return; | |
| } | |
| break; | |
| default: | |
| m_instanceLock.lock(); | |
| break; | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's the point. We should not access member variables without the lock
benhillis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
![]()
Summary of the Pull Request
This change adds new logic to force terminate the VM if we can't acquire the user session lock for 30 seconds. This should help avoid MSI installation being stuck because wslservice is waiting on the lock to terminate the VM (See #11013).
Thiswill only help when upgrading into a build that contains this commit, so we should expect the new upgrade to show the same issues.
PR Checklist
Detailed Description of the Pull Request / Additional comments
Validation Steps Performed