Skip to content

Conversation

@kralka
Copy link

@kralka kralka commented Oct 25, 2025

Make Dict::lookup check if the dictionary has been modified. This is a workaround for a possible segfault. See bpo-46615 for the original discussion in CPython.

Summary by CodeRabbit

  • Bug Fixes
    • Improved dictionary iteration robustness: operations that modify a collection during iteration now produce a clear runtime error ("set changed size during iteration") instead of risking crashes, preventing silent corruption and improving stability.

Make Dict::lookup check if the dictionary has been modified. This is a
workaround for a possible segfault. See bpo-46615 for the original
discussion in CPython.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 25, 2025

Walkthrough

Replaces unchecked access in DictInner::lookup with bound-checked get() and explicit concurrent-modification detection. Captures the original dictionary size and returns a runtime error if the size changes during iteration or when an expected index entry is missing; preserves main control flow otherwise.

Changes

Cohort / File(s) Summary
Dict lookup safety
vm/src/dict_inner.rs
Replaced unchecked indexing into inner.indices with get() checks in DictInner::lookup. Captures original_size and returns a runtime error ("set changed size during iteration") when the dict size changes during iteration or when an expected index slot is None. Inserts size-change checks after selecting FREE slots and before final return. No public signatures changed.

Sequence Diagram

sequenceDiagram
    participant Caller as caller
    participant Lookup as DictInner::lookup
    participant Indices as inner.indices

    Caller->>Lookup: start lookup
    Lookup->>Lookup: capture original_size
    loop iterate slots
        Lookup->>Indices: indices.get(idx)
        alt entry present
            Indices-->>Lookup: Some(entry)
            alt entry is FREE
                Lookup->>Lookup: check size == original_size
                alt size changed
                    Lookup-->>Caller: error "set changed size during iteration"
                else size same
                    Lookup->>Lookup: insert FREE slot / continue search
                end
            else entry occupied
                Lookup-->>Caller: return entry
            end
        else entry missing (None)
            Indices-->>Lookup: None
            Lookup-->>Caller: error "set changed size during iteration"
        end
    end
    Note over Lookup: final size-check before returning
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Single file but introduces control-flow and runtime-error semantics.
  • Areas to review closely:
    • Correctness of original_size capture and all size-change checks.
    • Error messages and error type consistency with surrounding runtime.
    • Behavior when indices.get() returns None versus retry semantics.
    • Any performance implications of added checks.

Poem

🐰
I hopped around the index tree,
I checked each nook with careful glee.
If sizes shift mid-lookup race,
I stop, I shout — no silent chase.
Safe burrows now, my code's a cozy place. 🥕

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "Enable TestMethodsMutating" is partially related to the changeset. The title accurately reflects one important outcome of the change—enabling a test that was previously blocked or failing. However, it doesn't capture the main technical work, which is adding concurrent modification detection to DictInner::lookup to prevent segfaults. The title communicates what the PR accomplishes at a high level (enabling the test) rather than the core mechanism that enables it (dictionary safety checks). A developer scanning the history would understand that this PR enables a specific test, though they wouldn't immediately grasp the underlying technical implementation.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3440811 and 71066ce.

⛔ Files ignored due to path filters (1)
  • Lib/test/test_set.py is excluded by !Lib/**
📒 Files selected for processing (1)
  • vm/src/dict_inner.rs (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • vm/src/dict_inner.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Run tests under miri
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Check Rust code with rustfmt and clippy
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Ensure compilation on various targets

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1d23f6e and 3440811.

⛔ Files ignored due to path filters (1)
  • Lib/test/test_set.py is excluded by !Lib/**
📒 Files selected for processing (1)
  • vm/src/dict_inner.rs (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Format Rust code with the default rustfmt style (run cargo fmt)
Run clippy and fix any warnings or lints introduced by your changes
Follow Rust best practices for error handling and memory management

Files:

  • vm/src/dict_inner.rs
{vm,stdlib}/**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use RustPython macros (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • vm/src/dict_inner.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Run tests under miri
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Check Rust code with rustfmt and clippy
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Ensure compilation on various targets

Comment on lines 585 to 590
let index_entry_ptr = inner.indices.get(index_index);
if index_entry_ptr.is_none() {
// Dictionary was modified under our hands, see TestMethodsMutating.
continue 'outer;
}
let index_entry = *index_entry_ptr.unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Incomplete fix: idxs must be reset when concurrent modification is detected.

The safety check correctly detects when the dictionary is modified concurrently (e.g., when user's __eq__ method clears the dictionary, shrinking it from size 16 to 8), but idxs retains its stale mask. When continuing the outer loop, line 580-582 will reuse the existing GenIndexes via get_or_insert_with, causing idxs.next() to keep generating out-of-bounds indices and triggering an infinite retry loop.

Apply this diff to reset idxs when concurrent modification is detected:

                 let index_entry_ptr = inner.indices.get(index_index);
                 if index_entry_ptr.is_none() {
                     // Dictionary was modified under our hands, see TestMethodsMutating.
+                    idxs = None;
                     continue 'outer;
                 }
                 let index_entry = *index_entry_ptr.unwrap();
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let index_entry_ptr = inner.indices.get(index_index);
if index_entry_ptr.is_none() {
// Dictionary was modified under our hands, see TestMethodsMutating.
continue 'outer;
}
let index_entry = *index_entry_ptr.unwrap();
let index_entry_ptr = inner.indices.get(index_index);
if index_entry_ptr.is_none() {
// Dictionary was modified under our hands, see TestMethodsMutating.
idxs = None;
continue 'outer;
}
let index_entry = *index_entry_ptr.unwrap();
🤖 Prompt for AI Agents
In vm/src/dict_inner.rs around lines 585 to 590, when the
concurrent-modification check finds index_entry_ptr.is_none() you must reset the
cached GenIndexes (idxs) before continuing the 'outer loop so the next iteration
will recreate it with the current mask; modify the branch to clear/take the idxs
(set to None or take ownership) immediately prior to continue 'outer to avoid
reusing a stale mask that causes out-of-bounds indices and an infinite retry
loop.

@kralka
Copy link
Author

kralka commented Oct 25, 2025

I will investigate later this weekend. Locally (Linux -- rust docker container) the test passes: target/release/rustpython -m test -j 1 -u all test_set seems to work for me. Same RUSTPYTHONPATH=Lib cargo run --release Lib/test/test_set.py gives OK (skipped=63, expected failures=15).

@youknowone
Copy link
Member

Thank you!

// Safety: index_index is generated
inner.indices.get_unchecked(index_index)
};
let index_entry_ptr = inner.indices.get(index_index);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would something like this work?

let index_entry = match inner.indices.get(index_index) {
  None => continue 'outer,
  Some(v) => v,
};

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, done, thank you.

@kralka
Copy link
Author

kralka commented Oct 28, 2025

Ok, only the Ubuntu is running the platform independent tests (based on the workflow file). And the test is flaky so that it crashed always the same in a single run on my computer. I let it run 1000 times yesterday.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants