Skip to content

Conversation

@robtaylor
Copy link
Owner

Summary

  • On macOS, always pass -syslibroot to the linker pointing to the macOS SDK (detected via xcrun --show-sdk-path)
  • This allows openvaf-r to compile Verilog-A to OSDI without requiring the LLVM_SYS_181_PREFIX environment variable at runtime

Problem

Previously, linking would fail with ld: library 'System' not found when LLVM_SYS_181_PREFIX was not set because the system linker didn't know where to find system libraries.

Test plan

  • Tested locally on macOS (Apple Silicon) without LLVM_SYS_181_PREFIX set
  • Verified CCCS and EKV test models compile successfully
  • Verified output OSDI files have correct symbols

🤖 Generated with Claude Code

robtaylor and others added 30 commits November 23, 2025 13:21
…eprecation

The deprecated IndexSet::remove() method was replaced with swap_remove() in
the small-signal network analysis. However, this exposed a latent order-
dependency bug where the analysis results differed based on iteration order.

Root Cause:
The algorithm had a circular dependency where analyze_value() checks if
values are in small_signal_vals, but membership in that set depends on the
analysis results. With platform-specific hash ordering (ahash::RandomState),
this caused reactive/resistive contribution counts to swap on Windows MSYS2.

Solution:
Implemented an order-independent fixed-point algorithm with four phases:

1. Speculatively add ALL candidate nodes to small_signal_vals
2. Evaluate all candidates against this consistent set state
3. Remove speculative nodes that weren't confirmed
4. Add confirmed flows and remove resolved candidates

This ensures all candidates see the same set state during evaluation,
making the analysis deterministic regardless of iteration order while
still supporting circular dependencies (e.g., noise nodes).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
The compiler was producing non-deterministic output due to use of
ahash's RandomState, which uses random seeding for its hash function.
This caused iteration order of hash-based collections to vary between
runs, leading to different compilation results and sporadic CI failures.

Root cause: GVN (Global Value Numbering) used ahash::RandomState for
hashing expressions. Different hash values across runs caused different
equivalence class assignments and leader selections, which cascaded:
1. GVN picks different instruction leaders non-deterministically
2. Values get replaced differently via replace_uses()
3. op_dependent_insts BitSet gets populated differently
4. determine_evaluation() makes different linearization decisions
5. Noise sources/implicit equations are created vs. linearized differently

This manifested as entire noise sources or implicit equations appearing
or disappearing between runs - not just ordering differences.

Changes:
- Replace ahash::RandomState with BuildHasherDefault<FxHasher> in GVN
- Replace AHashMap/AHashSet with IndexMap/IndexSet (with FxHasher) or
  HashMap/HashSet (with FxHasher) throughout the codebase
- Use FxHasher consistently for deterministic hashing
- Remove unused ahash dependencies from several crates

Affected crates: mir_opt, mir_autodiff, sim_back, osdi, hir_def,
hir_lower, basedb, mir, vfs, typed_indexmap, sourcegen

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
In one single commit to audit the differences better
allowing compilation on apple darwin on arm64
This error was discovered by checking the openvaf-r check, thank you.
robtaylor and others added 24 commits November 26, 2025 22:40
LLVM 18's ld64.lld requires -platform_version flags on both x86_64 and ARM64.
Also fixes CI to use correct GitHub runners (macos-14 for ARM64, macos-13 for x86_64).

Changes:
- Add -platform_version flags to x86_64_apple_darwin target spec
- Update CI matrix to use macos-14 for ARM64 builds (macos-13 is x86_64 only)

This fixes the linker error:
  ld64.lld: error: must specify -platform_version
  ld64.lld: error: missing or unsupported -arch x86_64

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Root cause: snprintf was declared with ALL parameters as variadic `(...)`
instead of having its first 3 parameters fixed: `(ptr, i64, ptr, ...)`.

On ARM64, fixed and variadic parameters use different calling conventions:
- Fixed parameters: passed in x0-x7 registers
- Variadic parameters: special handling, different register/stack allocation

When snprintf was declared as `i32 (...)`, LLVM generated code expecting
all arguments to use the variadic calling convention. This caused:
1. Arguments passed in wrong registers/memory locations
2. Double values (like 0x3ff0000000000000 = 1.0) interpreted as pointers
3. SIGSEGV when snprintf tried to strlen() these bogus pointer values

Fix: Changed intrinsics.rs line 100 to pass `args` instead of `&[]` to
`ty_variadic_func()`. Now snprintf is correctly declared as:
  declare i32 @snprintf(ptr, i64, ptr, ...)

This makes the first 3 parameters use the standard calling convention
while only format arguments use varargs convention.

Also removed duplicate parameter addition code in compilation_unit.rs
(lines 452-455) which was likely a failed attempt to work around the
incorrect function signature.

Fixes: Integration test crashes on ARM64 macOS with LLVM 18
Affects: BSIM3, BSIM4, and other models with large instance structs
Testing: All 28 integration tests now pass on ARM64 macOS

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit resolves mostcompiler warnings across the codebase:

**Lifetime syntax warnings:**
- Fixed elided lifetime annotations in return types across mir, mir_build,
  mir_autodiff, mir_opt, verilogae, and melange crates
- Added explicit '_ or 'a lifetime parameters where the compiler couldn't
  infer them correctly

**Deprecated features:**
- Replaced deprecated `std::intrinsics::transmute` with `std::mem::transmute`
  in basedb, hir, and mir_interpret crates
- Removed deprecated `cargo-clippy` feature checks in bforest, replacing
  with direct clippy lint attributes

**Unused code:**
- Prefixed unused variable `b` with underscore in hir_def
- Removed unused imports in hir_lower and mir_llvm
- Added `#[allow(dead_code)]` to VoidAbortCallback struct in osdi

**Unexpected cfg conditions:**
- Declared `cross_compile` cfg in sourcegen/Cargo.toml
- Declared `nightly` feature in openvaf-driver/Cargo.toml
- Removed Py_3_8 conditionals as were using private libpython internals.
  TODO: add support for Python 13 METH_FASTCALL

**Other:**
- Removed unreachable pattern in osdi metadata matching
- Added `#[allow(static_mut_refs)]` to PyInit_verilogae for Python C API
  compatibility

The codebase now builds cleanly with one warning in release mode,
related to the deprecation of from IndexSet::remove, which requires
a more in-depth fix.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
…eprecation

The deprecated IndexSet::remove() method was replaced with swap_remove() in
the small-signal network analysis. However, this exposed a latent order-
dependency bug where the analysis results differed based on iteration order.

Root Cause:
The algorithm had a circular dependency where analyze_value() checks if
values are in small_signal_vals, but membership in that set depends on the
analysis results. With platform-specific hash ordering (ahash::RandomState),
this caused reactive/resistive contribution counts to swap on Windows MSYS2.

Solution:
Implemented an order-independent fixed-point algorithm with four phases:

1. Speculatively add ALL candidate nodes to small_signal_vals
2. Evaluate all candidates against this consistent set state
3. Remove speculative nodes that weren't confirmed
4. Add confirmed flows and remove resolved candidates

This ensures all candidates see the same set state during evaluation,
making the analysis deterministic regardless of iteration order while
still supporting circular dependencies (e.g., noise nodes).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit updates the entire codebase to use LLVM 21 instead of LLVM 18:

- Updated llvm-sys dependency from 181.x to 211.0.0 in all Cargo.toml files
- Replaced LLVM_SYS_181_PREFIX with LLVM_SYS_210_PREFIX throughout codebase
- Updated CI workflows for Ubuntu and macOS to use LLVM 21
- Updated README documentation and build/test scripts
- Fixed deprecated LLVMConstStringInContext API (use LLVMConstStringInContext2)
- Updated build-macos.sh, test-macos.sh, and update-snapshots.sh
- Changed Homebrew package from llvm@18 to llvm (latest stable)

Build status: ✓ Successful
Unit tests: ✓ Pass (all tests pass)
Integration tests: ⚠ 2 tests fail with linker issue (dyld_stub_binder symbol)

The integration test failures appear to be related to macOS linker configuration
and may need additional investigation for ld64.lld with LLVM 21.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit expands LLVM support from just 18 and 21 to include all major
versions 18-21:

- Add llvm19 and llvm20 feature flags to all relevant Cargo.toml files
- Fix llvm-sys version for LLVM 20: use 201.0.1 (LLVM 20.1), not 200.0.0
- Update environment variable names to match llvm-sys conventions:
  - LLVM_SYS_181_PREFIX for LLVM 18.1
  - LLVM_SYS_191_PREFIX for LLVM 19.1
  - LLVM_SYS_201_PREFIX for LLVM 20.1
  - LLVM_SYS_211_PREFIX for LLVM 21.1
- Update linker and osdi/build.rs to check all LLVM env vars
- Update CI workflow to test all 4 LLVM versions on Ubuntu
- Update README.md and CLAUDE.md documentation

The llvm-sys crate versioning follows MAJORminor.patch format where 181 = LLVM
18.1, 191 = LLVM 19.1, 201 = LLVM 20.1, 211 = LLVM 21.1.

Build and tests verified with LLVM 21.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Use `brew install llvm` (currently LLVM 21) instead of pinning to llvm@18.
This simplifies the CI configuration and tests with the latest LLVM version.

Changes:
- Use simple image matrix instead of include blocks
- Install `llvm` package (latest) instead of `llvm@18`
- Use default llvm21 feature (no --features flag needed)
- Simplify cache keys

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit fixes multiple issues related to building and linking with
LLVM 18 on macOS ARM64:

1. Use LLVM 18's clang for bitcode generation (build.rs)
   - Bitcode generated by Apple's clang is incompatible with LLVM 18
   - Now uses clang from LLVM_SYS_181_PREFIX when available

2. Configure ld64.lld linker with proper flags (linker/src/lib.rs)
   - Detect and use ld64.lld from LLVM 18 when available
   - Automatically add -syslibroot flag using xcrun --show-sdk-path

3. Simplify macOS target configuration (aarch64_apple_darwin.rs)
   - Removed -lSystem from post_link_args
   - With -undefined dynamic_lookup, symbols are resolved at runtime
   - Added -platform_version flags required by ld64.lld

4. Update README with macOS build instructions
   - Add macOS dependency setup section with Homebrew instructions
   - Document build-macos.sh and test-macos.sh convenience scripts
   - Add section explaining integration tests
   - Clarify how to enable and run integration tests

These changes ensure the compiler works correctly when built against
LLVM 18, fixing "Unsupported stack probing method" and "library not
found for -lSystem" errors.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Updates:
- Configure macOS CI to use llvm@18 instead of latest llvm
- Add xtask for setup on Mac. Unfortunatly we can't fix for
  straight cargo build/test without patching llvm-sys

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This change makes integration test snapshots platform-agnostic, allowing
cross-platform CI testing without spurious failures due to different struct
layouts on ARM64 vs x86-64.

Removed values:
- residual offset numbers (4 values per node)
- react_ptr offset values in jacobian entries
- instance_size and model_size values

These values are calculated by LLVM using platform-specific ABI alignment
rules (LLVMABISizeOfType and LLVMOffsetOfElement), causing differences
between architectures. All semantic information (parameter names, types,
node names, jacobian flags, etc.) is preserved.

All integration test snapshots have been regenerated.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
LLVM 18's ld64.lld requires -platform_version flags on both x86_64 and ARM64.
Also fixes CI to use correct GitHub runners (macos-14 for ARM64, macos-13 for x86_64).

Changes:
- Add -platform_version flags to x86_64_apple_darwin target spec
- Update CI matrix to use macos-14 for ARM64 builds (macos-13 is x86_64 only)

This fixes the linker error:
  ld64.lld: error: must specify -platform_version
  ld64.lld: error: missing or unsupported -arch x86_64

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Root cause: snprintf was declared with ALL parameters as variadic `(...)`
instead of having its first 3 parameters fixed: `(ptr, i64, ptr, ...)`.

On ARM64, fixed and variadic parameters use different calling conventions:
- Fixed parameters: passed in x0-x7 registers
- Variadic parameters: special handling, different register/stack allocation

When snprintf was declared as `i32 (...)`, LLVM generated code expecting
all arguments to use the variadic calling convention. This caused:
1. Arguments passed in wrong registers/memory locations
2. Double values (like 0x3ff0000000000000 = 1.0) interpreted as pointers
3. SIGSEGV when snprintf tried to strlen() these bogus pointer values

Fix: Changed intrinsics.rs line 100 to pass `args` instead of `&[]` to
`ty_variadic_func()`. Now snprintf is correctly declared as:
  declare i32 @snprintf(ptr, i64, ptr, ...)

This makes the first 3 parameters use the standard calling convention
while only format arguments use varargs convention.

Also removed duplicate parameter addition code in compilation_unit.rs
(lines 452-455) which was likely a failed attempt to work around the
incorrect function signature.

Fixes: Integration test crashes on ARM64 macOS with LLVM 18
Affects: BSIM3, BSIM4, and other models with large instance structs
Testing: All 28 integration tests now pass on ARM64 macOS

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Test files were directly importing llvm_sys which doesn't work with
the extern crate aliasing used for multi-LLVM support.

Changes:
- Re-export LLVMCodeGenOptLevel from mir_llvm crate
- Update openvaf integration tests to use openvaf::LLVMCodeGenOptLevel
- Update osdi data_tests to use mir_llvm::LLVMCodeGenOptLevel

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Remove the complex dynamic matrix generation and simplify to testing:
- LLVM 18: Ubuntu Noble system package
- LLVM 21: Latest from apt.llvm.org

This reduces CI complexity and focuses on the two main use cases:
system LLVM and latest LLVM.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
The llvm-sys crate expects llvm-config at $LLVM_SYS_XXX_PREFIX/bin/llvm-config,
but Ubuntu installs it at /usr/bin/llvm-config-XX. Create a symlink to fix this.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Add || true to ignore error when symlink already exists and points
to the correct location.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
When building with --no-default-features --features llvm18, cargo still
compiled both llvm-sys v181 and v211 because workspace feature unification
activates default features on all workspace members.

Fix by building specific packages (openvaf-driver, openvaf, osdi, mir_llvm)
instead of the entire workspace. This ensures only the requested LLVM
version's llvm-sys crate is compiled.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Resolve conflicts in favor of multi-LLVM version support (18-21),
keeping the macOS linking fixes from macos-fixes branch.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@robtaylor robtaylor force-pushed the fix/macos-system-ld branch 3 times, most recently from 809db59 to ed8aea7 Compare December 16, 2025 01:03
On macOS, always pass -syslibroot to the linker pointing to the macOS
SDK (detected via xcrun --show-sdk-path). This allows openvaf-r to
compile Verilog-A to OSDI without requiring the LLVM_SYS_181_PREFIX
environment variable at runtime.

Previously, linking would fail with "ld: library 'System' not found"
when LLVM_SYS_181_PREFIX was not set because the system linker didn't
know where to find system libraries.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants