fix(evaluator): inject configured LLM into custom metrics by thesamet · Pull Request #134 · arklexai/arksim

thesamet · 2026-04-04T00:18:52Z

Summary

Custom metrics previously loaded the LLM themselves by reading a hardcoded config.yaml path at import time, causing an immediate crash when the configured provider differed from what the metric expected (e.g. running with Anthropic while metrics tried to instantiate an OpenAI client).

The evaluator now inspects each custom metric's __init__ signature and passes the already-configured LLM instance when an llm parameter is declared
Metrics without the parameter continue to work unchanged (fully backward-compatible)
Accessing self.llm when no LLM was injected raises a descriptive RuntimeError instead of an opaque AttributeError
Updated both built-in example metrics (bank-insurance, e-commerce) to use the injected LLM
Updated docs to show the llm=None injection pattern

Closes #131

Changes

arksim/evaluator/evaluator.py: detect llm parameter in custom metric constructors and inject the configured LLM instance
arksim/evaluator/base_metric.py: expose llm as a property on both base classes that raises a clear RuntimeError when accessed without injection; add llm parameter docs to both QuantitativeMetric and QualitativeMetric docstrings
examples/bank-insurance/custom_metrics.py, examples/e-commerce/custom_metrics.py: remove hardcoded config loading, accept injected LLM via self.llm
docs/main/evaluate-conversation.mdx: update both metric type examples to show the llm=None injection pattern
tests/unit/test_evaluator_class.py: new tests covering injection for quant and qual metrics, backward-compat (no-param) case, and RuntimeError when self.llm is accessed without injection

Documentation

Updated relevant docs in docs/ (if behavior, config, or API changed)
Updated README.md (if installation, quickstart, or usage changed)
No docs needed (explain why below)

How to Test

ruff check . passes
ruff format --check . passes
pytest tests/ passes
Manual verification: run evaluation with Anthropic config against example metrics -- no crash, LLM injected correctly

Notes

Backward-compatible: any existing custom metric that does not declare an llm parameter in __init__ is instantiated as before. Only metrics that opt in by declaring the parameter receive the injected LLM. Metrics that declare llm but try to use self.llm without injection now get a descriptive error pointing to the fix.

Reviewers

/cc @arklexai/arksim-maintainers

Custom metrics previously loaded the LLM themselves by reading a hardcoded `config.yaml` path at import time. This caused an immediate crash when the configured provider differed from what the metric expected (e.g. running with Anthropic while metrics tried to instantiate an OpenAI client). The evaluator now passes the already-configured LLM instance to any custom metric whose __init__ declares an `llm` parameter. Metrics without the parameter continue to work unchanged (backward-compatible). Update both built-in example metrics to use the injected LLM. Fixes arklexai#131

thesamet requested a review from a team as a code owner April 4, 2026 00:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(evaluator): inject configured LLM into custom metrics#134

fix(evaluator): inject configured LLM into custom metrics#134
thesamet wants to merge 1 commit intoarklexai:mainfrom
thesamet:fix/inject-llm-into-custom-metrics

thesamet commented Apr 4, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

thesamet commented Apr 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Documentation

How to Test

Notes

Reviewers

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

thesamet commented Apr 4, 2026 •

edited

Loading