TensorZero

TensorZero · 2026-01-07T15:57:48.420Z

Simeon L. previously was the Head of Design at Merge from inception through Series B. He was also a founding & senior design engineer at multiple startups in AI and developer tools. Earlier in his career, he worked in investment banking and graduated from USC. Welcome to the team, Simeon!

Technology, Information and Internet

See jobs Follow

View all 7 employees

About us

TensorZero is an open-source stack for industrial-grade LLM applications. It unifies an LLM gateway, observability, optimization, evaluation, and experimentation.

Website: https://www.tensorzero.com/
External link for TensorZero
Industry: Technology, Information and Internet
Company size: 2-10 employees
Type: Privately Held

Employees at TensorZero

See all employees

Updates

TensorZero

1,729 followers
2w
Report this post
TensorZero 2026.1.8 is out! 🔨 Fix a race condition in the TensorZero Autopilot UI that could disable the chat input. 🔨 Increase timeouts for slow tool calls triggered by TensorZero Autopilot (e.g. evaluations). & multiple under-the-hood and UI improvements! https://lnkd.in/eWf5zz-r

Release 2026.1.8 · tensorzero/tensorzero github.com

Like Comment Share
TensorZero

1,729 followers
2w
Report this post
TensorZero 2026.1.7 is out! 📌 This release introduces the preview of TensorZero Autopilot — our automated AI engineer (learn more on tensorzero.com). — Full Changelog 🆕 [Preview] TensorZero Autopilot — an automated AI engineer that analyzes LLM observability data, optimizes prompts and models, sets up evals, and runs A/B tests. 🆕 Support multi-turn reasoning for xAI (`reasoning_content` only). & multiple under-the-hood and UI improvements! https://lnkd.in/enKiwP8h

Release 2026.1.7 · tensorzero/tensorzero github.com

Like Comment Share
TensorZero

1,729 followers
2w
Report this post
TensorZero 2026.1.6 is out! 📌 This releases brings further improvements around reasoning models, error handling, and usage tracking. — Full Changelog 🚨 [Breaking Change] Moving forward, TensorZero will use the OpenAI API's error format (`{"error": {"message": "Bad!"}`) instead of TensorZero's error format (`{"error": "Bad!"}`) in the OpenAI-compatible endpoints. ⚠️ [Planned Deprecation] When using `unstable_error_json` with the OpenAI-compatible inference endpoint, use `tensorzero_error_json` instead of `error_json`. For now, TensorZero will emit both fields with identical data. The TensorZero inference endpoint is not affected. 🆕 Add native support for provider tools (e.g. web search) to the Anthropic and GCP Vertex AI Anthropic model providers. Previously, clients had to use `extra_body` to handle these tools. 🆕 Improve handling of reasoning content blocks when streaming with the OpenAI Responses API. 🆕 Handle inferences with missing `usage` fields gracefully in the OpenAI model provider. 🆕 Improve error handling across the UI. & multiple under-the-hood and UI improvements! https://lnkd.in/edNXnz_X

Release 2026.1.6 · tensorzero/tensorzero github.com

Like Comment Share
TensorZero

1,729 followers
2w
Report this post
TensorZero 2026.1.5 is out! 📌 This release brings many improvements around error handling, reasoning model, rate limiting performance, and more. — Full Changelog 🚨 [Breaking Change] TensorZero will normalize the reported `usage` from different model providers. Moving forward, `input_tokens` and `output_tokens` include all token variations (provider prompt caching, reasoning, etc.), just like OpenAI. Tokens cached by TensorZero remain excluded. You can still access the raw usage reported by providers with `include_raw_usage`. ⚠️ [Planned Deprecations] Migrate `include_original_response` to `include_raw_response`. For advanced variant types, the former only returned the last model inference, whereas the latter returns every model inference with associated metadata. ⚠️ [Planned Deprecations] Migrate `allow_auto_detect_region = true` to `region = "sdk"` when configuring AWS model providers. The behavior is identical. ⚠️ [Planned Deprecations] Provide the proper API base rather than the full endpoint when configuring custom Anthropic providers. 🔨 Fix a regression that triggered incorrect warnings about usage reporting for streaming inferences with Anthropic models. 🔨 Fix a bug in the TensorZero Python SDK that discarded some request fields in certain multi-turn inferences with tools. 🆕 Improve error handling across many areas: TensorZero UI, JSON deserialization, AWS providers, streaming inferences, timeouts, etc. 🆕 Support Valkey (Redis) for improving performance of rate limiting checks (recommended at 100+ QPS). 🆕 Support `reasoning_effort` for Gemini 3 models (mapped to `thinkingLevel`). 🆕 Improve handling of Anthropic reasoning models in TensorZero JSON functions. Moving forward, `json_mode = "strict"` will use the beta structured outputs feature; `json_mode = "on"` still uses the legacy assistant message prefill. 🆕 Improve handling of reasoning content in the OpenRouter and xAI model providers. 🆕 Add `extra_headers` support for embedding models. (thanks jonaylor89!) 🆕 Support dynamic credentials for AWS Bedrock and AWS SageMaker model providers. & multiple under-the-hood and UI improvements (thanks ndoherty-xyz)! https://lnkd.in/ewY7izAh

Release 2026.1.5 · tensorzero/tensorzero github.com

Like Comment Share
TensorZero

1,729 followers
1mo
Report this post
TensorZero 2026.1.2 is out! 📌 This is a small release that improves the developer experience of using long-tail LLM capabilities. — 🆕 Support appending to arrays with `extra_body` using the `/my_array/-` notation. 🆕 Handle cross-model thought signatures in GCP Vertex AI Gemini and Google AI Studio. & multiple under-the-hood and UI improvements https://lnkd.in/ekXYH2nc

Like Comment Share
TensorZero

1,729 followers
1mo
Report this post
TensorZero 2026.1.1 is out! 📌 This release brings improvements and bug fixes to token usage reporting. — ⚠️ [Planned Deprecation] In a future release, the parameter `model` will be required when initializing `DICLOptimizationConfig`. The parameter remains optional (defaults to `openai::gpt-5-mini`) in the meantime. 🔨 Stop buffering `raw_usage` when streaming with the OpenAI-compatible inference endpoint; instead, emit `raw_usage` as soon as possible, just like in the native endpoint. 🔨 Stop reporting zero usage in every chunk when streaming a cached inference; instead, report zero usage only in the final chunk, as expected. 🆕 Support `stream_options.include_usage` for every model under the Azure provider. & multiple under-the-hood and UI improvements! https://lnkd.in/e84XXYm8

Release 2026.1.1 · tensorzero/tensorzero github.com

Like Comment Share
TensorZero

1,729 followers
1mo
Report this post
TensorZero 2026.1.0 is out! 📌 This release an optional `include_raw_usage` parameter to inference requests. If enabled, the gateway returns the raw usage objects from model provider responses in addition to the normalized `usage` response field. — 🚨 [Breaking Changes] The Prometheus metric `tensorzero_inference_latency_overhead_seconds` will report a histogram instead of a summary. You can customize the buckets using `gateway.metrics.tensorzero_inference_latency_overhead_seconds_buckets` in the configuration (default: 1ms, 10ms, 100ms). ⚠️ [Planned Deprecation] Deprecate the `TENSORZERO_CLICKHOUSE_URL` environment variable from the UI. Moving forward, the UI will query data through the gateway and does not communicate directly with ClickHouse. ⚠️ [Planned Deprecation] Rename the Prometheus metric `tensorzero_inference_latency_overhead_seconds_histogram` to `tensorzero_inference_latency_overhead_seconds`. Both metrics will be emitted for now. ⚠️ [Planned Deprecation] Rename the configuration field `tensorzero_inference_latency_overhead_seconds_histogram_buckets` to `tensorzero_inference_latency_overhead_seconds_buckets`. Both fields are available for now. 🆕 Add optional `include_raw_usage` parameter to inference requests. If enabled, the gateway returns the raw usage objects from model provider responses in addition to the normalized `usage` response field. 🆕 Add optional `--bind-address` CLI flag to the gateway. 🆕 Add optional `description` field to metrics in the configuration. 🆕 Add option to fine-tune Fireworks models without automatic deployment. & multiple under-the-hood and UI improvements (thanks ecalifornica achaljhawar rguilmont)! https://lnkd.in/eKZVcUgZ

Like Comment Share
TensorZero

1,729 followers
1mo
Report this post
TensorZero 2025.12.6 is out! 📌 This release introduces Gateway Relay. With gateway relay, an LLM inference request can be routed through multiple independent TensorZero Gateway deployments before reaching a model provider. This enables you to enforce organization-wide controls (e.g. auth, rate limits, credentials) without restricting how teams build their LLM features. https://lnkd.in/e7JByKhS — 🚨 [Breaking Changes] Migrated the following optimization fields from the TensorZero Python SDK to the configuration: - `DICLOptimizationConfig`: removed `credential_location`. - `FireworksSFTConfig`: moved `account_id` to `[provider_types.fireworks.sft]`; removed `api_base` and `credential_location`. - `GCPVertexGeminiSFTConfig`: moved `bucket_name`, `bucket_path_prefix`, `kms_key_name`, `project_id`, `region`, and `service_account` to to `[provider_types.gcp_vertex_gemini.sft]`. - `OpenAIRFTConfig`: removed `api_base` and `credential_location`. - `OpenAISFTConfig`: removed `api_base` and `credential_location`. - `TogetherSFTConfig`: `hf_api_token`, `wandb_api_key`, `wandb_base_url`, and `wandb_project_name` moved to `[provider_types.together.sft]`; removed `api_base` and `credential_location`. 🆕 Support gateway relay. 🆕 Add "Try with model" button to the datapoint page in the UI. 🆕 Add `tensorzero_inference_latency_overhead_seconds_histogram` Prometheus metric for meta-observability. 🆕 Add `concurrency` parameter to `experimental_render_samples` (defaults to 100). 🆕 Add `otlp_traces_extra_attributes` and `otlp_traces_extra_resources` to the TensorZero Python SDK. (thanks jinnovation!) & multiple under-the-hood and UI improvements (thanks ecalifornica)! https://lnkd.in/e5DxBqqR

Release 2025.12.6 · tensorzero/tensorzero github.com

Like Comment Share
TensorZero

1,729 followers
1mo
Report this post
TensorZero 2025.12.5 is out! 📌 This version introduces a revamped dataset builder in the UI that supports complex queries (e.g. filter by tags, feedback, logical operators). 💡 We're skipping version 2025.12.4 due to an issue in our publishing process. — ⚠️ [Planned Deprecation] The variant type `experimental_chain_of_thought` will be deprecated in `2026.2+`. As reasoning models are becoming prevalent, please use their native reasoning capabilities. ⚠️ [Planned Deprecation] The `timeout_s` configuration field for best/mixture-of-N variants will be deprecated in `2026.2+`. Please use the `[timeouts]` block in the configuration for their candidates instead. 🆕 Expand the dataset builder in the UI to support complex queries (e.g. filter by tags, feedback). 🆕 Export `tensorzero_inference_latency_overhead_seconds` Prometheus metric for meta-observability. 🆕 Allow users to disable TensorZero API keys using `--disable-api-key` in the CLI. (thanks jinnovation!) & multiple under-the-hood and UI improvements (thanks ecalifornica)! https://lnkd.in/e-whxWeS

Release 2025.12.5 · tensorzero/tensorzero github.com

Like Comment Share
TensorZero

1,729 followers
1mo
Report this post
Simeon L. previously was the Head of Design at Merge from inception through Series B. He was also a founding & senior design engineer at multiple startups in AI and developer tools. Earlier in his career, he worked in investment banking and graduated from USC. Welcome to the team, Simeon!
3 Comments

Like Comment Share

Browse jobs

Funding

TensorZero 1 total round

Last Round

Seed Sep 18, 2025

US$ 7.3M

Investors

FirstMark + 3 Other investors

See more info on crunchbase

TensorZero

Technology, Information and Internet

About us

Employees at TensorZero

Gabriel Bianconi

Viraj Mehta

Andrew Jesson

Updates

Join now to see what you are missing

Similar pages

Storycraft

Ropes

Argon AI

True Markets

Diode Computers, Inc.

Daylight

Phaselaw

Thunder Compute

Woz (YC W25)

camfer

Browse jobs

Machine Learning Engineer jobs

Engineer jobs

Scientist jobs

Analyst jobs

Junior Scientist jobs

Content Writer jobs

Lead Engineer jobs

Researcher jobs

Associate jobs

Developer jobs

Solutions Engineer jobs

Head of Software Engineering jobs

Adjunct jobs

Mathematics Teacher jobs

Online Tutor jobs

Cleaner jobs

Geologist jobs

Biologist jobs

Intelligence Specialist jobs

Tutor jobs

Funding