Releases: deepset-ai/haystack
v2.22.0
⭐️ Highlights
✂️ Smarter Document Chunking with Embedding-Based Splitting
Introducing the new EmbeddingBasedDocumentSplitter, a component that takes an embedder and splits documents based on semantic similarity rather than fixed sizes or rules.
from haystack.components.embedders import SentenceTransformersDocumentEmbedder
from haystack.components.preprocessors import EmbeddingBasedDocumentSplitter
# Initialize an embedder to calculate semantic similarities
embedder = SentenceTransformersDocumentEmbedder()
# Configure the splitter with parameters that control splitting behavior
splitter = EmbeddingBasedDocumentSplitter(
document_embedder=embedder,
sentences_per_group=2, # Group 2 sentences before calculating embeddings
percentile=0.95, # Split when cosine distance exceeds 95th percentile
min_length=50, # Merge splits shorter than 50 characters
max_length=1000 # Further split chunks longer than 1000 characters
)
result = splitter.run(documents=[doc])🔥 warm_up Runs Automatically on First Use
Components that define awarm_up method now run it automatically on first execution, removing the need for manual calls and preventing errors in standalone usage.
from haystack.components.embedders import SentenceTransformersTextEmbedder
text_embedder = SentenceTransformersTextEmbedder()
# text_embedder.warm_up() # ❌ Don't need this step anymore
print(text_embedder.run("I love pizza!"))
## {'embedding': [-0.07804739475250244, 0.1498992145061493,, ...]}🛠️ Multiple Tool String Outputs with outputs_to_string
Tools can now expose multiple string outputs via the new outputs_to_string configuration, giving you fine-grained control over how tool results are surfaced to the LLM, without changing the underlying tool logic.
def format_documents(documents):
return "\n".join(f"{i+1}. Document: {doc.content}" for i, doc in enumerate(documents))
def format_summary(metadata):
return f"Found {metadata['count']} results"
tool = Tool(
name="search",
description="Search for documents",
parameters={...},
function=search_func, # Returns {"documents": [Document(...)], "metadata": {"count": 5}, "debug_info": {...}}
outputs_to_string={
"formatted_docs": {"source": "documents", "handler": format_documents},
"summary": {"source": "metadata", "handler": format_summary}
# Note: "debug_info" is not included, so it won't be converted to a string
}
)
# After the tool invocation, the tool result includes:
# {
# "formatted_docs": "1. Document Title\n Content...\n2. ...",
# "summary": "Found 5 results"
# }🐍 Python 3.10+ Only
Haystack now requires Python 3.10 or later, as Python 3.9 reached End of Life (EOL) in October 2025.
⬆️ Upgrade Notes
HuggingFaceLocalChatGeneratornow usesQwen/Qwen3-0.6Bas the default model, replacing the previous default.
⚡️Enhancement Notes
-
The parameters
query_suffixanddocument_suffixhave been added toSentenceTransformersSimilarityRankerto support the Qwen3 reranker model family.Here is an example of how to use these new parameters to use the Qwen3-Reranker-0.6B:
from haystack import Document from haystack.components.rankers.sentence_transformers_similarity import SentenceTransformersSimilarityRanker ranker = SentenceTransformersSimilarityRanker( model="tomaarsen/Qwen3-Reranker-0.6B-seq-cls", query_prefix='<|im_start|>system\nJudge whether the Document meets the requirements based on the Query and the Instruct provided. Note that the answer can only be "yes" or "no".<|im_end|>\n<|im_start|>user\n<Instruct>: Given a web search query, retrieve relevant passages that answer the query\n<Query>: ', query_suffix="\n", document_prefix="<Document>: ", document_suffix="<|im_end|>\n<|im_start|>assistant\n<think>\n\n</think>\n\n", ) result = ranker.run( query="Which planet is known as the Red Planet?", documents=[ Document(content="Venus is often called Earth's twin because of its similar size and proximity."), Document(content="Mars, known for its reddish appearance, is often referred to as the Red Planet."), Document(content="Jupiter, the largest planet in our solar system, has a prominent red spot."), Document(content="Saturn, famous for its rings, is sometimes mistaken for the Red Planet."), ], ) print(result)
NOTE: This only works with the Qwen3 reranker models that use the sequence classification architecture. For example, you can find some on
tomaarsen's Hugging Face profile. -
Added reasoning content support to
HuggingFaceAPIChatGenerator. The component now extracts reasoning content from models that support chain-of-thought reasoning (e.g., DeepSeek R1). Both streaming and non-streaming modes are supported. Access viareply.reasoning.reasoning_text. -
When an Agent runs as part of a Pipeline, the agent's tracing span now uses the component span as its parent. This enables proper nested trace visualization in tracing tools like Datadog, Braintrust, or OpenTelemetry backends.
-
The
_handle_async_stream_response()method inOpenAIChatGeneratornow handlesasyncio.CancelledErrorexceptions. When a streaming task is cancelled mid-stream, the async for loop gracefully closes the stream usingasyncio.shield()to ensure the cleanup operation completes even during cancellation. -
A new
enable_thinkingparameter has been added to enable thinking mode in chat templates for thinking-capable models, allowing them to generate intermediate reasoning steps before producing final responses. -
Add support for PEP 604 type syntax. This means that when defining types in components, you can use
X | Yinstead ofUnion[X, Y]andX | Noneinstead ofOptional[X]. The codebase has been migrated to the new syntax, but both syntaxes are fully supported. -
Support Multiple Tool String Outputs
Added support for tools to define multiple string outputs using the
outputs_to_stringconfiguration. This allows users to specify how different parts of a tool's output should be converted to strings, enhancing flexibility in handling tool results.- Updated
ToolInvokerto handle multiple output configurations. - Updated
Toolto validate and store multiple output configurations. - Added tests to verify the functionality of multiple string outputs.
This enables tools to provide rich, varied context to language models or downstream components without requiring multiple tool calls, while keeping full control over which outputs are stringified.
- Updated
-
Added validation for
inputs_from_stateandoutputs_to_stateparameters in theToolclass. Tools now validate at construction time that state mappings reference valid tool parameters and outputs, catching configuration errors early instead of at runtime. The validation uses function introspection and JSON schema to ensure parameter names exist, and subclasses likeComponentToolvalidate against component input/output sockets.
🐛 Bug Fixes
- Improved error messages in ConditionalRouter when non-string values are provided as route outputs. Users now receive clear guidance (e.g., "use '2' instead of 2") instead of the cryptic "Can't compile non template nodes" error.
- Fixes jinja2 variable detection in
ConditionalRouter,ChatPromptBuilder,PromptBuilderandOutputAdapterby properly skipping variables that are assigned within the template. Previously under specific scenarios variables assigned within a template would falsely be picked up as input variables to the component. For more information you can check out the parent issue in the Jinja2 library here: pallets/jinja#2069 - Fixes deserializing an instance of
NamedEntityExtractorwhenpipeline_kwargsis stored in the deserialization dict with the value ofNone. - When creating an HTTP client object from a dictionary, we now convert the
limitsparameter to anhttpx.Limitsobject to avoid AttributeError. - Raise a
ValueErrorwhen an async function is passed to theToolclass. Async functions are not supported as tools. This change provides a clear error message instead of silent failures where coroutines are never awaited.
⚠️ Deprecation Notes
- The
return_empty_on_no_matchparameter has been removed from theRegexTextExtractorcomponent. This component now always returns a dictionary with the key "captured_text"; the value can be an empty string if no match is found or the captured text. Currently, thereturn_empty_on_no_matchparameter is ignored. Starting from Haystack 2.23.0, initializing the component with this parameter will raise an error.
💙 Big thank you to everyone who contributed to this release!
@anakin87, @ArzelaAscoIi, @bilgeyucel, @Bobholamovic, @davidsbatista, @dfokina, @GunaPalanivel, @majiayu000, @OliverZhangA, @sjrl, @TaMaN2031A, @tommasocerruti, @tstadel, @vblagoje, @YassineGabsi
v2.22.0-rc1
⭐️ Highlights
✂️ Smarter Document Chunking with Embedding-Based Splitting
Introducing the new EmbeddingBasedDocumentSplitter, a component that takes an embedder and splits documents based on semantic similarity rather than fixed sizes or rules.
from haystack.components.embedders import SentenceTransformersDocumentEmbedder
from haystack.components.preprocessors import EmbeddingBasedDocumentSplitter
# Initialize an embedder to calculate semantic similarities
embedder = SentenceTransformersDocumentEmbedder()
# Configure the splitter with parameters that control splitting behavior
splitter = EmbeddingBasedDocumentSplitter(
document_embedder=embedder,
sentences_per_group=2, # Group 2 sentences before calculating embeddings
percentile=0.95, # Split when cosine distance exceeds 95th percentile
min_length=50, # Merge splits shorter than 50 characters
max_length=1000 # Further split chunks longer than 1000 characters
)
result = splitter.run(documents=[doc])🔥 warm_up Runs Automatically on First Use
Components that define awarm_up method now run it automatically on first execution, removing the need for manual calls and preventing errors in standalone usage.
from haystack.components.embedders import SentenceTransformersTextEmbedder
text_embedder = SentenceTransformersTextEmbedder()
# text_embedder.warm_up() # ❌ Don't need this step anymore
print(text_embedder.run("I love pizza!"))
## {'embedding': [-0.07804739475250244, 0.1498992145061493,, ...]}🛠️ Multiple Tool String Outputs with outputs_to_string
Tools can now expose multiple string outputs via the new outputs_to_string configuration, giving you fine-grained control over how tool results are surfaced to the LLM, without changing the underlying tool logic.
def format_documents(documents):
return "\n".join(f"{i+1}. Document: {doc.content}" for i, doc in enumerate(documents))
def format_summary(metadata):
return f"Found {metadata['count']} results"
tool = Tool(
name="search",
description="Search for documents",
parameters={...},
function=search_func, # Returns {"documents": [Document(...)], "metadata": {"count": 5}, "debug_info": {...}}
outputs_to_string={
"formatted_docs": {"source": "documents", "handler": format_documents},
"summary": {"source": "metadata", "handler": format_summary}
# Note: "debug_info" is not included, so it won't be converted to a string
}
)
# After the tool invocation, the tool result includes:
# {
# "formatted_docs": "1. Document Title\n Content...\n2. ...",
# "summary": "Found 5 results"
# }🐍 Python 3.10+ Only
Haystack now requires Python 3.10 or later, as Python 3.9 reached End of Life (EOL) in October 2025.
⬆️ Upgrade Notes
HuggingFaceLocalChatGeneratornow usesQwen/Qwen3-0.6Bas the default model, replacing the previous default.
⚡️Enhancement Notes
-
The parameters
query_suffixanddocument_suffixhave been added toSentenceTransformersSimilarityRankerto support the Qwen3 reranker model family.Here is an example of how to use these new parameters to use the Qwen3-Reranker-0.6B:
from haystack import Document from haystack.components.rankers.sentence_transformers_similarity import SentenceTransformersSimilarityRanker ranker = SentenceTransformersSimilarityRanker( model="tomaarsen/Qwen3-Reranker-0.6B-seq-cls", query_prefix='<|im_start|>system\nJudge whether the Document meets the requirements based on the Query and the Instruct provided. Note that the answer can only be "yes" or "no".<|im_end|>\n<|im_start|>user\n<Instruct>: Given a web search query, retrieve relevant passages that answer the query\n<Query>: ', query_suffix="\n", document_prefix="<Document>: ", document_suffix="<|im_end|>\n<|im_start|>assistant\n<think>\n\n</think>\n\n", ) result = ranker.run( query="Which planet is known as the Red Planet?", documents=[ Document(content="Venus is often called Earth's twin because of its similar size and proximity."), Document(content="Mars, known for its reddish appearance, is often referred to as the Red Planet."), Document(content="Jupiter, the largest planet in our solar system, has a prominent red spot."), Document(content="Saturn, famous for its rings, is sometimes mistaken for the Red Planet."), ], ) print(result)
NOTE: This only works with the Qwen3 reranker models that use the sequence classification architecture. For example, you can find some on
tomaarsen's Hugging Face profile. -
Added reasoning content support to
HuggingFaceAPIChatGenerator. The component now extracts reasoning content from models that support chain-of-thought reasoning (e.g., DeepSeek R1). Both streaming and non-streaming modes are supported. Access viareply.reasoning.reasoning_text. -
When an Agent runs as part of a Pipeline, the agent's tracing span now uses the component span as its parent. This enables proper nested trace visualization in tracing tools like Datadog, Braintrust, or OpenTelemetry backends.
-
The
_handle_async_stream_response()method inOpenAIChatGeneratornow handlesasyncio.CancelledErrorexceptions. When a streaming task is cancelled mid-stream, the async for loop gracefully closes the stream usingasyncio.shield()to ensure the cleanup operation completes even during cancellation. -
A new
enable_thinkingparameter has been added to enable thinking mode in chat templates for thinking-capable models, allowing them to generate intermediate reasoning steps before producing final responses. -
Add support for PEP 604 type syntax. This means that when defining types in components, you can use
X | Yinstead ofUnion[X, Y]andX | Noneinstead ofOptional[X]. The codebase has been migrated to the new syntax, but both syntaxes are fully supported. -
Support Multiple Tool String Outputs
Added support for tools to define multiple string outputs using the
outputs_to_stringconfiguration. This allows users to specify how different parts of a tool's output should be converted to strings, enhancing flexibility in handling tool results.- Updated
ToolInvokerto handle multiple output configurations. - Updated
Toolto validate and store multiple output configurations. - Added tests to verify the functionality of multiple string outputs.
This enables tools to provide rich, varied context to language models or downstream components without requiring multiple tool calls, while keeping full control over which outputs are stringified.
- Updated
-
Added validation for
inputs_from_stateandoutputs_to_stateparameters in theToolclass. Tools now validate at construction time that state mappings reference valid tool parameters and outputs, catching configuration errors early instead of at runtime. The validation uses function introspection and JSON schema to ensure parameter names exist, and subclasses likeComponentToolvalidate against component input/output sockets.
🐛 Bug Fixes
- Improved error messages in ConditionalRouter when non-string values are provided as route outputs. Users now receive clear guidance (e.g., "use '2' instead of 2") instead of the cryptic "Can't compile non template nodes" error.
- Fixes jinja2 variable detection in
ConditionalRouter,ChatPromptBuilder,PromptBuilderandOutputAdapterby properly skipping variables that are assigned within the template. Previously under specific scenarios variables assigned within a template would falsely be picked up as input variables to the component. For more information you can check out the parent issue in the Jinja2 library here: pallets/jinja#2069 - Fixes deserializing an instance of
NamedEntityExtractorwhenpipeline_kwargsis stored in the deserialization dict with the value ofNone. - When creating an HTTP client object from a dictionary, we now convert the
limitsparameter to anhttpx.Limitsobject to avoid AttributeError. - Raise a
ValueErrorwhen an async function is passed to theToolclass. Async functions are not supported as tools. This change provides a clear error message instead of silent failures where coroutines are never awaited.
⚠️ Deprecation Notes
- The
return_empty_on_no_matchparameter has been removed from theRegexTextExtractorcomponent. This component now always returns a dictionary with the key "captured_text"; the value can be an empty string if no match is found or the captured text. Currently, thereturn_empty_on_no_matchparameter is ignored. Starting from Haystack 2.23.0, initializing the component with this parameter will raise an error.
💙 Big thank you to everyone who contributed to this release!
@anakin87, @ArzelaAscoIi, @bilgeyucel, @Bobholamovic, @davidsbatista, @dfokina, @github-actions[bot], @GunaPalanivel, @majiayu000, @OliverZhangA, @sjrl, @TaMaN2031A, @tommasocerruti, @tstadel, @vblagoje, @YassineGabsi
v2.21.0
⭐️ Highlights
🔍 Smarter, Broader Retrieval with Multi-Query RAG
This release introduces three new components that significantly boost retrieval recall in RAG systems by expanding the user query and retrieving documents across multiple reformulations:
QueryExpandergenerates semantically similar variations of a user query to broaden search coverage.MultiQueryTextRetrieverruns multiple queries in parallel using a text-based retriever (e.g., BM25) and merges results by score.MultiQueryEmbeddingRetrieverperforms the same multi-query retrieval flow using embeddings, enabling richer semantic recall.
Used together, these components create a multi-query retrieval pipeline that improves recall especially when queries are short or ambiguous.
🧪 Example: Expanding a Query and Retrieving More Relevant Documents
from haystack import Pipeline
from haystack.components.query import QueryExpander
from haystack.components.retrievers import InMemoryBM25Retriever
from haystack.components.retrievers import MultiQueryTextRetriever
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.writers import DocumentWriter
from haystack import Document
from haystack.document_stores.types import DuplicatePolicy
# Sample documents
docs = [
Document(content="Renewable energy comes from natural sources like wind and sunlight."),
Document(content="Geothermal energy is heat from beneath the Earth's surface."),
Document(content="Hydropower generates electricity using flowing water."),
]
# Store documents
store = InMemoryDocumentStore()
writer = DocumentWriter(document_store=store, policy=DuplicatePolicy.SKIP)
writer.run(documents=docs)
# Components
expander = QueryExpander()
retriever = InMemoryBM25Retriever(document_store=store, top_k=1)
multi_retriever = MultiQueryTextRetriever(retriever=retriever)
# Expand and retrieve
expanded = expander.run(query="renewable energy")
results = multi_retriever.run(queries=expanded["queries"])
for doc in results["documents"]:
print(doc.content)This pipeline expands "renewable energy" into multiple related queries, retrieves documents for each in parallel, and returns a richer set of relevant results — demonstrating how multi-query retrieval improves recall with minimal effort.
⬆️ Upgrade Notes
- Updated the default Azure OpenAI model from
gpt-4o-minitogpt-4.1-miniand the default API version from2023-05-15to2024-12-01-previewfor bothAzureOpenAIGeneratorandAzureOpenAIChatGenerator. - The default OpenAI model has been changed from gpt-4o-mini to gpt-5-mini for OpenAIChatGenerator and OpenAIGenerator. If you rely on the default model and need to continue using gpt-4o-mini, explicitly specify it when initializing these components: OpenAIChatGenerator(model="gpt-4o-mini").
🚀 New Features
- Three new components added
QueryExpander,MultiQueryEmbeddingRetriever,MultiQueryTextRetriever. When used together, they allow a query to be expanded and each expansion is used to retrieve a potentially different set of documents.
⚡️Enhancement Notes
- Added a return_empty_on_no_match parameter (default True) to RegexTextExtractor.__init__(). When set to False, the component returns {"captured_text": ""} instead of {} when no regex match is found. Provides a consistent output structure for pipeline integration.
- The FilterRetriever and AutoMergingRetriever components now support asynchronous execution.
- Previously, when using tracing with objects like
ByteStreamandImageContent, the payload sent to the tracing backend could become too large, hitting provider limits or causing performance degradation. We now replace these objects with string placeholders to avoid oversized payloads. - The default OpenAI model for OpenAIChatGenerator and OpenAIGenerator has been updated from gpt-4o-mini to gpt-5-mini.
🐛 Bug Fixes
-
Ensure request header keys are unique in link_content to prevent 400 Bad Request errors.
Some image providers return a 400 Bad Request when using ImageContent.from_url() because the User-Agent header appears multiple times with different casing (e.g., user-agent, User-Agent). This update normalizes header keys in a case-insensitive way, removes duplicates, and preserves only the last occurrence.
-
Fixed a bug where components explicitly listed in include_outputs_from would not appear in the pipeline results if they returned an empty dictionary. Now, any component specified in include_outputs_from will be included in the results regardless of whether its output is empty.
-
Fix the serialization and deserialization of
pipeline_outputsinpipeline_snapshotand make it use the same schema as the rest of the pipeline state when running pipelines with breakpoints. The deserialization of the older format ofpipeline_outputswithout serialization schema is supported till Haystack 2.23.0. -
Fixed ToolInvoker missing tools after warmup for lazy-initialized toolsets. The invoker now refreshes its tool registry post-warmup, ensuring replaced placeholders (e.g., MCPToolset with eager_connect=False) resolve to the actual tool names at invocation time.
💙 Big thank you to everyone who contributed to this release!
@Amnah199, @anakin87, @davidsbatista, @dfokina, @mrchtr, @OscarPindaro, @schwartzadev, @sjrl, @TaMaN2031A, @vblagoje, @YassineGabsi, @ZeJ0hn
v2.21.0-rc1
v2.21.0-rc1
v2.20.0
⭐️ Highlights
Support for OpenAI's Responses API
Haystack now integrates the OpenAI's Responses API through the new OpenAIResponsesChatGenerator and AzureOpenAIResponsesChatGenerator components.
This unlocks several advanced capabilities like:
- Retrieving concise summaries of the model’s reasoning process.
- Using native OpenAI or MCP tool formats alongside Haystack
Toolobjects andToolsetinstances.
Example with reasoning and a web search tool:
from haystack.components.generators.chat import AzureOpenAIResponsesChatGenerator
from haystack.dataclasses import ChatMessage
# with `OpenAIResponsesChatGenerator`
chat_generator = OpenAIResponsesChatGenerator(
model="o3-mini",
generation_kwargs={"summary": "auto", "effort": "low"},
tools=[{"type": "web_search"}],
)
response = chat_generator.run(messages=[ChatMessage.from_user("What's a positive news story from today?")])
# with `AzureOpenAIResponsesChatGenerator`
chat_generator = AzureOpenAIResponsesChatGenerator(
azure_endpoint="https://example-resource.azure.openai.com/",
azure_deployment="gpt-5-mini",
generation_kwargs={"reasoning": {"effort": "low", "summary": "auto"}},
)
response = chat_generator.run(messages=[ChatMessage.from_user("What's Natural Language Processing?")])
print(response["replies"][0].text)🚀 New Features
- Added the
AzureOpenAIResponsesChatGenerator, a new component that integrates Azure OpenAI's Responses API into Haystack. - Added the
OpenAIResponsesChatGenerator, a new component that integrates OpenAI's Responses API into Haystack. - If logprobs are enabled in the generation kwargs, return logprobs in
ChatMessage.metaforOpenAIChatGeneratorandOpenAIResponsesChatGenerator. - Added an
extrafield toToolCallandToolCallDeltato store provider-specific information. - Updated serialization and deserialization of
PipelineSnapshotsto work with pydanticBaseModels. - Added async support to
SentenceWindowRetrieverwith a newrun_async()method, allowing the retriever to be used in async pipelines and workflows. - Added
warm_up()method to all ChatGenerator components (OpenAIChatGenerator,AzureOpenAIChatGenerator,HuggingFaceAPIChatGenerator,HuggingFaceLocalChatGenerator, andFallbackChatGenerator) to properly initialize tools that require warm-up before pipeline execution. Thewarm_up()method is idempotent and follows the same pattern used in Agent and ToolInvoker components. This enables proper tool initialization in pipelines that use ChatGenerators with tools but without an Agent component. - The
AnswerBuildercomponent now exposes a new parameterreturn_only_referenced_documents(default:True) that controls if only documents referenced in therepliesare returned. Returned documents include two new fields in themetadictionary:source_index: the 1-based index of the document in the input listreferenced: a boolean value indicating if the document was referenced in thereplies(only present if thereference_patternparameter is provided).
These additions make it easier to display references and other sources within a RAG pipeline.
⚡️ Enhancement Notes
- Adds
generation_kwargsto theAgentcomponent, allowing for more fine-grained control at run-time over chat generation. - Added a
revisionparameter to all Sentence Transformers embedder components (SentenceTransformersDocumentEmbedder,SentenceTransformersTextEmbedder,SentenceTransformersSparseDocumentEmbedder, andSentenceTransformersSparseTextEmbedder) to allow users to specify a specific model revision/version from the Hugging Face Hub. This enables pinning to a particular model version for reproducibility and stability. - Updated the components
Agent,LLMMetadataExtractor,LLMMessagesRouter, andLLMDocumentContentExtractorto automatically callself.warm_up()at runtime if they have not been warmed up yet. This ensures that the components are ready for use without requiring an explicit warm-up call. This differs from previous behavior where warm-up had to be manually invoked before use, otherwise aRuntimeErrorwas raised. - Improved log-trace correlation for
DatadogTracerby using the officialddtrace.tracer.get_log_correlation_context()method. - Improved Toolset warm-up architecture for better encapsulation. The base
Toolset.warm_up()method now warms up all tools by default, while subclasses can override it to customize initialization (e.g., setting up shared resources instead of warming individual tools). Thewarm_up_tools()utility function has been simplified to delegate toToolset.warm_up().
🐛 Bug Fixes
-
Fixed deserialization of state schema when it is
NoneinAgent.from_dict. -
Fixed a bug where components explicitly listed in
include_outputs_fromwould not appear in the pipeline results if they returned an empty dictionary. Now, any component specified ininclude_outputs_fromwill be included in the results regardless of whether its output is empty. -
Fixed type compatibility issue where passing
list[Tool]to components with atoolsparameter (such asToolInvoker) caused static type checker errors.
In version 2.19, theToolsTypewas changed toUnion[list[Union[Tool, Toolset]], Toolset]to support mixing Tools and Toolsets. However, due to Python's list invariance,list[Tool]was no longer considered compatible withlist[Union[Tool, Toolset]], breaking type checking for the common pattern of passing a list of Tool objects.The fix explicitly lists all valid type combinations in
ToolsType:Union[list[Tool], list[Toolset], list[Union[Tool, Toolset]], Toolset]. This preserves backward compatibility for existing code while still supporting the new functionality of mixing Tools and Toolsets.Users who encountered type errors like
"Argument of type 'list[Tool]' cannot be assigned to parameter 'tools'"should no longer see these errors after upgrading. No code changes are required on the user side. -
When creating a pipeline snapshot, we now ensure use of
_deepcopy_with_exceptionswhen copying component inputs to avoid deep copies of items like components and tools since they often contain attributes that are not deep-copyable.
For example, theLinkContentFetcherhashttpx.Clientas an attribute, which throws an error if deep-copied.
💙 Big thank you to everyone who contributed to this release!
@Amnah199, @anakin87, @cmnemoi, @davidsbatista, @dfokina, @HamidOna, @Hansehart, @jdb78, @mrchtr, @sjrl, @swapniel99, @TaMaN2031A, @tstadel, @vblagoje
v2.20.0-rc2
v2.20.0-rc2
v2.20.0-rc1
v2.20.0-rc1
v2.19.0
⭐️ Highlights
🛡️ Try Multiple LLMs with FallbackChatGenerator
Introduced FallbackChatGenerator, a resilient chat generator that runs multiple LLMs sequentially and automatically falls back when one fails. It tries each generator in order until one succeeds, handling errors like timeouts, rate limits, or server issues. Ideal for building robust, production-grade chat systems that stay responsive across providers.
from haystack.dataclasses import ChatMessage
from haystack_integrations.components.generators.google_genai import GoogleGenAIChatGenerator
from haystack_integrations.components.generators.anthropic import AnthropicChatGenerator
from haystack.components.generators.chat.openai import OpenAIChatGenerator
from haystack.components.generators.chat.fallback import FallbackChatGenerator
anthropic_generator = AnthropicChatGenerator(model="claude-sonnet-4-5", timeout=1) # force failure with low timeout
google_generator = GoogleGenAIChatGenerator(model="gemini-2.5-flashy") # force failure with typo in model name
openai_generator = OpenAIChatGenerator(model="gpt-4o-mini") # success
chat_generator = FallbackChatGenerator(chat_generators=[anthropic_generator, google_generator, openai_generator])
response = chat_generator.run(messages=[ChatMessage.from_user("What is the plot twist in Shawshank Redemption?")])
print("Successful ChatGenerator: ", response["meta"]["successful_chat_generator_class"])
print("Response: ", response["replies"][0].text)Output:
WARNING:haystack.components.generators.chat.fallback:ChatGenerator AnthropicChatGenerator failed with error: Request timed out or interrupted...
WARNING:haystack.components.generators.chat.fallback:ChatGenerator GoogleGenAIChatGenerator failed with error: Error in Google Gen AI chat generation: 404 NOT_FOUND...
Successful ChatGenerator: OpenAIChatGenerator
Response: In "The Shawshank Redemption," ....🛠️ Mix Tool and Toolset in Agents
You can now combine both Tool and Toolset objects in the same tools list for Agent and ToolInvoker components. This update brings more flexibility, letting you organize tools into logical groups while still adding standalone tools in one go.
from haystack.components.agents import Agent
from haystack.tools import Tool, Toolset
math_toolset = Toolset([add_tool, multiply_tool])
weather_toolset = Toolset([weather_tool, forecast_tool])
agent = Agent(
chat_generator=generator,
tools=[math_toolset, weather_toolset, calendar_tool], # ✨ Now supported!
)⚙️ Faster Agents with Tool Warmup
Tool and Toolset objects can now perform initialization during Agent or ToolInvoker warmup. This allows setup tasks such as connecting to databases, loading models, or initializing connection pools before the first use.
from haystack.tools import Toolset
from haystack.components.agents import Agent
# Custom toolset with initialization needs
class DatabaseToolset(Toolset):
def __init__(self, connection_string):
self.connection_string = connection_string
self.pool = None
super().__init__([query_tool, update_tool])
def warm_up(self):
# Initialize connection pool
self.pool = create_connection_pool(self.connection_string)🚀 New Features
-
Updated our serialization and deserialization of PipelineSnapshots to work with python Enum classes.
-
Added
FallbackChatGeneratorthat automatically retries different chat generators and returns first successful response with detailed information about which providers were tried. -
Added
pipeline_snapshotandpipeline_snapshot_file_pathparameters toBreakpointExceptionto provide more context when a pipeline breakpoint is triggered.
Addedpipeline_snapshot_file_pathparameter toPipelineRuntimeErrorto include a reference to the stored pipeline snapshot so it can be easily found. -
A new component
RegexTextExtractorwhich allows to extract text from chat messages or strings input based on custom regex pattern. -
CSVToDocument: add
conversion_mode='row'with optionalcontent_column; each row becomes aDocument; remaining columns stored inmeta; default 'file' mode preserved. -
Added the ability to resume an
Agentfrom anAgentSnapshotwhile specifying a new breakpoint in the same run call. This allows stepwise debugging and precise control over chat generator inputs tool inputs before execution, improving flexibility when inspecting intermediate states. This addresses a previous limitation where passing both a snapshot and a breakpoint simultaneously would throw an exception. -
Introduce
SentenceTransformersSparseTextEmbedderandSentenceTransformersSparseDocumentEmbeddercomponents. These components embed text and documents using sparse embedding models compatible with Sentence Transformers. Sparse embeddings are interpretable, efficient when used with inverted indexes, combine classic information retrieval with neural models, and are complementary to dense embeddings. Currently, the producedSparseEmbeddingobjects are compatible with theQdrantDocumentStore.Usage example:
from haystack.components.embedders import SentenceTransformersSparseTextEmbedder text_embedder = SentenceTransformersSparseTextEmbedder() text_embedder.warm_up() print(text_embedder.run("I love pizza!")) # {'sparse_embedding': SparseEmbedding(indices=[999, 1045, ...], values=[0.918, 0.867, ...])}
-
Added a
warm_up()function to theTooldataclass, allowing tools to perform resource-intensive initialization before execution. Tools and Toolsets can now override thewarm_up()method to establish connections to remote services, load models, or perform other preparatory operations. TheToolInvokerandAgentautomatically callwarm_up()on their tools during their own warm-up phase, ensuring tools are ready before use. -
Fixed a serialization issue related to function objects in a pipeline; now they are converted to type None (functions cannot be serialized). This was preventing the successful setting of breakpoints in agents and their use as a resume point. If an error occurs during an Agent execution, for instance, during tool calling. In that case, a snapshot of the last successful step is raised, allowing the caller to catch it to inspect the possible reason for the crash and use it to resume the pipeline execution from that point onwards.
⚡️ Enhancement Notes
- Added
toolsto agent run parameters to enhance the agent's flexibility. Users can now choose a subset of tools for the agent at runtime by providing a list of tool names, or supply an entirely new set by passingToolobjects or aToolset. - Enhanced the
toolsparameter across all tool-accepting components (Agent,ToolInvoker,OpenAIChatGenerator,AzureOpenAIChatGenerator,HuggingFaceAPIChatGenerator,HuggingFaceLocalChatGenerator) to accept either a mixed list of Tool and Toolset objects or just a Toolset object. Previously, components required either a list of Tool objects OR a single Toolset, but not both in the same list. Now users can organize tools into logical Toolsets while also including standalone Tool objects, providing greater flexibility in tool organization. For example:Agent(chat_generator=generator, tools=[math_toolset, weather_toolset, standalone_tool]). This change is fully backward compatible and preserves structure during serialization/deserialization, enabling proper round-trip support for mixed tool configurations. - Refactored
_save_pipeline_snapshotto consolidate try-except logic and added araise_on_failureoption to control whether save failures raise an exception or are logged._create_pipeline_snapshotnow wraps_serialize_value_with_schemain try-except blocks to prevent failures from non-serializable pipeline inputs.
🐛 Bug Fixes
- Fix Agent
run_asyncmethod to correctly handle async streaming callbacks. This previously triggered errors due to a bug. - Prevent duplication of the last assistant message in the chat history when initializing from an
AgentSnapshot. - We were setting
response_formattoNoneinOpenAIChatGeneratorby default which doesn't follow the API spec. We now omit the variable ifresponse_formatis not passed by the user. - Ensure that the
OpenAIChatGeneratoris properly serialized whenresponse_formatingeneration_kwargsis provided as a dictionary (for example,{"type": "json_object"}). Previously, this caused serialization errors. - Fixed parameter schema generation in
ComponentToolwhen usinginputs_from_state. Previously, parameters were only removed from the schema if the state key and parameter name matched exactly. For example,inputs_from_state={"text": "text"}removedtextas expected, butinputs_from_state={"state_text": "text"}did not. This is now resolved, and such cases work as intended. - Refactored
SentenceTransformersEmbeddingBackendto ensure unique embedding IDs by incorporating all relevant arguments. - Fixed Agent to correctly raise a
BreakpointExceptionwhen aToolBreakpointwith a specifictool_nameis provided in an assistant chat message containing multiple tool calls. - The
OpenAIChatGeneratorimplementation usesChatCompletionMessageCustomToolCall, which is only available in OpenAI client>=1.99.2. We now requireopenai>=1.99.2.
💙 Big thank you to everyone who contributed to this release!
@anakin87, @bilgeyucel, @davidsbatista, @dfokina, @...
v2.19.0-rc1
v2.19.0-rc1
v2.18.1
Release Notes
v2.18.1
⚡️ Enhancement Notes
- Added tools to agent run parameters to enhance the agent's flexibility. Users can now choose a subset of tools for the agent at runtime by providing a list of tool names, or supply an entirely new set by passing Tool objects or a Toolset.
🐛 Bug Fixes
- Fix Agent
run_asyncmethod to correctly handle async streaming callbacks. This previously triggered errors due to a bug. - Prevent duplication of the last assistant message in the chat history when initializing from an
AgentSnapshot. - We were setting
response_formattoNoneinOpenAIChatGeneratorby default which doesn't follow the API spec. We now omit the variable ifresponse_formatis not passed by the user.