Introducing voyage-context-3: Focused Chunk-Level Details with Global Document Context

Voyage AI
July 23, 2025 | Updated: July 28, 2025

Note to readers: voyage-context-3 is currently available through the Voyage AI API directly. For access, sign up for Voyage AI.

TL;DR: We’re excited to introduce voyage-context-3, a contextualized chunk embedding model that produces vectors for chunks that capture the full document context without any manual metadata and context augmentation, leading to higher retrieval accuracies than with or without augmentation. It’s also simpler, faster, and cheaper, and is a drop-in replacement for standard embeddings without downstream workflow changes, also reducing chunking strategy sensitivity.

On chunk-level and document-level retrieval tasks, voyage-context-3 outperforms OpenAI-v3-large by 14.24% and 12.56%, Cohere-v4 by 7.89% and 5.64%, Jina-v3 late chunking by 23.66% and 6.76%, and contextual retrieval by 20.54% and 2.40%, respectively.

It also supports multiple dimensions and multiple quantization options enabled by Matryoshka learning and quantization-aware training, saving vectorDB costs while maintaining retrieval accuracy. For example, voyage-context-3 (binary, 512) outperforms OpenAI-v3-large (float, 3072) by 0.73% while reducing vector database storage costs by 99.48%—virtually the same performance at 0.5% of the cost.

We’re excited to introduce voyage-context-3, a novel contextualized chunk embedding model, where chunk embedding encodes not only the chunk's own content, but also captures the contextual information from the full document. voyage-context-3 provides a seamless drop-in replacement for standard, context-agnostic embedding models used in existing retrieval-augmented generation (RAG) pipelines, while offering improved retrieval quality through its ability to capture relevant contextual information.

Compared to both context-agnostic models with isolated chunking (e.g., OpenAI-v3-large, Cohere-v4) as well as existing methods that add context and metadata to chunks, including overlapping chunks and attaching metadata, voyage-context-3 delivers significant gains in retrieval performance while simplifying the tech stack.

On chunk-level (retrieving the most relevant chunk) and document-level retrieval (retrieving the document containing the most relevant chunk), voyage-context-3 outperforms on average:

OpenAI-v3-large and Cohere-v4 by 14.24% and 12.56%, and 7.89% and 5.64%, respectively.
Context augmentation methods Jina-v3 late¹ chunking and contextual retrieval² by 23.66% and 6.76%, and 20.54% and 2.40%, respectively.
voyage-3-large by 7.96% and 2.70%, respectively.

Two side by side graphs indicating retrieval quality. The first graph is titled chunk-level average and shows voyage-context-3 as having the highest quality compared to other models. The second graph is titled document-level average and again shows voyage-context-3 as having the highest retrieval quality compared to other models.

Chunking challenges in RAG

Focused detail vs global context. Chunking—breaking large documents into smaller segments, or chunks—is a common and often necessary step in RAG systems. Originally, chunking was primarily driven by the models’ limited context window (which is significantly extended by, e.g., Voyage’s models lately). More importantly, it allows the embeddings to contain precise fine-grained information about the corresponding passages, and as a result, allows the search system to pinpoint precisely relevant passages. However, this focus can come at the expense of a broader context. Finally, without chunking, users must pass complete documents to downstream large language models (LLMs), driving up costs as many tokens may be irrelevant to the query.

For instance, if a 50-page legal document is vectorized into a single embedding, detailed information—such as the sentence “All data transmissions between the Client and the Service Provider’s infrastructure shall utilize AES-256 encryption in GCM mode”—is likely to be buried or lost in the aggregate. By chunking the document into paragraphs and vectorizing each one separately, the resulting embeddings can better capture localized details like “AES-256 encryption.” However, such a paragraph may not contain global context—such as the Client’s name—which is necessary to answer queries like “What encryption methods does Client VoyageAI want to use?”

Ideally, we want both focused detail and global context—without tradeoffs. Common workarounds—such as chunk overlaps, context summaries using LLMs (e.g., Anthropic’s contextual retrieval), or metadata augmentation—can introduce extra steps into an already complex AI application pipeline. These steps often require further experimentation to tune, resulting in increased development time and serving cost overhead.

Introducing contextualized chunk embeddings

We’re excited to introduce contextualized chunk embeddings that capture both focused detail and global context. Our model processes the entire document in a single pass and generates a distinct embedding for each chunk. Each vector encodes not only the specific information within its chunk but also coarse-grained, document-level context, enabling richer and more semantically aware retrieval. The key is that the neural network sees all the chunks at the same time and decides intelligently what global information from other chunks should be injected into the individual chunk embeddings.

Diagram showing the separate chunks. In this example, there are 3 different chunks. Each vector contains detailed info about chunk and high-level info about the whole document.

Full document automatic context aware: Contextualized chunk embeddings capture the full context of the document without requiring the user to manually or explicitly provide contextual information. This leads to improved retrieval performance compared to isolated chunk embeddings, while remaining simpler, faster, and cheaper than other context-augmentation methods.
Seamless drop-in replacement and storage cost parity: voyage-context-3 is a seamless drop-in replacement for standard, context-agnostic embedding models used in existing search systems, RAG pipelines, and agentic systems. It accepts the same input chunks and produces vectors with identical output dimensions and quantization—now enriched with document-level context for better retrieval performance. In contrast to ColBERT, which introduces an extensive amount of vectors and storage costs, voyage-context-3 generates the same number of vectors and is fully compatible with any existing vector database.
Less sensitive to chunking strategy: While chunking strategy still influences RAG system behavior—and the optimal approach depends on data and downstream tasks—our contextualized chunk embeddings are empirically shown to reduce the system's sensitivity to these strategies, because the model intelligently supplement overly short chunks with global contexts.

Contextualized chunk embeddings outperform manual or LLM-based contextualization because neural networks are trained to capture context intelligently from large datasets, surpassing the limitations of ad hoc efforts. voyage-context-3 was trained using both document-level and chunk-level relevance labels, along with a dual objective that teaches the model to preserve chunk-level granularity while incorporating global context.

	Context Preservation	Engineering Complexity	Retrieval Accuracy
Standard Embeddings (e.g., OpenAI-v3-large)	None	Low	Moderate
Metadata Augmentation & Contextual Retrieval (e.g., Jina-v3 late chunking)	Partial	High	Moderate-High
Contextualized Chunk Embeddings (e.g., voyage-context-3)	Full, Principled	Low	Highest

Evaluation details

Chunk-level and document-level retrieval

For a given query, chunk-level retrieval returns the most relevant chunks, while document-level retrieval returns the documents containing those chunks. The figure below illustrates both retrieval levels across chunks from n documents. The most relevant chunk, often referred to as the “golden chunk,” is bolded and shown in green. Its corresponding parent document is shown in blue.

Diagram showing how a query is chunked using document-level retrieval and chunk-level retrieval.

Datasets

We evaluate on 93 domain-specific retrieval datasets, spanning nine domains: web reviews, law, medical, long documents, technical documentation, code, finance, conversations, and multilingual, which are listed in this spreadsheet. Every dataset contains a set of queries and a set of documents. Each document consists of an ordered sequence of chunks, which are created by us via a reasonable chunking strategy. As usual, every query has a number of relevant documents with a potential score indicating the degree of relevance, which we call document-level relevance labels and can be used for the evaluation of document-level retrieval. Moreover, each query also has a list of most relevant chunks with relevance scores, which are curated through various ways, including labeling by LLMs. These are referred to as chunk-level relevance labels and are used for chunk-level retrieval evaluation.

We also include proprietary real-world datasets, such as technical documentation and documents containing header metadata. Finally, we assess voyage-context-3 across different embedding dimensions and various quantization options, on standard single-embedding retrieval evaluation, using the same datasets as in our previous retrieval-quality-versus-storage-cost analysis.

Models

We evaluate voyage-context-3 alongside several alternatives, including: OpenAI-v3-large (text-embedding-3-large), Cohere-v4 (embed-v4.0), Jina-v3 late chunking (jina-embeddings-v3), contextual retrieval, voyage-3.5, and voyage-3-large.

Metrics

Given a query, we retrieve the top 10 documents based on cosine similarities and report the normalized discounted cumulative gain (NDCG@10), a standard metric for retrieval quality and a variant of the recall.

Results

All the evaluation results are available in this spreadsheet, and we analyze the data below.

Domain-specific quality. The bar charts below show the average retrieval quality of voyage-context-3 with full-precision 2048 embeddings for each domain. In the following chunk-level retrieval chart, we can see that voyage-context-3 outperforms all other models across all domains. As noted earlier, for chunk-level retrieval, voyage-context-3 outperforms on average OpenAI-v3-large, Cohere-v4, Jina-v3 late chunking, and contextual retrieval by 14.24%, 7.89%, 23.66%, and 20.54%, respectively.

Graphs showing that the voyage-context-3 outperforms other models in retrieval quality for chunk-level retrieval for the web, law, medical, long-context, tech, code, finance, conversation, and multilingual industries.

voyage-context-3 also outperforms all other models across all domains in document-level retrieval, as shown in the corresponding chart below. On average, voyage-context-3 outperforms OpenAI-v3-large, Cohere-v4, Jina-v3 late chunking, and contextual retrieval by 12.56%, 5.64%, 6.76%, and 2.40%, respectively.

Graphs showing that the voyage-context-3 outperforms other models in retrieval quality for document level retrieval for the web, law, medical, long-context, tech, code, finance, conversation, and multilingual industries.

Real-world datasets. voyage-context-3 performs strongly on our proprietary real-world technical documentation and in-house datasets, outperforming all other models. The bar chart below shows chunk-level retrieval results. Document-level retrieval results are provided in the evaluation spreadsheet.

Two graphs that show that the voyage-context-3 model outperforms all other models in retrieval quality, with one graph titled technical documentation and one titled in-house.

Chunking sensitivity. Compared to standard, context-agnostic embeddings, voyage-context-3 is less sensitive to variations in chunk size and delivers stronger performance with smaller chunks. For example, on document-level retrieval, voyage-context-3 shows only a 2.06% variance, compared to 4.34% for voyage-3-large, and outperforms voyage-3-large by 6.63% when using 64-token chunks.

Graph showing the retrieval quality of voyage-3-large and voyage-context-3 as chunk size increases. At lower chunk size, voyage-context-3 has significantly better retrieval quality, but as chunk size gets bigger, the gap between the two becomes smaller.

Context metadata. We also evaluate performance when context metadata is prepended to chunks. Even with metadata prepended to chunks embedded by voyage-3-large, voyage-context-3 outperforms it by up to 5.53%, demonstrating better retrieval performance without the extra work and resources required to prepend metadata.

Two graphs showing the retrieval qualities for voyage-3-large, voyage-3-large w/ context metadata, voyage-context-3, and voyage-context-3 w/ context metadata. For technical documentation, voyage-context-3 has the best retrieval quality, and for document header evaluation datasets, voyage-context-3 w/ context metadata has the highest retrieval quality.

Matryoshka embeddings and quantization. voyage-context-3 supports 2048, 1024, 512, and 256- dimensional embeddings enabled by Matryoshka learning and multiple embedding quantization options—including 32-bit floating point, signed and unsigned 8-bit integer, and binary precision—while minimizing quality loss. To clarify in relation to the previous figures, the chart below illustrates single-embedding retrieval on documents. Compared with OpenAI-v3-large (float, 3072), voyage-context-3 (int8, 2048) reduces vector database costs by 83% with 8.60% better retrieval quality. Further, comparing OpenAI-v3-large (float, 3072) with voyage-context-3 (binary, 512), vector database costs are reduced by 99.48% with 0.73% better retrieval quality; that’s virtually the same retrieval performance at 0.5% of the cost.

Graph showing retrieval quality relative to relative storage costs.

Try voyage-context-3

voyage-context-3 is available today! The first 200 million tokens are free. Get started with this quickstart tutorial.

You can swap in voyage-context-3 into any existing RAG pipeline you have without requiring any downstream changes. Contextualized chunk embeddings are especially effective for:

Long, unstructured documents such as white papers, legal contracts, and research reports.
Cross-chunk reasoning, where queries require information that spans multiple sections.
High-sensitivity retrieval tasks—such as in finance, medical, or legal domains—where missing context can lead to costly errors.

To learn more about building AI applications with MongoDB, visit the MongoDB AI Learning Hub.

¹ Jina. “Late Chunking in Long-Context Embedding Models.” August 22, 2024

² Anthropic. “Introducing Contextual Retrieval.” September 19, 2024.

← Previous

Build Scalable RAG With MongoDB Atlas and Cohere Command R+

Retrieval-augmented generation (RAG) is becoming increasingly vital for developing sophisticated AI applications that not only generate fluent text but also ensure precision and contextual relevance by grounding responses in real, factual data. This approach significantly mitigates hallucinations and enhances the reliability of AI outputs. This guide provides a detailed exploration of an open-source solution designed to facilitate the deployment of a production-ready RAG application by using the powerful combination of MongoDB Atlas and Cohere Command R+. This solution is built upon and extends the foundational principles demonstrated in the official Cohere plus MongoDB RAG documentation available at Build Chatbots with MongoDB and Cohere . To provide you with in-depth knowledge and practical skills in several key areas, this comprehensive walkthrough will: Show you how to build a complete RAG pipeline using MongoDB Atlas and Cohere APIs Focus on data flow, retrieval, and generation Enable you to enhance answer quality through reranking to improve relevance and accuracy Enable detailed, flexible deployment with Docker Compose for local or cloud environments Explain MongoDB’s dual role as a vector store and chat memory for a seamless RAG application Reasons to choose MongoDB and Cohere for RAG The convergence of powerful technologies— MongoDB Atlas and Cohere Command R+ —unlocks significant potential for creating sophisticated, scalable, and high-performance systems for grounded generative AI (gen AI). This synergistic approach provides a comprehensive toolkit to handle the unique demands of modern AI applications. MongoDB Atlas and Cohere Command R+ facilitate the development of scalable, high-performing, and grounded AI applications. MongoDB Atlas provides a scalable, flexible, reliable, and fast database for managing large datasets used to ground generative models. Cohere Command R+ offers a sophisticated large language model (LLM) for natural language understanding and generation, incorporating retrieved data for factual accuracy and rapid inference. The combined use of MongoDB Atlas and Cohere Command R+ results in applications with fast and accurate responses, scalable architectures, and outputs informed by real-world data. This powerful combination represents a compelling approach to building the next generation of gen AI applications, facilitating innovation and unlocking novel opportunities across various sectors. Architecture overview In this section, we’ll look at the implementation architecture of the application and how the mixture of Cohere and MongoDB components flow underneath. Figure 1. Reference architecture, with Cohere and MongoDB components. The following list divides and explains the architecture components: 1. Document ingestion, chunking, and embedding with Cohere The initial step involves loading your source documents, which can be in various formats. These documents are then intelligently segmented into smaller, semantically meaningful chunks to optimize retrieval and processing. Cohere’s powerful embedding models generate dense vector representations of these text chunks, capturing their underlying meaning and semantic relationships. 2. Scalable vector and text storage in MongoDB Atlas MongoDB Atlas , a fully managed and scalable database service, serves as the central repository for both the original text chunks and their corresponding vector embeddings. MongoDB Atlas’s built-in vector search capabilities (with MongoDB Atlas Vector Search ) enable efficient and high-performance similarity searches based on the generated embeddings. This enables the scalable storage and retrieval of vast amounts of textual data and their corresponding vector representations. 3/ Query processing and semantic search with MongoDB Atlas When a user poses a query, it undergoes a similar embedding process, using Cohere to generate a vector representation of the search intent. MongoDB Atlas then uses this query vector to perform a semantic search within its vector index. MongoDB Atlas efficiently identifies the most relevant document chunks based on their vector similarity to the query vector, surpassing simple keyword matching to comprehend the underlying meaning. 4. Reranking with Cohere To further refine the relevance of the retrieved document chunks, you can employ Cohere’s reranking models. The reranker analyzes the initially retrieved chunks in the context of the original query, scoring and ordering them based on a more nuanced understanding of their relevance. This step ensures that you’re prioritizing the most pertinent information for the final answer generation. 5. Grounded answer generation with Cohere Command R+ The architecture then passes the top-ranked document chunks to Cohere’s Command R+ LLM. Command R+ uses its extensive knowledge and understanding of language to generate a grounded and coherent answer to the user’s query, with direct support from the information extracted from the retrieved documents. This ensures that the answers are accurate, contextually relevant, and traceable to the source material. 6. Context-aware interactions and memory with MongoDB To enable more natural and conversational interactions, you can store the history of the conversation in MongoDB. This enables the RAG application to maintain context across multiple turns, referencing previous queries and responses to provide more informed and relevant answers. By incorporating conversation history, the application gains memory and can engage in more meaningful dialogues with users. For a better understanding of what each technical component does, reference the following table, which shows how the architecture assigns roles to each component: table, th, td { border: 1px solid black; border-collapse: collapse; } th, td { padding: 5px; } Component Role MongoDB Atlas Stores text chunks, vector embeddings, and chat logs Cohere Embed API Converts text into dense vector representations MongoDB Atlas Vector Search Performs efficient semantic retrieval via cosine similarity Cohere Rerank API Prioritizes the most relevant results from the retrieval Cohere Command R+ Generates final responses grounded in top documents In summary, this architecture provides a robust and scalable framework for building RAG applications. It integrates the document processing and embedding capabilities of Cohere with the scalable storage and vector search functionalities of MongoDB Atlas. By combining this with the generative power of Command R+, developers can create intelligent applications that provide accurate, contextually relevant, and grounded answers to user queries, while also maintaining conversational context for an enhanced user experience. Application Setup The application requires the following components, ideally readied beforehand. A MongoDB Atlas cluster (free tier is fine) A Cohere account and API key Python 3.8+ Docker and Docker Compose A configured AWS CLI Deployment steps 1. Clone the repository. git clone https://github.com/mongodb-partners/maap-cohere-qs.git cd maap-cohere-qs 2. Configure the one-click.ksh script: Open the script in a text editor and fill in the required values for various environment variables: AWS Auth: Specify the AWS_REGION , AWS_ACCESS_KEY_ID , and AWS_SECRET_ACCESS_KEY for deployment. EC2 Instance Types: Choose suitable instance types for your workload. Network Configuration: Update key names, subnet IDs, security group IDs, etc. Authentication Keys: Fetch Project ID and API public and private keys for MongoDB Atlas cluster setup. Update the script file with the keys for APIPUBLICKEY , APIPRIVATEKEY , and GROUPID suitably. 3. Deploy the application. chmod +x one-click.ksh ./one-click.ksh 4. Access the application: http://<ec2-instance-ip>:8501 Core workflow Load and chunk data: Currently, data is loaded from a static, dummy source. However, you can update this to a live data source to ensure the latest data and reports are always available. For details on data loading, refer to the documentation . 2. Embed and store: Each chunk is embedded using embed-english-v3.0 , and both the original chunk and the vector are stored in a MongoDB collection: model = "embed-english-v3.0" response = self.co.embed( texts=[text], model=model, input_type=input_type, embedding_types=['float'] ) 3. Semantic retrieval with vector search: Create a vector search index on top of your collection: index_models = [ { "database": "asset_management_use_case", "collection": "market_reports", "index_model": SearchIndexModel( definition={ "fields": [ { "type": "vector", "path": "embedding", "numDimensions": 1024, "similarity": "cosine" }, { "type": "filter", "path": "key_metrics.p_e_ratio" }, { "type": "filter", "path": "key_metrics.market_cap" }, { "type": "filter", "path": "key_metrics.dividend_yield" }, { "type": "filter", "path": "key_metrics.current_stock_price" } ] }, name="vector_index", type="vectorSearch", ), } ] A vector index in MongoDB enables fast, cosine-similarity-based lookups. MongoDB Atlas returns the top-k semantically similar documents, on top of which you can apply additional post filters to get more fine-grained results set in a bounded space. 4. Re-ranking for accuracy: Instead of relying solely on vector similarity, the retrieved documents are reranked using Cohere’s Rerank API , which is trained to order results by relevance. This dramatically improves answer quality and prevents irrelevant context from polluting the response. response = self.co.rerank( query=query, documents=rerank_docs, top_n=top_n, model="rerank-english-v3.0", rank_fields=["company", "combined_attributes"] ) The importance of reranking A common limitation in RAG systems is that dense vector search alone may retrieve documents that are semantically close but not contextually relevant . The Cohere Rerank API solves this by using a lightweight model to score query-document pairs for relevance. The ability to combine everything The end application works and functions on a streamlit UI, as displayed below. Figure 2. Working application with UI. To achieve more direct and nuanced responses in data retrieval and analysis, you’ll find that the strategic implementation of prefilters is paramount. Prefilters act as an initial, critical layer of data reduction, sifting through larger datasets to present a more manageable and relevant subset for subsequent, more intensive processing. This not only significantly enhances the efficiency of queries but also refines the precision and interpretability of the results. For instance, instead of analyzing sales trends across an entire product catalogue, a prefilter can limit the analysis to a specific product line, thereby revealing more granular insights into its performance, customer demographics, or regional variations. This level of specificity enables the extraction of more subtle patterns and relationships that might otherwise be obscured within a broader, less filtered dataset. Figure 3. Prefilters to be applied on top of MongoDB Atlas Vector Search. Conclusion Just by using MongoDB Atlas and Cohere’s API suite, you can deploy a fully grounded, semantically aware RAG system that is cost effective, flexible, and production grade. This quick-start enables your developers to build AI assistants that reason with your data without requiring extensive infrastructure. Start building intelligent AI agents powered by MongoDB Atlas. Visit our GitHub repo to try out the quick-start and unlock the full potential of semantic search, secure automation, and real-time analytics. Your AI-agent journey starts now. Ready to learn more about building AI applications with MongoDB? Head over to our AI Learning Hub .

July 23, 2025

Next →

Streamlining Editorial Operations with Gen AI and MongoDB

Are you overwhelmed by the sheer volume of information and the constant pressure to produce content that truly resonates? Audiences constantly demand engaging and timely topics. As the daily influx of information grows massively, it’s becoming increasingly tough to identify what’s interesting and relevant. Consequently, teams are spending more time researching trends, verifying sources, and managing tools than actually creating compelling stories. This is where artificial intelligence enters the media landscape to offer newer possibilities. Tapping into AI capabilities calls for a flexible data infrastructure in order to streamline content workflows, provide real-time insights, and help teams stay focused on what matters most. In this blog, we will explore how combining gen AI with modern databases, such as MongoDB, can efficiently improve editorial operations. Why are your content ideas running dry? Creative fatigue significantly impacts content production. Content leads face constant pressure to generate fresh ideas under tight deadlines, leading to creative blocks. In fact, a recent report from Hubspot, 16% of content marketers struggle with finding compelling new content ideas . This pressure often compromises work quality due to time constraints, leaving little room for delivering authentic content. Another main hurdle is identifying credible and trending topics quickly. In order to find reliable pieces of information, a lot of time is spent on researching and discovery rather than actual creation. This leads to missed opportunities in identifying what’s trending and reduces the audience engagement as well. This presents a clear opportunity for AI, leveraged with modern databases, to deliver a transformative solution. Using MongoDB to streamline content operations MongoDB provides a flexible, unified storage solution through its collections for modern editorial workflows. The need for a flexible data infrastructure Developing an AI-driven publishing tool necessitates a system that can ingest, process, and structure a high volume of diverse content from multiple sources.. Traditional databases often struggle with this complexity. Such a system demands the ability to ingest data from many sources, dynamically categorize content by industry, and perform advanced AI-enabled searches to scale applications. Combining flexible document-oriented databases with embedding techniques transforms varied content into structured, easily retrievable insights. Figure 1 below illustrates this integrated workflow, from raw data ingestion to semantic retrieval and AI-driven topic suggestions. Figure 1. High-level architectural diagram of the Content Lab solution, showing the flow from the front-end through microservices, backend services, and MongoDB Atlas to AI-driven topic suggestions. Raw data into actionable insights We store a diverse mix of unstructured and semi-structured content in dedicated MongoDB collections such as news, Reddit posts, suggestions, userProfiles, and drafts, organized by topic, vertical (e.g., business, health), and source metadata for efficient retrieval and categorization. These collections are continuously updated from external APIs like NewsAPI and Reddit, alongside AI services (e.g., AWS Bedrock, Anthropic Claude) integrated via backend endpoints. By leveraging embedding models, we transform raw content into organised, meaningful data, stored in their specific categories (e.g., business, health) in the form of vectors. MongoDB Atlas Vector Search and Aggregation Pipeline enables fast semantic retrieval, allowing users to query abstract ideas or keywords and get back the most relevant, trending topics ranked by a similarity score. Generative AI services then draw upon these results to automate the early stages of content development, suggesting topics and drafting initial articles to substantially reduce creative fatigue. From a blank page to first draft – With gen AI and MongoDB Once a user chooses a topic, they’re taken to a draft page, as depicted in the third step of Figure 2. Users are then guided by a large language model (LLM)-based writing assistant and supported by Tavily’s search agent, which pulls in additional contextual information. MongoDB continues to handle all associated metadata and draft state, ensuring the user’s entire journey stays connected and fast. Figure 2. Customer flow pipeline & behind-the-scenes. We also maintain a dedicated userProfiles collection, linked to both the drafts and chatbot systems. This enables dynamic personalization so, for example, a Gen Z user receives writing suggestions aligned with their tone and preferences. This level of contextual adaptation improves user engagement and supports editorial consistency. User-generated drafts are stored as new entries in a dedicated drafts collection. This facilitates persistent storage, version control, and later reuse which is essential for editorial workflows. MongoDB’s flexible schema lets us evolve the data model as we add new content types or fields without migrating data. Solving the content credibility challenge Robust data management directly addresses the content credibility. When we generate topic suggestions, we capture and store the source URLs within MongoDB, embedding these links directly into the suggestion cards shown in the UI. This allows users to quickly verify each topic’s origin and reliability. Additionally, by integrating Tavily, we retrieve related contextual information along with their URLs, further enriching each suggestion. MongoDB’s efficient handling of complex metadata and relational data ensures that editorial teams can consistently and confidently vet content sources, delivering trustworthy, high-quality drafts. By combining Atlas Vector Search, flexible collections, and real-time queries, MongoDB assists greatly in building an end-to-end content system that’s agile, adaptable and intelligent. The next section shows how this translates into a working editorial experience. From raw ideas to ready stories: Our system in action With our current solution, the editorial teams can rapidly transition from scattered ideas to structured, AI-assisted drafts, all within a smart, connected system. The combination of generative AI, semantic search, and flexible data handling enables the workflow to become faster, more spontaneous and less dependent on manual effort. Consequently, the system focuses back on creativity as it becomes convenient to discover relevant topics from verified sources and produce personalised drafts. Adaptability and scalability become the essential factors in developing intelligent systems that can produce great results within the content scope. As editorial demands grow constantly, it necessitates an infrastructure that can ingest diverse data, produce insights, and assist in real-time collaboration. This system illustrates how AI coupled with a flexible, document-oriented backend can assist teams to reduce fatigue, enhance quality and accelerate the production without increasing difficulty. It’s not just about automation; it’s about providing a more focused, efficient, and reliable path from idea to publication. Here are a few next steps to help you explore the tools and techniques behind AI-powered editorial systems: Dive Deeper with Atlas Vector Search : Explore our comprehensive tutorial to understand how Atlas Vector Search empowers semantic search and enables real-time insights from your data. Discover Real-World Applications: Learn more about how MongoDB is transforming media operations by reading the AI-Powered Media article. Check out the MongoDB for Media and Entertainment page to learn more about how we meet the dynamic needs of modern media workflows.

August 26, 2025