Starting April 29, 2025, Gemini 1.5 Pro and Gemini 1.5 Flash models are not available in projects that have no prior usage of these models, including new projects. For details, see Model versions and lifecycle.
Vertex AI RAG Engine, a component of the Vertex AI Platform, is a data framework for developing applications that use Retrieval-Augmented Generation (RAG). RAG augments the context of a large language model (LLM) with your own data.
A common challenge with LLMs is that they can't access private knowledge, such as your organization's data. With Vertex AI RAG Engine, you can enrich the LLM's context with your private information. This process helps the model reduce hallucination and answer questions more accurately.
Combining your knowledge sources with an LLM's existing knowledge provides the model with better context. The improved context, along with the user's query, enhances the quality of the LLM's response. For example, to answer a question about a company's internal policy, a RAG system first retrieves the relevant policy document and then uses an LLM to generate an answer based on that document.
The following image illustrates the key concepts of the RAG process in Vertex AI RAG Engine.
The RAG process includes the following steps:
Data ingestion: Ingests data from various sources, such as local files, Cloud Storage, and Google Drive.
Data transformation: Transforms data in preparation for indexing, for example, by splitting it into chunks.
Embedding: Converts text into numerical representations (embeddings) that capture semantic meaning. Text with similar meanings has similar embeddings.
Data indexing: Creates an index, called a corpus, to structure the knowledge base for optimized searching.
Retrieval: Searches the indexed knowledge base to find information relevant to a user's query or prompt.
Generation: The retrieved information becomes the context added to the
original user query as a guide for the generative AI model to generate
factually grounded and relevant responses.
Supported regions
Vertex AI RAG Engine is supported in the following regions:
Region
Location
Description
Launch stage
us-central1
Iowa
v1 and v1beta1 versions are supported.
Allowlist
us-east4
Virginia
v1 and v1beta1 versions are supported.
GA
europe-west3
Frankfurt, Germany
v1 and v1beta1 versions are supported.
GA
europe-west4
Eemshaven, Netherlands
v1 and v1beta1 versions are supported.
GA
Access to us-central1 requires you to be on an allowlist. To experiment with Vertex AI RAG Engine, you can use other available regions. If you need to use us-central1 for production traffic, contact vertex-ai-rag-engine-support@google.com to request access.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-28 UTC."],[],[],null,["# Vertex AI RAG Engine overview\n\n| The [VPC-SC security controls](/vertex-ai/generative-ai/docs/security-controls) and\n| CMEK are supported by Vertex AI RAG Engine. Data residency and AXT security controls aren't\n| supported.\n| You must be added to the allowlist to access\n| Vertex AI RAG Engine in `us-central1`. For users\n| with existing projects, there is no impact. For users with new projects, you\n| can try other regions, or contact\n| `vertex-ai-rag-engine-support@google.com` to onboard to\n| `us-central1`.\n\nThis page describes what Vertex AI RAG Engine is and how it\nworks.\n\nOverview\n--------\n\nVertex AI RAG Engine, a component of the Vertex AI\nPlatform, facilitates Retrieval-Augmented Generation (RAG).\nVertex AI RAG Engine is also a data framework for developing\ncontext-augmented large language model (LLM) applications. Context augmentation\noccurs when you apply an LLM to your data. This implements retrieval-augmented\ngeneration (RAG).\n\nA common problem with LLMs is that they don't understand private knowledge, that\nis, your organization's data. With Vertex AI RAG Engine, you can\nenrich the LLM context with additional private information, because the model\ncan reduce hallucination and answer questions more accurately.\n\nBy combining additional knowledge sources with the existing knowledge that LLMs\nhave, a better context is provided. The improved context along with the query\nenhances the quality of the LLM's response.\n\nThe following image illustrates the key concepts to understanding\nVertex AI RAG Engine.\n\nThese concepts are listed in the order of the retrieval-augmented generation\n(RAG) process.\n\n1. **Data ingestion**: Intake data from different data sources. For example,\n local files, Cloud Storage, and Google Drive.\n\n2. [**Data transformation**](/vertex-ai/generative-ai/docs/fine-tune-rag-transformations):\n Conversion of the data in preparation for indexing. For example, data is\n split into chunks.\n\n3. [**Embedding**](/vertex-ai/generative-ai/docs/embeddings/get-text-embeddings): Numerical\n representations of words or pieces of text. These numbers capture the\n semantic meaning and context of the text. Similar or related words or text\n tend to have similar embeddings, which means they are closer together in the\n high-dimensional vector space.\n\n4. **Data indexing** : Vertex AI RAG Engine creates an index called a [corpus](/vertex-ai/generative-ai/docs/manage-your-rag-corpus#corpus-management).\n The index structures the knowledge base so it's optimized for searching. For\n example, the index is like a detailed table of contents for a massive\n reference book.\n\n5. **Retrieval**: When a user asks a question or provides a prompt, the retrieval\n component in Vertex AI RAG Engine searches through its knowledge\n base to find information that is relevant to the query.\n\n6. **Generation** : The retrieved information becomes the context added to the\n original user query as a guide for the generative AI model to generate\n factually [grounded](/vertex-ai/generative-ai/docs/grounding/overview) and relevant responses.\n\nSupported regions\n-----------------\n\nVertex AI RAG Engine is supported in the following regions:\n\n- `us-central1` is changed to `Allowlist`. If you'd like to experiment with Vertex AI RAG Engine, try other regions. If you plan to onboard your production traffic to `us-central1`, contact `vertex-ai-rag-engine-support@google.com`.\n\nSubmit feedback\n---------------\n\nTo chat with Google support, go to the [Vertex AI RAG Engine\nsupport\ngroup](https://groups.google.com/a/google.com/g/vertex-ai-rag-engine-support).\n\nTo send an email, use the email address\n`vertex-ai-rag-engine-support@google.com`.\n\nWhat's next\n-----------\n\n- To learn how to use the Vertex AI SDK to run Vertex AI RAG Engine tasks, see [RAG quickstart for\n Python](/vertex-ai/generative-ai/docs/rag-quickstart).\n- To learn about grounding, see [Grounding\n overview](/vertex-ai/generative-ai/docs/grounding/overview).\n- To learn more about the responses from RAG, see [Retrieval and Generation Output of Vertex AI RAG Engine](/vertex-ai/generative-ai/docs/model-reference/rag-output-explained).\n- To learn about the RAG architecture:\n - [Infrastructure for a RAG-capable generative AI application using Vertex AI and Vector Search](/architecture/gen-ai-rag-vertex-ai-vector-search)\n - [Infrastructure for a RAG-capable generative AI application using Vertex AI and AlloyDB for PostgreSQL](/architecture/rag-capable-gen-ai-app-using-vertex-ai)."]]