Simulate data ingestion Technical preview; Added in 8.12.0

POST /_ingest/{index}/_simulate

All methods and paths for this operation:

GET /_ingest/_simulate

POST /_ingest/_simulate
GET /_ingest/{index}/_simulate
POST /_ingest/{index}/_simulate

Run ingest pipelines against a set of provided documents, optionally with substitute pipeline definitions, to simulate ingesting data into an index.

This API is meant to be used for troubleshooting or pipeline development, as it does not actually index any data into Elasticsearch.

The API runs the default and final pipeline for that index against a set of documents provided in the body of the request. If a pipeline contains a reroute processor, it follows that reroute processor to the new index, running that index's pipelines as well the same way that a non-simulated ingest would. No data is indexed into Elasticsearch. Instead, the transformed document is returned, along with the list of pipelines that have been run and the name of the index where the document would have been indexed if this were not a simulation. The transformed document is validated against the mappings that would apply to this index, and any validation error is reported in the result.

This API differs from the simulate pipeline API in that you specify a single pipeline for that API, and it runs only that one pipeline. The simulate pipeline API is more useful for developing a single pipeline, while the simulate ingest API is more useful for troubleshooting the interaction of the various pipelines that get applied when ingesting into an index.

By default, the pipeline definitions that are currently in the system are used. However, you can supply substitute pipeline definitions in the body of the request. These will be used in place of the pipeline definitions that are already in the system. This can be used to replace existing pipeline definitions or to create new ones. The pipeline substitutions are used only within this request.

Required authorization

  • Index privileges: index

Path parameters

  • index string Required

    The index to simulate ingesting into. This value can be overridden by specifying an index on each document. If you specify this parameter in the request path, it is used for any documents that do not explicitly specify an index argument.

Query parameters

  • pipeline string

    The pipeline to use as the default pipeline. This value can be used to override the default pipeline of the index.

application/json

Body Required

  • docs array[object] Required

    Sample documents to test in the pipeline.

    Hide docs attributes Show docs attributes object
    • _id string

      Unique identifier for the document. This ID must be unique within the _index.

    • _index string

      Name of the index containing the document.

    • _source object Required

      JSON body for the document.

  • component_template_substitutions object

    A map of component template names to substitute component template definition objects.

    Hide component_template_substitutions attribute Show component_template_substitutions attribute object
    • * object
      Hide * attributes Show * attributes object
      • template object Required
        Hide template attributes Show template attributes object
        • _meta object
          Hide _meta attribute Show _meta attribute object
          • * object Additional properties
        • version number
        • settings object
          Hide settings attribute Show settings attribute object
        • mappings object
          Hide mappings attributes Show mappings attributes object
          • date_detection boolean
          • dynamic_date_formats array[string]
          • dynamic_templates array[object]
          • numeric_detection boolean
          • properties object
          • runtime object
          • enabled boolean
        • aliases object
          Hide aliases attribute Show aliases attribute object
          • * object Additional properties
            Hide * attributes Show * attributes object
            • index_routing string

              Value used to route indexing operations to a specific shard. If specified, this overwrites the routing value for indexing operations.

            • is_write_index boolean

              If true, the index is the write index for the alias.

              Default value is false.

            • routing string

              Value used to route indexing and search operations to a specific shard.

            • search_routing string

              Value used to route search operations to a specific shard. If specified, this overwrites the routing value for search operations.

            • is_hidden boolean Generally available; Added in 7.16.0

              If true, the alias is hidden. All indices for the alias must have the same is_hidden value.

              Default value is false.

        • Data stream lifecycle with rollover can be used to display the configuration including the default rollover conditions, if asked.

        • data_stream_options object | string | null

          One of:
      • version number
      • _meta object
        Hide _meta attribute Show _meta attribute object
        • * object Additional properties
      • deprecated boolean
  • index_template_substitutions object

    A map of index template names to substitute index template definition objects.

    Hide index_template_substitutions attribute Show index_template_substitutions attribute object
    • * object
      Hide * attributes Show * attributes object
      • index_patterns string | array[string] Required

        Name of the index template.

      • composed_of array[string] Required

        An ordered list of component template names. Component templates are merged in the order specified, meaning that the last component template specified has the highest precedence.

      • template object

        Template to be applied. It may optionally include an aliases, mappings, or settings configuration.

        Hide template attributes Show template attributes object
        • aliases object

          Aliases to add. If the index template includes a data_stream object, these are data stream aliases. Otherwise, these are index aliases. Data stream aliases ignore the index_routing, routing, and search_routing options.

          Hide aliases attribute Show aliases attribute object
          • * object Additional properties
            Hide * attributes Show * attributes object
            • is_hidden boolean

              If true, the alias is hidden. All indices for the alias must have the same is_hidden value.

              Default value is false.

            • is_write_index boolean

              If true, the index is the write index for the alias.

              Default value is false.

        • mappings object

          Mapping for fields in the index. If specified, this mapping can include field names, field data types, and mapping parameters.

          Hide mappings attributes Show mappings attributes object
          • date_detection boolean
          • dynamic_date_formats array[string]
          • dynamic_templates array[object]
          • numeric_detection boolean
          • properties object
          • runtime object
          • enabled boolean
        • settings object

          Configuration options for the index.

          Index settings
        • Data stream lifecycle with rollover can be used to display the configuration including the default rollover conditions, if asked.

        • data_stream_options object | string | null

          One of:
      • version number

        Version number used to manage index templates externally. This number is not automatically generated by Elasticsearch.

      • priority number

        Priority to determine index template precedence when a new data stream or index is created. The index template with the highest priority is chosen. If no priority is specified the template is treated as though it is of priority 0 (lowest priority). This number is not automatically generated by Elasticsearch.

      • _meta object

        Optional user metadata about the index template. May have any contents. This map is not automatically generated by Elasticsearch.

        Hide _meta attribute Show _meta attribute object
        • * object Additional properties
      • allow_auto_create boolean
      • data_stream object

        If this object is included, the template is used to create data streams and their backing indices. Supports an empty object. Data streams require a matching index template with a data_stream object.

        Hide data_stream attributes Show data_stream attributes object
        • hidden boolean

          If true, the data stream is hidden.

          Default value is false.

        • allow_custom_routing boolean

          If true, the data stream supports custom routing.

          Default value is false.

      • deprecated boolean Generally available; Added in 8.12.0

        Marks this index template as deprecated. When creating or updating a non-deprecated index template that uses deprecated components, Elasticsearch will emit a deprecation warning.

      • ignore_missing_component_templates string | array[string]

        A list of component template names that are allowed to be absent.

  • mapping_addition object
    Hide mapping_addition attributes Show mapping_addition attributes object
    • all_field object
      Hide all_field attributes Show all_field attributes object
      • analyzer string Required
      • enabled boolean Required
      • omit_norms boolean Required
      • search_analyzer string Required
      • similarity string Required
      • store boolean Required
      • store_term_vector_offsets boolean Required
      • store_term_vector_payloads boolean Required
      • store_term_vector_positions boolean Required
      • store_term_vectors boolean Required
    • date_detection boolean
    • dynamic string

      Values are strict, runtime, true, or false.

    • dynamic_date_formats array[string]
    • dynamic_templates array[object]
    • _field_names object
      Hide _field_names attribute Show _field_names attribute object
      • enabled boolean Required
    • index_field object
      Hide index_field attribute Show index_field attribute object
      • enabled boolean Required
    • _meta object
      Hide _meta attribute Show _meta attribute object
      • * object Additional properties
    • numeric_detection boolean
    • properties object
    • _routing object
      Hide _routing attribute Show _routing attribute object
      • required boolean Required
    • _size object
      Hide _size attribute Show _size attribute object
      • enabled boolean Required
    • _source object
      Hide _source attributes Show _source attributes object
      • compress boolean
      • compress_threshold string
      • enabled boolean
      • excludes array[string]
      • includes array[string]
      • mode string

        Supported values include:

        • disabled
        • stored
        • synthetic: Instead of storing source documents on disk exactly as you send them, Elasticsearch can reconstruct source content on the fly upon retrieval.

        Values are disabled, stored, or synthetic.

    • runtime object
      Hide runtime attribute Show runtime attribute object
      • * object Additional properties
        Hide * attributes Show * attributes object
        • fields object

          For type composite

          Hide fields attribute Show fields attribute object
          • * object Additional properties
        • fetch_fields array[object]

          For type lookup

          Hide fetch_fields attributes Show fetch_fields attributes object
          • field
          • format string
        • format string

          A custom format for date type runtime fields.

        • input_field string

          For type lookup

        • target_field string

          For type lookup

        • target_index string

          For type lookup

        • script object

          Painless script executed at query time.

          Hide script attributes Show script attributes object
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • type string Required

          Field type, which can be: boolean, composite, date, double, geo_point, ip,keyword, long, or lookup.

          Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

    • enabled boolean
    • subobjects string

      Values are true or false.

    • _data_stream_timestamp object
      Hide _data_stream_timestamp attribute Show _data_stream_timestamp attribute object
      • enabled boolean Required
  • pipeline_substitutions object

    Pipelines to test. If you don’t specify the pipeline request path parameter, this parameter is required. If you specify both this and the request path parameter, the API only uses the request path parameter.

    Hide pipeline_substitutions attribute Show pipeline_substitutions attribute object
    • * object Additional properties
      Hide * attributes Show * attributes object
      • description string

        Description of the ingest pipeline.

      • on_failure array[object]

        Processors to run immediately after a processor failure.

        Hide on_failure attributes Show on_failure attributes object
        • append object
        • attachment object
        • bytes object
        • circle object
        • community_id object
        • convert object
        • csv object
        • date object
        • date_index_name object
        • dissect object
        • dot_expander object
        • drop object
        • enrich object
        • fail object
        • fingerprint object
        • foreach object
        • ip_location object
        • geo_grid object
        • geoip object
        • grok object
        • gsub object
        • html_strip object
        • inference object
        • join object
        • json object
        • kv object
        • lowercase object
        • network_direction object
        • pipeline object
        • redact object
        • registered_domain object
        • remove object
        • rename object
        • reroute object
        • script object
        • set object
        • set_security_user object
        • sort object
        • split object
        • terminate object
        • trim object
        • uppercase object
        • urldecode object
        • uri_parts object
        • user_agent object
      • processors array[object]

        Processors used to perform transformations on documents before indexing. Processors run sequentially in the order specified.

        Hide processors attributes Show processors attributes object
        • append object
        • attachment object
        • bytes object
        • circle object
        • community_id object
        • convert object
        • csv object
        • date object
        • date_index_name object
        • dissect object
        • dot_expander object
        • drop object
        • enrich object
        • fail object
        • fingerprint object
        • foreach object
        • ip_location object
        • geo_grid object
        • geoip object
        • grok object
        • gsub object
        • html_strip object
        • inference object
        • join object
        • json object
        • kv object
        • lowercase object
        • network_direction object
        • pipeline object
        • redact object
        • registered_domain object
        • remove object
        • rename object
        • reroute object
        • script object
        • set object
        • set_security_user object
        • sort object
        • split object
        • terminate object
        • trim object
        • uppercase object
        • urldecode object
        • uri_parts object
        • user_agent object
      • version number

        Version number used by external systems to track ingest pipelines.

      • deprecated boolean

        Marks this ingest pipeline as deprecated. When a deprecated ingest pipeline is referenced as the default or final pipeline when creating or updating a non-deprecated index template, Elasticsearch will emit a deprecation warning.

        Default value is false.

      • _meta object

        Arbitrary metadata about the ingest pipeline. This map is not automatically generated by Elasticsearch.

        Hide _meta attribute Show _meta attribute object
        • * object Additional properties

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • docs array[object] Required
      Hide docs attribute Show docs attribute object
      • doc object

        The results of ingest simulation on a single document. The _source of the document contains the results after running all pipelines listed in executed_pipelines on the document. The list of executed pipelines is derived from the pipelines that would be executed if this document had been ingested into _index.

        Hide doc attributes Show doc attributes object
        • _id string Required

          Identifier for the document.

        • _index string Required

          Name of the index that the document would be indexed into if this were not a simulation.

        • _source object Required

          JSON body for the document.

          Hide _source attribute Show _source attribute object
          • * object Additional properties
        • _version
        • executed_pipelines array[string] Required

          A list of the names of the pipelines executed on this document.

        • ignored_fields array[object]

          A list of the fields that would be ignored at the indexing step. For example, a field whose value is larger than the allowed limit would make it through all of the pipelines, but would not be indexed into Elasticsearch.

        • error object

          Any error resulting from simulatng ingest on this doc. This can be an error generated by executing a processor, or a mapping validation error when simulating indexing the resulting doc.

POST /_ingest/_simulate
{
  "docs": [
    {
      "_id": "123",
      "_index": "my-index",
      "_source": {
        "foo": "bar"
      }
    },
    {
      "_id": "456",
      "_index": "my-index",
      "_source": {
        "foo": "rab"
      }
    }
  ]
}
resp = client.simulate.ingest(
    docs=[
        {
            "_id": "123",
            "_index": "my-index",
            "_source": {
                "foo": "bar"
            }
        },
        {
            "_id": "456",
            "_index": "my-index",
            "_source": {
                "foo": "rab"
            }
        }
    ],
)
const response = await client.simulate.ingest({
  docs: [
    {
      _id: "123",
      _index: "my-index",
      _source: {
        foo: "bar",
      },
    },
    {
      _id: "456",
      _index: "my-index",
      _source: {
        foo: "rab",
      },
    },
  ],
});
response = client.simulate.ingest(
  body: {
    "docs": [
      {
        "_id": "123",
        "_index": "my-index",
        "_source": {
          "foo": "bar"
        }
      },
      {
        "_id": "456",
        "_index": "my-index",
        "_source": {
          "foo": "rab"
        }
      }
    ]
  }
)
$resp = $client->simulate()->ingest([
    "body" => [
        "docs" => array(
            [
                "_id" => "123",
                "_index" => "my-index",
                "_source" => [
                    "foo" => "bar",
                ],
            ],
            [
                "_id" => "456",
                "_index" => "my-index",
                "_source" => [
                    "foo" => "rab",
                ],
            ],
        ),
    ],
]);
curl -X POST -H "Authorization: ApiKey $ELASTIC_API_KEY" -H "Content-Type: application/json" -d '{"docs":[{"_id":"123","_index":"my-index","_source":{"foo":"bar"}},{"_id":"456","_index":"my-index","_source":{"foo":"rab"}}]}' "$ELASTICSEARCH_URL/_ingest/_simulate"
client.simulate().ingest(i -> i
    .docs(List.of(Document.of(d -> d
            .id("123")
            .index("my-index")
            .source(JsonData.fromJson("{\"foo\":\"bar\"}"))),Document.of(d -> d
            .id("456")
            .index("my-index")
            .source(JsonData.fromJson("{\"foo\":\"rab\"}")))))
);
In this example the index `my-index` has a default pipeline called `my-pipeline` and a final pipeline called `my-final-pipeline`. Since both documents are being ingested into `my-index`, both pipelines are run using the pipeline definitions that are already in the system.
{
  "docs": [
    {
      "_id": "123",
      "_index": "my-index",
      "_source": {
        "foo": "bar"
      }
    },
    {
      "_id": "456",
      "_index": "my-index",
      "_source": {
        "foo": "rab"
      }
    }
  ]
}
In this example the index `my-index` has a default pipeline called `my-pipeline` and a final pipeline called `my-final-pipeline`. But a substitute definition of `my-pipeline` is provided in `pipeline_substitutions`. The substitute `my-pipeline` will be used in place of the `my-pipeline` that is in the system, and then the `my-final-pipeline` that is already defined in the system will run.
{
  "docs": [
    {
      "_index": "my-index",
      "_id": "123",
      "_source": {
        "foo": "bar"
      }
    },
    {
      "_index": "my-index",
      "_id": "456",
      "_source": {
        "foo": "rab"
      }
    }
  ],
  "pipeline_substitutions": {
    "my-pipeline": {
      "processors": [
        {
          "uppercase": {
            "field": "foo"
          }
        }
      ]
    }
  }
}
In this example, imagine that the index `my-index` has a strict mapping with only the `foo` keyword field defined. Say that field mapping came from a component template named `my-mappings-template`. You want to test adding a new field, `bar`. So a substitute definition of `my-mappings-template` is provided in `component_template_substitutions`. The substitute `my-mappings-template` will be used in place of the existing mapping for `my-index` and in place of the `my-mappings-template` that is in the system.
{
  "docs": [
    {
      "_index": "my-index",
      "_id": "123",
      "_source": {
        "foo": "foo"
      }
    },
    {
      "_index": "my-index",
      "_id": "456",
      "_source": {
        "bar": "rab"
      }
    }
  ],
  "component_template_substitutions": {
    "my-mappings_template": {
      "template": {
        "mappings": {
          "dynamic": "strict",
          "properties": {
            "foo": {
              "type": "keyword"
            },
            "bar": {
              "type": "keyword"
            }
          }
        }
      }
    }
  }
}
The pipeline, component template, and index template substitutions replace the existing pipeline details for the duration of this request.
{
  "docs": [
    {
      "_id": "id",
      "_index": "my-index",
      "_source": {
        "foo": "bar"
      }
    },
    {
      "_id": "id",
      "_index": "my-index",
      "_source": {
        "foo": "rab"
      }
    }
  ],
  "pipeline_substitutions": {
    "my-pipeline": {
      "processors": [
        {
          "set": {
            "field": "field3",
            "value": "value3"
          }
        }
      ]
    }
  },
  "component_template_substitutions": {
    "my-component-template": {
      "template": {
        "mappings": {
          "dynamic": true,
          "properties": {
            "field3": {
              "type": "keyword"
            }
          }
        },
        "settings": {
          "index": {
            "default_pipeline": "my-pipeline"
          }
        }
      }
    }
  },
  "index_template_substitutions": {
    "my-index-template": {
      "index_patterns": [
        "my-index-*"
      ],
      "composed_of": [
        "component_template_1",
        "component_template_2"
      ]
    }
  },
  "mapping_addition": {
    "dynamic": "strict",
    "properties": {
      "foo": {
        "type": "keyword"
      }
    }
  }
}
A successful response when the simulation uses pipeline definitions that are already in the system.
{
  "docs": [
    {
      "doc": null,
      "_id": 123,
      "_index": "my-index",
      "_version": -3,
      "_source": {
        "field1": "value1",
        "field2": "value2",
        "foo": "bar"
      },
      "executed_pipelines": [
        "my-pipeline",
        "my-final-pipeline"
      ]
    },
    {
      "doc": null,
      "_id": 456,
      "_index": "my-index",
      "_version": "-3,",
      "_source": {
        "field1": "value1",
        "field2": "value2",
        "foo": "rab"
      },
      "executed_pipelines": [
        "my-pipeline",
        "my-final-pipeline"
      ]
    }
  ]
}
A successful response when the simulation uses pipeline substitutions.
{
  "docs": [
    {
      "doc": null,
      "_id": 123,
      "_index": "my-index",
      "_version": -3,
      "_source": {
        "field2": "value2",
        "foo": "BAR"
      },
      "executed_pipelines": [
        "my-pipeline",
        "my-final-pipeline"
      ]
    },
    {
      "doc": null,
      "_id": 456,
      "_index": "my-index",
      "_version": -3,
      "_source": {
        "field2": "value2",
        "foo": "RAB"
      },
      "executed_pipelines": [
        "my-pipeline",
        "my-final-pipeline"
      ]
    }
  ]
}
A successful response when the simulation uses pipeline substitutions.
{
  "docs": [
    {
      "doc": {
        "_id": "123",
        "_index": "my-index",
        "_version": -3,
        "_source": {
          "foo": "foo"
        },
        "executed_pipelines": []
      }
    },
    {
      "doc": {
        "_id": "456",
        "_index": "my-index",
        "_version": -3,
        "_source": {
          "bar": "rab"
        },
      "executed_pipelines": []
      }
    }
  ]
}