Create a trained model Generally available; Added in 7.10.0

PUT /_ml/trained_models/{model_id}

Enable you to supply a trained model that is not created by data frame analytics.

Required authorization

  • Cluster privileges: manage_ml

Path parameters

  • model_id string Required

    The unique identifier of the trained model.

Query parameters

  • defer_definition_decompression boolean Generally available; Added in 8.0.0

    If set to true and a compressed_definition is provided, the request defers definition decompression and skips relevant validations.

  • wait_for_completion boolean Generally available; Added in 8.8.0

    Whether to wait for all child operations (e.g. model download) to complete.

application/json

Body Required

  • compressed_definition string

    The compressed (GZipped and Base64 encoded) inference definition of the model. If compressed_definition is specified, then definition cannot be specified.

  • definition object

    The inference definition for the model. If definition is specified, then compressed_definition cannot be specified.

    Hide definition attributes Show definition attributes object
    • preprocessors array[object]

      Collection of preprocessors

      Hide preprocessors attributes Show preprocessors attributes object
      • frequency_encoding object
        Hide frequency_encoding attributes Show frequency_encoding attributes object
        • field string Required
        • feature_name string Required
        • frequency_map object Required
      • one_hot_encoding object
        Hide one_hot_encoding attributes Show one_hot_encoding attributes object
        • field string Required
        • hot_map object Required
      • target_mean_encoding object
        Hide target_mean_encoding attributes Show target_mean_encoding attributes object
        • field string Required
        • feature_name string Required
        • target_map object Required
        • default_value number Required
    • trained_model object Required

      The definition of the trained model.

      Hide trained_model attributes Show trained_model attributes object
      • tree object

        The definition for a binary decision tree.

        Hide tree attributes Show tree attributes object
        • classification_labels array[string]
        • feature_names array[string] Required
        • target_type string
        • tree_structure array[object] Required
      • tree_node object

        The definition of a node in a tree. There are two major types of nodes: leaf nodes and not-leaf nodes.

        • Leaf nodes only need node_index and leaf_value defined.
        • All other nodes need split_feature, left_child, right_child, threshold, decision_type, and default_left defined.
        Hide tree_node attributes Show tree_node attributes object
        • decision_type string
        • default_left boolean
        • leaf_value number
        • left_child number
        • node_index number Required
        • right_child number
        • split_feature number
        • split_gain number
        • threshold number
      • ensemble object

        The definition for an ensemble model

        Hide ensemble attributes Show ensemble attributes object
        • classification_labels array[string]
        • feature_names array[string]
        • target_type string
        • trained_models array[object] Required
  • description string

    A human-readable description of the inference trained model.

  • inference_config object

    The default configuration for inference. This can be either a regression or classification configuration. It must match the underlying definition.trained_model's target_type. For pre-packaged models such as ELSER the config is not required.

    Hide inference_config attributes Show inference_config attributes object
    • regression object

      Regression configuration for inference.

      Hide regression attributes Show regression attributes object
      • results_field string

        The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

      • num_top_feature_importance_values number

        Specifies the maximum number of feature importance values per document.

        Default value is 0.

    • classification object

      Classification configuration for inference.

      Hide classification attributes Show classification attributes object
      • num_top_classes number

        Specifies the number of top class predictions to return. Defaults to 0.

      • num_top_feature_importance_values number

        Specifies the maximum number of feature importance values per document.

        Default value is 0.

      • prediction_field_type string

        Specifies the type of the predicted field to write. Acceptable values are: string, number, boolean. When boolean is provided 1.0 is transformed to true and 0.0 to false.

      • results_field string

        The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

      • top_classes_results_field string

        Specifies the field to which the top classes are written. Defaults to top_classes.

    • text_classification object

      Text classification configuration for inference.

      Hide text_classification attributes Show text_classification attributes object
      • num_top_classes number

        Specifies the number of top class predictions to return. Defaults to 0.

      • tokenization object

        The tokenization options

      • results_field string

        The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

      • classification_labels array[string]

        Classification labels to apply other than the stored labels. Must have the same deminsions as the default configured labels

      • vocabulary object
    • zero_shot_classification object

      Zeroshot classification configuration for inference.

      Hide zero_shot_classification attributes Show zero_shot_classification attributes object
      • tokenization object

        The tokenization options to update when inferring

      • hypothesis_template string

        Hypothesis template used when tokenizing labels for prediction

        Default value is "This example is {}.".

      • classification_labels array[string] Required

        The zero shot classification labels indicating entailment, neutral, and contradiction Must contain exactly and only entailment, neutral, and contradiction

      • results_field string

        The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

      • multi_label boolean

        Indicates if more than one true label exists.

        Default value is false.

      • labels array[string]

        The labels to predict.

    • fill_mask object

      Fill mask configuration for inference.

      Hide fill_mask attributes Show fill_mask attributes object
      • mask_token string

        The string/token which will be removed from incoming documents and replaced with the inference prediction(s). In a response, this field contains the mask token for the specified model/tokenizer. Each model and tokenizer has a predefined mask token which cannot be changed. Thus, it is recommended not to set this value in requests. However, if this field is present in a request, its value must match the predefined value for that model/tokenizer, otherwise the request will fail.

      • num_top_classes number

        Specifies the number of top class predictions to return. Defaults to 0.

      • tokenization object

        The tokenization options to update when inferring

      • results_field string

        The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

      • vocabulary object Required
    • learning_to_rank object
      Hide learning_to_rank attributes Show learning_to_rank attributes object
      • default_params object
        Hide default_params attribute Show default_params attribute object
        • * object Additional properties
      • feature_extractors array[object]
      • num_top_feature_importance_values number Required
    • ner object

      Named entity recognition configuration for inference.

      Hide ner attributes Show ner attributes object
      • tokenization object

        The tokenization options

      • results_field string

        The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

      • classification_labels array[string]

        The token classification labels. Must be IOB formatted tags

      • vocabulary object
    • pass_through object

      Pass through configuration for inference.

      Hide pass_through attributes Show pass_through attributes object
      • tokenization object

        The tokenization options

      • results_field string

        The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

      • vocabulary object
    • text_embedding object

      Text embedding configuration for inference.

      Hide text_embedding attributes Show text_embedding attributes object
      • embedding_size number

        The number of dimensions in the embedding output

      • tokenization object

        The tokenization options

      • results_field string

        The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

      • vocabulary object Required
    • text_expansion object

      Text expansion configuration for inference.

      Hide text_expansion attributes Show text_expansion attributes object
      • tokenization object

        The tokenization options

      • results_field string

        The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

      • vocabulary object Required
    • question_answering object

      Question answering configuration for inference.

      Hide question_answering attributes Show question_answering attributes object
      • num_top_classes number

        Specifies the number of top class predictions to return. Defaults to 0.

      • tokenization object

        The tokenization options to update when inferring

      • results_field string

        The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

      • max_answer_length number

        The maximum answer length to consider

  • input object

    The input field names for the model definition.

    Hide input attribute Show input attribute object
    • field_names string | array[string] Required
  • metadata object

    An object map that contains metadata about the model.

  • model_type string

    The model type.

    Supported values include:

    • tree_ensemble: The model definition is an ensemble model of decision trees.
    • lang_ident: A special type reserved for language identification models.
    • pytorch: The stored definition is a PyTorch (specifically a TorchScript) model. Currently only NLP models are supported.

    Values are tree_ensemble, lang_ident, or pytorch.

  • model_size_bytes number

    The estimated memory usage in bytes to keep the trained model in memory. This property is supported only if defer_definition_decompression is true or the model definition is not supplied.

  • platform_architecture string

    The platform architecture (if applicable) of the trained mode. If the model only works on one platform, because it is heavily optimized for a particular processor architecture and OS combination, then this field specifies which. The format of the string must match the platform identifiers used by Elasticsearch, so one of, linux-x86_64, linux-aarch64, darwin-x86_64, darwin-aarch64, or windows-x86_64. For portable models (those that work independent of processor architecture or OS features), leave this field unset.

  • tags array[string]

    An array of tags to organize the model.

  • prefix_strings object

    Optional prefix strings applied at inference

    Hide prefix_strings attributes Show prefix_strings attributes object
    • ingest string

      String prepended to input at ingest

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • model_id string Required

      Identifier for the trained model.

    • model_type string

      The model type

      Supported values include:

      • tree_ensemble: The model definition is an ensemble model of decision trees.
      • lang_ident: A special type reserved for language identification models.
      • pytorch: The stored definition is a PyTorch (specifically a TorchScript) model. Currently only NLP models are supported.

      Values are tree_ensemble, lang_ident, or pytorch.

    • tags array[string] Required

      A comma delimited string of tags. A trained model can have many tags, or none.

    • version string

      The Elasticsearch version number in which the trained model was created.

    • compressed_definition string
    • created_by string

      Information on the creator of the trained model.

    • create_time string | number

      The time when the trained model was created.

      One of:

      The time when the trained model was created.

      The time when the trained model was created.

    • default_field_map object

      Any field map described in the inference configuration takes precedence.

      Hide default_field_map attribute Show default_field_map attribute object
      • * string Additional properties
    • description string

      The free-text description of the trained model.

    • estimated_heap_memory_usage_bytes number

      The estimated heap usage in bytes to keep the trained model in memory.

    • estimated_operations number

      The estimated number of operations to use the trained model.

    • fully_defined boolean

      True if the full model definition is present.

    • inference_config object

      The default configuration for inference. This can be either a regression, classification, or one of the many NLP focused configurations. It must match the underlying definition.trained_model's target_type. For pre-packaged models such as ELSER the config is not required.

      Hide inference_config attributes Show inference_config attributes object
      • regression object

        Regression configuration for inference.

        Hide regression attributes Show regression attributes object
        • results_field string

          The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

        • num_top_feature_importance_values number

          Specifies the maximum number of feature importance values per document.

          Default value is 0.

      • classification object

        Classification configuration for inference.

        Hide classification attributes Show classification attributes object
        • num_top_classes number

          Specifies the number of top class predictions to return. Defaults to 0.

        • num_top_feature_importance_values number

          Specifies the maximum number of feature importance values per document.

          Default value is 0.

        • prediction_field_type string

          Specifies the type of the predicted field to write. Acceptable values are: string, number, boolean. When boolean is provided 1.0 is transformed to true and 0.0 to false.

        • results_field string

          The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

        • top_classes_results_field string

          Specifies the field to which the top classes are written. Defaults to top_classes.

      • text_classification object

        Text classification configuration for inference.

        Hide text_classification attributes Show text_classification attributes object
        • num_top_classes number

          Specifies the number of top class predictions to return. Defaults to 0.

        • tokenization object

          The tokenization options

        • results_field string

          The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

        • classification_labels array[string]

          Classification labels to apply other than the stored labels. Must have the same deminsions as the default configured labels

        • vocabulary object
      • zero_shot_classification object

        Zeroshot classification configuration for inference.

        Hide zero_shot_classification attributes Show zero_shot_classification attributes object
        • tokenization object

          The tokenization options to update when inferring

        • hypothesis_template string

          Hypothesis template used when tokenizing labels for prediction

          Default value is "This example is {}.".

        • classification_labels array[string] Required

          The zero shot classification labels indicating entailment, neutral, and contradiction Must contain exactly and only entailment, neutral, and contradiction

        • results_field string

          The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

        • multi_label boolean

          Indicates if more than one true label exists.

          Default value is false.

        • labels array[string]

          The labels to predict.

      • fill_mask object

        Fill mask configuration for inference.

        Hide fill_mask attributes Show fill_mask attributes object
        • mask_token string

          The string/token which will be removed from incoming documents and replaced with the inference prediction(s). In a response, this field contains the mask token for the specified model/tokenizer. Each model and tokenizer has a predefined mask token which cannot be changed. Thus, it is recommended not to set this value in requests. However, if this field is present in a request, its value must match the predefined value for that model/tokenizer, otherwise the request will fail.

        • num_top_classes number

          Specifies the number of top class predictions to return. Defaults to 0.

        • tokenization object

          The tokenization options to update when inferring

        • results_field string

          The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

        • vocabulary object Required
      • learning_to_rank object
        Hide learning_to_rank attributes Show learning_to_rank attributes object
        • default_params object
          Hide default_params attribute Show default_params attribute object
          • * object Additional properties
        • feature_extractors array[object]
        • num_top_feature_importance_values number Required
      • ner object

        Named entity recognition configuration for inference.

        Hide ner attributes Show ner attributes object
        • tokenization object

          The tokenization options

        • results_field string

          The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

        • classification_labels array[string]

          The token classification labels. Must be IOB formatted tags

        • vocabulary object
      • pass_through object

        Pass through configuration for inference.

        Hide pass_through attributes Show pass_through attributes object
        • tokenization object

          The tokenization options

        • results_field string

          The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

        • vocabulary object
      • text_embedding object

        Text embedding configuration for inference.

        Hide text_embedding attributes Show text_embedding attributes object
        • embedding_size number

          The number of dimensions in the embedding output

        • tokenization object

          The tokenization options

        • results_field string

          The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

        • vocabulary object Required
      • text_expansion object

        Text expansion configuration for inference.

        Hide text_expansion attributes Show text_expansion attributes object
        • tokenization object

          The tokenization options

        • results_field string

          The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

        • vocabulary object Required
      • question_answering object

        Question answering configuration for inference.

        Hide question_answering attributes Show question_answering attributes object
        • num_top_classes number

          Specifies the number of top class predictions to return. Defaults to 0.

        • tokenization object

          The tokenization options to update when inferring

        • results_field string

          The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

        • max_answer_length number

          The maximum answer length to consider

    • input object Required

      The input field names for the model definition.

      Hide input attribute Show input attribute object
      • field_names array[string] Required

        An array of input field names for the model.

    • license_level string

      The license level of the trained model.

    • metadata object

      An object containing metadata about the trained model. For example, models created by data frame analytics contain analysis_config and input objects.

      Hide metadata attributes Show metadata attributes object
      • model_aliases array[string]
      • feature_importance_baseline object

        An object that contains the baseline for feature importance values. For regression analysis, it is a single value. For classification analysis, there is a value for each class.

        Hide feature_importance_baseline attribute Show feature_importance_baseline attribute object
        • * string Additional properties
      • hyperparameters array[object]

        List of the available hyperparameters optimized during the fine_parameter_tuning phase as well as specified by the user.

        Hide hyperparameters attributes Show hyperparameters attributes object
        • absolute_importance number

          A positive number showing how much the parameter influences the variation of the loss function. For hyperparameters with values that are not specified by the user but tuned during hyperparameter optimization.

        • name string Required

          Name of the hyperparameter.

        • relative_importance number

          A number between 0 and 1 showing the proportion of influence on the variation of the loss function among all tuned hyperparameters. For hyperparameters with values that are not specified by the user but tuned during hyperparameter optimization.

        • supplied boolean Required

          Indicates if the hyperparameter is specified by the user (true) or optimized (false).

        • value number Required

          The value of the hyperparameter, either optimized or specified by the user.

      • total_feature_importance array[object]

        An array of the total feature importance for each feature used from the training data set. This array of objects is returned if data frame analytics trained the model and the request includes total_feature_importance in the include request parameter.

        Hide total_feature_importance attributes Show total_feature_importance attributes object
        • feature_name string Required

          The feature for which this importance was calculated.

        • importance array[object] Required

          A collection of feature importance statistics related to the training data set for this particular feature.

        • classes array[object] Required

          If the trained model is a classification model, feature importance statistics are gathered per target class value.

    • model_size_bytes number | string

    • model_package object
      Hide model_package attributes Show model_package attributes object
      • Time unit for milliseconds

      • description string
      • inference_config object
        Hide inference_config attribute Show inference_config attribute object
        • * object Additional properties
      • metadata object
        Hide metadata attribute Show metadata attribute object
        • * object Additional properties
      • minimum_version string
      • model_repository string
      • model_type string
      • packaged_model_id string Required
      • platform_architecture string
      • prefix_strings object
        Hide prefix_strings attributes Show prefix_strings attributes object
        • ingest string

          String prepended to input at ingest

      • size number | string

      • sha256 string
      • tags array[string]
      • vocabulary_file string
    • location object
      Hide location attribute Show location attribute object
      • index object Required
        Hide index attribute Show index attribute object
        • name string Required
    • platform_architecture string
    • prefix_strings object
      Hide prefix_strings attributes Show prefix_strings attributes object
      • ingest string

        String prepended to input at ingest

PUT /_ml/trained_models/{model_id}
curl \
 --request PUT 'http://api.example.com/_ml/trained_models/{model_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"compressed_definition":"string","definition":{"preprocessors":[{"frequency_encoding":{"field":"string","feature_name":"string","frequency_map":{}},"one_hot_encoding":{"field":"string","hot_map":{}},"target_mean_encoding":{"field":"string","feature_name":"string","target_map":{},"default_value":42.0}}],"trained_model":{"tree":{"classification_labels":["string"],"feature_names":["string"],"target_type":"string","tree_structure":[{}]},"tree_node":{"decision_type":"string","default_left":true,"leaf_value":42.0,"left_child":42.0,"node_index":42.0,"right_child":42.0,"split_feature":42.0,"split_gain":42.0,"threshold":42.0},"ensemble":{"classification_labels":["string"],"feature_names":["string"],"target_type":"string","trained_models":[{}]}}},"description":"string","inference_config":{"regression":{"results_field":"string","num_top_feature_importance_values":0},"classification":{"num_top_classes":42.0,"num_top_feature_importance_values":0,"prediction_field_type":"string","results_field":"string","top_classes_results_field":"string"},"text_classification":{"num_top_classes":42.0,"tokenization":{},"results_field":"string","classification_labels":["string"],"vocabulary":{}},"zero_shot_classification":{"tokenization":{},"hypothesis_template":"\"This example is {}.\"","classification_labels":["string"],"results_field":"string","multi_label":false,"labels":["string"]},"fill_mask":{"mask_token":"string","num_top_classes":42.0,"tokenization":{},"results_field":"string","vocabulary":{}},"learning_to_rank":{"default_params":{"additionalProperty1":{},"additionalProperty2":{}},"feature_extractors":[{}],"num_top_feature_importance_values":42.0},"ner":{"tokenization":{},"results_field":"string","classification_labels":["string"],"vocabulary":{}},"pass_through":{"tokenization":{},"results_field":"string","vocabulary":{}},"text_embedding":{"embedding_size":42.0,"tokenization":{},"results_field":"string","vocabulary":{}},"text_expansion":{"tokenization":{},"results_field":"string","vocabulary":{}},"question_answering":{"num_top_classes":42.0,"tokenization":{},"results_field":"string","max_answer_length":42.0}},"input":{"field_names":"string"},"metadata":{},"model_type":"tree_ensemble","model_size_bytes":42.0,"platform_architecture":"string","tags":["string"],"prefix_strings":{"ingest":"string","search":"string"}}'