Starting April 29, 2025, Gemini 1.5 Pro and Gemini 1.5 Flash models are not available in projects that have no prior usage of these models, including new projects. For details, see Model versions and lifecycle.

Count tokens for Claude models

This page shows you how to use the count-tokens endpoint to get the number of tokens in a message before you send it to a Claude model. You can use the token count to ensure your prompts don't exceed the model's context window.

There is no charge for using the count-tokens endpoint.

Supported Claude models

The following models support counting tokens:

Supported regions

The following regions support counting tokens:

us-east5
europe-west1
asia-east1
asia-southeast1
us-central1
europe-west4

Count tokens in basic messages

To count tokens, send a rawPredict request to the count-tokens endpoint. The request body must contain the model ID that you want to use.

REST

Before using any of the request data, make the following replacements:

LOCATION: A region that supports Anthropic Claude models. To use the global endpoint, see Specify the global endpoint.
MODEL: The model to count tokens against.
ROLE: The role associated with a message. You can specify a user or an assistant. The first message must use the user role. Claude models operate with alternating user and assistant turns. If the final message uses the assistant role, then the response content continues immediately from the content in that message. You can use this to constrain part of the model's response.
CONTENT: The content, such as text, of the user or assistant message.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/anthropic/models/count-tokens:rawPredict

Request JSON body:

{
  "model": "MODEL",
  "messages": [
    {
      "role": "user",
      "content":"how many tokens are in this request?"
    }
  ],
}

To send your request, choose one of these options:

curl

Note: The following command assumes that you have logged in to the gcloud CLI with your user account by running gcloud init or gcloud auth login , or by using Cloud Shell, which automatically logs you into the gcloud CLI . You can check the currently active account by running gcloud auth list.

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/anthropic/models/count-tokens:rawPredict"

PowerShell

Note: The following command assumes that you have logged in to the gcloud CLI with your user account by running gcloud init or gcloud auth login . You can check the currently active account by running gcloud auth list.

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/anthropic/models/count-tokens:rawPredict" | Select-Object -Expand Content

You should receive a JSON response similar to the following.

Response

{ "input_tokens": 14 }

For information about how to count tokens in messages with tools, images, and PDFs, see Anthropic's documentation.

Quotas

By default, the quota for the count-tokens endpoint is 2,000 requests per minute.

Count tokens for Claude models Stay organized with collections Save and categorize content based on your preferences.

Supported Claude models

Supported regions

Count tokens in basic messages

REST

curl

PowerShell

Response

Quotas

Count tokens for Claude models