Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -192,7 +192,7 @@ See [examples/models/llama](examples/models/llama/README.md) for complete workfl
| macOS | XNNPACK, MPS, Metal *(experimental)* |
| Embedded / MCU | XNNPACK, ARM Ethos-U, NXP, Cadence DSP |

See [Backend Documentation](https://docs.pytorch.org/executorch/main/backends-overview.html) for detailed hardware requirements and optimization guides.
See [Backend Documentation](https://docs.pytorch.org/executorch/main/backends-overview.html) for detailed hardware requirements and optimization guides. For desktop/laptop GPU inference with CUDA and Metal, see the [Desktop Guide](desktop/README.md). For Zephyr RTOS integration, see the [Zephyr Guide](zephyr/README.md).

## Production Deployments

Expand All @@ -204,9 +204,9 @@ ExecuTorch powers on-device AI at scale across Meta's family of apps, VR/AR devi

**Multimodal:** [Llava](examples/models/llava/README.md) (vision-language), [Voxtral](examples/models/voxtral/README.md) (audio-language), [Gemma](examples/models/gemma3) (vision-language)

**Vision/Speech:** [MobileNetV2](https://github.com/meta-pytorch/executorch-examples/tree/main/mv2), [DeepLabV3](https://github.com/meta-pytorch/executorch-examples/tree/main/dl3), [Whisper](https://github.com/meta-pytorch/executorch-examples/tree/main/whisper/android/WhisperApp)
**Vision/Speech:** [MobileNetV2](https://github.com/meta-pytorch/executorch-examples/tree/main/mv2), [DeepLabV3](https://github.com/meta-pytorch/executorch-examples/tree/main/dl3), [Whisper](examples/models/whisper/README.md) <!-- @lint-ignore -->

**Resources:** [`examples/`](examples/) directory • [executorch-examples](https://github.com/meta-pytorch/executorch-examples) out-of-tree demos • [Optimum-ExecuTorch](https://github.com/huggingface/optimum-executorch) for HuggingFace models
**Resources:** [`examples/`](examples/) directory • [executorch-examples](https://github.com/meta-pytorch/executorch-examples) out-of-tree demos • [Optimum-ExecuTorch](https://github.com/huggingface/optimum-executorch) for HuggingFace models • [Unsloth](https://docs.unsloth.ai/new/deploy-llms-phone) for fine-tuned LLM deployment <!-- @lint-ignore -->

## Key Features

Expand Down
2 changes: 2 additions & 0 deletions docs/source/backends-cadence.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@ In this tutorial we will walk you through the process of getting setup to build

In addition to the chip, the HiFi4 Neural Network Library ([nnlib](https://github.com/foss-xtensa/nnlib-hifi4)) offers an optimized set of library functions commonly used in NN processing that we utilize in this example to demonstrate how common operations can be accelerated.

For an overview of the Cadence ExecuTorch integration with performance benchmarks, see the blog post: [Running Optimized PyTorch Models on Cadence DSPs with ExecuTorch](https://community.cadence.com/cadence_blogs_8/b/ip/posts/running-optimized-pytorch-models-on-cadence-dsps-with-executorch).

On top of being able to run on the Xtensa HiFi4 DSP, another goal of this tutorial is to demonstrate how portable ExecuTorch is and its ability to run on a low-power embedded device such as the Xtensa HiFi4 DSP. This workflow does not require any delegates, it uses custom operators and compiler passes to enhance the model and make it more suitable to running on Xtensa HiFi4 DSPs. A custom [quantizer](https://pytorch.org/tutorials/prototype/quantization_in_pytorch_2_0_export_tutorial.html) is used to represent activations and weights as `uint8` instead of `float`, and call appropriate operators. Finally, custom kernels optimized with Xtensa intrinsics provide runtime acceleration.

::::{grid} 2
Expand Down
34 changes: 32 additions & 2 deletions docs/source/success-stories.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ Powers Instagram, WhatsApp, Facebook, and Messenger with real-time on-device AI
**Hardware:** Quest 3, Ray-Ban Meta Smart Glasses, Meta Ray-Ban Display

Enables real-time computer vision, hand tracking, voice commands, and translation on power-constrained wearable devices.
[Read Blog →](https://ai.meta.com/blog/executorch-reality-labs-on-device-ai/)
:::

:::{grid-item-card} **Liquid AI: Efficient, Flexible On-Device Intelligence**
Expand Down Expand Up @@ -106,14 +107,39 @@ PyTorch-native quantization and optimization library for preparing efficient mod

Optimize LLM fine-tuning with faster training and reduced VRAM usage, then deploy efficiently with ExecuTorch.

[Example Model →](https://huggingface.co/metascroy/Qwen3-4B-int8-int4-unsloth) [Blog →](https://docs.unsloth.ai/new/quantization-aware-training-qat)
[Example Model →](https://huggingface.co/metascroy/Qwen3-4B-int8-int4-unsloth) [Blog →](https://docs.unsloth.ai/new/quantization-aware-training-qat) • [Doc →](https://docs.unsloth.ai/new/deploy-llms-phone)
:::

:::{grid-item-card} **Ultralytics**
:class-header: bg-secondary text-white

Deploy on-device inference for Ultralytics YOLO models using ExecuTorch.
[Explore →](https://docs.ultralytics.com/integrations/executorch/)

[Explore →](https://docs.ultralytics.com/integrations/executorch/) • [Blog →](https://www.ultralytics.com/blog/deploy-ultralytics-yolo-models-using-the-executorch-integration)
:::

:::{grid-item-card} **Arm ML Embedded Evaluation Kit**
:class-header: bg-secondary text-white

Build and deploy ML applications on Arm Cortex-M (M55, M85) and Ethos-U NPUs (U55, U65, U85) using ExecuTorch.

[Explore →](https://gitlab.arm.com/artificial-intelligence/ethos-u/ml-embedded-evaluation-kit)
:::

:::{grid-item-card} **Alif Semiconductor Ensemble**
:class-header: bg-secondary text-white

Run generative AI on Ensemble E4/E6/E8 MCUs with Arm Ethos-U85 NPU acceleration.

[Learn More →](https://alifsemi.com/press-release/alif-semiconductor-elevates-generative-ai-with-support-for-executorch-runtime/)
:::

:::{grid-item-card} **Digica AI SDK**
:class-header: bg-secondary text-white

Automate PyTorch model deployment to iOS, Android, and edge devices with ExecuTorch-powered SDK.

[Blog →](https://www.digica.com/blog/effortless-edge-deployment-of-ai-models-with-digicas-ai-sdk-feat-executorch.html)
:::

::::
Expand All @@ -126,8 +152,12 @@ Deploy on-device inference for Ultralytics YOLO models using ExecuTorch.

- **Voxtral** - Deploy audio-text-input LLM on CPU (via XNNPACK) and on CUDA. [Try →](https://github.com/pytorch/executorch/blob/main/examples/models/voxtral/README.md)

- **Whisper** - Deploy OpenAI's Whisper speech recognition model on CUDA and Metal backends. [Try →](https://github.com/pytorch/executorch/blob/main/examples/models/whisper/README.md) <!-- @lint-ignore -->

- **LoRA adapter** - Export two LoRA adapters that share a single foundation weight file, saving memory and disk space. [Try →](https://github.com/meta-pytorch/executorch-examples/tree/main/program-data-separation/cpp/lora_example)

- **OpenVINO from Intel** - Deploy [Yolo12](https://github.com/pytorch/executorch/tree/main/examples/models/yolo12), [Llama](https://github.com/pytorch/executorch/tree/main/examples/openvino/llama), and [Stable Diffusion](https://github.com/pytorch/executorch/tree/main/examples/openvino/stable_diffusion) on [OpenVINO from Intel](https://www.intel.com/content/www/us/en/developer/articles/community/optimizing-executorch-on-ai-pcs.html).

- **Audio Generation** - Generate audio from text prompts using Stable Audio Open Small on Arm CPUs with XNNPACK and KleidiAI. [Try →](https://github.com/Arm-Examples/ML-examples/tree/main/kleidiai-examples/audiogen-et) • [Video →](https://www.youtube.com/watch?v=q2P0ESVxhAY) <!-- @lint-ignore -->

*Want to showcase your demo? [Submit here →](https://github.com/pytorch/executorch/issues)*
4 changes: 4 additions & 0 deletions examples/models/whisper/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -166,3 +166,7 @@ cmake-out/examples/models/whisper/whisper_runner \
--processor_path whisper_preprocessor.pte \
--temperature 0
```

## Mobile Demo

For an Android demo app, see the [Whisper Android App](https://github.com/meta-pytorch/executorch-examples/tree/main/whisper/android/WhisperApp) in the executorch-examples repository. <!-- @lint-ignore -->
3 changes: 3 additions & 0 deletions examples/qualcomm/oss_scripts/llama/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
# Summary

## Overview

**Video Tutorial:** [Build Along: Run LLMs Locally on Qualcomm Hardware Using ExecuTorch](https://www.youtube.com/watch?v=41PKDlGM3oU)

This file provides you the instructions to run LLM Decoder model with different parameters via Qualcomm HTP backend. We currently support the following models:
<!-- numbered list will be automatically generated -->
1. LLAMA2 Stories 110M
Expand Down
8 changes: 4 additions & 4 deletions website/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -918,16 +918,16 @@ <h2 class="section-title">Success <span class="highlight">Stories</span></h2>
<h3 style="font-size: 1.5rem; margin-bottom: 1.5rem; color: var(--text-dark);">Production Deployments</h3>
<ul style="color: var(--text-gray); line-height: 2; list-style: none; padding-left: 0;">
<li><strong class="highlight"><a href="https://engineering.fb.com/2025/07/28/android/executorch-on-device-ml-meta-family-of-apps/" style="color: var(--primary); text-decoration: none;">Meta Family of Apps</a>:</strong> Production deployment across Instagram, Facebook, and WhatsApp serving billions of users</li>
<li><strong class="highlight">Meta Reality Labs:</strong> Powers Quest 3 VR and Ray-Ban Meta Smart Glasses AI experiences</li>
<li><strong class="highlight"><a href="https://ai.meta.com/blog/executorch-reality-labs-on-device-ai/" style="color: var(--primary); text-decoration: none;">Meta Reality Labs</a>:</strong> Powers Quest 3 VR and Ray-Ban Meta Smart Glasses AI experiences</li>
</ul>
</div>

<div style="margin-bottom: 3rem;">
<h3 style="font-size: 1.5rem; margin-bottom: 1.5rem; color: var(--text-dark);">Ecosystem Integration</h3>
<ul style="color: var(--text-gray); line-height: 2; list-style: none; padding-left: 0;">
<li><strong class="highlight"><a href="https://github.com/huggingface/optimum-executorch" style="color: var(--primary); text-decoration: none;">Hugging Face</a>:</strong> Optimum-ExecuTorch for direct transformer model deployment</li>
<li><strong class="highlight">LiquidAI:</strong> Next-generation Liquid Foundation Models optimized for edge deployment</li>
<li><strong class="highlight">Software Mansion:</strong> React Native ExecuTorch bringing edge AI to mobile apps</li>
<li><strong class="highlight"><a href="https://www.liquid.ai/blog/how-liquid-ai-uses-executorch-to-power-efficient-flexible-on-device-intelligence" style="color: var(--primary); text-decoration: none;">LiquidAI</a>:</strong> Next-generation Liquid Foundation Models optimized for edge deployment</li>
<li><strong class="highlight"><a href="https://docs.swmansion.com/react-native-executorch/" style="color: var(--primary); text-decoration: none;">Software Mansion</a>:</strong> React Native ExecuTorch bringing edge AI to mobile apps</li>
</ul>
</div>

Expand All @@ -936,7 +936,7 @@ <h3 style="font-size: 1.5rem; margin-bottom: 1.5rem; color: var(--text-dark);">E
<ul style="color: var(--text-gray); line-height: 2; list-style: none; padding-left: 0;">
<li><strong class="highlight">LLMs:</strong> <a href="https://github.com/pytorch/executorch/blob/main/examples/models/llama/README.md" style="color: var(--primary); text-decoration: none;">Llama 3.2/3.1/3</a>, <a href="https://github.com/pytorch/executorch/blob/main/examples/models/qwen3/README.md" style="color: var(--primary); text-decoration: none;">Qwen 3</a>, <a href="https://github.com/pytorch/executorch/blob/main/examples/models/phi_4_mini/README.md" style="color: var(--primary); text-decoration: none;">Phi-4-mini</a>, <a href="https://github.com/pytorch/executorch/blob/main/examples/models/lfm2/README.md" style="color: var(--primary); text-decoration: none;">LiquidAI LFM2</a></li>
<li><strong class="highlight">Multimodal:</strong> <a href="https://github.com/pytorch/executorch/blob/main/examples/models/llava/README.md" style="color: var(--primary); text-decoration: none;">Llava</a> (vision-language), <a href="https://github.com/pytorch/executorch/blob/main/examples/models/voxtral/README.md" style="color: var(--primary); text-decoration: none;">Voxtral</a> (audio-language)</li>
<li><strong class="highlight">Vision/Speech:</strong> <a href="https://github.com/meta-pytorch/executorch-examples/tree/main/mv2" style="color: var(--primary); text-decoration: none;">MobileNetV2</a>, <a href="https://github.com/meta-pytorch/executorch-examples/tree/main/dl3" style="color: var(--primary); text-decoration: none;">DeepLabV3</a>, <a href="https://github.com/meta-pytorch/executorch-examples/tree/main/whisper/android/WhisperApp" style="color: var(--primary); text-decoration: none;">Whisper</a></li>
<li><strong class="highlight">Vision/Speech:</strong> <a href="https://github.com/meta-pytorch/executorch-examples/tree/main/mv2" style="color: var(--primary); text-decoration: none;">MobileNetV2</a>, <a href="https://github.com/meta-pytorch/executorch-examples/tree/main/dl3" style="color: var(--primary); text-decoration: none;">DeepLabV3</a>, <a href="https://github.com/pytorch/executorch/blob/main/examples/models/whisper/README.md" style="color: var(--primary); text-decoration: none;">Whisper</a></li> <!-- @lint-ignore -->
</ul>
</div>

Expand Down
Loading