Transformer From Scratch

A step-by-step tutorial for implementing Transformer architectures from scratch in PyTorch.

🚧 This code is under rapid development. 方法及模型正在快速补充中。 🚧

如何开始

阅读代码，理解模型结构的理论实现
根据代码中的提示，尝试实现模型结构
对比 solutions 目录下的代码，检查代码是否正确

Getting Started

Read the code for each architecture, understand its theoretical basis.
Try to implement the code with the hint in the code.
Compare your implementation with the provided solution.

Overview

This repository provides detailed, well-documented implementations of Transformer architectures. Each implementation includes:

📝 Step-by-step explanations
💡 Detailed comments for every key component
🔍 Mathematical derivations where necessary
⚡ Working examples and demonstrations

Structure

transformer_from_scratch/
├── vanilla_transformer/          # 经典 Transformer 结构
│   ├── attention.py             # 注意力机制实现
│   ├── encoder.py              # 编码器实现
│   ├── decoder.py              # 解码器实现
│   └── transformer.py          # 完整模型
│
├── attention_variants/          # 注意力机制变体
│   └── linear_attention       # 线性注意力
│
├── efficient_transformer/      # 轻量级/高效架构
│   └── performer              # Performer
│
├── vision_transformer/        # 视觉 Transformer
│   ├── vit                   # Vision Transformer
│   └── swin                  # Swin Transformer
│
└── SOLUTIONS                # 答案

Implemented Architectures

Architecture	Paper	Original Repo
Vanilla Transformer	Attention Is All You Need	tensorflow/tensor2tensor
Vision Transformer	An Image is Worth 16x16 Words	google-research/vision_transformer
Swin Transformer	Hierarchical Vision Transformer	microsoft/Swin-Transformer
Linear Attention	Transformers are RNNs	idiap/fast-transformers
Performer	Rethinking Attention with Performers	google-research/performer
EfficientViT	EfficientViT	microsoft/EfficientViT

Contributing

Contributions are welcome! Please feel free to:

Add new implementations
Improve existing explanations
Suggest new architectures

License

This project is licensed under the MIT License - see the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transformer From Scratch

如何开始

Getting Started

Overview

Structure

Implemented Architectures

Contributing

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
SOLUTIONS		SOLUTIONS
attention_variants		attention_variants
efficient_transformer		efficient_transformer
vanilla_transformer		vanilla_transformer
vision_transformer		vision_transformer
LICENSE		LICENSE
README.md		README.md

License

xfey/transformer-from-scratch

Folders and files

Latest commit

History

Repository files navigation

Transformer From Scratch

如何开始

Getting Started

Overview

Structure

Implemented Architectures

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages