Text Processing API v3

基于 FastAPI + LangChain 的高性能文本处理服务，支持语言检测、情感分析、文本分类、翻译等多种文本处理功能。

✨ 核心功能

功能	描述	状态
🔍 语言检测	自动识别文本语言，准确率 > 98%	✅
💭 情感分析	分析文本情感倾向（正面/负面/中性）	✅
🏷️ 文本分类	自定义类别分类，支持多标签	✅
🌐 文本翻译	多语言翻译支持	✅
📦 批量处理	批量文本处理，性能提升 15.5x	✅
🔄 集成处理	一次调用完成多项任务	✅

📊 性能指标

响应时间: 平均 < 500ms (P95)
并发处理: 支持 100+ 并发请求
吞吐量: 批量处理性能提升 15.5x
缓存命中率: 75%+ (启用缓存时)
成功率: 99.5%+

🚀 快速开始

前置要求

Python 3.13+
OpenAI API Key (获取地址)
Git

安装步骤

1. 克隆项目

git clone https://github.com/your-org/text-processor-v3.git
cd text-processor-v3
git checkout 001-llm-text-api

2. 创建虚拟环境

python -m venv venv
source venv/bin/activate  # Linux/Mac
# 或
venv\Scripts\activate  # Windows

3. 安装依赖

pip install -e ".[dev]"
# 或使用 uv (推荐)
uv pip install -e ".[dev]"

4. 配置环境变量

cp .env.example .env
# 编辑 .env 文件，填入您的 OpenAI API Key

必需配置 (.env):

# OpenAI 配置
OPENAI_API_KEY=sk-your-openai-api-key-here
OPENAI_MODEL=gpt-4o-mini

# 应用配置
APP_ENV=development
APP_DEBUG=true

5. 启动服务

# 开发模式
uvicorn src.main:app --reload --host 0.0.0.0 --port 8000

# 或使用 Python
python -m uvicorn src.main:app --reload --host 0.0.0.0 --port 8000

启动成功后，访问：

API 文档: http://localhost:8000/docs
ReDoc: http://localhost:8000/redoc

📖 使用示例

基础 API 调用

1. 语言检测

curl -X POST http://localhost:8000/api/v1/detect-language \
  -H "Content-Type: application/json" \
  -d '{"text": "Bonjour le monde!"}'

响应:

{
  "language": "fr",
  "confidence": 0.98
}

2. 情感分析

curl -X POST http://localhost:8000/api/v1/analyze-sentiment \
  -H "Content-Type: application/json" \
  -d '{"text": "I love this product!"}'

响应:

{
  "sentiment": "positive",
  "confidence": 0.95
}

3. 集成处理 (核心功能)

curl -X POST http://localhost:8000/api/v1/process \
  -H "Content-Type: application/json" \
  -d '{
    "text": "今天天气真好！",
    "tasks": ["language_detection", "sentiment_analysis"]
  }'

响应:

{
  "request_id": "550e8400-e29b-41d4-a716-446655440000",
  "results": {
    "language_detection": {
      "task": "language_detection",
      "result": "zh",
      "processing_time": 0.145,
      "confidence": 0.98
    },
    "sentiment_analysis": {
      "task": "sentiment_analysis",
      "result": {
        "sentiment": "positive",
        "confidence": 0.92
      },
      "processing_time": 0.234
    }
  },
  "processing_time": 0.379
}

4. 批量处理

curl -X POST http://localhost:8000/api/v1/batch-process \
  -H "Content-Type: application/json" \
  -d '{
    "texts": ["Hello", "Bonjour", "Hola"],
    "task": "language_detection"
  }'

Python 客户端示例

import requests

BASE_URL = "http://localhost:8000"

def detect_language(text):
    response = requests.post(
        f"{BASE_URL}/api/v1/detect-language",
        json={"text": text}
    )
    return response.json()

def process_text(text, tasks):
    response = requests.post(
        f"{BASE_URL}/api/v1/process",
        json={"text": text, "tasks": tasks, "use_cache": True}
    )
    return response.json()

# 使用示例
if __name__ == "__main__":
    # 检测语言
    result = detect_language("Hello, world!")
    print(f"Language: {result['language']}")

    # 集成处理
    result = process_text(
        "今天天气真好！",
        ["language_detection", "sentiment_analysis"]
    )
    print(f"Results: {result['results']}")

🛠️ 技术栈

技术	版本	用途
Python	3.13+	编程语言
FastAPI	0.125+	Web 框架
LangChain	1.2+	LLM 集成
Pydantic	2.12+	数据验证
Structlog	25.5+	日志记录
Uvicorn	0.38+	ASGI 服务器
Tenacity	9.1+	重试机制
Psutil	6.1+	系统监控

📁 项目结构

text-processor-v3/
├── src/                          # 源代码
│   ├── api/                      # API 路由
│   │   ├── deps.py              # 依赖注入
│   │   ├── individual.py        # 单项处理接口
│   │   ├── integrated.py        # 集成处理接口
│   │   └── monitoring.py        # 监控接口
│   ├── config/                  # 配置管理
│   │   ├── settings.py          # 应用设置
│   │   └── prompts.py           # 提示词模板
│   ├── middleware/              # 中间件
│   │   ├── request_context.py   # 请求上下文
│   │   └── rate_limiting.py     # 限流
│   ├── models/                  # 数据模型
│   │   ├── requests.py          # 请求模型
│   │   └── responses.py         # 响应模型
│   ├── services/                # 业务服务
│   │   ├── llm_service.py       # LLM 服务
│   │   ├── text_processor.py    # 文本处理
│   │   ├── batch_processor.py   # 批量处理
│   │   └── cache_service.py     # 缓存服务
│   ├── utils/                   # 工具模块
│   │   ├── exceptions.py        # 异常定义
│   │   ├── logging.py           # 日志工具
│   │   ├── validation.py        # 验证工具
│   │   ├── request_context.py   # 请求上下文
│   │   ├── performance.py       # 性能监控
│   │   └── security.py          # 安全工具
│   └── main.py                  # 应用入口
├── tests/                       # 测试代码
│   ├── unit/                    # 单元测试
│   ├── integration/             # 集成测试
│   └── benchmark/               # 性能测试
├── specs/                       # 项目规范
│   └── 001-llm-text-api/        # 功能规范
│       ├── spec.md              # 需求规范
│       ├── plan.md              # 实施计划
│       ├── tasks.md             # 任务列表
│       ├── quickstart.md        # 快速开始
│       └── contracts/           # API 合同
├── .env.example                 # 环境变量模板
├── pyproject.toml              # 项目配置
└── README.md                   # 项目文档

⚙️ 配置说明

环境变量

变量名	必需	默认值	描述
`OPENAI_API_KEY`	✅	-	OpenAI API 密钥
`OPENAI_MODEL`	❌	gpt-4o-mini	默认模型
`APP_ENV`	❌	development	运行环境
`BATCH_DEFAULT_SIZE`	❌	10	默认批处理大小
`CACHE_ENABLED`	❌	true	是否启用缓存
`LOG_LEVEL`	❌	INFO	日志级别

性能调优

# 生产环境
APP_ENV=production
APP_DEBUG=false
LOG_LEVEL=WARNING

# 高并发配置
BATCH_MAX_CONCURRENT_REQUESTS=20
BATCH_DEFAULT_SIZE=20

# 缓存优化
CACHE_ENABLED=true
CACHE_TTL=7200

🧪 测试

运行测试

# 单元测试
pytest tests/unit/ -v

# 集成测试
pytest tests/integration/ -v

# 覆盖率测试
pytest --cov=src tests/

# 性能测试
pytest tests/benchmark/ --benchmark-only

代码质量

# 代码格式化
ruff format .

# 代码检查
ruff check .

# 类型检查
mypy src/

📦 部署

Docker 部署

# 构建镜像
docker build -t text-processor-v3 .

# 运行容器
docker run -p 8000:8000 --env-file .env text-processor-v3

生产环境

# 使用 Gunicorn
gunicorn src.main:app \
  -w 4 \
  -k uvicorn.workers.UvicornWorker \
  --bind 0.0.0.0:8000

📊 监控

健康检查

curl http://localhost:8000/health

性能指标

curl http://localhost:8000/metrics

日志查看

# 实时日志
tail -f logs/app.log

# 错误日志
tail -f logs/error.log

🔧 开发

添加新功能

创建功能分支
编写代码和测试
运行测试套件
提交 Pull Request

代码规范

遵循 PEP 8
使用 type hints
添加文档字符串
编写单元测试

🤝 贡献

欢迎提交 Issue 和 Pull Request！

贡献流程

Fork 项目
创建功能分支 (git checkout -b feature/AmazingFeature)
提交更改 (git commit -m 'Add some AmazingFeature')
推送到分支 (git push origin feature/AmazingFeature)
打开 Pull Request

📄 许可证

本项目基于 MIT 许可证开源 - 查看 LICENSE 文件了解详情

🆘 获取帮助

🗺️ 路线图

📈 版本历史

v1.0.0 (2025-12-19)
- ✅ 初始版本发布
- ✅ 语言检测、情感分析、文本分类
- ✅ 批量处理和集成处理
- ✅ 完整的 API 文档

🙏 致谢

感谢以下开源项目：

FastAPI - 现代化的 Python Web 框架
LangChain - LLM 应用开发框架
Pydantic - 数据验证库

⭐ 如果这个项目对您有帮助，请给它一个星标！

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.claude/commands		.claude/commands
.serena		.serena
.specify		.specify
doc		doc
docs		docs
examples		examples
specs/001-llm-text-api		specs/001-llm-text-api
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
CLAUDE.md		CLAUDE.md
LOG_IMPLEMENTATION_SUMMARY.md		LOG_IMPLEMENTATION_SUMMARY.md
README.md		README.md
main.py		main.py
mypy.ini		mypy.ini
pyproject.toml		pyproject.toml
ruff.toml		ruff.toml
test_logging.py		test_logging.py
uv.lock		uv.lock

camelfire/text-processor-v3

Folders and files

Latest commit

History

Repository files navigation

Text Processing API v3

✨ 核心功能

📊 性能指标

🚀 快速开始

前置要求

安装步骤

1. 克隆项目

2. 创建虚拟环境

3. 安装依赖

4. 配置环境变量

5. 启动服务

📖 使用示例

基础 API 调用

1. 语言检测

2. 情感分析

3. 集成处理 (核心功能)

4. 批量处理

Python 客户端示例

🛠️ 技术栈

📁 项目结构

⚙️ 配置说明

环境变量

性能调优

🧪 测试

运行测试

代码质量

📦 部署

Docker 部署

生产环境

📊 监控

健康检查

性能指标

日志查看

🔧 开发

添加新功能

代码规范

🤝 贡献

贡献流程

📄 许可证

🆘 获取帮助

🗺️ 路线图

📈 版本历史

🙏 致谢

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages