OpenClaw-RL: Train any agent simply by talking
-
Updated
Mar 13, 2026 - TypeScript
OpenClaw-RL: Train any agent simply by talking
🛠️ Apply on-policy distillation to enhance Qwen3-0.6b's performance on GSM8K by learning from its own outputs, reducing bias during inference.
Train and customize OpenClaw agents using reinforcement learning with simple language feedback and fully asynchronous optimization.
Add a description, image, and links to the on-policy-distillation topic page so that developers can more easily learn about it.
To associate your repository with the on-policy-distillation topic, visit your repo's landing page and select "manage topics."