Source code for our paper:
CARFT: Boosting LLM Reasoning via Contrastive Learning with Annotated Chain-of-Thought-based Reinforced Fine-Tuning
please use cl_trainer_v4 to reproduce the results in the paper.
https://mp.weixin.qq.com/s/E7H3DCri6_xmBLwUv485Zw
Feel free to contact me if there is any question:
zhuwnq@outlook.com