Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".
-
Updated
Nov 29, 2024 - Python
Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".
[ACL 2024] Code for the paper "ALaRM: Align Language Models via Hierarchical Rewards Modeling"
Source code for 'Understanding impacts of human feedback via influence functions'
Experiments for the Neural Interactive Proofs paper
Adversarial Deliberation Trees with Mechanistic Verification for scalable LLM oversight
Add a description, image, and links to the scalable-oversight topic page so that developers can more easily learn about it.
To associate your repository with the scalable-oversight topic, visit your repo's landing page and select "manage topics."