The COMPASS group investigates how to build AI systems that are safe, aligned with human values, and robust against adversarial manipulation. We work on broad topics on A(G)I safety and security, including: interpretability, reasoning, evals, contextual integrity, agentic risks and opportunities, multi-agent dynamics, agents with long-term memory, self-improving agents, (deceptive) alignment, situational awareness, manipulation and deception.

Research Group Leader
Sahar Abdelnabi

Sahar Abdelnabi

  • Research Group Leader

Highlight Hub