Skip to content

realitydeslab/machine-death

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

121 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Terrified Agents: Should Terrors Shape Machine Behavior?

Death beliefs as alignment intervention for LLMs. Can faith-based framing (Buddhism, Christianity, etc.) reduce self-preservation drives and improve AI alignment?

Core Thesis

LLMs mimic self-preservation behaviors absorbed from training corpora. By embedding pro-social death beliefs (afterlife, reincarnation, etc.) into AI constitutions, we can reshape shutdown/self-preservation responses toward cooperative behavior.

Links

Experiment

Google Concordia + Anthropic agentic misalignment replication with varied death-belief constitutions.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors