Skip to content

DDigimon/GraphGhost

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Example code for decoder-only transformer on graph tasks and interpretation with circuit tracer

This is the code for Uncovering Graph Reasoning in Decoder-only Transformers with Circuit Tracing (Neurips 2025 Efficient Reasoning WorkShop Spotlight) and GraphGhost: Tracing Structures Behind Large Language Models (arxiv)

We provide a summary of the Circuit Tracer code to offer a simplified tutorial for constructing your own datasets and pretrained/finetuned models.

We build this repo based on nanoGPT (https://github.com/karpathy/nanoGPT), circuit_tracer (https://github.com/safety-research/circuit-tracer/tree/main) and transcoder_tracer (https://github.com/jacobdunefsky/transcoder_circuits/tree/master)

Here is the blog for this tutorial: English Chinese

The code is divided into two part:

1. How do decoder-only transformers do the graph reasoning task?

simplified_graph_task

The simplified graph task offers a lightweight setting for applying circuit tracing. The entire pipeline can be executed on a single 24 GB GPU, (and reducing the number of layers or the maximum sequence length can further lower the resource requirements)

The codebase provides a basic pipeline for generating synthetic data and exploring Circuit Tracers within self-defined Transformer models (We use GPT-2 architecture as an example). We hope this pipeline will support future theoretical investigations into how LLMs internalize and reason about structured information.

We further illustrate how a decoder-only Transformer can interpret explicit graph structures. For example, given a graph represented as an edge list, we ask the model to identify a path from a start node (e.g., 4) to an end node (e.g., 7). After training, the Transformer can correctly output the path. However, as suggested in Figure (b), the model does not form a corresponding implicit structure that reflects the underlying graph topology.

Intro

To run the code, you can start with tutorial.ipynb, and view the visualization results in vis_example.ipynb.

Alternatively, you may run the full pipeline through the Python scripts:

Step1. generate data

python generate_graphs.py
Step2. train your own transformer
python train_graph_transformer.py
Step3. train your own circuit tracer
python train_transcoders.py

Finally, you can open vis_example.ipynb to visualize the attribution graphs and reasoning traces.

2. How does the circuit tracers show on the pretrained LLMs?

LLM_applications

We also apply circuit tracing to LLM reasoning tasks. In addition to interpreting how LLMs predict the next token, we first train the transcoder for the given dataset (tutorial.ipynb or train_transcoder.py) we further provide interpretations of:

How LLMs generate a chain of thought (analsyis.ipynb (Analysis1))

How LLMs organize and structure information from a given dataset (analsyis.ipynb (Analysis2))

How to perturb LLMs based on insights from the previous analyses (analsyis.ipynb (Analysis3 and Analysis4))

By modifying the source code, you can select any architecture listed in TransformerLens. TransformerLens. You can also load your own fine-tuned model by adjusting the model loading logic. (Refer to model_load to see how we load deepseek model)


Overall, our goal is to adopt a graph-based perspective to better understand—and ultimately control—the behaviors of LLMs.


Download Example model at: https://huggingface.co/DDigimon/graph_ghost_model Download Example graphs at: https://huggingface.co/datasets/DDigimon/graph_ghost_data


Contact: daipigeon1[at]gmail


TODO:

  1. Evaluation for transcoder training part

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors