Brain Language Model

This repository stores a Brain Language Model implemented for a research project in my graduate program. The project implements fMRI-to-text translation using prompt tuning techniques. The pipeline leveraged a small brain adapter to project the fMRI embeddings to the text embedding space for a frozen LLM. The notebooks test various configurations for the brain adapter.

See the paper for more details.

Environment Setup

Install the necessary dependencies:

pip install -q -r environment.txt

If you do not already have HF_TOKEN set as an env variable, run cp example.env .env and add your HuggingFace Token.

Download Data

The data should be stored at the root of the project in a narratives_subset/ directory.

Use the following steps to pull and get a new subset of the dataset:

Install git-annex to pull a clone of the narratives dataset.

sudo apt-get -qq update
sudo apt-get -qq install git git-annex

Note: github username and password must be configured for datalad. To check if they are configured:

git config --global user.name
git config --global user.email

If this returns no results, you must configure them:

git config --global user.name "Your Name"
git config --global user.email "you@example.com"

Run the following script with optional args to clone the data and setup the data subset:

python3 get_narratives_subset.py --clearoot

The script has the following optional args:

--path (str, `default=/narratives`): the path to the git annex clone of the narratives directory. The script will clone to this path if it does not yet exist.
--out (str, `default=narratives_subset`): the output directory for the subset.
--clean_root: set thi flag to cleanup the git annex clone of the narratives directory after creating the subset

Alternatively, you can download the data from Google Drive. However, this approach may take longer.

Quick Start

Run the notebooks after installing dependencies and downloading data.

Experiments

There is a separate notebook for each subject corresponding to the experiments below. Note: all *_tune.ipynb notebooks include hyperparameter tuning done with sub-052 and a smaller subset of stories. See the paper for a more detailed description of the brain adapter used in each experiment.

Experiment 01A-B: PCA/MLP Brain Adapter with and without pretraining stage.
Experiment 02A-B: 3D CNN Brain Adapter with and without pretraining stage.
Experiment 03A-B: 3D CNN + MLP Brain Adapter with and without pretraining stage.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
encoders		encoders
evaluations		evaluations
.env		.env
.gitignore		.gitignore
README.md		README.md
brain_language_model.py		brain_language_model.py
colab_env.txt		colab_env.txt
data_exploration.ipynb		data_exploration.ipynb
data_loader.py		data_loader.py
data_processing.py		data_processing.py
diagram.jpg		diagram.jpg
environment.txt		environment.txt
evaluations.py		evaluations.py
example.env		example.env
experiment_1_sub052_final.ipynb		experiment_1_sub052_final.ipynb
experiment_1_sub065_final.ipynb		experiment_1_sub065_final.ipynb
experiment_1_sub111_final.ipynb		experiment_1_sub111_final.ipynb
experiment_2_sub052_final.ipynb		experiment_2_sub052_final.ipynb
experiment_2_sub052_tune01.ipynb		experiment_2_sub052_tune01.ipynb
experiment_2_sub052_tune02.ipynb		experiment_2_sub052_tune02.ipynb
experiment_2_sub052_tune03.ipynb		experiment_2_sub052_tune03.ipynb
experiment_2_sub052_tune04.ipynb		experiment_2_sub052_tune04.ipynb
experiment_2_sub052_tune05.ipynb		experiment_2_sub052_tune05.ipynb
experiment_2_sub052_tune06.ipynb		experiment_2_sub052_tune06.ipynb
experiment_2_sub052_tune07.ipynb		experiment_2_sub052_tune07.ipynb
experiment_2_sub065_final.ipynb		experiment_2_sub065_final.ipynb
experiment_2_sub111_final.ipynb		experiment_2_sub111_final.ipynb
experiment_3_sub052_final.ipynb		experiment_3_sub052_final.ipynb
experiment_3_sub065_final.ipynb		experiment_3_sub065_final.ipynb
experiment_3_sub111_final.ipynb		experiment_3_sub111_final.ipynb
experiment_4.ipynb		experiment_4.ipynb
get_narratives_subset.py		get_narratives_subset.py
helpers.py		helpers.py
logging_and_metrics.py		logging_and_metrics.py
results_visualizations.ipynb		results_visualizations.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Brain Language Model

Environment Setup

Download Data

Quick Start

Experiments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Brain Language Model

Environment Setup

Download Data

Quick Start

Experiments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages