Releases: ArcInstitute/state
v0.10.2
v0.10.1
updates infer script to fix a bug, where models trained on X_hvg would incorrectly use a randomly initialized "gene deoder MLP". this would effectively randomize the predictions of the model
note that this only occurred with src/state/_cli/_tx/_infer.py and did not affect src/state/_cli/_tx/_train.py or src/state/_cli/_tx/_predict.py scripts
v0.9.30
Updates code to allow:
data.kwargs.output_space == "embedding" to train on only embeddings without a gene decoder
update tx predict to clip [0,14]
update model hyperparameters for better preset values
updates to st infer, se fit/transform/eval
several improvements in this overdue change log.
-
infer now correctly groups control cells by shared covariates. see here for examples of running inference: https://colab.research.google.com/drive/1bq5v7hixnM-tZHwNdgPiuuDo6kuiwLKJ?authuser=1#scrollTo=ENfnF6ofAz1M
-
the embedding model checkpoints are now completely packaged (so no more model folder is needed). we retain backwards compatibility. here is an example for running transform: https://colab.research.google.com/drive/1uJinTJLSesJeot0mP254fQpSxGuDEsZt
-
the auxillary files to train the SE model, as well as the complete training data, are available here: https://huggingface.co/datasets/arcinstitute/SE-167M-Human
some notes on how to train SE yourself: #138 (comment)
- new eval scripts for SE checkpoints: use uv run state emb eval
remote installation fix
this version fixes a bug in the repo .gitignore that doesn't distribute the wandb folder for the configs.
half precision for inference
the model is trained in bf16-mixed precision, so this patch uses bfloat16 for inference when cuda is available. empirically for the 600m model this improves inference speed ~8-9x
fixes inference with the hugging face checkpoints
Merge pull request #139 from ArcInstitute/fix_broken_main fixes inference
remove double nesting from wandb
fix for fresh install to work with or without wandb integration
0.9.2: Merge pull request #114 from ArcInstitute/update_example
adds example command
adds cell barcodes as an optional parameter to the data loaders
this upgrades the version of cell load so that the perturbed cell and control cell barcodes are available in the final predicted outputs