Skip to content

src/dnadiffusion rewrite#150

Merged
cameronraysmith merged 4 commits intopinellolab:mainfrom
ssenan:main
Jun 10, 2023
Merged

src/dnadiffusion rewrite#150
cameronraysmith merged 4 commits intopinellolab:mainfrom
ssenan:main

Conversation

@ssenan
Copy link
Collaborator

@ssenan ssenan commented Jun 7, 2023

This is a large change to remove a loss spike issue that occurred after splitting single script version code into separate files

Resolves #149

Code changes

  • Moves code back towards using builtin pytorch (with the exception of huggingface accelerate being used to assist with distributed training)
  • Hydra-zen train loop is still a wip, so train_hf.py contains the main train call that can be linked to the slurm script for distributed training
  • sample.py is used to load a checkpoint and generate cell-specific sequences for validation
  • dnadiffusion.py contains all code in a single script that can also be used for training
  • in the top directory of notebooks there are now two notebooks: master_dataset.ipynb and filter_master.ipynb, which show how our original data was collated for our complete table and then filtered down to our current working set
  • Final major change is that diffusion functions have been collected into a class and this class has been integrated into the main trainloop (src/dnadiffusion/utils/train_util.py)
  • There are a multitude of other small changes made to accommodate these larger changes

Code should now be more readable in the single script version dnadiffusion.py and more extensible in src/dnaddifusion

@ssenan ssenan added enhancement New feature or request codebase breaking Breaking Changes labels Jun 7, 2023
@ssenan ssenan added this to the 0.0.0 milestone Jun 7, 2023
@ssenan ssenan requested a review from cameronraysmith June 7, 2023 22:36
@ssenan ssenan self-assigned this Jun 7, 2023
@ssenan ssenan added the refactoring Refactoring label Jun 7, 2023
@codecov
Copy link

codecov bot commented Jun 10, 2023

Codecov Report

Merging #150 (2203dc3) into main (b4ba5aa) will increase coverage by 1.04%.
The diff coverage is 0.00%.

@@           Coverage Diff            @@
##            main    #150      +/-   ##
========================================
+ Coverage   1.72%   2.76%   +1.04%     
========================================
  Files         18      12       -6     
  Lines       1278     795     -483     
  Branches     117      88      -29     
========================================
  Hits          22      22              
+ Misses      1256     773     -483     
Impacted Files Coverage Δ
src/dnadiffusion/data/dataloader.py 0.00% <0.00%> (ø)
src/dnadiffusion/metrics/metrics.py 0.00% <0.00%> (ø)
src/dnadiffusion/models/diffusion.py 0.00% <0.00%> (ø)
src/dnadiffusion/models/layers.py 0.00% <ø> (ø)
src/dnadiffusion/models/unet.py 0.00% <0.00%> (ø)
src/dnadiffusion/utils/sample_util.py 0.00% <0.00%> (ø)
src/dnadiffusion/utils/train_util.py 0.00% <0.00%> (ø)
src/dnadiffusion/utils/utils.py 0.00% <0.00%> (ø)

@cameronraysmith cameronraysmith self-requested a review June 10, 2023 03:26
@cameronraysmith cameronraysmith merged commit 8776054 into pinellolab:main Jun 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking Breaking Changes codebase enhancement New feature or request refactoring Refactoring

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Resolve training loss spike

2 participants