Skip to content

motif composition metric notebook conversion#222

Merged
mergify[bot] merged 14 commits intopinellolab:mainfrom
ssenan:motif_composition
Oct 3, 2023
Merged

motif composition metric notebook conversion#222
mergify[bot] merged 14 commits intopinellolab:mainfrom
ssenan:motif_composition

Conversation

@ssenan
Copy link
Collaborator

@ssenan ssenan commented Oct 3, 2023

  • Resolves add motif composition/trajectory validation metric #212
  • Adds the main functions from the motif composition notebook into main dnadiffusion library in motif_composition.py
  • Adds function called motif_composition_helper that scans an input dataframe for motifs. This utilizes similar methodology to extract_motifs, so these functions can likely be combined in the future
  • Introduces are a function version of the previously used SEQ_EXTRACT class called seq_extract, so we may look to remove the class in a future PR (this depends on integration of the rest of our validation metrics).
  • Tests are provided for all the new functions except for motif_composition_matrix. We need to investigate further the best way to capture the performance of these metrics / if they are working correctly. (possible need to store a copy of hg38.fa and test if gimme motifs can work in CI)
  • Reuses some fixtures across multiple files, which can probably be consolidated to one test_helper.py file in the future

@ssenan ssenan added enhancement New feature or request metrics modifies definition or measurement of model metrics labels Oct 3, 2023
@ssenan ssenan added this to the 0.1.0 milestone Oct 3, 2023
@ssenan ssenan self-assigned this Oct 3, 2023
@codecov
Copy link

codecov bot commented Oct 3, 2023

Codecov Report

Merging #222 (3aa8282) into main (c2648c2) will increase coverage by 1.98%.
The diff coverage is 59.20%.

@@            Coverage Diff             @@
##             main     #222      +/-   ##
==========================================
+ Coverage   35.12%   37.11%   +1.98%     
==========================================
  Files          21       23       +2     
  Lines        1358     1482     +124     
  Branches      180      202      +22     
==========================================
+ Hits          477      550      +73     
- Misses        874      924      +50     
- Partials        7        8       +1     
Files Coverage Δ
src/dnadiffusion/utils/data_util.py 49.66% <84.61%> (+3.34%) ⬆️
tests/test_data_util.py 92.94% <85.00%> (-7.06%) ⬇️
tests/test_motif_composition.py 41.37% <41.37%> (ø)
src/dnadiffusion/metrics/motif_composition.py 39.53% <39.53%> (ø)

@ssenan ssenan requested a review from cameronraysmith October 3, 2023 19:14
Copy link
Collaborator

@cameronraysmith cameronraysmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perfect!

@mergify mergify bot merged commit 33ca235 into pinellolab:main Oct 3, 2023
@ssenan ssenan deleted the motif_composition branch October 3, 2023 20:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request metrics modifies definition or measurement of model metrics

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

add motif composition/trajectory validation metric

2 participants