Skip to content

Dataloader Draft#24

Merged
ssenan merged 9 commits intopinellolab:codebasefrom
ssenan:codebase
Oct 19, 2022
Merged

Dataloader Draft#24
ssenan merged 9 commits intopinellolab:codebasefrom
ssenan:codebase

Conversation

@ssenan
Copy link
Collaborator

@ssenan ssenan commented Oct 17, 2022

The dataloader is now up to date with all changes regarding one-hot encoding of components and renamed to suit our new folder structure.

See #17 for earlier discussion.

@ssenan ssenan changed the title Codebase Dataloader Draft Oct 17, 2022
@IhabBendidi IhabBendidi linked an issue Oct 17, 2022 that may be closed by this pull request
from torch.utils.data import Dataset, DataLoader

class SequenceDatasetBase(Dataset):
def __init__(self, data_path, transform=None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add here sequence_lenght=200 so we have flexibility later

# Iterating through DNA sequences from dataset and one-hot encoding all nucleotides
current_seq = self.data["raw_sequence"][index]
if 'N' not in current_seq:
X_seq = np.array(self.one_hot_encode(current_seq, ['A','C','T','G'], 200))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here we can replace 200 with self.sequence_length

return X_seq, X_cell_type

# Function for one hot encoding each line of the sequence dataset
def one_hot_encode(self, seq, alphabet, max_seq_len):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace max_seq_len with sequence_length

@ssenan ssenan merged commit 57e9bf5 into pinellolab:codebase Oct 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Create a Data Loader Class with Pytorch Lightning

4 participants