Is is possible to release the GEOMDrugsDataset processed files ?

Hello, 

I'm trying to use MiDi to generate molecules based on the model trained on GEOM with explicit H.
The trained model requires the dataset_infos as input, which needs the datamodule to get the statistics. However, I currently don't have enough RAM on my machine to load the training set of GEOM in the pickle file you provide.
I was thinking that probably having the processed files for the GEOMDrugsDataset could avoid the process() function (that is run when the processed files don't exist) and these files could be lighter than the whole pickle file containing molecules? Can you provide those ?
Or if you see another workaround (i.e. separating the statistics/configuration required for the dataset_infos in other files that do not always require the datamodule), please let me know?

Thank you very much,

Best,
Benoit

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is is possible to release the GEOMDrugsDataset processed files ? #3

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Is is possible to release the GEOMDrugsDataset processed files ? #3

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions