Skip to content

Is is possible to release the GEOMDrugsDataset processed files ? #3

@bbaillif

Description

@bbaillif

Hello,

I'm trying to use MiDi to generate molecules based on the model trained on GEOM with explicit H.
The trained model requires the dataset_infos as input, which needs the datamodule to get the statistics. However, I currently don't have enough RAM on my machine to load the training set of GEOM in the pickle file you provide.
I was thinking that probably having the processed files for the GEOMDrugsDataset could avoid the process() function (that is run when the processed files don't exist) and these files could be lighter than the whole pickle file containing molecules? Can you provide those ?
Or if you see another workaround (i.e. separating the statistics/configuration required for the dataset_infos in other files that do not always require the datamodule), please let me know?

Thank you very much,

Best,
Benoit

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions