Working with a large dataset

Hello Mr. Ahmed

I'm working with a large dataset (~70GB worth of raw logs) for training, but I'm limited by 128GB RAM and 24GB GPU RAM. How can I efficiently create dataset and train this model in stages, considering these memory constraints? I'm currently considering using data generators and K-folds, but would appreciate any insights or alternative approaches you have or you may have tested.