So I'm working with a rather large dataset (perhaps not really by ML standards - but too big to fit into my computer's RAM at any rate). And so, I train the model by successively loading a subsample of the dataset and calling model.fit using that data. This means I call model.fit quite regularly and it seems to take significantly more time to initialize model.fit than it does to perform 10 epochs of training (on the data subset).
I'm wondering if there any solutions or tricks to improve initialization time? Should I input the data in a certain way (say tensorflow dataset object instead of numpy arrays) or employ model.train_on_batch in some fashion instead?
ImageDataGenerator, depending on your data) which will load data into memory as needed. $\endgroup$