I have a custom file containing the paths to all my images and their labels which I load in a dataframe using:
MyIndex=pd.read_table('./MySet.txt') MyIndex has two columns of interest ImagePath and ClassName
Next I do some train test split and encoding the output labels as:
images=[] for index, row in MyIndex.iterrows(): img_path=basePath+row['ImageName'] img = image.load_img(img_path, target_size=(299, 299)) img_path=None img_data = image.img_to_array(img) img=None images.append(img_data) img_data=None images[0].shape Classes=Sample['ClassName'] OutputClasses=Classes.unique().tolist() labels=Sample['ClassName'] images=np.array(images, dtype="float") / 255.0 (trainX, testX, trainY, testY) = train_test_split(images,labels, test_size=0.10, random_state=42) trainX, valX, trainY, valY = train_test_split(trainX, trainY, test_size=0.10, random_state=41) images=None labels=None encoder = LabelEncoder() encoder=encoder.fit(OutputClasses) encoded_Y = encoder.transform(trainY) # convert integers to dummy variables (i.e. one hot encoded) trainY = to_categorical(encoded_Y, num_classes=len(OutputClasses)) encoded_Y = encoder.transform(valY) # convert integers to dummy variables (i.e. one hot encoded) valY = to_categorical(encoded_Y, num_classes=len(OutputClasses)) encoded_Y = encoder.transform(testY) # convert integers to dummy variables (i.e. one hot encoded) testY = to_categorical(encoded_Y, num_classes=len(OutputClasses)) datagen=ImageDataGenerator(rotation_range=90,horizontal_flip=True,vertical_flip=True,width_shift_range=0.25,height_shift_range=0.25) datagen.fit(trainX,augment=True) model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) batch_size=128 model.fit_generator(datagen.flow(trainX,trainY,batch_size=batch_size), epochs=500, steps_per_epoch=trainX.shape[0]//batch_size,validation_data=(valX,valY)) The problem I face that the data loaded in one go is too large to fit in current machine memory and so I am unable to work with the complete dataset.
I have tried to work with the datagenerator but do not want to follow he directory conventions it follows and also cannot eradicate the augmentation part.
The question is that is there a way to load batches from the disk ensuring the two stated conditions.
Samplecomes from