I have been searching for an answer as to how I should go about this for quite some time and can't seem to find anything that works.
I am following a tutorial on using the tf.data API found here. My scenario is very similar to the one in this tutorial (i.e. I have 3 directories containing all the training/validation/test files), however, they are not images, they're spectrograms saved as CSVs.
I have found a couple solutions for reading lines of a CSV where each line is a training instance (e.g., How to *actually* read CSV data in TensorFlow?). But my issue with this implementation is the required record_defaults parameter as the CSVs are 500x200.
Here is what I was thinking:
import tensorflow as tf import pandas as pd def load_data(path, label): # This obviously doesn't work because path and label # are Tensors, but this is what I had in mind... data = pd.read_csv(path, index_col=0).values() return data, label X_train = tf.constant(training_files) # training_files is a list of the file names Y_train = tf.constant(training_labels # training_labels is a list of labels for each file train_data = tf.data.Dataset.from_tensor_slices((X_train, Y_train)) # Here is where I thought I would do the mapping of 'load_data' over each batch train_data = train_data.batch(64).map(load_data) iterator = tf.data.Iterator.from_structure(train_data.output_types, \ train_data.output_shapes) next_batch = iterator.get_next() train_op = iterator.make_initializer(train_data) I have only used Tensorflows feed_dict in the past, but I need a different approach now that my data has gotten to the size that it can no longer fit in memory.
Any thoughts? Thanks.
record_defaults. Can you elaborate?record_defaults = [[0],[0],...,[0]]) and then do something likecols = tf.decode_csv(csv_row, record_defaults=record_defaults)anddata = tf.stack(cols). Which seemed like a lot of overhead for every file.tf.read_file, then split it appropriately (seetf.string_split) or directly interpret as CSV usingtf.decode_csv.tf.constant(0)tensor. I would definitely give it a try.