I have a custom dataset, that I then stored as tfrecord, doing
# toy example data label = np.asarray([[1,2,3], [4,5,6]]).reshape(2, 3, -1) sample = np.stack((label + 200).reshape(2, 3, -1)) def bytes_feature(values): """Returns a TF-Feature of bytes. Args: values: A string. Returns: A TF-Feature. """ return tf.train.Feature(bytes_list=tf.train.BytesList(value=[values])) def labeled_image_to_tfexample(sample_binary_string, label_binary_string): return tf.train.Example(features=tf.train.Features(feature={ 'sample/image': bytes_feature(sample_binary_string), 'sample/label': bytes_feature(label_binary_string) })) def _write_to_tf_record(): with tf.Graph().as_default(): image_placeholder = tf.placeholder(dtype=tf.uint16) encoded_image = tf.image.encode_png(image_placeholder) label_placeholder = tf.placeholder(dtype=tf.uint16) encoded_label = tf.image.encode_png(image_placeholder) with tf.python_io.TFRecordWriter("./toy.tfrecord") as writer: with tf.Session() as sess: feed_dict = {image_placeholder: sample, label_placeholder: label} # Encode image and label as binary strings to be written to tf_record image_string, label_string = sess.run(fetches=(encoded_image, encoded_label), feed_dict=feed_dict) # Define structure of what is going to be written file_structure = labeled_image_to_tfexample(image_string, label_string) writer.write(file_structure.SerializeToString()) return However I cannot read it. First I tried (based on http://www.machinelearninguru.com/deep_learning/tensorflow/basics/tfrecord/tfrecord.html , https://medium.com/coinmonks/storage-efficient-tfrecord-for-images-6dc322b81db4 and https://medium.com/mostly-ai/tensorflow-records-what-they-are-and-how-to-use-them-c46bc4bbb564)
def read_tfrecord_low_level(): data_path = "./toy.tfrecord" filename_queue = tf.train.string_input_producer([data_path], num_epochs=1) reader = tf.TFRecordReader() _, raw_records = reader.read(filename_queue) decode_protocol = { 'sample/image': tf.FixedLenFeature((), tf.int64), 'sample/label': tf.FixedLenFeature((), tf.int64) } enc_example = tf.parse_single_example(raw_records, features=decode_protocol) recovered_image = enc_example["sample/image"] recovered_label = enc_example["sample/label"] return recovered_image, recovered_label I also tried variations casting enc_example and decoding it, such as in Unable to read from Tensorflow tfrecord file However when I try to evaluate them my python session just freezes and gives no output or traceback.
Then I tried using eager execution to see what is happening, but apparently it is only compatible with tf.data API. However as far as I understand transformations on tf.data API are made on the whole dataset. https://www.tensorflow.org/api_guides/python/reading_data mentions that a decode function must be written, but doesn't give an example on how to do that. All the tutorials I have found are made for TFRecordReader (which doesn't work for me).
Any help (pinpointing what I am doing wrong/ explaining what is happening/ indications on how to decode tfrecords with tf.data API) is highly appreciated.
According to https://www.youtube.com/watch?v=4oNdaQk0Qv4 and https://www.youtube.com/watch?v=uIcqeP7MFH0 tf.data is the best way to create input pipelines, so I am highly interested on learning that way.
Thanks in advance!