How do you write a fixed len feature to tfrecord

Question

I'm struggling with the basics of writing a tensorflow tfrecord file. I'm writing a simple example with an ndarray in python, but for some reason when I read it it's required to be variable-length and reads it as a SparseTensor.

Here's the example

def serialize_tf_record(features, targets): record = { 'shape': tf.train.Int64List(value=features.shape), 'features': tf.train.FloatList(value=features.flatten()), 'targets': tf.train.Int64List(value=targets), } return build_tf_example(record) def deserialize_tf_record(record): tfrecord_format = { 'shape': tf.io.VarLenFeature(tf.int64), 'features': tf.io.VarLenFeature(tf.float32), 'targets': tf.io.VarLenFeature(tf.int64), } features_tensor = tf.io.parse_single_example(record, tfrecord_format) return features_tensor

Can anybody explain to me why this writes a variable-length record? It is fixed in code, but I can't seem to write it in a way tensorflow knows its fixed. The tensorflow documentation is pretty horrific here. Can anybody clarify the API for me?

AlexisBRENON · Accepted Answer · 2019-10-08 07:17:06Z

You should provide more contextual code, like your build_tf_example function and examples of your features and targets.

Here is an example which return Dense Tensors:

 import numpy as np import tensorflow as tf def build_tf_example(record): return tf.train.Example(features=tf.train.Features(feature=record)).SerializeToString() def serialize_tf_record(features, targets): record = { 'shape': tf.train.Feature(int64_list=tf.train.Int64List(value=features.shape)), 'features': tf.train.Feature(float_list=tf.train.FloatList(value=features.flatten())), 'targets': tf.train.Feature(int64_list=tf.train.Int64List(value=targets)), } return build_tf_example(record) def deserialize_tf_record(record): tfrecord_format = { 'shape': tf.io.FixedLenSequenceFeature((), dtype=tf.int64, allow_missing=True), 'features': tf.io.FixedLenSequenceFeature((), dtype=tf.float32, allow_missing=True), 'targets': tf.io.FixedLenSequenceFeature((), dtype=tf.int64, allow_missing=True), } features_tensor = tf.io.parse_single_example(record, tfrecord_format) return features_tensor def main(): features = np.zeros((3, 5, 7)) targets = np.ones((4,), dtype=int) tf.print(deserialize_tf_record(serialize_tf_record(features, targets))) if __name__ == '__main__': main()

I converted record to a dictionary of Features (to easily serialize it)
From what I understood, each of your feature can be an array (opposed to a scalar value), hence you can parse it using FixedLenSequenceFeature input feature to build a dense tensor instead of a sparse one.

Collectives™ on Stack Overflow

How do you write a fixed len feature to tfrecord

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related