I trained a small CNN on my GPU using NCHW data format, now I want to export a .pb file that I can then use to do inference in other applications.
I wrote a small helper function to call Tensorflow's freeze_graph function with default values, given a directory containing the checkpoint files and graph.pbtxt:
import os import argparse #os.environ['CUDA_VISIBLE_DEVICES'] = '-1' import tensorflow as tf dir(tf.contrib) #fix for tf.contrib undefined ops bug from tensorflow.python.tools.freeze_graph import freeze_graph def my_freeze_graph_2(model_dir, output_node_names): """Extract the sub graph defined by the output nodes and convert all its variables into constant Args: model_dir: the root folder containing the checkpoint state file output_node_names: a string, containing all the output node's names, comma separated """ if not tf.gfile.Exists(model_dir): raise AssertionError( "Export directory doesn't exists. Please specify an export " "directory: %s" % model_dir) if not output_node_names: print("You need to supply the name of a node to --output_node_names.") return -1 # We retrieve our checkpoint fullpath checkpoint = tf.train.get_checkpoint_state(model_dir) input_checkpoint = checkpoint.model_checkpoint_path # We precise the file fullname of our freezed graph absolute_model_dir = os.path.abspath(model_dir) output_graph = os.path.join(absolute_model_dir, "frozen_model.pb") freeze_graph(input_graph=os.path.join(model_dir, 'graph.pbtxt'), input_saver='', input_binary=False, input_checkpoint=input_checkpoint, output_node_names=output_node_names, restore_op_name="save/restore_all", filename_tensor_name="save/Const:0", output_graph=output_graph, clear_devices=True, initializer_nodes='') I then have a small script that attempts to build the graph from frozen_model.pb to test that the freezing actually worked:
import os #os.environ['CUDA_VISIBLE_DEVICES'] = '-1' import argparse import tensorflow as tf from freeze_graph import load_graph import cv2 if __name__ == '__main__': # Let's allow the user to pass the filename as an argument parser = argparse.ArgumentParser() parser.add_argument("--frozen_model_filename", default="model-multiple_starts/frozen_model.pb", type=str, help="Frozen model file to import") args = parser.parse_args() # We use our "load_graph" function graph = load_graph(args.frozen_model_filename) # We can verify that we can access the list of operations in the graph for op in graph.get_operations(): print(op.name) # We access the input and output nodes x = graph.get_tensor_by_name('prefix/Reshape:0') y = graph.get_tensor_by_name('prefix/softmax_tensor:0') # We launch a Session with tf.Session(graph=graph, config=tf.ConfigProto(log_device_placement=True)) as sess: # Note: we don't nee to initialize/restore anything # There is no Variables in this graph, only hardcoded constants # Load an image to use as test im = cv2.imread('57_00000000.png', cv2.IMREAD_GRAYSCALE) im = im.T im = im / 255 - 0.5 im = im[None,:,:,None] y_out = sess.run(y, feed_dict={ x: im }) print(y_out) If I try to run my test script, I get the following error:
InvalidArgumentError: CPU BiasOp only supports NHWC. [[Node: prefix/conv2d/BiasAdd = BiasAdd[T=DT_FLOAT, data_format="NCHW", _device="/job:localhost/replica:0/task:0/cpu:0"](prefix/conv2d/convolution, prefix/conv2d/bias/read)]]
I tried different configurations:
- generate .pb file from CPU-only script, run on CPU-only
- generate .pb file from script with GPU visible, run with GPU visible
- generate .pb file from CPU-only script, run with GPU visible
All of them raise the same error.
The problem lies in the fact that the checkpoint which I want to freeze has operations defined with data_format='NCHW'. How do I freeze the checkpoint with NHWC data format?
Update:
Poking around the files, I see that in graph.pbtxt for many operations data_format is hardcoded to NCHW. I guess, then, I'll need to make a new model with NHWC format, selectively load from the checkpoint the weights for the layers and use that graph to manually save out a .pb file... I'd assume there would be a process to do this already, but I can't find any documentation about this, nor examples.
Update 2:
After trying to import the .pb files in OpenCV's DNN module, I found out the following:
- Freezing together a checkpoint from a training with data format NCHW and a
graph.pbtxtwith data format NHWC leads to an unusable.pbfile. I haven't found the exact reason yet, but converting the.pbto.pbtxtand comparing it to a working frozen graph, the files differ only in the values stored in the weights and biases constants. - Freezing together a checkpoint from a training and a
graph.pbtxtboth with data format NHWC produces a working frozen graph.
It seems, then, that checkpoints are not transferable between graphs with different data formats (even if no error or warning is raised during the freezing process).