To be able to serve models with tensorflow serving we should export the model as a protobuf file. We can export it in two ways
Since we first trained the model and saved it as checkpoint which created following files
model.ckpt-22001-00000-of-00001
model.ckpt-22001.meta
checkpoint
Alone these files cannot be used with tensorflow serving. We have generate a protobuf files with the above file.
Following code can be used to generate protobuf file,
from tensorflow.python.framework import graph_util
from tensorflow.contrib.session_bundle import session_bundle
import tensorflow as tf
export_dir = '/home/cdpdemo/tensorflow/reterival-based-model/chatbot-retrieval/runs/1494400825' # This directory contains files mentioned above
output_graph = '/home/cdpdemo/tensorflow/reterival-based-model/chatbot-retrieval/runs/1494400825/export.pb'
clear_devices = True # we need to make sure we dont carry device information to the protobuf file.
checkpoint = tf.train.get_checkpoint_state(export_dir)
input_checkpoint = checkpoint.model_checkpoint_path
saver = tf.train.import_meta_graph('/home/cdpdemo/tensorflow/reterival-based-model/chatbot-retrieval/runs/1494400825/export.meta')
graph = tf.get_default_graph()
input_graph_def = graph.as_graph_def()
output_graph_node = []
with tf.Session() as sess:
saver.restore(sess, input_checkpoint)
# we are exporting all the operations/nodes exiting in our model to the protobuf file.
for op in graph.get_operations():
output_graph_node.append(op.name)
output_graph_def = graph_util.convert_variables_to_constants(sess,input_graph_def,output_graph_node)
with tf.gfile.GFile(output_graph, "wb") as f:
f.write(output_graph_def.SerializeToString())
print("%d ops in the final graph." % len(output_graph_def.node))
Initiallly I was not sure how to export the nodes to the protobuf file so i was trying to export it with
output___graph__node = ['prediction']
I got following error while doing so,
Traceback (most recent call last):
File "convert2.py", line 21, in <module>
output_graph_def = graph_util.convert_variables_to_constants(sess,input_graph_def,["prediction"])
File "/home/cdpdemo/tensorflow/reterival-based-model/tensorflow-old-version-for-this-use-case/local/lib/python2.7/site-packages/tensorflow/python/framework/graph_util.py", line 234, in convert_variables_to_constants
inference_graph = extract_sub_graph(input_graph_def, output_node_names)
File "/home/cdpdemo/tensorflow/reterival-based-model/tensorflow-old-version-for-this-use-case/local/lib/python2.7/site-packages/tensorflow/python/framework/graph_util.py", line 158, in extract_sub_graph
assert d in name_to_node_map, "%s is not in graph" % d
AssertionError: prediction is not in graph
This error was because the output___graph__node is not scaler but a plural or an array variable. Basically it contains nodes or operations from the tensorflow graph that we want to export to the "pb" file.
For example, if I list all the operations in a tensorflow graph using below code,
for op in graph.get_operations():
print(op.name)
I get following nodes,
[u'global_step',
u'global_step/Initializer/zeros',
u'global_step/Assign',
u'global_step/read',
u'read_batch_features_train/file_name_queue/input',
u'read_batch_features_train/file_name_queue/Size',
u'read_batch_features_train/file_name_queue/Greater/y',
u'read_batch_features_train/file_name_queue/Greater',
u'read_batch_features_train/file_name_queue/Assert/AssertGuard/Switch',
u'read_batch_features_train/file_name_queue/Assert/AssertGuard/switch_t',
u'read_batch_features_train/file_name_queue/Assert/AssertGuard/switch_f',
u'read_batch_features_train/file_name_queue/Assert/AssertGuard/pred_id',
u'read_batch_features_train/file_name_queue/Assert/AssertGuard/NoOp',
u'read_batch_features_train/file_name_queue/Assert/AssertGuard/control_dependency',
u'read_batch_features_train/file_name_queue/Assert/AssertGuard/Assert/data_0',
u'read_batch_features_train/file_name_queue/Assert/AssertGuard/Assert/Switch',
u'read_batch_features_train/file_name_queue/Assert/AssertGuard/Assert',
...
...
u'prediction/logistic_loss/add/x',
u'prediction/logistic_loss/add',
u'prediction/logistic_loss/Log',
u'prediction/logistic_loss',
u'Const_3',
u'mean_loss']
Among these nodes, i have to decide which one to export to a "pb" file. For now I am exporting everything.
After successfully exporting i get below message,
Converted 7 variables to const ops.
169 ops in the final graph.
Once I have exported every thing into the "pb" file. I rename and move files as following,
model.ckpt-22001.meta --> export.meta
model.ckpt-22001-00000-of-00001 --> export-00000-of-00001
export.pb as it is
# Now moving files to a directory structure as follows,
# <base dir>/runs/<version>/
mv export.meta export-00000-of-00001 export.pb /home/cdpdemo/tensorflow/reterival-based-model/chatbot-retrieval/runs/1494400825/tf_serving/1/
After this to test whether i am able to import this model to tensorflow serving, i use below command,
bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server --port=9000 --model_name=export --model_base_path=/home/cdpdemo/tensorflow/reterival-based-model/chatbot-retrieval/runs/1494400825/tf_serving/
Below is the output from above command,
2017-05-19 04:31:49.719962: I tensorflow_serving/model_servers/main.cc:155] Building single TensorFlow model file config: model_name: export model_base_path: /home/cdpdemo/tensorflow/reterival-based-model/chatbot-retrieval/runs/1494400825/tf_serving/ model_version_policy: 0
2017-05-19 04:31:49.720457: I tensorflow_serving/model_servers/server_core.cc:375] Adding/updating models.
2017-05-19 04:31:49.720486: I tensorflow_serving/model_servers/server_core.cc:421] (Re-)adding model: export
2017-05-19 04:31:49.824347: I tensorflow_serving/core/basic_manager.cc:698] Successfully reserved resources to load servable {name: export version: 1}
2017-05-19 04:31:49.824393: I tensorflow_serving/core/loader_harness.cc:66] Approving load for servable version {name: export version: 1}
2017-05-19 04:31:49.824422: I tensorflow_serving/core/loader_harness.cc:74] Loading servable version {name: export version: 1}
2017-05-19 04:31:49.824513: I external/org_tensorflow/tensorflow/contrib/session_bundle/bundle_shim.cc:366] Attempting to up-convert SessionBundle to SavedModelBundle in bundle-shim from: /home/cdpdemo/tensorflow/reterival-based-model/chatbot-retrieval/runs/1494400825/tf_serving/1
2017-05-19 04:31:49.824559: I external/org_tensorflow/tensorflow/contrib/session_bundle/session_bundle.cc:161] Attempting to load a SessionBundle from: /home/cdpdemo/tensorflow/reterival-based-model/chatbot-retrieval/runs/1494400825/tf_serving/1
2017-05-19 04:31:49.824643: I external/org_tensorflow/tensorflow/contrib/session_bundle/session_bundle.cc:162] Using RunOptions:
2017-05-19 04:31:49.843385: W external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-05-19 04:31:49.843444: W external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-05-19 04:31:49.843458: W external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-05-19 04:31:49.843464: W external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-05-19 04:31:49.843473: W external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-05-19 04:31:49.970640: I external/org_tensorflow/tensorflow/contrib/session_bundle/session_bundle.cc:135] Running restore op for SessionBundle: save/restore_all, save/Const:0
2017-05-19 04:31:50.219595: I external/org_tensorflow/tensorflow/contrib/session_bundle/session_bundle.cc:244] Loading SessionBundle: success. Took 395024 microseconds.
2017-05-19 04:31:50.320556: I tensorflow_serving/core/loader_harness.cc:86] Successfully loaded servable version {name: export version: 1}
2017-05-19 04:31:50.383813: I tensorflow_serving/model_servers/main.cc:298] Running ModelServer at 0.0.0.0:9000 ...
If I had no "pb" file in the directory then running above command will generate following error,
2017-05-19 04:30:39.187463: I tensorflow_serving/model_servers/main.cc:155] Building single TensorFlow model file config: model_name: export model_base_path: /home/cdpdemo/tensorflow/reterival-based-model/chatbot-retrieval/runs/1494400825/tf_serving/ model_version_policy: 0
2017-05-19 04:30:39.187827: I tensorflow_serving/model_servers/server_core.cc:375] Adding/updating models.
2017-05-19 04:30:39.187850: I tensorflow_serving/model_servers/server_core.cc:421] (Re-)adding model: export
2017-05-19 04:30:39.288368: I tensorflow_serving/core/basic_manager.cc:698] Successfully reserved resources to load servable {name: export version: 1}
2017-05-19 04:30:39.288421: I tensorflow_serving/core/loader_harness.cc:66] Approving load for servable version {name: export version: 1}
2017-05-19 04:30:39.288435: I tensorflow_serving/core/loader_harness.cc:74] Loading servable version {name: export version: 1}
2017-05-19 04:30:39.288567: E tensorflow_serving/util/retrier.cc:38] Loading servable: {name: export version: 1} failed: Not found: Session bundle or SavedModel bundle not found at specified export location
If I used completely wrong path then i will get below error,
bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server --port=9000 --model_name=export --model_base_path=/home/cdpdemo/tensorflow/reterival-based-model/chatbot-retrieval/runs/temp/
2017-05-19 04:27:42.683155: I tensorflow_serving/model_servers/main.cc:155] Building single TensorFlow model file config: model_name: export model_base_path: /home/cdpdemo/tensorflow/reterival-based-model/chatbot-retrieval/runs/temp/ model_version_policy: 0
2017-05-19 04:27:42.683973: I tensorflow_serving/model_servers/server_core.cc:375] Adding/updating models.
2017-05-19 04:27:42.684148: I tensorflow_serving/model_servers/server_core.cc:421] (Re-)adding model: export
2017-05-19 04:27:42.695901: E tensorflow_serving/sources/storage_path/file_system_storage_path_source.cc:306] FileSystemStoragePathSource encountered a file-system access error: Could not find base path /home/cdpdemo/tensorflow/reterival-based-model/chatbot-retrieval/runs/temp/ for servable export
2017-05-19 04:27:43.695941: E tensorflow_serving/sources/storage_path/file_system_storage_path_source.cc:306] FileSystemStoragePathSource encountered a file-system access error: Could not find base path /home/cdpdemo/tensorflow/reterival-based-model/chatbot-retrieval/runs/temp/ for servable export
2017-05-19 04:27:44.696045: E tensorflow_serving/sources/storage_path/file_system_storage_path_source.cc:306] FileSystemStoragePathSource encountered a file-system access error: Could not find base path /home/cdpdemo/tensorflow/reterival-based-model/chatbot-retrieval/runs/temp/ for servable export
/home/cdpdemo/tensorflow/reterival-based-model/chatbot-retrieval/temp
problem saving the freeze model because i didnt understand what part of the graph to save
Next loading the saved freezing model
https://github.com/tensorflow/tensorflow/issues/3628
http://www.nqbao.com/blog/2017/02/tensorflow-exporting-model-for-serving/
<TODO>