Chapter 2
AutoGraph
In this chapter, we will introduce how to convert python function into executable Tensorflow graphs.
0. Setups for this section
import inspect
import time
import numpy as np
import tensorflow as tf
from pprint import pprint
print(tf.__version__)
tf.random.set_seed(42)
np.random.seed(42)
true_weights = tf.constant(list(range(5)), dtype=tf.float32)[:, tf.newaxis]
x = tf.constant(tf.random.uniform((32, 5)), dtype=tf.float32)
y = tf.constant(x @ true_weights, dtype=tf.float32)
2.1.0
1. AutoGraph
A computational graph contains two things computation and data. Tensorflow graph tf.Graph
is the computational graph made from operations as computation units and tensors as data units. Tensorflow has many optimizations around graphs, so executing in graph mode result in better utilization of distributed computing. Instead of writing complicated graph mode code, Tensorflow 2 provides a tool AutoGraph to automatically analyze and convert python code into graph code which can then be traced to create a graph with Function.
The following is a python function we defined in chapter 1. It involves relatively simple math operations on tensors, so it is pretty much graph ready. Let’s see if tf.autograph
would do anything interesting to it.
def f(a, b, power=2, d=3):
return tf.pow(a, power) + d * b
converted_f = tf.autograph.to_graph(f)
print(inspect.getsource(converted_f))
def tf__f(a, b, power=None, d=None): do_return = False retval_ = ag__.UndefinedReturnValue() with ag__.FunctionScope('f', 'fscope', ag__.ConversionOptions(recursive=True, user_requested=True, optional_features=(), internal_convert_user_code=True)) as fscope: do_return = True retval_ = fscope.mark_return_value(ag__.converted_call(tf.pow, (a, power), None, fscope) + d * b) do_return, return ag__.retval(retval_)
The generated function is a regular python function. Even though it may look much more complicated than the original function, but most stuff is boilerplate code to handling details of function scopes and overloads function calls with converted_call
. The line retval_ = fscope.mark_return_value(ag__.converted_call(tf.pow, (a, power), None, fscope) + d * b)
tells us that the generated code is performing the same computation.
Let’s move on to another python function with a bit graph unfriendly construct.
def cube(x):
o = x
for _ in range(2):
o *= x
return o
converted_cube = tf.autograph.to_graph(cube)
print(inspect.getsource(converted_cube))
def tf__cube(x): do_return = False retval_ = ag__.UndefinedReturnValue() with ag__.FunctionScope('cube', 'fscope', ag__.ConversionOptions(recursive=True, user_requested=True, optional_features=(), internal_convert_user_code=True)) as fscope: o = x def get_state(): return () def set_state(_): pass def loop_body(iterates, o): _ = iterates o *= x return o, o, = ag__.for_stmt(ag__.converted_call(range, (2,), None, fscope), None, loop_body, get_state, set_state, (o,), ('o',), ()) do_return = True retval_ = fscope.mark_return_value(o) do_return, return ag__.retval(retval_)
We see that body of the for
loop is converted into a loop_body
function, which is invoked by autograph.for_stmt
. This for_stmt
operation is, in a sense, overloads the for
statement. The purpose of this transformation is to make the code into a functional style so that it can be executed in graph.
Let’s look at yet another function, this time with a conditional statement.
def g(x):
if tf.reduce_any(x < 0):
return tf.square(x)
return x
converted_g = tf.autograph.to_graph(g)
print(inspect.getsource(converted_g))
def tf__g(x): do_return = False retval_ = ag__.UndefinedReturnValue() with ag__.FunctionScope('g', 'fscope', ag__.ConversionOptions(recursive=True, user_requested=True, optional_features=(), internal_convert_user_code=True)) as fscope: def get_state(): return () def set_state(_): pass def if_true(): do_return = True retval_ = fscope.mark_return_value(ag__.converted_call(tf.square, (x,), None, fscope)) return do_return, retval_ def if_false(): do_return = True retval_ = fscope.mark_return_value(x) return do_return, retval_ cond = ag__.converted_call(tf.reduce_any, (x < 0,), None, fscope) do_return, retval_ = ag__.if_stmt(cond, if_true, if_false, get_state, set_state, ('do_return', 'retval_'), ()) do_return, return ag__.retval(retval_)
Here we see the similar thing happened with the conditional execution, it also has been converted into a functional form by overload the if
statement.
The big idea about AutoGraph is that it translates the python code we wrote into a style that can be traced to create Tensorflow graphs. During the transformation, great efforts were made to rewrite the data-dependent conditionals and control flows, as these statements can’t be straight forward overloaded by Tensorflow operations. There are a lot more about AutoGraph, the interested reader can refer to the paper for more details.
2. Functions
Once the code is graph friendly, we can trace the operations in the code to create a graph. The created graph then gets wrapped up in a ConcreteFunction
object so that we can execute computations backed in graph mode with it. The details of how tracing works is omitted from here, due to my limited understanding of it.
Lets walkthrough how these works. First we provide Tensorflow our graph friendly code to create a Function
object.
tf_func_f = tf.function(autograph=False)(f)
tf_func_g = tf.function(autograph=False)(converted_g)
tf_func_g2 = tf.function(autograph=True)(g)
print(tf_func_f.python_function is f)
print(tf_func_g.python_function is converted_g)
print(tf_func_g2.python_function is g)
True True True
As of now, there is no graph created yet. We only attached our function to this Function object. Notice that we specified autograph=False
when constructing the Function in the first two cases, because we know both f
and converted_g
are graph friendly. But good old g
is not, so we need to turn on autograph
for it. In fact, tf.function(autograph=False)(tf.autograph.to_graph(g))
is roughly equivlent to tf.function(autograph=True)(g)
.
Note that we can create Function with tf.function(autograph=False)(g)
this will succeed without error. But we won’t be able to create any graphs with this Function in the next step.
The next step is to provide Tensorflow a signature, i.e. a description of inputs, allowing it to create a graph by tracing how the input tensors would flow through the operations in the graph and record the graph in a callable object.
concrete_g = tf_func_g.get_concrete_function(x=tf.TensorSpec(shape=[3], dtype=tf.float32))
print(concrete_g)
<tensorflow.python.eager.function.ConcreteFunction object at 0x7f7cf8736240>
We can use this concrete function directly as if it is a operation shipped with Tensorflow, or we can call the Function object, which will look up the concrete function and use it. Either way, the computation will be executed with the created graph.
pprint(concrete_g(tf.constant([-1, 1, -2], dtype=tf.float32)))
pprint(tf_func_g(tf.constant([-1, 1, -2], dtype=tf.float32)))
<tf.Tensor: shape=(3,), dtype=float32, numpy=array([1., 1., 4.], dtype=float32)> <tf.Tensor: shape=(3,), dtype=float32, numpy=array([1., 1., 4.], dtype=float32)>
The Function object is like a graph factory. When detailed input specifications were provided, it uses the graph code as receipt to create new graphs. When asked with an known specifications, it will dig up the graph in the storage and serve it. When called with an unknown signature, it will trigger the creation of the concrete function first. Let’s try to make a bunch graphs.
concrete_f = tf_func_f.get_concrete_function(a=tf.TensorSpec(shape=[1], dtype=tf.float32), b=tf.TensorSpec(shape=[1], dtype=tf.float32))
print(concrete_f)
pprint(concrete_f(tf.constant(1.), tf.constant(2.)))
pprint(tf_func_f(1., 2.))
pprint(tf_func_f(a=tf.constant(1., dtype=tf.float32), b=2, power=2.))
pprint(tf_func_f(a=tf.constant(1., dtype=tf.float32), b=2., d=3))
pprint(tf_func_f(a=tf.constant(1., dtype=tf.float32), b=2., d=3., power=3.))
<tensorflow.python.eager.function.ConcreteFunction object at 0x7fb8a40bf080> <tf.Tensor: shape=(), dtype=float32, numpy=7.0> <tf.Tensor: shape=(), dtype=float32, numpy=7.0> <tf.Tensor: shape=(), dtype=float32, numpy=7.0> <tf.Tensor: shape=(), dtype=float32, numpy=7.0> <tf.Tensor: shape=(), dtype=float32, numpy=7.0>
How many graphs we created during the block of code above? The answer is 4. Can you figure out which call created which graph?
print(tf_func_f._get_tracing_count())
4
for i, f in enumerate(tf_func_f._list_all_concrete_functions_for_serialization()):
print(i, f.structured_input_signature)
> 0 ((TensorSpec(shape=(), dtype=tf.float32, name='a'), 2.0, 3.0, 3.0), {})
> 1 ((TensorSpec(shape=(1,), dtype=tf.float32, name='a'), TensorSpec(shape=(1,), dtype=tf.float32, name='b'), 2, 3), {})
> 2 ((1.0, 2.0, 2, 3), {})
> 3 ((TensorSpec(shape=(), dtype=tf.float32, name='a'), 2, 2.0, 3), {})
We can see that with slight difference in the input signature, Tensorflow would create multiple graphs, which may ended up locking a lot of resources if not managed right.
Lastly, tf.function
is also available as a decorator, which makes life easier. The following two are equivalent.
@tf.function(autograph=False)
def square(x):
return x * x
def square(x):
return x * x
square = tf.function(autograph=False)(square)
3. Linear Regression Revisited
Now let’s go back to our linear regression example from last time and try to sugar it up with tf.function
. Let’s run the baseline.
t0 = time.time()
weights = tf.Variable(tf.random.uniform((5, 1)), dtype=tf.float32)
for iteration in range(1001):
with tf.GradientTape() as tape:
y_hat = tf.linalg.matmul(x, weights)
loss = tf.reduce_mean(tf.square(y - y_hat))
if not (iteration % 200):
print('mean squared loss at iteration {:4d} is {:5.4f}'.format(iteration, loss))
gradients = tape.gradient(loss, weights)
weights.assign_add(-0.05 * gradients)
pprint(weights)
print('time took: {} seconds'.format(time.time() - t0))
mean squared loss at iteration 0 is 16.9626 mean squared loss at iteration 200 is 0.0363 mean squared loss at iteration 400 is 0.0062 mean squared loss at iteration 600 is 0.0012 mean squared loss at iteration 800 is 0.0003 mean squared loss at iteration 1000 is 0.0001 <tf.Variable 'Variable:0' shape=(5, 1) dtype=float32, numpy= array([[-1.9201761e-03], [ 1.0043812e+00], [ 2.0000753e+00], [ 3.0021193e+00], [ 3.9955370e+00]], dtype=float32)> time took: 1.3443846702575684 seconds
Let’s see if @tf.function
can speed things up.
t0 = time.time()
weights = tf.Variable(tf.random.uniform((5, 1)), dtype=tf.float32)
@tf.function
def train_step():
with tf.GradientTape() as tape:
y_hat = tf.linalg.matmul(x, weights)
loss = tf.reduce_mean(tf.square(y - y_hat))
gradients = tape.gradient(loss, weights)
weights.assign_add(-0.05 * gradients)
return loss
for iteration in range(1001):
loss = train_step()
if not (iteration % 200):
print('mean squared loss at iteration {:4d} is {:5.4f}'.format(iteration, loss))
pprint(weights)
print('time took: {} seconds'.format(time.time() - t0))
mean squared loss at iteration 0 is 18.7201 mean squared loss at iteration 200 is 0.0325 mean squared loss at iteration 400 is 0.0017 mean squared loss at iteration 600 is 0.0002 mean squared loss at iteration 800 is 0.0000 mean squared loss at iteration 1000 is 0.0000 <tf.Variable 'Variable:0' shape=(5, 1) dtype=float32, numpy= array([[-3.1365452e-03], [ 1.0065703e+00], [ 2.0000944e+00], [ 3.0032609e+00], [ 3.9934685e+00]], dtype=float32)> time took: 0.4259765148162842 seconds
Even as simple as our example is, we can get some nice speed up with computing in graph mode.