How to register a custom gradient for a operation composed of tf operations

Question

More specifically I have a simple fprop that is a composition of tf operations. I want to override the tensorflow gradient computation with my own gradient method using RegisterGradient.

What's wrong with this code?

import tensorflow as tf
from tensorflow.python.framework import ops

@ops.RegisterGradient("MyopGrad")
def frop_grad(op, grad):
    x = op.inputs[0]
    return 0 * x  # zero out to see the difference:

def fprop(x):
    x = tf.sqrt(x)
    out = tf.maximum(x, .2)
    return out

a = tf.Variable(tf.constant([5., 4., 3., 2., 1.], dtype=tf.float32))
h = fprop(a)
h = tf.identity(h, name="Myop")
grad = tf.gradients(h, a)

g = tf.get_default_graph()
with g.gradient_override_map({'Myop': 'MyopGrad'}):
    with tf.Session() as sess:
        sess.run(tf.initialize_all_variables())
        result = sess.run(grad)

print(result[0])

I want to see all zeros in the print, but instead I am getting:

[ 0.2236068   0.25000003  0.28867513  0.35355341  0.5       ]

score 11 · Accepted Answer · answered Jun 16 '17 at 15:19

You need to define the op within the scope of with g.gradient_override_map({'Myop': 'MyopGrad'})

Also, you need to map Identity rather than the name Myop to your new gradient.

Here is the full code:

import tensorflow as tf
from tensorflow.python.framework import ops

@ops.RegisterGradient("MyopGrad")
def frop_grad(op, grad):
    x = op.inputs[0]
    return 0 * x  # zero out to see the difference:

def fprop(x):
    x = tf.sqrt(x)
    out = tf.maximum(x, .2)
    return out

a = tf.Variable(tf.constant([5., 4., 3., 2., 1.], dtype=tf.float32))
h = fprop(a)

g = tf.get_default_graph()
with g.gradient_override_map({'Identity': 'MyopGrad'}):
    h = tf.identity(h, name="Myop")
    grad = tf.gradients(h, a)

with tf.Session() as sess:
    sess.run(tf.initialize_all_variables())
    result = sess.run(grad)

print(result[0])

Output:

[ 0.  0.  0.  0.  0.]

Doesn't this define a custom gradient function for the identity op and not the fprop function? If you do not multiply x by zero you won't see [5., 4., 3., 2., 1.] but instead you'll see the input to the identity() op. — Milad, Dec 28 '17 at 21:10
@Milad that's probably why MZHm ignores the original gradient in MyopGrad, though I suspect there are very few practical use cases for doing that ... — cutsoy, Jun 06 '18 at 15:13

score 0 · Answer 2 · answered Aug 08 '18 at 12:37

If you want to use tf.RegisterGradient() for this purpose, I'm not sure if it is a proper solution. Because in the official documents https://www.tensorflow.org/api_docs/python/tf/RegisterGradient , it says:

This decorator is only used when defining a new op type.

which means you need to define a new op written in C++ or wrapped in py_func. I'm not totally sure if it can apply on the group of "tf op" you said.

However, You can also refer to the "trick" methods mentioned in this thread:

How Can I Define Only the Gradient for a Tensorflow Subgraph?

where you could combine tf.stop_gradient() and tfgradient_override_map() together to re-define the gradients for groups of operations

score 0 · Answer 3 · answered Jan 03 '19 at 17:15

0

See this answer (note that different questions might be satisfactorily answered by the same answer).

answered Jan 03 '19 at 17:15

Stephane Bersier

710
7
20

How to register a custom gradient for a operation composed of tf operations

3 Answers3

Linked