Is it possible to minimise a loss function by changing only some elements of a variable? In other words, if I have a variable `X`

of length 2, how can I minimise my loss function by changing `X[0]`

and keeping `X[1]`

constant?

Hopefully this code I have attempted will describe my problem:

```
import tensorflow as tf
import tensorflow.contrib.opt as opt
X = tf.Variable([1.0, 2.0])
X0 = tf.Variable([3.0])
Y = tf.constant([2.0, -3.0])
scatter = tf.scatter_update(X, [0], X0)
with tf.control_dependencies([scatter]):
loss = tf.reduce_sum(tf.squared_difference(X, Y))
opt = opt.ScipyOptimizerInterface(loss, [X0])
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
opt.minimize(sess)
print("X: {}".format(X.eval()))
print("X0: {}".format(X0.eval()))
```

which outputs:

```
INFO:tensorflow:Optimization terminated with:
Message: b'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL'
Objective function value: 26.000000
Number of iterations: 0
Number of functions evaluations: 1
X: [3. 2.]
X0: [3.]
```

where I would like to to find the optimal value of `X0 = 2`

and thus `X = [2, 2]`

**edit**

Motivation for doing this: I would like to import a trained graph/model and then tweak various elements of some of the variables depending on some new data I have.

·

Santiago Trujillo

You can use this trick to restrict the gradient calculation to one index:

```
import tensorflow as tf
import tensorflow.contrib.opt as opt
X = tf.Variable([1.0, 2.0])
part_X = tf.scatter_nd([[0]], [X[0]], [2])
X_2 = part_X + tf.stop_gradient(-part_X + X)
Y = tf.constant([2.0, -3.0])
loss = tf.reduce_sum(tf.squared_difference(X_2, Y))
opt = opt.ScipyOptimizerInterface(loss, [X])
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
opt.minimize(sess)
print("X: {}".format(X.eval()))
```

`part_X`

becomes the value you want to change in a one-hot vector of the same shape as X. `part_X + tf.stop_gradient(-part_X + X)`

is the same as X in the forward pass, since `part_X - part_X`

is 0. However in the backward pass the `tf.stop_gradient`

prevents all unnecessary gradient calculations.

·
Santiago Trujillo
Report

I'm not sure if it is possible with the SciPy optimizer interface, but using one of the regular `tf.train.Optimizer`

subclasses you can do something like that by calling `compute_gradients`

first, then masking the gradients and then calling `apply_gradients`

,
instead of calling `minimize`

(which, as the docs say, basically calls the previous ones).

```
import tensorflow as tf
X = tf.Variable([3.0, 2.0])
# Select updatable parameters
X_mask = tf.constant([True, False], dtype=tf.bool)
Y = tf.constant([2.0, -3.0])
loss = tf.reduce_sum(tf.squared_difference(X, Y))
opt = tf.train.GradientDescentOptimizer(learning_rate=0.1)
# Get gradients and mask them
((X_grad, _),) = opt.compute_gradients(loss, var_list=[X])
X_grad_masked = X_grad * tf.cast(X_mask, dtype=X_grad.dtype)
# Apply masked gradients
train_step = opt.apply_gradients([(X_grad_masked, X)])
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for i in range(10):
_, X_val = sess.run([train_step, X])
print("Step {}: X = {}".format(i, X_val))
print("Final X = {}".format(X.eval()))
```

Output:

```
Step 0: X = [ 2.79999995 2. ]
Step 1: X = [ 2.63999987 2. ]
Step 2: X = [ 2.51199985 2. ]
Step 3: X = [ 2.40959978 2. ]
Step 4: X = [ 2.32767987 2. ]
Step 5: X = [ 2.26214385 2. ]
Step 6: X = [ 2.20971513 2. ]
Step 7: X = [ 2.16777205 2. ]
Step 8: X = [ 2.13421774 2. ]
Step 9: X = [ 2.10737419 2. ]
Final X = [ 2.10737419 2. ]
```

·
Santiago Trujillo
Report

This should be pretty easy to do by using the `var_list`

parameter of the `minimize`

function.

```
trainable_var = X[0]
train_op = tf.train.GradientDescentOptimizer(learning_rate=1e-3).minimize(loss, var_list=[trainable_var])
```

You should note that by convention all trainable variables are added to the tensorflow default collection `GraphKeys.TRAINABLE_VARIABLES`

, so you can get a list of all trainable variables using:

```
all_trainable_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES)
```

This is just a list of variables which you can manipulate as you see fit and use as the `var_list`

parameter.

As a tangent to your question, if you ever want to take customizing the optimization process a step further you can also compute the gradients manually using `grads = tf.gradients(loss, var_list)`

manipulate the gradients as you see fit, then call `tf.train.GradientDescentOptimizer(...).apply_gradients(grads_and_vars_as_list_of_tuples)`

. Under the hood minimize is just doing these two steps for you.

Also note that you are perfectly free to create different optimizers for different collections of variables. You could create an SGD optimizer with learning rate 1e-4 for some variables, and another Adam optimizer with learning rate 1e-2 for another set of variables. Not that there's any specific use case for this, I'm just pointing out the flexibility you now have.

·
Santiago Trujillo
Report

Loading