The compute_gradient function has an unnecessary requirement that enforces the presence of the "labels" and "logits" attributes in the provided loss function (loss_fn). #1238
HaimFisher
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
The current implementation of the compute_gradient function in the open-source library assumes that the loss function used in adversarial attacks must have the attributes "labels" and "logits." However, I encountered a situation where this assumption proved to be unnecessary and caused issues.
In my case, I attempted to attack the tf.keras.applications.InceptionV3 model using the tf.keras.losses.CategoricalCrossentropy() loss function. Unfortunately, this specific loss function does not have the expected "labels" and "logits" attributes. Despite this, the compute_gradient function enforced the presence of these attributes at line 187 of tf2\utils.py, leading to errors.
The requirement for the "labels" and "logits" attributes in the compute_gradient function appears to be a unnecessary limit that avoid the support of multiple loss functions. This conclusion is supported by the fact that when I removed the enforcement of these attributes, the targeted attack worked successfully.
By assuming that loss functions should always have the "labels" and "logits" attributes, the current implementation limits the compatibility of the library with a broader range of loss functions. Removing this unnecessary enforcement would enhance the flexibility and usability of the library, allowing users to employ various loss functions seamlessly in their adversarial attacks.
In light of these findings, I strongly recommend revisiting the implementation of compute_gradient and modifying it to eliminate the requirement for "labels" and "logits" attributes. This bug fix will improve the library's compatibility and effectiveness, providing users with a more robust tool for conducting adversarial attacks.
Reproduce
Steps to reproduce the behavior:
run the code below
import tensorflow as tf
from tensorflow import keras
from cleverhans.tf2.attacks.fast_gradient_method import fast_gradient_method
model = tf.keras.applications.InceptionV3(include_top=True, weights='imagenet')
model.trainable = False
Define the FGSM attack parameters
logits_model = tf.keras.Model(model.input,model.layers[-1].output)
eps = 0.01
target_class = 235
one_hot_tensor = tf.one_hot(target_class, 1000)
reshaped_tensor = tf.reshape(one_hot_tensor, (1, 1000))
cce = tf.keras.losses.CategoricalCrossentropy()
adv_x = fast_gradient_method(logits_model, input, eps, np.inf, loss_fn=cce, y=reshaped_tensor, targeted=True)
See error that tells that loss_fn is missing the attributes labels and logits
File "c:\Users\davidfis\Downloads_MBA\thesis\code\Adversary-Armor.venv\lib\site-packages\cleverhans\tf2\utils.py", line 187, in compute_gradient *
loss = loss_fn(labels=y, logits=model_fn(x))
TypeError: Loss.call() got an unexpected keyword argument 'labels'
wanted behavior
support all type of loss_fn
System configuration
OS: Win
Python version: 3.10.9
TensorFlow version:
Beta Was this translation helpful? Give feedback.
All reactions