eric-wallace / universal-triggers Goto Github PK
View Code? Open in Web Editor NEWUniversal Adversarial Triggers for Attacking and Analyzing NLP (EMNLP 2019)
License: MIT License
Universal Adversarial Triggers for Attacking and Analyzing NLP (EMNLP 2019)
License: MIT License
Hi Eric, I'm looking at your paper for a school project and was wondering if you had the code to filter positive/negative sentiment words for the triggers generated during the sst attack. Any help would be appreciated!
Traceback (most recent call last):
File "/universal-triggers/gpt2/create_adv_token.py", line 10, in
import utils
File "/universal-triggers/gpt2/utils.py", line 8, in
from allennlp.data.iterators import BucketIterator
File "/usr/local/lib/python3.8/dist-packages/allennlp/data/init.py", line 1, in
from allennlp.data.dataset_readers.dataset_reader import DatasetReader
File "/usr/local/lib/python3.8/dist-packages/allennlp/data/dataset_readers/init.py", line 10, in
from allennlp.data.dataset_readers.ccgbank import CcgBankDatasetReader
File "/usr/local/lib/python3.8/dist-packages/allennlp/data/dataset_readers/ccgbank.py", line 9, in
from allennlp.data.dataset_readers.dataset_reader import DatasetReader
File "/usr/local/lib/python3.8/dist-packages/allennlp/data/dataset_readers/dataset_reader.py", line 8, in
from allennlp.data.instance import Instance
File "/usr/local/lib/python3.8/dist-packages/allennlp/data/instance.py", line 3, in
from allennlp.data.fields.field import DataArray, Field
File "/usr/local/lib/python3.8/dist-packages/allennlp/data/fields/init.py", line 7, in
from allennlp.data.fields.array_field import ArrayField
File "/usr/local/lib/python3.8/dist-packages/allennlp/data/fields/array_field.py", line 10, in
class ArrayField(Field[numpy.ndarray]):
File "/usr/local/lib/python3.8/dist-packages/allennlp/data/fields/array_field.py", line 51, in ArrayField
def empty_field(self): # pylint: disable=no-self-use
File "/usr/local/lib/python3.8/dist-packages/overrides/overrides.py", line 88, in overrides
return _overrides(method, check_signature, check_at_runtime)
File "/usr/local/lib/python3.8/dist-packages/overrides/overrides.py", line 114, in _overrides
_validate_method(method, super_class, check_signature)
File "/usr/local/lib/python3.8/dist-packages/overrides/overrides.py", line 135, in _validate_method
ensure_signature_is_compatible(super_method, method, is_static)
File "/usr/local/lib/python3.8/dist-packages/overrides/signature.py", line 93, in ensure_signature_is_compatible
ensure_return_type_compatibility(super_type_hints, sub_type_hints, method_name)
File "/usr/local/lib/python3.8/dist-packages/overrides/signature.py", line 287, in ensure_return_type_compatibility
raise TypeError(
TypeError: ArrayField.empty_field: return type None
is not a <class 'allennlp.data.fields.field.Field'>
.
Hi Eric! Thanks for sharing this work. I've implemented this in Tensorflow to use with a dupe of the 124M GPT-2 model and was wondering if you could provide some details on the range of final "best loss" #s you were seeing with the smallest model and the triggers which worked (I'm working under the assumption that on a vocab size of 50k that cross entropy of ~10.8 ish would be equivalent to "random"). My current process isn't producing triggers which are successfully adversarial and I'm wondering if perhaps I'm just not finding very good triggers. Thanks!
Hello, I'd like to ask what your code License is.
Hi,
would appreciate code examples to use character embeddings of ELMO for SST -LSTM or SNLI -DA
Hi Eric, I'm looking at your paper for a school project and was wondering if you had any tips for adapting the code to generate triggers for a BERT model during the sst attack. Any help would be appreciated!
Traceback (most recent call last):
File "sst.py", line 184, in
main()
File "sst.py", line 155, in main
averaged_grad = utils.get_average_grad(model, batch, trigger_token_ids)
File "../utils.py", line 84, in get_average_grad
loss.backward()
File "/anaconda3/envs/triggers/lib/python3.6/site-packages/torch/tensor.py", line 102, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/anaconda3/envs/triggers/lib/python3.6/site-packages/torch/autograd/init.py", line 90, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: cudnn RNN backward can only be called in training mode
Hello Eric, in your paper you wrote that a sentiment word list is used to filter the words to substitute, but I could not find related code in the repository. Maybe the reason is that it is not important in generating the trigger words?
When I run pip install -r requirements.txt
with the default requirements.txt file, I get the following error:
Traceback (most recent call last): File "squad/squad.py", line 2, in <module> from allennlp.data.dataset_readers.reading_comprehension.squad import SquadReader File "/opt/conda/lib/python3.6/site-packages/allennlp/data/__init__.py", line 1, in <module> from allennlp.data.dataset_readers.dataset_reader import DatasetReader File "/opt/conda/lib/python3.6/site-packages/allennlp/data/dataset_readers/__init__.py", line 10, in <module> from allennlp.data.dataset_readers.ccgbank import CcgBankDatasetReader File "/opt/conda/lib/python3.6/site-packages/allennlp/data/dataset_readers/ccgbank.py", line 9, in <module> from allennlp.data.dataset_readers.dataset_reader import DatasetReader File "/opt/conda/lib/python3.6/site-packages/allennlp/data/dataset_readers/dataset_reader.py", line 8, in <module> from allennlp.data.instance import Instance File "/opt/conda/lib/python3.6/site-packages/allennlp/data/instance.py", line 3, in <module> from allennlp.data.fields.field import DataArray, Field File "/opt/conda/lib/python3.6/site-packages/allennlp/data/fields/__init__.py", line 10, in <module> from allennlp.data.fields.knowledge_graph_field import KnowledgeGraphField File "/opt/conda/lib/python3.6/site-packages/allennlp/data/fields/knowledge_graph_field.py", line 14, in <module> from allennlp.data.token_indexers.token_indexer import TokenIndexer, TokenType File "/opt/conda/lib/python3.6/site-packages/allennlp/data/token_indexers/__init__.py", line 5, in <module> from allennlp.data.token_indexers.dep_label_indexer import DepLabelIndexer File "/opt/conda/lib/python3.6/site-packages/allennlp/data/token_indexers/dep_label_indexer.py", line 9, in <module> from allennlp.data.tokenizers.token import Token File "/opt/conda/lib/python3.6/site-packages/allennlp/data/tokenizers/__init__.py", line 8, in <module> from allennlp.data.tokenizers.pretrained_transformer_tokenizer import PretrainedTransformerTokenizer File "/opt/conda/lib/python3.6/site-packages/allennlp/data/tokenizers/pretrained_transformer_tokenizer.py", line 5, in <module> from pytorch_transformers.tokenization_auto import AutoTokenizer ModuleNotFoundError: No module named 'pytorch_transformers.tokenization_auto'
So I install the latest version of allennlp with pip install allennlp --upgrade
and get the following versions:
allennlp-0.9.0
pytorch-transformers-1.1.0
However, I still can't run the squad script because I get the following error:
Traceback (most recent call last): File "squad.py", line 118, in <module> main() File "squad.py", line 19, in main model.eval().cuda() File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 265, in cuda return self._apply(lambda t: t.cuda(device)) File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 193, in _apply module._apply(fn) File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 193, in _apply module._apply(fn) File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/rnn.py", line 127, in _apply self.flatten_parameters() File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/rnn.py", line 123, in flatten_parameters self.batch_first, bool(self.bidirectional)) RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED
I am running the code on an AWS instance with CUDA and Python 3.6.9. I would appreciate it if you could help me figure out what I'm doing wrong.
Hi Thanks for sharing the code. I have a quick question about the implementation of hotflip_attack (equation 2 of paper) (
Line 8 in 2e4bc93
Hi,
In snli.py file, I'm wondering why get_best_candidates() function extracts the best candidate with the largest loss but at the same time, the model extracts candidates that can minimize model's loss (hotflip_attack() function with increase_loss=False)? Shouldn't the best candidate also minimize the loss? Thanks!
When comparing the accuracy of esim-glove-snli
with decomposable-attention
, it seems like the the accuracy computed for the second model is not computed correctly.
Am I missing something here?
Here's a Colab.
# Load model and vocab
model = load_archive('https://allennlp.s3-us-west-2.amazonaws.com/models/esim-glove-snli-
2019.04.23.tar.gz').model
model.eval().cuda()
vocab = model.vocab
# Get original accuracy before adding universal triggers
utils.get_accuracy(model, subset_dev_dataset, vocab, trigger_token_ids=None, snli=True)
100%|██████████| 53676932/53676932 [00:00<00:00, 62451888.15B/s]
Without Triggers: 0.9095824571943527
# Load model and vocab
model2 = load_archive('https://s3-us-west-2.amazonaws.com/allennlp/models/decomposable-attention-2017.09.04.tar.gz').model
model2.eval().cuda()
vocab2 = model2.vocab
# Get original accuracy before adding universal triggers
utils.get_accuracy(model2, subset_dev_dataset, vocab2, trigger_token_ids=None, snli=True)
100%|██████████| 38907176/38907176 [00:00<00:00, 77210263.22B/s]
Did not use initialization regex that was passed: .*token_embedder_tokens\\\\._projection.*weight
Without Triggers: 0.42805647341544006
Hi Wallace, thanks for sharing this work! As I am trying the different token replace strategies in the atttack.py I find a little hard to understand why for increase_loss = False, we apply, e(t+1)=e(t)+g (gradient ascent), while for increase_loss = True, we apply, e(t+1)=e(t)-g. In my understanding, the gradient descent tries to minimize the loss function? And I also try to flipping the sign the original code which gets better attack accuracy compared to the source code? Thanks!
when I run "python sst.py",this error always exist.So,please tell me how to slove this error.
error:
" from pytorch_transformers.tokenization_auto import AutoTokenizer
ModuleNotFoundError: No module named 'pytorch_transformers.tokenization_auto' "
thank you @Eric-Wallace
For snli task, if i sample the subset of which the labels are 'entailment' and attack toward the same class(increase_loss=Flase), the accuracy should get higher right? cause we minimize the loss between the input and the target 'entailment', which is the true label.
That wasn't the case though, the accuracy drops.
Hello
I have a small question about sst task code. Should the increase_loss
parameter be set to False
when dataset_label_filter = "1"
?
Hi Eric,
I notice that you generated triggers separately for each class (e.g., generate a trigger for positive examples, and generate another one for negative examples). Have you tried generating a common trigger for all different classes (just set the target label to a different label than the original label)? Any chance it could work well?
Thank you for your wonderful work!
I'd like to ask How are “target_texts” generated, in "Conditional Text Generation" ?
Is there a strategy in this step?
In the attacks.py file, line number 30 (hot flip attack), according to the commented part of the code (line 19), shouldn't it be multiplying the gradient dot embedding matrix with -1 when increase loss is True?
hence line 30 should be [ **if increase_loss :**
] ?
Hi there, a little background on my project. I am currently doing a benign/malware app classifier based on API sequences, which can be quite similar to text classification (positive/negative).
I am running the code based on sst.py. To prepare my dataset, I followed the allennlp to create instances for train and dev data. Everything seems fine when I run the main() function, the training is done but when it comes to the trigger part, the "words" seems to be changing but accuracy is not dropping. Do you have any idea why is this happening? The same behaviour can be seen with the different attacks (e.g hotflip, nearest_neighbor_grad etc..)
Without Triggers: 0.9994070560332049
Current Triggers: landroid/content/context;->sendbroadcast, landroid/content/context;->sendbroadcast, landroid/content/context;->sendbroadcast, landroid/content/context;->sendbroadcast, landroid/content/context;->sendbroadcast, landroid/content/context;->sendbroadcast, landroid/content/context;->sendbroadcast, : 0.9997035280166024
Current Triggers: landroid/content/context;->sendbroadcast, landroid/os/bundle;->putsparseparcelablearray, landroid/app/activity;->databaselist, landroid/app/activity;->databaselist, landroid/app/activity;->databaselist, landroid/app/activity;->databaselist, landroid/os/bundle;->putsparseparcelablearray, : 0.9997035280166024
Current Triggers: landroid/content/context;->sendbroadcast, landroid/content/intent;->replaceextras, landroid/content/intent;->replaceextras, landroid/content/intent;->replaceextras, landroid/content/intent;->replaceextras, landroid/content/intent;->replaceextras, landroid/content/intent;->replaceextras, : 0.9997035280166024
Current Triggers: landroid/content/context;->sendbroadcast, landroid/content/res/assetmanager;->opennonassetfdnative, landroid/content/res/assetmanager;->opennonassetfdnative, landroid/content/res/assetmanager;->opennonassetfdnative, landroid/media/audiomanager;->adjuststreamvolume, landroid/content/res/assetmanager;->opennonassetfdnative, landroid/content/res/assetmanager;->opennonassetfdnative, : 0.9997035280166024
Current Triggers: landroid/content/context;->sendbroadcast, ljava/lang/runtime;->runfinalization, ljava/lang/runtime;->runfinalization, ljava/lang/runtime;->runfinalization, lorg/apache/cordova/directorymanager;->gettempdirectorypath, landroid/hardware/sensormanager;->getsensorlist, ljava/lang/runtime;->runfinalization, : 0.9997035280166024
Current Triggers: landroid/content/context;->sendbroadcast, landroid/content/intent;->getcomponent, landroid/content/intent;->getcomponent, landroid/content/intent;->getcomponent, landroid/content/intent;->getcomponent, landroid/os/environment;->isexternalstorageemulated, landroid/content/intent;->getcomponent, : 0.9997035280166024
Current Triggers: landroid/app/activity;->databaselist, landroid/os/bundle;->getparcelablearraylist, landroid/os/bundle;->getparcelablearraylist, landroid/os/bundle;->getparcelablearraylist, landroid/accounts/accountmanager;->getauthtoken, landroid/location/locationmanager;->removeproximityalert, landroid/os/bundle;->getparcelablearraylist, : 0.9997035280166024
Current Triggers: landroid/content/intent;->replaceextras, landroid/net/uri;->getfragment, landroid/net/uri;->getfragment, landroid/app/activitymanager;->killbackgroundprocesses, landroid/content/clipboardmanager;->getservice, landroid/os/bundle;->putparcelablearraylist, ljava/lang/system;->setsecuritymanager, : 0.9997035280166024
Current Triggers: landroid/content/res/assetmanager;->opennonassetfdnative, ljava/net/urlconnection;->getfilenamemap, ljava/net/urlconnection;->getfilenamemap, lorg/apache/xerces/impl/xmlentitymanager;->isentitydeclinexternalsubset, landroid/content/clipboardmanager;->reportprimaryclipchanged, landroid/app/fragmentmanager;->begintransaction, landroid/net/uri;->getencodedpath, : 0.9997035280166024
Current Triggers: ljava/lang/runtime;->runfinalization, landroid/net/uri;->tostring, lorg/apache/xerces/impl/xmlentitymanager;->closereaders, landroid/hardware/camera;->cancelautofocus, landroid/app/activitymanager;->getlocktaskmodestate, landroid/webkit/cookiesyncmanager;->resetsync, ljava/net/urlconnection;->getdooutput, : 0.9997035280166024
Current Triggers: landroid/content/intent;->getcomponent, landroid/bluetooth/rfcommsocket;->waitforasyncconnectnative, landroid/app/activity;->finalize, landroid/hardware/sensor;->getreportingmode, landroid/content/intent;->setdataandtype, landroid/hardware/camera;->startsmoothzoom, lorg/apache/cordova/file/directorymanager;->getfreediskspace, : 0.9997035280166024
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.