Giter VIP home page Giter VIP logo

hierarchical-explanation-neural-sequence-models's Introduction

Hierarchical Explanations for Neural Sequence Model Predictions

This repo include implementation of SOC and SCD algorithm, scripts for visualization and evaluation. Paper: Towards Hierarchical Importance Attribution: Explaining Compositional Semantics for Neural Sequence Models, ICLR 2020.

Installation

conda create -n hiexpl-env python==3.7.4
conda activate hiexpl-env
# modify CUDA version as yours
conda install pytorch=0.4.1 cuda90 -c pytorch
pip install nltk numpy scikit-learn scikit-image matplotlib torchtext
# requirements from pytorch-transformers
pip install tokenizers==0.0.11 boto3 filelock requests tqdm regex sentencepiece sacremoses

Pipeline

SST-2 (LSTM)

Train a LSTM classifier. The SST-2 dataset will be downloaded automatically.

mkdir models
export model_path=models/sst_lstm
python train.py --task sst --save_path models/${model_path} --no_subtrees --lr 0.0005

Pretrain a language model on the training set.

export lm_path=models/sst_lstm_lm
python -m lm.lm_train --task sst --save_path models/${lm_path} --no_subtrees --lr 0.0002

Use SOC/SCD to interpret first 50 predictions on dev set. nb_range=10 and sample_n=20 recommended.

export algo=soc # or scd
export exp_name=.sst_demo
mkdir sst
mkdir sst/results
python explain.py --resume_snapshot models/${model_path} --method ${algo} --lm_path models/${lm_path} --batch_size 1 --start 0 --stop 50 --exp_name ${exp_name} --task sst --explain_model lstm --nb_range 10 --sample_n 20

Check outputs/sst/soc_results/ or outputs/sst/scd_results for explanation outputs. Each line contains one instance, where score attribution for each word/phrase is tab-separated. For example:

it 0.142656	 's 0.192409	 the 0.175471	 best 0.829247	 film 0.095305	 best film 0.805854	 the best film 1.004583 ...

Or use --agg flag to automatically construct hierarchical explanations, without need for ground truth parsing trees.

python explain.py --resume_snapshot models/${model_path} --method ${algo} --lm_path models/${lm_path} --batch_size 1 --start 0 --stop 50 --exp_name ${exp_name} --task sst --explain_model lstm --nb_range 10 --sample_n 20 --agg

The output can be read by visualize.py to generate visualizations.

SST-2 (BERT)

Download SST-2 dataset from https://gluebenchmark.com/ and unzip at bert/glue_data. Then finetune the BERT model to build a classifier.

python -m bert.run_classifier \
  --task_name SST-2 \
  --do_train \
  --do_eval \
  --do_lower_case \
  --data_dir bert/glue_data/SST-2 \
  --bert_model bert-base-uncased \
  --max_seq_length 128 \
  --train_batch_size 32 \
  --learning_rate 2e-5 \
  --num_train_epochs 3 \
  --output_dir bert/models_sst 

Then train a language model on BERT-tokenized inputs, and run explanations. Simply add the --use_bert_tokenizer and --explain_model bert flag to all the experiments above for LSTM.

Note that you need to filter out subtrees in train.tsv if you are interested in evaluating explanations.

Evaluating explanations

To evaluate word level explanation, a BOW linear classifier is required.

python -m nns.linear_model --task sst --save_path models/${model_path}

For evaluation of phrase level explanation, you also need to download the original SST dataset.

wget http://nlp.stanford.edu/~socherr/stanfordSentimentTreebank.zip
unzip ./.data/stanfordSentimentTreebank.zip -d ./.data/
mv ./.data/stanfordSentimentTreebank ./.data/sst_raw

Then run the evaluation script:

python eval_explanations.py --eval_file outputs/soc${exp_name}.txt

Contact

If you have any questions about the paper or the code, please feel free to contact Xisen Jin (xisenjin usc edu).

hierarchical-explanation-neural-sequence-models's People

Contributors

aucson avatar es94129 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

hierarchical-explanation-neural-sequence-models's Issues

Should we perform softmax before computing the prediction difference?

logits_enb = self.model(inp_enb, segment_ids[:, :inp_enb.size(1)], inp_mask_enb)
logits_ex = self.model(inp_ex, segment_ids[:, :inp_ex.size(1)], inp_mask_ex)
contrib_logits = logits_enb - logits_ex  # [1 * C]

Hi, the logits_enb and logits_ex may have very different values, while softmax(logits_enb) and softmax(logits_ex) remains similar.
Should we perform softmax before computing the prediction difference?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.