Giter VIP home page Giter VIP logo

sdnet's Introduction

SDNet

This is the official code for the Microsoft's submission of SDNet model to CoQA leaderboard. It is implemented under PyTorch framework. The related paper to cite is:

SDNet: Contextualized Attention-based Deep Network for Conversational Question Answering, by Chenguang Zhu, Michael Zeng and Xuedong Huang, at https://arxiv.org/abs/1812.03593.

For usage of this code, please follow Microsoft Open Source Code of Conduct.

Directory structure:

  • main.py: the starter code

  • Models/

    • BaseTrainer.py: Base class for trainer
    • SDNetTrainer.py: Trainer for SDNet, including training and predicting procedures
    • SDNet.py: The SDNet network structure
    • Layers.py: Related network layer functions
    • Bert/
      • Bert.py: Customized class to compute BERT contextualized embedding
        • modeling.py, optimization.py, tokenization.py: From Huggingface's PyTorch implementation of BERT
  • Utils/

    • Arguments.py: Process argument configuration file
    • Constants.py: Define constants used
    • CoQAPreprocess.py: preprocess CoQA raw data into intermediate binary/json file, including tokenzation, history preprending
    • CoQAUtils.py, General Utils.py: utility functions used in SDNet
    • Timing.py: Logging time

How to run

Requirement: PyTorch 0.4.1, spaCy 2.0.16. The docker we used is available at dockerhub: https://hub.docker.com/r/zcgzcgzcg/squadv2/tags. Please use v3.0 or v4.0.

  1. Create a folder (e.g. coqa) to contain data and running logs;
  2. Create folder coqa/data to store CoQA raw data: coqa-train-v1.0.json and coqa-dev-v1.0.json;
  3. Copy the file conf from the repo into folder coqa;
  4. If you want to use BERT-Large, download their model into coqa/bert-large-uncased; if you want to use BERT-base, download their model into coqa/bert-base-cased;
  5. Create a folder glove in the same directory of coqa and download GloVe embedding glove.840B.300d.txt into the folder.

Your directory should look like this:

  • coqa/
    • data/
      • coqa-train-v1.0.json
      • coqa-dev-v1.0.json
    • bert-large-uncased/
      • bert-large-uncased-vocab.txt
      • bert_config.json
      • pytorch_model.bin
    • conf
  • glove/
    • glove.840B.300d.txt

Then, execute python main.py train path_to_coqa/conf.

If you run for the first time, CoQAPreprocess.py will automatically create folders conf~/spacy_intermediate_features~ inside coqa to store intermediate tokenization results, which will take a few hours.

Every time you run the code, a new running folder run_idx will be created inside coqa/conf~, which contains running logs, prediction result on dev set, and best model.

Contact

If you have any questions, please contact Chenguang Zhu, [email protected]

sdnet's People

Contributors

microsoft-github-policy-service[bot] avatar zcgzcgzcg1 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

sdnet's Issues

Inference/Prediction module

Is there a way to test the trained model with our own data? Like passing passage and questions as input and getting answer span and score for it respectively.

Mapping Bert model contents

I have followed all the steps as mentioned in the readme. In the file structure given, inside the bert-large-uncased folder, it is mentioned as

bert-large-uncased-vocab.txt
bert_config.json
pytorch_model.bin

but in the actual models downloaded from the bert repo will have the following,
vocab.txt --> bert-large-uncased-vocab.txt
bert_config.json --> bert_config.json
bert_model.ckpt.index
bert_model.ckpt.meta
bert_model.ckpt.data-00000-of-00001

among the other three I tried replacing each one of them as pytorch_model.bin but each one of them is giving a pickling error, only the key value displayed for error changes.

Traceback (most recent call last):
File "main.py", line 33, in
model.train()
File "/home/crm-di/SDNet/Models/SDNetTrainer.py", line 66, in train
self.setup_model(vocab_embedding)
File "/home/crm-di/SDNet/Models/SDNetTrainer.py", line 137, in setup_model
self.network = SDNet(self.opt, vocab_embedding)
File "/home/crm-di/SDNet/Models/SDNet.py", line 63, in init
self.Bert = Bert(self.opt)
File "/home/crm-di/SDNet/Models/Bert/Bert.py", line 24, in init
self.bert_model = BertModel.from_pretrained(model_file)
File "/home/crm-di/SDNet/Models/Bert/modeling.py", line 505, in from_pretrained
state_dict = torch.load(weights_path)
File "/home/crm-di/SDNet/env/lib/python3.7/site-packages/torch/serialization.py", line 358, in load
return _load(f, map_location, pickle_module)
File "/home/crm-di/SDNet/env/lib/python3.7/site-packages/torch/serialization.py", line 532, in _load
magic_number = pickle_module.load(f)
_pickle.UnpicklingError: invalid load key, '\x0a'.

Can more details about the bert models could be added?

Download bert models

Hi,

I have download bert-large-uncased from "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased.tar.gz", which only gives me the pytorch_model.bin and bert_config.json.
In order to get the bert-large-uncased-vocab.txt missing, I've downloaded the model from https://github.com/google-research/bert and renamed vocab.txt into bert-large-uncased-vocab.txt

I guess that it was not the right solution because I got an error

Using BERT Large model
Loading tokenizer from ../coqa/bert-large-uncased/bert-large-uncased-vocab.txt
02/27/2019 14:02:02 - INFO - Models.Bert.tokenization -   loading vocabulary file ../coqa/bert-large-uncased/bert-large-uncased-vocab.txt
*****************
prev_ques   : 2
prev_ans    : 2
ques_max_len: 140
****************
/path/SDNet/Models/SDNet.py:286: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
  alpha_softmax = F.softmax(alpha)
Traceback (most recent call last):
  File "main.py", line 33, in <module>
    model.train()
  File "/path/SDNet/Models/SDNetTrainer.py", line 126, in train
    self.update(batch)
  File "/path/SDNet/Models/SDNetTrainer.py", line 182, in update
    targets = torch.LongTensor(np.array(targets))
TypeError: can't convert np.ndarray of type numpy.object_. The only supported types are: double, float, float16, int64, int32, and uint8.

Any idea about the place I can get the data or fix it??

train error with BERT model in pytorch 0.4.0 or 0.4.1

IN pytorch ==0.4.1
when i training model with Bert_large_model, it shows ERROR in NO.288 of Models/SDNet.py as follows:
t = output[i] * alpha_softmax[i] * gamma

/da1/home/berton/project/SDNet/Models/SDNet.py:286: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
alpha_softmax = F.softmax(alpha)
Traceback (most recent call last):
File "main.py", line 35, in
model.train()
File "/da1/home/berton/project/SDNet/Models/SDNetTrainer.py", line 126, in train
self.update(batch)
File "/da1/home/berton/project/SDNet/Models/SDNetTrainer.py", line 165, in update
query, query_mask, query_char, query_char_mask, query_bert, query_bert_mask, query_bert_offsets, len(context_words))
File "/home/berton/py35/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/da1/home/berton/project/SDNet/Models/SDNet.py", line 192, in forward
x_cemb_mid = self.linear_sum(x_bert_output, self.alphaBERT, self.gammaBERT)
File "/da1/home/berton/project/SDNet/Models/SDNet.py", line 288, in linear_sum
t = output[i] * alpha_softmax[i] * gamma
RuntimeError: sizes must be non-negative

However, I guess it's due to version issues(pytorch/pytorch#11478) ,and change to Pytorch ==0.4.0 again, it shows another ERROR shows :

/da1/home/berton/project/SDNet/Models/SDNet.py:286: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
alpha_softmax = F.softmax(alpha)
Traceback (most recent call last):
File "main.py", line 35, in
model.train()
File "/da1/home/berton/project/SDNet/Models/SDNetTrainer.py", line 126, in train
self.update(batch)
File "/da1/home/berton/project/SDNet/Models/SDNetTrainer.py", line 165, in update
query, query_mask, query_char, query_char_mask, query_bert, query_bert_mask, query_bert_offsets, len(context_words))
File "/home/berton/anaconda3/lib/python3.5/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/da1/home/berton/project/SDNet/Models/SDNet.py", line 192, in forward
x_cemb_mid = self.linear_sum(x_bert_output, self.alphaBERT, self.gammaBERT)
File "/da1/home/berton/project/SDNet/Models/SDNet.py", line 292, in linear_sum
res += t
RuntimeError: The expanded size of the tensor (1024) must match the existing size (388) at non-singleton dimension 2

Please add support for pytorch 1.0.0 and msgpack-numpy 0.4.4.1

I used msgpack-numpy==0.4.4.1, but the code goes wrong when store and load preprocessed file.

I used pytorch==1.0.0, but it goes wrong when starting training.

So I have to downgrade to msgpack-numpy==0.4.3.2 and pytorch==0.4.1, and they worked!

I would be very grateful if you could add support for the higher version, thanks~

Python 3.4 compatibility

It seems async is a reserved keyword since python 3.4, it would be nice if you could update your code to reflect this change. Just change async=True in your code to non_blocking=True would make the code compatible with python >= 3.4

Why training with ELMo instead of BERT does not improve results ?

Hello,

I have tried to use ELMo instead of BERT as you can see on my fork
The training is working but the results are very similar with the training without any contextual embedding (just GLoVE).
Do you have any idea why or how to fix it?
I think that I might have forgotten smth in my code...

Moreover I can notice that x_cemb and ques_cemb are never instanciate, they are always None, would this be part of the issue?

Thanks in advance

train with bert model

the Requirement is PyTorch 0.4.0
but the line No.518 of Models/Bert/modeling.py is:
module._load_from_state_dict(state_dict, prefix, local_metadata, True, missing_keys, unexpected_keys, error_msgs)
the code is for Pytorch 0.4.1 ( huggingface/transformers#122)
When i removed "local_metadata", it shows the ERROR as follows. could anyone can help?

Loading train json...
Loading dev json...
Epoch 0
Using BERT Large model
Loading tokenizer from test_coqa/bert-large-uncased/bert-large-uncased-vocab.txt
04/18/2019 11:27:58 - INFO - Models.Bert.tokenization - loading vocabulary file test_coqa/bert-large-uncased/bert-large-uncased-vocab.txt


prev_ques : 2
prev_ans : 2
ques_max_len: 140


Using BERT Large model
Loading tokenizer from test_coqa/bert-large-uncased/bert-large-uncased-vocab.txt
04/18/2019 11:27:58 - INFO - Models.Bert.tokenization - loading vocabulary file test_coqa/bert-large-uncased/bert-large-uncased-vocab.txt


prev_ques : 2
prev_ans : 2
ques_max_len: 140


/da1/home/berton/project/SDNet/Models/SDNet.py:286: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
alpha_softmax = F.softmax(alpha)
Traceback (most recent call last):
File "main.py", line 35, in
model.train()
File "/da1/home/berton/project/SDNet/Models/SDNetTrainer.py", line 126, in train
self.update(batch)
File "/da1/home/berton/project/SDNet/Models/SDNetTrainer.py", line 165, in update
query, query_mask, query_char, query_char_mask, query_bert, query_bert_mask, query_bert_offsets, len(context_words))
File "/home/berton/anaconda3/lib/python3.5/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/da1/home/berton/project/SDNet/Models/SDNet.py", line 192, in forward
x_cemb_mid = self.linear_sum(x_bert_output, self.alphaBERT, self.gammaBERT)
File "/da1/home/berton/project/SDNet/Models/SDNet.py", line 292, in linear_sum
res += t
RuntimeError: The expanded size of the tensor (1024) must match the existing size (388) at non-singleton dimension 2

关于fine-tuning bert的问题

请问一下作者在实验过程中有尝试过fine-tuning bert的参数吗?如果尝试过,请问一下卡的类型、显存大小、训练时间分别是多少?

Is there a code for the tensorflow version?

你好,我对SDNET模型的encoding 层特别感兴趣,特别是创新的使用BERT部分。请问下,你后续会出TF版本的代码吗?或者就encoding部分的TF代码?

CUDA error: out of memory

While training I am getting this error

Traceback (most recent call last):
File "main.py", line 33, in
model.train()
File "/home/SDNet/Models/SDNetTrainer.py", line 126, in train
self.update(batch)
File "/home/SDNet/Models/SDNetTrainer.py", line 164, in update
query, query_mask, query_char, query_char_mask, query_bert, query_bert_mask, query_bert_offsets, len(context_words))
File "/home/SDNet/env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/SDNet/Models/SDNet.py", line 253, in forward
x_highlvl_output = self.high_lvl_context_rnn(torch.cat([x_rnn_after_inter_attn, x_self_attn_output], 2), x_mask)
File "/home/raisudeen/SDNet/env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/SDNet/Models/Layers.py", line 163, in forward
rnn_output = self.rnnsi[0]
File "/home/SDNet/env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/crm-di/raisudeen/SDNet/env/lib/python3.7/site-packages/torch/nn/modules/rnn.py", line 192, in forward
output, hidden = func(input, self.all_weights, hx, batch_sizes)
File "/home/SDNet/env/lib/python3.7/site-packages/torch/nn/_functions/rnn.py", line 324, in forward
return func(input, *fargs, **fkwargs)
File "/home/SDNet/env/lib/python3.7/site-packages/torch/nn/_functions/rnn.py", line 288, in forward
dropout_ts)
RuntimeError: CUDA error: out of memory

I have tried changing the mini batch size from 32 to all the way to 2 (tried 16,8,4 as well). I am using GTX 1080 Ti GPU. Is there anything else that I can do to?

关于TypeError的问题

image

我们按照readme中的步骤安排好文件结构后开始训练,一段时间后查看进程就出现了这个typrerror的问题,希望有人可以提点一二。

tf version

Hi!
I use tf 1.10.0 to transfer bert ckpt into pytorch_model.bin. Could you tell me the tf version?
When I run the source code, this error occurred:
TypeError: _load_from_state_dict() takes 7 positional arguments but 8 were given

Thank you!

Pre-trained model ?

Hi

Thanks for your amazing work on this project. Would it be possible for you to share your pre-trained model ?

Thanks in advance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.