Giter VIP home page Giter VIP logo

zen's People

Contributors

guiminchen avatar wixette avatar yuanhetian avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

zen's Issues

总线错误(吐核)

小数量语料(25w行)没有出问题,在跑大规模语料(840w)是出现总线错误(吐核),日志如下:
image

麻烦问一下是否知道原因以及可能的解决方案?

数据集预处理

您好,感谢您的付出!我想请问一下,当我下载了一个中文数据集THUCNew后,我应该怎么做(或者说使用什么命令)才能让./examples/create_pre_train_data.py正常运行并生成正确的训练集呢?

期待您的回复!感激不尽!

about pre-training time

我们有相同的配置NVIDIA Tesla V100 GPUs with 16GB memory,打算换切换为百度百科进行预训练,请问 1epoch 大概需要多久?
We have the same configuration of NVIDIA Tesla V100 GPUs with 16GB memory, we plan to switch to baidu baike for pre-training, may I ask how long it will take for 1 epoch?

size mismatch for classifier.bias: copying a param with shape torch.Size

请问可以直接执行分类任务吗?还是必须finetun.
我下载了所有数据,直接执行这个报错:

python run_sequence_level_classification.py
--task_name ChnSentiCorp
--do_train
--do_eval
--do_lower_case
--data_dir /path/to/dataset/ChnSentiCorp
--bert_model /path/to/zen_model
--max_seq_length 512
--train_batch_size 32
--learning_rate 2e-5
--num_train_epochs 30.0

07/20/2020 22:14:06 - INFO - ZEN.tokenization - loading vocabulary file /data/ceph/arikchen/TitleScoring_withData/zen_ngram/ZEN_ft_NLI_v0.1.0/vocab.txt
07/20/2020 22:14:06 - INFO - ZEN.ngram_utils - loading ngram frequency file /data/ceph/arikchen/TitleScoring_withData/zen_ngram/ZEN_ft_NLI_v0.1.0/ngram.txt
07/20/2020 22:14:08 - INFO - ZEN.modeling - loading weights file /data/ceph/arikchen/TitleScoring_withData/zen_ngram/ZEN_ft_NLI_v0.1.0/pytorch_model.bin
07/20/2020 22:14:08 - INFO - ZEN.modeling - loading configuration file /data/ceph/arikchen/TitleScoring_withData/zen_ngram/ZEN_ft_NLI_v0.1.0/config.json
07/20/2020 22:14:08 - INFO - ZEN.modeling - Model config {
"attention_probs_dropout_prob": 0.1,
"directionality": "bidi",
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"initializer_range": 0.02,
"intermediate_size": 3072,
"layer_norm_eps": 1e-12,
"max_position_embeddings": 512,
"num_attention_heads": 12,
"num_hidden_layers": 12,
"num_hidden_word_layers": 6,
"pooler_fc_size": 768,
"pooler_num_attention_heads": 12,
"pooler_num_fc_layers": 3,
"pooler_size_per_head": 128,
"pooler_type": "first_token_transform",
"type_vocab_size": 2,
"vocab_size": 21128,
"word_size": 104089
}

Traceback (most recent call last):
File "examples/run_sequence_level_classification.py", line 396, in
main()
File "examples/run_sequence_level_classification.py", line 361, in main
if task_name not in processors:
File "/data/anaconda3/lib/python3.6/site-packages/ZEN-0.1.0-py3.6.egg/ZEN/modeling.py", line 839, in from_pretrained
RuntimeError: Error(s) in loading state_dict for ZenForSequenceClassification:
size mismatch for classifier.weight: copying a param with shape torch.Size([3, 768]) from checkpoint, the shape in current model is torch.Size([2, 768]).
size mismatch for classifier.bias: copying a param with shape torch.Size([3]) from checkpoint, the shape in current model is torch.Size([2]).
sh-4.2$

thanks a lot.

构建ngram字典

您好,我想问一下ZEN模型在构建ngram字典是使用了什么工具?我想在自己的领域的文本上构建一个ngram字典,但不知道如何构建比较好。

ngram字典问题

请问在哪里可以得到ngram字典啊?是否可以提供一个链接?多谢!

finetune的数据格式是否是官方的格式,能否直接提供一下,否则个人获取不太方便

python run_token_level_classification.py
--task_name cwsmsra
--do_train
--do_eval
--do_lower_case
--data_dir data/msra_ner
--bert_model data/ZEN_pretrain_base_v0.1.0
--max_seq_length 256
--do_train
--do_eval
--train_batch_size 96
--num_train_epochs 30
--warmup_proportion 0.1

比如,想进行上面的finetune,但是这个任务cwsmsra,使用的训练数据格式应该是怎样的,从哪里能比较方便获取到?

ModuleNotFoundError: No module named ‘ZEN’

运行python run_pre_train.py时出错:
——————————————————————————
Traceback (most recent call last):
File "run_pre_train.py", line 33, in
from ZEN import WEIGHTS_NAME, CONFIG_NAME
ModuleNotFoundError: No module named 'ZEN'

how to initialize n-gram tower and emb?

Hi~

1、Is ZEN trained from any base bert(e.g. google) or trained from scratch? If from scrach, I guess the n-gram emb is randomly initialized, If from base bert, the n-gram emb maybe the average of characters included?

2、According to "We use the same parameter setting for the n-gram encoder as in BERT" in the paper,I want to know that the params of n-gram encoder is shared and the same with bert tower(maybe the bottom six layer?),or is initialized and trained independently?

thank you~

Fine-tuning datasets preparation

Firstly, thanks a lot for your open source contribution.
Could you please provide some Python scripts for converting the originally official datasets format to the TSV format ? For example, XML to TSV for the NER task of MSRA, ...... therefore, we can use your project much more conveniently.

Thanks a lot again.

hyperparameters for pre-training

Hi, this is a nice work!

Could you give some more details about the hyperparameters used in pre-training?

ZEN (P) is trained based on Google BERT. How many epochs used in the additional pre-training?

Thanks!

Can you evaluate ZEN model in the benchmark of CLUE?

Thank you for ZEN! Now researchers have another great choice in NLP pretrained model. We have witnessed that ZEN compares favorably with BERT in many NLP tasks. Would you like to evaluate ZEN in benchmark of CLUE?

Our group CLUE is also devoted to promoting the progress of Chinese NLP, we chosen 9 representative Chinese tasks and the leaderboard is open now (including human's performance).

We hope to see ZEN in this leaderboard : )

CLUE Group: https://github.com/CLUEbenchmark/CLUE
CLUE Benchmark: https://www.cluebenchmarks.com/

a question about fine-tuned model

Excuse me,is the ''fine-tuned model for NER'' mean, I can directly test the dataset(ontonote、resume、msra...) without train or only test the msra without train?
Thanks!

BERT的原版实现是哪个?

您好,

非常喜欢你们的工作!
因为想要follow你们的工作,所以想知道 ZEN 基于的 BERT 实现是哪一个?这样也好方便我们后续的使用 BERT 来进行比较。
能否给出你们所使用的 BERT 的实现(implementation)和链接?

谢谢!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.