leesureman / flat-lattice-transformer Goto Github PK

View Code? Open in Web Editor NEW

1.0K 1.0K 178.0 104 KB

code for ACL 2020 paper: FLAT: Chinese NER Using Flat-Lattice Transformer

Python 100.00%

flat-lattice-transformer's People

Contributors

Stargazers

Watchers

Forkers

joah248 addsionpapa huicao1995 786440445 qingkongzhiqian kiminh cjopengler debuluoyi yyht zhunanyang sdxshuai shayeboshi laomagic karta282950 alexyoung757 gaohaihui 1351497214 zheyuye chaineli qianrenjian poccajknjkn ericdoug-qi ruizewang yhjiujiu ninerui newsky los-phoenix sataliulan roreagan xiejunxuip xrosliang rhtrht gaohuan2020 fxlp yysirs hins jhr0717 null-op andrew05200 jacklee-pixnet jttgump yuwl798180 yangyuxiang1996 jiangjiawei1103 xhsun1997 wn1652400018 qwzhong1988 lululululuev johnson7788 teddy-li fangfang22-oss yiming258 xiaomogui asmallsheep dashuang13 aiedward isyinun runrunliuliu lsq357 yang-zi-jiang yangxiyucs quanjiehan robink87 yongheshinian rulepack mecthew dataxujing htyhmn gavingx ml2457 lrpopeyou ez4daniel-wang ghostagan sunyangu yayuanzi8 robby927 yang-zhikai markwjj lidhcs shjgiser xiaowen581 l99500 ylinlinz 612twilight tangkangqi zyq2016 xiexiexiewr fantabulous-j snaildm shinoyuki222 jhxcugbcs tiffen anatanick gll-2020 songyandong uccme thuhcsi peanutf yydg1 tk1363704

flat-lattice-transformer's Issues

是不是应该给一个yj的lexicon链接，和ls的lexicon链接，baidu pan的链接没法分辨是谁的呀

Can not download embedding from Google drive

There are no gigaword_chn.all.a2b.bi.ite50.vec and sgns.merge.word.bz2 google drive links

Since I receive 404 error from Baidu Pan links, I cannot download Baidu Pan links.

Very much appreciate if you can upload the word embeddings.

Thank you very much.

bert的实现

你好，请问论文中是如何实现 Flat-Lattice 和BERT的结合，①是修改BERT的Transformer结构？②还是BERT只提供了char级别的embedding，然后在bert输出隐藏表示后进行FLAT的操作？

请问有不使用bigram embdedding的对比实验结果吗？

请问能给出各个测试数据集的地址吗？如果游客知道，也请麻烦告知一下~

如题。

按照readme运行代码会报以下错误，想请教一下是什么原因造成的：
Traceback (most recent call last):
File "flat_main.py", line 306, in
only_train_min_freq=args.only_train_min_freq)
File "E:\Program\Anaconda\Install\envs\pytorch-1.2.0\lib\site-packages\fastNLP\core\utils.py", line 160, in wrapper
with open(cache_filepath, 'wb') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'cache\weibo_lattice_only_train:False_trainClip:True_norm_num:0char_min_freq1bigram_min_freq1word_min_freq1only_train_min_freqTruenumber_norm0lexicon_yjload_dataset_seed100'

修改path.py后process中无法使用相关变量

修改path.py中的路径为本地路径后process中无法使用相关变量
yangjie_rich_pretrain_unigram_path等显示未定义

Concatenation

你好，我想请问一下Rij的最后形状是怎样的？

关于residual结构的疑问。在resume数据集上只达到95.15%，还没有达到95.5%。

1.我总感觉作者transformer_encoder时的residual结构写的有问题，但是我把它修改后发现效果变差了。不知道为什么。
class Layer_Process(nn.Module):
def init(self, process_sequence, hidden_size, dropout=0,
use_pytorch_dropout=True):
def forward(self, inp):
output = inp
for op in self.process_sequence: #process_sequence=’an‘
if op == 'a':
output = output + inp #这里不是相当于将inp*2吗？
elif op == 'd':
output = self.dropout(output)
elif op == 'n':
output = self.layer_norm(output)
return output
2.修改了个别超参，如batch设置为5，k_proj修改为True，作者设置为false，另外在融合位置embed的时候ss，se，es，ee都使用了，作者的超参只使用了ss，ee，为了增大每个注意力头的大小稍微怎大了隐藏层大小。其中将k_proj修改为True就可以到达95.0%，使用4个相对位置融合感觉没有提升，增大隐藏层大小上升到95.15%。笔记本空间有限就没有继续增加隐藏层。

运行错误

File "../V0/modules.py", line 93, in forward
pe_ss = self.pe_ss[(pos_ss).view(-1)+self.max_seq_len].view(size=[batch,max_seq_len,max_seq_len,-1])
IndexError: index 363 is out of bounds for dimension 0 with size 351

报错OSError: [Errno 22] Invalid argument

请问一下，您知道这个错误咋解决吗，resume和weibo两个数据集都会报这个错误。

Traceback (most recent call last):
File "flat_main.py", line 313, in
only_train_min_freq=args.only_train_min_freq)
File "D:\develop\python\Anaconda3\lib\site-packages\fastNLP\core\utils.py", line 160, in wrapper
with open(cache_filepath, 'wb') as f:
OSError: [Errno 22] Invalid argument: 'cache\weibo_lattice_only_train:False_trainClip:True_norm_num:0char_min_freq1bigram_min_freq1word_min_freq1only_train_min_freqTruenumber_norm0lexicon_yjload_dataset_seed100'

相关代码

datasets,vocabs,embeddings = equip_chinese_ner_with_lexicon(datasets,vocabs,embeddings,
                                                            w_list,yangjie_rich_pretrain_word_path,
                                                         _refresh=refresh_data,_cache_fp=cache_name,
                                                         only_lexicon_in_train=args.only_lexicon_in_train,
                                                            word_char_mix_embedding_path=yangjie_rich_pretrain_char_and_word_path,
                                                            number_normalized=args.number_normalized,
                                                            lattice_min_freq=args.lattice_min_freq,
                                                            only_train_min_freq=args.only_train_min_freq)

应该是这个注解产生的错误吧。

@cache_results(_cache_fp='need_to_defined_fp',_refresh=True)
def equip_chinese_ner_with_lexicon(datasets,vocabs,embeddings,w_list,word_embedding_path=None,
                                   only_lexicon_in_train=False,word_char_mix_embedding_path=None,
                                   number_normalized=False,
                                   lattice_min_freq=1,only_train_min_freq=0):

typo in paper?

maybe found a typo in paper?

预测代码

请问有相应的预测代码吗

yangjie_word_char_mix这个文件在哪下载

公式(11)书写是否有问题？

在读论文推导公式的过程中，我觉得公式（11）书写是否有问题？
以这一部分举例，，是一个(dhead，dmodel)*(dmodel，1)*(1，dmodel)*(dmodel，dhead)的计算，计算结果是一个(dhead,dhead)的矩阵，而非是一个标量。Aij是一个矩阵的话，A*就不能替换掉attention公式中的A了。

不知道是否我理解有问题，希望作者能够解答

复现msra，最好F1值目前94.12%,无法达到论文中说的96+%

作者能否公布一下msra结果相关的训练参数？norm = 0/1/2/3 ? learning rate batchsize之类的。

这个文件sgns.merge.word在哪里啊

gigaword_chn.all.a2b.uni.ite50.vec/gigaword_chn.all.a2b.bi.ite50.vec/ctb.50d.vec/sgns.merge.word从readme中能够找到这四个，还缺上边的yangjie_word_char_mix.txt，不知道有没有遗漏，老哥方便给个链接吗

这个文件sgns.merge.word在哪里啊

Originally posted by @Eason-zz in https://github.com/LeeSureman/Flat-Lattice-Transformer/issue_comments/705606773

请教一下博主

weibo数据集找不到_deseg后缀的文件

原始数据集里只有.train/.dev/.test的文件？

Traceback (most recent call last):
  File "flat_main.py", line 254, in <module>
    only_train_min_freq=args.only_train_min_freq,
  File "/Users/user/.pyenv/versions/env-mkwPXnF--py3.7/lib/python3.7/site-packages/fastNLP/core/utils.py", line 344, in wrapper
    results = func(*args, **kwargs)
  File "../load_data.py", line 646, in load_weibo_ner
    bundle = loader.load(v)
  File "/Users/user/.pyenv/versions/env-mkwPXnF--py3.7/lib/python3.7/site-packages/fastNLP/io/loader/loader.py", line 68, in load
    paths = check_loader_paths(paths)
  File "/Users/user/.pyenv/versions/env-mkwPXnF--py3.7/lib/python3.7/site-packages/fastNLP/io/utils.py", line 63, in check_loader_paths
    raise FileNotFoundError(f"{paths} is not a valid file path.")
FileNotFoundError: /Users/user/Downloads/Flat-Lattice-Transformer/V0/WeiboNER/weiboNER_2nd_conll.train_deseg is not a valid file path.

train.char.bmes_clip这个数据集是如何生成的？

如图，尝试执行代码，但是报错发现缺少train.char.bmes_clip这个数据集，目前已经有OntoNote4NER中的train.char.bmes数据集了，想问一下带_clip后缀的如何生成，谢谢。

模型推断速度

您好，请问有实验过模型预测的速度吗，我用的一块Tesla P100，每个样本预测速度在5s左右，这个是正常的吗

Compatibility with BERT

请问在最后结合BERT模型的实验中，由于BERT_Tokenizer会把序列切分为字序列，是如何将lexicon中匹配的词输入到BERT中的？

请问当前 flat_main.py 里的默认参数，就是论文里描述的模型结构吗？

请问使用当前flat_main.py里的默认参数，模型结构就是论文里 “3.2 Relative Position Encoding of Spans” 所描述的模型结构吗？
flat_main.py 116行：

parser.add_argument('--four_pos_fusion',default='ff_two',choices=['ff','attn','gate','ff_two','ff_linear'],

modules.py 110行

if self.four_pos_fusion == 'ff_two':
    pe_2 = torch.cat([pe_ss,pe_ee],dim=-1)

这里我理解是只使用了4个相对位置特征([pe_ss,pe_se,pe_es,pe_ee])中的两个[pe_ss,pe_ee]，而论文“3.2 Relative Position Encoding of Spans”用了4个特征。不知道是否是我对代码解读有误？

运行weibo数据集报错

运行微博数据集报错:
运行命令:python flat_main.py --dataset weibo

OSError: [Errno 22] Invalid argument: 'cache\weibo_lattice_only_train:False_trainClip:True_norm_num:0char_min_freq1bigram_min_freq1word_min_freq1only_train_min_freqTruenumber_norm0lexicon_yjload_dataset_seed100'

为什么要把所有的加载好的数据集都放到train中，是bug吗？

在文件load_data.py中489—491行代码【函数 load_weibo_ner()】：

   for k,v in paths.items():
        bundle = loader.load(v)
        datasets[k] = bundle.datasets['train']

为什么是把所有的数据都放到 train中？不应该是分别放在dataset[train],dataset[dev] ,dataset[test] 吗？

能否将代码发送我邮箱一份？

拜读了您的论文，感觉是非常好的idea。迫不及待地想看下源码。您能否提前发一份源码到我邮箱，仅供学习之用。email： [email protected]
非常感谢！

运行代码错误

add and norm问题

您好，在看代码时class Layer_Process内部进行Add+Norm时，Add好像不是残差连接，请问是什么原因？
代码如下：
def forward(self, inp):
output = inp
for op in self.process_sequence:
if op == 'a':
output = output + inp
elif op == 'd':
output = self.dropout(output)
elif op == 'n':
output = self.layer_norm(output)

    return output

这里add 是相当于 input+input，并不是 fun(input) + input，看了TENER也使用的是残差连接，请问是我理解错了吗？

import issue

您好，我想请问一下，我这边运行flat_main.py 后fastNLP_module提示fastNLP.modules.utils包中can not import name 'get_file_name_base_on_postfix'是什么情况，不是很懂这里的原因

关于lattice-lstm

文章给出的lattcie-lstm的结构与我看到的Chinese NER Using Lattice LSTM有点不一样, 有个问题想要请教一下, 按照Chinese NER Using Lattice LSTM中的构建方法, 重庆人和药店应该会提取出[重庆, 重庆人, 人和药店, 药店]四个词, 请问是如何剔除重庆人人这个词的. 文章中只提到“Some words in lattice may be important for NER. ”, 能给出如果筛选这些重要的词的么?

输入长度未超过512也会报长度错误

Traceback (most recent call last):
File "flat_main.py", line 787, in
trainer.train()
File "/raid/ztt/anaconda2/envs/p3/lib/python3.6/site-packages/fastNLP/core/trainer.py", line 613, in train
self.callback_manager.on_exception(e)
File "/raid/ztt/anaconda2/envs/p3/lib/python3.6/site-packages/fastNLP/core/callback.py", line 309, in wrapper
returns.append(getattr(callback, func.name)(*arg))
File "/raid/ztt/anaconda2/envs/p3/lib/python3.6/site-packages/fastNLP/core/callback.py", line 505, in on_exception
raise exception # 抛出陌生Error
File "/raid/ztt/anaconda2/envs/p3/lib/python3.6/site-packages/fastNLP/core/trainer.py", line 609, in train
self._train()
File "/raid/ztt/anaconda2/envs/p3/lib/python3.6/site-packages/fastNLP/core/trainer.py", line 664, in _train
prediction = self._data_forward(self.model, batch_x)
File "/raid/ztt/anaconda2/envs/p3/lib/python3.6/site-packages/fastNLP/core/trainer.py", line 752, in _data_forward
y = network(**x)
File "/raid/ztt/anaconda2/envs/p3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "../V1/models.py", line 440, in forward
bert_embed = self.bert_embedding(char_for_bert)
File "/raid/ztt/anaconda2/envs/p3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "../fastNLP_module.py", line 389, in forward
outputs = self.model(words)
File "/raid/ztt/anaconda2/envs/p3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/raid/ztt/anaconda2/envs/p3/lib/python3.6/site-packages/fastNLP/embeddings/bert_embedding.py", line 339, in forward
"After split words into word pieces, the lengths of word pieces are longer than the "
RuntimeError: After split words into word pieces, the lengths of word pieces are longer than the maximum allowed sequence length:512 of bert. You can set auto_truncate=True for BertEmbedding to automatically truncate overlong input.

作者为何没有使用FLAT作为预训练语言模型进行预训练呢？

FLAT模型结构应该可以进行预训练

不过作者好像没有进行这方面的实验，不知为何？

关于weibo数据集

作者您好，请问您在论文中使用的数据集是weiboNER.conll.train还是weiboNER_2nd_conll.train呢？

一个简单的bug

Traceback (most recent call last):
File "E:/yuan/Flat-Lattice-Transformer-master/V0/flat_main.py", line 306, in
only_train_min_freq=args.only_train_min_freq)
File "C:\ProgramData\Anaconda3\envs\flat\lib\site-packages\fastNLP\core\utils.py", line 160, in wrapper
with open(cache_filepath, 'wb') as f:
OSError: [Errno 22] Invalid argument: 'cache\resume_lattice_only_train:False_trainClip:True_norm_num:0char_min_freq1bigram_min_freq1word_min_freq1only_train_min_freqTruenumber_norm0lexicon_yjload_dataset_seed100'
这个怎么解决啊