leesureman / flat-lattice-transformer Goto Github PK
View Code? Open in Web Editor NEWcode for ACL 2020 paper: FLAT: Chinese NER Using Flat-Lattice Transformer
code for ACL 2020 paper: FLAT: Chinese NER Using Flat-Lattice Transformer
可不可以给个数据集的链接,自己找的好像格式不太对
There are no gigaword_chn.all.a2b.bi.ite50.vec and sgns.merge.word.bz2 google drive links
Since I receive 404 error from Baidu Pan links, I cannot download Baidu Pan links.
Very much appreciate if you can upload the word embeddings.
Thank you very much.
想请问一下,最终模型复现的效果是什么样的呢
你好,请问论文中是如何实现 Flat-Lattice 和BERT的结合,①是修改BERT的Transformer结构?②还是BERT只提供了char级别的embedding,然后在bert输出隐藏表示后进行FLAT的操作?
你好,可以发一下数据集的格式吗,发几条样例数据
这么吃显存吗? 8万条训练数据16g就跑不动了
如题。
微信ID:lawsonabs
按照readme运行代码会报以下错误,想请教一下是什么原因造成的:
Traceback (most recent call last):
File "flat_main.py", line 306, in
only_train_min_freq=args.only_train_min_freq)
File "E:\Program\Anaconda\Install\envs\pytorch-1.2.0\lib\site-packages\fastNLP\core\utils.py", line 160, in wrapper
with open(cache_filepath, 'wb') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'cache\weibo_lattice_only_train:False_trainClip:True_norm_num:0char_min_freq1bigram_min_freq1word_min_freq1only_train_min_freqTruenumber_norm0lexicon_yjload_dataset_seed100'
修改path.py中的路径为本地路径后process中无法使用相关变量
yangjie_rich_pretrain_unigram_path等显示未定义
你好,我想请问一下Rij的最后形状是怎样的?
1.我总感觉作者transformer_encoder时的residual结构写的有问题,但是我把它修改后发现效果变差了。不知道为什么。
class Layer_Process(nn.Module):
def init(self, process_sequence, hidden_size, dropout=0,
use_pytorch_dropout=True):
def forward(self, inp):
output = inp
for op in self.process_sequence: #process_sequence=’an‘
if op == 'a':
output = output + inp #这里不是相当于将inp*2吗?
elif op == 'd':
output = self.dropout(output)
elif op == 'n':
output = self.layer_norm(output)
return output
2.修改了个别超参,如batch设置为5,k_proj修改为True,作者设置为false,另外在融合位置embed的时候ss,se,es,ee都使用了,作者的超参只使用了ss,ee,为了增大每个注意力头的大小稍微怎大了隐藏层大小。其中将k_proj修改为True就可以到达95.0%,使用4个相对位置融合感觉没有提升,增大隐藏层大小上升到95.15%。笔记本空间有限就没有继续增加隐藏层。
File "../V0/modules.py", line 93, in forward
pe_ss = self.pe_ss[(pos_ss).view(-1)+self.max_seq_len].view(size=[batch,max_seq_len,max_seq_len,-1])
IndexError: index 363 is out of bounds for dimension 0 with size 351
请问一下,您知道这个错误咋解决吗,resume和weibo两个数据集都会报这个错误。
Traceback (most recent call last):
File "flat_main.py", line 313, in
only_train_min_freq=args.only_train_min_freq)
File "D:\develop\python\Anaconda3\lib\site-packages\fastNLP\core\utils.py", line 160, in wrapper
with open(cache_filepath, 'wb') as f:
OSError: [Errno 22] Invalid argument: 'cache\weibo_lattice_only_train:False_trainClip:True_norm_num:0char_min_freq1bigram_min_freq1word_min_freq1only_train_min_freqTruenumber_norm0lexicon_yjload_dataset_seed100'
相关代码
datasets,vocabs,embeddings = equip_chinese_ner_with_lexicon(datasets,vocabs,embeddings,
w_list,yangjie_rich_pretrain_word_path,
_refresh=refresh_data,_cache_fp=cache_name,
only_lexicon_in_train=args.only_lexicon_in_train,
word_char_mix_embedding_path=yangjie_rich_pretrain_char_and_word_path,
number_normalized=args.number_normalized,
lattice_min_freq=args.lattice_min_freq,
only_train_min_freq=args.only_train_min_freq)
应该是这个注解产生的错误吧。
@cache_results(_cache_fp='need_to_defined_fp',_refresh=True)
def equip_chinese_ner_with_lexicon(datasets,vocabs,embeddings,w_list,word_embedding_path=None,
only_lexicon_in_train=False,word_char_mix_embedding_path=None,
number_normalized=False,
lattice_min_freq=1,only_train_min_freq=0):
请问有相应的预测代码吗
作者能否公布一下msra结果相关的训练参数?norm = 0/1/2/3 ? learning rate batchsize之类的。
gigaword_chn.all.a2b.uni.ite50.vec/gigaword_chn.all.a2b.bi.ite50.vec/ctb.50d.vec/sgns.merge.word从readme中能够找到这四个,还缺上边的yangjie_word_char_mix.txt,不知道有没有遗漏,老哥方便给个链接吗
这个文件sgns.merge.word在哪里啊
Originally posted by @Eason-zz in https://github.com/LeeSureman/Flat-Lattice-Transformer/issue_comments/705606773
原始数据集里只有.train/.dev/.test的文件?
Traceback (most recent call last):
File "flat_main.py", line 254, in <module>
only_train_min_freq=args.only_train_min_freq,
File "/Users/user/.pyenv/versions/env-mkwPXnF--py3.7/lib/python3.7/site-packages/fastNLP/core/utils.py", line 344, in wrapper
results = func(*args, **kwargs)
File "../load_data.py", line 646, in load_weibo_ner
bundle = loader.load(v)
File "/Users/user/.pyenv/versions/env-mkwPXnF--py3.7/lib/python3.7/site-packages/fastNLP/io/loader/loader.py", line 68, in load
paths = check_loader_paths(paths)
File "/Users/user/.pyenv/versions/env-mkwPXnF--py3.7/lib/python3.7/site-packages/fastNLP/io/utils.py", line 63, in check_loader_paths
raise FileNotFoundError(f"{paths} is not a valid file path.")
FileNotFoundError: /Users/user/Downloads/Flat-Lattice-Transformer/V0/WeiboNER/weiboNER_2nd_conll.train_deseg is not a valid file path.
您好,请问有实验过模型预测的速度吗,我用的一块Tesla P100,每个样本预测速度在5s左右,这个是正常的吗
请问在最后结合BERT模型的实验中,由于BERT_Tokenizer会把序列切分为字序列,是如何将lexicon中匹配的词输入到BERT中的?
请问使用当前flat_main.py
里的默认参数,模型结构就是论文里 “3.2 Relative Position Encoding of Spans” 所描述的模型结构吗?
flat_main.py 116行:
parser.add_argument('--four_pos_fusion',default='ff_two',choices=['ff','attn','gate','ff_two','ff_linear'],
if self.four_pos_fusion == 'ff_two':
pe_2 = torch.cat([pe_ss,pe_ee],dim=-1)
这里我理解是只使用了4个相对位置特征([pe_ss,pe_se,pe_es,pe_ee])中的两个[pe_ss,pe_ee],而论文“3.2 Relative Position Encoding of Spans”用了4个特征。不知道是否是我对代码解读有误?
在文件load_data.py中489—491行代码【函数 load_weibo_ner()】:
for k,v in paths.items():
bundle = loader.load(v)
datasets[k] = bundle.datasets['train']
为什么是把所有的数据都放到 train中?不应该是分别放在dataset[train],dataset[dev] ,dataset[test] 吗?
拜读了您的论文,感觉是非常好的idea。迫不及待地想看下源码。您能否提前发一份源码到我邮箱,仅供学习之用。email: [email protected]
非常感谢!
File "../V0/modules.py", line 93, in forward
pe_ss = self.pe_ss[(pos_ss).view(-1)+self.max_seq_len].view(size=[batch,max_seq_len,max_seq_len,-1])
IndexError: index 363 is out of bounds for dimension 0 with size 351
您好,在看代码时class Layer_Process内部进行Add+Norm时,Add好像不是残差连接,请问是什么原因?
代码如下:
def forward(self, inp):
output = inp
for op in self.process_sequence:
if op == 'a':
output = output + inp
elif op == 'd':
output = self.dropout(output)
elif op == 'n':
output = self.layer_norm(output)
return output
这里add 是 相当于 input+input,并不是 fun(input) + input,看了TENER也使用的是残差连接,请问是我理解错了吗?
您好,我想请问一下,我这边运行flat_main.py 后fastNLP_module提示fastNLP.modules.utils包中can not import name 'get_file_name_base_on_postfix'是什么情况,不是很懂这里的原因
文章给出的lattcie-lstm的结构与我看到的Chinese NER Using Lattice LSTM有点不一样, 有个问题想要请教一下, 按照Chinese NER Using Lattice LSTM中的构建方法, 重庆人和药店应该会提取出[重庆, 重庆人, 人和药店, 药店]四个词, 请问是如何剔除重庆人人这个词的. 文章中只提到“Some words in lattice may be important for NER. ”, 能给出如果筛选这些重要的词的么?
Traceback (most recent call last):
File "flat_main.py", line 787, in
trainer.train()
File "/raid/ztt/anaconda2/envs/p3/lib/python3.6/site-packages/fastNLP/core/trainer.py", line 613, in train
self.callback_manager.on_exception(e)
File "/raid/ztt/anaconda2/envs/p3/lib/python3.6/site-packages/fastNLP/core/callback.py", line 309, in wrapper
returns.append(getattr(callback, func.name)(*arg))
File "/raid/ztt/anaconda2/envs/p3/lib/python3.6/site-packages/fastNLP/core/callback.py", line 505, in on_exception
raise exception # 抛出陌生Error
File "/raid/ztt/anaconda2/envs/p3/lib/python3.6/site-packages/fastNLP/core/trainer.py", line 609, in train
self._train()
File "/raid/ztt/anaconda2/envs/p3/lib/python3.6/site-packages/fastNLP/core/trainer.py", line 664, in _train
prediction = self._data_forward(self.model, batch_x)
File "/raid/ztt/anaconda2/envs/p3/lib/python3.6/site-packages/fastNLP/core/trainer.py", line 752, in _data_forward
y = network(**x)
File "/raid/ztt/anaconda2/envs/p3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "../V1/models.py", line 440, in forward
bert_embed = self.bert_embedding(char_for_bert)
File "/raid/ztt/anaconda2/envs/p3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "../fastNLP_module.py", line 389, in forward
outputs = self.model(words)
File "/raid/ztt/anaconda2/envs/p3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/raid/ztt/anaconda2/envs/p3/lib/python3.6/site-packages/fastNLP/embeddings/bert_embedding.py", line 339, in forward
"After split words into word pieces, the lengths of word pieces are longer than the "
RuntimeError: After split words into word pieces, the lengths of word pieces are longer than the maximum allowed sequence length:512 of bert. You can set auto_truncate=True
for BertEmbedding to automatically truncate overlong input.
FLAT模型结构应该可以进行预训练
不过作者好像没有进行这方面的实验,不知为何?
作者您好,请问您在论文中使用的数据集是weiboNER.conll.train还是weiboNER_2nd_conll.train呢?
Traceback (most recent call last):
File "E:/yuan/Flat-Lattice-Transformer-master/V0/flat_main.py", line 306, in
only_train_min_freq=args.only_train_min_freq)
File "C:\ProgramData\Anaconda3\envs\flat\lib\site-packages\fastNLP\core\utils.py", line 160, in wrapper
with open(cache_filepath, 'wb') as f:
OSError: [Errno 22] Invalid argument: 'cache\resume_lattice_only_train:False_trainClip:True_norm_num:0char_min_freq1bigram_min_freq1word_min_freq1only_train_min_freqTruenumber_norm0lexicon_yjload_dataset_seed100'
这个怎么解决啊
在调用fastNLP_module.py 的时候报这个错,报错地方:
from fastNLP.modules.utils import _get_file_name_base_on_postfix
我去fastNLP.modules.utils 这个里面找了,确实没有这个方法,是因为fastNLP的版本问题吗?还是其他什么原因呢
File "../V0/modules.py", line 93, in forward
pe_ss = self.pe_ss[(pos_ss).view(-1)+self.max_seq_len].view(size=[batch,max_seq_len,max_seq_len,-1])
IndexError: index 363 is out of bounds for dimension 0 with size 351
总是出错RuntimeError: The size of tensor a (94) must match the size of tensor b (214) at non-singleton dimension 1
发现多GPU会切割维度,导致match不上。如何正确设置多GPU运行?
按照 readme 文件 运行V0中的fla_python.py文件后,出现了这个错误,想要请问一下这是什么情况
处理后生成的这个txt文件无法打开是怎么回事啊,一打开就无响应
ontonotes, msra, weibo ,resume 这些数据集都找不到,能否在Readme中增加一下数据集下载链接?
另外看起来load_data.py和preprocess.py代码都是不完整的?还没有实现完?
您好,请问一下利用下载的embedding修改path.py是修改路径吗
具体是修改哪一个呢,抱歉我是新手还请见谅哈
想请教一下这一块不是特别清晰,感觉代码里面和自己用的数据集也并没有用到相对位置的功能?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.