- 👋 Hi, I’m @king-yyf
- 👀 I’m interested in programming,
- 🌱 I’m currently learning CP
- 💞️ I’m looking to collaborate on ...
- 📫 How to reach me ...
king-yyf / cmekg_tools Goto Github PK
View Code? Open in Web Editor NEWLicense: MIT License
License: MIT License
看到了关系抽取的,想问下如果训练分词器,数据集长什么样子呀 感谢
您好,请问前端的代码可以上传一下吗
hi,我们在测试ner任务的时候没有测试集,可以发一份ner任务的测试集出来做测试吗,非常感谢
从百度云下载了模型文件, 更新 medical_cws.py 对应的模型路径后,运行 medical_cws.py 报错了,怎么解决?以下是日志
(base) ubuntu@ubuntu-test3:~/knowledgegraph/CMeKG_tools/CMeKG_tools-main$ python medical_cws.py
Some weights of the model checkpoint at /home/ubuntu/knowledgegraph/CMeKG_tools/CMeKG_tools-main/models/medical_cws were not used when initializing BertModel: ['cls.predictions.transform.dense.weight', 'cls.predictions.decoder.weight', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.bias']
作者您好,对于medical_re.py目前我们是加载您训练好的模型来进行train_example.json数据测试,如果我们使用自己的数据集,那么又该如何训练自己的模型呢? 可以讲一下如何训练自己的模型流程成嘛?非常感谢,期待您的回复。
你好,我仔细看了一下您的代码,关于re有两点想讨论一下:
1、extract_spoes()函数中,L280-L291,我清晰你希望完成的是当同一输入文本中有多个主语定位词时遍历每一组,并在model4po模型中作为mask,与hidden_state进行叠加,希望在提取宾语与实体关系词时仅关注该主语起始位置,这样就免除了依存分析的内容。但是这一部分遍历只会取到第一组。只是因为在get_triples中用“。”切割,通常情况下一句只有一个主语,因此看起来表现是对的。
2、同上所述,在model4po模型定义时,看起来将s直接填充进了所有有效token对应的位置,all_s[b, :cue_len, :] = s,无法起到长文本的mask作用,这一步骤添加对第二段po提取的训练是无意义的。
Hello,
Could you please provide the specific version of transformers?
球球了
我这边加载模型这块不太明白,同时我想用您的代码,训练自己的非医学的数据,是否可以呢?可否有偿指导一下
Excuse me, train_data.json file mentioned in medical_re.py file from where to get?
无论是直接打开还是挂梯子还是用流量打开都不行。。
您好,我遇到了这样的报错:
Traceback (most recent call last):
File "medical_ner.py", line 184, in <module>
res = my_pred.predict_sentence(sentence)
File "medical_ner.py", line 103, in predict_sentence
self.model.load_state_dict(torch.load(self.NEWPATH, map_location=device))
File "/home/amax/.conda/envs/torch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1407, in load_state_dict
self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for BERT_LSTM_CRF:
Missing key(s) in state_dict: "word_embeds.embeddings.position_ids".
我做的操作是这样的:
self.NEWPATH = '/Users/yangyf/workplace/model/medical_ner/model.pkl'
self.vocab = load_vocab('/Users/yangyf/workplace/model/medical_ner/vocab.txt')
self.vocab_reverse = {v: k for k, v in self.vocab.items()}
self.model = BERT_LSTM_CRF('/Users/yangyf/workplace/model/medical_ner', tagset_size, 768, 200, 2,
dropout_ratio=0.5, dropout1=0.5, use_cuda=use_cuda)
medical_ner.py
我检查了一下,当前这个模型【需要】以下这些参数:
word_embeds.embeddings.position_ids torch.Size([1, 512])
word_embeds.embeddings.word_embeddings.weight torch.Size([21128, 768])
word_embeds.embeddings.position_embeddings.weight torch.Size([512, 768])
word_embeds.embeddings.token_type_embeddings.weight torch.Size([2, 768])
word_embeds.embeddings.LayerNorm.weight torch.Size([768])
word_embeds.embeddings.LayerNorm.bias torch.Size([768])
word_embeds.encoder.layer.0.attention.self.query.weight torch.Size([768, 768])
word_embeds.encoder.layer.0.attention.self.query.bias torch.Size([768])
word_embeds.encoder.layer.0.attention.self.key.weight torch.Size([768, 768])
word_embeds.encoder.layer.0.attention.self.key.bias torch.Size([768])
word_embeds.encoder.layer.0.attention.self.value.weight torch.Size([768, 768])
word_embeds.encoder.layer.0.attention.self.value.bias torch.Size([768])
word_embeds.encoder.layer.0.attention.output.dense.weight torch.Size([768, 768])
...省略...
我猜测load进来的checkpoint中(也就是model.pkl
中),可能没有word_embeds.embeddings.position_ids
这项。劳烦您能否拨冗查看一下,是我的执行步骤有误?还是训练好的模型checkpoint有问题?谢谢!
你好,问一下,torch是哪个版本?pkl文件该如何打开?我使用python解析是一串int类型的数字。
Hello, may I ask, which version of torch is it? How to open pkl file? I am using python to parse a string of numbers of type int.
您好,这个网站挂掉了http://cmekg.pcl.ac.cn/
>
感谢大佬,改了就跑通了!
大佬medical_cws.py medical_ner.py 这俩里面的使用能跑通吗
Originally posted by @gouyulang in #8 (comment)
网站http://cmekg.pcl.ac.cn/打不开,无法获取知识图谱
想做知识图谱构建方面研究,请问可以使用您的CMeKG的数据集吗,后期会注明引用来源
对于这个pkl文件能不能滞空来训练自己的模型呢,但是设为null会报错,这里有办法解决吗
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.