Comments (8)
Hi @wjczf123, yeah. If you remove the #384-385 and #447 of learner.py, the code'll skip fine-tuning on meta-test support set.
from vert-papers.
Thanks for your reply.
I ran it once under inter 5-way 1-shot setting and the results looked very bad.
2022-09-20 22:57:35 INFO: - span_f1 = 0.7218073781712385
2022-09-20 22:57:35 INFO: - span_p = 0.7370060346505719
2022-09-20 22:57:35 INFO: - span_r = 0.7072229140722269
2022-09-20 22:57:35 INFO: - type_f1 = 0.156973848019738
2022-09-20 22:57:35 INFO: - type_p = 0.156973848069738
2022-09-20 22:57:35 INFO: - type_r = 0.156973848069738
2022-09-20 22:57:35 INFO: - 9.445,9.063,9.250,73.701,70.722,72.181,15.697,15.697,15.697,0.000,0.000,0.000
from vert-papers.
I understand the performance will drop, but it performs poorly.
from vert-papers.
Thanks for your reply. I ran it once under inter 5-way 1-shot setting and the results looked very bad.
2022-09-20 22:57:35 INFO: - span_f1 = 0.7218073781712385 2022-09-20 22:57:35 INFO: - span_p = 0.7370060346505719 2022-09-20 22:57:35 INFO: - span_r = 0.7072229140722269 2022-09-20 22:57:35 INFO: - type_f1 = 0.156973848019738 2022-09-20 22:57:35 INFO: - type_p = 0.156973848069738 2022-09-20 22:57:35 INFO: - type_r = 0.156973848069738 2022-09-20 22:57:35 INFO: - 9.445,9.063,9.250,73.701,70.722,72.181,15.697,15.697,15.697,0.000,0.000,0.000
Sorry, I made a mistake earlier. You can't direct remove #447 in the type classification stage, which has some logit to generate the type embedding.
The solution should be keep #447, and change #165 to self.model.eval()
. You may also need to remove #191-192
from vert-papers.
Thanks. The new result seems to be correct.
2022-09-24 20:43:14 INFO: - ***** Eval results inter-test *****
2022-09-24 20:43:14 INFO: - f1 = 0.6104350036041772
2022-09-24 20:43:14 INFO: - f1_threshold = 0.6133144703132174
2022-09-24 20:43:14 INFO: - loss = tensor(4.1757, device='cuda:0')
2022-09-24 20:43:14 INFO: - precision = 0.6232885601193933
2022-09-24 20:43:14 INFO: - precision_threshold = 0.6340790479672884
2022-09-24 20:43:14 INFO: - recall = 0.5981008717310069
2022-09-24 20:43:14 INFO: - recall_threshold = 0.5938667496886657
2022-09-24 20:43:14 INFO: - span_f1 = 0.7218073781712385
2022-09-24 20:43:14 INFO: - span_p = 0.7370060346505719
2022-09-24 20:43:14 INFO: - span_r = 0.7072229140722269
2022-09-24 20:43:14 INFO: - type_f1 = 0.8474159401741568
2022-09-24 20:43:14 INFO: - type_p = 0.8474159402241568
2022-09-24 20:43:14 INFO: - type_r = 0.8474159402241568
2022-09-24 20:43:14 INFO: - 62.329,59.810,61.044,73.701,70.722,72.181,84.742,84.742,84.742,63.408,59.387,61.331
from vert-papers.
I am very sorry to interrupt again. Why is the performance in 5-shot worse than 1-shot after ablating fine-tuning in meta-test?
For example, the F1 under inter 5way-5shot is about 54. However, the performance under 1-shot is 61.04. Have you ever observed this phenomenon before? It doesn't seem normal. Thanks.
from vert-papers.
Hi @wjczf123, this may be reasonable, although we have not done the corresponding ablation experiments on 5shot. First of all the 5shot and 1shot datasets cannot be compared in parallel, they are both just a sampled subset of Few-NERD. Of course, according to our experimental results on inter 5-1 and inter 5-5, it seems that the 5shot results are better. Secondly, we found in our experiments that more fine-tuning steps are needed for inter 5-5 and inter 10-5 in the meta-test. Removing the fine-tune may have a greater impact on 5shot. Hope this helps.
from vert-papers.
Thanks. Hope you have a good day.
from vert-papers.
Related Issues (20)
- How to run CANNER code HOT 2
- This repo is missing important files
- How to deal with labels that don't appear in N-way labels HOT 1
- Which line is the code of Meta learning in Decomposed meta NER HOT 6
- A question about meta-learning few-shot NER HOT 5
- A detailed question about meta-testing HOT 2
- The version of FewNERD HOT 9
- This repo is missing a license file HOT 1
- Error when running code of advpicker HOT 2
- Migration Problem of Code on Apple M1 Chip HOT 1
- CAN-NER 里的训练代码中pretrained_embed_path 应该放什么文件啊? HOT 1
- 请问 《Decomposed Meta-Learning for Few-Shot Named Entity Recognition》模型的训练时长多少 HOT 3
- The result of decoding BPE HOT 4
- cannot downlload models
- cannot download the models
- Obtaining Predicted NER Results for a Series of Sentences with DecomposedMetaNER HOT 3
- DecomposedMetaNER evaluate problem HOT 4
- Fail to reproduce the f1 score for dataset Cross-Dataset HOT 9
- Why my results are so poor? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vert-papers.