Giter VIP home page Giter VIP logo

piqn's Introduction

piqn's People

Contributors

tricktreat avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

piqn's Issues

对论文中的公式有些困惑

VW(7CWPATSP`ETGM1{S_LU4
您好,想请教一下论文里的公式。论文中Sti,即“The boundary-aware representation”由三个size为1h的向量concatenate得到,那就是3h的size。可下面的公式计算使用Sti乘了一个1*h的权重矩阵加偏置,最后不应该得到一个(3,)的向量吗,但看样子这里应该得到的是一个数字才对。

【matrix contains invalid numeric entries】匈牙利算法优化过程

作者你好,非常喜欢你们在locate and label和piqn的工作,在复现过程中遇到了以下问题。如果你们遇到过类似的问题,可以告知一下解决方案吗?
由于硬件限制,我们使用了最新版的torchtransformer。希望不会带来影响。
在复现的过程中,我们遇到了以下错误。在github,CSDN个Stack Overflow查询后发现是匈牙利算法的优化问题。
错误发生在非常后面的epoch里,这给我们的debug带来了很大的困难。

Process SpawnProcess-1:
Traceback (most recent call last):
  File "/home/kk/anaconda3/envs/piqn/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/home/kk/anaconda3/envs/piqn/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/kk/piqn/piqn.py", line 15, in __train
    trainer.train(train_path=run_args.train_path[:-5]+ '_' + run_args.index + '.json',
  File "/home/kk/piqn/piqn/piqn_trainer.py", line 188, in train
    self._train_epoch(model, compute_loss, optimizer, train_dataset, updates_epoch, epoch)
  File "/home/kk/piqn/piqn/piqn_trainer.py", line 285, in _train_epoch
    batch_loss = compute_loss.compute(entity_logits, p_left, p_right, output, gt_types=batch['gt_types'], gt_spans = batch['gt_spans'], entity_masks=batch['entity_masks'], epoch = epoch,  deeply_weight = args.deeply_weight, seq_logits = masked_seq_logits, gt_seq_labels=batch['gt_seq_labels'], batch = batch)
  File "/home/kk/piqn/piqn/loss.py", line 59, in compute
    loss_dict = self.criterion(outputs, targets, epoch)
  File "/home/kk/anaconda3/envs/piqn/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/kk/piqn/piqn/loss.py", line 261, in forward
    indices = self.matcher(outputs, targets)
  File "/home/kk/anaconda3/envs/piqn/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/kk/anaconda3/envs/piqn/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/home/kk/piqn/piqn/matcher.py", line 76, in forward
    indices = [linear_sum_assignment(c[i]) for i, c in enumerate(C.split(sizes, -1))]
  File "/home/kk/piqn/piqn/matcher.py", line 76, in <listcomp>
    indices = [linear_sum_assignment(c[i]) for i, c in enumerate(C.split(sizes, -1))]
  File "/home/kk/anaconda3/envs/piqn/lib/python3.8/site-packages/scipy/optimize/_lsap.py", line 93, in linear_sum_assignment
    raise ValueError("matrix contains invalid numeric entries")
ValueError: matrix contains invalid numeric entries

参考链接:
stack overflow
scipy发现了这个问题,并且似乎改进了它,但依然出错
scipy issue

OntoNotes 5.0 Version

Dear authors,
thanks a lot for the nice work and the git repository.

I have the access of the original/raw OnotoNotes 5.0 data, yet I am not clear how the preprocessing has been done on your side.

  1. Your repository offers this link for preprocessing; it is mentioned that there are two versions: v4 and v12. May I know which one are you using?
  2. According to the paper, it seems that you neither use v4 nor v12. Instead, you seem to use the flat version of OntoNotes 5.0. If so, may I know the reason for that? Also, I don't know how to access the flat version of the OntoNotes 5.0 dataset, even on their official repository.
  3. In general, do you know which OntoNotes version we should use? v4, v12 or flat?

Thanks a lot!

【multiprocessing】

Traceback (most recent call last):
  File "/home/deeplearning/anaconda3/envs/acl/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/home/deeplearning/anaconda3/envs/acl/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/data/kk/GEEK/piqn/piqn.py", line 14, in __train
    trainer.train(train_path=run_args.train_path, valid_path=run_args.valid_path,
  File "/data/kk/GEEK/piqn/piqn/piqn_trainer.py", line 164, in train
    if args.local_rank != -1:
  File "/home/deeplearning/anaconda3/envs/acl/lib/python3.8/site-packages/torch/nn/modules/module.py", line 612, in to
    return self._apply(convert)
  File "/home/deeplearning/anaconda3/envs/acl/lib/python3.8/site-packages/torch/nn/modules/module.py", line 359, in _apply
    module._apply(fn)
  File "/home/deeplearning/anaconda3/envs/acl/lib/python3.8/site-packages/torch/nn/modules/module.py", line 359, in _apply
    module._apply(fn)
  File "/home/deeplearning/anaconda3/envs/acl/lib/python3.8/site-packages/torch/nn/modules/module.py", line 359, in _apply
    module._apply(fn)
  File "/home/deeplearning/anaconda3/envs/acl/lib/python3.8/site-packages/torch/nn/modules/module.py", line 381, in _apply
    param_applied = fn(param)
  File "/home/deeplearning/anaconda3/envs/acl/lib/python3.8/site-packages/torch/nn/modules/module.py", line 610, in convert
    return t.to(device, dtype if t.is_floating_point() else None, non_blocking)
  File "/home/deeplearning/anaconda3/envs/acl/lib/python3.8/site-packages/torch/cuda/__init__.py", line 163, in _lazy_init
    raise RuntimeError(
RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method

Different scores between train and eval mode on the same split

Hello,
Thank you for sharing your model. We tried PIQN on another dataset, when training, we got the following evaluation:

2023-07-08 14:00:22,582 [MainThread  ] [INFO ]  Evaluate: valid
2023-07-08 14:02:53,269 [MainThread  ] [INFO ]  Evaluation
2023-07-08 14:02:53,270 [MainThread  ] [INFO ]  
2023-07-08 14:02:53,270 [MainThread  ] [INFO ]  --- NER ---
2023-07-08 14:02:53,270 [MainThread  ] [INFO ]  
2023-07-08 14:02:53,327 [MainThread  ] [INFO ]                  type    precision       recall     f1-score      support
2023-07-08 14:02:53,327 [MainThread  ] [INFO ]                   GPE        95.90        96.66        96.28         2154
2023-07-08 14:02:53,327 [MainThread  ] [INFO ]                 EVENT        82.68        78.95        80.77          266
2023-07-08 14:02:53,327 [MainThread  ] [INFO ]              CARDINAL        87.91        88.89        88.40          180
2023-07-08 14:02:53,327 [MainThread  ] [INFO ]               PERCENT        92.86       100.00        96.30           13
2023-07-08 14:02:53,327 [MainThread  ] [INFO ]               WEBSITE        60.47        57.78        59.09           45
2023-07-08 14:02:53,327 [MainThread  ] [INFO ]              LANGUAGE        82.35        93.33        87.50           15
2023-07-08 14:02:53,327 [MainThread  ] [INFO ]               ORDINAL        95.04        96.18        95.61          498
2023-07-08 14:02:53,327 [MainThread  ] [INFO ]                  DATE        95.77        95.89        95.83         1653
2023-07-08 14:02:53,327 [MainThread  ] [INFO ]                   OCC        86.50        89.19        87.83          546
2023-07-08 14:02:53,327 [MainThread  ] [INFO ]                   ORG        93.85        94.15        94.00         1830
2023-07-08 14:02:53,327 [MainThread  ] [INFO ]                  PERS        93.65        95.59        94.61          725
2023-07-08 14:02:53,328 [MainThread  ] [INFO ]               PRODUCT        60.00        60.00        60.00            5
2023-07-08 14:02:53,328 [MainThread  ] [INFO ]                  CURR       100.00       100.00       100.00           21
2023-07-08 14:02:53,328 [MainThread  ] [INFO ]                  UNIT       100.00       100.00       100.00            3
2023-07-08 14:02:53,328 [MainThread  ] [INFO ]                  TIME        79.55        66.04        72.16           53
2023-07-08 14:02:53,328 [MainThread  ] [INFO ]                  NORP        76.37        79.84        78.07          506
2023-07-08 14:02:53,328 [MainThread  ] [INFO ]                 MONEY        89.47        85.00        87.18           20
2023-07-08 14:02:53,328 [MainThread  ] [INFO ]                   LAW        89.13        93.18        91.11           44
2023-07-08 14:02:53,328 [MainThread  ] [INFO ]                   FAC        74.38        81.08        77.59          111
2023-07-08 14:02:53,328 [MainThread  ] [INFO ]                   LOC        81.08        78.95        80.00           76
2023-07-08 14:02:53,328 [MainThread  ] [INFO ]              QUANTITY       100.00       100.00       100.00            3
2023-07-08 14:02:53,328 [MainThread  ] [INFO ]  
2023-07-08 14:02:53,328 [MainThread  ] [INFO ]                 micro        92.14        92.95        92.54         8767
2023-07-08 14:02:53,328 [MainThread  ] [INFO ]                 macro        86.52        87.18        86.78         8767
2023-07-08 14:02:53,328 [MainThread  ] [INFO ]  
2023-07-08 14:02:53,328 [MainThread  ] [INFO ]  --- NER on Localization ---
2023-07-08 14:02:53,328 [MainThread  ] [INFO ]  
2023-07-08 14:02:53,373 [MainThread  ] [INFO ]                  type    precision       recall     f1-score      support
2023-07-08 14:02:53,374 [MainThread  ] [INFO ]                Entity        93.17        94.02        93.59         8764
2023-07-08 14:02:53,374 [MainThread  ] [INFO ]  
2023-07-08 14:02:53,374 [MainThread  ] [INFO ]                 micro        93.17        94.02        93.59         8764
2023-07-08 14:02:53,374 [MainThread  ] [INFO ]                 macro        93.17        94.02        93.59         8764
2023-07-08 14:02:53,374 [MainThread  ] [INFO ]  
2023-07-08 14:02:53,374 [MainThread  ] [INFO ]  --- NER on Classification ---
2023-07-08 14:02:53,374 [MainThread  ] [INFO ]  
2023-07-08 14:02:53,418 [MainThread  ] [INFO ]                  type    precision       recall     f1-score      support
2023-07-08 14:02:53,418 [MainThread  ] [INFO ]                   GPE        98.53        96.66        97.59         2154
2023-07-08 14:02:53,418 [MainThread  ] [INFO ]                 EVENT       100.00        78.95        88.24          266
2023-07-08 14:02:53,418 [MainThread  ] [INFO ]              CARDINAL        98.16        88.89        93.29          180
2023-07-08 14:02:53,418 [MainThread  ] [INFO ]               PERCENT        92.86       100.00        96.30           13
2023-07-08 14:02:53,418 [MainThread  ] [INFO ]               WEBSITE        86.67        57.78        69.33           45
2023-07-08 14:02:53,418 [MainThread  ] [INFO ]              LANGUAGE        82.35        93.33        87.50           15
2023-07-08 14:02:53,418 [MainThread  ] [INFO ]               ORDINAL        98.76        96.18        97.46          498
2023-07-08 14:02:53,418 [MainThread  ] [INFO ]                  DATE        99.81        95.89        97.81         1653
2023-07-08 14:02:53,418 [MainThread  ] [INFO ]                   OCC        99.59        89.19        94.11          546
2023-07-08 14:02:53,418 [MainThread  ] [INFO ]                   ORG        98.80        94.15        96.42         1830
2023-07-08 14:02:53,418 [MainThread  ] [INFO ]                  PERS        99.43        95.59        97.47          725
2023-07-08 14:02:53,418 [MainThread  ] [INFO ]               PRODUCT       100.00        60.00        75.00            5
2023-07-08 14:02:53,418 [MainThread  ] [INFO ]                  CURR       100.00       100.00       100.00           21
2023-07-08 14:02:53,418 [MainThread  ] [INFO ]                  UNIT       100.00       100.00       100.00            3
2023-07-08 14:02:53,418 [MainThread  ] [INFO ]                  TIME        97.22        66.04        78.65           53
2023-07-08 14:02:53,418 [MainThread  ] [INFO ]                  NORP        99.26        79.84        88.50          506
2023-07-08 14:02:53,418 [MainThread  ] [INFO ]                 MONEY       100.00        85.00        91.89           20
2023-07-08 14:02:53,418 [MainThread  ] [INFO ]                   LAW       100.00        93.18        96.47           44
2023-07-08 14:02:53,418 [MainThread  ] [INFO ]                   FAC        92.78        81.08        86.54          111
2023-07-08 14:02:53,418 [MainThread  ] [INFO ]                   LOC        96.77        78.95        86.96           76
2023-07-08 14:02:53,418 [MainThread  ] [INFO ]              QUANTITY       100.00       100.00       100.00            3
2023-07-08 14:02:53,419 [MainThread  ] [INFO ]  
2023-07-08 14:02:53,419 [MainThread  ] [INFO ]                 micro        98.90        92.95        95.83         8767
2023-07-08 14:02:53,419 [MainThread  ] [INFO ]                 macro        97.19        87.18        91.41         8767
2023-07-08 14:03:15,726 [MainThread  ] [INFO ]  Best F1 score update, from 92.49047808538458 to 92.5444324569871
2023-07-08 14:03:22,836 [MainThread  ] [INFO ]  Best F1 score: 92.5444324569871, achieved at Epoch: 91

However, upon running the evaluation script, using the best_model, the exact same parameters and the exact same data (our val.json file), we get the following:

2023-07-24 18:04:43,382 [MainThread  ] [INFO ]  Evaluation
2023-07-24 18:04:43,382 [MainThread  ] [INFO ]  
2023-07-24 18:04:43,382 [MainThread  ] [INFO ]  --- NER ---
2023-07-24 18:04:43,382 [MainThread  ] [INFO ]  
2023-07-24 18:04:43,451 [MainThread  ] [INFO ]                  type    precision       recall     f1-score      support
2023-07-24 18:04:43,451 [MainThread  ] [INFO ]              QUANTITY       100.00       100.00       100.00            3
2023-07-24 18:04:43,451 [MainThread  ] [INFO ]                   OCC        85.61        84.98        85.29          546
2023-07-24 18:04:43,451 [MainThread  ] [INFO ]                 EVENT        84.21        78.20        81.09          266
2023-07-24 18:04:43,451 [MainThread  ] [INFO ]              LANGUAGE        82.35        93.33        87.50           15
2023-07-24 18:04:43,451 [MainThread  ] [INFO ]               WEBSITE        60.47        57.78        59.09           45
2023-07-24 18:04:43,451 [MainThread  ] [INFO ]                  NORP        75.98        76.88        76.42          506
2023-07-24 18:04:43,451 [MainThread  ] [INFO ]                  PERS        92.38        90.34        91.35          725
2023-07-24 18:04:43,451 [MainThread  ] [INFO ]               PRODUCT        75.00        60.00        66.67            5
2023-07-24 18:04:43,451 [MainThread  ] [INFO ]                   LAW        90.70        88.64        89.66           44
2023-07-24 18:04:43,451 [MainThread  ] [INFO ]                   FAC        73.91        76.58        75.22          111
2023-07-24 18:04:43,451 [MainThread  ] [INFO ]               PERCENT        92.86       100.00        96.30           13
2023-07-24 18:04:43,451 [MainThread  ] [INFO ]              CARDINAL        86.11        86.11        86.11          180
2023-07-24 18:04:43,451 [MainThread  ] [INFO ]                   LOC        79.41        71.05        75.00           76
2023-07-24 18:04:43,451 [MainThread  ] [INFO ]                 MONEY        84.21        80.00        82.05           20
2023-07-24 18:04:43,451 [MainThread  ] [INFO ]                   GPE        94.38        80.32        86.78         2154
2023-07-24 18:04:43,451 [MainThread  ] [INFO ]                  TIME        75.56        64.15        69.39           53
2023-07-24 18:04:43,452 [MainThread  ] [INFO ]                  CURR       100.00       100.00       100.00           21
2023-07-24 18:04:43,452 [MainThread  ] [INFO ]                  UNIT       100.00       100.00       100.00            3
2023-07-24 18:04:43,452 [MainThread  ] [INFO ]                  DATE        96.09        93.77        94.92         1653
2023-07-24 18:04:43,452 [MainThread  ] [INFO ]               ORDINAL        95.07        92.97        94.01          498
2023-07-24 18:04:43,452 [MainThread  ] [INFO ]                   ORG        91.50        73.50        81.52         1830
2023-07-24 18:04:43,452 [MainThread  ] [INFO ]  
2023-07-24 18:04:43,452 [MainThread  ] [INFO ]                 micro        91.01        82.92        86.78         8767
2023-07-24 18:04:43,452 [MainThread  ] [INFO ]                 macro        86.47        83.27        84.68         8767
2023-07-24 18:04:43,452 [MainThread  ] [INFO ]  
2023-07-24 18:04:43,452 [MainThread  ] [INFO ]  --- NER on Localization ---
2023-07-24 18:04:43,452 [MainThread  ] [INFO ]  
2023-07-24 18:04:43,496 [MainThread  ] [INFO ]                  type    precision       recall     f1-score      support
2023-07-24 18:04:43,496 [MainThread  ] [INFO ]                Entity        92.14        83.98        87.87         8764
2023-07-24 18:04:43,496 [MainThread  ] [INFO ]  
2023-07-24 18:04:43,496 [MainThread  ] [INFO ]                 micro        92.14        83.98        87.87         8764
2023-07-24 18:04:43,496 [MainThread  ] [INFO ]                 macro        92.14        83.98        87.87         8764
2023-07-24 18:04:43,496 [MainThread  ] [INFO ]  
2023-07-24 18:04:43,496 [MainThread  ] [INFO ]  --- NER on Classification ---
2023-07-24 18:04:43,496 [MainThread  ] [INFO ]  
2023-07-24 18:04:43,537 [MainThread  ] [INFO ]                  type    precision       recall     f1-score      support
2023-07-24 18:04:43,537 [MainThread  ] [INFO ]              QUANTITY       100.00       100.00       100.00            3
2023-07-24 18:04:43,537 [MainThread  ] [INFO ]                   OCC        99.57        84.98        91.70          546
2023-07-24 18:04:43,537 [MainThread  ] [INFO ]                 EVENT       100.00        78.20        87.76          266
2023-07-24 18:04:43,537 [MainThread  ] [INFO ]              LANGUAGE        82.35        93.33        87.50           15
2023-07-24 18:04:43,537 [MainThread  ] [INFO ]               WEBSITE        86.67        57.78        69.33           45
2023-07-24 18:04:43,537 [MainThread  ] [INFO ]                  NORP        98.98        76.88        86.54          506
2023-07-24 18:04:43,537 [MainThread  ] [INFO ]                  PERS        99.39        90.34        94.65          725
2023-07-24 18:04:43,537 [MainThread  ] [INFO ]               PRODUCT       100.00        60.00        75.00            5
2023-07-24 18:04:43,537 [MainThread  ] [INFO ]                   LAW       100.00        88.64        93.98           44
2023-07-24 18:04:43,537 [MainThread  ] [INFO ]                   FAC        92.39        76.58        83.74          111
2023-07-24 18:04:43,537 [MainThread  ] [INFO ]               PERCENT        92.86       100.00        96.30           13
2023-07-24 18:04:43,537 [MainThread  ] [INFO ]              CARDINAL        98.10        86.11        91.72          180
2023-07-24 18:04:43,537 [MainThread  ] [INFO ]                   LOC        96.43        71.05        81.82           76
2023-07-24 18:04:43,537 [MainThread  ] [INFO ]                 MONEY       100.00        80.00        88.89           20
2023-07-24 18:04:43,537 [MainThread  ] [INFO ]                   GPE        98.13        80.32        88.33         2154
2023-07-24 18:04:43,537 [MainThread  ] [INFO ]                  TIME        97.14        64.15        77.27           53
2023-07-24 18:04:43,537 [MainThread  ] [INFO ]                  CURR       100.00       100.00       100.00           21
2023-07-24 18:04:43,537 [MainThread  ] [INFO ]                  UNIT       100.00       100.00       100.00            3
2023-07-24 18:04:43,537 [MainThread  ] [INFO ]                  DATE        99.87        93.77        96.72         1653
2023-07-24 18:04:43,537 [MainThread  ] [INFO ]               ORDINAL        98.93        92.97        95.86          498
2023-07-24 18:04:43,537 [MainThread  ] [INFO ]                   ORG        98.61        73.50        84.22         1830
2023-07-24 18:04:43,537 [MainThread  ] [INFO ]  
2023-07-24 18:04:43,537 [MainThread  ] [INFO ]                 micro        98.78        82.92        90.16         8767
2023-07-24 18:04:43,537 [MainThread  ] [INFO ]                 macro        97.12        83.27        89.11         8767

In short, the micro NER f1 score went from 92.54 to 86.78.

We also get the following warning:
image

关于entity point阶段的一些问题

首先感谢如此出色的工作,有一些问题需要向您请教。
文中指出,每一个query能够指出一个实体,也就是指出一个span的左右两个边界,请问模型是如何保证每一个query一定会选取左右两个边界的呢?我理解的做法是,每一个query对每一个token做询问,得到一个p,最后选取p值最大的两个token作为query所对应选取的边界,请问这样理解对吗?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.