Giter VIP home page Giter VIP logo

pluslabnlp / eventplus Goto Github PK

View Code? Open in Web Editor NEW
25.0 5.0 13.0 9.11 MB

[NAACL'21 Demo] EventPlus: a temporal event understanding pipeline that integrates various state-of-the-art event understanding components including event trigger and type detection, event argument detection, event duration and temporal relation extraction

Home Page: https://kairos-event.isi.edu

License: Apache License 2.0

Python 80.25% Shell 0.08% Jupyter Notebook 16.51% CSS 0.26% JavaScript 1.84% HTML 1.06%
event-extraction temporal-relation nlp-toolkit event-duration temporal-event-detection ace2005

eventplus's Introduction

[NAACL'21] EventPlus: A Temporal Event Understanding Pipeline

This is the codebase for the system demo EventPlus: A Temporal Event Understanding Pipeline in NAACL 2021.

Please refer to our paper for details. [PDF] [Talk] [Demo]

Quick Start

0 - Clone the codebase with all submodules

git clone --recurse-submodules https://github.com/PlusLabNLP/EventPlus.git
# or use following commands
git clone https://github.com/PlusLabNLP/EventPlus.git
git submodule init
git submodule update

1 - Environment Installation

Change prefix (last line) of env.yml to fit your path, then run

conda env create -f env.yml
conda activate event-pipeline
pip install https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.2.4/en_ner_jnlpba_md-0.2.4.tar.gz
python -m spacy download en_core_web_sm
pip install git+https://github.com/hltcoe/PredPatt.git

2 - Download trained model for components

For component/BETTER module, download the trained model [Link], unzip and place it under component/BETTER/joint/worked_model_ace.

For component/TempRel module, download the trained model [Link], unzip and place it under component/TempRel/models.

For component/Duration module, download scripts zip file [Link], unzip and place it under component/Duration/scripts.

For component/NegationDetection module, download the trained model [Link], unzip and place is under component/NegationDetection/models

3 - In background: Run REST API for event duration detection module for faster processing

(optional) tmux new -s duration_rest_api
conda activate event-pipeline
cd component/REST_service
python main.py
(optional) exit tmux window

4 - Application 1: Raw Text Annotation. The input is a multiple line raw text file, and the output pickle and json file will be saved to designated paths

cd YOUR_PREFERRED_PATH/project
python APIs/test_on_raw_text.py -data YOUR_RAW_TEXT_FILE -save_path SAVE_PICKLE_PATH -save_path_json SAVE_JSON_PATH -negation_detection

5 - Application 2: Web App for Interaction and Visualization. A web app will be started and user can input a piece of text and get annotation result and visualization.

cd YOUR_PREFERRED_PATH/project
tmux new -s serve
python manage.py runserver 8080

Components

The code for data processing and incorporating different components is in project/APIs/main.py. Please refer to README file of each component for more details about training and inference.

1- Event Extraction on ACE Ontology: component/BETTER

2- Joint Event Trigger and Temporal Relation Extraction: component/TempRel for inference, this codebase for training

3- Event Duration Detection: component/Duration

4- Negation and Speculation Cue Detection and Scope Resolution: component/NegationDetection

5- Biomedical Event Extraction: component/BioMedEventEx for inference, this codebase for training

Quick Start with ISI shared NAS

If you are using the system on a machine with access to ISI shared NAS, you could directly activate environment and copy the code and start using it right away!

# 1 - Environment Installation: Activate existing environment
conda activate /nas/home/mingyuma/miniconda3/envs/event-pipeline-dev

# 2 - Prepare Components (Submodules): Copy the whole codebase
cp -R /nas/home/mingyuma/event-pipeline/event-pipeline-dev YOUR_PREFERRED_PATH

# 3 - In background: Run REST API for event duration detection module for faster processing
(optional) tmux new -s duration_rest_api
conda activate /nas/home/mingyuma/miniconda3/envs/event-pipeline-dev
cd component/REST_service
python main.py
(optional) exit tmux window

# To use it for raw text annotation or web app, please follow step 4 and 5 in quick start section.

Deployment as Web Service

Here are instruction of how to deploy the web application on an server

Set up web server

pip install uwsgi

If you met errors like error while loading shared libraries libssl.so.1.1, reference this link and do the following

export LD_LIBRARY_PATH=/nas/home/mingyuma/miniconda3/envs/event-pipeline/lib:$LD_LIBRARY_PATH

Server port setting

External port: 443 (for HTTPS)

Django will forward traffic from 443 port to internal 8080 port

Internal port

  • 8080: run Django main process
  • 17000: run service for duration (if we run a REST API for duration module, but now the newer version doesn't need such a separate service)

Citation

@inproceedings{ma-etal-2021-eventplus,
    title = "{E}vent{P}lus: A Temporal Event Understanding Pipeline",
    author = "Ma, Mingyu Derek  and
      Sun, Jiao  and
      Yang, Mu  and
      Huang, Kung-Hsiang  and
      Wen, Nuan  and
      Singh, Shikhar  and
      Han, Rujun  and
      Peng, Nanyun",
    booktitle = "Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Demonstrations",
    month = jun,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2021.naacl-demos.7",
    pages = "56--65",
    abstract = "We present EventPlus, a temporal event understanding pipeline that integrates various state-of-the-art event understanding components including event trigger and type detection, event argument detection, event duration and temporal relation extraction. Event information, especially event temporal knowledge, is a type of common sense knowledge that helps people understand how stories evolve and provides predictive hints for future events. EventPlus as the first comprehensive temporal event understanding pipeline provides a convenient tool for users to quickly obtain annotations about events and their temporal information for any user-provided document. Furthermore, we show EventPlus can be easily adapted to other domains (e.g., biomedical domain). We make EventPlus publicly available to facilitate event-related information extraction and downstream applications.",
}

eventplus's People

Contributors

derekmma avatar g1eb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

eventplus's Issues

Duration train: some class objects not defined

VOCContextDataset and FCN8s classes aren't imported / defined in the Duration module. Also, within the VOCContextDataset, two parameters are undefined: img_size and transforms_train. Am I missing it?

TempRel inference breaks if no relations returned from construct_relations()

During inference, if a sentence does not produce probabilities > 0.5 for any candidate event, the TempRelAPI.pred() pipeline breaks.

Specifically if the below line in construct_relations() returns an empty list, the TempRel pipeline breaks at a downstream line

# select event based on prob > 0.5, but eliminate ent_pred > context length
        ent_locs = [[x for x in (ent_probs[b,:, 1] > 0.5).nonzero().view(-1).tolist()
                     if x < lengths[b]] for b in range(batch_size)]

If ent_locs is an empty list, TempRel EventEvaluator.evaluate() call breaks at:

predicted_rels = [self.model._id_to_label[ridx] for ridx in torch.argmax(preds, dim=1).tolist()]

Also, NNClassifier.predict() doesn't like when probs returns as an empty list, either, which it definitely does return as an empty list if the ent_locs in construct_relations is an empty list, as rel_idxs returns as an empty list, etc:

probs = torch.cat(probs,dim=0)
labels = torch.cat(labels,dim=0)

Where do train/dev/test pickle files come from in BETTER/joint/main.py?

I'm trying to find out how the ACE 2005 data gets written to the following pickle files which get loaded in the training pipeline of the BETTER component of this codebase. I do have a license for ACE 2005, but can't identify how it gets encoded for the training task.

p.add_argument('-train_pkl', type=str, default='./all_liz/train_w-pairs_xlmroberta.pkl')
p.add_argument('-dev_pkl', type=str, default='./all_liz/dev_w-pairs_xlmroberta.pkl')
p.add_argument('-test_pkl', type=str, default='./all_liz/test_w-pairs_xlmroberta.pkl')

Unable to create conda environment

Hi, I was following you guide but I got stuck on conda env create -f env.yml

The output says

Collecting package metadata (repodata.json): done
Solving environment: failed

ResolvePackageNotFound: 
  - gurobi==9.0.1=py37_0

I tried tinkering a bit with packets and versions but with no luck, any idea on how to solve this?

(Also, I tried for several days to reach your demo at https://kairos-event.isi.edu/ but apparently the site is down)

Thanks a lot

checkpoint for TempRel inference

The TempRelAPI class references a matres_pipeline_best_hid90_dropout0.4_ew15.0.pth.tar file for the entity model. Is this checkpoint shared anywhere in particular?

You also mention https://github.com/rujunhan/EMNLP-2019 codebase for training. Which model type did you guys use to train the above checkpoint:

  1. Single task
  2. Multitask
  3. Joint
  4. Global

Thanks for any insight!

AttributeError: 'Elmo' object has no attribute 'batch_to_embeddings'

File "main.py", line 50, in response_pred
json_list = durationAPI.pred(events)
File "../Duration/inference_api.py", line 84, in pred
outputs = compute_predictions(self.model, test_loader)
File "/home/tianhao/anaconda3/envs/event-pipeline/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 49, in decorate_no_grad
return func(*args, **kwargs)
File "../Duration/utils_duration.py", line 40, in compute_predictions
p1_dur, p2_dur, fine, rel = model(words, span, root)
File "/home/tianhao/anaconda3/envs/event-pipeline/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/home/tianhao/relation_extraction/EventPlus/component/Duration/scripts/src/factslab/factslab/pytorch/temporalmodule.py", line 241, in forward
inputs, masks = self.elmo_class.batch_to_embeddings(structures)
File "/home/tianhao/anaconda3/envs/event-pipeline/lib/python3.7/site-packages/torch/nn/modules/module.py", line 576, in getattr
type(self).name, name))
AttributeError: 'Elmo' object has no attribute 'batch_to_embeddings'

not able to run

hi, I am not able to install and run the package. Is there another way to use it? thanks!

env.yml

Is there any chance you guys could expose an env.yml file that is cross platform compatible? If you export your conda env with --no-builds and --from-history, it'll only show the dependencies you asked for instead of all the packages that also get included alongside the packages in the original conda recipe. Right now the env.yml exposed in this repo works for Linux but not Mac OS.

conda env export --no-builds --from-history

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.