Giter VIP home page Giter VIP logo

kogito's Introduction

kogito

A Python NLP Commonsense Knowledge Inference Toolkit

System Description available here: https://arxiv.org/abs/2211.08451

Installation

Installation with pip

kogito can be installed using pip.

pip install kogito

It requires a minimum python version of 3.8.

Setup

Inference

kogito uses spacy under the hood for various text processing purposes, so, a spacy language package has to be installed before running the inference module.

python -m spacy download en_core_web_sm

By default, CommonsenseInference module uses en_core_web_sm to initialize spacy pipeline, but a different language pipeline can be specified as well.

Evaluation

If you also would like evaluate knowledge models using METEOR score, then you need to download the following nltk libraries:

import nltk

nltk.download("punkt")
nltk.download("wordnet")
nltk.download("omw-1.4")

Quickstart

kogito provides an easy interface to interact with knowledge inference or commonsense reasoning models such as COMET to generate inferences from a text input. Here is a sample usage of the library where you can initialize an inference module, a custom commonsense reasoning model, and generate a knowledge graph from text on the fly.

from kogito.models.bart.comet import COMETBART
from kogito.inference import CommonsenseInference

# Load pre-trained model from HuggingFace
model = COMETBART.from_pretrained("mismayil/comet-bart-ai2")

# Initialize inference module with a spacy language pipeline
csi = CommonsenseInference(language="en_core_web_sm")

# Run inference
text = "PersonX becomes a great basketball player"
kgraph = csi.infer(text, model)

# Save output knowledge graph to JSON file
kgraph.to_jsonl("kgraph.json")

Here is an excerpt from the result of the above code sample:

{"head": "PersonX becomes a great basketball player", "relation": "Causes", "tails": [" PersonX practices every day.", " PersonX plays basketball every day", " PersonX practices every day"]}
{"head": "basketball", "relation": "ObjectUse", "tails": [" play with friends", " play basketball with", " play basketball"]}
{"head": "player", "relation": "CapableOf", "tails": [" play game", " win game", " play football"]}
{"head": "great basketball player", "relation": "HasProperty", "tails": [" good at basketball", " good at sports", " very good"]}
{"head": "become player", "relation": "isAfter", "tails": [" play game", " become coach", " play with"]}

This is just one way to generate commonsense inferences and kogito offers much more. For complete documentation, check out the kogito docs.

Development

Setup

kogito uses Poetry to manage its dependencies.

Install poetry from the official repository first:

curl -sSL https://install.python-poetry.org | python3 -

Then run the following command to install package dependencies:

poetry install

Data

If you need the ATOMIC2020 data to train your knowledge models, you can download it from AI2:

For ATOMIC:

wget https://storage.googleapis.com/ai2-mosaic/public/atomic/v1.0/atomic_data.tgz

For ATOMIC 2020:

wget https://ai2-atomic.s3-us-west-2.amazonaws.com/data/atomic2020_data-feb2021.zip

Paper

If you want to learn more about the library design, models and data used for this toolkit, check out our paper. The paper can be cited as:

@article{Ismayilzada2022kogito,
  title={kogito: A Commonsense Knowledge Inference Toolkit},
  author={Mete Ismayilzada and Antoine Bosselut},
  journal={ArXiv},
  volume={abs/2211.08451},
  year={2022}
}

If you work with knowledge models, consider citing the following papers:

@article{Hwang2020COMETATOMIC,
 author = {Jena D. Hwang and Chandra Bhagavatula and Ronan Le Bras and Jeff Da and Keisuke Sakaguchi and Antoine Bosselut and Yejin Choi},
 booktitle = {Proceedings of the 35th AAAI Conference on Artificial Intelligence (AAAI)},
 title = {COMET-ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs},
 year = {2021}
}

@inproceedings{Bosselut2019COMETCT,
 author = {Antoine Bosselut and Hannah Rashkin and Maarten Sap and Chaitanya Malaviya and Asli Çelikyilmaz and Yejin Choi},
 booktitle = {Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL)},
 title = {COMET: Commonsense Transformers for Automatic Knowledge Graph Construction},
 year = {2019}
}

Acknowledgements

Significant portion of the model training and evaluation code has been adapted from the original codebase for the paper (Comet-) Atomic 2020: On Symbolic and Neural Commonsense Knowledge Graphs.

kogito's People

Contributors

atcbosselut avatar mismayil avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

kogito's Issues

DistilBERTRelationMatcher - TypeError: __new__() missing 1 required positional argument: 'task'

When attempting to use the DistilBERTRelationMatcher class from the kogito.core.processors.relation module, I encountered the following error:

from kogito.core.processors.relation import DistilBERTRelationMatcher

DistilBERTRelationMatcher(name='distilbert_matcher')

Error: TypeError: new() missing 1 required positional argument: 'task'

Could you please provide advice on how to solve this issue? It seems that the DistilBERTRelationMatcher class does not support the task argument.

I encountered the same problem when attempting to use the BERTRelationMatcher and SWEMRelationMatcher classes as well.

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

Hi, thanks for the fantastic work. I have a problem here. I want to train COMET on my own dataset, and I follow the user guide from here docs. My codes are as follows:

import os

from kogito.core.knowledge import KnowledgeGraph
from kogito.models.bart.comet import COMETBART, COMETBARTConfig, AutoTokenizer

path = 'data/csv'

config = COMETBARTConfig(
    output_dir="bart",
    num_workers=16,
    learning_rate=1e-5,
    gpus=1,
    sortish_sampler=True,
    atomic=True,
    max_epochs=56,
    pretrained_model="facebook/bart-large",
)
model = COMETBART(config)
model.tokenizer = AutoTokenizer.from_pretrained("mismayil/comet-bart-ai2")

train_graph = KnowledgeGraph.from_csv(os.path.join(path, 'train.csv'))
val_graph = KnowledgeGraph.from_csv(os.path.join(path, 'val.csv'))
test_graph = KnowledgeGraph.from_csv(os.path.join(path, 'test.csv'))

model.train(train_graph=train_graph, val_graph=val_graph, test_graph=test_graph)

# Save as a pretrained model
model.save_pretrained("comet-bart/v1")

However, i got following errors when i run the code:

  File "/home/wzh/work/comet_ft/cometbart_train.py", line 25, in <module>
    model.train(train_graph=train_graph, val_graph=val_graph, test_graph=test_graph)
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/kogito/models/bart/comet.py", line 109, in train
    trainer: pl.Trainer = generic_train(
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/kogito/models/bart/utils.py", line 262, in generic_train
    trainer.fit(model)
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 740, in fit
    self._call_and_handle_interrupt(
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 685, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 777, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1199, in _run
    self._dispatch()
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1279, in _dispatch
    self.training_type_plugin.start_training(self)
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 202, in start_training
    self._results = trainer.run_stage()
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1289, in run_stage
    return self._run_train()
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1319, in _run_train
    self.fit_loop.run()
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/pytorch_lightning/loops/base.py", line 145, in run
    self.advance(*args, **kwargs)
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/pytorch_lightning/loops/fit_loop.py", line 234, in advance
    self.epoch_loop.run(data_fetcher)
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/pytorch_lightning/loops/base.py", line 145, in run
    self.advance(*args, **kwargs)
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 193, in advance
    batch_output = self.batch_loop.run(batch, batch_idx)
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/pytorch_lightning/loops/base.py", line 145, in run
    self.advance(*args, **kwargs)
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/pytorch_lightning/loops/batch/training_batch_loop.py", line 88, in advance
    outputs = self.optimizer_loop.run(split_batch, optimizers, batch_idx)
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/pytorch_lightning/loops/base.py", line 145, in run
    self.advance(*args, **kwargs)
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 215, in advance
    result = self._run_optimization(
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 266, in _run_optimization
    self._optimizer_step(optimizer, opt_idx, batch_idx, closure)
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 378, in _optimizer_step
    lightning_module.optimizer_step(
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/pytorch_lightning/core/lightning.py", line 1652, in optimizer_step
    optimizer.step(closure=optimizer_closure)
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/pytorch_lightning/core/optimizer.py", line 164, in step
    trainer.accelerator.optimizer_step(self._optimizer, self._optimizer_idx, closure, **kwargs)
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/pytorch_lightning/accelerators/accelerator.py", line 339, in optimizer_step
    self.precision_plugin.optimizer_step(model, optimizer, opt_idx, closure, **kwargs)
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 163, in optimizer_step
    optimizer.step(closure=closure, **kwargs)
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/torch/optim/lr_scheduler.py", line 68, in wrapper
    return wrapped(*args, **kwargs)
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/torch/optim/optimizer.py", line 140, in wrapper
    out = func(*args, **kwargs)
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/transformers/optimization.py", line 457, in step
    loss = closure()
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 148, in _wrap_closure
    closure_result = closure()
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 160, in __call__
    self._result = self.closure(*args, **kwargs)
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 155, in closure
    self._backward_fn(step_output.closure_loss)
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 327, in backward_fn
    self.trainer.accelerator.backward(loss, optimizer, opt_idx)
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/pytorch_lightning/accelerators/accelerator.py", line 314, in backward
    self.precision_plugin.backward(self.lightning_module, closure_loss, *args, **kwargs)
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 91, in backward
    model.backward(closure_loss, optimizer, *args, **kwargs)
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/pytorch_lightning/core/lightning.py", line 1434, in backward
    loss.backward(*args, **kwargs)
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/torch/_tensor.py", line 488, in backward
    torch.autograd.backward(
  File "/home/wzh/anaconda3/envs/kogito/lib/python3.10/site-packages/torch/autograd/__init__.py", line 197, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

Here is my environment:

  • Nvidia RTX A6000
  • Cuda 11.6
  • Use conda to create python environment

I do not kown why, any reply will be appreciated.
Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.