Giter VIP home page Giter VIP logo

izuna385 / zero-shot-entity-linking Goto Github PK

View Code? Open in Web Editor NEW
30.0 2.0 5.0 217 KB

Zero-shot Entity Linking with blitz start in 3 minutes. Hard negative mining and encoder for all entities are also included in this implementation.

License: MIT License

Shell 0.75% Python 99.25%
entity-linking bert allennlp natural-language-processing faiss zero-shot-learning zero-shot-retrieval approximate-nearest-neighbor-search

zero-shot-entity-linking's Introduction

Dual-Encoder-Based Zero-Shot Entity Linking

Quick Starts in 3 minutes

git clone https://github.com/izuna385/Zero-Shot-Entity-Linking.git
cd Zero-Shot-Entity-Linking
python -m spacy download en_core_web_sm

# ~ Multiprocessing Sentence Boundary Detection takes about 2 hours under 8 core CPUs.
sh preprocessing.sh
python3 ./src/train.py -num_epochs 1

For further speednizing to check entire script, run the following command.

python3 ./src/train.py -num_epochs 1 -debug True

also, multi-gpu is supported.

CUDA_VISIBLE_DEVICES=0,1 python3 ./src/train.py -num_epochs 1 -cuda_devices 0,1

Descriptions

  • This experiments aim to confirm whether fine-tuning pretraind BERT (more specifically, encoders for mention and entity) is effective even to the unknown domains.

Requirements

  • torch,allennlp,transformers, and faiss are required. See also requirements.txt.

  • ~3 GB CPU and ~1.1GB GPU are necessary for running script.

How to run experiments

1. Preprocessing

2. Training and Evaluate Bi-Encoder Model

  • python3 ./src/train.py

    • This script trains encoder for mention and entity.

3. Logging Each Experiment

  • See ./src/experiment_logdir/.

    • Log directory is named after when the experiment starts.

TODO

  • Preprocess with more strict sentence boundary.

LICENSE

  • MIT

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.