Giter VIP home page Giter VIP logo

npel's Introduction

NPEL

NPEL(Neural Paired Entity Linking) is the entity linking method proposed by "Neural Paired Entity Linking in Web Tables"

Datasets

There are three datasets used in NPEL experiment. Web_Manual_fixed_2020(with 347 tables), Wiki_Links_Large(with 168 tables),Wiki_Links_Random_2020(with 2335 tables).

Director Web_Manual_Fixed_2020 has two subdirectors. One of them is table data in json format and another director is the labeled entity correspondent to table data in the same position.

Director Wiki_Links_Large and Wiki_Links_Random_2020 share the same struct of files. They have only one tables director. All table data and labeled are in json type files. Each table mention labeled entity is annotate as target in surfaceLinks

Code

code contains the mention-entity semantics model and pair-linking algorithm implementation.

The mention-entity semantics model mainly included in nn_models and pair-linking algorithm is implemented in pair_linking.py

Both neural network model and pair-linking algorithm required input data format each mention in a table contains all it's embedding, candidates, candidates' embedding, candidates' abstract. These elements are directly accessed as dict in model or algorithm.

Besides, we utilized this work to generate prior data.

run-step

This project required enwiki as knowledge base, which can download from this. All enwiki articles will be used to train entities and words embedding. All entity abstract need to be extracted for entity encoding. All anchor text will be used to calculate the prior of mention to entities.

data_generate.py/gen_abstract is used to get the necessary abstract of tables' entities.

data_generate.py/gen_embedding is used to get the necessary embedding of tables' entities and mentions/words.

data_generate.py/dump_mention_abstract_emb merge entity abstract and embedding to one file.

data_generate.py/rebuild_? is used for prepare format data to deep learning model and pair-linking algorithm.

pair_linking.py implement algorithm of pair-linking, it required format table

npel's People

Contributors

npel-ll avatar llgithubll avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.