Giter VIP home page Giter VIP logo

ontoea's Introduction

OntoEA: Ontology-guided Entity Alignment via Joint Knowledge Graph Embedding

The code and benchmark of paper OntoEA: Ontology-guided Entity Alignment via Joint Knowledge Graph Embedding [arxiv][acl] in Findings of ACL-IJCNLP 2021.

Code

The source code of OntoEA is implemented based on OpenEA and we follow the same design features of OpenEA to achieve functionality while maintaining extensibility. We retain the minimal fraction of OpenEA in the implementation of OntoEA, and we list the source code (along with comments) of as follows.

code
|__ run
|   |__ args
|   |   |__ ontoea_args_15K.json    // OntoEA config file for 15K benchmark
|   |   |__ ontoea_args_100K.json   // OntoEA config file for 100K benchmark
|   |__ main_from_args.py           // OntoEA interface with args input
|   |__ run_OntoEA.sh               // bash script to run experiments of OntoEA
|__ src
|   |__ openea
|   |   |__ approaches
|   |   |   |__ ontoea.py           // OntoEA core code
|   |   |__ models
|   |   |   |__ basic_model.py      // same as OpenEA
|   |   |__ modules
|   |   |   |__ args
|   |   |   |   |__ args_hander.py  // same as OpenEA
|   |   |   |__ base
|   |   |   |   |__ initializers.py // same as OpenEA
|   |   |   |   |__ losses.py       // same as OpenEA
|   |   |   |   |__ mapping.py      // same as OpenEA (with minor updates)
|   |   |   |   |__ optimizers.py   // same as OpenEA
|   |   |   |__ finding
|   |   |   |   |__ alignment.py    // same as OpenEA
|   |   |   |   |__ evaluation.py   // same as OpenEA (with minor updates)
|   |   |   |   |__ similarity.py   // same as OpenEA
|   |   |   |__ load
|   |   |   |   |__ kg.py           // same as OpenEA
|   |   |   |   |__ kgs.py          // same as OpenEA (with updates)
|   |   |   |   |__ read.py         // same as OpenEA (with updates)
|   |   |   |__ train
|   |   |   |   |__ batch.py        // same as OpenEA (with minor updates)
|   |   |   |   |__ sample.py       // same as OpenEA (with minor updates)
|   |   |   |__ utils
|   |   |   |   |__ check.py        // same as OpenEA (with updates)
|   |   |   |   |__ inference.py    // same as OpenEA (with updates)
|   |   |   |   |__ util.py         // same as OpenEA

How to run OntoEA

Since we adopt the OpenEA codebase to implement OntoEA, the way to run OntoEA is much similar to the way to use OpenEA. We provide two ways to run OntoEA:

1. using bash scripts (recommended)

Change into ./code/run and run the bash script

./run_OntoEA.sh

2. using off-the-shelf approaches with python interface

Change into ./code/run and run OntoEA as

python main_from_args.py <predefined_arguments> <benchmark_name> <dataset_splits>

and for example

python main_from_args.py ./args/ontoea_args_15K.json MED_BBK_9K 721_5fold/1/

Configuration of OntoEA

Different settings of OntoEA can be configured in the ontoea_args_15K.json and ontoea_args_100K.json. The key configurations are listed as follows.

{
  "training_data": "../../benchmarks/",       // where the benchmark is
  "output": "../../output/results/",          // where the output results go
  "dataset_division": "721_5fold/1/",         // where the train/valid/test splits are
  "word_embed": "../../benchmarks/wiki-news-300d-1M.vec",     // where the word embeddings are
  "use_word_embed_init": 0,                   // whether or not to use word embeddings in OntoEA (i.e., 1 == OntoEA w/ SI; 0 == OntoEA w/o SI)
  "use_word_embed_init_onto": 0,              // whether or not to use word embeddings in OntoEA (i.e., 1 == OntoEA w/ SI; 0 == OntoEA w/o SI)
  "word_embed_init_zh": 0,                    // whether or not Chinese word embeddings (only for MED_BBK_9K)
  "use_alter_label": 0,                       // whether or not to use alternative labels for entity names (only for D_W benchmarks)
}

Some other configurations on the hyperparameters (lambda_1, lambda_2, lambda_3) that balance different losses are listed as follows.

{
  "alpha": 5,		// controlling L_E (entity embedding)
  "gamma": 1,		// controlling L_M (membership loss)
  "sigma": 1,		// controlling L_C (ontology embedding)
}

Therefore, the experiments of Model Component Analysis in Sec. 4.3 of our paper can be re-produced using these settings:

w/o L_C   ==> to set sigma = 0
w/o L_M   ==> to set gamma = 0
w/o Onto. ==> to set sigma = 0 and gamma = 0

Dependencies of OntoEA

  • Python 3.x (tested on Python 3.6.12)
  • Tensorflow 1.x (tested on Tensorflow 1.14.0)
  • Scipy
  • Numpy
  • Pandas
  • Scikit-learn
  • Gensim

All the dependencies can be installed using the provided requirements.

Benchmarks

The newly constructed benchmarks extends the original KG alignment benchmarks (OpenEA and MED_BBK_9K) with ontological information, including ontology triples (both disjointWith and subClassOf), membership links (or cross-view links), ontology hierarchy.

The data files are identical for each benchmark, and we list all the benchmarks with MED_BBK_9K benchmark details as follows.

benchmarks
|__ EN_FR_15K_V1    // EN_FR_15K_V1 from OpenEA with appended ontological information
|__ EN_FR_15K_V2    // EN_FR_15K_V2 from OpenEA with appended ontological information
|__ EN_DE_15K_V1    // EN_DE_15K_V1 from OpenEA with appended ontological information
|__ EN_DE_15K_V2    // EN_DE_15K_V2 from OpenEA with appended ontological information
|__ D_W_15K_V1_A    // D_W_15K_V1 from OpenEA with appended ontological information and ontologies pre-aligned with PARIS
|__ D_W_15K_V1_M    // D_W_15K_V1 from OpenEA with appended ontological information and ontologies pre-aligned with manual annotation
|__ D_W_15K_V2_A    // D_W_15K_V2 from OpenEA with appended ontological information and ontologies pre-aligned with PARIS
|__ D_W_15K_V2_M    // D_W_15K_V2 from OpenEA with appended ontological information and ontologies pre-aligned with  manual annotation
|__ MED_BBK_9K      // MED_BBK_9K with appended ontological information
|   |__ 721_5fold/1         // train/valid/test splits
|   |__ attr_triples_1      // attribute triples of KG1
|   |__ attr_triples_2      // attribute triples of KG2
|   |__ rel_triples_1       // relation triples of KG1
|   |__ rel_triples_2       // relation triples of KG2
|   |__ ent_links           // all entity mappings between KG1 and KG2
|   |__ crossview_link_1    // membership links between KG1 and its ontology
|   |__ crossview_link_2    // membership links between KG2 and its ontology
|   |__ onto_attr_triples   // attribute triples of the aligned/shared ontology
|   |__ onto_disjointWith_triples   // disjointWith relation triples of the aligned/shared ontology
|   |__ onto_subClassOf_triples     // subClassOf relation triples of the aligned/shared ontology
|   |__ class_path.json     // class hierarchy of the aligned/shared ontology

Note that D_W_15K_V1/V2_A are constructed with the ontologies pre-aligned with alignment system PARIS and D_W_15K_V1/V2_M are constructed with the ontologies pre-aligned with manual annotation. These two different settings correspond to the Analysis of Ontology Alignment Methods in Sec. 4.3 of our paper.

Citation

If you have any difficulty or question in running code and reproducing experimental results, please email to [email protected]

If you use this model or code, please cite it as follows:

@inproceedings{xiang-etal-2021-ontoea,
    title = "{O}nto{EA}: Ontology-guided Entity Alignment via Joint Knowledge Graph Embedding",
    author = "Xiang, Yuejia  and
      Zhang, Ziheng  and
      Chen, Jiaoyan  and
      Chen, Xi  and
      Lin, Zhenxi  and
      Zheng, Yefeng",
    booktitle = "Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021",
    month = aug,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.findings-acl.96",
    doi = "10.18653/v1/2021.findings-acl.96",
    pages = "1117--1128",
}

ontoea's People

Contributors

zihengzzh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

ontoea's Issues

openea 路径问题

cd 到code/run 下运行run_OntoEa.sh 或者python 运行 都提示openea 找不到 这个是不是要修改以下import openea 的路径?

pytoch version

Hi,
could you please share the code of pytorch version?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.