Giter VIP home page Giter VIP logo

gnnq's Introduction

GNNQ: A Neuro-Symbolic Approach to Query Answering over Incomplete Knowledge Graphs

About

The GNNQ repository contains the source code for the GNNQ system presented in the paper GNNQ: A Neuro-Symbolic Approach to Query Answering over Incomplete Knowledge Graphs accepted to ISWC22.

GNNQ is a neuro-symbolic system to answer monadic tree-like conjunctive queries over incomplete KGs. GNNQ first symbolically augments an input KG (formally a set of facts) with additional facts representing subsets matching connected query fragments, and then applies a generalisation of the Relational Graph Convolutional Networks (RGCNs) model to the augmented KG to produce the predicted query answers.

Source Code

Clone the GNNQ repository.

git clone https://github.com/KRR-Oxford/GNNQ.git or git clone [email protected]:KRR-Oxford/GNNQ.git

Requirements

We assume that the following is pre-installed. We used the respective versions specified in brackets.

  • python (3.8.10 or higher)
  • pip (19.2.3 or higher)
  • venv

Instructions for the installation of the requirements can be found here.

Please follow the steps outlined below to reproduce the experiments.

Dependencies

To install all dependencies required for our experiments follow the instructions below:

  • Navigate to the GNNQ/ directory.
    cd path/to/download/GNNQ
  • Create a virtual environment.
    python -m venv env
  • Start virtual environment.
    source env/bin/activate
  • Install PyTorch. Replace ${CUDA} with cpu or cu113.
    pip install torch==1.11.0+${CUDA} --extra-index-url https://download.pytorch.org/whl/${CUDA}
  • Install PyTorch Scatter. Replace ${CUDA} with cpu or cu113.
    pip install torch-scatter -f https://data.pyg.org/whl/torch-1.11.0+${CUDA}.html
  • Install all other dependencies.
    pip install -r requirements.txt

Datasets

The datasets/ directory, containing both the WatDiv-Qi and the FB15k237-Qi benchmarks, can be downloaded from here (https://portal.sds.ox.ac.uk/ndownloader/files/36445044). Unzip the downloaded .zip-file and place the datasets/ directory in the GNNQ/ directory.

Run Experiments

To train and evaluate a 4-layer GNNQ instance on the WatDiv-Q1 benchmark run the following command from the GNNQ folder. Please remember that the virtual environment needs to be active.

python main.py  --log_dir watdiv_q1_4l_aug/ --num_layers 4 --aug --test --train_data datasets/watdiv/train_samples --val_data datasets/watdiv/val_samples --test_data datasets/watdiv/test_samples --query_string "SELECT distinct ?v0 WHERE { ?v0  <http://schema.org/caption> ?v1 . ?v0   <http://schema.org/text> ?v2 . ?v0 <http://schema.org/contentRating> ?v3 . ?v0   <http://purl.org/stuff/rev#hasReview> ?v4 .  ?v4 <http://purl.org/stuff/rev#title> ?v5 . ?v4  <http://purl.org/stuff/rev#reviewer> ?v6 . ?v7 <http://schema.org/actor> ?v6 . ?v7 <http://schema.org/language> ?v8  }" 

To train and evaluate an instance on other WatDiv benchmarks exchange the query specified by the --query_string parameter and specify a new logging directory using the --log_dir parameter. All benchmark queries can be found in the datasets/benchmark_queries.txt- file. To train and evaluate a 4-layer GNNQ- (baseline), remove the --aug parameter. The number of layers for all models can be specified using the --num_layersparameter.

To train and evaluate a 4-layer GNNQ instance on the FB15k237-Q1 benchmark run the following command from the GNNQ folder. Please remember again that the virtual environment needs to be active.

python main.py  --log_dir fb15k237_q1_4l_aug/ --num_layers 4 --aug --test --batch_size 40 --train_data  datasets/fb15k237/org_train_samples --val_data datasets/fb15k237/org_val_samples --test_data datasets/fb15k237/org_test_samples --query_string "select distinct ?org where { ?org <http://dummyrel.com/organization/organization/headquarters./location/mailing_address/state_province_region> ?region . ?biggerregion <http://dummyrel.com/location/location/contains> ?region . ?biggerregion <http://dummyrel.com/location/location/adjoin_s./location/adjoining_relationship/adjoins> ?neighbourregion . ?biggerregion <http://dummyrel.com/location/country/capital> ?capital . ?neighbourregion <http://dummyrel.com/location/country/official_language> ?lang . ?capital <http://dummyrel.com/common/topic/webpage./common/webpage/category> ?category . ?capital <http://dummyrel.com/travel/travel_destination/climate./travel/travel_destination_monthly_climate/month> ?month }"  

To train and evaluate a 4-layer GNNQ instance on the other FB15k237-Qi benchmarks, exchange the query specified by the --query_string parameter and specify the training, validation and testing samples for the respective query using the --train_data, --val_data and --test_data parameters (the sample files for the FB15k237 benchmarks are named with the answer variable of the respective query). All benchmark queries can be found in the datasets/benchmark_queries.txt-file and all samples can be found in the datasets/fb15k237/ directory. Furthermore, specify a new logging directory using the --log_dir parameter. To train and evaluate a 4-layer GNNQ- (baseline), remove the --aug parameter. The number of layers for all models can be specified using the --num_layersparameter.

Hyperparameter Tuning and Training with GPU:

To tune hyperparameters for a benchmark use the --tune parameter. This will start an Optuna study with 100 trials. If you installed PyTorch and PyTorch-Scatter with Cuda, i.e. you replaced ${CUDA} with cu113, you can use the --gpu parameter to train an instance on an available GPU.

gnnq's People

Contributors

maxpflueger avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.