Giter VIP home page Giter VIP logo

eigenthemes's Introduction

Eigenthemes

Source code for "Low-rank Subspaces for Unsupervised Entity Linking"

Detailed instructions to run the code

  1. Clone this repository using git clone https://github.com/blind-anonymous/eigenthemes.git
  2. Download Anaconda (64-bit Python 3.7 version)
    • The Anaconda installer would provide the following prompt: 'Do you wish the installer to initialize Anaconda3 by running conda init? [yes|no]'. Answering 'yes' would make your life simpler, as your 'bashrc'/'bash_profile' would be automatically updated with all the environment variables properly set.
    • If you choose to answer 'yes' in the previous step, please run source <path-to-your bashrc or bash_profile> to set all the environment variables properly in your currently active terminal.
  3. Setup the virtual environment named el to install all the required dependencies conda env create -f el.yml
  4. Activate the installed environment conda activate el
  5. Download the resources (data and embeddings) available via google drive (no sign-in required)
    1. Unzip the data.zip file in the empty data directory provided with the code repository
    2. Unzip the deepwalk_wikidata.pickle.zip file in the empty embeddings directory provided with the code repository
  6. Download the resources for Le and Titov (pretrained models) available via google drive (no sign-in required)
    1. Unzip the tau-MILND_models.zip file in the empty models directory provided with the code repository
      Important Note: If you want to train the model from scratch, you have to remove the current saved model (if existent) using rm -rf models/*. Retrain the models using bash train_taumilnd.sh, which will train five different models on the train set
  7. Reproducing results presented in Table-2
    • NameMatch Baseline: Run python namematch.py. This script will produce the results for the name-matching baseline as described in the paper for each of the four datasets considered in this study.
    • MIL-ND by Le and Titov: Run bash evaluate_taumilnd.sh. This script will produce the results for the state of the art MIL-ND for each of the four datasets considered in this study. It also outputs the mean and standard deviation of precision@1 and MRR over five independent runs of MIL-ND on the terminal.
    • Eigen (Proposed Technique): Run python unsupervised_el.py. This script will produce the results for Eigen for all the four considered datasets. The description of Eigenthemes (Eigen) can be found in the paper.
    • The overall micro Precision@1 and MRR is present in the 12th and 13th column of the results files. Additional information can be self-inferred, thanks to the descriptive header present in each output file.
      Important Note: The results are stored in the empty directory results provided with the code repository. Precomputed results for the aforementioned techniques for all the datasets have already been updated in results directory of the code repository. Also, the results filenames are self-explanatory.

eigenthemes's People

Contributors

akhilarora avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

eigenthemes's Issues

Missing test_avg.sh

Hello!

Thank you so much for publishing your code!
I came across your paper and found it very interesting. I'm trying to reproduce the results, and got an error when trying to reproduce the MIL-ND results, it seems that it's missing a test_avg.sh?

Thank you!

Some packages in el.yml cannot be found

Hi, when creating a Conda environment from the el.yml file, the versions of intel-openmp, openssl, and mkl packages specified in the file could not be found. Removing the exact version requirement from the aforementioned packages from the .yml file solves the problem.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.