Giter VIP home page Giter VIP logo

txt2graphllms's Introduction

txt2graphLLMs

On General and Biomedical Text-to-Graph Large Language Models.

Environment setup

We recommend using conda as package manager. Please find the environment requirements in environment.yml. Note that this is an environment that evolved from an earlier project and as such is not a minimal working product.

To create the environment, run the following commands.

conda env create -f environment.yml
conda activate txt2graph_llms

Load and pre-process data

Load the Web NLG (version 3.0) dataset and pre-process both Web NLG and Bio Event by running the following commands. The results can be found in the \data folder.

python scripts/pre_process_data.py

Run experiments for in-context learning (iCL) and fine-tuning (FT)

To run the main experiments (using 8 in-context examples), run the following commands. The results can be found in the \output folder.

You will be prompted for your Hugging Face login key. In the fine-tuning case you will also be prompted for your Hugging Face username. All fine-tuned models are saved both locally and in your personal Hugging Face repository.

python scripts/run_experiments_icl.py --models all
python scripts/run_experiments_ft.py --models all 

To run the next set of experiments varying the amount of in-context examples, run the following commands.

python scripts/run_experiments_icl.py --models mistral --icl_examples 0 2 4 8 16 32
python scripts/run_experiments_icl.py --models llama --icl_examples 0 2 4 8

In both cases, if you desire to run a single model on a single dataset, simply modify the options in the following lines

python scripts/run_experiments_icl.py --models mistralai/Mistral-7B-v0.1 mistralai/Mistral-7B-Instruct-v0.1 Open-Orca/Mistral-7B-OpenOrca meta-llama/Llama-2-7b-chat-hf meta-llama/Llama-2-13b-chat-hf epfl-llm/meditron-7b --icl_examples 0 2 4 8 16 32 --datasets web_nlg bio_event --batch_size 24 --seed 41 --debug False
python scripts/run_experiments_ft.py --models t5-small t5-base t5-large facebook/bart-large ibm/knowgl-large --datasets web_nlg bio_event --batch size 24 --epochs 10 --seed 41 --debug False

Demo notebook

Checkout demo_results.ipynb to recreate the graphs in the main body of our paper. Note that running the previous commands is necessary to obtain the needed model output first. Furthermore, this repository is a cleaned-up version of the codebase used to produce the graphs in the paper and although the results should be perfectly replicable, this has not been tested.

Code

The main scripts can be found in \scripts with various util functions in \scripts\utils.

Citing our work

txt2graphllms's People

Contributors

roelhulsman avatar

Watchers

Kostas Georgiou avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.