Giter VIP home page Giter VIP logo

ctkg's Introduction

A Knowledge Graph of Clinical Trials (CTKG)

Clinical Trials Knowledge Graph (CTKG) is a knowledge graph constructed over the clinical trial data from The Access to Aggregate Content of ClinicalTrials.gov (AACT) database1. CTKG includes nodes representing medical entities in clinical trials (e.g., studies, drugs, conditions), and edges representing the relations among these entities (e.g., drugs used in studies). It includes 1,496,684 nodes belonging to 18 node-types; and 3,667,750 triplets belonging to 21 relation-types. It also provides three notebooks about how to explore and analysis the CTKG using the knowledge graph embeddings.

This work has been published in Scientific Reports (https://www.nature.com/articles/s41598-022-08454-z).

Schema

CTKG dataset

The directory rawdata contains all the entities and relations:

  • attributes.zip : the attributes of entities (e.g., "study").
  • relations.zip : the attributes of relations between two types of entities (e.g., "study"--- study-condition ---"condition").
  • reverse.zip : the attributes of reverse relations between two types of entities (e.g., "condition" --- condition-study --- "study").

Embedding analysis

The directory scripts contains all the jupyter notebooks for the embedding analysis:

  • loading_ctkg_in_dgl.ipynb is a notebook to load CTKG as a graph using the Deep Graph Library (https://www.dgl.ai/).
  • Train_embeddings.ipynb is a notebook to generate the embeddings for nodes and relations in CTKG.
  • Subtype_entity_similarity_analysis.ipynb is a notebook to retrieve similar nodes of a certain node type.
  • Crosstype_entity_similarity_analysis.ipynb is a notebook for the drug repurposing analysis in the manuscript.

Before running the scripts, you need to unzip rawdata/ctkg.zip and rawdata/attributes.zip, and install DGL (https://www.dgl.ai/) and PyTorch. If you are not able to learn embeddings via the command in the notebook, please run the command in a terminal with DGL 0.4.3.

Citation

@Article{ctkg,
  author    = {Ziqi Chen and Bo Peng and Vassilis N. Ioannidis and Mufei Li and George Karypis and Xia Ning},
  journal   = {Scientific Reports},
  title     = {A knowledge graph of clinical trials (CTKG)},
  year      = {2022},
  month     = {mar},
  number    = {1},
  volume    = {12},
  doi       = {10.1038/s41598-022-08454-z},
  publisher = {Springer Science and Business Media {LLC}},
}

Reference

Footnotes

  1. Tasneem A, Aberle L, Ananth H, Chakraborty S, Chiswell K, McCourt BJ, et al. (2012) The Database for Aggregate Analysis of ClinicalTrials.gov (AACT) and Subsequent Regrouping by Clinical Specialty. PLoS ONE 7(3): e33677. https://doi.org/10.1371/journal.pone.0033677 โ†ฉ

ctkg's People

Contributors

ziqi92 avatar bopeng112 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.