Giter VIP home page Giter VIP logo

ramp-db's Introduction

Build Status

New! RaMP 2.0!

RaMP 2.0 is now released and includes an updated backend database with expanded annotations for >150,000 metabolites and ~14,000 genes/proteins. Annotations include biological pathways, chemical classes and structures (for metabolites only), ontologies (metabolites only), and enzyme-metabolite relationships based on chemical reactions. Annotations are drawn from HMDB, KEGG (through HMDB), Lipid-MAPS, WikiPathways, Reactome, and CheBI.

This R package includes functions that allow users to interface with this up-do-date and comprehensive resource. Functionalities include 1) simple and batch queries for pathways, ontologies, chemical annotations, and reaction-level gene-metabolite relationships; 2) pathway and chemical enrichment analyses.

The code used to build the backend RaMP database is freely available at https://github.com/ncats/RaMP-Backend.

Please click here to view our latest manuscript.

Web Interface

Our new revamped web interface can be found at https://rampdb.nih.gov/. The code is publicly available at https://github.com/ncats/RaMP-Client/.

APIs

API access is now available here.

Why RaMP (Relational Database of Metabolomic Pathways)

The purpose of RaMP is to provide a publicly available database that integrates metabolite and gene/protein biological, chemical and other from multiple sources. The database structure and data is available as a MySQL dump file and it can be directly downloaded from Figshare for integration into any tool. Please see the Installation Instructions for the database download link. Please note that this project is in continuous development and we appreciated any feedback.

Contact Info:

For any questions or feedback, please send us a note at [email protected].

If you find a bug, please submit an issue through this GitHub repo.

Basic Features:

The R packages and associated app perform the following queries:

1. Retrieve analytes (genes, proteins, metabolites) given pathway(s) as input.
2. Retrieve pathway annotations given analytes as input.
3. Retrieve chemical annotations/structures given metabolites as input.
4. Retrieve analytes involved in the same reaction (e.g. enzymes catalyzing reactions involving input metabolites)
5. Retrieve ontologies (e.g. biospecimen location, disease, etc.) given input meteabolites.

The following analyses are also supported:

1. Multi-omic pathway enrichment analysis
2. Chemical enrichment analyses

Last date of dump file update: 03/02/2023

Vignette

Detailed instructions for installing RaMP locally are below. We've also put together a vignette to get you started on the analyses. Click here for vignette.

Citation

If you use RaMP-DB, please cite the following work:

Braisted J, Patt A, Tindall C, Sheils T, Neyra J, Spencer K, Eicher T, Mathé EA. RaMP-DB 2.0: a renovated knowledgebase for deriving biological and chemical insight from metabolites, proteins, and genes. Bioinformatics. 2023 Jan 1;39(1):btac726. doi: 10.1093/bioinformatics/btac726. PMID: 36373969; PMCID: PMC9825745. To access, click here

Zhang, B., et al., RaMP: A Comprehensive Relational Database of Metabolomics Pathways for Pathway Enrichment Analysis of Genes and Metabolites. Metabolites, 2018. 8(1). PMID: 29470400; PMCID: PMC5876005; DOI: 10.3390/metabo8010016 To access, click here

Installation Instructions

In order to use this R package locally, you will need the following:

  • The R code under this repo
  • The mysql dump file that contains the RaMP database. Download here.

If you would like to know how to build RaMP database from scratch, please check another GitHub site at RaMP-BackEnd

MySQL set-up

RaMP requires that MySQL and the RaMP database be set up on the machine that you will be running the R package from. To download MySQL, you can go to the MySQL Downloads page

When installing, you will be prompted to create a password for the user "root", or it will create one automatically for you. Importantly, remember your MySQL password! You will need to get into mysql and to pass it as an argument to the RaMP R shiny web application.

If you want to reset your password , you can go to [MySQL References 5.7 - How to reset root password ] (https://dev.mysql.com/doc/refman/5.7/en/resetting-permissions.html)

Please note that you will need administrator privileges for this step..

If you are using a Mac, we recommend using brew to install MySQL. Here's a good tutorial: https://www.novicedev.com/blog/how-install-mysql-macos-homebrew.

Creating the database locally

Once your MySQL environment is in place, creating the RaMP database locally is trivial. First, launch MySQL and create the database:

> mysql -u root -p
mysql> create database ramp;
mysql> exit;

Here, we are naming the database "ramp" but you can use any name you'd like. It is worth noting though that the R package assumes that the name of the database is "ramp" by default. So if you change the name, remember to pass that name as arguments in the R package functions.

Second, download and unzip the latest RaMP database. Download here.

Third, populate the named database with the mysql dump file Supply the path and file name to the unzipped sql file that you've downloaded.

> mysql -u root -p ramp < /your/file/path/here/ramp_<current_version_id_here>.sql  

You're done!

Your "ramp" database should contain the following 12 tables:

  1. analyte
  2. analyehasontology
  3. analytehaspathway
  4. analytesynonym
  5. catalyzed
  6. chem_props
  7. db_version
  8. metabolite_class
  9. ontology
  10. pathway
  11. source
  12. version_info

If you want to explore this in MySQL, you can try:

mysql -u root
use ramp;
show tables;
select * from source limit 4; 
select * from source where commonName = "creatine riboside";
select distinct(HMDBOntologyType) from ontology;

Install and load the RaMP package

You can install this package directly from GitHub using the install_github() function available through the devtools package. In the R Console, type the following:

# Locally install RaMP
install.packages("devtools")
library(devtools)
install_github("ncats/RAMP-DB")

# Load the package
library(RaMP)

# Set up your connection to the RaMP2.0 database:
pkg.globals <- setConnectionToRaMP(dbname="ramp",username="root",conpass="",host = "localhost")

Note that prior to using RaMP functions, users much establish required parameters to appropriately connect to your local database (if you are not using the web app). This step is simplified by a single function call (last line in the above code snippet).

If the username is different then root, then specify the username in the "username" parameter. Similarly, if the name of the database is different than "ramp2", then specify the "dbname" parameter.

Important Notes

If you reinstall the latest version of the RaMP package, be sure to also install the latest version of the MySQL RaMP dump file.

Also, when gene or metabolite ids are input for queries, IDs should be prepended with their database of origin, e.g. kegg:C02712, hmdb:HMDB04824, or CAS:2566-39-4. The list of metabolite or gene/protien IDs may be of mixed source. Remember to include the colon in the prefix. The id prefixes that are currently included in RaMP are:

Analyte Type ID Prefix Types
Metabolites hmdb, pubchem, chebi, chemspider, kegg, CAS, LIPIDMAPS, swisslipids, lipidbank, wikidata, plantfa, kegg_glycan
Genes/Proteins ensembl, entrez, gene_symbol, uniprot, hmdb, ncbiprotein, EN, wikidata, chebi

To query the ID types supports in MySQL:

select distinct(IDtype) from source where geneOrCompound ="compound";
mysql> select distinct(IDtype) from source where geneOrCompound ="gene";

Current Authors and Testers

Previous Authors/Testers

  • Cole Tindall -
  • Bofei Zhang - Bofei5675
  • Shunchao Wang -
  • Rohith Vanam -
  • Jorge Neyra - Jorgeso

ramp-db's People

Contributors

amvirdev avatar andyptt21 avatar bofei5675 avatar ewymathe avatar jalalsiddiqui avatar johnbraisted avatar jorgeso avatar kartiksl avatar mapleknight avatar mathelab avatar tsheils avatar wangk8 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ramp-db's Issues

Error installing RaMP locally with R

Hi,
I am using a Windows laptop + Rstudio and I am having the following error when installing RaMP as a remote db.
Could you please take a look?

remotes::install_github("Mathelab/RaMP-DB")
....
── R CMD build ───────────────────────────────────────────────────────────────
WARNING: Rtools is required to build R packages, but is not currently installed.

Please download and install Rtools 4.2 from https://cran.r-project.org/bin/windows/Rtools/ or https://www.r-project.org/nosvn/winutf8/ucrt3/.
✔  checking for file 'C:\Users\krist\AppData\Local\Temp\Rtmpy6pOjk\remotes2cf01a0a77eb\Mathelab-RaMP-DB-5acaded/DESCRIPTION' ...
─  preparing 'RaMP':
✔  checking DESCRIPTION meta-information ... 
   Warning in grepl(e, files, perl = TRUE, ignore.case = TRUE) :
     PCRE pattern compilation error
   	'unrecognized character follows \'
   	at 'img/.*$'
   Error in grepl(e, files, perl = TRUE, ignore.case = TRUE) : 
     invalid regular expression '^\img/.*$'
   Execution halted
Error: Failed to install 'RaMP' from GitHub:
  ! System command 'Rcmd.exe' failed

I am not sure if that's related to Rtools that cannot be detected.
Thank you in advance
Kristina

504 Server Error: Gateway Time-out

Hi,

Thank you for making such a great resource, that's really handy for the metabolomics research community.

I am using the API through python and when requesting the "analytes-from-pathways" I am occasionally having the following error:

....
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 504 Server Error: Gateway Time-out for url: https://rampdb.nih.gov/api/analytes-from-pathways?pathway=2...

My list includes about 80 pathways.

Do you maybe have any suggestion that can make my request working? Have you experienced a previous issue like this?

Thank you in advance
Kristina

Install failed

I successfully loaded the mySQL database, but am getting an error when I try to download the R package from this website:

`> library(devtools)

install_github("ncats/RAMP-DB")
Downloading GitHub repo ncats/RAMP-DB@HEAD
Error in utils::download.file(url, path, method = method, quiet = quiet, :
download from 'https://api.github.com/repos/ncats/RAMP-DB/tarball/HEAD' failed`

I haven't had problems using the install_github() function with other packages. Would appreciate any advice. Thanks!

Option to supply a 'population' of analytes.

Fisher exact scores build a 2x2 contingency matrix that tally the items (analytes) being tested for associations. If the analytes are genes, for instance, the population of genes of interest isn't the full genome, but rather the list of genes that were quantified from which the set of interest was selected (by stats or clustering, etc.).

Cannot install RaMP R package on NIH laptop: PCRE compilation error

I am running RStudio Version 1.3.959 on an NIH laptop with Windows 10 Enterprise. I have saved the RaMP database file, installed MySQL version 5.7, and set up the RaMP database in MySQL as specified in the instructions. I have also installed the devtools package in R. However, when I try to install the RaMP R package, I get a PCRE compilation error as shown below:

library(devtools)
Loading required package: usethis
install_github("ncats/RAMP-DB")
Downloading GitHub repo ncats/RAMP-DB@master
√ checking for file 'C:\Users\eichertd\AppData\Local\Temp\RtmpemI9FW\remotes297c1e725c1b\ncats-RaMP-DB-1e7a7d2/DESCRIPTION' (365ms)
-- preparing 'RaMP': (339ms)
√ checking DESCRIPTION meta-information ...
Warning in grepl(e, files, perl = TRUE, ignore.case = TRUE) :
PCRE pattern compilation error
'unrecognized character follows '
at 'img/.$'
Error in grepl(e, files, perl = TRUE, ignore.case = TRUE) :
invalid regular expression '^\img/.
$'
Execution halted
Error: Failed to install 'RaMP' from GitHub:
System command 'Rcmd.exe' failed, exit status: 1, stdout + stderr:
E> * checking for file 'C:\Users\eichertd\AppData\Local\Temp\RtmpemI9FW\remotes297c1e725c1b\ncats-RaMP-DB-1e7a7d2/DESCRIPTION' ... OK
E> * preparing 'RaMP':
E> * checking DESCRIPTION meta-information ... OK
E> Warning in grepl(e, files, perl = TRUE, ignore.case = TRUE) :
E> PCRE pattern compilation error
E> 'unrecognized character follows '
E> at 'img/.$'
E> Error in grepl(e, files, perl = TRUE, ignore.case = TRUE) :
E> invalid regular expression '^\img/.
$'
E> Execution halted

Biological Pathway Enrichment Issues with Web GUI

Hi,
I am analyzing around 115 HMDB numbers from metabolites identified in blood samples and was trying to perform biological pathway enrichment. However, the biological pathway enrichment only completes when no sample type is selected and no p-values are ever calculated, even with a subset of data. Additionally, I cannot download the output for the full dataset, only a subset. I am currently attempting to switch to running this analysis in R on my local device. Will that help? Any other suggestions?
Thanks for your help!

SQL Call issue only_full_group_by

Running into a MySQL issue when calling any RaMP functions that connect with the ramp MySQL database:

> fisher.results <- runCombinedFisherTest(analytes = c("hmdb:HMDB0000033","hmdb:HMDB0000052"))
[1] "Running Fisher's tests on metabolites"
[1] "Fisher Testing ......"
[1] "Starting getPathwayFromAnalyte()"
[1] "Working on ID List..."
Error: Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'ramp2.p.pathwayName' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by [1055]

The issue is due to the "ONLY_FULL_GROUP_BY" which is introduced in some MySQL versions.

Issue with RaMP when performing Pathway Enrichment

I am trying to return pathways from given analytes by using a batch query for multiple analytes on the server hosted by NCATS at https://rampdb.ncats.io. When I paste the list of analytes and hit submit query everything appears to be fine. When I hit run pathway analyses the following error in the attached screenshot occurs. I have also provided an Excel file with the list of analytes (one on each row). Any help would be greatly appreciated.

Analytes_for_GitHub_Issue.xlsx
Screen Shot 2020-05-28 at 8 33 22 PM

Biological pathway enrichment only include metabolites

I'm using the web tool for biological pathway enrichment analysis using inputs from genes, proteins, metabolites and lipids. However, the analytes included in the enriched pathways only include metabolites and lipids.
Is there a bug in the code that only consider metabolites and lipids for the analysis ?

Issue with Data Tables Implementation on web based server.

Kyle 8:34 PM
When using the RaMP tool online I keep getting this error. Does anyone know what it means?
"DataTables warning: table id = DataTables_Table_2 - Requested unknown parameter for '1' for row 44, column 1."

Garrett 11:22 AM
DataTables is an html/javascript package for displaying tables. The R package is convenient wrapper, but all the logic/display is in javascript.
11:23
That error message (https://datatables.net/manual/tech-notes/4) seems to be coming from the javascript. Looks like shiny is producing a table with a missing element somewhere.
11:23
I'd check to make sure row 44 column 1 of the requested table doesn't have an NA or something weird in it

Kyle 11:35 AM
I have confirmed the input file looks fine

Question about RaMP database installing

I am trying to install RaMP database according to the instruction. I am not familiar with mysql language, but it looks like it does not work.
Could you please help me out? Below is my screenshot in R. Many thanks for your help.

mysql -u root -p ramp < myramp.sql
Error: unexpected symbol in "mysql -u root"

Thanks,
Yuanyuan

Proposed Table Enhancements for RaMP

1.) Express stat p-values in scientific notation with a limit on the number of significant figures shown to 3 or 4.
2.) Consider moving the RaMP pathway name to be the second column next to Ramp Pathway ID so that viewers hit that first as they look left to right.
3.) Double check initial sorting behavior. Do we want to focus on results with multi-omics support, sorted by adjusted p-values?

Pathway Annotation Survey

Supply a collection of analytes representing your identified set (not set of interest) and receive a report indicating RaMP pathway coverage. So if there's a pathway of interest, this would tell you if you have good representation in the assay relative to the total number of analytes possible for a pathway.

Are there plans to incorporate Metacyc

Hi RaMP DB team,

Are there any future plans to incorporate data from Metacyc to the current RaMP DB? If not, have you attempted to quantify how much information we would lose by depending solely on RaMP?

Thanks and this is an awesome resource

issue with load MySQL dump file

I’m trying to load the MySQL dump file downloaded here: https://figshare.com/ndownloader/files/34941486.

I got the following error:
$ mysql -u **** -p**** -h **** PubChem < ramp_2.0.7_20220428.sql
ERROR 1146 (42S02) at line 30: Table 'PubChem.analyte' doesn't exist

I have already created the PubChem db. It looks that there is no schema defined in the above sql file, i.e. no CREATE TABLE statements. So mysql has no idea where to insert the data. Could you please check?

Error with call to RaMP() on Mac

The version of RaMP on the sqlite branch was installed. The below traceback occurred with github actions when running R CMD check on the RcometsAnalytics R package. The error only occurs on a Mac.

*** caught segfault ***
address 0x0, cause 'unknown'

Traceback:
1: getLoadedDLLs()
2: get_lib_path()
3: extension_load(db@ptr, get_lib_path(), paste0("sqlite3_", extension, "_init"))
4: initExtension(conn)
5: .local(drv, ...)
6: dbConnect(SQLite(), dbname = dbfile, cache_size = 64000L, synchronous = "off", flags = SQLITE_RO, vfs = "unix-none")
7: dbConnect(SQLite(), dbname = dbfile, cache_size = 64000L, synchronous = "off", flags = SQLITE_RO, vfs = "unix-none")
8: .sql_connect_RO(.sql_dbfile(bfc))
9: tryCatchList(expr, classes, parentenv, handlers)
10: tryCatch({ info <- .sql_connect_RO(.sql_dbfile(bfc)) con <- info$con src <- src_dbi(con) tbl <- tbl(src, "metadata") %>% collect(n = Inf)}, finally = { .sql_disconnect(info)})
11: .sql_schema_version(bfc)
12: .sql_validate_version(bfc)
13: .sql_create_db(bfc)
14: BiocFileCache(cache = getBFCOption("CACHE"), ask = FALSE)
15: listRaMPVersions(local = TRUE)
16: RaMP()

Calculate Effect Size from Chemical Class Enrichment

This is not necessarily an issue, but how would someone calculate effect size from chemical class enrichment results? I've looked at a few methods for fisher exact test such as Cramer's V or odds ratio, but it's not clear to me how to apply the output from chemical class enrichment to these methods. Perhaps you have a better suggestion for calculating an effect size?

Thank you for your help!

RAMP not recognizing known HMDBs

Hello,

I recently input a list of HMDBs into both "Chemical Classes from Metabolites" and "Biological Pathway Enrichment" into the GUI version and was surprised to see how many metabolites were not recognized, even though they seem to be present in HMDB 5.0, dating back to before the date you have listed on the Source Data section. Any help/explanation you can provide for this?
Examples: "There were no matches for the following pathways: hmdb:hmdb00532, hmdb:hmdb00821, hmdb:hmdb32055, hmdb:hmdb62551, hmdb:hmdb61115, hmdb:hmdb04983"
Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.