Giter VIP home page Giter VIP logo

omixer-rpmr's Introduction

omixer-rpmR

An R interface to omixer-rpm, the tool for metabolic module profiling of microbiome samples

Dependencies

R and Java8 (Docker users, please make sure Java8 is part of the R image)

Installation

Binaries for Linux only (tested on 16.04.1-Ubuntu)

Download the latest binary (omixeRpm_x.y.z.tar.gz) from the release page, then install it as follows after replacing x.y.z by the correct version

R CMD INSTALL omixeRpm_x.y.z.tar.gz

From source

Download the latest source from the release page, then install it as follows after replacing x.y.z by the correct version

R CMD INSTALL omixer-rpmR-x.y.z.tar.gz

Docker Shiny [experimental]

Start the docker container

docker run --rm -ti -p 3838:3838 -v $PWD:/workspace omixer/shinyrpm:0.1

then open the browser at http://localhost:3838/sample-apps/rpm/

Usage

Mapping example

Download the example matrix.tsv (raw link) form the test directory.

library(omixerRpm)
# read a functional profile matrix into R or create it inside R. Please note that row.names should not be used while reading the matrix. 
dat <- read.table("matrix.tsv", header=T, sep="\t")
# Run the module mapping on the loaded table.
mods <- rpm(dat, minimum.coverage=0.3, annotation = 1)

# alternatively run the mapping without loading the table into R.
mods <- rpm("matrix.tsv", minimum.coverage=0.3, annotation = 1)

# Load the default mapping database
db <- loadDefaultDB()
# get the name of the first predicted module
getNames(db, mods@annotation[1,])

# get the abundance|coverage as a data.frame with module id and description
coverage <- asDataFrame(mods, "coverage")
Using an alternative database, several options are available
  1. load one of the bundled databases. Type listDB() to check the list of available databases
db <- loadDB("GBMs.v1.0")
  1. load an external database. Please refer to this module.list and module.names for examples
db <- ModuleDB(directory = "/path/to/moduledb/", modules = "module.list", module.names.file="module.names")

Bundled databases

  1. Gut Brain Modules, Valles-Colomer et al. 2019, The neuroactive potential of the human gut microbiota in quality of life and depression, Nature Microbiology 2019.
  2. Gut Metabolic Modules, Vieira-Silva et al. 2016, Species-function relationships shape ecological properties of the human gut microbiome, Nature Microbiology 2016.

Citing omixer-rpmR

omixer-rpmR was developed as part of GOmixer. If you use omixer-rpmR in your work please cite:

Youssef Darzi, Gwen Falony, Sara Silva, Jeroen Raes. Towards biome-specific analysis of meta-omics data, The ISME journal, 2015.

License

GNU General Public License v3.0. The bundled omixer-rpm.jar is licensed under an Academic Non-commercial Software License Agreement

omixer-rpmr's People

Contributors

omixer avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

omixer-rpmr's Issues

missing file?

Hi,

I tried using the R package on MacOS Montery. I installed omixer-rpmR first via CMD and then later using install_github.

When I run my code:

library(tidyverse)
library(omixerRpm)

# load the KEGG orthologue table 
ko <- read.delim(
    "picrust2/data/KO_metagenome_out/pred_metagenome_unstrat_descrip.tsv",
    header = TRUE
    )

# pick most recent database 
listDB()
db <- loadDB(name = listDB()[1])

# calculate GBM abundance and store in df
gbm <- rpm(x = ko, module.db = db)

I get the following error:

Exception in thread "main" java.lang.NumberFormatException: For input string: "E1.1.1.1, adh; alcohol dehydrogenase [EC:1.1.1.1]"
        at java.base/jdk.internal.math.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2054)
        at java.base/jdk.internal.math.FloatingDecimal.parseDouble(FloatingDecimal.java:110)
        at java.base/java.lang.Double.parseDouble(Double.java:735)
        at java.base/java.lang.Double.valueOf(Double.java:698)
        at org.omixer.rpm.parsers.FunctionLineProcessor.process(FunctionLineProcessor.java:49)
        at org.omixer.utils.utils.FileUtils.readMatrix(FileUtils.java:235)
        at org.omixer.rpm.service.impl.ModuleManagerImpl.inferModules(ModuleManagerImpl.java:173)
        at org.omixer.rpm.core.InferenceApp.main(InferenceApp.java:252)
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
  cannot open file '/var/folders/f5/rkv6j_sj3sd8_m591syph__r0000gn/T//Rtmp7iwQds/file104e361195d7c/modules.tsv': No such file or directory

I tried using threads = 16, java.mem = 16 based on an earlier issue that was reported here but that does not help. Any ideas what could be the issue?

could not find function "asDataFrame"

Hi there,

I went through the example and ran into the following problem with v 0.3.1.

coverage <- asDataFrame(mods, "coverage")
Error in asDataFrame(mods, "coverage") :
could not find function "asDataFrame"

Installation looked fine and everything went smoothly up until this point.

Thanks.

Error when working with large datasets

Hi,

I'm using your R package with a (seemingly) too large dataset. It has > 6,900 columns (different bacterial genomes) and > 10,000 rows (different KOs present in the bacterial genomes). When I run the rpm function with that dataset, after spending some time computing, it crashes and shows the following error:

Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
	at sun.misc.FloatingDecimal.readJavaFormatString(Unknown Source)
	at sun.misc.FloatingDecimal.parseDouble(Unknown Source)
	at java.lang.Double.parseDouble(Unknown Source)
	at java.lang.Double.valueOf(Unknown Source)
	at org.omixer.rpm.parsers.FunctionLineProcessor.process(FunctionLineProcessor.java:49)
	at org.omixer.utils.utils.FileUtils.readMatrix(FileUtils.java:235)
	at org.omixer.rpm.service.impl.ModuleManagerImpl.inferModules(ModuleManagerImpl.java:173)
	at org.omixer.rpm.core.InferenceApp.main(InferenceApp.java:252)

Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
  cannot open file 'C:\Users\BVALDE~1\AppData\Local\Temp\RtmpqcBd0h\file1e406c471c1f/modules.tsv': No such file or directory

I have been using your package for around 2 months with smaller datasets and It never happened before, so after a quick search in the internet I found that what can be happening is that the computation times to analyze the entire dataset would be too long.

Is it possible to overcome this difficulty within R? I ran the same script in my personal computer and in a server and in both cases the script crashes after working for a while.


R and OS info:

platform x86_64-w64-mingw32
arch x86_64
os mingw32
crt ucrt
system x86_64, mingw32
status Patched
major 4
minor 2.0
year 2022
month 05
day 19
svn rev 82383
language R
version.string R version 4.2.0 Patched (2022-05-19 r82383 ucrt)
nickname Vigorous Calisthenics

Also, I'm using the 0.3.2 version of the omixerRpm package

windows compatibilities

I tried to run your example on windows

(I installed myself the jar code on java folder see 1)

however I had this error now:

> library(omixerRpm)
> # read a functional profile matrix into R or create it inside R
> dat <- read.table("test/matrix.tsv", header=T, sep="\t")
> # Run the module mapping on the loaded table.
> mods <- rpm(dat, minimum.coverage=0.3, annotation = 1)
Error in file(file, "rt") : cannot open the connection
In addition: Warning messages:
1: running command 'java -jar C:/Users/jtap/Documents/R/R-3.4.1/library/omixerRpm/java/omixer-rpm.jar -c 0.3 -s median -d C:/Users/jtap/Documents/R/R-3.4.1/library/omixerRpm/extdata/GMMs.v1.07.txt -i C:\Users\jtap\AppData\Local\Temp\RtmpG83sYA\file28d41b752272/input.tsv -o C:\Users\jtap\AppData\Local\Temp\RtmpG83sYA\file28d44d086a43 -a 1 -t 1 -e 2' had status 127 
2: In file(file, "rt") :
  cannot open file 'C:\Users\jtap\AppData\Local\Temp\RtmpG83sYA\file28d44d086a43/modules.tsv': No such file or directory

I suspect something about the path, not correctly called for windows system. "/" vs "" with the status 127.

Tutorial data fails

I have installed omixer-rpmR and downloaded the test data.

This works:

library(omixerRpm)
dat<-read.table("example.tsv", header=TRUE, sep="\t")
head(dat)
entry S1 S2
K00001 47.162155 329.953621
K00002 335.849057 0.000000
K00003 277.056277 0.000000
K00004 9.492025 18.155431
K00005 265.656566 10.068284
K00006 238.396625 6.329114

But then:
mods <- rpm(dat, minimum.coverage=0.3, annotation = 1)

[1] "Loaded GMMs.v1.07"
<simpleWarning in system(command): error in running command>

Warning message in system(command):
“error in running command”

Error in invokeRestart("muffleWarning"): no 'restart' 'muffleWarning' found
Traceback:

1. rpm(dat, minimum.coverage = 0.3, annotation = 1)
2. tryCatch({
 .     system(command)
 . }, warning = function(e) {
 .     print(e)
 .     if (e$message == "error in running command") {
 .         stop(e)
 .     }
 . }, error = function(e) {
 .     print(geterrmessage())
 .     stop(e)
 . })
3. tryCatchList(expr, classes, parentenv, handlers)
4. tryCatchOne(tryCatchList(expr, names[-nh], parentenv, handlers[-nh]), 
 .     names[nh], parentenv, handlers[[nh]])
5. doTryCatch(return(expr), name, parentenv, handler)
6. tryCatchList(expr, names[-nh], parentenv, handlers[-nh])
7. tryCatchOne(expr, names, parentenv, handlers[[1L]])
8. value[[3L]](cond)
9. stop(e)
10. (function (wn) 
  . {
  .     if (getOption("warn") >= 2) 
  .         return()
  .     if (getOption("warn") >= 0) {
  .         handle_condition(wn)
  .         output_handler$warning(wn)
  .     }
  .     invokeRestart("muffleWarning")
  . })(structure(list(message = "error in running command", call = system(command)), class = c("simpleWarning", 
  . "warning", "condition")))
11. invokeRestart("muffleWarning")
12. stop(gettextf("no 'restart' '%s' found", as.character(r)), domain = NA)

I get the same if I try reading the table directly into the rpm function. Any clues or advice? Thanks!

java folder missing?

Hi,

I tried to run you example code:

library(omixerRpm)
# read a functional profile matrix into R or create it inside R
dat <- read.table("test/matrix.tsv", header=T, sep="\t")
# Run the module mapping on the loaded table.
mods <- rpm(dat, minimum.coverage=0.3, annotation = 1)

but I have this error

> mods <- rpm(dat, minimum.coverage=0.3, annotation = 1)
Error: Unable to access jarfile 0.3

when I l look into your code, it seems that you need to call a jar file from a java folder

 system.file("java", "omixer-rpm.jar", package = "omixerRpm")

which is not provided from this repo. Where could I find "omixer-rpm.jar"?

Thank you,

Julien

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.