cdk-r / cdkr Goto Github PK
View Code? Open in Web Editor NEWIntegrating R and the CDK
Home Page: https://cdk-r.github.io/cdkr/
Integrating R and the CDK
Home Page: https://cdk-r.github.io/cdkr/
mols <- load.molecules(molfiles=c("thisFileDoesNotExist.sdf"))
causes not a fail message, but this error:
Error in if (!file.exists(f) && !grep("http://", f)) stop(paste(f, ": Does not exist", :
missing value where TRUE/FALSE needed
Hi,
thanks for your rcdk packages. I've just started trying them and have encountered the following bug in rpubchem.
x <- get.cid(3197)
x$CanonicalSmile
>[1] "C51H64N12O12S2"
Looks like everything after the IUPACName
is off. Somewhere or another the XML parsing is off. Looks like maybe line 280.
Just built the package from source and it worked fine. This was probably due to #17 Maybe this can get bumped to CRAN?
zachcp
First execution returns the distance, second time around it generates segmentation error:
library('rcdk', 'fingerprint')
a <- parse.smiles('CCC')
b <- parse.smiles('CCCO')
af <- get.fingerprint(a[[1]])
bf <- get.fingerprint(b[[1]])
fingerprint::distance(af, bf)
[1] 0.4285714
fingerprint::distance(af, bf)
Segmentation fault (core dumped)
This happens even if I use a new set of feature vectors.
See this post on cdk-user: https://sourceforge.net/p/cdk/mailman/message/36278438/
The new depiction with kekulise=TRUE looks awesome, but with kekulise=FALSE rather bizarre.
In earlier versions this would have been the aromatic delocalised ring representation - any reason for the change? Should I "block" the kekulise=FALSE option? I'd rather keep it in for backwards compatibility...
smiles <- "OS(=O)(=O)c1ccc(cc1)C(CC(=O)O)CC(=O)O"
plot.new()
plot.window(xlim=c(0,200), ylim=c(0,100))
mol <- parse.smiles(smiles,kekulise=TRUE)[[1]]
img <- view.image.2d(mol)
rasterImage(img, 0,0, 100,100)
mol <- parse.smiles(smiles,kekulise=FALSE)[[1]]
img <- view.image.2d(mol)
rasterImage(img, 100,0, 200,100)
The depictions in latest rcdk don't yet take advantage of the latest developments shown at https://cdkdepict-openchem.rhcloud.com/depict.html - if these could be upgraded that would be wonderful! Thank you. See emails earlier today :-)
Hello,
For a specific set of molecules I cannot seem to calculate the mcs:
mol1 <- rcdk::parse.smiles("CC(=C)[C@@H]1CC[C@]2([C@H]1[C@H]3CC[C@H]4[C@]([C@@]3(CC2)C)(CC[C@H]([C@]4(C)CCOC(=O)C)C(C)(C)COC(=O)C)C)COC(=O)C")[[1]]
mol2 <- rcdk::parse.smiles("C[C@H]1[C@H](C2CC[C@@]3(C(=C2[C@]([C@@H]1C)(C)O)C=CC4[C@]3(CCC5[C@@]4(CCC(C5(C)C)O[C@H]6[C@@H]([C@H]([C@H](CO6)O)O)O)C)C)C)C")[[1]]
mcs <- rcdk::get.mcs(mol1, mol2)
Error in .jcall("org.guha.rcdk.util.Misc", "Lorg/openscience/cdk/interfaces/IAtomContainer;", :
java.lang.OutOfMemoryError: Java heap space
The calculation takes a lot of time and usually fails with above error. Once in a while it may succeed though. Any thoughts? I am using the latest version of rcdk
and rcdklibs
from CRAN.
Rajarshi,
I am using the .get.desc.values() in a derived package, but cannot access the method anymore in the new R NAMESPACE world. Can you make the method public perhaps, or is there an alternative method that is exported that i should be using instead?
Egon
With an R API something like:
> elementRanges <- matrix(c("C", 1, 10, "H", 0, 22), ncol=3, byrow=T)
> formulas <- rcdk.getFormulas(mass, tolerance, elementRanges)
The CDK API is described in https://github.com/cdk/cdk-paper-3/blob/master/formula_generator_benchmark/CDK/CDKFormulaGeneratorCLI.java
Hi there,
Thanks for a great R package. I'm encountering an issue with fp.sim.matrix() wherein fplist2 seems to be always interpreted as null even when I provide a second list of fingerprints.
Eg:
fp1 is a list of 8500 fingerprints
fp2 is a list of 2500 fingerprints
fp.sim <- fp.sim.matrix(fplist = fp1, fplist2 = fp2, method='tanimoto')
fp.sim ends up being a 8500x8500 matrix, rather than 8500x2500.
I am running R 3.4.0, fingerprint version 3.3.8.
Cheers,
Robert
I am getting issues with get.fingerprint module. As far as I checked it is coming from call to get.property method. I have built the package from latest source off the master. See the following snippet
library(rcdk)
a = parse.smiles('CCCO')
f = get.fingerprint(a[[1]])
Error in .jcall("org/guha/rcdk/util/Misc", "Ljava/lang/Object;", "getProperty", :
RcallMethod: cannot determine object class
Line 43 in props.R has following call:
value <- .jcall('org/guha/rcdk/util/Misc', 'Ljava/lang/Object;', 'getProperty',
molecule, as.character(key), check=FALSE)
jClassPath seems to be okay and has path for rcdk.jar.
Is the error because of difference in JRE version?
.jnew('org.guha.rcdk.util.Misc')
Error in .jnew("org.guha.rcdk.util.Misc") :
java.lang.UnsupportedClassVersionError: org/guha/rcdk/util/Misc : Unsupported major.minor version 52.0
Though it seems that cdk-2.0.jar is also built using Java 1.8 and seems to work okay on my system.
library(rJava)
.jinit(classpath = "/home/varun/R/x86_64-pc-linux-gnu-library/3.4/rcdklibs/cont/cdk-2.0.jar")
.jclassPath()
[1] "/home/varun/R/x86_64-pc-linux-gnu-library/3.4/rJava/java"
[2] "/home/varun/R/x86_64-pc-linux-gnu-library/3.4/rcdklibs/cont/cdk-2.0.jar"
.jcall("org.openscience.cdk.CDK", "S", "getVersion")
[1] "2.0"
Hi Rcdk team
I have a small question regarding isotope annotation for generate.formula.iter().
If I want to annotate possible formulae to MS peaks, limited by the number of atoms of the parent compound. If I use the attached example without adding to the element list :
elements
[[1]]
[1] "C" "0" "7"
[[2]]
[1] "H" "0" "4"
[[3]]
[1] "Br" "0" "2"
[[4]]
[1] "O" "0" "3"
Using this list, the formula of the monoisotopic peak (M-H) at 292.8454296 can easily be annotated as "C7H3Br2O3" by the generate.formula.iter() function.
The problem is, that the M+2 peak at 294.8432039 is not annotated as the [81Br] isotope is not in the list.
if i modify the list to Br81:
elements
[[1]]
[1] "C" "0" "7"
[[2]]
[1] "H" "0" "4"
[[3]]
[1] "Br" "0" "2" "81"
[[4]]
[1] "O" "0" "3"
I can only annotate the peak with 2 81Br atoms.
If I add an additional line with the 81Br (as shown in the example):
elements
[[1]]
[1] "C" "0" "7"
[[2]]
[1] "H" "0" "4"
[[3]]
[1] "Br" "0" "2"
[[4]]
[1] "O" "0" "3"
[[5]]
[1] "Br" "0" "2" "81"
I can annotate all 3 peaks as "C7H3Br2O3". Unfortunately, the annotation makes no difference between 79Br and 81Br in regard to the symbol.
My question is now, if the is a way (or if a way could be created), to safe the isotope entry in the list with a different symbol (like [81Br]) so as to be able to differentiate between the annotated isotopes.
Thank you in advance
Benedikt Lauper
Eawag Dübendorf
Uchem
Hi,
we get a not so informative error message when passing crap into get.inchi()
> get.inchi("x")
Error in .jcall("org/guha/rcdk/util/Misc", "S", "getInChi", molecule, :
method getInChi with signature ()Ljava/lang/String; not found
It would be better to throw a better error message invalid SMILES
back to R.
Yours, Steffen
So that I do not have to wrap that single file name into a c()
Hi,
library(rcdk)
m <- parse.smiles("c1ccccc1")[[1]]
view.molecule.2d(m)
do.typing(m)
view.molecule.2d(m)
The first display is kinda correct:
while after typing I get dual aromaticity depiction:
It would be great to have just one of them. (Or even a choice :-)
And I love the github issue tracker. Very well done.
Yours,
Steffen
Using r-devel version I get a segfault when loading rcdk.
See also the CRAN checks.
sessionInfo()
R Under development (unstable) (2016-11-14 r71659)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.1 LTS
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=de_DE.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=de_DE.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices datasets utils methods base
other attached packages:
[1] clisymbols_1.0.0 prompt_1.0.0 gitty_1.0.0
loaded via a namespace (and not attached):
[1] compiler_3.4.0 parr_3.3.0 whisker_0.3-2 crayon_1.3.2 memuse_3.0-1
Note, that there are no problems under current R:
❯ library(rcdk)
Loading required package: fingerprint
❯ sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.1 LTS
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=de_DE.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=de_DE.UTF-8 LC_NAME=de_DE.UTF-8
[9] LC_ADDRESS=de_DE.UTF-8 LC_TELEPHONE=de_DE.UTF-8
[11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=de_DE.UTF-8
attached base packages:
[1] stats graphics grDevices datasets utils methods base
other attached packages:
[1] rcdk_3.3.6 fingerprint_3.5.4 clisymbols_1.0.0 prompt_1.0.0
[5] gitty_1.0.0
loaded via a namespace (and not attached):
[1] parr_3.3.0 parallel_3.3.2 whisker_0.3-2 crayon_1.3.2
[5] rcdklibs_1.5.13 memuse_3.0-1 iterators_1.0.8 itertools_0.1-3
[9] rJava_0.9-8 png_0.1-7
I don't see this, when running on Travis-CI, which runs R-devel on Ubuntu 12.04 LTS.
Maybe it something with my installation... Investigating...
Maybe a different jdk version
openjdk version "1.8.0_111"
vs
oraclejdk8
Nope. Segfault also with oracle.
Dear rcdk-Developers,
I'm using rcdk for prediction of isotope patterns for mass spectrometric analysis. I recognized that some problems when working with charged formulas. The function get.isotopes.pattern returns the masses without correction for charge.
Please find a example for the [M+Na]+ adduct of Glucose below.
Best regards,
Michael
library(rcdk)
glucose <- "C6H12O6"
glucoseFormula <- get.formula(glucose, charge = 0)
sodium <- "Na"
sodiumFormula <- get.formula(sodium, charge = 1)
glucoseFormula@mass + sodiumFormula@mass
glucoseSodium <- "C6H12O6Na"
glucoseSodiumFormula <- get.formula(glucoseSodium, charge = 1)
glucoseSodiumFormula@mass
get.isotopes.pattern(glucoseSodiumFormula, minAbund = 0.001)[1]
get.formula(glucoseSodium, charge = 0)@mass
The results are:
glucoseFormula@mass + sodiumFormula@mass
[1] 203.0526
glucoseSodiumFormula@mass
[1] 203.0526
get.isotopes.pattern(glucoseSodiumFormula, minAbund = 0.001)[1]
[1] 203.0532
get.formula(glucoseSodium, charge = 0)@mass
[1] 203.0532
I ran parse.smiles on a long list (14,000+) of SMILES, some of which I later discovered were actually invalid. However, the function doesn't stop or give warnings as I'd have expected, and it took a while to trace which SMILES were problematic (so that I can remove/fix them).
Small snippet to reproduce:
smi <- c('CCC', 'c1ccccc1', 'N/A', 'C(C)(C=O)C(CCNC)C1CC1C(=O)', 'foo')
parse.smiles(smi)
Output:
$CCC
[1] "Java-Object{AtomContainer(171493374, #A:3, Atom(1876682596, S:C, H:3, AtomType(1876682596, FC:0, ..."
$c1ccccc1
[1] "Java-Object{AtomContainer(806511723, #A:6, Atom(1250442005, S:C, H:1, AtomType(1250442005, FC:0, ..."
$`N/A`
NULL
$`C(C)(C=O)C(CCNC)C1CC1C(=O)`
[1] "Java-Object{AtomContainer(2032079962, #A:14, Atom(953082513, S:C, H:1, AtomType(953082513, FC:0..."
$foo
NULL
Hi,
I am trying to iterate over ChEBI using the code below on a current rcdk git snapshot,
and want to calculate fingerprints. The iteration fails as soon as rcdk hits a compound
with an R# "atom", e.g. CHEBI:15489 because the hasNext(moliter) fails with an NPE:
Error in .jcall(sreader, "Z", "hasNext") : java.lang.NullPointerException
I have no problem if the molecule is a NULL, but iteration should be able to continue
to the end of the file.
Yours,
Steffen
P.S. That github Markdown for code looks cool!
library(rcdk)
sessionInfo()
chebifile <- "ChEBI_complete.sdf"
# iterate over a large file
moliter <- iload.molecules(chebifile, type="sdf")
i <- 1
chebifp <- c(new("fingerprint"))
while(hasNext(moliter)) {
mol <- nextElem(moliter)
}
> sessionInfo()
R version 2.14.1 (2011-12-22)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] rcdk_3.1.5 iterators_1.0.3 png_0.1-3 fingerprint_3.4.6
[5] rcdklibs_1.4.5 rJava_0.9-2
I received an email from CRAN about fixing an rcdk
version check before the release to Java 11. I didn't update a fix in time and noticed that as of this morning I have been booted from Maintainership as of rcdk
(see https://cran.r-project.org/web/packages/rcdk/rcdk.pdf)
I will look into a fix of the Java version test to patch an updated version, but I don't know if I will be able to submit the patch. Build error and and correction updated below. A fix will need to
either:
Any preferences on which route to take?
zach cp
* testing if installed package can be loaded
Warning in fun(libname, pkgname) : NAs introduced by coercion
Error: package or namespace load failed for ‘rcdk’:
.onLoad failed in loadNamespace() for 'rcdk', details:
call: if (isjavagood == FALSE) {
error: missing value where TRUE/FALSE needed
jversion evaluates as 11-ea+22" .
The code in 'Writing R Extensions' does work portably,
Please correct ASAP and before Sep 25 (the currently expected release
date for Java 11).
Hi, I can confirm @schymane problem with rcdk-3.4.9 (see MassBank/RMassBank#199). I haven't checked in detail yet, but running the example from the get.exact.mass
manpage does not work:
m <- parse.smiles('c1ccccc1')[[1]]
## Need to configure the molecule
do.aromaticity(m)
do.typing(m)
do.isotopes(m)
get.exact.mass(m)
> get.exact.mass(m)
[1] "Java-Object{java.lang.NullPointerException}"
Error in get.exact.mass(m) :
Couldn't get exact mass. Maybe you have not performed aromaticity, atom type or isotope configuration?
So either an issue with the rcdk code, my environment or the documentation.
Yours,
Steffen
> sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=en_US.UTF-8
[9] LC_ADDRESS=en_US.UTF-8 LC_TELEPHONE=en_US.UTF-8
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] rcdk_3.4.9 rcdklibs_2.2 rJava_0.9-10
loaded via a namespace (and not attached):
[1] compiler_3.4.4 tools_3.4.4 parallel_3.4.4 fingerprint_3.5.7
[5] iterators_1.0.10 itertools_0.1-3 png_0.1-7
and
java --version
openjdk 10.0.1 2018-04-17
OpenJDK Runtime Environment (build 10.0.1+10-Ubuntu-3ubuntu1)
OpenJDK 64-Bit Server VM (build 10.0.1+10-Ubuntu-3ubuntu1, mixed mode)
Loading rinchi
first causes the following error. If I load rcdk
first all works.
library(rinchi)
library(rcdk)
Loading required package: rcdklibs
Loading required package: rJava
Warning messages:
1: package ‘rcdk’ was built under R version 3.4.3
2: package ‘rcdklibs’ was built under R version 3.4.3
> m <- parse.smiles('C1C=CCC1N(C)c1ccccc1')[[1]]
> get.smiles(m)
Error in .jnew("org/openscience/cdk/smiles/SmilesGenerator", flavor) :
java.lang.NoSuchMethodError: <init>
Hi,
I am back to using rCDK for some stuff, and there are some display issues
in the current release. Here are some test cases, most by Emma:
library(rcdk)
m <- parse.smiles("[CH2+]")[[1]]
get.total.charge(m)
view.molecule.2d(m)
As the screenshots show, there is no + charge:
rcdklibs_1.5.4 , rcdk_3.2.4
Hi,
if running the example from #14, I get 'cellx' not found:
> library(rcdk)
Loading required package: fingerprint
> m <- parse.smiles("[CH2+]")[[1]]
> get.total.charge(m)
[1] 1
> view.molecule.2d(m)
Error in .jnew("org/guha/rcdk/view/ViewMolecule2D", molecule, as.integer(cellx), :
object 'cellx' not found
> sessionInfo()
R version 3.2.3 (2015-12-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.2 LTS
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=de_DE.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=de_DE.UTF-8 LC_NAME=de_DE.UTF-8
[9] LC_ADDRESS=de_DE.UTF-8 LC_TELEPHONE=de_DE.UTF-8
[11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=de_DE.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] rcdk_3.4.1 fingerprint_3.5.4
loaded via a namespace (and not attached):
[1] parallel_3.2.3 rcdklibs_1.5.14 iterators_1.0.8 itertools_0.1-3
[5] rJava_0.9-8 png_0.1-7
Update fingerprint package to support similarity calculations for count fingerprints (that actually use the counts). See the Tanimoto class in the CDK sources. Also, the one named with the number 2 is the one that performed the best for me in virtual
screening.
Hi, I am trying to compile with an updated cdk.jar.
While it works with the included cdk-1.5.2, it does not work
with the download of cdk-1.5.5.ar from sf.net nor the nightly.
This is with javac 1.7.0_51 on Linux.
[javac] /vol/R/rguha/cdkr/rcdkjar/src/org/guha/rcdk/util/Misc.java:188: error: cannot find symbol
[javac] IsotopeFactory ifac = IsotopeFactory.getInstance(DefaultChemObjectBuilder.getInstance());
Source: https://www.biostars.org/p/100384/
Reproducible with rcdk 3.2.3.2 with:
#! /usr/bin/Rscript
require(rcdk)
drug.mols <- load.molecules(molfiles="./CID_175540.sdf")
descNames <- unique(unlist(sapply(get.desc.categories(), get.desc.names)))
drug.descs <- eval.desc(drug.mols, descNames, verbose=T)
The error you get:
Processing BasicGroupCountDescriptor
Error in if (is.na(dval)) return(NA) : argument is of length zero
In addition: Warning message:
In is.na(dval) : is.na() applied to non-(list or vector) of type 'NULL'
Hi,
Both my local R CMD check
as well as Travis report an issue:
https://travis-ci.org/sneumann/cdkr/builds/447702470#L3022
Error: processing vignette 'molform.Rmd' failed with diagnostics:
RcallMethod: invalid object parameter
This comes from
> library(rcdk)
Loading required package: rcdklibs
Loading required package: rJava
> sp <- get.smiles.parser()
> molecule <- parse.smiles('N')[[1]]
> convert.implicit.to.explicit(molecule)
> formula <- get.mol2formula(molecule,charge=0)
Error in .jcall(ch, "D", "doubleValue") :
RcallMethod: invalid object parameter
in
> traceback()
3: .jcall(ch, "D", "doubleValue")
2: .cdkFormula.createObject(.jcast(moleculaJT, .IMolecularFormula))
1: get.mol2formula(molecule, charge = 0)
Can someone confirm ? Ideas ? Yours, Steffen
> sessionInfo()
R Under development (unstable) (2018-10-17 r75450)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.1 LTS
Matrix products: default
BLAS: /vol/R/R-devel/lib/R/lib/libRblas.so
LAPACK: /vol/R/R-devel/lib/R/lib/libRlapack.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=en_US.UTF-8
[9] LC_ADDRESS=en_US.UTF-8 LC_TELEPHONE=en_US.UTF-8
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] rcdk_3.4.9 rcdklibs_2.2.1 rJava_0.9-10
loaded via a namespace (and not attached):
[1] compiler_3.6.0 tools_3.6.0 parallel_3.6.0 fingerprint_3.5.7
[5] iterators_1.0.10 itertools_0.1-3 png_0.1-7 tcltk_3.6.0
Hi, I was trying to follow the examples on youtube on how to retrieve the SMILES files from pubchem with the tutorial video. Using following script,
library(rpubchem)
library(rcdk)
library(ggplot2)
aids <- find.assay.id("dihydroorotate+dehyogenase+and+Malaria")
aidsdata <- data.frame()
for (i in 1:50){
assay <- get.assay(aids[i], quiet=TRUE)
assaydata <- assay[,c("PUBCHEM.CID", "PUBCHEM.ACTIVITY.OUTCOME")]
data <- rbind(data, assaydata)
}
I receive the following error.
In addition: Warning message:
In open.connection(file, "rt") :
cannot open: HTTP status was '400 Bad Request'
Here is the sessionInfo():
R version 3.1.2 (2014-10-31)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] ggplot2_1.0.1 rcdk_3.3.2 fingerprint_3.5.2 rpubchem_1.5.0.2 webchem_0.0.1 dplyr_0.4.1
loaded via a namespace (and not attached):
[1] assertthat_0.1 bitops_1.0-6 car_2.0-25 colorspace_1.2-6 DBI_0.3.1 digest_0.6.8 grid_3.1.2
[8] gtable_0.1.2 iterators_1.0.7 lattice_0.20-31 lazyeval_0.1.10 lme4_1.1-7 magrittr_1.5 MASS_7.3-40
[15] Matrix_1.2-0 mgcv_1.8-6 minqa_1.2.4 munsell_0.4.2 nlme_3.1-120 nloptr_1.0.4 nnet_7.3-9
[22] parallel_3.1.2 pbkrtest_0.4-2 plyr_1.8.2 png_0.1-7 proto_0.3-10 quantreg_5.11 rcdklibs_1.5.8.4
[29] Rcpp_0.11.5 RCurl_1.95-4.6 reshape2_1.4.1 rJava_0.9-6 RJSONIO_1.3-0 scales_0.2.4 SparseM_1.6
[36] splines_3.1.2 stringr_0.6.2 tools_3.1.2 XML_3.98-1.1
Hei,
I cloned the latest version of the repository and tried to compile the library using the command line:
R CMD build rcdklibs
R CMD INSTALL rcdklibs_*gz
cd rcdkjar
ant clean jar
cd ../
R CMD build rcdk # <-- Produces error
R CMD INSTALL rcdk_*gz
The R CMD build rcdk
command fails with the following error-message:
creating vignettes ... ERROR
Quitting from lines 110-115 (molform.Rmd)
Error: processing vignette 'molform.Rmd' failed with diagnostics:
Elements must be 3-tuples or 4-tuples
I fixed the molform.Rmd
file by changing the lines 110-111:
mit <- generate.formula.iter(100, charge=0, window=0.1,
elements=list(C=c(0,50), H=c(0,50), N=c(0,50)))
to:
mit <- generate.formula.iter(100, charge=0, window=0.1,
elements=list(c("C",0,50), c("H",0,50), c("N",0,50)))
to be compatible with the generate.formula.iter
function definition in rcdk/R/formula.R
.
However, I receive another compilation error, that I could not fix now:
creating vignettes ... ERROR
Quitting from lines 110-115 (molform.Rmd)
Error: processing vignette 'molform.Rmd' failed with diagnostics:
method getString with signature (Lorg/openscience/cdk/interfaces/IMolecularFormula;ZZ)Ljava/lang/String; not found
This seems to be related to the command in line 245-246 in file rcdk/R/formula.R
:
return(.jcall("org/openscience/cdk/tools/manipulator/MolecularFormulaManipulator",
"S", "getString", formula, FALSE, TRUE))
Can you help with that?
Best regards,
Eric
I've problems loading the rcdk
package on macOS high sierra with JDK 1.9. The package fails to install with the error:
Error: package or namespace load failed for ‘rcdk’:
.onLoad failed in loadNamespace() for 'rcdk', details:
call: if (isjavagood == FALSE) {
error: missing value where TRUE/FALSE needed
Error: loading failed
Hi Rajarsh,
I found some errors this morning while running my code built on rcdk package. Zach mentioned you guys are doing a major update for the packages. I dig into my code and seems like "get.desc.categories()" generate errors like shown below:
Error in .jcall("org/guha/rcdk/descriptors/DescriptorUtilities", "[Ljava/lang/String;", :
java.lang.UnsupportedClassVersionError: org/guha/rcdk/descriptors/DescriptorUtilities : Unsupported major.minor version 52.0
I wonder if this is something you guys noticed?
Thank you,
Tao
hi @rajarshi,
thanks for your excellent package. I have been writing some utilities that depend on rcdk/rcdklibs and I would like to use the most recent features of 1.5.12 including the newer smiles parsing and possibly image depiction. Therefore, I am hoping that you can cut another CRAN release of rcdk/rcdklibs. I'd be happy to help in any way.
(FYI: just put together a chemdoodle widget for drawing molecules in html and using CDK as the backend for parsing them https://github.com/zachcp/chemdoodle)
zach cp
I'm getting a lot of output trying a fresh install of rcdk_v3.4.8 from github that seems to persist for all subsequent installations as well (and also appears for rcdk_libs)
*** arch - x64
0 [main] DEBUG org.openscience.cdk.DynamicFactory - registered 'IAtom' with 'Atom' implementation
15 [main] DEBUG org.openscience.cdk.DynamicFactory - registered 'IPseudoAtom' with 'PseudoAtom' implementation
[... another 38 or so lines ... you get the picture ]
Then ...
Is there a way to make it quiet? :-) Thanks!
R version 3.4.2 (2017-09-28)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=English_Australia.1252 LC_CTYPE=English_Australia.1252 LC_MONETARY=English_Australia.1252
[4] LC_NUMERIC=C LC_TIME=English_Australia.1252
attached base packages:
[1] grid stats graphics grDevices utils datasets methods base
other attached packages:
[1] ReSOLUTION_0.1.5 nontarget_1.9 mgcv_1.8-20 nlme_3.1-131 nontargetData_1.1
[6] mzR_2.10.0 readxl_1.1.0 OrgMassSpecR_0.5-3 RChemMass_0.1.12 enviPat_2.2
[11] rsvg_1.2 curl_3.2 rcdk_3.4.8 rcdklibs_2.2 rJava_0.9-9
[16] RMassBank_2.4.0 Rcpp_0.12.13 devtools_1.13.3
loaded via a namespace (and not attached):
[1] lattice_0.20-35 colorspace_1.3-2 stats4_3.4.2 fingerprint_3.5.7
[5] yaml_2.1.14 vsn_3.44.0 XML_3.98-1.9 rlang_0.1.2
[9] withr_2.0.0 MSnbase_2.2.0 BiocParallel_1.10.1 affy_1.54.0
[13] BiocGenerics_0.22.1 affyio_1.46.0 foreach_1.4.3 plyr_1.8.4
[17] mzID_1.14.0 ProtGenerics_1.8.0 zlibbioc_1.22.0 cellranger_1.1.0
[21] munsell_0.4.3 pcaMethods_1.68.0 gtable_0.2.0 codetools_0.2-15
[25] memoise_1.1.0 Biobase_2.36.2 knitr_1.17 IRanges_2.10.5
[29] doParallel_1.0.11 BiocInstaller_1.26.1 parallel_3.4.2 itertools_0.1-3
[33] preprocessCore_1.38.1 scales_0.5.0 limma_3.32.10 S4Vectors_0.14.7
[37] impute_1.50.1 rjson_0.2.15 ggplot2_2.2.1 png_0.1-7
[41] digest_0.6.12 tools_3.4.2 bitops_1.0-6 lazyeval_0.2.0
[45] RCurl_1.95-4.8 tibble_1.3.4 Matrix_1.2-11 httr_1.3.1
[49] iterators_1.0.9 R6_2.2.2 MALDIquant_1.16.4 git2r_0.19.0
[53] compiler_3.4.2
Following the example for "eval.desc"
smiles <- c('CCC', 'c1ccccc1', 'CC(=O)C')
mols <- sapply(smiles, parse.smiles)
dnames <- get.desc.names('topological')
descs <- eval.desc(mols, dnames, verbose=TRUE)
you can do
lapply(dnames,function(x){ require(rcdk); eval.desc(mols, x, verbose=FALSE)})
which works but this
cl <- makeCluster(detectCores()-1)
clusterExport(cl, "mols")
rcdk_desc <- parLapply(cl,dnames,function(x){ require(rcdk); eval.desc(mols, x, verbose=FALSE)})
stopCluster(cl)
fails with
Error in checkForRemoteErrors(val) :
3 nodes produced errors; first error: java.lang.NullPointerException
Applying over "mols" instead of "dnames" does not produce an error but all are NA.
I realize this is probably some java/parallel interaction but do you have any idea of a workaround?
Hi, I am using Python to invoke R package by rpy2. Actually this is a django project, I can use cdkr
correctly only the first time, when I refresh the website, I get these error message everytime:
Error in .jcall(cn, "Ljava/lang/Object;", "get", as.integer(i - 1)) :
java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
I am not good at R, so I would like to know, what kind of situation may cause the size
equal to zero?, in cdkr
package source code, line 68. Thanks!
Hi,
I updated rcdk today to the newest version and the functionality of view.molecule.2d()
broke down. I tried to execute the chunk from the vignette and get the following error:
library(rcdk)
Loading required package: rcdklibs
Loading required package: rJava
smiles <- c('CCC', 'CCN', 'CCN(C)(C)',
'c1ccccc1Cc1ccccc1','C1CCC1CC(CN(C)(C))CC(=O)CC')
mols <- parse.smiles(smiles)
view.molecule.2d(mols[[1]])
Error in .jnew("org/guha/rcdk/view/ViewMolecule2D", molecule, as.integer(width), :
java.lang.NoSuchMethodError: <init>
R version 3.4.3 (2017-11-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
Matrix products: default
locale:
[1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] rcdk_3.4.5 rcdklibs_2.0 rJava_0.9-9
loaded via a namespace (and not attached):
[1] compiler_3.4.3 parallel_3.4.3 fingerprint_3.5.6 tools_3.4.3 iterators_1.0.9
[6] itertools_0.1-3 png_0.1-7
My Java build is (build 1.8.0_151-b12)
load.molecules(molfiles="http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=6442441&disopt=SaveSDF")
Error in as.character.default(X[[1L]], ...) :
no method for coercing this S4 class to a vector
Same Error for the local file of this chemical.
however in version 3.2.3.2
This error will be https://www.biostars.org/p/100384/ (in my comments)
for CID: 6442441, http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=6442441&disopt=SaveSDF
descriptors: "org.openscience.cdk.qsar.descriptors.molecular.RuleOfFiveDescriptor"; "org.openscience.cdk.qsar.descriptors.molecular.XLogPDescriptor"
require(rcdk)
drug.mols <- load.molecules(molfiles="./CID_6442441.sdf")
drug.descs <- eval.desc(drug.mols, "org.openscience.cdk.qsar.descriptors.molecular.RuleOfFiveDescriptor", verbose=T)
drug.descs <- eval.desc(drug.mols, "org.openscience.cdk.qsar.descriptors.molecular.XLogPDescriptor", verbose=T)
Error in .jcall(b, "Lorg/openscience/cdk/qsar/DescriptorValue;", "calculate", :
java.lang.NullPointerException
Thank you. the rcdk is will helpful to me.
Dear rcdk-Team,
is it possible to have the IsotopePatternSimilarity function in rcdk?
http://cdk.github.io/cdk/1.5/docs/api/org/openscience/cdk/formula/IsotopePatternSimilarity.html
Best regards,
Michael
1 Test Suite :
rcdk rcdk Unit Tests - 27 test functions, 0 errors, 1 failure
FAILURE in test.get.smiles2: Error in checkEquals("N([CH2])CC", get.smiles(mcs)) : 1 string mismatch
checkEquals("N([CH2])CC", get.smiles(mcs))
In the new code, the mcs is returned as "[CH2]NCC", which looks identical to "N([CH2])CC". Patch for test coming.
Yours, Steffen
Steffen
Hi,
another problem occurs for other exceptions:
Error in .jcall(.jnew("org/openscience/cdk/ChemObject"), "Lorg/openscience/cdk/interfaces/IChemObjectBuilder;", :
org.openscience.cdk.exception.NoSuchAtomTypeException: The AtomType Se.2 could not be found
Can these be caught with rJava ? And an option added to iload.molecules()
to just jump over them, because we can't do anything with them anyway ?
Yours,
Steffen
* checking Rd \usage sections ... WARNING
Undocumented arguments in documentation object 'parse.smiles'
‘kekulise’
Documented arguments not in \usage in documentation object 'parse.smiles':
‘kekulize’
Patch coming.
Here's the file:
The rcdk code I use:
mols <- load.molecules(c("Fragments2.sdf"), verbose=T)
Seems it is included with the cdk: http://cdk.github.io/cdk/1.5/docs/api/org/openscience/cdk/stereo/Stereocenters.html
what I would like to be able to do is detect which atoms are potential chiral centers.
It's not available from CRAN...
I've been struggling to view the 2d structure of any molecule. I tried with R version 3.3.1, version 3.3.2 and R version 3.4.2
Latest sessionInfo():
R version 3.4.2 (2017-09-28), png_0.1-7 , fingerprint_3.5.6, rcdk_3.4.3, rcdklibs_2.0, rJava_0.9-9
root@eb2bc2d30d3d:/# javac -version
javac 1.8.0_121
curcumin = parse.smiles("O=C(\C=C\c1ccc(O)c(OC)c1)CC(=O)\C=C\c2cc(OC)c(O)cc2")[[1]]
dep <- get.depictor(width = 200, height = 200, zoom = 1.3, style = "cow",
imp <- view.image.2d(curcumin, dep)
Error in .jcall(mi, "[B", "getBytes", as.integer(depictor$getWidth()), :
java.lang.NoClassDefFoundError: Could not initialize class sun.awt.X11GraphicsEnvironment
view.image.2d(curcumin, dep)
Error in .jcall(mi, "[B", "getBytes", as.integer(depictor$getWidth()), :
java.lang.NoClassDefFoundError: Could not initialize class sun.awt.X11GraphicsEnvironment
view.molecule.2d(curcumin, dep)
Error in .jnew("org/guha/rcdk/view/ViewMolecule2D", molecule, as.integer(width), :
java.lang.NoSuchMethodError:
Hydrogens are not included in plots (using view.image.2d or view.molecule.2d) functions. This is confusing when plotting structures containing, for example, hydroxyl groups. I suspect this is caused by the following default assigment in these functions of:
molecule = AtomContainerManipulator.removeHydrogens(molecule)
This assignment should be updated to include the following options:
Hello,
As the title says.
Example:
sm1 <- "C1=CC=CC=C1"
sm2 <- "[O-]P(=O)([O-])[O-]"
rcdk::get.mcs(rcdk::parse.smiles(sm1)[[1]], rcdk::parse.smiles(sm2)[[1]])
Gives
Error in .jcall("org.guha.rcdk.util.Misc", "Lorg/openscience/cdk/interfaces/IAtomContainer;", :
java.lang.NullPointerException
Perhaps it should return an empty molecule (if that is possible...)?
Regards,
Rick
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.