bioconductor / annotationdbi Goto Github PK
View Code? Open in Web Editor NEWManipulation of SQLite-based annotations in Bioconductor
Home Page: https://bioconductor.org/packages/AnnotationDbi
Manipulation of SQLite-based annotations in Bioconductor
Home Page: https://bioconductor.org/packages/AnnotationDbi
I am trying to convert rat genes to human gene orthologs using idConverter() however when running the code I get an error that loading of hom.Rn.inp.db and is required, however this has been removed from bioconductor and I would like to make sure that my ortholog retrieval is up to date, trustworthy, and replicable so I don't want to be installing later deprecated packages.
Code:
orthologs=idConverter(ids=allrats_sigs$NOG,
srcSpecies = "RATNO",
destSpecies = "HOMSA",
srcIDType ="ENSEMBL" )
Error:
Loading required package: hom.Rn.inp.db
Error in get(paste0("hom.", srcSpcAbrv, ".inp", destSpecies)) :
object 'hom.Rn.inpHOMSA' not found
In addition: Warning message:
In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE, :
there is no package called 'hom.Rn.inp.db'
https://github.com/Bioconductor/AnnotationDbi/blob/devel/vignettes/AnnotationDbi.Rnw is noted as deprecated; however
From slack with @hpages
Maybe we should keep and ignore that vignette as proposed by Vince. It's true that users are no longer supposed to use the 'bimaps' interface but we don't know whether other Bioconductor packages are still using it or not. So we might want to keep the vignette around until we know for sure that all packages have migrated to the 'select' interface (this is the replacement for the 'bimaps' interface). Thx
Updated R to 4.0 and now am getting an error with the following code:
OrgDb = 'org.Hs.eg.db'
res@organism <- AnnotationDbi::species(OrgDb)
Unable to find an inherited method for function ‘species’ for signature ‘"character"’
Guessing it has something to do with stringtofactor changes?
Hello,
Love the package. Saw an issue I want to discuss.
Okay so I was using the Mus.Musculus AnnotateDbi database package to make a genomic ranges object containing all of the promoters and because I have access to your wonderful package through the AnnotateDbi framework I can now slap all that wonderful metdata on to this object in one go instead of having to merge 3 different databases together via ENTRZID.
Here was the code I ran.
# Package setup
BiocManager::install("OrganismDbi")
library(OrganismDbi)
BiocManager::install("Mus.musculus")
library(Mus.musculus)
BiocManager::install("GenomicFeatures")
library(GenomicFeatures)
# Making this object just for comparison
Mm_gene <- transcriptsBy(Mus.musculus, by="gene", columns=c("SYMBOL", "ENTREZID", "TXCHROM", "TXSTART", "TXSTRAND", "CDSSTART"))
Mm_gene
# Here is the promoter object. You can see I'm calling 1500 bp upstream of the transcription start and 500 bp downstream of the transcription start site my "promoter region" for this analysis.
Mm_gene_promoters <- promoters(transcriptsBy(Mus.musculus, by="gene", columns=c("SYMBOL", "ENTREZID", "TXCHROM", "TXSTART", "TXSTRAND", "CDSSTART")), upstream = 1500, downstream = 500)
Mm_gene_promoters
Below are screenshots of the outputs.
Mm_gene
Even though the transcripts are on the minus strand the database is calling that start of the transcript as the first base pair from the genomic range object.
Here you can see that the promoters() function from genomicFeatures gets it right and assigns my promoter region as 1500 bases upstream and 500 downstream to the transcription start site for Zglp1 which is coming from the minus strand and should be adding 1500 bp to the last bp of the genomic ranges and then subtracting 500 bp to get the correct ranges.
This is something I saw and was curious if the TXSTART metadata coming from the Mus.Musculus package was just being scraped from the first base pair of the genomic ranges. This would be super simple to add in an "if loop" and have it grab the last base pair in the ranges instead for transcripts on the minus strand. Otherwise this is going to lead to some confusion from people trying to use this metadata and not knowing where these numbers are coming from.
In my analysis code, I have not-uncommon occurrences of:
library(AnnotationHub)
ens.mm.v97 <- AnnotationHub()[["AH73905"]]
anno <- select(ens.mm.v97, keys=rownames(se),
keytype="GENEID", columns=c("SYMBOL", "SEQNAME"))
rowData(se) <- anno[match(rownames(se), anno$GENEID),]
It would be nice to do something like:
anno <- select(ens.mm.v97, keys=rownames(se), multiVals="first",
keytype="GENEID", columns=c("SYMBOL", "SEQNAME"))
... and save myself an extra line of code (and improve robustness to changes to the annotation object). Sort of like how I get an integer vector if I ask for findOverlaps(..., select="first")
.
> class(AnnotationDbi::mapIds(TxDb.Hsapiens.UCSC.hg38.knownGene, "1", "TXID", "GENEID", multiVals="asNA"))
'select()' returned 1:many mapping between keys and columns
[1] "logical"
One solution is to change line
AnnotationDbi/R/methods-geneCentricDbs.R
Line 1142 in 00cc5c5
as.character(unlist(data))
or better, in the line above, explicitly use NA_character_
AnnotationDbi/R/createAnnObjs.ORGANISM_DB.R
Line 140 in 50cba36
makeAnnDbMapSeeds()
is defined in a comment in the same R file, but I don't see it defined anywhere else.
Should this be removed?
Hi,
I'm trying to make an organism package from annotations using makeOrgPackage()
according to the Bioconductor vignette.
Unfortunately, when using the code example from the vignette, I receive the following error flagging a problem with AnnotationDbi
:
Error in installed.packages()["AnnotationDbi", "Version"] :
subscript out of bounds
The final package isn't built, yet the rest looks kind of working fine:
Populating genes table:
genes table filled
Populating gene_info table:
gene_info table filled
Populating chromosome table:
chromosome table filled
Populating go table:
go table filled
table metadata filled
'select()' returned many:1 mapping between keys and columns
Dropping GO IDs that are too new for the current GO.db
Populating go table:
go table filled
Populating go_bp table:
go_bp table filled
Populating go_cc table:
go_cc table filled
Populating go_mf table:
go_mf table filled
'select()' returned many:1 mapping between keys and columns
Populating go_bp_all table:
go_bp_all table filled
Populating go_cc_all table:
go_cc_all table filled
Populating go_mf_all table:
go_mf_all table filled
Populating go_all table:
go_all table filled
Error in installed.packages()["AnnotationDbi", "Version"] :
subscript out of bounds
In addition: There were 50 or more warnings (use warnings() to see the first 50)
> warnings()
Warning messages:
1: In result_fetch(res@ptr, n = n) :
SQL statements must be issued with dbExecute() or dbSendStatement() instead of dbGetQuery() or dbSendQuery().
...
I would appreciate any help how to solve the error.
Many thanks in advance!
Jan
> sessionInfo()
R version 3.6.3 (2020-02-29)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.6
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] AnnotationForge_1.28.0 GenomeInfoDb_1.22.1 biomaRt_2.42.1 GO.db_3.10.0
[5] org.Pf.plasmo.db_3.10.0 pkgconfig_2.0.3 AnnotationDbi_1.48.0 IRanges_2.20.2
[9] S4Vectors_0.24.4 Biobase_2.46.0 BiocGenerics_0.32.0
loaded via a namespace (and not attached):
[1] Rcpp_1.0.4.6 pillar_1.4.3 compiler_3.6.3 dbplyr_1.4.2
[5] bitops_1.0-6 prettyunits_1.1.1 tools_3.6.3 progress_1.2.2
[9] digest_0.6.25 bit_1.1-15.2 lifecycle_0.2.0 RSQLite_2.2.0
[13] memoise_1.1.0 BiocFileCache_1.10.2 tibble_3.0.0 rlang_0.4.5
[17] cli_2.0.2 DBI_1.1.0 curl_4.3 GenomeInfoDbData_1.2.2
[21] stringr_1.4.0 httr_1.4.1 dplyr_0.8.5 rappdirs_0.3.1
[25] vctrs_0.2.4 askpass_1.1 hms_0.5.3 tidyselect_1.0.0
[29] bit64_0.9-7 glue_1.4.0 R6_2.4.1 fansi_0.4.1
[33] XML_3.99-0.3 purrr_0.3.3 blob_1.2.1 magrittr_1.5
[37] ellipsis_0.3.0 assertthat_0.2.1 stringi_1.4.6 RCurl_1.98-1.1
[41] openssl_1.4.1 crayon_1.3.4
In Bioconductor 3.17, when I call library(AnnotationDbi)
, or loading a package that imports AnnotationDbi, I get this warning:
Warning message:
replacing previous import ‘utils::findMatches’ by ‘S4Vectors::findMatches’ when loading ‘AnnotationDbi’
I'm running R 4.3.0 on MacOS 13.3.1
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.