ropensci / medrxivr Goto Github PK
View Code? Open in Web Editor NEWAccess and search medRxiv and bioRxiv preprint data
Home Page: https://docs.ropensci.org/medrxivr/
License: Other
Access and search medRxiv and bioRxiv preprint data
Home Page: https://docs.ropensci.org/medrxivr/
License: Other
According to the docs the default from_date
is "2019-06-01"
Lines 6 to 7 in bdec02c
However, the actual default "from_date" is "2013-01-01".
For example,
mx_search("dementia", auto_caps = TRUE)
would find both Dementia and dementia
results <- medrxivr::mx_search(medrxivr::mx_snapshot(),
medrxivr::mx_caps("mendelian randomi*ation"))
This seems to be due to the fact that the number of records given by the "total" metadata is more than the total number of records actually available.
As of 14.39pm on 04/01/2021, the number of records given by the "total" is 148231. However, if you set the counter to any record within 31 of this figure (e.g. https://api.biorxiv.org/details/biorxiv/2013-01-01/2021-01-04/148201), you get a "No posts found" message. As medrxivr
uses the "total" metadata field to work out how many pages it needs to cycle through to download the whole database, this sometimes leads to an error when the last page, expected by medrxivr
based on the "total" field, is empty.
Note as more records are added to the API, the hardcode figures above will no longer demonstrate the issue.
Requested by editor:
Requested by editor:
Prepare for release:
devtools::build_readme()
usethis::use_cran_comments()
devtools::check(remote = TRUE, manual = TRUE)
devtools::check_win_devel()
rhub::check_for_cran(env_vars=c(R_COMPILE_AND_INSTALL_PACKAGES = "always"))
cran-comments.md
Submit to CRAN:
usethis::use_version('major')
devtools::submit_cran()
Wait for CRAN...
usethis::use_github_release()
usethis::use_dev_version()
While preparing the workflow directory and required actions in a GitHub Actions workflow, the following error is encountered:
Error: Unable to resolve action `r-lib/actions@master`, unable to find version `master`
This prevents the workflow from running.
Via https://ropensci.r-universe.dev/ui#builds we see
Missing images in 'README.md': 'articles/data_sources.png'
โน pkgdown can only use images in 'man/figures' and 'vignettes'
Anything over and including NEAR10 fails
It seems like the bioRxiv/medRxiv API are down entirely. I'm experiencing the following error with every request:
Error : (2002) Connection refused
Can you confirm this? All examples provided in the docs (https://api.biorxiv.org) are failing.
For example: https://api.biorxiv.org/details/biorxiv/2018-08-21/2018-08-28/45 results in the described error.
Are you experiencing the same? Posting here, since it might affect your entire package.
Wondering if there's an option to add altmetrics to the search function. Additionally how many times a preprint's PDF has been clicked
It is often useful to be able to see the number of "hits" (records returned) by each individual element of the search, so that when designing the search strategy you can interogate which elements are influencing the returned records the most. So for example, if the search is:
topic1 <- c("dementia", "Alzheimer's") # Combined with Boolean "OR"
topic2 <- c("lipids", "cholesterol") # Combined with Boolean "OR"
query <- list(topic1,topic2) # Combined with Boolean "AND"
results <- mx_search(mx_snapshot(), query)
Then passing query
to the proposed mx_reporter()
function would return something like the below:
# Total number of records found by your search: XX
# Total topic 1 records: XX
# - dementia: XX
# - Alzheimer's: XX
# Total Topic 2 records: XX
# - lipids: XX
# - cholesterol: XX
setting value
version R version 4.2.2 (2022-10-31)
os macOS Ventura 13.2.1
system x86_64, darwin17.0
ui RStudio
language (EN)
collate en_US.UTF-8
ctype en_US.UTF-8
tz America/Toronto
date 2024-03-20
rstudio 2023.09.1+494 Desert Sunflower (desktop)
pandoc NA
> mx_api_content(
+ from_date = "2013-01-01",
+ to_date = as.character(Sys.Date()),
+ clean = TRUE,
+ server = "medrxiv",
+ include_info = FALSE
+ )
Error in count/100 : non-numeric argument to binary operator
> mx_data <- mx_api_content(from_date = "2020-01-01",
+ to_date = "2020-01-07")
Error in count/100 : non-numeric argument to binary operator
> if(interactive()){
+ mx_data <- mx_api_content(from_date = "2020-01-01",
+ to_date = "2020-01-07")
+ }
Error in count/100 : non-numeric argument to binary operator
> preprint_data <- mx_api_content(server = "biorxiv")
Error in count/100 : non-numeric argument to binary operator
> preprint_data <- mx_api_content()
Error in count/100 : non-numeric argument to binary operator
Offhand, I can see that this is caused by a bad value of count
, probably NA
, null, or 0. I believe the same error has been included automatically in the docs.
I am guessing either something changed server-side or in a dependency that invalidated the lib's logic. I am writing a Python version now; will let you know if I find the problem and solution.
The pkgdown
GitHub action is failing, but the Jenkins version works fine. Gives error message:
-- Building function reference -------------------------------------------------
Error in check_missing_topics(rows, pkg) :
Topics missing from index: medrxivr
For example: if a users searches for "randomi*ation", convert this to "randomi([[:alpha:]])ation", where ([[:alpha:]]) element defines any single alphanumeric character - in this case, the regex
will find both randomisation and randomization.
The idea is to prevent users from having to use unfamiliar regex
terms, in favour of common MEDLINE/EMBASE/Ovid syntax.
E.g. "attitude NEAR2 data" won't find "attitudes to data"
Some topics are missing from the configuration file.
Error in check_missing_topics(rows, pkg) :
All topics must be included in reference index
โข Missing topics: medrxivr, mx_caps
Note that for topics you do not want to include in the index you can create an "internal" section https://pkgdown.r-lib.org/reference/build_reference.html?q=internal#missing-topics
You can also use the @keywords internal
tag and redocument for, say, the package manual page.
To check all topics are listed, after editing the configuration file you can run pkgdown::check_pkgdown()
.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.