Giter VIP home page Giter VIP logo

springerquarantinebooksr's Introduction

springerQuarantineBooksR: download all Springer books made available during the COVID-19 quarantine

"With the Coronavirus outbreak having an unprecedented impact on education, Springer Nature is launching a global program to support learning and teaching at higher education institutions worldwide."

Source: https://group.springernature.com/gp/group/media/press-releases/freely-accessible-textbook-initiative-for-educators-and-students/17858180?utm_medium=social&utm_content=organic&utm_source=facebook&utm_campaign=SpringerNature_&sf232256230=1

This package has the download_springer_book_files function which can be used to download all (or a subset) of these Springer book files freely available. The default parameters download the latest pdf versions of the English books and generate a repo with 7.27GB.

An excellent blog post with some nice usage examples can be found at https://www.statsandr.com/blog/a-package-to-download-free-springer-books-during-covid-19-quarantine/.

This is still a work in progress. Thus, any help and/or feedbacks are welcome!

Installation

Assuming you have devtools installed, you can install springerQuarantineBooksR with the following code (you might also try force = T argument inside install_github function):

devtools::install_github("renanxcortes/springerQuarantineBooksR")
library(springerQuarantineBooksR)

Download all books in any repo of your choice

setwd('path_of_your_choice')
download_springer_book_files()

You'll get an output similar like this:

Repo Structure generated

It will be generated a repo named springer_quarantine_books with a specific structure:

Download only specific books

For example, if you'd like to download only books with "Data Science" on the title, you can run:

springer_table <- download_springer_table()

specific_titles_list <- springer_table %>% 
  filter(str_detect(book_title, 'Data Science')) %>% 
  pull(book_title)

download_springer_book_files(springer_books_titles = specific_titles_list)

Download .epub extension of the books:

If you'd like to download the books in the .epub extension (alternatevely, you can download both by setting filetype = 'both'), you can run:

setwd('path_of_your_choice_for_epub_books')
download_springer_book_files(filetype = 'epub')

Download books in German

If you'd like to download German books (more info in #16), you can run:

setwd('path_of_your_choice_for_german_books')
download_springer_book_files(lan = 'ger')

Acknowledgments

This project draw inspiration from the springer_free_books project available at https://github.com/alexgand/springer_free_books.

I also would like to thank @AntoineSoetewey for the constant help on feedbacks and spreading the package!

Thank you, Springer!

springerquarantinebooksr's People

Contributors

renanxcortes avatar paeselhz avatar abichat avatar amanj131989 avatar gabrielmagno avatar vinhtantran avatar

Stargazers

 avatar  avatar Maiza avatar  avatar JR avatar  avatar  avatar  avatar Leo Lee avatar Felipe Mattioni Maturana, Ph.D. avatar  avatar Cyber Marmot avatar Antony Barja  avatar Jamais avatar  avatar  avatar Bruno Mendella avatar  avatar Salvador Guzman avatar Maria avatar  avatar Matheus Jonatha avatar Marc Choisy avatar Áquila Macena avatar Belinda Maher avatar eli knaap avatar Dr. Andreas Fischer avatar Tom Link avatar Nicholas Williams avatar Sergey Kalachkov avatar Lucas Lima de Oliveira avatar Jason M. Boucher avatar  avatar  avatar Bastián González-Bustamante avatar Antonio avatar Dalton Jorge avatar  avatar OpenMind avatar John Blischak avatar  avatar  avatar  avatar  avatar  avatar Isaac Zarzuri avatar  avatar Jeňa Kočí avatar Henning avatar Neylson Crepalde avatar K. N. avatar  avatar  avatar Cristian Arean avatar Alex_Rocks avatar Ege Ulgen avatar  avatar Martin Frasch avatar Soumya Ghosh avatar Cogitarian avatar  avatar Mario Garcia avatar João Gross avatar Reyhan Aydın avatar Ricardo Lebrón avatar  avatar Mohamed Laouar avatar elif şahingöz avatar Ayşen Özün Türkçetin avatar Umut Gerlevik avatar Venito Gules de Lima avatar Oscar Mora avatar Lorgensky Pelicier  avatar Anas sheashaey avatar  avatar  avatar Roe Rogers avatar Francesco Di Cicco avatar José Duarte Alleuy avatar Ian T. Adams avatar  avatar STYLIANOS IORDANIS avatar Chen Zhong avatar Turkuler avatar Jacquie Tran avatar  avatar Jonathan Alberto Machuca Yaguana avatar  avatar Elizabeth Borgognoni avatar Jessica Burnett avatar Katherine Hébert avatar Antonio Alvarez de la Paz avatar Cherise R. Chin Fatt, Ph.D. avatar  avatar Fardil Bhugaloo avatar Ernest Guevarra avatar Jack Wasey avatar The4thJuliek avatar Eugene Girtcius avatar Eduardo dos Santos Almeida avatar

Watchers

James Cloos avatar Steve avatar  avatar Antoine Soetewey avatar  avatar Anas sheashaey avatar  avatar  avatar

springerquarantinebooksr's Issues

Installation error

After updating all packages, included dplyr, I am still getting the same error:
Error: (converted from warning) package 'dplyr' was built under R version 3.6.3
Execution halted
ERROR: lazy loading failed for package 'springerQuarantineBooksR'

  • removing 'C:/Program Files/R/R-3.6.1/library/springerQuarantineBooksR'
    Error: Failed to install 'springerQuarantineBooksR' from GitHub:
    (converted from warning) installation of package ‘C:/Users/n9675230/AppData/Local/Temp/Rtmpw3buJ7/file251c4f0151/springerQuarantineBooksR_0.1.0.tar.gz’ had non-zero exit status

Any suggestion how to proceed?

Thanks for your great work!!

All books show as 14KB in length

I am having an issue downloading all the Springer books. I end up getting all the book files by running downloadspringerbook_files(); however, they are all 14KB.

Evaluation error: file '...\file....xlsx' cannot be opened

Springer seems to have updated their list of books, therefore xlsx-file that contains the list of books is not downloaded correctly.

Changing the end of line 21 in download_springer_table.R from

books_list_url <- 'https://resource-cms.springernature.com/springer-cms/rest/v1/content/17858272/data/v4/'

to

books_list_url <- 'https://resource-cms.springernature.com/springer-cms/rest/v1/content/17858272/data/v5/'

fixes the problem (i.e., replace the v4 with v5)

Can't find function "download_springer_book_files"

I'm running R 3.6.3, Mac 10.12.6.
Output file:
The downloaded binary packages are in
/var/folders/vd/td5slcr12ts4rfxgx06vvrww0000gp/T//RtmpmdFVWg/downloaded_packages
checking for file ‘/private/var/folders/vd/td5slcr12ts4rfxgx06vvrww0000gp/T/RtmpmdFVWg/remotes1c94e592bbe/renanxcortes-springerQuarantineBooksR-73b513e/DESCRIPTION’ ...

  • installing source package ‘springerQuarantineBooksR’ ...
    ** using staged installation
    ** R
    ** inst
    ** byte-compile and prepare package for lazy loading
    ** help
    *** installing help indices
    ** building package indices
    ** testing if installed package can be loaded from temporary location
    ** testing if installed package can be loaded from final location
    ** testing if installed package keeps a record of temporary installation path
  • DONE (springerQuarantineBooksR)

setwd("Zipped 2")
Error in setwd("Zipped 2") : cannot change working directory
download_springer_book_files()
Error in download_springer_book_files() :
could not find function "download_springer_book_files"

Thanks in advance for any suggestions

Failed installation because of "rlang" package

I tried to install this package on a brand-new PC with the latest version of R just installed. I first installed the devtools package, and then ran the following code
devtools::install_github("renanxcortes/springerQuarantineBooksR", force=T)

It returns the error message:

Error: Failed to install 'springerQuarantineBooksR' from GitHub:
(converted from warning) cannot remove prior installation of package ‘rlang’

It is a brand-new PC with freshly installed R. What can the problem? Thanks!

http_error(get_file)

Good job with the package! By the way, when trying to run "download_springer_book_files()" I get an error. The code I run:
devtools::install_github("renanxcortes/springerQuarantineBooksR") library(springerQuarantineBooksR) setwd('G:/.../Springer Ebooks') download_springer_book_files()

And the error:
Downloading title latest editions. Processing... Fundamentals of Power Electronics (1 out of 391) Error in http_error(get_file) : could not find function "http_error"

Is it common? Is there a way to solve it? By the way, i use R version 4.0.0 (2020-04-24).

Thank you,

Pau

Invalid Color Space error from Adobe Reader

Thanks for creating this package and helping people with bulk downloading.

I installed, then loaded the package and then I executed download_springer_book_files(parallel = TRUE) as suggested in your README

It downloads almost 7.5 Gb of PDFs in the default folder 'springer_quarantine_books', however, trying to open some of them with Adobe Acrobat Reader DC version 2020.006.20042, I get the following errors, from case to case, e.g.:

  • for './springer_quarantine_books/Computer Science/A Beginners Guide to Python 3 Programming - 1st ed. 2019.pdf':
  • "There was an error processing a page. Invalid ColorSpace"
    I click OK and the PDF is blank with an gray background.
  • for './springer_quarantine_books/Behavioral Science/A Clinical Guide to the Treatment of the Human Stress Response - 3rd ed. 2013.pdf':
  • "Cannot extract the embedded font 'Times-Roman'. Some characters may not display or print correctly."
    I click OK then I get further:
  • "A drawing error occurred.". Click OK, then:
  • "An error exists on this page. Acrobat may not display the page correctly. Please contact the person who created the PDF document to correct the problem."
    In the end the PDF is blank with a white background.
    I get a similar error for './springer_quarantine_books\Business and Economics/Corporate Social Responsibility - 2013.pdf', but instead of referring to font 'Times-Roman', it mentions font IHMLIA+AdvP6975 (I presume this will vary from PDF to PDF).

When I download these books directly from Springer's link, I do not get the errors and the PDF can be read.

Any idea what might have happened?

Thanks so much for your support and help.

sessionInfo()
#> R version 3.6.2 (2019-12-12)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 18362)
#> 
#> Matrix products: default
#> 
#> locale:
#> [1] LC_COLLATE=English_United States.1252 
#> [2] LC_CTYPE=English_United States.1252   
#> [3] LC_MONETARY=English_United States.1252
#> [4] LC_NUMERIC=C                          
#> [5] LC_TIME=English_United States.1252    
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> loaded via a namespace (and not attached):
#>  [1] compiler_3.6.2  magrittr_1.5    tools_3.6.2     htmltools_0.4.0
#>  [5] yaml_2.2.1      Rcpp_1.0.4.6    stringi_1.4.6   rmarkdown_2.1  
#>  [9] highr_0.8       knitr_1.28      stringr_1.4.0   xfun_0.13      
#> [13] digest_0.6.25   rlang_0.4.5     evaluate_0.14

Created on 2020-04-26 by the reprex package (v0.3.0)

Same book is downloaded (all books)

Great package!
I tried the function download_springer_book_files() but it seems that the same book is downloaded (the file name is correct). An issue with the URL maybe?
Package installed 1 hour ago.

Add possibility of .epub

Currently, the package downloads only the .pdf extensions of the books. It worth to add also the .epub versions also. Probably, some kind of filetype = 'epub' argument inside generate_springer_book_files.

Warnings appearence

Hi! First of all, thanks for the code!
I work with R 3.6.3 version and I used this lines to download the ebooks:

setwd("my path")
download_springer_book_files()

At the end of the process, it appear on the console:

Warning messages:
1: In if (!dir.exists(current_folder)) { :
the condition has length > 1 and only the first element will be used
2: In if (!dir.exists(current_folder)) { :
the condition has length > 1 and only the first element will be used

And I didn't get the books. What am I supposed to do?

dplyr error when installing

Hello,

i'm trying to install this repo and its showing some errors

> devtools::install_github("renanxcortes/springerQuarantineBooksR", force = T)
Downloading GitHub repo renanxcortes/springerQuarantineBooksR@master
√  checking for file 'C:\Users\Dunnlab\AppData\Local\Temp\Rtmp4AYpjB\remotes290446b5124b\renanxcortes-springerQuarantineBooksR-6278667/DESCRIPTION' (533ms)
-  preparing 'springerQuarantineBooksR':
√  checking DESCRIPTION meta-information ... 
-  checking for LF line-endings in source and make files and shell scripts
-  checking for empty or unneeded directories
-  building 'springerQuarantineBooksR_0.1.0.tar.gz'
   
Installing package into ‘C:/Users/Dunnlab/Documents/R/win-library/3.6’
(as ‘lib’ is unspecified)
* installing *source* package 'springerQuarantineBooksR' ...
** using staged installation
** R
** inst
** byte-compile and prepare package for lazy loading
Error: (converted from warning) package 'janitor' was built under R version 3.6.3
Execution halted
ERROR: lazy loading failed for package 'springerQuarantineBooksR'
* removing 'C:/Users/Dunnlab/Documents/R/win-library/3.6/springerQuarantineBooksR'
Error: Failed to install 'springerQuarantineBooksR' from GitHub:
  (converted from warning) installation of package ‘C:/Users/Dunnlab/AppData/Local/Temp/Rtmp4AYpjB/file290433322c75/springerQuarantineBooksR_0.1.0.tar.gz’ had non-zero exit status

below is the session info

sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18362)

Matrix products: default

locale:
[1] LC_COLLATE=English_Canada.1252  LC_CTYPE=English_Canada.1252    LC_MONETARY=English_Canada.1252
[4] LC_NUMERIC=C                    LC_TIME=English_Canada.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
 [1] rstudioapi_0.11   magrittr_1.5      usethis_1.6.1     devtools_2.3.0    pkgload_1.0.2     R6_2.4.1         
 [7] rlang_0.4.6       fansi_0.4.1       tools_3.6.1       pkgbuild_1.0.8    sessioninfo_1.1.1 cli_2.0.2        
[13] withr_2.2.0       ellipsis_0.3.0    remotes_2.1.1     assertthat_0.2.1  digest_0.6.25     rprojroot_1.3-2  
[19] crayon_1.3.4      processx_3.4.2    callr_3.4.3       fs_1.4.1          ps_1.3.2          curl_4.3         
[25] testthat_2.3.2    memoise_1.1.0     glue_1.4.0        compiler_3.6.1    desc_1.2.0        backports_1.1.6  
[31] prettyunits_1.1.1

Unable to install

Using Force=T didn't help either.

Here's the full log from the console:

Installing package into ‘C:/Users/Ariel Karlinsky/Documents/R/win-library/3.6’
(as ‘lib’ is unspecified)
* installing *source* package 'springerQuarantineBooksR' ...
** using staged installation
** R
** inst
** byte-compile and prepare package for lazy loading
Error: (converted from warning) package 'httr' was built under R version 3.6.1
Execution halted
ERROR: lazy loading failed for package 'springerQuarantineBooksR'
* removing 'C:/Users/Ariel Karlinsky/Documents/R/win-library/3.6/springerQuarantineBooksR'
Error: Failed to install 'springerQuarantineBooksR' from GitHub:
  (converted from warning) installation of package ‘C:/Users/ARIELK~1/AppData/Local/Temp/Rtmpyw2fkh/file271c4d4a64e/springerQuarantineBooksR_0.1.0.tar.gz’ had non-zero exit status

issues to install

hi, i cannot install the package. the error:

WARNING: Rtools is required to build R packages, but is not currently installed.

Please download and install Rtools custom from https://cran.r-project.org/bin/windows/Rtools/.
√ checking for file 'C:\Users\javie\AppData\Local\Temp\RtmpiazejU\remotes3c1c50251966\renanxcortes-springerQuarantineBooksR-befa138/DESCRIPTION' ...

  • preparing 'springerQuarantineBooksR':
    √ checking DESCRIPTION meta-information ...
  • checking for LF line-endings in source and make files and shell scripts
  • checking for empty or unneeded directories
  • building 'springerQuarantineBooksR_0.1.0.tar.gz'

Installing package into ‘C:/Users/javie/Documents/R/win-library/3.6’
(as ‘lib’ is unspecified)

  • installing source package 'springerQuarantineBooksR' ...
    ** using staged installation
    ** R
    ** inst
    ** byte-compile and prepare package for lazy loading
    Error: (convertido del aviso) package 'readxl' was built under R version 3.6.3
    Ejecución interrumpida
    ERROR: lazy loading failed for package 'springerQuarantineBooksR'
  • removing 'C:/Users/javie/Documents/R/win-library/3.6/springerQuarantineBooksR'
    Error: Failed to install 'springerQuarantineBooksR' from GitHub:
    (converted from warning) installation of package ‘C:/Users/javie/AppData/Local/Temp/RtmpiazejU/file3c1c205e2eb2/springerQuarantineBooksR_0.1.0.tar.gz’ had non-zero exit status

sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

Random number generation:
RNG: Mersenne-Twister
Normal: Inversion
Sample: Rounding

locale:
[1] LC_COLLATE=Spanish_Spain.1252 LC_CTYPE=Spanish_Spain.1252
[3] LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C
[5] LC_TIME=Spanish_Spain.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] readxl_1.3.1

If you have any suggestions,
thanks for your help!

Allow download different editions of the same title?

Currently, due to the fact that the package works with the book title (in order to facilitate the creation of subgroups of books by title), It fetches the latest edition of each book. In fact, that's why the number of books generated is lower than the number of lines of the .xlsx file catalog that springer made available. Perhaps, it is worth to check an alternative to download all editions of the same book.

Can not install the package

I am using R 3.6.1 and tried to installed the new version but didn't work. So how can I use your codes for R 3.6.1.

installation non-zero exit status

Trying to install the package (in multiple different Windows PCs) but encounter the error message

installation of package ‘C:/Users/Tim/AppData/Local/Temp/RtmpSETARB/file3ce4752013ff/springerQuarantineBooksR_0.1.0.tar.gz’ had non-zero exit status

Any help is greatly appreciated.

Feature request: Provide an argument to resume downloading at a certain position

Allow to restart downloading titles from a certain position. I didn't fully dig into the code, but assuming that the order of the results from download_springer_table is stable, it would be sufficient to change the definition of download_springer_book_files to something like this:

download_springer_book_files <- function ([...], start = 1) {
    [...]
    for (title in springer_books_titles[start : lenght(springer_books_titles)]) {
        [...]
    }
    [...]
}

Error: Failed to install 'springerQuarantineBooksR'

package ‘rlang’ successfully unpacked and MD5 sums checked
Error: Failed to install 'springerQuarantineBooksR' from GitHub:
(converted from warning) cannot remove prior installation of package ‘rlang’
In addition: Warning messages:
1: In untar2(tarfile, files, list, exdir) :
skipping pax global extended headers
2: In untar2(tarfile, files, list, exdir) :
skipping pax global extended headers

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.