Giter VIP home page Giter VIP logo

redshelf-bypass-drm-pdfdownload's Introduction

RedShelf Virdocs - Bypass DRM protection PDF Download

In the web platform "RedShelf Virdocs" you are allowed to print in PDF your ePub/eReader documents, but sometimes the contributors or owners add DRM (Digital Rights Management) protection for do not allow the downloading and printing of these eBooks to avoid possible leaks or unauthorized third party distributions.

This is not a vulnerability or something not legitimate, simply an exploitation of how these eBooks published on this platform are being offered. Although the direct download or print functionality is not allowed for some eBooks by the contributor or owner, it is possible to "download" them in this alternative way.

This python script allows us to "bypass" DRM this protection and download ePub/eReader documents and export them to PDF format. First all pages are downloaded in jpg image format, resized to A4 size and finally a PDF document output is generated.

Requirements

pip install requests
pip install reportlab

Get URL ePub/eReader images

Inside the RedShelf Virdocs ePub/eReader viewer. We inspect the web and look for the iframe corresponding to the current page, in the body we will see a div containing the internal URL inside an img src tag. There we can copy the absolute path and substitute it in the variable "url_base".

redshelf-virdocs-url-img

Variables

  • Indicates the directory where the jpg images will be saved and the name of the exported PDF file.
  • Replace XXXXXXX with the value of the URL.
  • Replace the numpag variable with the number of jpg URL pages contained in the document.
target_directory_img = "TARGET_DIRECTORY_IMG"
base_url = "https://platform.virdocs.com/rscontent/epub/XXXXXXX/XXXXXXX/OEBPS/images/page-{}.jpg"
pdf_file = "EXPORT_PDF_FILE.pdf"
numpag = 350

Cookies

Replace session cookie settings. Use any browser addon for editing and exporting cookies.

{
    "session_id": "value1",
    "csrftoken": "value2"
}

Usage

python3 RedShelf-BypassDRM-PDFDownload.py

redshelf-virdocs-export-pdf

redshelf-bypass-drm-pdfdownload's People

Contributors

adrianlois avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

jaser jueybean

redshelf-bypass-drm-pdfdownload's Issues

Keeps downloading first page over and over

It’s possible I’ve made a mistake, since I’m new to Python (and programming in general), but when I run the file in Terminal it just repeatedly downloads the first page while changing the filename. I’ve attached screenshots of the download folder for reference.
download folder screenshot #1
download folder screenshot #2
download folder screenshot #3
I can share the modified Python and JSON files if that helps. I can also screenshot my Terminal window and share that too. Just let me know.

Downloads 0 bytes

Sometimes when downloading a page, it fails and leaves a 0 byte file. Probably best to check the file size after downloading to make sure it's > than 0 bytes, and redownload it if it isn't.

A bit of helpful advice to people trying to get the IDs

Just wanted to give a bit of helpful advice to people
I use FireFox, so it likely won't be different for Chrome.

Variables
base_url = "https://platform.virdocs.com/rscontent/epub/XXXXXXX/XXXXXXX/OEBPS/images/page-{}.jpg"

To find the base URL in inspect element, search for

../images/page-1.jpg

This will show up as a link, so right-click it click on "Open Link In New Tab" then look at the URL. It should look something like the URL below.

https://platform.virdocs.com/private/manifest_item/XXXXXXX/XXXXXXX/YYYYYYYYYYYYYYYYY.jpg?AWSAccessKeyId=YYYYYYYYYYYYYY&Signature=YYYYYYYY&Expires=YYYYYYYYYY

The spot where I marked it as XXXXXXX will be the two variables that you need.

Cookies
So, for the session_ID and csrftoken. For the session_ID, there's only one, don't confuse it with "sessionid". On my end I get multiple IDs for the csrftoken, just be sure you get the long csrftoken, it will be basically the same length of the session_ID.

Hope that helps someone else out.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.