Giter VIP home page Giter VIP logo

doc_downloader's Introduction

多种文档下载器

本工具适用于下载豆丁、道客巴巴、淘豆网、原创力、新浪爱问、金锄头网站的可以预览的文档。只要可以预览,就可以下载。下载下来是图片格式,然后会通过reportlab库,将图片转换成PDF。

其中,由于新浪爱问网站用的都是svg格式的文件,将其转换成图片格式需要调用第三方库。Windows下可用svg2png库,Linux下可使用rsvg库。当然,在windows上面也可以安装rsvg库,需要下载CRAN,利用CRAN安装rsvg,实现svg的转换。

本项目还提供了一个简易的在线下载网页,[点击进入]

rsvg库安装方法

Binary packages for OS-X or Windows can be installed directly from CRAN:

install.packages("rsvg")

Installation from source on Linux or OSX requires librsvg2. On Debian or Ubuntu install librsvg2-dev:

sudo apt-get install -y librsvg2-dev

On Fedora, CentOS or RHEL we need librsvg2-devel:

sudo yum install librsvg2-devel

On OS-X use rsvg from Homebrew:

brew install librsvg

svg2png安装方法(仅限Windows操作系统)

1. 安装nodejs
2. 命令提示符内输入:npm install -g svg2png
3. 命令提示符内输入:Set-ExecutionPolicy -ExecutionPolicy 

本项目使用方法

终端内输入:

pip install -r requirements.txt
python docDownloader.py

若使用报错,应先检查chromedriver版本与chrome版本是否兼容。若不兼容,则只需将项目中的chromedriver.exe替换为兼容的版本即可。

doc_downloader's People

Contributors

rty813 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.