Giter VIP home page Giter VIP logo

download_ss_pdf's Introduction

Download_SS_PDF

这是一个从超星图书馆(http://www.sslibrary.com )下载PDF并且自动添加目录的python脚本。鉴于大概没有外国友人用,所以no English。

本脚本由本刚刚学会python的菜鸟一边google一边编写出来,当然不能突破超星图书馆的版权限制,原理只是网页爬虫,只能省去阁下按几百次右键保存图片的时间。

所以请使用者自重,若他人将该项目用于非法用途,本人概不负责。

更新历史

  • 2022年4月21日 19:30:35 ver1.1 加入了不良页面的诊断&重下载功能
  • 2022年4月22日 16:10:35 ver1.2 加入了压缩为纯黑白PDF的功能,可大大减小体积
  • 2022年4月22日 18:30:23 ver1.3 优化了下不良页面的诊断&重下载功能,但实在下不了我也没办法了
  • 2022年4月24日 15:28:26 ver1.4 优化了一点细节
  • 2022年4月28日 21:07:23 ver1.5 调整架构,提高运行效率
  • 2022年4月29日 18:35:56 ver1.6 进一步提高纯黑白压缩效率,现在体积大概可以÷15
  • 2022年5月31日 20:45:54 ver1.7 输出的pdf页面宽度统一为18 cm,整齐点。另外为exe用户提供了更改参数的功能。
  • 2022年10月06日 17:31:46 ver1.8 新增了选页下载的功能;给requests加了个伪装头,但好像超星的反爬机制升级了,成功率并不高

功能

  1. 完整下载封面、版权页、前言页、目录页等,合成为完整的书籍PDF;
  2. 与官方pdz下载同等的最高画质(zoom=3);
  3. 顺带下载了目录,并妥当地嵌入到了PDF书签中; snipaste_20220420_212157
  4. (ver1.2 new!)可以将下载的PDF压缩为纯黑白,可在保持清晰度情况下可大大减小体积,ver1.6进一步提高了压缩的效率。
    snipaste_20220429_185940

环境与用法

新推出了点开即用的exe文件 ,降低使用门槛。exe点开之后可能会卡几秒,别急。报「no publisher」是因为我没给微软交认bao证hu费,别慌。

(原来python打包exe这么简单,我以为会很复杂。但是代价就是打出来的包十分不精练……)

环境为python 3.x,需要的模块如下:

import requests,time,os,shutil,img2pdf,sys,re,numpy,cv2,glob
from PyPDF2 import PdfFileReader,PdfFileWriter
from PIL import Image
from io import BytesIO

如果阁下是完全不会python的新人,要使用,只需下载一个Visual Studio Code,安装python扩展,然后打开python所在的目录(大概在\Program Files (x86)\Microsoft Visual >Studio\Shared\Python39_64\之类的地方),在Script文件夹上按住Shift地右键→在此处打开Powershell窗口,然后运行以下命令:

pip3 install requests PyPDF2 Pillow img2pdf numpy opencv-python glob

然后用VS code打开本脚本运行即可。

用法非常简单:只需在超星网页打开一本书,复制阅读界面的网址进命令行,回车,然后等它下载就可以了。 snipaste_20220420_205402

设置

主要能进行清晰度和下载间隔的设置:

  1. 清晰度zoom:超星的最高分辨率图即为zoom=3,但是代价是总是去色的;如果想下载彩色书籍而保留颜色,可更改到zoom=2
  2. 下载间隔interval:下太快会被ban的!所以默认interval=1,即每下一页停1s,因此下载速度略慢。若阁下对自己的ip有信心可以改短一点。

【1.8更新】还可以设置重试下载的次数。

关于新式阅读器的注意事项!

本脚本只能处理 img.sslibrary.com 开头的旧式阅读器页面,而近几年新出版的书有些提供了原生电子版pdf,用的是 ssj.sslibrary.com 开头的新式阅读器:由于这种新式阅读器每页不再是图片了,所以 ssj.sslibrary.com 开头的新式阅读器是本脚本处理不了的

但是就算是这些新书,超星也依旧做了旧式扫描版,阁下可以访问读秀( https://book.duxiu.com ),这里同时提供了新式阅读器(「书世界」)和旧式阅读器(「汇雅电子书」)的进入链接,阁下进这个「汇雅电子书」的页面,就是本脚本能处理的页面了snipaste_20220529_132917

另一种方法是,在超星图书馆自己的搜索页面里,「PDF阅读」的网址复制出来会是这样的:
https://www.sslibrary.com/reader/pdf/pdfreader?ssid=...

在只要把这里面的两个pdf改成jpath,就能进入旧式阅读器了:
https://www.sslibrary.com/reader/jpath/jpathreader?ssid=...

页面清晰度嘛自然是比不上原生电子版的……但是……又不是不能用对吧……(;´д`)ゞ

Credit

本脚本受到https://github.com/0NG/sslibrary-pdf-downloader 的启发而编写,补完了前辈计划做而没有做完的工作。

download_ss_pdf's People

Contributors

dertahsama avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

download_ss_pdf's Issues

希望添加断点下载功能

由于学校网络等原因,下载一部分后频繁出现需要滑动验证,希望能够通过设置继续下载功能修复此类问题。
这里给出一个思路:
在RAW下建立一个文件,内含书籍标识,同时存储当前下载进度(页码)。
每页下载完毕后更新上述文件。

Pages are unsorted since the result of glob.glob is not always sorted

在我的环境中该程序输出的PDF页面顺序是乱序的,这是因为glob.glob()的结果不一定是顺序的。在我的Windows环境中glob.glob()的结果确实是顺序的,但是在我的Linux环境下,具体为:

  • 发行版:Manjaro
  • Python 3.10.4

在项目根目录执行:

>>> import glob
>>> glob.glob("./RAW/*.png")
['./RAW/000248.png', './RAW/000250.png', './RAW/000246.png', './RAW/000308.png', './RAW/000263.png', './RAW/000198.png', './RAW/000390.png', './RAW/000200.png', './RAW/000476.png', './RAW/000016.png', './RAW/000048.png', './RAW/000197.png', './RAW/000265.png', './RAW/000143.png', './RAW/000154.png', './RAW/000116.png', './RAW/000414.png', './RAW/000445.png', './RAW/000115.png', './RAW/000411.png', './RAW/000227.png', './RAW/000355.png', './RAW/000288.png', './RAW/000408.png', './RAW/000428.png', './RAW/000118.png', './RAW/000468.png', './RAW/000273.png', './RAW/000243.png', './RAW/000096.png', './RAW/000094.png', './RAW/000359.png', './RAW/000222.png', './RAW/000302.png', './RAW/000417.png', './RAW/000237.png', './RAW/000073.png', './RAW/000258.png', './RAW/000337.png', './RAW/000239.png', './RAW/000413.png', './RAW/000454.png', './RAW/000435.png', './RAW/000368.png', './RAW/000160.png', './RAW/000278.png', './RAW/000036.png', './RAW/000234.png', './RAW/000087.png', './RAW/000209.png', './RAW/000419.png', './RAW/000460.png', './RAW/000071.png', './RAW/000055.png', './RAW/000077.png', './RAW/000341.png', './RAW/000204.png', './RAW/000274.png', './RAW/000085.png', './RAW/000449.png', './RAW/000346.png', './RAW/000141.png', './RAW/000119.png', './RAW/000422.png', './RAW/000076.png', './RAW/000126.png', './RAW/000380.png', './RAW/000295.png', './RAW/000280.png', './RAW/000191.png', './RAW/000331.png', './RAW/000245.png', './RAW/000472.png', './RAW/000038.png', './RAW/000478.png', './RAW/000135.png', './RAW/000074.png', './RAW/000336.png', './RAW/000107.png', './RAW/000181.png', './RAW/000082.png', './RAW/000229.png', './RAW/000215.png', './RAW/000348.png', './RAW/000132.png', './RAW/000104.png', './RAW/000272.png', './RAW/000434.png', './RAW/000481.png', './RAW/000375.png', './RAW/000170.png', './RAW/000098.png', './RAW/000385.png', './RAW/000153.png', './RAW/000022.png', './RAW/000241.png', './RAW/000180.png', './RAW/000026.png', './RAW/000425.png', './RAW/000221.png', './RAW/000159.png', './RAW/000438.png', './RAW/000122.png', './RAW/000418.png', './RAW/000360.png', './RAW/000338.png', './RAW/000430.png', './RAW/000416.png', './RAW/000303.png', './RAW/000108.png', './RAW/000297.png', './RAW/000062.png', './RAW/000008.png', './RAW/000168.png', './RAW/000283.png', './RAW/000345.png', './RAW/000024.png', './RAW/000410.png', './RAW/000152.png', './RAW/000439.png', './RAW/000253.png', './RAW/000056.png', './RAW/000376.png', './RAW/000015.png', './RAW/000277.png', './RAW/000109.png', './RAW/000236.png', './RAW/000137.png', './RAW/000354.png', './RAW/000175.png', './RAW/000014.png', './RAW/000162.png', './RAW/000161.png', './RAW/000112.png', './RAW/000230.png', './RAW/000088.png', './RAW/000304.png', './RAW/000081.png', './RAW/000211.png', './RAW/000035.png', './RAW/000092.png', './RAW/000451.png', './RAW/000440.png', './RAW/000356.png', './RAW/000005.png', './RAW/000343.png', './RAW/000167.png', './RAW/000075.png', './RAW/000281.png', './RAW/000034.png', './RAW/000060.png', './RAW/000479.png', './RAW/000030.png', './RAW/000322.png', './RAW/000467.png', './RAW/000378.png', './RAW/000452.png', './RAW/000477.png', './RAW/000350.png', './RAW/000266.png', './RAW/000006.png', './RAW/000271.png', './RAW/000179.png', './RAW/000127.png', './RAW/000100.png', './RAW/000019.png', './RAW/000448.png', './RAW/000225.png', './RAW/000174.png', './RAW/000054.png', './RAW/000186.png', './RAW/000306.png', './RAW/000313.png', './RAW/000431.png', './RAW/000182.png', './RAW/000442.png', './RAW/000218.png', './RAW/000455.png', './RAW/000009.png', './RAW/000256.png', './RAW/000032.png', './RAW/000333.png', './RAW/000157.png', './RAW/000144.png', './RAW/000091.png', './RAW/000002.png', './RAW/000031.png', './RAW/000282.png', './RAW/000397.png', './RAW/000124.png', './RAW/000318.png', './RAW/000446.png', './RAW/000007.png', './RAW/000483.png', './RAW/000106.png', './RAW/000474.png', './RAW/000261.png', './RAW/000342.png', './RAW/000041.png', './RAW/000158.png', './RAW/000156.png', './RAW/000223.png', './RAW/000012.png', './RAW/000457.png', './RAW/000361.png', './RAW/000335.png', './RAW/000286.png', './RAW/000164.png', './RAW/000240.png', './RAW/000401.png', './RAW/000441.png', './RAW/000369.png', './RAW/000267.png', './RAW/000420.png', './RAW/000247.png', './RAW/000405.png', './RAW/000183.png', './RAW/000177.png', './RAW/000196.png', './RAW/000291.png', './RAW/000367.png', './RAW/000427.png', './RAW/000120.png', './RAW/000268.png', './RAW/000409.png', './RAW/000027.png', './RAW/000349.png', './RAW/000466.png', './RAW/000469.png', './RAW/000090.png', './RAW/000371.png', './RAW/000437.png', './RAW/000262.png', './RAW/000042.png', './RAW/000314.png', './RAW/000289.png', './RAW/000129.png', './RAW/000084.png', './RAW/000456.png', './RAW/000362.png', './RAW/000392.png', './RAW/000429.png', './RAW/000394.png', './RAW/000064.png', './RAW/000117.png', './RAW/000393.png', './RAW/000194.png', './RAW/000353.png', './RAW/000190.png', './RAW/000004.png', './RAW/000187.png', './RAW/000226.png', './RAW/000166.png', './RAW/000332.png', './RAW/000072.png', './RAW/000424.png', './RAW/000293.png', './RAW/000193.png', './RAW/000029.png', './RAW/000347.png', './RAW/000169.png', './RAW/000130.png', './RAW/000329.png', './RAW/000443.png', './RAW/000086.png', './RAW/000358.png', './RAW/000235.png', './RAW/000063.png', './RAW/000003.png', './RAW/000207.png', './RAW/000377.png', './RAW/000053.png', './RAW/000251.png', './RAW/000395.png', './RAW/000372.png', './RAW/000059.png', './RAW/000099.png', './RAW/000139.png', './RAW/000045.png', './RAW/000320.png', './RAW/000111.png', './RAW/000315.png', './RAW/000264.png', './RAW/000453.png', './RAW/000078.png', './RAW/000210.png', './RAW/000192.png', './RAW/000089.png', './RAW/000021.png', './RAW/000387.png', './RAW/000459.png', './RAW/000482.png', './RAW/000217.png', './RAW/000260.png', './RAW/000316.png', './RAW/000254.png', './RAW/000220.png', './RAW/000339.png', './RAW/000040.png', './RAW/000228.png', './RAW/000269.png', './RAW/000047.png', './RAW/000049.png', './RAW/000461.png', './RAW/000432.png', './RAW/000284.png', './RAW/000383.png', './RAW/000471.png', './RAW/000103.png', './RAW/000475.png', './RAW/000423.png', './RAW/000095.png', './RAW/000325.png', './RAW/000249.png', './RAW/000370.png', './RAW/000389.png', './RAW/000184.png', './RAW/000148.png', './RAW/000140.png', './RAW/000028.png', './RAW/000403.png', './RAW/000101.png', './RAW/000205.png', './RAW/000384.png', './RAW/000244.png', './RAW/000433.png', './RAW/000150.png', './RAW/000165.png', './RAW/000470.png', './RAW/000131.png', './RAW/000310.png', './RAW/000464.png', './RAW/000123.png', './RAW/000364.png', './RAW/000102.png', './RAW/000340.png', './RAW/000238.png', './RAW/000450.png', './RAW/000199.png', './RAW/000066.png', './RAW/000300.png', './RAW/000133.png', './RAW/000017.png', './RAW/000216.png', './RAW/000176.png', './RAW/000473.png', './RAW/000065.png', './RAW/000285.png', './RAW/000068.png', './RAW/000391.png', './RAW/000252.png', './RAW/000373.png', './RAW/000114.png', './RAW/000290.png', './RAW/000203.png', './RAW/000458.png', './RAW/000214.png', './RAW/000178.png', './RAW/000415.png', './RAW/000402.png', './RAW/000406.png', './RAW/000039.png', './RAW/000357.png', './RAW/000020.png', './RAW/000399.png', './RAW/000202.png', './RAW/000305.png', './RAW/000400.png', './RAW/000125.png', './RAW/000447.png', './RAW/000463.png', './RAW/000057.png', './RAW/000083.png', './RAW/000134.png', './RAW/000208.png', './RAW/000051.png', './RAW/000279.png', './RAW/000465.png', './RAW/000224.png', './RAW/000407.png', './RAW/000366.png', './RAW/000327.png', './RAW/000149.png', './RAW/000023.png', './RAW/000025.png', './RAW/000319.png', './RAW/000058.png', './RAW/000242.png', './RAW/000069.png', './RAW/000436.png', './RAW/000185.png', './RAW/000013.png', './RAW/000462.png', './RAW/000231.png', './RAW/000138.png', './RAW/000232.png', './RAW/000398.png', './RAW/000219.png', './RAW/000276.png', './RAW/000396.png', './RAW/000321.png', './RAW/000147.png', './RAW/000365.png', './RAW/000270.png', './RAW/000188.png', './RAW/000255.png', './RAW/000324.png', './RAW/000195.png', './RAW/000018.png', './RAW/000326.png', './RAW/000275.png', './RAW/000301.png', './RAW/000296.png', './RAW/000171.png', './RAW/000421.png', './RAW/000374.png', './RAW/000105.png', './RAW/000330.png', './RAW/000379.png', './RAW/000097.png', './RAW/000142.png', './RAW/000136.png', './RAW/000080.png', './RAW/000426.png', './RAW/000292.png', './RAW/000311.png', './RAW/000061.png', './RAW/000307.png', './RAW/000050.png', './RAW/000001.png', './RAW/000323.png', './RAW/000113.png', './RAW/000079.png', './RAW/000317.png', './RAW/000386.png', './RAW/000412.png', './RAW/000299.png', './RAW/000033.png', './RAW/000163.png', './RAW/000257.png', './RAW/000212.png', './RAW/000382.png', './RAW/000067.png', './RAW/000444.png', './RAW/000351.png', './RAW/000213.png', './RAW/000363.png', './RAW/000151.png', './RAW/000201.png', './RAW/000043.png', './RAW/000189.png', './RAW/000388.png', './RAW/000010.png', './RAW/000146.png', './RAW/000480.png', './RAW/000155.png', './RAW/000121.png', './RAW/000352.png', './RAW/000309.png', './RAW/000404.png', './RAW/000052.png', './RAW/000206.png', './RAW/000046.png', './RAW/000259.png', './RAW/000145.png', './RAW/000287.png', './RAW/000294.png', './RAW/000128.png', './RAW/000298.png', './RAW/000037.png', './RAW/000312.png', './RAW/000110.png', './RAW/000328.png', './RAW/000093.png', './RAW/000172.png', './RAW/000070.png', './RAW/000233.png', './RAW/000344.png', './RAW/000334.png', './RAW/000381.png', './RAW/000173.png', './RAW/000044.png', './RAW/000011.png']

具体可参考此issue:python/cpython#77456

为了保证程序的可靠性,请考虑在输出为PDF文件前对文件名序列排序。

with open("temp.pdf", "wb+") as pdf_temp:
pdf_temp.write(img2pdf.convert(glob.glob(files_to_save))) #合成pdf

改为

with open("temp.pdf", "wb+") as pdf_temp:
    images = glob.glob(files_to_save)
    images.sort()
    pdf_temp.write(img2pdf.convert(images))   #合成pdf

添加开源许可证

作者你好,非常感谢你的项目(很有用!)。请问能否为该项目添加一个开源许可证,以便使用和贡献?

提示 “EOF occurred in violation of protocol” 无法下载,能否更新

提示 “EOF occurred in violation of protocol” 无法下载,能否更新, 感谢!

[code]
PS D:\Downloads> .\Download_SS_PDF.ver1.8.exe
Download_SS_PDF-ver1.8, by DertahSama, 2022.10.5
这是一个从超星图书馆(http://www.sslibrary.com )下载PDF并且自动添加目录和压缩、并且支持选页下载的python脚本,然后打包成了exe。
本项目地址:https://github.com/DertahSama/Download_SS_PDF

输入阅读页面网址(页面不要关):https://img.sslibrary.com/n/slib/book/slib/10253709/373cc3111f99438f95ac0fa5af4bd670/ec8de8a852b0df94e81d781e6f882466.shtml?dxbaoku=false&moocbaoku=false&deptid=147&fav=https%3A%2F%2Fwww.sslibrary.com%2Freader%2Fpdg%2Fpdgreader%3Fd%3D3eabcf181fd8507a983e18fba70763dd%26ssid%3D10253709&fenlei=130408010201&spage=1&t=5&username=219.142.99.12&view=-1
开始获取信息……
Traceback (most recent call last):
File "urllib3\connectionpool.py", line 700, in urlopen
File "urllib3\connectionpool.py", line 994, in prepare_proxy
File "urllib3\connection.py", line 364, in connect
File "urllib3\connection.py", line 499, in connect_tls_proxy
File "urllib3\util\ssl
.py", line 453, in ssl_wrap_socket
File "urllib3\util\ssl
.py", line 495, in _ssl_wrap_socket_impl
File "ssl.py", line 500, in wrap_socket
File "ssl.py", line 1040, in _create
File "ssl.py", line 1309, in do_handshake
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:1129)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "requests\adapters.py", line 440, in send
File "urllib3\connectionpool.py", line 785, in urlopen
File "urllib3\util\retry.py", line 592, in increment
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='img.sslibrary.com', port=443): Max retries exceeded with url: /n/slib/book/slib/10253709/373cc3111f99438f95ac0fa5af4bd670/ec8de8a852b0df94e81d781e6f882466.shtml?dxbaoku=false&moocbaoku=false&deptid=147&fav=https%3A%2F%2Fwww.sslibrary.com%2Freader%2Fpdg%2Fpdgreader%3Fd%3D3eabcf181fd8507a983e18fba70763dd%26ssid%3D10253709&fenlei=130408010201&spage=1&t=5&username=219.142.99.12&view=-1&Upgrade-Insecure-Requests=1&User-Agent=Mozilla%2F5.0+%28Windows+NT+10.0%3B+WOW64%29+AppleWebKit%2F537.36+%28KHTML%2C+like+Gecko%29+Chrome%2F86.0.4240.198+Safari%2F537.36&Accept=text%2Fhtml%2Capplication%2Fxhtml%2Bxml%2Capplication%2Fxml%3Bq%3D0.9%2Cimage%2Favif%2Cimage%2Fwebp%2Cimage%2Fapng%2C%2A%2F%2A%3Bq%3D0.8%2Capplication%2Fsigned-exchange%3Bv%3Db3%3Bq%3D0.9&Accept-Encoding=gzip%2C+deflate%2C+br&Accept-Language=zh-CN%2Czh%3Bq%3D0.9 (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:1129)')))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "Download_SS_PDF ver1.8.py", line 434, in
File "Download_SS_PDF ver1.8.py", line 404, in main
File "Download_SS_PDF ver1.8.py", line 65, in GetData
File "requests\api.py", line 75, in get
File "requests\api.py", line 61, in request
File "requests\sessions.py", line 529, in request
File "requests\sessions.py", line 645, in send
File "requests\adapters.py", line 517, in send
requests.exceptions.SSLError: HTTPSConnectionPool(host='img.sslibrary.com', port=443): Max retries exceeded with url: /n/slib/book/slib/10253709/373cc3111f99438f95ac0fa5af4bd670/ec8de8a852b0df94e81d781e6f882466.shtml?dxbaoku=false&moocbaoku=false&deptid=147&fav=https%3A%2F%2Fwww.sslibrary.com%2Freader%2Fpdg%2Fpdgreader%3Fd%3D3eabcf181fd8507a983e18fba70763dd%26ssid%3D10253709&fenlei=130408010201&spage=1&t=5&username=219.142.99.12&view=-1&Upgrade-Insecure-Requests=1&User-Agent=Mozilla%2F5.0+%28Windows+NT+10.0%3B+WOW64%29+AppleWebKit%2F537.36+%28KHTML%2C+like+Gecko%29+Chrome%2F86.0.4240.198+Safari%2F537.36&Accept=text%2Fhtml%2Capplication%2Fxhtml%2Bxml%2Capplication%2Fxml%3Bq%3D0.9%2Cimage%2Favif%2Cimage%2Fwebp%2Cimage%2Fapng%2C%2A%2F%2A%3Bq%3D0.8%2Capplication%2Fsigned-exchange%3Bv%3Db3%3Bq%3D0.9&Accept-Encoding=gzip%2C+deflate%2C+br&Accept-Language=zh-CN%2Czh%3Bq%3D0.9 (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:1129)')))
[56184] Failed to execute script 'Download_SS_PDF ver1.8' due to unhandled exception!
PS D:\Downloads> .\Download_SS_PDF.ver1.8.exe
Download_SS_PDF-ver1.8, by DertahSama, 2022.10.5
这是一个从超星图书馆(http://www.sslibrary.com )下载PDF并且自动添加目录和压缩、并且支持选页下载的python脚本,然后打包成了exe。
本项目地址:https://github.com/DertahSama/Download_SS_PDF

输入阅读页面网址(页面不要关):https://img.sslibrary.com/n/slib/book/slib/10253709/373cc3111f99438f95ac0fa5af4bd670/ec8de8a852b0df94e81d781e6f882466.shtml
开始获取信息……
Traceback (most recent call last):
File "urllib3\connectionpool.py", line 700, in urlopen
File "urllib3\connectionpool.py", line 994, in prepare_proxy
File "urllib3\connection.py", line 364, in connect
File "urllib3\connection.py", line 499, in connect_tls_proxy
File "urllib3\util\ssl
.py", line 453, in ssl_wrap_socket
File "urllib3\util\ssl
.py", line 495, in _ssl_wrap_socket_impl
File "ssl.py", line 500, in wrap_socket
File "ssl.py", line 1040, in _create
File "ssl.py", line 1309, in do_handshake
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:1129)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "requests\adapters.py", line 440, in send
File "urllib3\connectionpool.py", line 785, in urlopen
File "urllib3\util\retry.py", line 592, in increment
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='img.sslibrary.com', port=443): Max retries exceeded with url: /n/slib/book/slib/10253709/373cc3111f99438f95ac0fa5af4bd670/ec8de8a852b0df94e81d781e6f882466.shtml?Upgrade-Insecure-Requests=1&User-Agent=Mozilla%2F5.0+%28Windows+NT+10.0%3B+WOW64%29+AppleWebKit%2F537.36+%28KHTML%2C+like+Gecko%29+Chrome%2F86.0.4240.198+Safari%2F537.36&Accept=text%2Fhtml%2Capplication%2Fxhtml%2Bxml%2Capplication%2Fxml%3Bq%3D0.9%2Cimage%2Favif%2Cimage%2Fwebp%2Cimage%2Fapng%2C%2A%2F%2A%3Bq%3D0.8%2Capplication%2Fsigned-exchange%3Bv%3Db3%3Bq%3D0.9&Accept-Encoding=gzip%2C+deflate%2C+br&Accept-Language=zh-CN%2Czh%3Bq%3D0.9 (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:1129)')))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "Download_SS_PDF ver1.8.py", line 434, in
File "Download_SS_PDF ver1.8.py", line 404, in main
File "Download_SS_PDF ver1.8.py", line 65, in GetData
File "requests\api.py", line 75, in get
File "requests\api.py", line 61, in request
File "requests\sessions.py", line 529, in request
File "requests\sessions.py", line 645, in send
File "requests\adapters.py", line 517, in send
requests.exceptions.SSLError: HTTPSConnectionPool(host='img.sslibrary.com', port=443): Max retries exceeded with url: /n/slib/book/slib/10253709/373cc3111f99438f95ac0fa5af4bd670/ec8de8a852b0df94e81d781e6f882466.shtml?Upgrade-Insecure-Requests=1&User-Agent=Mozilla%2F5.0+%28Windows+NT+10.0%3B+WOW64%29+AppleWebKit%2F537.36+%28KHTML%2C+like+Gecko%29+Chrome%2F86.0.4240.198+Safari%2F537.36&Accept=text%2Fhtml%2Capplication%2Fxhtml%2Bxml%2Capplication%2Fxml%3Bq%3D0.9%2Cimage%2Favif%2Cimage%2Fwebp%2Cimage%2Fapng%2C%2A%2F%2A%3Bq%3D0.8%2Capplication%2Fsigned-exchange%3Bv%3Db3%3Bq%3D0.9&Accept-Encoding=gzip%2C+deflate%2C+br&Accept-Language=zh-CN%2Czh%3Bq%3D0.9 (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:1129)')))
[53512] Failed to execute script 'Download_SS_PDF ver1.8' due to unhandled exception!
PS D:\Downloads>

[/code]

超时导致的问题

最后pdf页面的链接是动态的 会在一定时间后失效 故而整页显示“⚠数据加载失败,请稍后重试!”应当做一些工作来检测并中断处理这个问题

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.