Giter VIP home page Giter VIP logo

duitangloader's People

Contributors

ivanlon30000 avatar

Stargazers

 avatar

Watchers

 avatar  avatar

duitangloader's Issues

我有个问题 我也想动态爬取 不知道json怎么更变

`import os
import requests
from lxml import etree # requests,lxml控件需要pip安装。什么?不会。小孩子一边玩去。

url = "https://www.duitang.com/search/?kw=橘猫&type=feed"
headers = {
"user-agent": "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36"}

response = requests.get(url, headers=headers) # 获取网页
html_str = response.content.decode() # 将页面内容转换成字符串
html = etree.HTML(html_str) # 构造了一个XPath解析对象并对HTML文本进行自动修正

img_list = html.xpath("//a[@target='_blank'][@Class='a']/img/@src") # 创建图片地址列表,获取图片地址
#print(img_list) #到这里可打印一下看看获取到的图片地址列表是否正确,最后可注释掉这行
img_name = html.xpath("//div[@Class='g']/text()")[0].replace(" ", "") # 获取图片标题,replace替换掉标题中的空格

print(img_name) #到这里可打印一下看看获取到的图片名称是否正确,最后可注释掉这行

try:
os.mkdir("{}".format(img_name)) # 在程序目录下创建一个以“图片标题”命名的文件夹用来保存这一组图片
except:
pass

for url in img_list:
url = url.replace("thumb.400_0.", "") # 将小图地址替换成大图地址
filename = url.split("/")[-1] # 获取图片文件名,split是删除网址中/号前所有字符包括/
# print(filename) #可打印一下文件名看是否获取成功,最后可注释掉这行
f = requests.get(url, headers=headers) # 获取网页地址
with open(".\{0}\{1}".format(img_name, filename), "wb") as code: # 下载文件
code.write(f.content)
print("({})高清图片下载完成。".format(filename))

print("全部图片下载完成,保存在程序目录下({})文件夹下")`

以上是我的代码。 我觉得我可以 params.append({}) 数值来手动递推json。但并不是非常实用。我可以用你的方法吗?如果可以的话 如何?: D

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.