Giter VIP home page Giter VIP logo

getdouyin's Introduction

原Java版废弃,改为python脚本实现

代码是摘抄自github某位大佬,并稍作修改。

注意事项:

  • 需要nodejs和python3环境,自行安装nodejs和python3
  • 目前只支持下载指定抖音用户的全部视频(含收藏),也可以下载指定主题(挑战)或音乐下的全部视频。
  • 空闲时间我会改为自动爬取全部用户信息,并自动爬取所有视频
  • 有时间的朋友们,可以fork代码修改后提交pull request。

怎么样方便地讨论交流

环境安装

程序猿和程序媛见这里

配置好你的Python、node环境,然后pip install requirements.txt .

或者

$ git clone https://github.com/liupeng328/GetDouYin.git 
$ cd amemv-crawler
$ pip install -r requirements.txt
$ python amemv-video-ripper.py

大功告成,直接跳到下一节配置和运行.

配置和运行

有两种方式来指定你要下载的抖音号分享链接,一是编辑share-url.txt,二是指定命令行参数.

第一种方法:编辑share-url.txt文件

找到一个文字编辑器,然后打开文件share-url.txt,把你想要下载的抖音号分享链接编辑进去,以逗号/空格/tab/表格鍵/回车符分隔,可以多行.例如, 这个文件看起来是这样的:

https://www.douyin.com/share/user/85860189461?share_type=link&tt_from=weixin&utm_source=weixin&utm_medium=aweme_ios&utm_campaign=client_share&uid=97193379950&did=30337873848,

https://www.iesdouyin.com/share/challenge/1593608573838339?utm_campaign=clien,

https://www.iesdouyin.com/share/music/6536362398318922509?utm_campaign=client_share&app=aweme&utm_medium=ios&iid=30337873848&utm_source=copy

获取用户分享链接的方法(挑战、音乐 类似)

直接在抖音分享,然后复制连接发送到qq或微信或者自己粘贴出来即可。

然后保存文件,双击运行amemv-video-ripper.py或者在终端(terminal)里面 运行python amemv-video-ripper.py

第二种方法:使用命令行参数(仅针对会使用操作系统终端的用户)

如果你对Windows或者Unix系统的命令行很熟悉,你可以通过指定运行时的命令行参数来指定要下载的站点:

某些平台下注意给URL增加引号

python amemv-video-ripper.py URL1,URL2

分享链接以逗号分隔,不要有空格.

视频的下载与保存

程序运行后,会默认在当前路径下面生成一个跟抖音ID名字相同的文件夹, 视频都会放在这个文件夹下面.

运行这个脚本,不会重复下载已经下载过的视频,所以不用担心重复下载的问题.同时,多次运行可以 帮你找回丢失的或者删除的视频.

然后重新运行下载命令.

高级应用

如果你想下载整个挑战主题,请在 share-url.txt 文件中添加 挑战的分享URL

如果你想下载按音乐去下载,请在 share-url.txt 文件中添加 音乐的分享URL

如下: 既为抖音号、挑战主题和音乐的三种爬虫方式,需要注意的是,爬虫只对搜索结果第一的结果进行下载,所以请尽量完整的写出你的 主题或音乐名称。

https://www.douyin.com/share/user/85860189461?share_type=link&tt_from=weixin&utm_source=weixin&utm_medium=aweme_ios&utm_campaign=client_share&uid=97193379950&did=30337873848,

https://www.iesdouyin.com/share/challenge/1593608573838339?utm_campaign=clien,

https://www.iesdouyin.com/share/music/6536362398318922509?utm_campaign=client_share&app=aweme&utm_medium=ios&iid=30337873848&utm_source=copy

短地址的情况

http://v.douyin.com/cDo2P/,

http://v.douyin.com/cFuAN/,

http://v.douyin.com/cMdjU/

关于签名破解

node fuck-byted-acrawler.js 这里是参数:用户id/视频id

加群讨论

QQ群:967073790 点击链接加入群聊【开发者交流群】:https://jq.qq.com/?_wv=1027&k=5l6VTXa

getdouyin's People

Contributors

dakuohao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

getdouyin's Issues

运行python amemv-video-ripper.py代码报错

运行python amemv-video-ripper.py代码报错
share-url.txt写短网址的话,UnicodeDecodeError: 'utf-8' codec can't decode byte 0x98 in position 1: invalid start byte
写长网址的话:UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe8 in position 1: invalid continuation byte

解决了一些小问题,代码最终可以跑起来

1.HEADERS 中 KEY 值为 accept-encoding 需要注释, 否则在获取内容后转码过程中会报错.
2.get_dytk 貌似没什么用, 抖音上不用这个方法是没问题的,但是正则匹配 “dytk: '(.*)'” 是匹配不到内容的,因此相关 dytk 的判断注释后不影响. Tiktok没测不知道这个参数有没有用
3.在下载抖音视频 _join_download_queue 方法中, download_params 参数仅需保留 video_id, ratio,line 三个即可正常下载无水印视频,加上其他参数反而无法正常下载!

运行python amemv-video-ripper.py代码报错

F:\GetDouYin-master>python amemv-video-ripper.py
F:\GetDouYin-master\amemv-video-ripper.py:169: SyntaxWarning: invalid escape sequence '\d'
challenge = re.findall('share/challenge/(\d+)', url)
Traceback (most recent call last):
File "F:\GetDouYin-master\amemv-video-ripper.py", line 14, in
import requests
File "C:\Users\1001\AppData\Local\Programs\Python\Python312\Lib\site-packages\requests_init_.py", line 43, in
import urllib3
File "C:\Users\1001\AppData\Local\Programs\Python\Python312\Lib\site-packages\urllib3_init_.py", line 8, in
from .connectionpool import (
File "C:\Users\001\AppData\Local\Programs\Python\Python312\Lib\site-packages\urllib3\connectionpool.py", line 11, in
from .exceptions import (
File "C:\Users\001\AppData\Local\Programs\Python\Python312\Lib\site-packages\urllib3\exceptions.py", line 2, in
from .packages.six.moves.http_client import (
ModuleNotFoundError: No module named 'urllib3.packages.six.moves

share/video 无法下载

有些短连接可以获取到下载视频的地址。如:短连接 http://v.douyin.com/9GEGSp/ 可以获取到视频并下载。但是用另外一个连接 https://v.douyin.com/CdC8kQ/ 就无法获取到视频

可以下载的短连接 http://v.douyin.com/9GEGSp/ 贴到浏览器后,地址显示如下
https://www.iesdouyin.com/share/user/73838190950?u_code=128dfi636&sec_uid=MS4wLjABAAAAHmQ4DqHKN8IdfWWd52sYaGS6zaZaOTghOZ4ysZ0z_YM&timestamp=1571884619&utm_source=copy&utm_campaign=client_share&utm_medium=android&share_app_name=douyin
地址中有关键字 share/user

不可以下载的短连接 https://v.douyin.com/CdC8kQ/ 地址贴到浏览器后,地址显示如下
https://www.iesdouyin.com/share/video/6760586675262328072/?region=CN&mid=6758364479400266499&u_code=m9f2m5mg&titleType=title&utm_source=copy_link&utm_campaign=client_share&utm_medium=android&app=aweme
地址中有关键字 share/video

dockerfile不管用

The command '/bin/sh -c apt-get install python3 python3-pip curl git vim' returned a non-zero code: 1

请赐教

IndexError: list index out of range这个怎么解决的

国际版抖音无法使用

国内版抖音爬了几个视频後出错,
但国际版Tiktok,只建立了空白的文件夹。
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

下载某些视频失效

用短链接下载某用户的作品,只能下载到部分正确的视频,大部分是尺寸为0的视频。怀疑最近服务器升级了,目前请求导致失效,或者服务器对下载进行了访问频度限制。2019年9月18日发现此问题,以前是正常的

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.