Comments (10)
The fix has been made, let me know if you face another issue @dayvsonsales, if not, this issue can be closed
from m3u8-dl.
@excalibur-kvrv it works now. thank you. One more question: is the download process made entirely in memory? 'cause I downloaded a total of 730mb playlist and I noticed via top
that python process was growing over and over.
from m3u8-dl.
It is designed to write the data as soon as it is downloaded (this maybe a problem if each individual chunk is big), i did notice the growing size, i'm still working on identifying the areas where the memory is growing. Btw @dayvsonsales how much download speed(mega bytes per sec) where you getting while using m3u8-dl? was it close to your internet bandwidth(mega bytes per sec)?
from m3u8-dl.
@excalibur-kvrv the download speed is fine. The only problem is memory usage, my playlists have big single files (more than 100mb usually). I have a simple script that I wrote using only curl
and it was working fine (no memory usage problem), but it doesn't scale (like your script that uses 4 parallel processes).
from m3u8-dl.
100mb per file in the playlist? @dayvsonsales, then i think i know what the issue is. The playlists that i have encountered so far only contained small files(10mb max) so i had designed my program to download the entire file and then write it, the fix is quite simple i will just need to write the data in chunks. it'll take a few hours to fix.
from m3u8-dl.
@excalibur-kvrv I think that I solved my problem. Inspecting the fetch.py
file, I noticed that you use session.get
without passing the stream
option. So, I added these options and deal with the chunks, writing to file_path file, using python's default file system (not your write_file_no_gil). The code is below:
with session.get(download_url, timeout=timeout, stream=True) as r:
r.raise_for_status()
if r.status_code == 302:
r = redirect_handler(session, r.content)
with open(file_path, "wb") as f:
for chunk in r.iter_content(1024):
if not chunk:
break
f.write(chunk)
The memory usage seems now littler than before.
But theres a check that I had to ignore:
if type(request_data) == bytes:
data = request_data
else:
data = request_data.content
I don't know what you were trying to do with this type check. Could you explain to me, please?
from m3u8-dl.
The type check was simply for compatiblity, in the event redirect_handler were to run since it was returning bytes. The if else would ensure that it wouldn't be calling .content on a bytes object, but run it on a response object. Also try experimenting with the amount of bytes passed into r.iter_content since if you pass a small amount it would increase the overall file write time, the file writting to the os is faster if it's passed a larger value. The custom write_file_no_gil was to ensure faster write time by taking advantage of the fact that the gil gets dropped.
from m3u8-dl.
Nice so @dayvsonsales, i take it your issue has been resolved?
from m3u8-dl.
@excalibur-kvrv it solved. Just to clarify, if redirect_data returns bytes there's no iter_content so? Is this right? Cause removing this check could cause more problems, I think. I'll make a pull request just to history the code in this issue. But I think it should be more investigated before merge it.
from m3u8-dl.
Well if you were to remove the type check, it would cause a lot of problems for whenever the redirect handler were to run. But you are on the right path, with a few more changes and a bit of restructuring the code your fix would work, i'll take a look and notify you of the changes that you need to make. Oh and do ensure that your code passes codacy checks.
from m3u8-dl.
Related Issues (19)
- m3u8-dl does not terminate, when an internal process gets terminated due to an error.
- Dependabot couldn't authenticate with https://pypi.python.org/simple/
- Addition for default Headers
- Addition of progress bar to show download progress
- CVE-2020-26137 (High) detected in urllib3-1.25.8-py2.py3-none-any.whl HOT 1
- Add a -t option for specifying custom number of threads per process HOT 1
- Add a -m option to specify custom number of download processes to be used.
- Add a custom -f option to read a m3u8 playlist from a file if a .m3u8 playlist file is available HOT 4
- Add support for displaying options to download playlists, if the provided m3u8 has nested m3u8 files
- Add a -c cache option to keep the cache and redirect it to a new specified location.
- Add converting support to --convert such that it will convert the downloaded video to other popular video formats as well.
- Use this issue to ask questions on installation or any other installation related issues. HOT 4
- Add global installation instructions for Windows
- CVE-2021-33503 (High) detected in urllib3-1.25.11-py2.py3-none-any.whl - autoclosed HOT 1
- Adding support for encryption keys
- Add global installation instructions for MacOs
- Stream Reset error HOT 2
- Integrating httpx library for http/2.0 request support, due to no maintanence on hyper library for a few years
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from m3u8-dl.