Comments (14)
Since some Youtube channels simply consist of people talking for long periods of time in front of a camera, it's useful to be able to convert the video to an audio file and listen to it using podcast software. Podsync already has this feature.
from podsync.
I found the internal architecture of podsync to be… rather excessively complicated. (A testament to the power of ingenuity, but too complicated for me to be comfortable setting up locally.) DynamoDB? Lambda? Golang! Docker. But also Node… (How many programming languages does one need at a single time? Node does not touch my machines.) I looked at all that, then just dove into the code looking for the ultimate youtube-dl invocation. Then extracted and isolated that, automated using GNU parallel. That Gist also describes the patches made to youtube-dl to avoid excessive numbers of HTTP requests. (E.g. actually sys.exit()
when failing a video because it is too old, all subsequent videos will be older.)
I've resolved the issue with playlists being named after their origin channel instead of the actual name of the playlist and will continue to keep this tiny little shell script updated. Also added optional lines for rate limiting, randomized sleep periods, and SOCKS5 proxy configuration; that is, ssh -D 8088 example.com
and the proxy would be --proxy "socks5://localhost:8088/"
. Only real remaining issue, feed thumbnails. With this setup, it's taken YouTube two weeks to begin to throw up Captcha challenges, after ingesting 5,895 episodes totalling 1.6 TiB across 141 channels / playlists. (Each run taking, on average, around 10 minutes, run via cron every few hours.)
from podsync.
Direct use of youtube-dl
(the command-line program powering all of this media ingest) permits retrieval of just the audio. My little automation script wins again, it already can do this! ;P
Some of us want audio-only…
It really is a bit flabbergasting to be repeatedly asked for something the user already has the ability to do… and search for.
from podsync.
Hello Max and thanks for your work so far and for your effort to make a self-hosting version !
Just wondering, why the mp3 format? Isn't that for audio only? Do you mean mp4?
Keep it up!
from podsync.
Ah, adding a second comment as it's an important note, my shell script there (basically a text file containing a channel or playlist URL per line…) explicitly gives you control over per-channel quality settings (see line 41; split that up with multiple formats if needed, I do) as well as extended video selection criteria, such as title exclusions (see line 74). (Run youtube-dl -h
to see the many, many options available.)
from podsync.
Hello Alice,
Thanks for this work! I had started to dive into Max code, starting with early commits. Did you know podsync started as a .NET project? :)
I can understand Max used a database because he had to store every user playlist. For a self-hosted, single-user version, generating/serving just one file may indeed actually be a simpler/better solution.
For node I feel you.
As for Docker, it could be a nice feature to add to your code, as this could determine the right environment, especially for versions of parallel and python.
Anyway, I'm grateful because you made me save some time and effort.
from podsync.
especially for versions of parallel and python.
Any version will do:
brew install parallel
Python 3 is already a pretty universal standard; the given code will work with any Python 3.3 or newer, that is, virtually any Python released in the last 10 years in that series. Including the version that comes pre-installed on macOS.
Edited to add: thus, in this particular case, Docker would simplify nothing, and complicate everything. Like a Spartan soldier taking everything and giving nothing.
store every user playlist
On-disk directories are the database, in my case. My Python script and template will transform any directory containing youtube-dl .info.json files into a podcast. (Future improvement: only regenerate the index.xml if there are actually new/updated episodes, but feed generation is so minor compared to content collection, that's a low priority.)
Edited to add: ingest (pull.sh
invocations of youtube-dl
) are one half of the problem: actually getting the content. A problem tackled entirely separately: turning those collected media files into podcast feeds.
from podsync.
I understand you want to keep things simple. Loved the 300 reference and can't help imagining Docker as a bare-torso warrior now.
As I already said, didn't read podsync code, but does it store every mp4 on their server?
I'm using your script right now (will also check these youtube channels of yours, just curious).
I see the content (mp4) is directly "youtube-dl-ed" right here on the machine.
I can see the dl.podsync.net/* urls link to googlevideo.com. Is there an upload wrapper somewhere that could avoid using space on the podsync self-hosted server?
from podsync.
…does it store every mp4 on their server?
Yes, as part of the background "updater" process. Where that is Python code, so invokes youtube_dl directly, and my shell script is a shell script invoking the youtube-dl command itself. One layer out. ;)
«googlevideo.com links» … Is there an upload wrapper somewhere that could avoid using space on the podsync self-hosted server?
Well, where youtube-dl by command line, by default, will download the video content, if you are careful to pick a video format that comes "pre-muxed" (that is, audio and video together) you can hypothetically avoid downloading the video and pull the actual origin links from the .info.json
for use in the RSS feed. Or, in Podsync's case, after a 302 redirect, likely looking up the local cache status vs. availability from YouTube of the pre-muxed version link.
That's a key difference, I think. I get 1080p episodes, as I re-mux the independent streams. Hypothetically I could choose a 4K --format
. (But ye gods, the storage space, then!)
from podsync.
Just wondering, why the mp3 format? Isn't that for audio only? Do you mean mp4?
Some of us want audio-only, since we like listening to the audio of podcasts posted to YouTube, but we don't have the time to watch the video, since we're doing other things when we listen to the audio, such as driving, or we don't care to see the podcaster's studio, when what they say is more important than their studio.
I consider it a welcome addition.
Self-hosting seems like the way to go too, eliminating single points of failure.
from podsync.
Just wondering, why the mp3 format? Isn't that for audio only? Do you mean mp4?
Some of us want audio-only, since we like listening to the audio of podcasts posted to YouTube, but we don't have the time to watch the video, since we're doing other things when we listen to the audio, such as driving, or we don't care to see the podcaster's studio, when what they say is more important than their studio.
I consider it a welcome addition.
Self-hosting seems like the way to go too, eliminating single points of failure.
Second the audio only option. I don't know what APIs you're using, but I can tell you that as a Youtube Red subscriber (actually a Google Play Music subscriber, but that's the same thing now), there is a way to only stream audio, since this is a premium feature specifically offered as part of Red.
from podsync.
Is it expected that docker-compose pull
produces the following?
ERROR: for api pull access denied for mxpv/podsync_api, repository does not exist or may require 'docker login': denied: requested access to the resource is denied
ERROR: for updater pull access denied for mxpv/updater, repository does not exist or may require 'docker login': denied: requested access to the resource is denied
ERROR: for resolver pull access denied for mxpv/podsync_lambda, repository does not exist or may require 'docker login': denied: requested access to the resource is denied
ERROR: for nginx pull access denied for mxpv/nginx, repository does not exist or may require 'docker login': denied: requested access to the resource is denied
from podsync.
CLI docker images are not yet published.
from podsync.
New functionality, docs and tutorials will be added in follow up PRs.
from podsync.
Related Issues (20)
- Episodes missing from the XML file HOT 1
- download episod above 5 mins HOT 6
- how to configure each video file's title & description & cover image?
- How to stop docker container/program when download has completed? HOT 1
- Not Found on Path
- it's impossible to download a video after live stream finished?
- put two playlists into one feed? HOT 2
- Feature request: bilibili download HOT 2
- Nebula support
- Cannot stream via Apple Podcasts HOT 1
- Why not use the Invidious API? Most instances have it public. And its Youtube API compatible. HOT 1
- How I can get regular and live videos?
- How to support links HOT 2
- Latest version not available on docker HOT 3
- Issues with youtube dl HOT 1
- Episode image are are off by 1
- Feature request: GUI for add a feed HOT 1
- Struggling to get working in docker and a bit lost now
- error="update failed: failed to parse duration : bad format string" HOT 4
- Filter for min and max age / incomplete live streams
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from podsync.