Comments (5)
I would be happy to add useful features to the srt
command.
But before that I need to understand exactly what your script intend to do.
It build hybrid subtitles with mixed languages, right ?
Can you provide me 2 test subtitles to see it in action ?
from pysrt.
Ok, I've to admit that I was too quick in sharing my code.
So yes, indeed, I wanted to have 2 languages.
Currently I'm learning a new language, it is useful to listen and read the language you are learning while having also the subtitles in your mother language...
here is my new code and 2 input files as well as the result.
They are just 2 test - files
subtitle_merge.py
import argparse
from pysrt import SubRipFile, SubRipItem, SubRipTime
parser = argparse.ArgumentParser(description='Merge 2 srt files.')
parser.add_argument('fin', type=str, nargs=2,
help='input file')
parser.add_argument(dest='fout', type=str, nargs=1,
help='the output file')
args = parser.parse_args()
master = args.fin[0]
slave = args.fin[1]
result = args.fout[0]
msubs = SubRipFile.open (master, encoding='iso-8859-1')
ssubs = SubRipFile.open (slave, encoding='iso-8859-1')
rsubs = SubRipFile.from_string ("", encoding='iso-8859-1')
tp = SubRipTime()
for msub in msubs:
#"start before, ends before"
for ssub in ssubs.slice (starts_after=tp.ordinal - 1, ends_before=msub.start.ordinal+1):
rsubs.append (SubRipItem(index=len(rsubs), start=ssub.start, end=ssub.end, text=ssub.text))
ssubs.remove (ssub) # remove as completely handled.
tp = msub.start
#"start before; ends before end of master sub"
for ssub in ssubs.slice (starts_before=msub.start.ordinal, ends_before=msub.end.ordinal):
rsubs.append (SubRipItem(index=len(rsubs), start=ssub.start, end=msub.start, text=ssub.text))
rsubs.append (SubRipItem(index=len(rsubs), start=msub.start, end=ssub.end, text=ssub.text))
rsubs.append (SubRipItem(index=len(rsubs), start=msub.start, end=ssub.end, text=msub.text))
ssubs.remove (ssub) # remove as completely handled.
tp = ssub.end
#"start before; ends after end of master sub"
for ssub in ssubs.slice (starts_before=msub.start, ends_after=msub.end):
rsubs.append (SubRipItem(index=len(rsubs), start=ssub.start, end=msub.start, text=ssub.text))
rsubs.append (SubRipItem(index=len(rsubs), start=msub.start, end=msub.end, text=ssub.text))
rsubs.append (SubRipItem(index=len(rsubs), start=msub.start, end=msub.end, text=msub.text))
rsubs.append (SubRipItem(index=len(rsubs), start=msub.end, end=ssub.end, text=ssub.text))
ssubs.remove (ssub) # remove as completely handled.
tp = msub.end
#"start during; ends before master"
for ssub in ssubs.slice (starts_after=msub.start.ordinal-1, ends_before=msub.end.ordinal+1):
if (tp != ssub.start):
rsubs.append (SubRipItem(index=len(rsubs), start=tp, end=ssub.start, text=msub.text))
rsubs.append (SubRipItem(index=len(rsubs), start=ssub.start, end=ssub.end, text=ssub.text))
rsubs.append (SubRipItem(index=len(rsubs), start=ssub.start, end=ssub.end, text=msub.text))
tp = ssub.end
ssubs.remove (ssub) # remove as completely handled.
#"start during; ends after master"
for ssub in ssubs.slice (starts_before=msub.end, ends_after=msub.end):
rsubs.append (SubRipItem(index=len(rsubs), start=ssub.start, end=msub.end, text=ssub.text))
rsubs.append (SubRipItem(index=len(rsubs), start=msub.end, end=ssub.end, text=ssub.text))
rsubs.append (SubRipItem(index=len(rsubs), start=ssub.start, end=msub.end, text=msub.text))
ssubs.remove (ssub) # remove as completely handled.
tp = msub.end
if (tp != msub.end):
if (tp.ordinal < msub.start.ordinal):
tp = msub.start
rsubs.append (SubRipItem(index=len(rsubs), start=tp, end=msub.end, text=msub.text))
tp = msub.end
rsubs.sort(key=lambda SubRipItem: SubRipItem.start.ordinal)
for idx, rsub in enumerate(rsubs):
rsub.index = idx
rsubs.save (result, encoding='iso-8859-1')
master.srt
1 00:00:01,000 --> 00:00:03,000 Master sub title! 2 00:00:10,000 --> 00:00:15,000 First master sub. 3 00:00:20,000 --> 00:00:22,000 2nd master sub. 4 00:00:30,000 --> 00:00:40,000 3rd master sub. 5 00:00:45,000 --> 00:01:00,000 4th master sub.
slave.srt
1 00:00:01,000 --> 00:00:03,000 Slave sub title! 2 00:00:04,000 --> 00:00:05,500 First slave sub before 1st msub. 3 00:00:06,500 --> 00:00:08,000 Second slave sub before 1st msub. 4 00:00:09,000 --> 00:00:10,000 Third slave sub until 1st msub. 5 00:00:15,000 --> 00:00:17,000 Fourth slave sub right after 1st msub. 6 00:00:18,000 --> 00:00:24,000 Fifth slave sub @ before and after 2nd msub. 7 00:00:44,000 --> 00:00:46,000 Sixth slave sub @ 4th msub entry. 8 00:00:47,000 --> 00:00:48,000 Seventh slave sub during 4th msub. 9 00:00:49,000 --> 00:00:50,000 Eighth slave sub during 4th msub.
After ./subtitle_merge.py master.srt slave.srt merge.srt
you get:
merge.srt
0 00:00:01,000 --> 00:00:03,000 Slave sub title! 1 00:00:01,000 --> 00:00:03,000 Master sub title! 2 00:00:04,000 --> 00:00:05,500 First slave sub before 1st msub. 3 00:00:06,500 --> 00:00:08,000 Second slave sub before 1st msub. 4 00:00:09,000 --> 00:00:10,000 Third slave sub until 1st msub. 5 00:00:10,000 --> 00:00:15,000 First master sub. 6 00:00:15,000 --> 00:00:17,000 Fourth slave sub right after 1st msub. 7 00:00:18,000 --> 00:00:20,000 Fifth slave sub @ before and after 2nd msub. 8 00:00:20,000 --> 00:00:22,000 Fifth slave sub @ before and after 2nd msub. 9 00:00:20,000 --> 00:00:22,000 2nd master sub. 10 00:00:22,000 --> 00:00:24,000 Fifth slave sub @ before and after 2nd msub. 11 00:00:30,000 --> 00:00:40,000 3rd master sub. 12 00:00:44,000 --> 00:00:45,000 Sixth slave sub @ 4th msub entry. 13 00:00:45,000 --> 00:00:46,000 Sixth slave sub @ 4th msub entry. 14 00:00:45,000 --> 00:00:46,000 4th master sub. 15 00:00:46,000 --> 00:00:47,000 4th master sub. 16 00:00:47,000 --> 00:00:48,000 Seventh slave sub during 4th msub. 17 00:00:47,000 --> 00:00:48,000 4th master sub. 18 00:00:48,000 --> 00:00:49,000 4th master sub. 19 00:00:49,000 --> 00:00:50,000 Eighth slave sub during 4th msub. 20 00:00:49,000 --> 00:00:50,000 4th master sub. 21 00:00:50,000 --> 00:01:00,000 4th master sub.
from pysrt.
And as you can see I had a quick look to the flavors of git hub! :-)
from pysrt.
And can you tell me if this simpler version is working for you ?
import argparse
from pysrt import SubRipFile, SubRipItem, SubRipTime
parser = argparse.ArgumentParser(description='Merge 2 srt files.')
parser.add_argument('fin', type=str, nargs=2,
help='input file')
parser.add_argument(dest='fout', type=str, nargs=1,
help='the output file')
args = parser.parse_args()
master, slave = args.fin
result = args.fout[0]
msubs = SubRipFile.open(master, encoding='iso-8859-1')
ssubs = SubRipFile.open(slave, encoding='iso-8859-1')
rsubs = msubs + ssubs
rsubs.sort()
rsubs.clean_indexes()
rsubs.save (result, encoding='iso-8859-1')
from pysrt.
This is what I did first, or at least something similar...., I'm relatively new to python, so I was adding one by one instead of '+', the sort I did like my shared code, as well as the 'clean_indexes'.
I was not happy with the output since I always get the a mixture: sometime first master then slave subtitles,
sometime the opposite,
sometime it needs up to 6 lines to display all the subtitles.
I like your solution more, it has cleaner code, it executes faster, but I prefer my version for its output.
Thanks for the challenge anyhow!
PS. I tried also your version, it suffer from the same issue, as my very initial version, at least in VLC player 2.0.1.
from pysrt.
Related Issues (20)
- Phantom pointers when assigning fields HOT 1
- In-place mode does not write the entire file HOT 2
- Inserting Subtitle Snippet
- Can't parse text with empty line HOT 2
- time passed to at() will not find caption if the time passed in equals start time of caption
- UnicodeDecodeError
- Script: parsing transcript .srt files into readable text HOT 1
- the latest code in master was not released
- SubRipTime.__init__ should maybe cast the arguments to int or float (aka “TypeError: '>' not supported between instances of 'SubRipTime' and 'dict'” in slice())
- Faster loading HOT 2
- Tag v1.1.2 HOT 1
- UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' HOT 1
- Weird version ??
- text_without_tags should also remove subtitle tags
- pysrt.open() returns empty list
- Captions whose text begins with Line Separator character are parsed as blank string HOT 1
- pysrt fails to build with Python 3.11.0a1
- Tests fail with Python 3.12 HOT 2
- Subtitle synchronization with input video files.
- SubRipTime.to_time() does not support times over 24 hours
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pysrt.