asticode / go-astisub Goto Github PK
View Code? Open in Web Editor NEWManipulate subtitles in GO (.srt, .ssa/.ass, .stl, .ttml, .vtt (webvtt), teletext, etc.)
License: MIT License
Manipulate subtitles in GO (.srt, .ssa/.ass, .stl, .ttml, .vtt (webvtt), teletext, etc.)
License: MIT License
Hi,
I have a .SRT
file with position properties, like following:
1
00:00:12,000 --> 00:00:15,123
This is the first subtitle
2
00:00:16,000 --> 00:00:18,000
Another subtitle demonstrating tags:
<b>bold</b>, <i>italic</i>, <u>underlined</u>
<font color="#ff0000">red text</font>
3
00:00:20,000 --> 00:00:22,000 X1:40 X2:600 Y1:20 Y2:50
Another subtitle demonstrating position.
Got this error when trying to open the file
astisub: line 15: parsing srt duration 00:00:22,000 X1:40 X2:600 Y1:20 Y2:50 failed: astisub: Invalid number of millisecond digits detected in 00:00:22,000 X1:40 X2:600 Y1:20 Y2:50
It seems that astisub not recognize such position property.
Is it possible to be supportable?
referenece:
function formatDuration use string to create ass timeline,but if the precision carry, then will output a wrong timeline.
ex:
00:48:35,039 --> 00:48:35,999
the timestamp 00:48:35,999
convert to ass timeline,will output:
00:48:35.04,00:48:35.00
the end time is before than bengin time.
becuse fmt.Sprintf("%.2f", 0.999)
is equal 1.00
,when use slice to get result,will lose precision.
t.Log(fmt.Sprintf("%."+strconv.Itoa(2)+"f", 0.999)[2:])
will output 0.
As there are some invalid formats that will be parsed and can't be read properly, I'd like to know if it is possible to avoid this annoying output from the code below:
Lines 178 to 199 in aa9412f
Plus: If there's a method that I can verify whether or not the format is correct before that, will make it even better
Hi,
I tried to convert some SRT to VTT, and if the SRT file contain some special character like &, it is not escaped in output VTT:
Test case convert SRT to VTT:
Input, SRT sample:
1
00:00:03,400 --> 00:00:06,177
This is a sentence with &
Expected VTT:
WEBVTT
1
00:00:03.400 --> 00:00:06.177
This is a sentence with &
Actual VTT:
WEBVTT
1
00:00:03.400 --> 00:00:06.177
This is a sentence with &
The same problem happen when I convert from TTML to VTT
And when I convert from VTT to SRT, I expect that &
in VTT will be changed to &
in SRT, but nothing change, &
is brought to SRT without unescaping.
const (
srtTimeBoundariesSeparator = " --> "
)
when the srt content is like:
1
00:08:29,079 -->00:08:30,119
hello
because there lose a space, this line will not be indefiy, output string will be nothing.
i hope return a error for this
Hello,
I am trying to change the color of the caption text for STL subtitle file but I can't.
Here is my use case:
s, _ := astisub.OpenFile(input)
s.Metadata
) object :s.Metadata = &astisub.Metadata{
Title: fileNameWithoutExtTrimSuffix(input),
Comments: comments,
STLDisplayStandardCode: "1",
STLCharacterCodeTableNumber: 12336,
STLCodePageNumber: 3683891,
Framerate: 25,
Language: "french",
STLCountryOfOrigin: "FRA",
STLCreationDate: ¤tTime,
STLRevisionDate: ¤tTime,
STLMaximumNumberOfDisplayableCharactersInAnyTextRow: astikit.IntPtr(40),
STLMaximumNumberOfDisplayableRows: astikit.IntPtr(7),
STLPublisher: "",
STLSubtitleListReferenceCode: "",
STLOriginalEpisodeTitle: "",
STLTimecodeStartOfProgramme: tcp.PresentationTime()
}
for idx, item := range s.Items {
styleJustification := styles.Styles[idx].Justification
styleVerticalPosition := styles.Styles[idx].VerticalPosition
styleColor := styles.Styles[idx].Color
c, _ := ParseHexColor(styleColor)
justification := convertJustification(styleJustification)
position := astisub.STLPosition{
MaxRows: 7,
Rows: 7,
VerticalPosition: styleVerticalPosition,
}
color := astisub.Color{Red: c.Red, Green: c.Green, Blue: c.Blue}
attr := astisub.StyleAttributes{
STLJustification: &justification,
STLPosition: &position,
TeletextColor: &color,
TTMLColor: astikit.StrPtr(color.TTMLString()),
}
item.InlineStyle = &attr
}
s.Write(output)
My problem: All of the GSI and TTI metadatas works fine EXCEPT for the text color which is not in my final STL file. However I did change the TeletextColor
and TTMLColor
.
Can you help me to make the text color work for STL file ?
Hello @asticode
I'm currently troubleshooting issues with some STL files, that loose all their text when converting to any other format.
So far i tracked it down to the TTI-TF which, as far as i understand the teletext.go:parseTeletextRow function correctly, requires to contain a byte with a value of 0xB (Start Box control character according to EBU Spec 3264), before any text is decoded.
What at this point is unclear to me is if i'm looking at a bug in astisub or a faulty STL file. Do you maybe have any insight into this? I couldn't find anything in the STL spec that says a start box character is required.
Cheers
Matt
spec: https://tech.ebu.ch/docs/tech/tech3264.pdf
suspicious STL file: https://github.com/asnapper/go-astisub/raw/6fad4995b64ef6fd09b57e6882fd96604476122e/testdata/CHPD0608.stl
Hi @asticode ,
I would like to request access to include features related to styling, timestamp preservation, and various bug fixes, such as language support.
In the meantime, I am just using a fork and making PRs from my fork.
Thanks for the consideration, and great work on the repo so far.
Cheers!
# go get -u github.com/asticode/go-astisub/...
go: golang.org/x/text upgrade => v0.3.6
go: golang.org/x/net upgrade => v0.0.0-20210410081132-afb366fc7cd1
go: github.com/asticode/go-astits upgrade => v1.8.0
go: github.com/asticode/go-astikit upgrade => v0.20.0
# github.com/asticode/go-astisub
../../../../pkg/mod/github.com/asticode/[email protected]/teletext.go:338:12: undefined: astits.New
../../../../pkg/mod/github.com/asticode/[email protected]/teletext.go:357:9: undefined: astits.Data
../../../../pkg/mod/github.com/asticode/[email protected]/teletext.go:409:26: undefined: astits.Data
../../../../pkg/mod/github.com/asticode/[email protected]/teletext.go:429:9: undefined: astits.Data
Hi,
Is there an easy way to modify the whole text content of an item in SRT file?
For example, I have a file with the following content:
1
00:00:00,480 --> 00:00:04,380
hello
world!
2
00:00:04,513 --> 00:00:08,346
this is
test
It is malformed. I want to replace "\n "
(\n
and a space right after it) with "\n"
(just a \n
), so the results could be this:
1
00:00:00,480 --> 00:00:04,380
hello
world!
2
00:00:04,513 --> 00:00:08,346
this is
test
I see, that each item is split into lines, so simple replace
call won't work.
Any suggestions?
Thanks.
no updates in 3 years:
https://github.com/asticode/go-astisub/blob/master/go.sum
go get -u
go mod tidy
Specification available at Apple.
Sample files: iTunes Timed Text.zip
I specify font in style but it not work. e.g.:
[V4 Styles]
Format: Name, Alignment, Angle, BackColour, Bold, BorderStyle, Encoding, Fontname, Fontsize, Italic, MarginL, MarginV, Outline, OutlineColour, PrimaryColour, ScaleX, ScaleY, SecondaryColour, Underline
Style: style_0_0,5,0.000,&H007f7f7f,0,0,134,SimHei,60.000,0,1090,520,0.000,&H007f7f7f,&H64000000,100.000,100.000,&H00ffffff,0
[Events]
Format: Start, End, MarginL, MarginV, Style, Text
Dialogue: 00:00:10.88,00:00:24.04,1090,520,style_0_0,产品计划
after I test ass/ssa code, it worked. e.g.
[V4 Styles]
Format: Name, Alignment, Angle, BackColour, Bold, BorderStyle, Encoding, Fontname, Fontsize, Italic, MarginL, MarginV, Outline, OutlineColour, PrimaryColour, ScaleX, ScaleY, SecondaryColour, Underline
Style: style_0_0,5,0.000,&H007f7f7f,0,0,134,SimHei,60.000,0,1090,520,0.000,&H007f7f7f,&H64000000,100.000,100.000,&H00ffffff,0
[Events]
Format: Start, End, MarginL, MarginV, Style, Text
Dialogue: 00:00:10.88,00:00:24.04,1090,520,style_0_0,{\fnApple Chancery\b0}产品计划
so how can I gengerate code like Dialogue: 00:00:10.88,00:00:24.04,1090,520,style_0_0,{\fnApple Chancery\b0}产品计划
s2, _ := astisub.ReadFromSRT(bytes.NewReader([]byte("00:01:00.000 --> 00:02:00.000\nCredits")))
This lines produces 2 errors.
The first is fixed by replacing the dots in the timestamps with commas.
s2, _ := astisub.ReadFromSRT(bytes.NewReader([]byte("00:01:00,000 --> 00:02:00,000\nCredits")))
The second is because the input string begins with the timestamp.
Starting at line 53 in str.go:
Appending a line to the input []byte avoids the error.
s2, _ := astisub.ReadFromSRT(bytes.NewReader([]byte("1\n00:01:00,000 --> 00:02:00,000\nCredits")))
Helper methods like:
Looking forward to support of this format for my video application (https://github.com/5k3105/Trans-View).
One common operation on subtitles is to apply a linear correction to fix various sync issues. This basically means that you change one time in the beginning and one in the end and it then recalculates all subtitles in between in a linear fashion.
An example of this could be the linear correction of https://subshifter.bitsnbites.eu/.
~/git/go-astisub ./main --help
Usage of ./main:
-f duration
the fragment duration
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x6c0974]
goroutine 1 [running]:
github.com/asticode/go-astikit.FlagStrings.String(...)
/home/user/go/pkg/mod/github.com/asticode/[email protected]/flag.go:34
flag.isZeroValue(0xc0000d6400, 0x0, 0x0, 0x80b9b2)
/usr/lib/go-1.14/src/flag/flag.go:458 +0x104
flag.(*FlagSet).PrintDefaults.func1(0xc0000d6400)
/usr/lib/go-1.14/src/flag/flag.go:521 +0x20b
flag.(*FlagSet).VisitAll(0xc0000b6120, 0xc000125b40)
/usr/lib/go-1.14/src/flag/flag.go:388 +0x61
flag.(*FlagSet).PrintDefaults(0xc0000b6120)
/usr/lib/go-1.14/src/flag/flag.go:504 +0x4e
flag.PrintDefaults(...)
/usr/lib/go-1.14/src/flag/flag.go:555
flag.glob..func1()
/usr/lib/go-1.14/src/flag/flag.go:583 +0xe6
flag.commandLineUsage()
/usr/lib/go-1.14/src/flag/flag.go:1021 +0x27
flag.(*FlagSet).usage(0xc0000b6120)
/usr/lib/go-1.14/src/flag/flag.go:884 +0x2f
flag.(*FlagSet).parseOne(0xc0000b6120, 0xc000125dc8, 0xdf3f8fad, 0xce75dcd2148b0b58)
/usr/lib/go-1.14/src/flag/flag.go:926 +0x1d4
flag.(*FlagSet).Parse(0xc0000b6120, 0xc0000a6030, 0x1, 0x1, 0x1, 0x80b9b2)
/usr/lib/go-1.14/src/flag/flag.go:971 +0x62
flag.Parse(...)
/usr/lib/go-1.14/src/flag/flag.go:999
main.main()
/home/user/git/go-astisub/astisub/main.go:24 +0x122
When we display a subtitle, it may be broken into multiple lines.
Example:
"Solo se muestran controles que estén
en estado crítico o de advertencia."
Unfortunately, the call to subtitle.String() method converts this to:
"Solo se muestran controles que estén - en estado crítico o de advertencia."
// String implements the Stringer interface
func (i Item) String() string {
var os []string
for _, l := range i.Lines {
os = append(os, l.String())
}
return strings.Join(os, " - ") // <-- could this please be changed to return strings.Join(os, "\n")?
}
Thank you.
Neo
Hi,
In version 0.26.1
, the indexes for .srt or .vtt files consistently register as zero, whereas in version 0.26.0
, the behavior is normal.
Here is simplified code and subtitle files:
s1, _ := astisub.OpenFile("test.srt")
for _, v := range s1.Items {
fmt.Printf("Index: %d\n", v.Index)
}
Also known as Spruce Technologies, the specification is available here.
Sample files: DVD Studio Pro.zip
Hello
I want convert an SRT to STL.
How update originalProgramTitle and originalEpisodeTitle ?
KG
when i read this line:
Dialogue: 0,0:00:36.38,0:00:38.84,DX,NTP,0,0,0,!Effect,Even solo players need to\Ntake this more seriously{\fscx300}-{\r}
the result is nil. I read code :
https://github.com/asticode/go-astisub/blob/master/ssa.go#L1068
and find this func not handle text before {\fscx300}.
is this a feature or bug ?
Hi.
I'm trying to parse the following ttml snippet:
<?xml version="1.0" encoding="UTF-8"?><tt xmlns:smpte="http://www.smpte-ra.org/schemas/2052-1/2010/smpte-tt" xmlns="http://www.w3.org/ns/ttml" xmlns:ttm="http://www.w3.org/ns/ttml#metadata" xmlns:tts="http://www.w3.org/ns/ttml#styling" xml:space="default" xml:lang="eng"><head>
<metadata>
<ttm:title/>
</metadata>
<styling>
<style xml:id="style.center.outline" xmlns:tts="http://www.w3.org/ns/ttml#style" tts:fontFamily="Arial" tts:fontSize="100%" tts:fontStyle="normal" tts:fontWeight="normal" tts:backgroundColor="transparent" tts:color="white" tts:textOutline="black 2px" tts:textAlign="center"/>
</styling>
<layout>
<region xml:id="r0" tts:displayAlign="after" tts:origin="10% 75%" tts:extent="80% 20%"/>
</layout>
</head><body>
<div>
<p style="style.center.outline" begin="00:22:31.000" region="r0" xml:id="p264" end="00:22:33.720" ><span tts:direction="ltr">Got you!<br/>Steady on.</span></p>
</div></body></tt>
It seems that the subtitle text is parsed without a new line.
The text is unmarshalled as xml chardata:
type TTMLInItem struct {
Text string `xml:",chardata"`
...
}
Which results with the following string: "Got you!Steady on."
ttml.go has the following comment in the code:
// New line decoded as a line break. This can happen if there's a "br" tag within the text since
// since the go xml unmarshaler will unmarshal a "br" tag as a line break if the field has the
// chardata xml tag.
But it doesn't really seem the go xml unmarshaler converts the br tag into a new line.
Perhaps this is something which used to be true in old go versions? (I'm using Go 1.18.5
Hi.
I see there are different ttml structures which the parser doesn't seem to support.
For example, the parser expects the subtitle "items" in the body to be in p tags, with the styles and region attributes.
I came across this ttml, where the body contains div of region, and within this div all the related subtitles of that region, without any region attribute:
<tt xmlns:ttp="http://www.w3.org/ns/ttml#parameter" xmlns="http://www.w3.org/ns/ttml"
xmlns:tts="http://www.w3.org/ns/ttml#styling" xmlns:ttm="http://www.w3.org/ns/ttml#metadata"
xmlns:ebuttm="urn:ebu:metadata" xmlns:ebutts="urn:ebu:style"
xml:lang="eng" xml:space="default"
ttp:timeBase="media"
ttp:cellResolution="32 15">
<head>
<metadata>
<ttm:title>DASH-IF Live Simulator</ttm:title>
<ebuttm:documentMetadata>
<ebuttm:conformsToStandard>urn:ebu:distribution:2014-01</ebuttm:conformsToStandard>
<ebuttm:authoredFrameRate>30</ebuttm:authoredFrameRate>
</ebuttm:documentMetadata>
</metadata>
<styling>
<style xml:id="s0" tts:fontStyle="normal" tts:fontFamily="sansSerif" tts:fontSize="100%" tts:lineHeight="normal"
tts:color="#FFFFFF" tts:wrapOption="noWrap" tts:textAlign="center"/>
<style xml:id="s1" tts:color="#00FF00" tts:backgroundColor="#000000" ebutts:linePadding="0.5c"/>
<style xml:id="s2" tts:color="#ff0000" tts:backgroundColor="#000000" ebutts:linePadding="0.5c"/>
</styling>
<layout>
<region xml:id="r0" tts:origin="15% 80%" tts:extent="70% 20%" tts:overflow="visible" tts:displayAlign="before"/>
<region xml:id="r1" tts:origin="15% 20%" tts:extent="70% 20%" tts:overflow="visible" tts:displayAlign="before"/>
</layout>
</head>
<body style="s0">
<div region="r0">
<p xml:id="sub16000" begin="00:00:16.000" end="00:00:17.000" >
<span style="s1">eng : 00:00:16.000</span>
</p>
<p xml:id="sub17000" begin="00:00:17.000" end="00:00:18.000" >
<span style="s1">eng : 00:00:17.000</span>
</p>
</div>
</body>
</tt>
In some other cases, the
elements might have style attributes, but these element are also a child of a div element which also have some styles associated which should be inherited, but this package doesn't seem to look at any div inside the body:
<body ttm:role="caption">
<div style="autogenFontStyle_n_150_120 S1 StyleFillLineGapTrue fontFamilyStyle">
<p begin="00:00:01.000" end="00:00:02.000" region="R6" style="S4" ttm:role="sound" xml:id="C1">
<span style="S3">FIRST SUBTITLE, WHA!!!!!! C1</span>
</p>
<p begin="00:00:03.000" end="00:00:04.000" region="R6" style="S4" ttm:role="sound" xml:id="C2">
<span style="S3">PHONE RINGS C2</span>
</p>
<p begin="00:00:05.000" end="00:00:06.000" region="R6" style="S4" ttm:role="sound" xml:id="C3">
<span style="S3">PHONE RINGS C3</span>
</p>
In another ttml I have, the structure is as follows which seems to be parsed fine (Except for the
which I raised another issue).
the style attribute is directly on the
element and the region as well...
<body>
<div>
<p style="style.center.outline" begin="00:22:31.000" region="r0" xml:id="p264" end="00:22:33.720" ><span tts:direction="ltr">Got you!<br/>Steady on.</span></p>
</div></body>
Are there multiple known types of TTML formats or versions so I can know which are supported by this package?
It seems to be a big problem for parsing such structures, at least not something which can be achieved by mapping the tags into go structs....
This TTML subs are so frustrating comparing to other subtitle formats, and there is no clear documentation about it and all the different possible structures it may have....
Hello there- I apologize if this is the wrong venue for a simple question.
Do you plan to support CAP files in this library in the near future?
Thanks.
Hi, I have never worked with Go before (but needed to use this cool lib/tool), so I might be wrong, but I believe the installation instructions do not work with Go 1.17+.
From what I found it is related to https://go.dev/doc/go-get-install-deprecation . I have tried using install
but then I got:
go install github.com/asticode/go-astisub@latest
go: downloading github.com/asticode/go-astisub v0.19.0
go: downloading github.com/asticode/go-astikit v0.20.0
go: downloading github.com/asticode/go-astits v1.8.0
go: downloading golang.org/x/net v0.0.0-20200904194848-62affa334b73
go: downloading golang.org/x/text v0.3.2
package github.com/asticode/go-astisub is not a main package
I have solved my issue by using 1.16, but it would probably be helpful to address this somehow (maybe at least mention the version).
Thnaks for the great lib/tool :)
s1, err := astisub.OpenFile("Call.Me.by.Your.Name.2017.BluRay.ass")
if err != nil {
return nil, err
}
// s1.items is nil, why ,if subtitle file is .srt ,it works.
For example, for the srt file from Lark:
1
00:00:01.230 --> 00:00:05.790
说话人 1: 用资讯唤醒每一天,欢迎收听财马早评
2
00:00:07.900 --> 00:00:11.220
说话人 2: 各位喜马拉雅才怕早评的听众朋友们大家早上好
3
00:00:11.220 --> 00:00:21.580
说话人 2: 2022 年的 12 月 26 号,今天是周一,本周是比较特殊的一周,本周交易结束到本周六
It's use dot "." instead of comma "," for duration, which fails in this package. Can you please add a fallback for this kind of SRT files?
Hi
I got an exception astisub: style *Default not found"
when trying to open an .ass file.
I found that *Default
is defined as default style in ASS v4.00+
Is this supported by the astisub lib?
Thanks.
Here is part of the content in my test .ass file, which I downloaded it from somewhere subtitle resource site.
[Script Info]
; // The sub is created by AssToolkit
; // AssToolkit is an ASS Converter designed by David C.
Title:BlendVision
Original Script:Test
Synch Point:0
ScriptType:v4.00+
Collisions:Normal
Timer:100.0000
ScaledBorderAndShadow: no
[V4+ Styles]
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding
Style: Default,方正黑体_GBK,20,&H00FFFFFF,&HF0000000,&H00000000,&H32000000,0,0,0,0,100,100,0,0,1,2,1,2,5,5,2,134
[Events]
Format: Layer, Start, End, Style, Actor, MarginL, MarginR, MarginV, Effect, Text
Dialogue: 0,0:00:05.66,0:00:10.66,*Default,NTP,0000,0000,0000,,{\an5\fad(0,500)\p1\bord2\shad0\fscx150\fscy160\alpha&566\c&H000000&\3c&HECB000&\move(145,247,145,227,0,500)\clip(30,188,340,210)\t(0,500,\clip(30,168,340,190))}m 211 -8 b 217 -6 217 -4 217 -2 l 217 24 b 217 26 217 29 211 31 l 31 31 b 26 29 26 26 26 24 l 26 -2 b 26 -4 26 -6 31 -8{\p0}
Dialogue: 0,0:00:05.66,0:00:10.66,*Default,NTP,0000,0000,0000,,{\an5\fad(0,500)\p1\bord2\shad0\fscx150\fscy160\alpha&566\c&H000000&\3c&HECB000&\move(145,207,145,227,0,500)\clip(30,220,340,235)\t(0,500,\clip(30,240,340,255))}m 211 -8 b 217 -6 217 -4 217 -2 l 217 24 b 217 26 217 29 211 31 l 31 31 b 26 29 26 26 26 24 l 26 -2 b 26 -4 26 -6 31 -8{\p0}
Dialogue: 0,0:00:05.66,0:00:10.66,*Default,NTP,0000,0000,0000,,{\an5\fad(0,500)\p1\bord3\blur3\shad0\fscx150\fscy160\alpha&566\c&H000000&\pos(145,227)\clip(30,210,340,220)\t(0,500,\clip(30,190,340,240))}m 211 -8 b 217 -6 217 -4 217 -2 l 217 24 b 217 26 217 29 211 31 l 31 31 b 26 29 26 26 26 24 l 26 -2 b 26 -4 26 -6 31 -8{\p0}
....
cat /tmp/Armageddon.1998.ass
Title: CNXP
Original Script: lzqc
PlayResX: 384
PlayResY: 288
Timer: 100.0000
[V4+ Styles]
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding
Style: chs,simhei,20,&H00ffffff,&H0000ffff,&H00000000,&H80000000,1,0,0,0,90,90,0,0.00,1,2,2,2,20,20,17,1
[V4 Styles]
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, TertiaryColour, BackColour, Bold, Italic, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, AlphaLevel, Encoding
Style: eng,Arial Narrow,12,&H00ffeedd,&H00ffc286,&H00000000,&H80000000,-1,0,1,1,0,2,20,20,4,0,1
[Events]
Format: Marked, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
Dialogue: 0,0:00:53.11,0:00:55.23,*eng,,0000,0000,0000,,This is the Earth at a time...
Dialogue: 0,0:00:55.36,0:01:00.19,*eng,,0000,0000,0000,,when the dinosaurs roamed a lush and fertile planet.
Dialogue: 0,0:01:07.58,0:01:11.03,*eng,,0000,0000,0000,,A piece of rock just six miles wide...
Dialogue: 0,0:00:53.11,0:00:55.23,*chs,,0000,0000,0000,,这是地球
Dialogue: 0,0:00:55.36,0:01:00.19,*chs,,0000,0000,0000,,那是恐龙称霸的时代 万物滋长 欣欣向荣
Dialogue: 0,0:01:07.58,0:01:11.03,*chs,,0000,0000,0000,,一块只有六里宽的石头
reproduct bug
go install github.com/asticode/go-astisub/astisub@latest
astisub convert -i /tmp/Armageddon.1998.ass -o /tmp/out.vtt
it shows
2022/12/13 21:32:07 astisub: style *eng not found while opening /tmp/Armageddon.1998.ass
Are there any plans to add features for converting the styling in TTML input to various output formats (I'm hoping for VTT)
Thanks for a very nice and useful library!
I've run into STL files where STLTimecodeStartOfProgramme is set to 10 hours while the actual timestamps start at 0. As far as I've understood, this is a "safety mechanism" to avoid non-processed insertion of subtitles.
Since the offset is subtracted from the timestamp at line 255 in stl.go, I end up with large negative timestamps:
Line 255 in 8c108a5
Looping through all cues and adding 10hours to each of them should give the correct values back, but it would be much more efficient and more transparent to have an option to set STLTimecodeStartOfProgramme = 0
in the call to ReadFromSTL, or have this done automatically if the timestamps are less than the STLTimecodeStartOfProgramme
.
It's not clear to me what is the best way to progress here, but in order to not break the interface, maybe some optional parameters like in
https://github.com/edgeware/mp4ff/blob/bb9320744777dc97f18034c8aed45a9bcdbaa995/mp4/file.go#L121
could be a viable alternative?
Inline timestamp tags are rendered incorrectly.
The following test illustrates this issue:
func TestWebVTTWithTimestampTag(t *testing.T) {
testData := `WEBVTT
00:01:00.000 --> 00:02:00.000
Sentence with a timestamp<00:01:02.000> in the middle`
s, err := astisub.ReadFromWebVTT(strings.NewReader(testData))
require.NoError(t, err)
require.Len(t, s.Items, 1)
b := &bytes.Buffer{}
err = s.WriteToWebVTT(b)
require.NoError(t, err)
require.Equal(t, `WEBVTT
1
00:01:00.000 --> 00:02:00.000
Sentence with an timestamp<00:01:02.000> in the middle
`, b.String())
}
This test passes in v0.25.1 and fails in 0.26.0.
I can't determine if this is the same bug as #94 as the original file is no longer available. I can verify that #96 does not fix this issue.
I was able to determine that this was introduced by 1e3a211
I was able to fix this with the following diff:
diff --git a/webvtt.go b/webvtt.go
index 3b1f5e4..4f56392 100644
--- a/webvtt.go
+++ b/webvtt.go
@@ -574,7 +574,7 @@ func (li LineItem) webVTTBytes() (c []byte) {
c = append(c, []byte(tag.startTag())...)
}
}
- c = append(c, []byte(escapeWebVTT(li.Text))...)
+ c = append(c, []byte(li.Text)...)
if li.InlineStyle != nil {
noTags := len(li.InlineStyle.WebVTTTags)
for i := noTags - 1; i >= 0; i-- {
But of course, this breaks the TestWebVTTEscape
test. I'm not familiar enough with the code to determine a proper fix, but happy to do a PR if you have an idea on how to address this.
when use subtitles.ReadFromWebVTT, then style message will be losted.
STYLE
::cue {
color: #fff;
text-shadow: 0 1px #000, 1px 0 #000, -1px 0 #000, 0 -1px #000;
font-size: 15vw;
}
Tick values higher than a certain amount will result in incorrect times.
For instance, take this test TTML:
<?xml version="1.0" encoding="UTF-8"?>
<tt xmlns="http://www.w3.org/ns/ttml" xmlns:tt="http://www.w3.org/ns/ttml" xmlns:ttm="http://www.w3.org/ns/ttml#metadata" xmlns:ttp="http://www.w3.org/ns/ttml#parameter" xmlns:tts="http://www.w3.org/ns/ttml#styling" ttp:tickRate="10000000" ttp:version="2" xml:lang="ja">
<head>
<styling>
<initial tts:backgroundColor="transparent" tts:color="white" tts:fontSize="6.000vh"/>
<style xml:id="style0" tts:textAlign="center"/>
<style xml:id="style1" tts:textAlign="start"/>
<style xml:id="style2" tts:ruby="container" tts:rubyPosition="auto"/>
<style xml:id="style3" tts:ruby="base"/>
<style xml:id="style4" tts:ruby="text"/>
<style xml:id="style5" tts:ruby="text"/>
</styling>
<layout>
<region xml:id="region0" tts:displayAlign="after"/>
</layout>
</head>
<body xml:space="preserve">
<div>
<p xml:id="subtitle1" begin="18637368750t" end="18676157500t" region="region0" style="style0"><span style="style1">テソプ<span style="style2"><span style="style3">の所だ</span><span style="style4">カン食ン</span></span>食おう<br/><span style="style2"><span style="style3">江陵</span><span style="style5">カンヌン</span></span>で刺身でも食おう</span></p>
</div>
</body>
</tt>
The resulting ASS is:
[Script Info]
[V4 Styles]
Format: Name
Style: italic
Style: span
[Events]
Format: Start, End, Text
Dialogue: 00:0-8:0-44..48,00:0-8:0-43..02,TEST
Probably something to do with ticks not being int64 but normal integers, not sure though.
I wanted to get the ID number of a specific item of an SRT file; but it brings nil.
Tried this code:
s1, _ := astisub.OpenFile("example-in.srt")
fmt.Println(s1.Items[0].Region)
The simplified code:
var sub *astisub.Subtitles
if strings.HasSuffix(file.Name, ".srt") {
sub, err = astisub.ReadFromSRT(fd)
} else if strings.HasSuffix(file.Name, ".ass") || strings.HasSuffix(file.Name, ".ssa") {
sub, err = astisub.ReadFromSSA(fd)
}
var buf = &bytes.Buffer{}
sub.WriteToWebVTT(buf)
This is a fantastic module thank you. I've noticed that when I read VTT files that it will set the VoiceName field. However, when I write the file back down it doesn't include the speaker.
I'm not sure if this effects other formats as I've only tested with VTT.
Hi there!
I ran into an issue when parsing a WebVTT file with the ReadFromWebVTT method. Specifically, the parser incorrectly returns an error that the inline style is invalid in the scenario when there are two spaces between the cue timing and cue settings.
For example, parsing this file will return an error because of the two spaces after the end of the cue time and before the word "position":
WEBVTT
00:00:00.580 --> 00:00:03.438 position:50% align:middle
- [Reporter] Hello welcome to today's episode
00:00:03.438 --> 00:00:07.862 position:50% align:middle
Today we're going to talk about cats
The WebVTT spec suggests it's ok to have more than one space:
A WebVTT cue block consists of the following components, in the given order:
Optionally, a WebVTT cue identifier followed by a WebVTT line terminator.
WebVTT cue timings.
Optionally, one or more U+0020 SPACE characters or U+0009 CHARACTER TABULATION (tab) characters followed by a WebVTT cue settings list.
Thanks for help in addressing this.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.