Giter VIP home page Giter VIP logo

go-astisub's People

Contributors

anupcshan avatar arnestorksen avatar arun1587 avatar asnapper avatar asticode avatar caryyu avatar chadnickbok avatar des-nerger avatar discovery-avishekgulshan avatar dlecorfec avatar eldersjavas avatar finn0 avatar firodj avatar iheidashuai avatar j0sh avatar mernat avatar mholt avatar nhannguyen700 avatar ping avatar rockjohnson503 avatar rrooij avatar ruxton avatar samuel avatar tobbee avatar withoutpants avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

go-astisub's Issues

SRT with position properties

Hi,

I have a .SRT file with position properties, like following:

1
00:00:12,000 --> 00:00:15,123
This is the first subtitle

2
00:00:16,000 --> 00:00:18,000
Another subtitle demonstrating tags:
<b>bold</b>, <i>italic</i>, <u>underlined</u>
<font color="#ff0000">red text</font>

3
00:00:20,000 --> 00:00:22,000  X1:40 X2:600 Y1:20 Y2:50
Another subtitle demonstrating position.

Got this error when trying to open the file

astisub: line 15: parsing srt duration 00:00:22,000  X1:40 X2:600 Y1:20 Y2:50 failed: astisub: Invalid number of millisecond digits detected in 00:00:22,000  X1:40 X2:600 Y1:20 Y2:50

It seems that astisub not recognize such position property.
Is it possible to be supportable?

referenece:

func formatDuration has a bug will to output wrong result

function formatDuration use string to create ass timeline,but if the precision carry, then will output a wrong timeline.
ex:

00:48:35,039 --> 00:48:35,999

the timestamp 00:48:35,999 convert to ass timeline,will output:

00:48:35.04,00:48:35.00

the end time is before than bengin time.
becuse fmt.Sprintf("%.2f", 0.999) is equal 1.00,when use slice to get result,will lose precision.

t.Log(fmt.Sprintf("%."+strconv.Itoa(2)+"f", 0.999)[2:])

will output 0.

Any methods to suppress these annoying log.Printf

As there are some invalid formats that will be parsed and can't be read properly, I'd like to know if it is possible to avoid this annoying output from the code below:

go-astisub/ssa.go

Lines 178 to 199 in aa9412f

log.Printf("astisub: unknown section: %s", line)
sectionName = ssaSectionNameUnknown
continue
}
}
// Unknown section
if sectionName == ssaSectionNameUnknown {
continue
}
// Comment
if len(line) > 0 && line[0] == ';' {
si.comments = append(si.comments, strings.TrimSpace(line[1:]))
continue
}
// Split on ":"
var split = strings.Split(line, ":")
if len(split) < 2 || split[0] == "" {
log.Printf("astisub: not understood: '%s', ignoring", line)
continue

Plus: If there's a method that I can verify whether or not the format is correct before that, will make it even better

Escape problem when convert from other formats to VTT, and from VTT to other formats

Hi,

I tried to convert some SRT to VTT, and if the SRT file contain some special character like &, it is not escaped in output VTT:

  • Test case convert SRT to VTT:
    Input, SRT sample:

    1
    00:00:03,400 --> 00:00:06,177
    This is a sentence with &
    

    Expected VTT:

    WEBVTT
    
    1
    00:00:03.400 --> 00:00:06.177
    This is a sentence with &amp;
    

    Actual VTT:

    WEBVTT
    
    1
    00:00:03.400 --> 00:00:06.177
    This is a sentence with &
    

The same problem happen when I convert from TTML to VTT

And when I convert from VTT to SRT, I expect that &amp; in VTT will be changed to & in SRT, but nothing change, &amp; is brought to SRT without unescaping.

srtTimeBoundariesSeparator will lead to incorrect results

const (
srtTimeBoundariesSeparator = " --> "
)

when the srt content is like:

1
00:08:29,079 -->00:08:30,119
hello

because there lose a space, this line will not be indefiy, output string will be nothing.

i hope return a error for this

Can't change fontColor for STL file

Hello,

I am trying to change the color of the caption text for STL subtitle file but I can't.

Here is my use case:

  1. I read a SRT file s, _ := astisub.OpenFile(input)
  2. I add the GSI to my subtitle (s.Metadata) object :
s.Metadata = &astisub.Metadata{
		Title:    fileNameWithoutExtTrimSuffix(input),
		Comments: comments,

		STLDisplayStandardCode:      "1",
		STLCharacterCodeTableNumber:    12336,
		STLCodePageNumber:           3683891,
		Framerate:                   25,
		Language:                    "french",
		STLCountryOfOrigin:          "FRA",

		STLCreationDate: &currentTime,
		STLRevisionDate: &currentTime,
		STLMaximumNumberOfDisplayableCharactersInAnyTextRow: astikit.IntPtr(40),
		STLMaximumNumberOfDisplayableRows:                   astikit.IntPtr(7),
		STLPublisher:                                        "",
		STLSubtitleListReferenceCode:                        "",
		STLOriginalEpisodeTitle:                             "",
		STLTimecodeStartOfProgramme:                         tcp.PresentationTime()
}
  1. I add the TTI metadata to my subtitle object :
for idx, item := range s.Items {
		styleJustification := styles.Styles[idx].Justification
		styleVerticalPosition := styles.Styles[idx].VerticalPosition
		styleColor := styles.Styles[idx].Color

		c, _ := ParseHexColor(styleColor)

		justification := convertJustification(styleJustification)

		position := astisub.STLPosition{
			MaxRows:          7,
			Rows:             7,
			VerticalPosition: styleVerticalPosition,
		}

		color := astisub.Color{Red: c.Red, Green: c.Green, Blue: c.Blue}

		attr := astisub.StyleAttributes{
			STLJustification: &justification,
			STLPosition:      &position,

			TeletextColor: &color,
			TTMLColor:     astikit.StrPtr(color.TTMLString()),
		}

		item.InlineStyle = &attr

	}
  1. Finally I create the STL subtitle file with s.Write(output)

My problem: All of the GSI and TTI metadatas works fine EXCEPT for the text color which is not in my final STL file. However I did change the TeletextColor and TTMLColor.

Can you help me to make the text color work for STL file ?

STL Parsing TTI-TF without start box (0xb) character

Hello @asticode
I'm currently troubleshooting issues with some STL files, that loose all their text when converting to any other format.

So far i tracked it down to the TTI-TF which, as far as i understand the teletext.go:parseTeletextRow function correctly, requires to contain a byte with a value of 0xB (Start Box control character according to EBU Spec 3264), before any text is decoded.

What at this point is unclear to me is if i'm looking at a bug in astisub or a faulty STL file. Do you maybe have any insight into this? I couldn't find anything in the STL spec that says a start box character is required.

Cheers
Matt


spec: https://tech.ebu.ch/docs/tech/tech3264.pdf
suspicious STL file: https://github.com/asnapper/go-astisub/raw/6fad4995b64ef6fd09b57e6882fd96604476122e/testdata/CHPD0608.stl

Access for adding new features to base

Hi @asticode ,
I would like to request access to include features related to styling, timestamp preservation, and various bug fixes, such as language support.
In the meantime, I am just using a fork and making PRs from my fork.
Thanks for the consideration, and great work on the repo so far.
Cheers!

Errors on update

# go get -u github.com/asticode/go-astisub/...

go: golang.org/x/text upgrade => v0.3.6
go: golang.org/x/net upgrade => v0.0.0-20210410081132-afb366fc7cd1
go: github.com/asticode/go-astits upgrade => v1.8.0
go: github.com/asticode/go-astikit upgrade => v0.20.0
# github.com/asticode/go-astisub
../../../../pkg/mod/github.com/asticode/[email protected]/teletext.go:338:12: undefined: astits.New
../../../../pkg/mod/github.com/asticode/[email protected]/teletext.go:357:9: undefined: astits.Data
../../../../pkg/mod/github.com/asticode/[email protected]/teletext.go:409:26: undefined: astits.Data
../../../../pkg/mod/github.com/asticode/[email protected]/teletext.go:429:9: undefined: astits.Data

Modify the text of an item

Hi,

Is there an easy way to modify the whole text content of an item in SRT file?

For example, I have a file with the following content:

1
00:00:00,480 --> 00:00:04,380
hello
 world!

2
00:00:04,513 --> 00:00:08,346
this is
 test

It is malformed. I want to replace "\n " (\n and a space right after it) with "\n" (just a \n), so the results could be this:

1
00:00:00,480 --> 00:00:04,380
hello
world!

2
00:00:04,513 --> 00:00:08,346
this is
test

I see, that each item is split into lines, so simple replace call won't work.

Any suggestions?
Thanks.

Can go-astisub generate ASS/SSA code?

I specify font in style but it not work. e.g.:


[V4 Styles]
Format: Name, Alignment, Angle, BackColour, Bold, BorderStyle, Encoding, Fontname, Fontsize, Italic, MarginL, MarginV, Outline, OutlineColour, PrimaryColour, ScaleX, ScaleY, SecondaryColour, Underline
Style: style_0_0,5,0.000,&H007f7f7f,0,0,134,SimHei,60.000,0,1090,520,0.000,&H007f7f7f,&H64000000,100.000,100.000,&H00ffffff,0
[Events]
Format: Start, End, MarginL, MarginV, Style, Text
Dialogue: 00:00:10.88,00:00:24.04,1090,520,style_0_0,产品计划

after I test ass/ssa code, it worked. e.g.


[V4 Styles]
Format: Name, Alignment, Angle, BackColour, Bold, BorderStyle, Encoding, Fontname, Fontsize, Italic, MarginL, MarginV, Outline, OutlineColour, PrimaryColour, ScaleX, ScaleY, SecondaryColour, Underline
Style: style_0_0,5,0.000,&H007f7f7f,0,0,134,SimHei,60.000,0,1090,520,0.000,&H007f7f7f,&H64000000,100.000,100.000,&H00ffffff,0
[Events]
Format: Start, End, MarginL, MarginV, Style, Text
Dialogue: 00:00:10.88,00:00:24.04,1090,520,style_0_0,{\fnApple Chancery\b0}产品计划

so how can I gengerate code like Dialogue: 00:00:10.88,00:00:24.04,1090,520,style_0_0,{\fnApple Chancery\b0}产品计划

Example usage ReadFromSRT in the README is incorrect

s2, _ := astisub.ReadFromSRT(bytes.NewReader([]byte("00:01:00.000 --> 00:02:00.000\nCredits")))

This lines produces 2 errors.
The first is fixed by replacing the dots in the timestamps with commas.
s2, _ := astisub.ReadFromSRT(bytes.NewReader([]byte("00:01:00,000 --> 00:02:00,000\nCredits")))

The second is because the input string begins with the timestamp.
Starting at line 53 in str.go:

https://github.com/asticode/go-astisub/blob/b6b18718ddb6ee0da08772d8ab310c9c3d2d0459/srt.go#L52C3-L58

Appending a line to the input []byte avoids the error.

s2, _ := astisub.ReadFromSRT(bytes.NewReader([]byte("1\n00:01:00,000 --> 00:02:00,000\nCredits")))

Add support for linear correction

One common operation on subtitles is to apply a linear correction to fix various sync issues. This basically means that you change one time in the beginning and one in the end and it then recalculates all subtitles in between in a linear fashion.

An example of this could be the linear correction of https://subshifter.bitsnbites.eu/.

CLI: --help crashes the application

~/git/go-astisub ./main  --help 
Usage of ./main:
  -f duration
    	the fragment duration
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x6c0974]

goroutine 1 [running]:
github.com/asticode/go-astikit.FlagStrings.String(...)
	/home/user/go/pkg/mod/github.com/asticode/[email protected]/flag.go:34
flag.isZeroValue(0xc0000d6400, 0x0, 0x0, 0x80b9b2)
	/usr/lib/go-1.14/src/flag/flag.go:458 +0x104
flag.(*FlagSet).PrintDefaults.func1(0xc0000d6400)
	/usr/lib/go-1.14/src/flag/flag.go:521 +0x20b
flag.(*FlagSet).VisitAll(0xc0000b6120, 0xc000125b40)
	/usr/lib/go-1.14/src/flag/flag.go:388 +0x61
flag.(*FlagSet).PrintDefaults(0xc0000b6120)
	/usr/lib/go-1.14/src/flag/flag.go:504 +0x4e
flag.PrintDefaults(...)
	/usr/lib/go-1.14/src/flag/flag.go:555
flag.glob..func1()
	/usr/lib/go-1.14/src/flag/flag.go:583 +0xe6
flag.commandLineUsage()
	/usr/lib/go-1.14/src/flag/flag.go:1021 +0x27
flag.(*FlagSet).usage(0xc0000b6120)
	/usr/lib/go-1.14/src/flag/flag.go:884 +0x2f
flag.(*FlagSet).parseOne(0xc0000b6120, 0xc000125dc8, 0xdf3f8fad, 0xce75dcd2148b0b58)
	/usr/lib/go-1.14/src/flag/flag.go:926 +0x1d4
flag.(*FlagSet).Parse(0xc0000b6120, 0xc0000a6030, 0x1, 0x1, 0x1, 0x80b9b2)
	/usr/lib/go-1.14/src/flag/flag.go:971 +0x62
flag.Parse(...)
	/usr/lib/go-1.14/src/flag/flag.go:999
main.main()
	/home/user/git/go-astisub/astisub/main.go:24 +0x122

Subtitles.go String() - why are subtitle lines joined with a " - ", instead of a new line ("\n")?

When we display a subtitle, it may be broken into multiple lines.
Example:
"Solo se muestran controles que estén
en estado crítico o de advertencia."

Unfortunately, the call to subtitle.String() method converts this to:
"Solo se muestran controles que estén - en estado crítico o de advertencia."

// String implements the Stringer interface
func (i Item) String() string {
	var os []string
	for _, l := range i.Lines {
		os = append(os, l.String())
	}
	return strings.Join(os, " - ") // <-- could this please be changed to return strings.Join(os, "\n")?
}

Thank you.
Neo

The indexes always are zero (v0.26.1 only)

Hi,
In version 0.26.1, the indexes for .srt or .vtt files consistently register as zero, whereas in version 0.26.0, the behavior is normal.
Here is simplified code and subtitle files:

s1, _ := astisub.OpenFile("test.srt")

for _, v := range s1.Items {
	fmt.Printf("Index: %d\n", v.Index)
}

TTML text parsing issues with new lines

Hi.

I'm trying to parse the following ttml snippet:

<?xml version="1.0" encoding="UTF-8"?><tt xmlns:smpte="http://www.smpte-ra.org/schemas/2052-1/2010/smpte-tt" xmlns="http://www.w3.org/ns/ttml" xmlns:ttm="http://www.w3.org/ns/ttml#metadata" xmlns:tts="http://www.w3.org/ns/ttml#styling" xml:space="default" xml:lang="eng"><head>
    <metadata>
      <ttm:title/>
    </metadata>
    <styling>
<style xml:id="style.center.outline" xmlns:tts="http://www.w3.org/ns/ttml#style" tts:fontFamily="Arial" tts:fontSize="100%" tts:fontStyle="normal" tts:fontWeight="normal" tts:backgroundColor="transparent" tts:color="white" tts:textOutline="black 2px" tts:textAlign="center"/>
    </styling>
    <layout>
      <region xml:id="r0" tts:displayAlign="after" tts:origin="10% 75%" tts:extent="80% 20%"/>
    </layout>
  </head><body>
  <div>
  <p style="style.center.outline" begin="00:22:31.000" region="r0" xml:id="p264" end="00:22:33.720" ><span tts:direction="ltr">Got you!<br/>Steady on.</span></p>
  </div></body></tt>


It seems that the subtitle text is parsed without a new line.
The text is unmarshalled as xml chardata:

type TTMLInItem struct {
	Text string `xml:",chardata"`
...
}

Which results with the following string: "Got you!Steady on."

ttml.go has the following comment in the code:

// New line decoded as a line break. This can happen if there's a "br" tag within the text since
// since the go xml unmarshaler will unmarshal a "br" tag as a line break if the field has the
// chardata xml tag.

But it doesn't really seem the go xml unmarshaler converts the br tag into a new line.
Perhaps this is something which used to be true in old go versions? (I'm using Go 1.18.5

What to expect to be supported by ttml parser?

Hi.
I see there are different ttml structures which the parser doesn't seem to support.
For example, the parser expects the subtitle "items" in the body to be in p tags, with the styles and region attributes.
I came across this ttml, where the body contains div of region, and within this div all the related subtitles of that region, without any region attribute:

<tt xmlns:ttp="http://www.w3.org/ns/ttml#parameter" xmlns="http://www.w3.org/ns/ttml"
    xmlns:tts="http://www.w3.org/ns/ttml#styling" xmlns:ttm="http://www.w3.org/ns/ttml#metadata"
    xmlns:ebuttm="urn:ebu:metadata" xmlns:ebutts="urn:ebu:style"
    xml:lang="eng" xml:space="default"
    ttp:timeBase="media"
    ttp:cellResolution="32 15">
  <head>
    <metadata>
      <ttm:title>DASH-IF Live Simulator</ttm:title>
      <ebuttm:documentMetadata>
        <ebuttm:conformsToStandard>urn:ebu:distribution:2014-01</ebuttm:conformsToStandard>
        <ebuttm:authoredFrameRate>30</ebuttm:authoredFrameRate>
      </ebuttm:documentMetadata>
    </metadata>
    <styling>
      <style xml:id="s0" tts:fontStyle="normal" tts:fontFamily="sansSerif" tts:fontSize="100%" tts:lineHeight="normal"
      tts:color="#FFFFFF" tts:wrapOption="noWrap" tts:textAlign="center"/>
      <style xml:id="s1" tts:color="#00FF00" tts:backgroundColor="#000000" ebutts:linePadding="0.5c"/>
      <style xml:id="s2" tts:color="#ff0000" tts:backgroundColor="#000000" ebutts:linePadding="0.5c"/>
    </styling>
    <layout>
      <region xml:id="r0" tts:origin="15% 80%" tts:extent="70% 20%" tts:overflow="visible" tts:displayAlign="before"/>
      <region xml:id="r1" tts:origin="15% 20%" tts:extent="70% 20%" tts:overflow="visible" tts:displayAlign="before"/>
    </layout>
  </head>
  <body style="s0">
    <div region="r0">
      
      <p xml:id="sub16000" begin="00:00:16.000" end="00:00:17.000" >
        <span style="s1">eng : 00:00:16.000</span>
      </p>
      
      <p xml:id="sub17000" begin="00:00:17.000" end="00:00:18.000" >
        <span style="s1">eng : 00:00:17.000</span>
      </p>
      
    </div>
  </body>
</tt>

In some other cases, the

elements might have style attributes, but these element are also a child of a div element which also have some styles associated which should be inherited, but this package doesn't seem to look at any div inside the body:

   <body ttm:role="caption">
      <div style="autogenFontStyle_n_150_120 S1 StyleFillLineGapTrue fontFamilyStyle">
         <p begin="00:00:01.000" end="00:00:02.000" region="R6" style="S4" ttm:role="sound" xml:id="C1">
            <span style="S3">FIRST SUBTITLE, WHA!!!!!! C1</span>
         </p>
         <p begin="00:00:03.000" end="00:00:04.000" region="R6" style="S4" ttm:role="sound" xml:id="C2">
            <span style="S3">PHONE RINGS C2</span>
         </p>
         <p begin="00:00:05.000" end="00:00:06.000" region="R6" style="S4" ttm:role="sound" xml:id="C3">
            <span style="S3">PHONE RINGS C3</span>
         </p>

In another ttml I have, the structure is as follows which seems to be parsed fine (Except for the
which I raised another issue).
the style attribute is directly on the

element and the region as well...

<body>
  <div>
  <p style="style.center.outline" begin="00:22:31.000" region="r0" xml:id="p264" end="00:22:33.720" ><span tts:direction="ltr">Got you!<br/>Steady on.</span></p>
  </div></body>

Are there multiple known types of TTML formats or versions so I can know which are supported by this package?
It seems to be a big problem for parsing such structures, at least not something which can be achieved by mapping the tags into go structs....

This TTML subs are so frustrating comparing to other subtitle formats, and there is no clear documentation about it and all the different possible structures it may have....

Cheetah CAP Files?

Hello there- I apologize if this is the wrong venue for a simple question.

Do you plan to support CAP files in this library in the near future?

Thanks.

Installation instructions do not work with Go 1.17

Hi, I have never worked with Go before (but needed to use this cool lib/tool), so I might be wrong, but I believe the installation instructions do not work with Go 1.17+.

From what I found it is related to https://go.dev/doc/go-get-install-deprecation . I have tried using install but then I got:

go install github.com/asticode/go-astisub@latest
go: downloading github.com/asticode/go-astisub v0.19.0
go: downloading github.com/asticode/go-astikit v0.20.0
go: downloading github.com/asticode/go-astits v1.8.0
go: downloading golang.org/x/net v0.0.0-20200904194848-62affa334b73
go: downloading golang.org/x/text v0.3.2
package github.com/asticode/go-astisub is not a main package

I have solved my issue by using 1.16, but it would probably be helpful to address this somehow (maybe at least mention the version).

Thnaks for the great lib/tool :)

not support .ass file?

s1, err := astisub.OpenFile("Call.Me.by.Your.Name.2017.BluRay.ass")
if err != nil {
	return nil, err
}

// s1.items is nil, why ,if subtitle file is .srt ,it works.

Handle dot as well for SRT files

For example, for the srt file from Lark:

1
00:00:01.230 --> 00:00:05.790
说话人 1: 用资讯唤醒每一天,欢迎收听财马早评

2
00:00:07.900 --> 00:00:11.220
说话人 2: 各位喜马拉雅才怕早评的听众朋友们大家早上好

3
00:00:11.220 --> 00:00:21.580
说话人 2:  2022 年的 1226 号,今天是周一,本周是比较特殊的一周,本周交易结束到本周六

It's use dot "." instead of comma "," for duration, which fails in this package. Can you please add a fallback for this kind of SRT files?

Not recognize *Default style for ASS v4.00+

Hi

I got an exception astisub: style *Default not found" when trying to open an .ass file.
I found that *Default is defined as default style in ASS v4.00+

Is this supported by the astisub lib?

Thanks.

Here is part of the content in my test .ass file, which I downloaded it from somewhere subtitle resource site.

[Script Info]
; // The sub is created by AssToolkit
; // AssToolkit is an ASS Converter designed by David C.
Title:BlendVision
Original Script:Test
Synch Point:0
ScriptType:v4.00+
Collisions:Normal
Timer:100.0000
ScaledBorderAndShadow: no

[V4+ Styles]
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding
Style: Default,方正黑体_GBK,20,&H00FFFFFF,&HF0000000,&H00000000,&H32000000,0,0,0,0,100,100,0,0,1,2,1,2,5,5,2,134

[Events]
Format: Layer, Start, End, Style, Actor, MarginL, MarginR, MarginV, Effect, Text
Dialogue: 0,0:00:05.66,0:00:10.66,*Default,NTP,0000,0000,0000,,{\an5\fad(0,500)\p1\bord2\shad0\fscx150\fscy160\alpha&566\c&H000000&\3c&HECB000&\move(145,247,145,227,0,500)\clip(30,188,340,210)\t(0,500,\clip(30,168,340,190))}m 211 -8 b 217 -6 217 -4 217 -2 l 217 24 b 217 26 217 29 211 31 l 31 31 b 26 29 26 26 26 24 l 26 -2 b 26 -4 26 -6 31 -8{\p0}
Dialogue: 0,0:00:05.66,0:00:10.66,*Default,NTP,0000,0000,0000,,{\an5\fad(0,500)\p1\bord2\shad0\fscx150\fscy160\alpha&566\c&H000000&\3c&HECB000&\move(145,207,145,227,0,500)\clip(30,220,340,235)\t(0,500,\clip(30,240,340,255))}m 211 -8 b 217 -6 217 -4 217 -2 l 217 24 b 217 26 217 29 211 31 l 31 31 b 26 29 26 26 26 24 l 26 -2 b 26 -4 26 -6 31 -8{\p0}
Dialogue: 0,0:00:05.66,0:00:10.66,*Default,NTP,0000,0000,0000,,{\an5\fad(0,500)\p1\bord3\blur3\shad0\fscx150\fscy160\alpha&566\c&H000000&\pos(145,227)\clip(30,210,340,220)\t(0,500,\clip(30,190,340,240))}m 211 -8 b 217 -6 217 -4 217 -2 l 217 24 b 217 26 217 29 211 31 l 31 31 b 26 29 26 26 26 24 l 26 -2 b 26 -4 26 -6 31 -8{\p0}
....

convert ass subtitle contains more than one language to vtt fail

cat /tmp/Armageddon.1998.ass

Title: CNXP
Original Script: lzqc
PlayResX: 384
PlayResY: 288
Timer: 100.0000

[V4+ Styles]
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding
Style: chs,simhei,20,&H00ffffff,&H0000ffff,&H00000000,&H80000000,1,0,0,0,90,90,0,0.00,1,2,2,2,20,20,17,1

[V4 Styles]
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, TertiaryColour, BackColour, Bold, Italic, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, AlphaLevel, Encoding
Style: eng,Arial Narrow,12,&H00ffeedd,&H00ffc286,&H00000000,&H80000000,-1,0,1,1,0,2,20,20,4,0,1

[Events]
Format: Marked, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
Dialogue: 0,0:00:53.11,0:00:55.23,*eng,,0000,0000,0000,,This is the Earth at a time...
Dialogue: 0,0:00:55.36,0:01:00.19,*eng,,0000,0000,0000,,when the dinosaurs roamed a lush and fertile planet.
Dialogue: 0,0:01:07.58,0:01:11.03,*eng,,0000,0000,0000,,A piece of rock just six miles wide...
Dialogue: 0,0:00:53.11,0:00:55.23,*chs,,0000,0000,0000,,这是地球
Dialogue: 0,0:00:55.36,0:01:00.19,*chs,,0000,0000,0000,,那是恐龙称霸的时代 万物滋长 欣欣向荣
Dialogue: 0,0:01:07.58,0:01:11.03,*chs,,0000,0000,0000,,一块只有六里宽的石头

reproduct bug

go install github.com/asticode/go-astisub/astisub@latest
astisub convert -i /tmp/Armageddon.1998.ass -o /tmp/out.vtt

it shows
2022/12/13 21:32:07 astisub: style *eng not found while opening /tmp/Armageddon.1998.ass

TTML styling conversion

Are there any plans to add features for converting the styling in TTML input to various output formats (I'm hoping for VTT)

Setting STLTimecodeStartOfProgramme to zero before decoding STL?

Thanks for a very nice and useful library!

I've run into STL files where STLTimecodeStartOfProgramme is set to 10 hours while the actual timestamps start at 0. As far as I've understood, this is a "safety mechanism" to avoid non-processed insertion of subtitles.

Since the offset is subtracted from the timestamp at line 255 in stl.go, I end up with large negative timestamps:

StartAt: t.timecodeIn - g.timecodeStartOfProgramme,

Looping through all cues and adding 10hours to each of them should give the correct values back, but it would be much more efficient and more transparent to have an option to set STLTimecodeStartOfProgramme = 0 in the call to ReadFromSTL, or have this done automatically if the timestamps are less than the STLTimecodeStartOfProgramme.

It's not clear to me what is the best way to progress here, but in order to not break the interface, maybe some optional parameters like in

https://github.com/edgeware/mp4ff/blob/bb9320744777dc97f18034c8aed45a9bcdbaa995/mp4/file.go#L121

could be a viable alternative?

Timestamp tag regression in 0.26.0

Inline timestamp tags are rendered incorrectly.

The following test illustrates this issue:

func TestWebVTTWithTimestampTag(t *testing.T) {
	testData := `WEBVTT

	00:01:00.000 --> 00:02:00.000
	Sentence with a timestamp<00:01:02.000> in the middle`

	s, err := astisub.ReadFromWebVTT(strings.NewReader(testData))
	require.NoError(t, err)

	require.Len(t, s.Items, 1)

	b := &bytes.Buffer{}
	err = s.WriteToWebVTT(b)
	require.NoError(t, err)
	require.Equal(t, `WEBVTT

1
00:01:00.000 --> 00:02:00.000
Sentence with an timestamp<00:01:02.000> in the middle
`, b.String())
}

This test passes in v0.25.1 and fails in 0.26.0.

I can't determine if this is the same bug as #94 as the original file is no longer available. I can verify that #96 does not fix this issue.

I was able to determine that this was introduced by 1e3a211

I was able to fix this with the following diff:

diff --git a/webvtt.go b/webvtt.go
index 3b1f5e4..4f56392 100644
--- a/webvtt.go
+++ b/webvtt.go
@@ -574,7 +574,7 @@ func (li LineItem) webVTTBytes() (c []byte) {
                        c = append(c, []byte(tag.startTag())...)
                }
        }
-       c = append(c, []byte(escapeWebVTT(li.Text))...)
+       c = append(c, []byte(li.Text)...)
        if li.InlineStyle != nil {
                noTags := len(li.InlineStyle.WebVTTTags)
                for i := noTags - 1; i >= 0; i-- {

But of course, this breaks the TestWebVTTEscape test. I'm not familiar enough with the code to determine a proper fix, but happy to do a PR if you have an idea on how to address this.

how to support vtt style

when use subtitles.ReadFromWebVTT, then style message will be losted.

STYLE
::cue {
color: #fff;
text-shadow: 0 1px #000, 1px 0 #000, -1px 0 #000, 0 -1px #000;
font-size: 15vw;
}

TTML ticks higher than 15 minutes display invalid values

Tick values higher than a certain amount will result in incorrect times.

For instance, take this test TTML:

<?xml version="1.0" encoding="UTF-8"?>
<tt xmlns="http://www.w3.org/ns/ttml" xmlns:tt="http://www.w3.org/ns/ttml" xmlns:ttm="http://www.w3.org/ns/ttml#metadata" xmlns:ttp="http://www.w3.org/ns/ttml#parameter" xmlns:tts="http://www.w3.org/ns/ttml#styling" ttp:tickRate="10000000" ttp:version="2" xml:lang="ja">
 <head>
  <styling>
   <initial tts:backgroundColor="transparent" tts:color="white" tts:fontSize="6.000vh"/>
   <style xml:id="style0" tts:textAlign="center"/>
   <style xml:id="style1" tts:textAlign="start"/>
   <style xml:id="style2" tts:ruby="container" tts:rubyPosition="auto"/>
   <style xml:id="style3" tts:ruby="base"/>
   <style xml:id="style4" tts:ruby="text"/>
   <style xml:id="style5" tts:ruby="text"/>
  </styling>
  <layout>
   <region xml:id="region0" tts:displayAlign="after"/>
  </layout>
  </head>
 <body xml:space="preserve">
  <div>
   <p xml:id="subtitle1" begin="18637368750t" end="18676157500t" region="region0" style="style0"><span style="style1">テソプ<span style="style2"><span style="style3">の所だ</span><span style="style4">カン食ン</span></span>食おう<br/><span style="style2"><span style="style3">江陵</span><span style="style5">カンヌン</span></span>で刺身でも食おう</span></p>
  </div>
 </body>
</tt>

The resulting ASS is:

[Script Info]

[V4 Styles]
Format: Name
Style: italic
Style: span

[Events]
Format: Start, End, Text
Dialogue: 00:0-8:0-44..48,00:0-8:0-43..02,TEST

Probably something to do with ticks not being int64 but normal integers, not sure though.

Broken VTT styling in output (0.26.0 only)

2_Eng.zip

The simplified code:

var sub *astisub.Subtitles
if strings.HasSuffix(file.Name, ".srt") {
sub, err = astisub.ReadFromSRT(fd)
} else if strings.HasSuffix(file.Name, ".ass") || strings.HasSuffix(file.Name, ".ssa") {
sub, err = astisub.ReadFromSSA(fd)
}
var buf = &bytes.Buffer{}
sub.WriteToWebVTT(buf)

Speaker not included when writing VTT

This is a fantastic module thank you. I've noticed that when I read VTT files that it will set the VoiceName field. However, when I write the file back down it doesn't include the speaker.

I'm not sure if this effects other formats as I've only tested with VTT.

ReadFromWebVTT incorrectly returns empty Items if cue timings have two spaces before cue settings

Hi there!

I ran into an issue when parsing a WebVTT file with the ReadFromWebVTT method. Specifically, the parser incorrectly returns an error that the inline style is invalid in the scenario when there are two spaces between the cue timing and cue settings.

For example, parsing this file will return an error because of the two spaces after the end of the cue time and before the word "position":

WEBVTT

00:00:00.580 --> 00:00:03.438  position:50% align:middle
- [Reporter] Hello welcome to today's episode

00:00:03.438 --> 00:00:07.862  position:50% align:middle
Today we're going to talk about cats

The WebVTT spec suggests it's ok to have more than one space:

A WebVTT cue block consists of the following components, in the given order:

Optionally, a WebVTT cue identifier followed by a WebVTT line terminator.
WebVTT cue timings.
Optionally, one or more U+0020 SPACE characters or U+0009 CHARACTER TABULATION (tab) characters followed by a WebVTT cue settings list.

Thanks for help in addressing this.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.