Giter VIP home page Giter VIP logo

Comments (17)

RazrFalcon avatar RazrFalcon commented on June 24, 2024

so I assume it should be fine to just always return None for zero offsets

I would not assume anything in case of TrueType.

As you correctly noted, not all offsets are allowed to be NULL and for now you have to check them manually.
I would suggest adding parse_at_non_null_offset16 and use it only where the spec allows NULL. Because technically, a NULL value set for an offset that doesn't allow NULL is a hard error and we have to skip the whole table.

In other parts of the codebase I specifically use s.read::<Option<Offset16>>() for this. Ugly, but this is how it is.

from ttf-parser.

laurmaedje avatar laurmaedje commented on June 24, 2024

From the spec:

A NULL subtable offset always indicates that the given subtable is not present. Applications should never interpret a NULL offset value as the offset to subtable data. For cases in which subtable offset fields are not documented as permitting NULL values, font compilers must include a subtable of the indicated format, even if it is a header stub without further data (for example, a coverage table with no glyph IDs). Applications parsing font data should, however, anticipate non-conformant font data that has a NULL subtable offset where only a non-NULL value is expected.

Adding a new parse_at_non_null_offset16 and keeping the current parse_at_offset16 as-is conflicts with "applications should never interpret a NULL offset value as the offset to subtable data", doesn't it? And if we adapt parse_at_offset16 to conform with this, what would be the difference between the two methods?

from ttf-parser.

RazrFalcon avatar RazrFalcon commented on June 24, 2024

Can you give a link to this doc? I'm not sure if it's talking about the specific table or all of them.

Ok, my bad. I got confused with parse_at_offset16 and read_at_offset16. The first one isn't used anywhere except for MATH. But I guess you're proposing to change read_at_offset16 as well?
This would be a breaking change with unknown consequences.

from ttf-parser.

laurmaedje avatar laurmaedje commented on June 24, 2024

In the data types sections: https://learn.microsoft.com/en-us/typography/opentype/spec/otff
Yes, that would have been my proposal.

from ttf-parser.

RazrFalcon avatar RazrFalcon commented on June 24, 2024

harfbuzz has a clear separation between nullable and non-nullable offsets: link

Sadly, I can't read CRTP, so I'm not sure how exactly they handle NULL for non-NULL. But we should do the same.
Maybe you will be able to unravel it.

Also, this is OpenType spec. Apple can do whatever they want with their fonts, aka AAT.

from ttf-parser.

laurmaedje avatar laurmaedje commented on June 24, 2024

I'm also not able to parse that with confidence. From these lines, it seems that Harfbuzz produces null-pointers for non-nullable offsets (maybe this is never reached due to sanitization) and empty nulled objects for nullable offsets (which are reached). This makes sense insofar as that the GlyphAssembly table, whose offset may be null, is accessed as if it was always present. But I also can't find code in the sanitization that does that.

from ttf-parser.

RazrFalcon avatar RazrFalcon commented on June 24, 2024

@behdad Hi! Sorry for bothering you, but maybe you could clarify how NNOffsetNTo works in harfbuzz? Specifically, do you mark just the offset as missing/null, or do you mark the whole table as invalid? I've tried reading the code, but it's too advance for me.
And maybe you have some other insight regarding offsets handling in TrueType.

from ttf-parser.

behdad avatar behdad commented on June 24, 2024

NNOffsetTo

That's something else. That's used for a few places (mostly AAT tables and a couple of places in CFF) where a 0 offset is allowed.

In other places we always use OffsetTo. Even when the spec says a NULL offset is disallowed, we always allow it, and return a null object.

from ttf-parser.

behdad avatar behdad commented on June 24, 2024

In other places we always use OffsetTo. Even when the spec says a NULL offset is disallowed, we always allow it, and return a null object.

Ie. we never treat those as hard error.

from ttf-parser.

RazrFalcon avatar RazrFalcon commented on June 24, 2024

Thanks a lot!

@laurmaedje I guess we can invert the default behavior then. And those methods are not used in AAT anyway, at least for now.

from ttf-parser.

laurmaedje avatar laurmaedje commented on June 24, 2024

To summarize:

  • read_at_offset16 is used across the OpenType tables, but not for AAT
  • parse_at_offset16 is only used for MATH, but could be used for lots of OpenType parsing
  • read_at_offset32 is only used in AAT, and only twice

Do you want to only change parse_at_offset16 or also read_at_offset16? Changing read_at_offset32 breaks some test for the ankr table and the Apple TrueType spec talks about how 0 is a valid value.

from ttf-parser.

RazrFalcon avatar RazrFalcon commented on June 24, 2024

I would suggest to:

  1. read_at_offset16 should handle 0 now.
  2. parse_at_offset16 should be moved to MATH as a private Stream extension used only by MATH.
  3. read_at_offset32 should be moved toankr as a private Stream extension used only by ankr.

After updating everything, try running rustybuzz tests with those changes.

from ttf-parser.

laurmaedje avatar laurmaedje commented on June 24, 2024

Okay, I tried that. 786 rustybuzz tests passed, but a single one failed. :(

thread 'use_010' panicked at 'assertion failed: `(left == right)`
  left: `"Ga.icd=0+367|Gha.diag=0@100,0+386|O_dv.lg=0+190|Candrabindu.sm=0@48,32+0|Ga.icd=5+367|Gha.diag=5@100,0+386|AU_dv.lg=5+190|Candrabindu.sm=5@-64,187+0"`,
 right: `"E_dv.alt=0+275|Ga.icd=0+367|Gha.diag=0@100,0+386|AA_dv.alt=0+208|Candrabindu=0@17,-8+0|E_dv.alt=5+275|Ga.icd=5+367|Gha.diag=5@100,0+386|AU_dv_part.alt=5+213|Candrabindu.sm=5@-52,179+0"`', tests/shaping_in_house.rs:10712:5

I would have to take a closer look to see what's going on here as the assertion failure message isn't particularly helpful.

PS: Math doesn't really work differently from the rest of OpenType and both could make use of the parse_at_offset16 in the same way. Currently OpenType calls Coverage::parse(s.read_at_offset16(data)?)? manually, while MATH calls s.parse_at_offset16::<Coverage>(data)?

from ttf-parser.

RazrFalcon avatar RazrFalcon commented on June 24, 2024

rustybuzz has about 1600 tests, I guess cargo stoped earlier.

Anyway, as I've said before - there are no easy changes in TrueType.
I will take a look what might be wrong here.

from ttf-parser.

RazrFalcon avatar RazrFalcon commented on June 24, 2024

Can you publish your current changes so I could run them locally?

from ttf-parser.

laurmaedje avatar laurmaedje commented on June 24, 2024

Sure: https://github.com/laurmaedje/ttf-parser
Thanks for looking into it!

from ttf-parser.

RazrFalcon avatar RazrFalcon commented on June 24, 2024

Ok, so in case of use_010, ChainedContextLookup allows nulls. As per spec.

Specifically, in that font format2.backtrack_classes is set to 0, which is apparently just fine.

So instead of changing the old behavior, you could add a custom one just for MATH. Otherwise we should spend some time carefully checking all read_at_offset16 calls.

from ttf-parser.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.