Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update OpenType to make AAT and Graphite completely obsolete #7

Open
davelab6 opened this issue Mar 4, 2019 · 22 comments
Open

Update OpenType to make AAT and Graphite completely obsolete #7

davelab6 opened this issue Mar 4, 2019 · 22 comments

Comments

@davelab6
Copy link

davelab6 commented Mar 4, 2019

Apple AAT and SIL Graphite are 2 SFNT based font shaping technologies that can do things the GPOS/GSUB can not do.

Should a state-machine-driven lookup mechanism be added OpenType, that is super-fast and on par with what’s available in Graphite and Apple’s AAT, making them finally obsolete? :)

@behdad
Copy link
Member

behdad commented Mar 4, 2019

Hint hint:
https://github.com/OpenType/opentype-layout/blob/master/proposals/complex_contextual.md

I was very excited when Martin and I came up with that. I promised him to write the intro (aka rationale) to that proposal, which I never did. But that proposal includes best of state machines and then some.

I'm still interested pursuing that, if there's interest / commitment from MS.

Or just tlfold morx into GSUB and call it good.

@be5invis
Copy link

I heard @clerkma have a great format that covers all needs.

@be5invis

This comment has been minimized.

@tiroj
Copy link
Collaborator

tiroj commented Mar 20, 2019

I understand and appreciate the need to either specify the lookup format as in the Complex Contextual Chaining Lookup type proposal or define the behaviour of a substitution in a formal way, but speaking as a type designer, I would find it really useful to have some examples of the kinds of things one could do with these approaches, and how one would apply them.

@be5invis
Copy link

be5invis commented Apr 2, 2019

@tiroj
Or, could we simply adopt Graphite and make it official?

@behdad
Copy link
Member

behdad commented Apr 2, 2019

Or adopt AAT which will be universally supported in browsers later this year.

@be5invis
Copy link

be5invis commented Apr 2, 2019

@behdad
To clearify, which "way" is your preferrence: extend OTL to make it having same ability of ATT/Graphite, or completely throw away OTL and use a new shaping model.
Option 1 is better for like OpenType 1.9, but option 2 is a good idea for “OpenType 2.0”.

@mhosken
Copy link
Collaborator

mhosken commented Apr 2, 2019

For me it isn't the weaknesses of GSUB that make me cry, but the weaknesses of GPOS. And AAT doesn't help with that.

@tiroj
Copy link
Collaborator

tiroj commented Apr 2, 2019

Agreed: GSUB occasionally requires me to do some funky stuff to get glyphs in the right order for subsequent processing, but GPOS actually prevents me from doing things. AAT has been a non-starter with the majority of type designers for quarter of a century, and without solving adjacency and positioning issues it's unlikely to gain any more support now.

Since we're throwing around ideas for alternative layout approaches, I'm going to mention DecoType's ACE, which demonstrably does solve the adjacency and positioning issues, even though currently very few people know how.

@be5invis
Copy link

be5invis commented Apr 2, 2019

@tiroj

GSUB occasionally requires me to do some funky stuff to get glyphs in the right order for subsequent processing, but GPOS actually prevents me from doing things.

Could you please elaborate? And can Graphite handle all the cases? If so adopting Graphite would be a good option. I heard that Graphite can do very fine-tunes position adjustments but I did not reviewed its semantics.

A note: I do not want DWrite or HarfBuzz’s API changed after we change the shaping process, so some ideas like mixing GPOS and GSUB would have a risk of API change. Also be careful about cluster map, which is critical for editors.

@tiroj
Copy link
Collaborator

tiroj commented Apr 2, 2019

I'm not familiar enough with Graphite's positioning model to comment on that.

With regard to GPOS, there are problems with interaction between spacing and mark positioning that affects some writing systems very badly, notably cascading Arabic styles such as nastaliq and diwani*, and Telugu. I spoke about these issues at TypeCon a few years ago: Problems of Adjacency.

Recently, I've been working on Telugu again, and pushed things about as far as I could using massive numbers of contextual GPOS lookups to adjust spacing over lateral subscripts. It's ugly.

@tiroj
Copy link
Collaborator

tiroj commented Apr 2, 2019

I'd probably revise some parts of that 2014 presentation based on my more recent Telugu experience in which I decided to try to support Sanskrit as well as Telugu language text. My characterisation of Telugu shaping in OT on page 29 of the presentation PDF — 'it is all reasonably do-able' — now seems rather optimistic.

@mhosken
Copy link
Collaborator

mhosken commented Apr 3, 2019

I'll have a try at listing some of the key positioning capabilities of Graphite that enable us to produce fonts for different scripts, including Nastaliq.

  • Conditional Actions. Graphite has the ability to constrain actions more than on the glyph sequence. You can do calculations on relative glyph positions; variables associated with the particular glyph; feature settings; glyph attributes (shared by all glyphs of a given glyph id). These may be applied to any action. For example, you can test for overlapping bounding boxes.
  • Spacing Attachment. Automatic optional insertion of guard space around diacritics, left and or right side.
  • Attachment as trees. Allowing movement of whole attachment trees, etc.
  • Dynamically calculated values for things like shift and advance. These can be based on attributes of other glyphs in the matched string in addition to the glyph being modified.
  • Setting glyph attribute values that can be used in this or subsequent passes to constrain actions.
  • Glyph reprocessing within the same pass.

Specifically to support Nastaliq we also added:

  • Non linear kerning based on octabox outline approximations (rather than glyph bounding boxes)
  • Shift collision avoidance for nuqtas and diacritics within a cluster
  • Limited glyph replacement (1:1 only) during positioning based on positional constraints. E.g. to swap in a smaller glyph when space is limited.

In answer to your questions, @be5invis, yes Graphite handles the char to glyph and glyph to character mapping very well (better than OT) and there is code that can create OpenType clusters out of the results from Graphite, evidenced in Harfbuzz.

Areas where OT does positioning stuff that Graphite doesn't do:

  • no device metrics
  • no shaper preprocessing. In effect Graphite is a DFLT shaper on steroids. Reordering, insertion and deletion is easy in Graphite and with glyph attributes, very sophisticated substitution is possible.

Graphite doesn't exist in opposition to OpenType, but as an alternative particularly for the most complex shaping and positional needs.

In my experience of working on complex script needs in OpenType, I have found that even with sophisticated compiler/macro layering on top of OpenType, it just cannot express what needs to be expressed with regard to positioning in some cases. So having another string to our bow would be nice.

While I'm sure you could come up with a use case that Graphite can't handle, I don't know of any and we have fonts in a wide range of scripts.

@tiroj
Copy link
Collaborator

tiroj commented Apr 3, 2019

re. nastaliq:

Shift collision avoidance for nuqtas and diacritics within a cluster

Do you have recent documentation on this? The initial version that I saw at TUC a few years back did a kind of approximation of nuqta and diacritic positioning which was definitely an improvement over trying to use GPOS, but didn't always correspond to script-normal positioning. As I recall, this was partly due to performance optimisation constraining the allowable angles of movement. Is this still the case, or is it possible to customise the collision avoidance algorithm if one wants to try to target traditional positioning norms?

I wrote up the algorithms in an unpublished paper that I would be happy to share. The main point about working in 45 degree space is one of speed given the workload is at least O(N^2) by dimension. Given all movement is in units of 45 degrees it makes sense to have outline approximations in the same space. This means that a single move is an optimal move against a 45 degree approximation to the outline. If doing multiple steps helps, that is still only an O(N) increase in cost so worth taking that route. The results we have seen are excellent generally.

Another way of saying this is that if you want to solve the problem exactly, it is going to be prohibitively expensive to calculate for minimal gain in quality. Yes, this is an approximation, but it's a good one that enables us to get a solution in a timely fashion. Happy to discuss further offline if that would help.

@be5invis
Copy link

be5invis commented Apr 3, 2019

@tiroj @mhosken

DISCLAIMER: This is NOT a proposal from Microsoft, just from me.

My idea, if we want to refactor OTL, we can simplify the lookup thing into a flat list of rules:

matchState, (B, N, C), recognizer ▶ action

where B,N,C are natural numbers and N>0, recognizer is a function that takes (B+N+C) glyphs (in GSUB) or (B+N+C) glyph-position pairs (in GPOS), and return either TRUE or FALSE.

When performing substitution or positioning, If the current state (an integer) is equal to matchState and the recognizer returns TRUE for the non-ignore glyphs, action would be performed, to:

  • Replace the input sequence (in GSUB) : N glyph-clusterMap pairs → N lists of glyph-clusterMap pairs
  • Change the position data (in GPOS) : N glyph-position pairs → N positions
  • Jump to another state
  • Advance by one glyph

Recognizers and actions can have a huge space of flexibility and extensibility, they only need to be conformal with the “type”:

  • GSUB recognizer: (B + N + C) Glyphs → TRUE or FALSE
  • GPOS recognizer: (B + N + C) Glyph-Position pairs → TRUE or FALSE
  • GSUB primitive action: N Glyph-ClusterMap pairs → N lists of glyph-clusterMap pairs
  • GPOS primitive action: N Glyph-Position pairs → N Positions

@be5invis
Copy link

be5invis commented Apr 3, 2019

@tiroj @mhosken
Image of my concept:
image

DISCLAIMER: This is NOT a proposal from Microsoft, just from me.

@NorbertLindenberg
Copy link

@be5invis Is there serious interest at Microsoft in improving OpenType layout in any substantial way? Are you speaking for Microsoft with your proposals and questions, or just for yourself? My impression was that Microsoft as a company considers OpenType layout done and has moved key people on to other projects. I’d be happy to hear that my impression was wrong.

@be5invis
Copy link

be5invis commented Apr 3, 2019

@NorbertLindenberg
I am speaking for myself, not company.
However the text people (like Andrew Glass, I am not sure whether he uses GitHub) have interest on improving OTL, but there are a lot of concerns, like API stability and performance. Also, they seldom express their idea to the public.

@behdad
Copy link
Member

behdad commented Apr 25, 2019

@behdad
To clearify, which "way" is your preferrence: extend OTL to make it having same ability of ATT/Graphite, or completely throw away OTL and use a new shaping model.
Option 1 is better for like OpenType 1.9, but option 2 is a good idea for “OpenType 2.0”.

I don't support throwing away and restarting. There's no indication that we can do better, and it will just waste a lot :). I support either integrating AAT-compatible machines into GSUB, or add something like the Complex Contextuals proposal that Martin and I produced, which is state-machine-equivalent but has several nice properties in terms of storage and integrating with existing lookups. We probably will work more on it again later this year.

https://github.com/OpenType/opentype-layout/blob/master/proposals/complex_contextual.md

@behdad
Copy link
Member

behdad commented Apr 25, 2019

Agreed: GSUB occasionally requires me to do some funky stuff to get glyphs in the right order for subsequent processing,

This attitude was what broke down advancement of layout when we tried back in 2015. Just because this is not the top difficulty, it doesn't mean it's not worth addressing.

@mhosken
Copy link
Collaborator

mhosken commented Apr 26, 2019

The point I was making is not that we shouldn't develop GSUB and make it better, but that just doing that is insufficient, on its own, to meet the needs out there.

@aminanan
Copy link

I was trying to implement a finite state machine lookup based on the Complex Contextual Chaining Lookup proposal. I've come to the following conclusion. Correct me if I'm wrong, please.

If we need to implement the full expressiveness of regular expressions with unbound repetitions (i.e., Kleene star), we should support submatch extraction. See for example Tagged Deterministic Finite Automaton, an extension of deterministic finite automaton capable of submatch extraction and parsing. (Consider a tag as an action or lookup to apply at a given position after a match.)

Without submatching, we will not be able to express a simple rule such as

@class1* @class1 ' lookup l1 @class1

which specifies to apply the lookup l1 to the second to last glyph of a series of glyphs belonging to @class1. Graphite avoids the problem by not implementing Kleene star. The proposed complex lookup and AAT would support Kleene star only if it does not interfere with some action or lookup such as

@class1* @class2 ' lookup l1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants