Feasibility of new combining characters? #1

stevengj · 2016-08-27T02:38:04Z

The most general form of this proposal, suggested by @StefanKarpinski, is to encode new combining characters: mathematical subscript and mathematical superscript, which indicate that the previous glyph is to be rendered as a subscript or superscript, respectively.

How difficult would this be to implement? Can it be done with font changes alone? Or does it require changes to text-rendering engines? In the latter case, can we get in contact with maintainers of prominent rendering software (Pango, Apple, Microsoft, ...) to gauge their interest?

ProfFan · 2016-08-27T04:56:59Z

Just a little suggestion, can this be implemented using ligatures? An example is https://github.com/tonsky/FiraCode

mpacer · 2016-08-27T05:02:14Z

It would likely require changes in text rendering software. At the least to map the characters to the value.

Also there's something a little wierd about a superscript combining character if you start thinking of it generatively. Because what you're creating is itself a combining character, which would traditionally have a scope on the previous character. But now if you wanted to have both a superscript and subscript on the same fullscript character you need some way to define the possibility of it going two spaces back.

If you change the meaning of subscript/superscript to be a category of letters you still have a problem. Do you want one to ever come before the other when setting some type? Or do they combine directly above and below each other regardless of order. What then happens when you have more than one letter you want to be super or subscripted in a sequence? Should this also cover that case? Then how does it know how to space then.

The general problem this is running up against is the expressivity of mathematical type conventions in terms of indicating referential scope on operators. Unicode is a huge code space but there will be no way to encode all that you would want in the semantics of this combining character.

However if you want to specify a use case, my guess is the way to define it is as a modification of the way diacritics combine for handling the superscript subscript thing (in that case they'll always be one on top of the other) but then something like a normal character when there are more than one in a row. That way you could emulate 1^{st}, for example. However this introduces the problem of kerning and using s smaller font size rather than just a scaled larger font size.

The simpler thing to do would be to just stack them like with diacritics but that will look terrible for superscripts since it's like trying to have an acuté, grave, hat, and diariesis on the same character. They'll overlap and will generally look awful.

But regardless it's either going to require making strong constraints on interpretation or you'll be using Unicode's flat name space to effectively implement a mathematical typesetting language like LaTεχ.

As for font issues. Look at guthub's monospace rendering of subscript schwa(ₔ vs. ₔ). Not all fonts know how to handle the current problem, even generally good ones have difficulty supporting all the features of Unicode semantics now. It's unclear how fonts are to encode an entire secondary font file in terms of themselves. The way out of that is to have some major changes to the text renderers to look for more than one font file, otherwise the spacing is likely to be terrible.

stevengj · 2016-08-27T11:55:03Z

@michaelpacer, I was thinking a[super] = ᵃ, regardless of the what comes before it, so if you did Aa[super]x[sub] you would get Aᵃₓ ... i.e. it wouldn't try to put subscripts below superscripts.

So the combining character would only affect the glyph immediately preceding it, and would be a alternative to defining lots of new subscript/superscript codepoints, not a more generalized typesetting system.

stevengj · 2016-08-27T11:56:12Z

The point is to look at the superscript/subscript characters already in Unicode, and to follow that model. It's only a question of how to encode them (as new characters or via a combining mark).

stevengj · 2016-08-30T18:27:06Z

@michaelpacer (responding to #3 (comment)) if you have a generic combining character, couldn't a font encode all of the most important cases (e.g. Latin and Greek subscripts/superscripts), similar to ligature substitution? If it does a mediocre job at rendering 🐨 subscripts, I don't think it's a big deal.

This way you get the benefit of good sub/superscript glyphs in the important cases, combined with the flexibility to add more good sub/superscript glyphs as the need arises without changing the Unicode encoding.

asmeurer · 2016-08-30T18:55:44Z

Another question here: in which cases should subscript combined characters normalize to existing subscript characters?

stevengj · 2016-08-30T19:20:34Z

@asmeurer, yes, I wasn't sure about that. Unicode tends to favor using different codepoints for semantically distinct concepts, so a "mathematical subscript/superscript" character would tend to have a different codepoint from a subscript/superscript used for phonetic symbols. So, from this perspective one would have to go though the existing sub/superscripts in Unicode and identify their semantic origins.

Alternatively, a simple answer would be "all of them". (Or "none of them".)

stevengj mentioned this issue Aug 27, 2016

Adds Julia-style latex->unicode tab completion ipython/ipython#6380

Merged

2 tasks

stevengj mentioned this issue Aug 30, 2016

Endorsements #3

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feasibility of new combining characters? #1

Feasibility of new combining characters? #1

stevengj commented Aug 27, 2016 •

edited

Loading

ProfFan commented Aug 27, 2016

mpacer commented Aug 27, 2016 •

edited

Loading

stevengj commented Aug 27, 2016

stevengj commented Aug 27, 2016

stevengj commented Aug 30, 2016 •

edited

Loading

asmeurer commented Aug 30, 2016

stevengj commented Aug 30, 2016

Feasibility of new combining characters? #1

Feasibility of new combining characters? #1

Comments

stevengj commented Aug 27, 2016 • edited Loading

ProfFan commented Aug 27, 2016

mpacer commented Aug 27, 2016 • edited Loading

stevengj commented Aug 27, 2016

stevengj commented Aug 27, 2016

stevengj commented Aug 30, 2016 • edited Loading

asmeurer commented Aug 30, 2016

stevengj commented Aug 30, 2016

stevengj commented Aug 27, 2016 •

edited

Loading

mpacer commented Aug 27, 2016 •

edited

Loading

stevengj commented Aug 30, 2016 •

edited

Loading