-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feasibility of new combining characters? #1
Comments
Just a little suggestion, can this be implemented using ligatures? An example is https://github.com/tonsky/FiraCode |
It would likely require changes in text rendering software. At the least to map the characters to the value. Also there's something a little wierd about a superscript combining character if you start thinking of it generatively. Because what you're creating is itself a combining character, which would traditionally have a scope on the previous character. But now if you wanted to have both a superscript and subscript on the same fullscript character you need some way to define the possibility of it going two spaces back. If you change the meaning of subscript/superscript to be a category of letters you still have a problem. Do you want one to ever come before the other when setting some type? Or do they combine directly above and below each other regardless of order. What then happens when you have more than one letter you want to be super or subscripted in a sequence? Should this also cover that case? Then how does it know how to space then. The general problem this is running up against is the expressivity of mathematical type conventions in terms of indicating referential scope on operators. Unicode is a huge code space but there will be no way to encode all that you would want in the semantics of this combining character. However if you want to specify a use case, my guess is the way to define it is as a modification of the way diacritics combine for handling the superscript subscript thing (in that case they'll always be one on top of the other) but then something like a normal character when there are more than one in a row. That way you could emulate 1^{st}, for example. However this introduces the problem of kerning and using s smaller font size rather than just a scaled larger font size. The simpler thing to do would be to just stack them like with diacritics but that will look terrible for superscripts since it's like trying to have an acuté, grave, hat, and diariesis on the same character. They'll overlap and will generally look awful. But regardless it's either going to require making strong constraints on interpretation or you'll be using Unicode's flat name space to effectively implement a mathematical typesetting language like LaTεχ. As for font issues. Look at guthub's monospace rendering of subscript schwa(ₔ vs. |
@michaelpacer, I was thinking So the combining character would only affect the glyph immediately preceding it, and would be a alternative to defining lots of new subscript/superscript codepoints, not a more generalized typesetting system. |
The point is to look at the superscript/subscript characters already in Unicode, and to follow that model. It's only a question of how to encode them (as new characters or via a combining mark). |
@michaelpacer (responding to #3 (comment)) if you have a generic combining character, couldn't a font encode all of the most important cases (e.g. Latin and Greek subscripts/superscripts), similar to ligature substitution? If it does a mediocre job at rendering 🐨 subscripts, I don't think it's a big deal. This way you get the benefit of good sub/superscript glyphs in the important cases, combined with the flexibility to add more good sub/superscript glyphs as the need arises without changing the Unicode encoding. |
Another question here: in which cases should subscript combined characters normalize to existing subscript characters? |
@asmeurer, yes, I wasn't sure about that. Unicode tends to favor using different codepoints for semantically distinct concepts, so a "mathematical subscript/superscript" character would tend to have a different codepoint from a subscript/superscript used for phonetic symbols. So, from this perspective one would have to go though the existing sub/superscripts in Unicode and identify their semantic origins. Alternatively, a simple answer would be "all of them". (Or "none of them".) |
The most general form of this proposal, suggested by @StefanKarpinski, is to encode new combining characters: mathematical subscript and mathematical superscript, which indicate that the previous glyph is to be rendered as a subscript or superscript, respectively.
How difficult would this be to implement? Can it be done with font changes alone? Or does it require changes to text-rendering engines? In the latter case, can we get in contact with maintainers of prominent rendering software (Pango, Apple, Microsoft, ...) to gauge their interest?
The text was updated successfully, but these errors were encountered: