You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are using text-icu in Balkón, an inline layout engine intended for a web browser.
In order to handle bidirectional text, Balkón needs to be able to run the BiDi algorithm for a given input text and retrieve the calculated levels, so that it can break this text into directional runs and pass each of them to HarfBuzz for shaping. For this, Balkón needs the output provided by the ubidi_getLevels() function in the ICU C API. Because Balkón allows associating formatting options and metadata with portions of the input text, and because the output of HarfBuzz has the form of glyphs, we cannot use the high-level reorderParagraph function, which only works on plain text and reorders it in such a way that preserving metadata would be very difficult. Fortunately, the reordering step is very simple to implement, so Balkón can take responsibility for it. We just need the output of the BiDi algorithm after rule I2, before reordering (https://www.unicode.org/reports/tr9/#Reordering_Resolved_Levels).
It would also be useful if Balkón could supply the initial embedding levels to the algorithm, so that direction changes can be dictated by higher level protocols (typically HTML, since this is intended for a web browser) without having to insert explicit formatting characters into the input string (which would complicate working with text offsets for selections). This is permitted by BiDi rule HL3 (https://www.unicode.org/reports/tr9/#HL3). Initial embedding levels can be passed as the embeddingLevels parameter of the ubidi_setPara() function in the ICU C API, which is currently hardcoded as NULL in the Haskell bindings (https://hackage.haskell.org/package/text-icu-0.8.0.2/src/cbits/text_icu.c).
It should also be possible to control the paraLevel parameter of the ubidi_setPara() function, which would typically reflect the direction set on HTML block elements. This is permitted by BiDi rule HL1 (https://www.unicode.org/reports/tr9/#HL1).
A high-level, pure Haskell function providing the required functionality might look something like this:
textLevels :: Word8 -> ByteString -> Text -> ByteString
textLevels paraLevel inputLevels inputText =
unsafePerformIO $ do
bidi <- open
setParaWithLevels bidi inputText paraLevel inputLevels
getLevels bidi
where setParaWithLevels is a foreign call to ubidi_setPara() including the embeddingLevels parameter, and getLevels is a foreign call to ubidi_getLevels(). Additional code may be necessary to handle memory allocation and deallocation. The levels may be stored in a data type other than ByteString if necessary.
The text was updated successfully, but these errors were encountered:
We are using text-icu in Balkón, an inline layout engine intended for a web browser.
In order to handle bidirectional text, Balkón needs to be able to run the BiDi algorithm for a given input text and retrieve the calculated levels, so that it can break this text into directional runs and pass each of them to HarfBuzz for shaping. For this, Balkón needs the output provided by the
ubidi_getLevels()
function in the ICU C API. Because Balkón allows associating formatting options and metadata with portions of the input text, and because the output of HarfBuzz has the form of glyphs, we cannot use the high-levelreorderParagraph
function, which only works on plain text and reorders it in such a way that preserving metadata would be very difficult. Fortunately, the reordering step is very simple to implement, so Balkón can take responsibility for it. We just need the output of the BiDi algorithm after rule I2, before reordering (https://www.unicode.org/reports/tr9/#Reordering_Resolved_Levels).It would also be useful if Balkón could supply the initial embedding levels to the algorithm, so that direction changes can be dictated by higher level protocols (typically HTML, since this is intended for a web browser) without having to insert explicit formatting characters into the input string (which would complicate working with text offsets for selections). This is permitted by BiDi rule HL3 (https://www.unicode.org/reports/tr9/#HL3). Initial embedding levels can be passed as the
embeddingLevels
parameter of theubidi_setPara()
function in the ICU C API, which is currently hardcoded asNULL
in the Haskell bindings (https://hackage.haskell.org/package/text-icu-0.8.0.2/src/cbits/text_icu.c).It should also be possible to control the
paraLevel
parameter of theubidi_setPara()
function, which would typically reflect the direction set on HTML block elements. This is permitted by BiDi rule HL1 (https://www.unicode.org/reports/tr9/#HL1).A high-level, pure Haskell function providing the required functionality might look something like this:
where
setParaWithLevels
is a foreign call toubidi_setPara()
including theembeddingLevels
parameter, andgetLevels
is a foreign call toubidi_getLevels()
. Additional code may be necessary to handle memory allocation and deallocation. The levels may be stored in a data type other thanByteString
if necessary.The text was updated successfully, but these errors were encountered: