Language: [tcf]
Text source: Mark Weathers
Text content: Epistle of James
Keyboard layout history: The Spanish Windows ANSI keyboard layout was co-opted with redundant Spanish characters replaced by Meꞌphaa characters. This was done in conjunction with replacing the glyphs in a special font, so that no keyboard setting needed to be changed on the computer. Simply type as normal on a Spanish keyboard and use the Mephaa file. This results in a keyboard layout that has been used for some years by some members of the Meꞌphaa community. The layout for this keyboard is presented below and is available as JSON from here.
Meꞌphaa keyboard layout images:
Base State
Shift State
Alt State
Mephaa Required Usage
Text processing steps:
- Text received, as
JAS_TCF.txt
- Moved characters from hacked font code points to proper Unicode values, using
Teckit
.me'phaa.map
&me'phaa.tec
- Scripture texts has a very formal typesetting process. Things like paragraph, book, chapter, and verse markers: these are indicated by a reverse solidus
\
. All of these are removed (by hand). - Replaced all characters in the Meꞌphaa text with their corresponding values as if they were English characters typed on a QWERTY keyboard. (Done by hand via search and replace.) resulting file:
tcf-on-QWERTY.txt
This allows forTyping
to process the characters (really in the mental model of typing it is processing keypresses not characters).Typing
only processes characters as if they are single byte, so no two or three byte characters work with the program. However, this means that if a language corpus is converted from their orthographical representation it can be re-rendered as a keypress representation. This keypress representation can just so happen to have QWERTY codepoints - the result is not English, rather some language as goblety gook. Another way to think about this would be to use ISO 9995 names of keys. tcf-on-QWERTY-UCC.txt
is a quick check to show that all characters in the file are in the single byte range.Typing
requires a list of character bigrams and a list of character counts. The default method is to use an application by Michael Dickens calledFrequency
. - Hugh has had some difficulty in getting that to compile (and it was not guaranteed to work with multi-byte characters which was another requirement). So in lieu of using that Hugh started down the path of step Seven.Typing
assumes that there is a one to one correspondence between each single byte character and each keystroke. Processes in step three ensure that all all multi-byte characters are converted to single byte characters and their corresponding positions. This can allowTyping
to give us a fitness value (by running the tests against the existing QWERTY setting), it can also allowTyping
to make a projection about how to organize a keyboard layout based onTyping
's simulated annealing algorithm.- To create bigrams and character count the following scripts were used:
./bigrams.py tcf-on-QWERTY.txt > allDigrams.txt
Then to get the character counts.
UnicodeCCount.pl -n tcf-on-QWERTY.txt | cut -f 2,3 | tr "\t" " " > all
Characters.txt && sed -i '1d' allCharacters.txt
Then the character for new line had to be added to the top line as \n
.
- Eventually just KLA was used with the text from
tcf-on-QWERTY.txt
. This text was then re-encoded back to Meꞌphaa and an image created via KLE. The analysis of KLA is presented below. - Assumed total keypress count for Meꞌphaa was computed as follows:
UnicodeCCount.pl tcf-on-QWERTY.txt | tail -n +2 | cut -f 3 | paste -sd+ | bc
This results in a total of 22235 keypresses. It is assumed that because we are counting the text after it has been converted to QWERTY that we are no longer counting characters, but we are counting what they represent, keypresses. By using the following command
UnicodeCCount.pl mephaa3-unicode.txt | tail -n +2 | cut -f 3 | paste -sd+ | bc
we see that there are only 21294 characters (NFC) in the Meꞌphaa text. If then take out the 1879 units of U+0331 (it is a combining character) then we get the total number of "reading characters" (like letters, but without evoking the idea of functional units or punctuation and non-visible characters - I'm counting diacritics with their bases - and I am not counting ñ
as a separable character). For a grand total of 19415 letters. 22235 key presses to produce 19415 letters. A ratio of 1.145 keys per letter. 4176 total diacritics, for a diacritic (to character) density 21.51%. Or of one diacritic per every 4.2428 letters (not including ñ
). If we include ñ
the total increases to 4296 and the density to 22.12% or one diacritic per every 4.51 "letters".
Statistical analysis of exiting and optimized Meꞌphaa keyboard using Keyboard Layout Analyzer (KLA)
Using the text transformation methods outlined above the following keyboard statistics become available when using KLA.
KLA also suggests an "optimized" keyboard, and is the reference keyboard layout in the following graphs. This is contrasted with the existing Meꞌphaa keyboard which is shown above. In the diagrams the KLA optimized keyboard is referenced as Personalized while the existing Meꞌphaa layout is labeled QWERTY. It should be noted that the KLA optimization engine acknowledges that it is not a very aggressive optimization. One place or issue that Hugh notices is where further optimization could be considered is that both tone marks could be moved to the right hand so that a better cadence can be achieved. As the tone marks currently are situated a high level of outward rolls exist.
The distance that the typists' fingers will need to travel is greater for the exiting Meꞌphaa layout.
As the previous heat map for the existing Meꞌphaa keyboard shows, the frequently used keys are on the periphery of the typing area, significantly overloading the weaker fingers. In actual typing of Meꞌphaa, Hugh has observed this to contribute to hunt-and-peck style typing.
However, the severity of how much more work to type it is is not revealed until we compare the text input task on the Meꞌphaa keyboard with other keyboards. There are two ways to compare:
- Compare within the Meꞌphaa language to other keyboard layouts supporting mephaa
- Compare to other language based options that Meꞌphaa speakers might use to communicate the same message. The following graphs illustrate both points.
In terms of work load percentages we can see where on the hand the two keyboards are "balancing" the workload.
Finally in terms of row usage we can see where the high frequency targets are.