From 200197594353f33baf0a501023e06c17ea8e5731 Mon Sep 17 00:00:00 2001 From: r12a Date: Thu, 15 Dec 2022 10:31:44 +0000 Subject: [PATCH] Apply https://github.com/r12a/scripts/issues/123 --- bengali/bn-examples.js | 10 +- bengali/bn.css | 22 +- bengali/bn.html | 1643 +++++++++++++------------ bengali/images/fig_circumgraph.svg | 1 + bengali/images/fig_prebase.svg | 1 + bengali/images/fig_prebase1.svg | 1 + bengali/images/fig_prebase2.svg | 1 + bengali/images/fig_prebase_split1.svg | 1 + bengali/images/fig_prebase_split2.svg | 1 + bengali/images/fig_yophala.svg | 1 + 10 files changed, 880 insertions(+), 802 deletions(-) create mode 100644 bengali/images/fig_circumgraph.svg create mode 100644 bengali/images/fig_prebase.svg create mode 100644 bengali/images/fig_prebase1.svg create mode 100644 bengali/images/fig_prebase2.svg create mode 100644 bengali/images/fig_prebase_split1.svg create mode 100644 bengali/images/fig_prebase_split2.svg create mode 100644 bengali/images/fig_yophala.svg diff --git a/bengali/bn-examples.js b/bengali/bn-examples.js index 75207f35a..e74b2128a 100644 --- a/bengali/bn-examples.js +++ b/bengali/bn-examples.js @@ -63,7 +63,7 @@ autoExpandExamples.bn = ` অকাল বীর্যপাত|| অকাল|hard times|ɔkal অকৃতকার্য|| -অক্টোবর|| +অক্টোবর|October|ɔ§k.§ʈo.§bɔ§ɾ অক্ষর|| অক্ষি|| অগ্নি|| @@ -723,7 +723,7 @@ autoExpandExamples.bn = ` কচুয়া|| কচ্ছপ|| কঠিন|hard, difficult|kɔʈʰin -কড়া|strict|kɔɽa/kɔɾa +কড়া|strict|kɔ§ɽa§/kɔɾa কড়ে আঙ্গুল|| কত|how much|ˈkɔt̪o কথা|| @@ -757,7 +757,7 @@ autoExpandExamples.bn = ` করতাল|cymbal|kɔ§r§tɑ§l করবী|oleander|kɔɾɔbi করলা|| -করা|to do|kɔɾa/kɔɹa/kɔɽa +করা|to do|kɔ§ɾa§/kɔɹa/kɔɽa করানো|| করার|| করে|| @@ -1079,7 +1079,7 @@ autoExpandExamples.bn = ` গপ|gossip|ɡɔp গম|| গর-|| -গরম|warm, hot|ɡɔɾom|ɡôrôm|ɡɔɽom +গরম|warm, hot|ɡɔ§ɾo§m§/ɡɔɽom|ɡôrôm গরিলা|gorilla|ɡɔɾila||ɡɔɹila গরু|| গরুর দুধ|| @@ -2374,7 +2374,7 @@ autoExpandExamples.bn = ` বাইরে|| বাইশ|| বাউল|| -বাকি|left over|baki +বাকি|left over|ba§ki বাকী|| বাগ|garden, orchard|baɡ বাগধারা|| diff --git a/bengali/bn.css b/bengali/bn.css index bca174fe8..ff08d813f 100755 --- a/bengali/bn.css +++ b/bengali/bn.css @@ -1,16 +1,14 @@ @font-face { font-family: 'Noto Sans Bengali WF'; src: local('Noto Sans Bengali'), - url('../../shared/webfonts/NotoSansBengali-Regularwebfont.woff2') format('woff2'), - url('../../shared/webfonts/NotoSansBengali-Regularwebfont.woff') format('woff'); + url('../../shared/webfonts/NotoSansBengali-Regularwebfont.woff2') format('woff2'); font-weight: normal; font-style: normal; } @font-face { font-family: 'Noto Serif Bengali WF'; src: local('Noto Serif Bengali'), - url('../../shared/webfonts/NotoSerifBengali-Regularwebfont.woff2') format('woff2'), - url('../../shared/webfonts/NotoSerifBengali-Regularwebfont.woff') format('woff'); + url('../../shared/webfonts/notoserifbengali-regular-webfont.woff2') format('woff2'); font-weight: normal; font-style: normal; } @@ -28,3 +26,19 @@ .iso15919 { font-family: "Gentium Plus", Gentium, "Lucida Sans Unicode", "Lucida Grande", 'lucida sans', "helvetica neue", sans-serif; font-size: 1.1em; font-weight: 300; } + + + + + +.useBlockExamples .charExample .ex { + font-size: 3.3rem; + line-height: 1.2; + } +.useBlockExamples .charExample.inline .ex { + font-size: 1.4rem; + } + + + + diff --git a/bengali/bn.html b/bengali/bn.html index 8254e349f..34be3137e 100755 --- a/bengali/bn.html +++ b/bengali/bn.html @@ -59,7 +59,7 @@

Contents

Updated - 6 December, 2022 + 15 December, 2022

@@ -121,36 +121,46 @@

Usage & history

Basic features

-

The Bengali script is an abugida. Consonants carry an inherent vowel which can be modified by appending vowel-signs to the consonant. See the table to the right for a brief overview of features for Bangla.

+ +

The Bengali script is an abugida. Consonants carry an inherent vowel which can be modified by appending vowel signs to the consonant. See the table to the right for a brief overview of features for Bangla.

+

The orthographic letters of the Bengali script are derived from Sanskrit, and in some cases don't quite fit the needs of modern Bangla (eg. lack of simple vowels for the sounds ɛ and æ, letters for only 2 of many diphthongs, long and short letters where pronunciation no longer distinguishes those sounds, etc.)

+

Bengali runs left to right in horizontal lines.

+

Words are separated by spaces.

+

The 33 consonant letters used for Bangla are supplemented by repertoire extensions for 3 more sounds by applying the nukta diacritic to characters. ❯ consonants

+

Consonant clusters at any location are normally indicated using a virama (hasant) between consonants. This results in a large number of conjunct forms expressed using stacked consonants, conjoined consonants, and ligated glyphs. Conjuncts often have different pronunciations than might be expected from the letters involved, and in particular gemination is very common. Occasionally, a visible virama is used. However, clusters are often not marked at all. ❯ clusters

-

Bengali runs left to right in horizontal lines.

-

Words are separated by spaces.

-

The 33 consonant letters used for Bangla are supplemented by repertoire extensions for 3 more sounds by applying the nukta diacritic to characters.

-

Consonant clusters at any location are normally indicated using the virama between consonants. This results in a large number of conjunct forms expressed using stacked consonants, conjoined consonants, and ligated glyphs. Conjuncts often have different pronunciations than might be expected from the letters involved, and in particular gemination is very common. Occasionally, a visible virama is used. However, clusters are often not marked at all.

-

As part of a cluster, RA has special forms, for both cluster-initial and post-base positions.

-

Word-final consonant sounds may be represented by a special letter, , or by 2 dedicated combining marks (anusvara & visarga), but are generally ordinary consonants that are not marked by a virama.

-

Vowel harmony plays a significant role in the pronunciation of vowel-related code points.

-

The Bangla orthography has 2 inherent vowels, and represents other vowels using 9 vowel-signs, including 3 prescripts and 2 circumgraphs. All vowel-signs are combining marks, and are stored after the base character. Other vowels are represented by adaptations of the y consonant. The final sound of numerous diphthongs is represented using independent vowels.

-

There are 10 independent vowels, one for each vowel sound, including the inherent vowel, and these are used to write all standalone vowel sounds.

-

There are no composite vowels, in principle, however the 2 circumgraphs are decomposed into 2 parts each.

-

Vowels may be nasalised, using the candrabindu diacritic.

-

Bengali has native digit shapes.

+

As part of a cluster, RA has special forms, for both cluster-initial and post-base positions.

+ +

Word-final consonant sounds may be represented by a special letter, , or by 2 dedicated combining marks (anusvara & visarga), but are generally ordinary consonants that are not marked by a virama. ❯ finals

+ +

Vowel harmony plays a significant role in the pronunciation of vowel-related code points. ❯ vowelharmonydesc

+ +

The Bangla orthography has 2 inherent vowels, and represents other vowels using 9 vowel signs, including 3 pre-base and 2 circumgraph vowel signs. All vowel signs are combining marks, and are stored after the base character. Other vowels are represented by adaptations of the y consonant. The final sound of numerous diphthongs is represented using independent vowels. ❯ vowels

+ +

There are 10 independent vowels, one for each vowel sound, including the inherent vowel, and these are used to write all standalone vowel sounds. ❯ standalone

+ +

There are no composite vowels, in principle, however the 2 circumgraphs are decomposed into 2 parts each.

+ +

Vowels may be nasalised, using the candrabindu diacritic. ❯ nasalisation

+ +

Bengali has native digit shapes. ❯ numbers

-
-

Character index

+
+

Character index

+
@@ -175,7 +185,7 @@

Basic consonants

Extended consonants

-
ড়␣ঢ়␣য়
+
ড়␣ঢ়␣য়
@@ -206,7 +216,7 @@

Combining marks

-

Vowel-signs

+

Vowel signs

ি␣ে␣ৈ␣ো␣ৌ␣ী␣ু␣ূ␣া
@@ -235,7 +245,7 @@

Numbers

Not used for Bangla

-
৴␣৵␣৶␣৷␣৸␣৹
+
৴␣৵␣৶␣৷␣৸␣৹
@@ -275,7 +285,7 @@

Symbols

Not used for Bangla

-
৲␣৻
+
৲␣৻
@@ -331,19 +341,19 @@

Consonant-based orthographic syllables

C
Consonant.
Cn
Consonant followed by nukta.
h
Hasant.
-
v
Vowel-sign.
+
v
Vowel sign.
n
Nasalisation diacritic (candrabindu).
f
Final consonant (one of khanda ta, anusvara, or visarga).

The core of a consonant-based syllable is a base consonant character, which may or may not additionally represent an inherent vowel if it stands alone.

-

There is no inherent vowel if it is followed by a vowel-sign, eg. কী কি ki কো ko or hasant, eg. ক্ At the end of a word, there may or may not be an inherent vowel, even if there is no hasant.

+

There is no inherent vowel if it is followed by a vowel sign, eg. কী কি ki কো ko or hasant, eg. ক্ At the end of a word, there may or may not be an inherent vowel, even if there is no hasant.

Any base consonant may be a combination of consonant code point plus nukta.

The base consonant can be preceded by up to two consonant+hasant pairs (where the consonant may also be a combination of consonant+nukta), but only if those consonants form conjuncts (ie. the hasant is invisible), eg. ক্ক k͓k ম্প m͓p ক্ষ k͓ʃ̇ ন্ত্র n͓t͓r If the preceding consonants carry visible hasant symbols, those are treated as separate orthographic syllables.

Likewise, the variable use of the hasant in Bengali means that a phonetic cluster of consonants can constitute a larger series of orthographic syllables. For example, this word for cymbal has two phonetic syllables, but 3 orthographic since the rt combination is not combined: করতাল

-

A vowel-sign may optionally be followed by a nasalisation diacritic.

+

A vowel sign may optionally be followed by a nasalisation diacritic.

Unless the base consonant is followed by a hasant, the syllable may be terminated by a final consonant repesented by khanda ta, anusvara, or visarga.

@@ -834,392 +844,90 @@

Consonant sounds

Vowels

-
-

Vowel sounds to characters

-

This section maps Bengali vowel sounds to common graphemes in the Bengali orthography, grouped by whether they are vowel-signs ( vs ), or standalone ( s ). Click on a grapheme to find other mentions on this page (links appear at the bottom of the page). Click on the character name to see examples and for detailed descriptions of the character(s) shown.

- -

Hyphens indicate a consonant with inherent vowel.

+
+

Vowel harmony plays a significant role in the pronunciation of vowel-related code points.

+

The Bangla orthography has 2 inherent vowels, and represents other vowels using 9 vowel signs, including 3 pre-base and 2 circumgraph vowel signs. All vowel signs are combining marks, and are stored after the base character. Other vowels are represented by adaptations of the y consonant. The final sound of numerous diphthongs is represented using independent vowels.

+

There are 10 independent vowels, one for each vowel sound, including the inherent vowel, and these are used to write all standalone vowel sounds.

-
-

Plain vowels

+

There are no composite vowels, in principle, however the 2 circumgraphs are decomposed into 2 parts each.

-
-
-
i
-
vs
-
-

ি [U+09BF BENGALI VOWEL SIGN I], eg. বিড়ি

-

[U+09C0 BENGALI VOWEL SIGN II], eg. বীর

-

[U+09C3 BENGALI VOWEL SIGN VOCALIC R] as part of the vocalic ri, eg. বৃহৎ

-
-
-
-
 
-
s
-
-

[U+0987 BENGALI LETTER I], eg. ইংরেজ

-

[U+0988 BENGALI LETTER II], eg. ঈদ

-

[U+098B BENGALI LETTER VOCALIC R] as part of the vocalic ri, eg. ঋতু

-
-
-
-
u
-
vs
-
-

[U+09C1 BENGALI VOWEL SIGN U], eg. বুক

-

[U+09C2 BENGALI VOWEL SIGN UU], eg. মূল

-

[U+09CB BENGALI VOWEL SIGN O] with vowel harmony before one of i u, eg. কোকিল

-
-
-
-
 
-
s
-
-

[U+0989 BENGALI LETTER U], eg. উঁচু

-

[U+098A BENGALI LETTER UU]

-

[U+0993 BENGALI LETTER O] with vowel harmony before one of i u, eg. ওদিকে

-
-
+

Vowels may be nasalised, using the candrabindu diacritic.

-
-
-
e
-
vs
-
-

[U+09C7 BENGALI VOWEL SIGN E], with vowel harmony before one of i u, eg. বেগুন.

-

ি [U+09BF BENGALI VOWEL SIGN I] with vowel harmony before one of ɔ o e a, eg. বিড়াল.

-

্য [U+09CD BENGALI SIGN VIRAMA + U+09AF BENGALI LETTER YA] with the inherent vowel before i, eg. ব্যক্তি.

-
-
-
-
 
-
s
-
-

[U+098F BENGALI LETTER E], with vowel harmony before one of i u, eg. একটু.

-

[U+0987 BENGALI LETTER I] with vowel harmony before one of ɔ o e a.

-
-
-
-
-
vs
-
-

য় [U+09AF BENGALI LETTER YA + U+09BC BENGALI SIGN NUKTA] as part of a diphthong, esp after ɔ a o, eg. ভয়.

-
-
-
-
o
-
vs
-
-

Inherent vowel

-

[U+09CB BENGALI VOWEL SIGN O], eg. বোন.

-

[U+09C1 BENGALI VOWEL SIGN U] with vowel harmony before one of ɔ o e a, eg. বুড়ো.

-
-
-
-
 
-
s
-
-

[U+0993 BENGALI LETTER O]

-

[U+0985 BENGALI LETTER A] with vowel harmony before one of i u, eg. অভিধান.

-

[U+0989 BENGALI LETTER U], with vowel harmony before one of ɔ o e a, eg. উচ্চারন.

-
-
-
-
-
-
ɛ
-
vs
- -
-
-
 
-
s
- -
-
-
ɔ
-
vs
-
-

Inherent vowel

-

[U+09CB BENGALI VOWEL SIGN O] with vowel harmony before one of ɔ o e a, eg. বোকা

-
-
-
-
 
-
s
-
-

[U+0985 BENGALI LETTER A] , eg. অঙ্ক.

-

[U+0993 BENGALI LETTER O] with vowel harmony before one of ɔ o e a, eg. ওড়া.

-
-
-
+
+

Inherent vowel

+

-
-
-
æ
-
vs
-
-

[U+09BE BENGALI VOWEL SIGN AA] after জ্ঞ [U+099C BENGALI LETTER JA + U+09CD BENGALI SIGN VIRAMA + U+099E BENGALI LETTER NYA], eg. জ্ঞান;

-

্যা [U+09CD BENGALI SIGN VIRAMA + U+09AF BENGALI LETTER YA + U+09BE BENGALI VOWEL SIGN AA] sometimes, eg. ব্যাঙ্ক.

-

্য [U+09CD BENGALI SIGN VIRAMA + U+09AF BENGALI LETTER YA] with the inherent vowel and not followed by i, eg. ব্যথা.

-

[U+09C7 BENGALI VOWEL SIGN E], with vowel harmony before one of ɔ o e a, eg. বেলা.

-
-
-
-
 
-
s
-
-

[U+098F BENGALI LETTER E] with vowel harmony before one of ɔ o e a, eg. একবার.

-
-
-
+

The inherent vowel is typically transcribed as a, and pronounced as ɔ or o. (And sometimes halfway between these two, when influenced by surrounding sounds.) Bengalis are not always aware of these sound differences – thinking of this as one sound. So or ko are written by simply using the consonant letter.

+

or ko [U+0995 BENGALI LETTER KA]

+

There is also a vowel sign pronounced o. This can lead to inconsistent spellings, eg. bhalo, good, well, can be spelled either ভালো or ভাল. Verb forms tend to be particularly inconsistent, sometimes basing the rationale on what looks good in a particular context.

+

The rules for determining the sound of the inherent vowel are not simple. Partly it is a question of vowel harmony. The following two +tendencies can help:

-
-
-
a
-
vs
-
-

[U+09BE BENGALI VOWEL SIGN AA], eg. কাটা.

-
-
-
-
 
-
s
-
-

[U+0986 BENGALI LETTER AA], eg. আকাশ.

-
-
-
- + +
    +
  • +

    In words with inherent vowels in two consecutive syllables, the sound will usually be ɔ..o, not o..ɔ, eg. গরম However, exceptions occur for prefixes, such as prɔ-, ɔ-, and sɔ-.r,8

    +
  • +
  • +

    When pronounced at the end of a word after a conjunct consonant, the inherent vowel is always o,r,8 eg. যুদ্ধ

    +
  • +
  • +

    The pronunciation tends to be o when followed by a one of i, j, u, w either immediately or in the next syllable, but ɔ otherwise.d,400

    +
  • +
+
+

Vowel signs

+

+

Non-inherent vowel sounds that follow a consonant are represented using vowel signs, eg.

+

কী ki [U+0995 BENGALI LETTER KA + U+09C0 BENGALI VOWEL SIGN II]

-
-

Diphthongs and other combinations

+

Bengali vowel signs are nearly all combining characters. One consonant is also used in a special configuration described below. In principle a single character is used per base consonant, but 2 vowel signs decompose to more than one character (see circumgraphs). All vowel signs are typed and stored after the base consonant, whether or not they precede it when displayed. The glyph rendering system takes care of the positioning at display time.

-
-
-
ii̯
-
o
-
-

িই [U+09BF BENGALI VOWEL SIGN I + U+0987 BENGALI LETTER I], eg. পাখিই.

-
-
-
-
iu̯
-
o
-
-

িউ [U+09BF BENGALI VOWEL SIGN I + U+0989 BENGALI LETTER U], eg. পারফিউম.

-
-
-
-
ui̯
-
o
-
-

ুই [U+09C1 BENGALI VOWEL SIGN U + U+0987 BENGALI LETTER I], eg. বাবুই.

-
-
-
+

An orthography that uses vowel signs is different from one that uses simple diacritics or letters for vowels, in that the vowel signs are generally rendered relative to an orthographic syllable, rather than just applied to the letter of the immediately preceding consonant (see prebase_vowels for an example).

+

Almost all of the vowel signs are spacing combining characters, meaning that they consume horizontal space when added to a base consonant.

-
-
-
ei̯
-
o
-
-

েই [U+09C7 BENGALI VOWEL SIGN E + U+0987 BENGALI LETTER I], eg. পারেই.

-
-
-
-
eu̯
-
o
-
-

েউ [U+09C7 BENGALI VOWEL SIGN E + U+0989 BENGALI LETTER U], eg. যেকেউ.

-
-
-
-
oi̯
-
o
-
-

[U+09C8 BENGALI VOWEL SIGN AI], eg. তাথৈ.

-

-ই [U+0987 BENGALI LETTER I], eg. তেরই.

-
-
-
-
 
-
s
- -
-
-
ou̯
-
vs
-
-

[U+09CC BENGALI VOWEL SIGN AU], eg. ধৌত

-
-
-
-
 
-
s
-
-

[U+0994 BENGALI LETTER AU], eg. ঔষুধ.

-
-
-
-
oo̯
-
o
-
-

-ও [U+0993 BENGALI LETTER O], eg. দুঃখও

-
-
-
+

See also vowelligatures and vocalics.

+
-
-
-
ɔe̯
-
 o
-
-

-য় [U+09AF BENGALI LETTER YA + U+09BC BENGALI SIGN NUKTA], eg. ভয়

-
-
-
-
ɔo̯
-
o
-
-

-ও [U+0993 BENGALI LETTER O], ], eg. তওবার

-
-
-
+
+

Combining marks used for vowels

- -
-
-
ai̯
-
o
-
-

াই [U+09BE BENGALI VOWEL SIGN AA + U+0987 BENGALI LETTER I], eg. অনলাইন.

-
-
-
-
au̯
-
o
-
-

াউ [U+09BE BENGALI VOWEL SIGN AA + U+0989 BENGALI LETTER U], eg. একাউন্ট.

-
-
-
-
ae̯
-
o
- -
-
-
ao̯
-
o
-
-

াও [U+09BE BENGALI VOWEL SIGN AA + U+0993 BENGALI LETTER O], eg. এছাড়াও.

-
-
-
-
-
- - - - - - -
- - - - - -
-

Inherent vowel

-

-

The inherent vowel is typically transcribed as a, and pronounced as ɔ or o. (And sometimes halfway between these -two, when influenced by surrounding sounds.) Bengalis are not always aware of these sound differences – thinking of this as one sound. So or ko are written by simply using the consonant letter [U+0995 BENGALI LETTER KA].

-

There is also a vowel-sign pronounced o. This can lead to inconsistent spellings, eg. bhalo, good, well, can be spelled either ভালো or ভাল. Verb forms tend to be particularly inconsistent, sometimes basing the rationale on what looks good in a particular - context.

-

The rules for determining the sound of the inherent vowel are not simple. Partly it is a question of vowel harmony. The following two - tendencies can help:

-
    -
  • -

    In words with inherent vowels in two consecutive syllables, the sound will usually be ɔ..o, not o..ɔ, eg. গরম However, exceptions occur for prefixes, such as prɔ-, ɔ-, and sɔ-.r,8

    -
  • -
  • -

    When pronounced at the end of a word after a conjunct consonant, the inherent vowel is always o,r,8 eg. যুদ্ধ

    -
  • -
  • -

    The pronunciation tends to be o when followed by a one of i, j, u, w either immediately or in the next syllable, but ɔ otherwise.d,400

    -
  • -
-
- - - - -
-

Vowel-signs

-

- -

Non-inherent vowel sounds that follow a consonant are represented using vowel-signs, eg. ki is written কী [U+0995 BENGALI LETTER KA + U+09C0 BENGALI VOWEL SIGN II].

- - -

An orthography that uses vowel-signs is different from one that uses simple diacritics or letters for vowels, in that the vowel-signs are generally rendered relative to an orthographic syllable, rather than just applied to the letter of the immediately preceding consonant (see prebase_vowels for an example).

-

Bengali vowel signs are nearly all combining characters. One consonant is also used in a special configuration described below. In principle a single character is used per base consonant, but 2 vowel-signs decompose to more than one character (see circumgraphs). All vowel-signs are typed and stored after the base consonant, whether or not they precede it when displayed. The glyph rendering system takes care of the positioning at display time.

-

Almost all of the vowel-signs are spacing combining characters, meaning that they consume horizontal space when added to a base consonant.

-

See also vocalics.

-
- - - - - - - -
-

Combining marks used for vowels

- -

Bangla uses the following dedicated combining marks for vowels.

+

Bangla uses the following dedicated combining marks for vowels.

ি␣ী␣ু␣ূ␣ে␣ো␣া␣ ␣ৈ␣ৌ

Bengali has lost the distinction between short and long vowels in pronunciation, but retains the difference in spelling.

-

The variation in pronunciation for the vowel-signs can often be explained by vowel harmony.

-

[U+09CB BENGALI VOWEL SIGN O] was originally pronounced ʊ, and that pronunciation sometimes persists alongside the o that came from Sanskrit, eg. নোংরা

+ +

The variation in pronunciation for the vowel signs can often be explained by vowel harmony.

+ +

    [U+09CB BENGALI VOWEL SIGN O] was originally pronounced ʊ, and that pronunciation sometimes persists alongside the o that came from Sanskrit, eg. নোংরা

@@ -1236,8 +944,11 @@

Consonants used for vowels

When it occurs as the last member of a consonant cluster [U+09AF BENGALI LETTER YA] has a special shape seen in orange in fig_yophala, and is called ʤɔ-pfɔlɑ (য-ফলা). One of its functions is to create the sound æ.

-হ্যাঁ -
The word হ্যাঁ, which creates the sound æ with the sequence ্যা [U+09CD SIGN VIRAMA + U+09AF LETTER YA + U+09BE VOWEL SIGN AA].
+হ্যাঁ +
+
The sound æ, created with the sequence ্যা [U+09CD SIGN VIRAMA + U+09AF LETTER YA + U+09BE VOWEL SIGN AA].
+
details

হ্যাঁ

+

There are exceptions to the previous rule, when the a-kar produces its normal value, eg. @@ -1255,7 +966,7 @@

Consonants used for vowels

অ্যাটর্নি এ্যাডভোকেট

- +

See also composite_vowels, which explains how independent vowels are used for the off-glide of diphthongs.

@@ -1265,26 +976,43 @@

Consonants used for vowels

-

Pre-base vowel-signs

+

Pre-base vowel signs

ি␣ে␣ ␣ৈ
-

Three vowel-signs appear to the left of the base consonant letter or cluster.

+

Three vowel signs appear to the left of the base consonant letter or cluster, eg. +বাকি +

-
+ + +

These combining marks are always stored after the base consonant, ie. the codepoints follow the order in which the items are pronounced. The rendering process places the glyph before the base consonant without changing the code points.

+ +

The vowel sign is actually placed before the start of an orthographic syllable. In fig_prebase the sequence of glyphs for the orthographic syllable is rendered VCC, whereas the pronunciation is CCV. In conjuncts with 3 consonants, it will still be rendered before all the consonants.

+ + +
+ব্যক্তি +ব্যক্তি +
+
Two examples of a prebase vowel, pronounced after a consonant cluster, but rendered to the left of the conjunct.
+
details

ব্যক্তি

প্লিজ

+
-

These combining marks are always stored after the base consonant. The font and rendering places the glyph before the base consonant.

-

Because vowel-signs are attached to the syllable. rather than a letter, a pre-base vowel glyph that represents a vowel sound after a consonant cluster is displayed before the whole cluster if that cluster is represented by a conjunct form, eg. compare -ব্যক্তি -বালতি -

-

However, note that if the cluster is split by a visible virama, this creates two syllables and the pre-base vowel-sign appears after the consonant with the virama. If you click on the examples below, you'll see that the characters and code point orders are the same, apart from the addition of the ZWNJ in the second instance to force the virama to appear. -প্লিজ -প্‌লিজ -

+ +

However, if the cluster doesn't form a conjunct, or is split by a visible virama, the cluster becomes two orthographic syllables and the pre-base vowel sign appears after the last consonant with the virama. The sequence of displayed glyphs is now CVC. If the conjunct contains 3 consonants, the displayed order will be CCVC.

+
+ব্যক্তি +ব্যক্তি +
+
Two examples of a the same prebase vowel, pronounced after consonant clusters that don't form conjuncts, and rendered to the left of the last consonant in the cluster.
+
details

বালতি

প্‌লিজ

+
+
@@ -1296,16 +1024,25 @@

Pre-base vowel-signs

Circumgraphs

+
ো␣ ␣ৌ
+

Two vowels are represented by circumgraphs, producing glyphs on opposite sides of the consonant onset.

-
ো␣ ␣ৌ
-
-কৌ -
The single code point representing the vowel-sign in কৌ [U+0995 BENGALI LETTER KA + U+09CC VOWEL SIGN AU] is rendered on two sides of the base character.
+
+অক্টোবর +
+
The circumgraph U+09CB BENGALI VOWEL SIGN O is rendered on two sides of the consonant stack.
+
details

অক্টোবর

+
-

These circumgraph vowel-signs are normally a single code point, but in decomposed text the base consonant can be followed by two code points , eg. compare the following equivalent ways of writing ko:

+ + +

These circumgraph vowel signs are normally a single code point, but in decomposed text the base consonant can be followed by two code points , eg. compare the following equivalent ways of writing ko:

@@ -1326,7 +1063,7 @@

Circumgraphs

Composite vowels & diphthongs

Composite vowels in Bengali may occur when [U+09CB BENGALI VOWEL SIGN O] and [U+09CC BENGALI VOWEL SIGN AU] are decomposed, however this is not common. See encoding for more details.

-

Although 2 of the vowel-signs (and independent vowels) represent diphthongs (oi̯ and ou̯) with a single code point, most of the many diphthongs are represented by a sequence of vowels,wa,#Vowels eg. কেউ

+

Although 2 of the vowel signs (and independent vowels) represent diphthongs (oi̯ and ou̯) with a single code point, most of the many diphthongs are represented by a sequence of vowels,wa,#Vowels eg. কেউ

Diphthongs typically represent the off-glide using one of the following:

  • [U+0987 BENGALI LETTER I]
  • @@ -1396,10 +1133,10 @@

    Nasalisation

    The candrabindu should be placed over the top of an independent vowel, but over the base consonant when a vowel sign is attached – not over the vowel sign. In the code point sequence, however, this should occur after any combining vowel sign associated with the same syllable. Note how the base consonant is identified correctly in the second word of fig_candrabindu, even though the candrabindu is 4 code points away. Some fonts do not position the candrabindu correctly.

    -
    +
    -হাঁপান -হ্যাঁ +হাঁপান +হ্যাঁ
    The candrabindu is positioned over the base consonant, even though it is the last code point in the syllable. (The arrow gives the approximate location of the code point. Click to see the code point sequence.)
    @@ -1409,57 +1146,6 @@

    Nasalisation

    -
    -

    Vowel ligatures

    -

    Sometimes vowel signs (particularly U) form ligatures with a preceding base consonant. fig_lig shows ligated (top) and non-ligated (bottom) forms for several combinations. In certain contexts it may be less appropriate to ligate (eg. newspapers and modern typefaces). Both forms are equivalent in every way but visually.

    - - -
    -রু র‌ু -রূ র‌ূ -হৃ হ‌ৃ -হু হ‌ু -ন্তু ন্ত‌ু -শু শ‌ু -গু গ‌ু -
    Ligated (top) and non-ligated (bottom) forms for several combinations of consonant+vowel.
    -
    - - - - - - - - -

    The default behaviour of a given font can be modified using the zero-width non-joiner character before the vowel to produce the simple form. See fig_ligated_simplified.

    - -
    - -
    Example code sequences for ligated and simplified forms of a consonant+vowel combination.
    -
    - - - -

    See a matrix of consonants followed by vowel-signs for Bengali. A few clusters that are often ligated are pre-highlighted.

    -
    - - @@ -1475,28 +1161,6 @@

    Consonants with no following vowel

    Consonant clusters that are represented by conjunct forms use the hasant between consonants to invoke the shape changes. If the font has the glyphs needed to produce the conjunct the hasant is hidden (see clusters).

    Refs: Radice 3, 7-8, 21, 148; Daniels 400

-
- - - - - -
-

Vocalics

-

- -
ঋ␣ৃ
- -

Only one vocalic is in common use for modern Bangla. It is used in standalone and vowel-sign forms, eg. -বৃহৎ -ঋতু -

-
-

Three more vocalics, [U+098C BENGALI LETTER VOCALIC L], [U+09E0 BENGALI LETTER VOCALIC RR] and [U+09E1 BENGALI LETTER VOCALIC LL], are historic and only used to write Sanskrit in Bengali.

- -
ৠ␣ঌ␣ৡ␣ৄ␣ৢ␣ৣ
-
-
@@ -1504,204 +1168,179 @@

Vocalics

-
-

Consonants

+
+

Vowel sounds to characters

+

This section maps Bengali vowel sounds to common graphemes in the Bengali orthography, grouped by whether they are vowel signs ( vs ), or standalone ( s ). Click on a grapheme to find other mentions on this page (links appear at the bottom of the page). Click on the character name to see examples and for detailed descriptions of the character(s) shown.

+

Hyphens indicate a consonant with inherent vowel.

-
-

Consonant sounds to characters

- - -

This section maps Bengali consonant sounds to common graphemes in the Bengali orthography, grouped according to whether they are regular ( r ), conjuncts ( c ), or final ( f ). Click on a grapheme to find other mentions on this page (links appear at the bottom of the page). Click on the character name to see examples and for detailed descriptions of the character(s) shown.

+
+

Plain vowels

-
-

Stops

-
-
-
p
-
r
-
-

[U+09AA BENGALI LETTER PA], eg. পথ.

-
-
-
-
b
-
r
-
-

[U+09AC BENGALI LETTER BA] (when not in a conjunct), eg. বড়.

-
-
+
-
pʰ/pf
-
r
+
i
+
vs
-
-
r
+
 
+
s
-

[U+09AD BENGALI LETTER BHA], eg. ভাল.

+

[U+0987 BENGALI LETTER I]

+

[U+0988 BENGALI LETTER II]

+

[U+098B BENGALI LETTER VOCALIC R] as part of the vocalic ri

-
t
-
r
+
u
+
vs
-

[U+09A4 BENGALI LETTER TA], eg. তারা.

+

[U+09C1 BENGALI VOWEL SIGN U]

+

[U+09C2 BENGALI VOWEL SIGN UU]

+

[U+09CB BENGALI VOWEL SIGN O] with vowel harmony before one of i u

 
-
f
+
s
-

[U+09CE BENGALI LETTER KHANDA TA], in syllable-final positions, eg. হঠাৎ and উৎসব.

+

[U+0989 BENGALI LETTER U]

+

[U+098A BENGALI LETTER UU]

+

[U+0993 BENGALI LETTER O] with vowel harmony before one of i u

- + + +
-
d
-
r
+
e
+
vs
-

[U+09A6 BENGALI LETTER DA], eg. দাদী.

+

[U+09C7 BENGALI VOWEL SIGN E], with vowel harmony before one of i u

+

ি [U+09BF BENGALI VOWEL SIGN I] with vowel harmony before one of ɔ o e a

+

্য [U+09CD BENGALI SIGN VIRAMA + U+09AF BENGALI LETTER YA] with the inherent vowel before i

-
-
c
+
 
+
s
-

দ্ম [U+09A6 BENGALI LETTER DA + U+09CD BENGALI SIGN VIRAMA + U+09AE BENGALI LETTER MA], eg. পদ্ম.

-

দ্ব [U+09A6 BENGALI LETTER DA + U+09CD BENGALI SIGN VIRAMA + U+09AC BENGALI LETTER BA], eg. দ্বিতীয়.

-

দ্ব্য [U+09A6 BENGALI LETTER DA + U+09CD BENGALI SIGN VIRAMA + U+09AC BENGALI LETTER BA + U+09CD BENGALI SIGN VIRAMA + U+09AF BENGALI LETTER YA] 

+

[U+098F BENGALI LETTER E], with vowel harmony before one of i u

+

[U+0987 BENGALI LETTER I] with vowel harmony before one of ɔ o e a.

-
-
r
+
+
vs
-

[U+09A5 BENGALI LETTER THA], eg. থামা.

+

য় [U+09AF BENGALI LETTER YA + U+09BC BENGALI SIGN NUKTA] as part of a diphthong, esp after ɔ a o

-
tːʰ
-
c
+
o
+
vs
-

থ্ব [U+09A5 BENGALI LETTER THA + U+09CD BENGALI SIGN VIRAMA + U+09AC BENGALI LETTER BA]

+

Inherent vowel

+

[U+09CB BENGALI VOWEL SIGN O]

+

[U+09C1 BENGALI VOWEL SIGN U] with vowel harmony before one of ɔ o e a

-
-
r
+
 
+
s
-

[U+09A7 BENGALI LETTER DHA], eg. ধন্যবাদ.

+

[U+0993 BENGALI LETTER O]

+

[U+0985 BENGALI LETTER A] with vowel harmony before one of i u

+

[U+0989 BENGALI LETTER U], with vowel harmony before one of ɔ o e a

- + +
-
ɖ
-
r
+
 
+
s
-

[U+09A1 BENGALI LETTER DDA], eg. ডাক্তার.

+

[U+098F BENGALI LETTER E]

-
ʈʰ
-
r
+
ɔ
+
vs
-

[U+099F BENGALI LETTER TTA], eg. টাকা.

+

Inherent vowel

+

[U+09CB BENGALI VOWEL SIGN O] with vowel harmony before one of ɔ o e a 

-
ɖʰ
-
r
+
 
+
s
-

[U+09A2 BENGALI LETTER DDHA], eg. ঢেউ.

+

[U+0985 BENGALI LETTER A] 

+

[U+0993 BENGALI LETTER O] with vowel harmony before one of ɔ o e a

-
-
k
-
r
-
-

[U+0995 BENGALI LETTER KA], eg. কলম.

-
+ + + + +
-
-
r
+
æ
+
vs
-

[U+0996 BENGALI LETTER KHA],  eg. খবর.

+

[U+09BE BENGALI VOWEL SIGN AA] after জ্ঞ [U+099C BENGALI LETTER JA + U+09CD BENGALI SIGN VIRAMA + U+099E BENGALI LETTER NYA]

+

্যা [U+09CD BENGALI SIGN VIRAMA + U+09AF BENGALI LETTER YA + U+09BE BENGALI VOWEL SIGN AA] sometimes

+

্য [U+09CD BENGALI SIGN VIRAMA + U+09AF BENGALI LETTER YA] with the inherent vowel and not followed by i

+

[U+09C7 BENGALI VOWEL SIGN E], with vowel harmony before one of ɔ o e a

 
-
c
+
s
-

ক্ষ [U+0995 BENGALI LETTER KA + U+09CD BENGALI SIGN VIRAMA + U+09B7 BENGALI LETTER SSA] when initial, eg. ক্ষুদ্র.

-
+

[U+098F BENGALI LETTER E] with vowel harmony before one of ɔ o e a

- + + + + +
-
ɡ
-
r
+
a
+
vs
-

[U+0997 BENGALI LETTER GA], eg. গতকাল.

+

[U+09BE BENGALI VOWEL SIGN AA]

 
-
c
-
-

জ্ঞ [U+099C BENGALI LETTER JA + U+09CD BENGALI SIGN VIRAMA + U+099E BENGALI LETTER NYA] when word-initial, eg. জ্ঞান.

-
-
-
-
ɡː
-
c
-
-

জ্ঞ [U+099C BENGALI LETTER JA + U+09CD BENGALI SIGN VIRAMA + U+099E BENGALI LETTER NYA] when between vowels, eg. বিজ্ঞান.

-
-
-
-
ɡʰ
-
r
+
s
+
@@ -1709,258 +1348,188 @@

Stops

-
-

Affricates

-
-
-
t͡ʃ
-
r
-
-

[U+099A BENGALI LETTER CA], eg. চক্র.

-
-
-
-
t͡ʃʰ
-
r
-
-

[U+099B BENGALI LETTER CHA], eg. ছাতা.

-
-
+ +
+

Diphthongs and other combinations

+ + -
- - - -
-

Fricatives

-
+
-
f
-
 
+
ei̯
+
o
-
 
-
c
+
oi̯
+
o
-
ʃ
-
r
+
 
+
s
-

[U+09B6 BENGALI LETTER SHA], eg. শাপ.

-

[U+09B7 BENGALI LETTER SSA],  eg. ষড়যন্ত্র.

-

[U+09B8 BENGALI LETTER SA],  eg. সকাল.

+

[U+0990 BENGALI LETTER AI]

+

ওই [U+0993 BENGALI LETTER O + U+0987 BENGALI LETTER I]

-
h
-
r
+
 
+
s
-
 
-
f
+
oo̯
+
o
-

[U+0983 BENGALI SIGN VISARGA], eg. বাঃ.

+

-ও [U+0993 BENGALI LETTER O] 

-
-
-

Nasals

-
-
-
m
-
r
-
-

[U+09AE BENGALI LETTER MA], eg. মজার.

-
-
+
-
-
n
-
r
-
-

[U+09A8 BENGALI LETTER NA], eg. নদী.

-

[U+099E BENGALI LETTER NYA], very rare outside conjuncts, eg. গাঞি.

-

[U+09A3 BENGALI LETTER NNA], eg. হরিণ.

-
+ + + + +
+
+
+
+

Vocalics

+

-
-

Other

-
-
-
w
-
r
-
-

ওয় [U+0993 BENGALI LETTER O + U+09AF BENGALI LETTER YA + U+09BC BENGALI SIGN NUKTA], (light, like French 'oui') between o...a , eg. দাওয়াত.

-
-
-
-
r ɾ
-
r
-
-

[U+09B0 BENGALI LETTER RA], eg. রওয়া.

-

[U+09C3 BENGALI VOWEL SIGN VOCALIC R] as part of the vocalic ri, eg. বৃহৎ

-

[U+098B BENGALI LETTER VOCALIC R] as part of the standalone vocalic ri, eg. ঋতু

-
-
-
-
ɽ(ʰ)
-
r
-
-

ড় [U+09A1 BENGALI LETTER DDA + U+09BC BENGALI SIGN NUKTA], eg. ওড়া.

-

ঢ় [U+09A2 BENGALI LETTER DDHA + U+09BC BENGALI SIGN NUKTA], eg. আষাঢ়.

-
-
-
-
l
-
r
-
-

[U+09B2 BENGALI LETTER LA], eg. লওয়া.

-
-
- -
-
j
-
r
-
-

য় [U+09AF BENGALI LETTER YA + U+09BC BENGALI SIGN NUKTA], between i...e, a...u, or e...e, eg. আষাঢ়.

-
-
- +
ঋ␣ৃ
+ +

Only one vocalic is in common use for modern Bangla. It is used in standalone and vowel sign forms, eg. +বৃহৎ +ঋতু +

+
+

Three more vocalics, [U+098C BENGALI LETTER VOCALIC L], [U+09E0 BENGALI LETTER VOCALIC RR] and [U+09E1 BENGALI LETTER VOCALIC LL], are historic and only used to write Sanskrit in Bengali.

+ +
ৠ␣ঌ␣ৡ␣ৄ␣ৢ␣ৣ
-
-
+ + +
+

Consonants

+ + + +
+

The 33 consonant letters used for Bangla are supplemented by repertoire extensions for 3 more sounds by applying the nukta diacritic to characters.

+ +

Consonant clusters at any location are normally indicated using a virama (hasant) between consonants. This results in a large number of conjunct forms expressed using stacked consonants, conjoined consonants, and ligated glyphs. Conjuncts often have different pronunciations than might be expected from the letters involved, and in particular gemination is very common. Occasionally, a visible virama is used. However, clusters are often not marked at all.

+ +

As part of a cluster, RA has special forms, for both cluster-initial and post-base positions.

+ +

Word-final consonant sounds may be represented by a special letter, , or by 2 dedicated combining marks (anusvara & visarga), but are generally ordinary consonants that are not marked by a virama.

+
@@ -2285,7 +1854,7 @@

Ligating conjuncts

Special forms

-

Cluster-initial RA [U+09B0 BENGALI LETTER RA] at the start of a cluster is normally displayed as a mark above the following consonant(s), eg. rt in গর্ত gɔrtô hole. Unlike Devanagari, it doesn't appear to be displayed above the vowel-sign of the orthographic syllable, eg. +

Cluster-initial RA [U+09B0 BENGALI LETTER RA] at the start of a cluster is normally displayed as a mark above the following consonant(s), eg. rt in গর্ত gɔrtô hole. Unlike Devanagari, it doesn't appear to be displayed above the vowel sign of the orthographic syllable, eg. কুর্তা

@@ -2358,44 +1927,497 @@

Visible virama

- - - - -
-

Formatting characters

-

ZWNJ [U+200C ZERO WIDTH NON-JOINER] (ZWNJ) can be used to force the production of a visible virama, rather than a half-form (see visiblevirama). It can also be used to prevent the formation of vowel ligatures (see vowelligatures).

-

ZWJ [U+200D ZERO WIDTH JOINER] (ZWJ) is used to produce special joining forms for RA + YA (see consonant_syllable).

-
+ + + + +
+

Formatting characters

+

ZWNJ [U+200C ZERO WIDTH NON-JOINER] (ZWNJ) can be used to force the production of a visible virama, rather than a half-form (see visiblevirama). It can also be used to prevent the formation of vowel ligatures (see vowelligatures).

+

ZWJ [U+200D ZERO WIDTH JOINER] (ZWJ) is used to produce special joining forms for RA + YA (see consonant_syllable).

+
+
+ + + + + + + + + + + + + +
+

Consonant lengthening

+

There are a number of ways of producing a lengthened consonant sound in Bangla.

+

A straightforward approach is to duplicate the consonant sound in conjunct form. For example, a long l can be written ল্ল [U+09B2 BENGALI LETTER LA + U+09CD SIGN VIRAMA + U+09B2 LETTER LA], eg. +ঝিল্লি +

+

Another common way of doubling the length of a consonant is to use a conjunct ending with [U+09AC BENGALI LETTER BA] or [U+09AE BENGALI LETTER MA], eg. +ভস্ম +বিশ্ব +

+

The y̌ɔ-phɔla (্য [U+09CD BENGALI SIGN VIRAMA + U+09AF BENGALI LETTER YA]) can also lengthen the consonant it follows, eg. +জন্য +

+

[U+0983 BENGALI SIGN VISARGA] can also lengthen the following consonant, with no aspiration, eg. +নিঃশব্দ

+
+ + + + + + + +
+

Consonant sounds to characters

+ + +

This section maps Bengali consonant sounds to common graphemes in the Bengali orthography, grouped according to whether they are regular ( r ), conjuncts ( c ), or final ( f ). Click on a grapheme to find other mentions on this page (links appear at the bottom of the page). Click on the character name to see examples and for detailed descriptions of the character(s) shown.

+ + + +
+

Stops

+
+
+
p
+
r
+ +
+
+
b
+
r
+
+

[U+09AC BENGALI LETTER BA] (when not in a conjunct)

+
+
+
+
pʰ/pf
+
r
+ +
+
+
+
r
+ +
+
+
t
+
r
+ +
+
+
 
+
f
+
+

[U+09CE BENGALI LETTER KHANDA TA], in syllable-final positions

+
+
+ +
+
d
+
r
+ +
+ +
+
+
r
+ +
+ +
+
+
r
+ +
+ +
+
ʈ
+
r
+ +
+
+
ɖ
+
r
+ +
+
+
ʈʰ
+
r
+ +
+
+
ɖʰ
+
r
+ +
+
+
k
+
r
+ +
+
+
+
r
+
+

[U+0996 BENGALI LETTER KHA],  eg. খবর.

+
+
+ + +
+
ɡ
+
r
+ +
+
+
 
+
c
+ +
+
+
ɡː
+
c
+ +
+
+
ɡʰ
+
r
+ +
+
+
+ + + + + + +
+

Affricates

+ +
+ + + + + +
+

Fricatives

+
+
+

Nasals

+ +
- - -
-

Consonant lengthening

-

There are a number of ways of producing a lengthened consonant sound in Bangla.

-

A straightforward approach is to duplicate the consonant sound in conjunct form. For example, a long l can be written ল্ল [U+09B2 BENGALI LETTER LA + U+09CD SIGN VIRAMA + U+09B2 LETTER LA], eg. -ঝিল্লি -

-

Another common way of doubling the length of a consonant is to use a conjunct ending with [U+09AC BENGALI LETTER BA] or [U+09AE BENGALI LETTER MA], eg. -ভস্ম -বিশ্ব -

-

The y̌ɔ-phɔla (্য [U+09CD BENGALI SIGN VIRAMA + U+09AF BENGALI LETTER YA]) can also lengthen the consonant it follows, eg. -জন্য -

-

[U+0983 BENGALI SIGN VISARGA] can also lengthen the following consonant, with no aspiration, eg. -নিঃশব্দ

+
+

Other

+
+
+
w
+
r
+
+

ওয় [U+0993 BENGALI LETTER O + U+09AF BENGALI LETTER YA + U+09BC BENGALI SIGN NUKTA], (light, like French 'oui') between o...a

+
+
+
+
r ɾ
+
r
+
+

[U+09B0 BENGALI LETTER RA]

+

[U+09C3 BENGALI VOWEL SIGN VOCALIC R] as part of the vocalic ri

+

[U+098B BENGALI LETTER VOCALIC R] as part of the standalone vocalic ri

+
+
+ +
+
l
+
r
+ +
+ +
+
j
+
r
+
+

য় [U+09AF BENGALI LETTER YA + U+09BC BENGALI SIGN NUKTA], between i...e, a...u, or e...e

+
+
+ +
+
@@ -2410,16 +2432,16 @@

Consonant lengthening

Encoding choices

-

This section looks at alternative strategies for typing and storing vowel-signs and independent vowels used by Bangla, taking into consideration the effects of normalising the text using Unicode Normalisation Form D (NFD), and Normalisation Form C (NFC).

+

This section looks at alternative strategies for typing and storing vowel signs and independent vowels used by Bangla, taking into consideration the effects of normalising the text using Unicode Normalisation Form D (NFD), and Normalisation Form C (NFC).

-

Vowel-signs

+

Vowel signs

The 2 circumgraphs can be written as a single character, or as two characters (in decomposed text).

-

The single code point per vowel-sign is the form preferred by the Unicode Standard and the form in common use for Bengali. The parts are separated, however, in Unicode Normalisation Form D (NFD), and recomposed in Unicode Normalisation Form C (NFC), so both approaches are canonically equivalent.

+

The single code point per vowel sign is the form preferred by the Unicode Standard and the form in common use for Bengali. The parts are separated, however, in Unicode Normalisation Form D (NFD), and recomposed in Unicode Normalisation Form C (NFC), so both approaches are canonically equivalent.

-

Whichever approach is used, the vowel-signs must be typed and stored after the consonant characters they surround. In the case of decomposed vowel-signs, the order is also important and must be as shown above.

+

Whichever approach is used, the vowel signs must be typed and stored after the consonant characters they surround. In the case of decomposed vowel signs, the order is also important and must be as shown above.

@@ -2524,7 +2546,7 @@

Codepoint sequences

Nuktas must immediately follow the base consonant they modify.

-

When 2 vowel-signs are used for a circumgraph, the encoded order of the combining marks should match the displayed order, left to right.

+

When 2 vowel signs are used for a circumgraph, the encoded order of the combining marks should match the displayed order, left to right.

@@ -2657,31 +2679,66 @@

Cursive text

The cursive treatment doesn't produce significant variations of the essential part of the glyph for a character (unlike Arabic).

Non-cursive fonts are sometimes used, mainly as display fonts for books and article titles. u

- + -
-

Context-based shaping

+
+

Context-based shaping & positioning

-

tbd

-

@ None of the characters require special shaping based on the visual context.

-

@ Many of the subjoined and post-fixed consonant forms have different shapes from the standard glyph for that character, for example na becomes   ᭄ᬦ.

-

@ In addition, many conjunct clusters combine characters with special shapes, or subtly change parts of glyphs to join smoothly. Often the changes are significant, especially the medial consonants, ya, ra, wa and la. For example, see the sequence <ba, adeg-adeg, ra, adeg-adeg, ya> in ᬩ᭄ᬭ᭄ᬬᬕ᭄ b͓r͓yg͓ (briag) laughter.

-

@ Combining vowel signs can also have different shapes depending on the context. For example, the vowel sign tedung typically ligates with the preceding consonant, eg. ha is but <ha, tedong> is ᬳᬵ and subjoined ya is   ᭄ᬬ but <consonant, adeg-adeg, ya, tedong> can be rendered as   ᭄ᬬᬵ.

+ + +
+

Vowel ligatures

+ +

Sometimes vowel signs (particularly U) form ligatures with a preceding base consonant. fig_lig shows ligated (top) and non-ligated (bottom) forms for several combinations. In certain contexts it may be less appropriate to ligate (eg. newspapers and modern typefaces). Both forms are equivalent in every way but visually.

+ + +
রু র‌ু রূ র‌ূ হৃ হ‌ৃ হু হ‌ু ন্তু ন্ত‌ু শু শ‌ু গু গ‌ু +
Ligated (top) and non-ligated (bottom) forms for several combinations of consonant+vowel.
+
+ + + + +

The default behaviour of a given font can be modified using the zero-width non-joiner character before the vowel to produce the simple form. See fig_ligated_simplified.

+ + +
+ +
Example code sequences for ligated and simplified forms of a consonant+vowel combination.
+
+ + +

See a matrix of consonants followed by vowel signs for Bengali. A few clusters that are often ligated are pre-highlighted.

-
+ @@ -2725,7 +2782,7 @@

Grapheme clusters

  1. Nukta [1] (see extendedC) Only one per grapheme cluster, typed and stored immediately after the base consonant.
  2. Dependent vowels [10] (see combiningvowels and vocalics) Usually a single code point, but in decomposed text can be two (see circumgraphs).
  3. -
  4. Nasalisation mark [1] (see nasalisation) Occurs over an independent vowel, or over a consonant when it is followed by a vowel-sign. It is nevertheless typed and stored after any vowel-signs.
  5. +
  6. Nasalisation mark [1] (see nasalisation) Occurs over an independent vowel, or over a consonant when it is followed by a vowel sign. It is nevertheless typed and stored after any vowel signs.
  7. Final consonant marks [2] (see finals) One of 2 possible combining marks, at the end of a grapheme cluster sequence. May also occur after independent vowels.
  8. Virama (hasant) (see clusters and novowel) Normally occurs immediately after a consonant (and optional nukta) at the beginning of a cluster, but also occurs after independent vowels, particularly when writing the sound æ. It may also occur after RA+ZWJ to force a particular rendered shape for RA (see below).
@@ -2733,7 +2790,7 @@

Grapheme clusters

In some cases, a ZWJ is inserted between RA + hasant followed by YA in order to specify special shaping rules (see special_forms).

-

A base consonant may be followed by ZWNJ before vowel-sign code points where the author wants to prevent ligation of the following vowel sign (see vowelligatures). A ZWNJ may also be used after a virama to prevent conjunct formation and force the virama to be rendered visibly to the reader (see visiblevirama).

+

A base consonant may be followed by ZWNJ before vowel sign code points where the author wants to prevent ligation of the following vowel sign (see vowelligatures). A ZWNJ may also be used after a virama to prevent conjunct formation and force the virama to be rendered visibly to the reader (see visiblevirama).

@@ -2819,7 +2876,7 @@

Complicating factors

Behaviour is font-dependent. If a font doesn't have a conjunct form for a particular combination of characters it will make the virama visible.

What's important to note here is that it is normally possible to break a line after the grapheme cluster containing the virama when the virama is visible. This is currently difficult to manage because the decision as to whether the text is segmented into 2 graphemes or one depends only on the capabilities of the font used (ie. the rendered result); the code point sequence is identical for both cases, and gives no clues to which approach to segmentation is applicable.

-

Visible viramas can also affect vowel-sign positioning. For the purposes of illustration, see fig_prebase_position, where the placement of the pre-base vowel varies. In the conjunct form on the left, the vowel-sign is rendered to the left of the whole conjunct. If the sequence is not rendered as a conjunct, as in the second example, the pre-base glyph precedes the TA, not the SA.

+

Visible viramas can also affect vowel sign positioning. For the purposes of illustration, see fig_prebase_position, where the placement of the pre-base vowel varies. In the conjunct form on the left, the vowel sign is rendered to the left of the whole conjunct. If the sequence is not rendered as a conjunct, as in the second example, the pre-base glyph precedes the TA, not the SA.

@@ -3174,8 +3231,8 @@

Prefixes and suffixes

Examples:

-
-
+
+
১. ২. ৩. diff --git a/bengali/images/fig_circumgraph.svg b/bengali/images/fig_circumgraph.svg new file mode 100644 index 000000000..4bc8fe43d --- /dev/null +++ b/bengali/images/fig_circumgraph.svg @@ -0,0 +1 @@ + \ No newline at end of file diff --git a/bengali/images/fig_prebase.svg b/bengali/images/fig_prebase.svg new file mode 100644 index 000000000..a5627c61a --- /dev/null +++ b/bengali/images/fig_prebase.svg @@ -0,0 +1 @@ + \ No newline at end of file diff --git a/bengali/images/fig_prebase1.svg b/bengali/images/fig_prebase1.svg new file mode 100644 index 000000000..9408d34aa --- /dev/null +++ b/bengali/images/fig_prebase1.svg @@ -0,0 +1 @@ + \ No newline at end of file diff --git a/bengali/images/fig_prebase2.svg b/bengali/images/fig_prebase2.svg new file mode 100644 index 000000000..181a096c2 --- /dev/null +++ b/bengali/images/fig_prebase2.svg @@ -0,0 +1 @@ + \ No newline at end of file diff --git a/bengali/images/fig_prebase_split1.svg b/bengali/images/fig_prebase_split1.svg new file mode 100644 index 000000000..6dd6d1ee9 --- /dev/null +++ b/bengali/images/fig_prebase_split1.svg @@ -0,0 +1 @@ + \ No newline at end of file diff --git a/bengali/images/fig_prebase_split2.svg b/bengali/images/fig_prebase_split2.svg new file mode 100644 index 000000000..25c27f6ba --- /dev/null +++ b/bengali/images/fig_prebase_split2.svg @@ -0,0 +1 @@ + \ No newline at end of file diff --git a/bengali/images/fig_yophala.svg b/bengali/images/fig_yophala.svg new file mode 100644 index 000000000..dbcb8a701 --- /dev/null +++ b/bengali/images/fig_yophala.svg @@ -0,0 +1 @@ + \ No newline at end of file