Skip to content

Latest commit

 

History

History
115 lines (90 loc) · 6.76 KB

README.md

File metadata and controls

115 lines (90 loc) · 6.76 KB

Hangeul4s

CircleCI codecov GitHub

A functional library for Hangeul transliteration

Quick start

A snapshot of version 0.0.1 is available from Sonatype. It is cross-built for Scala 2.11, 2.12 and 2.13. Just add the following to your build.sbt:

resolvers += Resolver.sonatypeRepo("snapshots")
libraryDependencies += "com.github.sophiecollard" %% "hangeul4s" % "0.0.1-SNAPSHOT"

Status

This project is currently under development.

Roadmap to first release

  • Implement romanization of jamo
  • Implement romanization of syllables
  • Implement conversion between jamo and syllables
  • Implement parsing of Hangeul text
  • Add CircleCI integration
  • Add Codecov integration
  • Cross-build for Scala 2.11, 2.12 and 2.13
  • Add Apache 2.0 licence

Examples

Single-word transliteration example
import hangeul4s.implicits._
import hangeul4s.model.hangeul.HangeulTextElement
import hangeul4s.model.romanization.RomanizedTextElement

val input = "안녕하세요"
// input: String = 안녕하세요

val output = for {
  parsed <- input.parseTo[HangeulTextElement]
  transliterated <- parsed.transliterateTo[RomanizedTextElement]
} yield transliterated.unparseTo[String]
// output: scala.util.Either[hangeul4s.error.Hangeul4sError,String] = Right(annyeonghaseyo)
Text transliteration example
import cats.implicits._
import hangeul4s.implicits._
import hangeul4s.model.hangeul.HangeulTextElement
import hangeul4s.model.romanization.RomanizedTextElement

// first sentence of second paragraph of the Korean Wikipedia article on Seoul (retrieved 2019-09-22)
// See https://ko.wikipedia.org/wiki/%EC%84%9C%EC%9A%B8%ED%8A%B9%EB%B3%84%EC%8B%9C
val input = "시청 소재지는 중구이며, 25개의 자치구로 이루어져 있다."
// input: String = 시청 소재지는 중구이며, 25개의 자치구로 이루어져 있다.

val output = for {
  parsed <- input.parseToF[Vector, HangeulTextElement]
  transliterated <- parsed.transliterateToF[Vector, RomanizedTextElement]
} yield transliterated.unparseTo[String]
// output: scala.util.Either[hangeul4s.error.Hangeul4sError,String] = Right(sicheong sojaejineun jungguimyeo, 25gaeui jachiguro irueojyeo itda.)

Transliteration rules

This project is an implementation of the revised Hangeul romanization. Transliteration rules currently supported are detailed in the tables below.

Vowels

Hangul
Romanization a ae ya yae eo e yeo ye o wa wae oe yo u wo we wi yu eu ui i

Initial consonants

Hangul
Romanization g kk n d tt r m b pp s ss j jj ch k t p h

Final consonants

Hangul
Romanization k k k n n n t l k m p l l p l m p p t t ng t t k t p t

Special provisions for final / initial consonant pairs

Rows and columns correspond to final and initial consonants, respectively. Final / initial consonants pairs with irregular transliteration are displayed in bold.

F/I
g kg ngn kd ngn ngm
n ng nn nd ll, nn2 nm
d, j1 tg nn td nn nm
r lg ll, nn2 ld ll lm
m mg mn md mn mm
b pg mn pd mn mm
s tg nn td nn nm
ng ngg ngn ngd ngn ngm
j tg nn td nn nm
ch tg nn td nn nm
t, ch3 tg nn td nn nm
h k nn t nn nm

1 Always transliterated as d in the current implementation
2 Always transliterated as ll in the current implementation
3 Always transliterated as t in the current implementation

Licence

Copyright 2019 Sophie Collard <https://github.com/sophiecollard>

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this software except in compliance with the License.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.