Acta Infologica
Yazarlar: Züleyha YİNER, Atakan KURT
Konular:Bilgisayar Bilimleri, Bilgi Sistemleri
Anahtar Kelimeler:Kazakh Morphology,Natural Language Processing,Computational Linguistics
Özet: We present a comprehensive two level morphological analysis of contemporary Kazakh with implementation and a disambiguation test data set on the Nuve Framework. Our study differs from the similar studies in a number of ways: (i) Our study covers both derivational and inflectional morphology to a greater extend (ii) Our implementation consisting of orthographic rules, morphotactics, a root lexicon of roughly 24 thousand roots, a lexicon of roughly 150 suffixes is open source which can be downloaded, reviewed and tested. (ii) Roughly 10 thousand manually disambiguated parses are available as a morphological disambiguation data set. (iii) It is easily extensible meaning it can be modified or extended with new rules without any programming. (iv) we are able to tackle emerging problems quickly and easily since Nuve is maintained by our study group. (v) Our implementation can handle separately written morphemes or digraphs etc. directly. (vi) We also have a Turkish morphological parser/generator in Nuve for morphology based machine translation between Turkish and other Turkic languages since these closely related languages have a lot in common from lexical, morphological, and syntactic aspects.