Kocaeli Journal of Science and Engineering

Kocaeli Journal of Science and Engineering

Named Entity Recognition in Turkish Bank Documents

Yazarlar: Osman KABASAKAL, Alev MUTLU

Cilt 4 , Sayı 2 , 2021 , Sayfalar 86 - 92

Konular:Bilgisayar Bilimleri, Bilgi Sistemleri

DOI:10.34088/kojose.871873

Anahtar Kelimeler:Bank Document,Conditional Random Fields,Named Entity Recognition,Natural Language Processing,Turkish Documents

Özet: Named Entity Recognition (NER) is the process of automatically recognizing entity names such as person, organization, and date in a document. In this study, we focus on bank documents written in Turkish and propose a Conditional Random Fields (CRF) model to extract named entities. The main contribution of this study is twofold: (i) we propose domain-specific features to extract entity names such as law, regulation, and reference which frequently appear in bank documents; and (ii) we contribute to NER research in Turkish document which is not as mature as other languages such as English and German. Experimental results based on 10-fold cross validation conducted on 551 real-life, anonymized bank documents show the proposed CRF-NER model achieves 0.962 micro average F1 score. More specifically, F1 score for the identification of law names is 0.979, regulation name is 0.850, and article no is 0.850.


ATIFLAR
Atıf Yapan Eserler
Henüz Atıf Yapılmamıştır

KAYNAK GÖSTER
BibTex
KOPYALA
@article{2021, title={Named Entity Recognition in Turkish Bank Documents}, volume={4}, number={86–92}, publisher={Kocaeli Journal of Science and Engineering}, author={Osman KABASAKAL,Alev MUTLU}, year={2021} }
APA
KOPYALA
Osman KABASAKAL,Alev MUTLU. (2021). Named Entity Recognition in Turkish Bank Documents (Vol. 4). Vol. 4. Kocaeli Journal of Science and Engineering.
MLA
KOPYALA
Osman KABASAKAL,Alev MUTLU. Named Entity Recognition in Turkish Bank Documents. no. 86–92, Kocaeli Journal of Science and Engineering, 2021.