Current Proceedings on Technology

Current Proceedings on Technology

Dictionary and Pattern-Based Recognition of Organization Names in Russian News Texts

Yazarlar: V. D. Solovyev, R. M. Gareev, V. V. Ivanov

Cilt 3 , Sayı - , 2013 , Sayfalar -

Konular:-

Anahtar Kelimeler:Named entity recognition,Knowledge-based event extraction

Özet: This paper describes a part of the event extraction system which has been developed in collaboration with HP Labs Russia. The domain of input texts is business news feeds. One of the most important event participant types is 'Organization'. This paper is focused on the problem of organization names recognition in Russian news texts. Two approaches have been implemented. The first is dictionary-based. We propose an algorithm to make a dictionary from a set of legal body full names gathered from a government registry. The main problems with the dictionary matching are incorrect stemming and significant fraction of ambiguous names among dictionary entries. The second recognition approach is based on usage of local context clues and internal name words. These words constitute patterns which are intrinsic to organization names. These patterns enable recognition of non-dictionary names. We propose an algorithm to derive such patterns from the original dictionary.


ATIFLAR
Atıf Yapan Eserler
Henüz Atıf Yapılmamıştır

KAYNAK GÖSTER
BibTex
KOPYALA
@article{2013, title={Dictionary and Pattern-Based Recognition of Organization Names in Russian News Texts}, volume={3}, number={0}, publisher={Current Proceedings on Technology }, author={V. D. Solovyev, R. M. Gareev, V. V. Ivanov}, year={2013} }
APA
KOPYALA
V. D. Solovyev, R. M. Gareev, V. V. Ivanov. (2013). Dictionary and Pattern-Based Recognition of Organization Names in Russian News Texts (Vol. 3). Vol. 3. Current Proceedings on Technology .
MLA
KOPYALA
V. D. Solovyev, R. M. Gareev, V. V. Ivanov. Dictionary and Pattern-Based Recognition of Organization Names in Russian News Texts. no. 0, Current Proceedings on Technology , 2013.