Current Proceedings on Technology

Current Proceedings on Technology

Improving Term Candidate Validation Using Ranking Metrics

Yazarlar: Mercè Vàzquez, Antoni Oliver

Cilt 3 , Sayı - , 2013 , Sayfalar -

Konular:-

Anahtar Kelimeler:Term candidate validation,Ranking metrics,Term extraction,Token slot detection

Özet: At times it is difficult to automatically identify the most representative terms in a specialized corpus and to validate them as correct due to the similarity of words and terms. In order to identify the most representative terms in a corpus that can be easily adapted to any language or terminology extraction tool, we explore the combination of token slot extraction and ranking metrics to select term candidates with a high likelihood of being terminological units. This paper presents the results we have identified using four statistical measures. We observe high term detection in English corpora (a precision of 76.92% and a recall of 79.09%) and Spanish corpora (a precision of 60% and a recall of 70.48%) using token slot detection together with four ranking metrics: Dice, True Mutual Information, T-score and Log-likelihood. In conclusion, token slot detection extracts terminological patterns in term candidates to reduce lists of candidates, and ranking metrics improve results and reduce the number to be evaluated manually. We will evaluate the algorithm’s performance in other domains and for other user profiles and needs.


ATIFLAR
Atıf Yapan Eserler
Henüz Atıf Yapılmamıştır

KAYNAK GÖSTER
BibTex
KOPYALA
@article{2013, title={Improving Term Candidate Validation Using Ranking Metrics}, volume={3}, number={0}, publisher={Current Proceedings on Technology }, author={Mercè Vàzquez, Antoni Oliver}, year={2013} }
APA
KOPYALA
Mercè Vàzquez, Antoni Oliver. (2013). Improving Term Candidate Validation Using Ranking Metrics (Vol. 3). Vol. 3. Current Proceedings on Technology .
MLA
KOPYALA
Mercè Vàzquez, Antoni Oliver. Improving Term Candidate Validation Using Ranking Metrics. no. 0, Current Proceedings on Technology , 2013.