International Journal of Linguistics and Computational Applications

International Journal of Linguistics and Computational Applications

Rule Based Approach for Word Normalization in Transliterated Search Queries

Yazarlar: Varsha M. Pathak and Manish R. Joshi

Cilt 7 , Sayı 2 , 2020 , Sayfalar -

Konular:-

DOI:10.30726/ijlca/v7.i2.2020.72002

Anahtar Kelimeler:Information Retrieval,SMS Based Information System,Vector Space Model,Minimum Edit Distance,Noisy Query,Transliterated Search

Özet: SMS based Information Systems is the need of the age. Most of the present SMS based information systems send one way SMS based informative text messages generated from respective knowledge systems. By applying information retrieval methodology using models like Vector Space Mode, the systems can allow its users to send queries as per their requirement of information. This makes the system more fruitful from the user’s point of view. This paper is about such initiatives for accessing relevant literature like poems, phrases, Rhymes, stories, abhang and much more. The mobile based quick library access system MQuickLib allows users to access such literature by formulating transliterated queries. The Vector Space Model is used to create the systems knowledge base by processing. The document terms and matched with the query terms by allowing variation in spelling due to transliteration style of the users. The matching score is assigned by devising a set of rules that identify the distance between two terms dk the term from document and qj the query term. The original Levenshtein’s minimum edit distance algorithm is modified by applying this rule based approach. These rules are identified by collecting SMS queries from users for a given set of known queries in Marathi (Devnagari). Experiments were carried out for the collection of Marathi and Hindi literature that mainly include songs, gazals, powadas, bharud and other types. These documents are available in a standard transliteration form like ITRANS (an Indic Transliteration System). This paper elaborated a rule based approach and analyses the results to select appropriate rule based model that is further applied for the development of MQuickLib system.


ATIFLAR
Atıf Yapan Eserler
Henüz Atıf Yapılmamıştır

KAYNAK GÖSTER
BibTex
KOPYALA
@article{2020, title={Rule Based Approach for Word Normalization in Transliterated Search Queries}, volume={7}, number={0}, publisher={International Journal of Linguistics and Computational Applications}, author={Varsha M. Pathak and Manish R. Joshi}, year={2020} }
APA
KOPYALA
Varsha M. Pathak and Manish R. Joshi. (2020). Rule Based Approach for Word Normalization in Transliterated Search Queries (Vol. 7). Vol. 7. International Journal of Linguistics and Computational Applications.
MLA
KOPYALA
Varsha M. Pathak and Manish R. Joshi. Rule Based Approach for Word Normalization in Transliterated Search Queries. no. 0, International Journal of Linguistics and Computational Applications, 2020.