Enhancing Malay stemming algorithm with background knowledge

Leong, Leow Ching and Surayaini Basri and Rayner Alfred (2012) Enhancing Malay stemming algorithm with background knowledge. In: 12th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2012, 3-7 September 2012, Kuching, Sarawak.

Full text not available from this repository.

Abstract

Stemming is a process of reducing the inflected words to their root form. Stemming algorithm for Malay language is very important especially in building an effective information retrieval system. Although there are many existing Malay stemmers such as Othman's and Fatimah's algorithms, they are not complete stemmers because their algorithms fail to stem all the Malay words as there is still a room for improvement. It is difficult to implement a perfect stemmer for Malay language due to the complexity of words morphology in Malay language. This paper presents a new approach to stem Malay word with higher percentage of correctly stemmed words. In the proposed approach, additional background knowledge is provided in order to increase the accuracy of stemming words in Malay language. This new approach is called a Malay stemmer with background knowledge. Besides having reference to a dictionary that contains all root words, a second reference to a dictionary is added that contains all affixed words. These two files are considered as the background knowledge that will serve as references for the stemming process. A Rule Frequency Order (RFO) is applied as the basis stemming algorithm due to its high accuracy of correctly stemming Malay words. Based on the results obtained, it is proven that the proposed stemmer with background knowledge produces less error in comparison to previously published stemmers that do not apply any background knowledge in stemming Malay words.

Item Type: Conference or Workshop Item (UNSPECIFIED)
Keyword: Affixes, Background Knowledge, Malay stemming, Rule Frequency Order, Rule-based Affix Elimination
Subjects: Q Science > QA Mathematics
Department: SCHOOL > School of Engineering and Information Technology
Depositing User: ADMIN ADMIN
Date Deposited: 31 Oct 2012 17:08
Last Modified: 08 Sep 2014 12:40
URI: https://eprints.ums.edu.my/id/eprint/5288

Actions (login required)

View Item View Item