An optimized term weights based clustering for bilingual parallel

Rayner Alfred (2009) An optimized term weights based clustering for bilingual parallel. In: International Conference on Advanced Computer Theory and Engineering (ICACTE ), 25-27 September 2009, Cairo, Egypt.

Full text not available from this repository.

Abstract

Multilingual corpora are becoming an essential resource for work in multilingual natural language processing. The aim of this paper is to investigate the effects of applying a clustering technique to parallel multilingual texts. It is interesting to look at the differences of the cluster mappings and the tree structures of the clusters. The effect of reducing the set of terms considered in clustering parallel corpora is also studied. After that, a genetic-based algorithm is applied to optimize the weights of terms considered in clustering the texts to classify unseen examples of documents. Specifically, the aim of this work is to introduce the tools necessary for this task and display a set of experimental results and issues which have become apparent.

Item Type: Conference or Workshop Item (UNSPECIFIED)
Keyword: Bilingual corpora, Clustering, Hierarchical agglomerative clustering, NLP
Subjects: P Language and Literature > P Philology. Linguistics > P1-1091 Philology. Linguistics > P98-98.5 Computational linguistics. Natural language processing
Q Science > QA Mathematics > QA1-939 Mathematics > QA71-90 Instruments and machines > QA75.5-76.95 Electronic computers. Computer science
Department: SCHOOL > School of Engineering and Information Technology
Depositing User: ADMIN ADMIN
Date Deposited: 21 Mar 2011 18:31
Last Modified: 30 Dec 2014 14:35
URI: https://eprints.ums.edu.my/id/eprint/2476

Actions (login required)

View Item View Item