Aggregating multiple instances in relational database using semi-supervised genetic algorithm-based clustering technique.

Rayner Alfred, and Dimitar Kazakov, (2007) Aggregating multiple instances in relational database using semi-supervised genetic algorithm-based clustering technique. In: Communications of the Eleventh East-European Conference on Advances in Databases and Information Systems, September 29 - October 3, 2007, Varna, Bulgaria.

[img]
Preview
Text
Aggregating_Multiple_Instances_in_Relational_Database_Using_Semi.pdf

Download (45kB) | Preview

Abstract

In solving the classification problem in relational data mining, traditional methods, for example, the C4.5 and its variants, usually require data transformations from datasets stored in multiple tables into a single table. Unfortunately, we may loss some information when we join tables with a high degree of one-to-many association. Therefore, data transformation becomes a tedious trial-and-error work and the classification result is often not very promising especially when the number of tables and the degree of one-to-many association are large. In this paper, we propose a genetic semi-supervised clustering technique as a means of aggregating data in multiple tables for the classification problem in relational database. This algorithm is suitable for classification of datasets with a high degree of one-to-many associations. It can be used in two ways. One is user-controlled clustering, where the user may control the result of clustering by varying the compactness of the spherical cluster. The other is automatic clustering, where a non-overlap clustering strategy is applied. In this paper, we use the latter method to dynamically cluster multiple instances, as a means of aggregating them, and illustrate the effectiveness of this method using the semi-supervised genetic algorithm-based clustering technique.

Item Type: Conference or Workshop Item (UNSPECIFIED)
Uncontrolled Keywords: Aggregation, Clustering, Semi-supervised clustering, Genetic Algorithm, Relational data Mining
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: FACULTY > Faculty of Computing and Informatics
Depositing User: Unnamed user with email storage.bpmlib@ums.edu.my
Date Deposited: 17 Nov 2015 07:20
Last Modified: 10 Nov 2017 01:42
URI: http://eprints.ums.edu.my/id/eprint/12312

Actions (login required)

View Item View Item

Browse Repository
Collection
   Articles
   Book
   Speeches
   Thesis
   UMS News
Search
Quick Search

   Latest Repository

Link to other Malaysia University Institutional Repository

Malaysia University Institutional Repository