A clustering approach to generalized pattern identification based on multi-instanced objects with DARA

Rayner Alfred and Dimitar Kazakov (2007) A clustering approach to generalized pattern identification based on multi-instanced objects with DARA. In: Communications of the Eleventh East-European Conference on Advances in Databases and Information Systems, September 29 - October 3, 2007, Varna, Bulgaria.


Download (45kB) | Preview


Clustering is an essential data mining task with various types of applications. Traditional clustering algorithms are based on a vector space model representation. A relational database system often contains multirelational information spread across multiple relations (tables). In order to cluster such data, one would require to restrict the analysis to a single representation, or to construct a feature space comprising all possible representations from the data stored in multiple tables. In this paper, we present a data summarization approach, borrowed from the Information Retrieval theory, to clustering in multi-relational environment. We find that the data summarization technique can be used here to capture the typical high volume of multiple instances and numerous forms of patterns. Our experiments demonstrate a technique to cluster data in a multi-relational environment and show the evaluation results on the mutagenesis dataset. In addition, the effect of varying the number of features considered in clustering on the classification performance is also evaluated.

Item Type: Conference or Workshop Item (UNSPECIFIED)
Uncontrolled Keywords: Relational Data Mining, Distance - based, Clustering, Multiple Instance, Relational Databas
Subjects: Q Science > QA Mathematics > QA76 Computer software
Divisions: FACULTY > Faculty of Computing and Informatics
Depositing User: ADMIN ADMIN
Date Deposited: 17 Nov 2015 07:24
Last Modified: 10 Nov 2017 01:42
URI: http://eprints.ums.edu.my/id/eprint/12311

Actions (login required)

View Item View Item