Rayner Alfred (2009) Discovering knowledge from multi-relational data based on information retrieval theory. In: 5th International Conference on Advanced Data Mining and Applications (ADMA), 17-19 August 2009, Beijing, China..
Full text not available from this repository.Abstract
Although the TF-IDF weighted frequency matrix (vector space model) has been widely studied and used in document clustering or document categorisation, there has been no attempt to extend this application to relational data that contain one-to-many associations between records. This paper explains the rationale for using TF-IDF (term frequency inverse document frequency), a technique for weighting data attributes, borrowed from Information Retrieval theory, to summarise datasets stored in a multi-relational setting with one-to-many relationships. A novel data summarisation algorithm based on TF-IDF is introduced, which is referred to as Dynamic Aggregation of Relational Attributes (DARA). The DARA algorithm applies clustering techniques in order to summarise these datasets. The experimental results show that using the DARA algorithm finds solutions with much greater accuracy. © 2009 Springer.
Item Type: | Conference or Workshop Item (UNSPECIFIED) |
---|---|
Keyword: | Clustering, Data summarization, Information retrieval, Knowledge discovery, Vector space model |
Subjects: | T Technology > T Technology (General) > T1-995 Technology (General) > T55.4-60.8 Industrial engineering. Management engineering > T58.5-58.64 Information technology |
Department: | SCHOOL > School of Engineering and Information Technology |
Depositing User: | ADMIN ADMIN |
Date Deposited: | 25 Mar 2011 09:08 |
Last Modified: | 30 Dec 2014 14:22 |
URI: | https://eprints.ums.edu.my/id/eprint/2568 |
Actions (login required)
View Item |