Optimizing feature construction process for dynamic aggregation of relational attributes

Rayner Alfred (2009) Optimizing feature construction process for dynamic aggregation of relational attributes. Journal of Computer Science, 5 (11). pp. 864-877. ISSN 1549-3636

[img]
Preview
Text
Optimizing_feature_construction_process_.pdf

Download (47kB) | Preview

Abstract

Problem statement: The importance of input representation has been recognized already in machine learning. Feature construction is one of the methods used to generate relevant features for learning data. This study addressed the question whether or not the descriptive accuracy of the DARA algorithm benefits from the feature construction process. In other words, this paper discusses the application of genetic algorithm to optimize the feature construction process to generate input data for the data summarization method called Dynamic Aggregation of Relational Attributes (DARA). Approach: The DARA algorithm was designed to summarize data stored in the non-target tables by clustering them into groups, where multiple records stored in non-target tables correspond to a single record stored in a target table. Here, feature construction methods are applied in order to improve the descriptive accuracy of the DARA algorithm. Since, the study addressed the question whether or not the descriptive accuracy of the DARA algorithm benefits from the feature construction process, the involved task includes solving the problem of constructing a relevant set of features for the DARA algorithm by using a genetic-based algorithm. Results: It is shown in the experimental results that the quality of summarized data is directly influenced by the methods used to create patterns that represent records in the (n×p) TF-IDF weighted frequency matrix. The results of the evaluation of the geneticbased feature construction algorithm showed that the data summarization results can be improved by constructing features by using the Cluster Entropy (CE) genetic-based feature construction algorithm. Conclusion: This study showed that the data summarization results can be improved by constructing features by using the cluster entropy genetic-based feature construction algorithm. © 2009 Science Publications.

Item Type: Article
Keyword: Clustering, Data summarization, Feature construction, Feature transformation, Genetic algorithm
Subjects: Q Science > QA Mathematics > QA1-939 Mathematics > QA71-90 Instruments and machines > QA75.5-76.95 Electronic computers. Computer science
Department: SCHOOL > School of Engineering and Information Technology
Depositing User: ADMIN ADMIN
Date Deposited: 22 Mar 2011 16:21
Last Modified: 13 Oct 2017 11:12
URI: https://eprints.ums.edu.my/id/eprint/2520

Actions (login required)

View Item View Item