Identifying clusters structure of rare events using random forest clustering

Zaturrawiah A Omar and Chin, Su Na and Siti Rahayu Mohd. Hashim and Norhafiza Hamzah (2021) Identifying clusters structure of rare events using random forest clustering.

[img] Text
Identifying clusters structure of rare events using random forest clustering.ABSTRACT.pdf

Download (59kB)
[img] Text
IDENTIFYING CLUSTERS STRUCTURE OF RARE EVENTS USING RANDOM FOREST CLUSTERING.pdf
Restricted to Registered users only

Download (855kB) | Request a copy

Abstract

Given highly imbalanced data, most learning algorithms faced the challenge to accurately predict rare events, while such cases were the ones that carry importance and useful knowledge. In a binary class label dataset, the rare events are the ones in the minority class. This study used a stroke dataset with a binary class label and the class imbalance ratio was 54:1. In addition to that, the dataset contained missing values and mixed data types. To identify the intrinsic structures in the minority class (the stroke group), Random Forest Clustering was used to produce the proximity matrix and fed to Partition around Medoid (PAM) clustering method to identify the optimal number of clusters. The proximity plot seems to show there could be cluster tendency and k=2 was identified to be the best as compared to k=3 to k=5. Based on the internal cluster validation, however, the silhouette coefficient width was small (0.1) indicating much of the data objects were within the other boundary of the other class. We have suggested a further investigation plan in this paper for the next action.

Item Type: Proceedings
Keyword: Clustering , Random forest , Stroke , Unsupervised learning , PAM
Subjects: Q Science > QA Mathematics > QA1-939 Mathematics > QA71-90 Instruments and machines > QA75.5-76.95 Electronic computers. Computer science > QA76.75-76.765 Computer software
Department: FACULTY > Faculty of Science and Natural Resources
Depositing User: DG MASNIAH AHMAD -
Date Deposited: 31 Mar 2022 14:18
Last Modified: 31 Mar 2022 14:18
URI: https://eprints.ums.edu.my/id/eprint/32160

Actions (login required)

View Item View Item