Multi-stage filtering for improving confidence level and determining dominant clusters in clustering algorithms of gene expression data

A drastic improvement in the analysis of gene expression has lead to new discoveries in bioinformatics research. In order to analyse the gene expression data, fuzzy clustering algorithms are widely used. However, the resulting analyses from these specific types of algorithms may lead to confusion in...

Full description

Saved in:
Bibliographic Details
Main Authors: Kasim, Shahreen, Deris, Safaai, M. Othman, Razib
Format: Article
Published: Elsevier 2013
Subjects:
Online Access:http://www.journals.elsevier.com/computers-in-biology-and-medicine
http://www.journals.elsevier.com/computers-in-biology-and-medicine
http://eprints.uthm.edu.my/4108/1/shahreen_kasim.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:A drastic improvement in the analysis of gene expression has lead to new discoveries in bioinformatics research. In order to analyse the gene expression data, fuzzy clustering algorithms are widely used. However, the resulting analyses from these specific types of algorithms may lead to confusion in hypotheses with regard to the suggestion of dominant function for genes of interest. Besides that, the current fuzzy clustering algorithms do not conduct a thorough analysis of genes with low membership values. Therefore, we present a novel computational framework called the "multi-stage filtering- Clustering Functional Annotation" (msf-CluFA) for clustering gene expression data. The framework consists of four components: fuzzy c-means clustering (msf-CluFA-0), achieving dominant cluster (msf- CluFA-I), improving confidence level (msf-CluFA-2) and combination of msf-CluFA-0, msf-CluFA-1 and msf-CluFA-2 (msf-CluFA-3). By employing double filtering in msf-CluFA-1 and aprion algorithms in msf-CluFA-2, our new framework is capable of determining the dominant clusters and improving the confidence level of genes with lower membership values by means of which the unknown genes can be predicted.