About Entrez

Text Version

Entrez PubMed
Overview
Help | FAQ
Tutorials
New/Noteworthy  New/Noteworthy Web (RSS) Feed
E-Utilities

PubMed Services
Journals Database
MeSH Database
Single Citation Matcher
Batch Citation Matcher
Clinical Queries
Special Queries
LinkOut
My NCBI

Related Resources
Order Documents
NLM Mobile
NLM Catalog
NLM Gateway
TOXNET
Consumer Health
Clinical Alerts
ClinicalTrials.gov
PubMed Central
 Display  Show 
All: 1 
Review: 0 
1: Biometrics. 2005 Mar;61(1):10-6. Related Articles, Links
Click here to read 
Tight clustering: a resampling-based approach for identifying stable and tight patterns in data.

Tseng GC, Wong WH.

Department of Biostatistics and Department of Human Genetics, University of Pittsburgh, Pittsburgh, Pennsylvania 15261, USA. ctseng@pitt.edu

In this article, we propose a method for clustering that produces tight and stable clusters without forcing all points into clusters. The methodology is general but was initially motivated from cluster analysis of microarray experiments. Most current algorithms aim to assign all genes into clusters. For many biological studies, however, we are mainly interested in identifying the most informative, tight, and stable clusters of sizes, say, 20-60 genes for further investigation. We want to avoid the contamination of tightly regulated expression patterns of biologically relevant genes due to other genes whose expressions are only loosely compatible with these patterns. "Tight clustering" has been developed specifically to address this problem. It applies K-means clustering as an intermediate clustering engine. Early truncation of a hierarchical clustering tree is used to overcome the local minimum problem in K-means clustering. The tightest and most stable clusters are identified in a sequential manner through an analysis of the tendency of genes to be grouped together under repeated resampling. We validated this method in a simulated example and applied it to analyze a set of expression profiles in the study of embryonic stem cells.

Publication Types:
PMID: 15737073 [PubMed - indexed for MEDLINE]

 Display  Show