About Entrez
Text Version
Entrez PubMed Overview Help | FAQ Tutorials New/Noteworthy  E-Utilities
PubMed Services Journals Database MeSH Database Single Citation Matcher Batch Citation Matcher Clinical Queries Special Queries LinkOut My
NCBI
Related
Resources Order Documents NLM Mobile NLM Catalog NLM Gateway TOXNET Consumer Health Clinical Alerts ClinicalTrials.gov PubMed Central |
 |
| Display Show |
 |
 |
|
-
Tight clustering: a
resampling-based approach for identifying stable and tight patterns in
data.
Tseng
GC, Wong
WH.
Department of Biostatistics and Department of Human
Genetics, University of Pittsburgh, Pittsburgh, Pennsylvania 15261, USA.
ctseng@pitt.edu
In this article, we propose a method for
clustering that produces tight and stable clusters without forcing all
points into clusters. The methodology is general but was initially
motivated from cluster analysis of microarray experiments. Most current
algorithms aim to assign all genes into clusters. For many biological
studies, however, we are mainly interested in identifying the most
informative, tight, and stable clusters of sizes, say, 20-60 genes for
further investigation. We want to avoid the contamination of tightly
regulated expression patterns of biologically relevant genes due to
other genes whose expressions are only loosely compatible with these
patterns. "Tight clustering" has been developed specifically to address
this problem. It applies K-means clustering as an intermediate
clustering engine. Early truncation of a hierarchical clustering tree is
used to overcome the local minimum problem in K-means clustering. The
tightest and most stable clusters are identified in a sequential manner
through an analysis of the tendency of genes to be grouped together
under repeated resampling. We validated this method in a simulated
example and applied it to analyze a set of expression profiles in the
study of embryonic stem cells.
Publication Types:
PMID: 15737073 [PubMed
- indexed for MEDLINE]
| Display Show |
 | |