Bootstrapping cluster analysis: assessing the reliability of conclusions from microarray experiments.

Document Type


Publication Date



Cluster-Analysis, Models-Statistical, Oligonucleotide-Array-Sequence-Analysis, SUPPORT-NON-U-S-GOVT, SUPPORT-U-S-GOVT-P-H-S

JAX Source

Proc Natl Acad Sci USA 2001 Jul; 98(16):8961-5.


CA88327/CA/NCI, HL66620/HL/NHLBI


We introduce a general technique for making statistical inference from clustering tools applied to gene expression microarray data. The approach utilizes an analysis of variance model to achieve normalization and estimate differential expression of genes across multiple conditions. Statistical inference is based on the application of a randomization technique, bootstrapping. Bootstrapping has previously been used to obtain confidence intervals for estimates of differential expression for individual genes. Here we apply bootstrapping to assess the stability of results from a cluster analysis. We illustrate the technique with a publicly available data set and draw conclusions about the reliability of clustering results in light of variation in the data. The bootstrapping procedure relies on experimental replication. We discuss the implications of replication and good design in microarray experiments.