Affycomp III

A Benchmark for Affymetrix GeneChip Expression Measures

Sponsored by: The Hopgene Program for Genomic Applications and The JHU Department of Biostatistics

Results as of August 7, 2003 presented at: The 2003 Affymetrix GeneChip Microarray Low-Level Workshop


The defining feature of oligonucleotide expression arrays is the use of several probes to assay each targeted transcript. This is a bonanza for the statistical geneticist, who can create probeset summaries with specific characteristics.

There are now several methods available for summarizing probe level data from the popular Affymetrix GeneChips, and it can be difficult to identify the method best suited to a given inquiry.

We have developed a graphical tool to evaluate summaries of Affymetrix probe level data. Plots and summary statistics offer a picture of how an expression measure performs in several important areas. This picture facilitates the comparison of competing expression measures and the selection of methods suitable for a specific investigation. The key is a benchmark consisting of one or two spike-in studies and, optionally, a dilution study (details below). Because the truth is known for these data, it is possible to identify statistical features of the data for which the expected outcome is known in advance. Those features highlighted in our suite of graphs are justified by questions of biological interest, and motivated by the presence of appropriate data.

In conjunction with the release of a graphics toolbox as part of the Bioconductor Project, we have created this web-based tool.

We invite all interested parties to put their probe summary methods to the test in a friendly competition. See the submission form below. Download the benchmark data and develop one or more probe summaries. Return the expression-level data, and we'll tell you how you did on this set of tasks. The new assessments (and the original assessments ) show how everyone is doing.

Summaries need not be serious attempts at a complete expression measure. The submission form contains a check-box for exclusion from the competition. If you are interested in normalization, run competing normalization procedures, take a simple average over probes in a set and see how the different methods do. The goal is threefold. In addition to vetting the toolbox and competing for bragging rights, this will be an opportunity to systematically examine the strengths and weaknesses of the various approaches to probeset summary.

For more details, read the manuscript [pdf].

Data and instructions

  1. Download the spike-in and (optionally) dilution data sets.

  2. Obtain your expression measures (in original scale, NOT log scale), for any or all of the data types above, and write each as a comma-delimited text file, as follows:

  3. Choose a "unique" (original) nickname for your method, and submit your csv files and contact information using the form below. You need not submit all 3 files; any subset is permissible.

  4. Be patient. Uploading takes time, depending on internet connection, etc. You will be informed when the upload is complete (see sample Upload results page). Then the assessments begin and may take 1-2 minutes. See "Real-time log of assessment" and "Summary of assessment" on your Upload results page to monitor the progress.

  5. When all assessments are complete, you will get back via the "Summary" link several reports (e.g. one assessing your submission, another comparing it to MAS 5.0), depending on which studies you submitted.

  6. We will also keep some bottom line results to put in a one or more tables, comparing all submissions. NOTE: If you do not want your results to appear in these tables, check the NO-COMP box in the entry submission form.

