High-throughput data analysis
DNA microarrays rely on the specificity of hybridization between complementary nucleic acid sequences in DNA fragments (termed probes) immobilized on a solid surface and labeled RNA fragments isolated from biological samples of interest. A typical DNA microarray consists of thousands of ordered sets of DNA fragments on a glass, filter, or silicon wafer. After hybridization, the signal intensity of each individual probe should correlate with the abundance of the labeled RNA complementary to that probe
Microarray platforms DNA microarrays fall into two types based on the DNA fragments used to build the array: complementary DNA (cDNA) arrays and oligonucleotide arrays. Although a number of subtypes exist for each array type, spotted cDNA arrays and Affymetrix oligonucleotide arrays are the major platforms currently in use. The choice of which microarray platform to use is based on the research needs, cost, available expertise, and accessibility. For cDNA arrays, cDNA probes, which are usually generated by a polymerase chain reaction (PCR) amplification of cDNA clone inserts (representing genes of interest), are robotically spotted on glass slides or filters. The immobilized sequences of cDNA probes may range greatly in length, but are usually much longer than those of the corresponding oligonucleotide probes. The major advantage of cDNA arrays is the flexibility in designing a custom array for specific purposes. Numerous genes can be rapidly screened, which allows very quick elaboration of functional hypotheses without any a priori assumptions. In addition, cDNA arrays typically cost approximately one-fourth as much as commercial oligonucleotide arrays. Flexibility and lower cost initially made cDNA arrays popular in academic research laboratories. However, the major disadvantage of these arrays is the amount of total input RNA needed. It is also difficult to have complete control over the design of the probe sequences. cDNA is generated by the enzyme reverse transcriptase RNA-dependent DNA polymerase, and like all DNA polymerases, it cannot initiate synthesis de novo, but requires a primer. It is therefore difficult to generate comprehensive coverage of all genes in a cell. Furthermore, managing large clone libraries, and the infrastructure of a relational database for keeping records, sequence verification and data extraction is a challenge for most laboratories. For oligonucleotide arrays, probes are comprised of short segments of DNA complementary to the RNA transcripts of interest and are synthesized directly on the surface of a silicon wafer. When compared with cDNA arrays, oligonucleotide arrays generally provide greater gene coverage, more consistency, and better quality control of the immobilized sequences. Other advantages include uniformity of probe length, the ability to discriminate gene splice variants, and the availability of carefully designed standard operating procedures. Another advantage particular to Affymetrix arrays is the ability to recover samples after hybridization to an array. This feature makes Affymetrix arrays attractive in situations where the amount of available tissue is limited. However, a major disadvantage is the high cost of arrays. Following hybridization, the image is processed to obtain the hybridization signals. There are two different ways to measure signal intensity. In the two-color fluorescence hybridization scheme, the RNA from experimental and control samples (referred to as target RNAs) are differentially labeled with fluorescent dyes (Cye5 - red vs. Cye3 - green) and hybridized to the same array. When the region of the probe is fluorescently illuminated, both the experimental and control target RNAs fluorescence and the relative balance of red versus green fluorescence indicate the relative expression levels of experimental and control target RNAs. Therefore, gene expression values are reported as ratios between the two fluorescent values. Affymetrix oligonucleotide arrays use a one-color fluorescence hybridization system where experimental RNA is labeled with a single fluorescent dye and hybridized to an oligonucleotide array. After hybridization, the fluorescence intensity from each spot on the array provides a measurement of the abundance of the corresponding target RNA. A second array is then hybridized to the control RNA, allowing calculation of expression differences. Because Affymetrix array screening generally follows a standard protocol, results from different experiments in different laboratories can theoretically be combined. Following image processing, the digitized gene expression data need to be pre-processed for data normalization before carrying out further analyses. REFERENCES
- Stekel D. (2003) Microarray Bioinformatics, Cambridge University Press.
- Barrett T., Suzek T.O., Troup D.B., Wilhite S.E., Ngau W.C., Ledoux P., Rudnev D., Lash A.E., Fujibuchi W. and Edgar R. (2005) NCBI GEO: mining millions of expression profiles--database and tools. Nucleic Acids Res. Vol. 33, D562–D566. doi: 10.1093/nar/gki022.
- Faumont N., Durand-Panteix S., Schlee M., Grömminger S., Schuhmacher M., Hölzel M., Laux G., Mailhammer R., Rosenwald A., Staudt L.M., Bornkamm G.W. and Feuillard J. (2009) c-Myc and Rel/NF-kappaB are the two master transcriptional systems activated in the latency III program of Epstein-Barr virus-immortalized B cells. Journal of Virology, Vol. 83, pp. 5014-5027.