TY - JOUR AU - Dieckgraefe, Brian K. AB - Summary Identification of factors involved in the initiation, amplification, and perpetuation of the chronic immune response and the identification of markers for the characterization of patient subgroups remain critical objectives for ongoing research in inflammatory bowel disease (IBD). The Human Genome Project and the development of the expressed sequence tag (EST) clone collection and database have made possible a new revolution in gene expression analysis. Instead of measuring one or a few genes, parallel DNA microarrays are capable of simultaneously measuring expression of thousands of genes, providing a glimpse into the logic and functional grouping of gene programs encoded by our genome. Applied to clinical specimens from affected and normal individuals, this methodology has the potential to provide a new level of information about disease pathogenesis not previously possible. Two dominant platforms for the construction of high-density microarrays have emerged: cDNA arrays and GeneChips. The first involves robotic spotting of DNA molecules, often derived from EST clone collections, onto a suitable solid phase matrix such as a glass slide. The second involves direct in situ synthesis of sets of gene-specific oligonucleotides on a silicon wafer by an eloquent derivative of the photolithography process. Both cDNA and oligonucleotide arrays are interrogated by hybridization with a fluorescent-labeled cDNA or cRNA representation of the original tissue mRNA. This enables measurement of the expression levels for thousands of mucosal genes in a single experiment. These technologies have recently become less expensive and more widely accessible to all researchers. This review details the principles and methods behind DNA array technology, data analysis and mining, and potential application to research and treatment of IBD. Microarray, Oligonucleotide arrays, Gene expression, Ulcerative colitis, Crohn's disease, Inflammatory bowel disease Introduction Progression from normal mucosal physiology to disease is accompanied by complex anatomic and biochemical alterations in association with corresponding changes in gene expression. Traditional techniques in molecular biology have provided a means to measure the expression of these genes individually. The new field of functional genomics uses techniques such as cDNA microarrays or GeneChips to provide a more comprehensive picture of the gene expression underlying a physiological process. In this manner, functional genomics provides an opportunity to study and appreciate programmed genetic responses and complex gene interrelationships. Contrasted with traditional serial techniques in molecular biology, this can be likened to the difference between listening to a complete symphony versus hearing only the contribution of one instrument at time. Completion of the draft sequence from the public Human Genome Project (1) and the parallel development of the expressed sequence tag (EST) clone collection and database (2) have made this new revolution in the field of genomics possible. EST libraries are composed of cDNA clones representative of the population of mRNAs expressed by a particular tissue. Individual EST clones may be known genes or novel genes. By necessity and design, application of functional genomics to the study of inflammatory bowel disease (IBD) is a discovery-driven process, owing in large part to our limited understanding of the function of so many genes. Undoubtedly, many genes involved in the pathogenesis of Crohn's disease (CD) and ulcerative colitis (UC) have yet to be identified. A number of powerful techniques in molecular biology have been developed for the purpose of identification and measurement of differentially expressed genes. Many familiar techniques are well suited for serial analysis and/or for single specimens, such as serial analysis of gene expression and differential display. However, array-based technologies are more amenable to the analysis of multiple samples (3,–12). Basic Overview: What is a DNA Array? A DNA array is a tool that allows a simultaneous wide survey of gene expression. In this regard, an array reflects a simple parallel “scaling up” of standard hybridization techniques applied in molecular biology. In its most simple form, a DNA array is an orderly arrangement of individual polynucleotide sequences, each specific for a single gene, tethered onto a solid support. For the construction of these arrays, two leading technologies have emerged: immobilization of purified cDNA inserts to glass slide microarrays (11) and in situ synthesis of oligonucleotides directly on silica wafers (chips) (13). The latter is available commercially as the Affymetrix GeneChip (Santa Clara, CA, U.S.A.). Individual locations on the solid support (referred to as “features” on GeneChips) are spatially organized in a grid-like fashion and are linked to information corresponding to the identity of the gene sequence contained at that location. Gene sequences contained on the array can be chosen to represent the entire expressed genome or some selected subset of genes (e.g., genes related to inflammation or genes related to apoptosis). The term “microarray” typically has been applied to arrays containing sample spots or features 200 μm or smaller. Progressive reductions in feature size have permitted the generation of slides containing literally tens of thousands of individual polynucleotide elements. This solid phase matrix thus provides a means for capturing labeled complementary sequences in solution by hybridization. Is that the Target or the Probe? One difficulty facing the researcher reading microarray literature is the existence of conflicting nomenclature systems used to define hybridization partners: specifically, what constitutes the probe and the target. Although one tends to associate a labeled sequence with the term “probe,” the opposite is true in the case of the preferred microarray nomenclature. In the simplest sense, all hybridization reactions involve the use of a known polynucleotide sequence or “probe” to identify an unknown polynucleotide sequence or “target” by binding to its complementary sequence. One sequence, either the defined polynucleotide or the unknown, is usually tethered to a solid phase, while the other is free in solution. In conventional techniques like northern blot analysis, the defined sequence or “probe” is in solution, whereas the unknown “target” sequence(s) is immobilized on a solid phase matrix or membrane. Microarrays are designed in a manner similar to nuclear runoff assays where this strategy is reversed. In these cases, the defined polynucleotide “probe” sequences are tethered to the solid phase, while the labeled targets (unknown polynucleotide sequences) are in the solution phase. For this review, we have adopted the preferred microarray nomenclature recommended by Phimister (14), where probe refers to the multiple defined individual nucleic acid sequences spatially organized and bound to the array matrix and the target sequences are free in solution and labeled. Basic Steps are Involved in Using Arrays RNA is isolated from a tissue sample (e.g., UC mucosa) and used to generate labeled cRNA or cDNA (the target). The label may be a direct fluorescent tag or a chemical group that can be later labeled with a fluorescent group. The amount of each labeled target, representative of a single gene, is proportionate to the amount of mRNA for that gene in the original sample. Next, mixtures of labeled targets are incubated with the microarray or GeneChip. Labeled targets will hybridize to their specific complementary sequences contained at specific locations on the array. The fluorescence at each spot or feature on the microarray or chip is then measured. The amount of fluorescence is proportional to the number of copies of cRNA or cDNA bound to each location. Arrays constructed containing oligonucleotide or cDNA representation for thousands of genes on a single microscope slide or chip can therefore provide an impressive global perspective on events involved in the development and persistence of the immune response in IBD. This process is shown schematically in Figs. 1 and 2. The sheer volume and complexity of data generated by a typical array experiment, poorly accommodated by traditional analytical approaches, have also led to the parallel development of powerful new bioinformatic tools for the analysis of these unique mucosal expression profiles in microarray data. Fig. 1. View largeDownload slide Gene expression analysis using cDNA microarrays. In this illustration, individual steps involved with making cDNA arrays are shown in the upper left-hand box. After selection of an appropriate clone collection, cDNA inserts contained within individual plasmids are amplified by polymerase chain reaction (PCR). Products are than purified to remove salts and PCR primers and undergo validation by agarose gel electrophoresis. Final amplified products are spotted onto duplicate slides by direct contact using a robotic printer. cDNAs are covalently cross-linked to positively charged amine groups on the slide surface by ultraviolet irradiation. During the final step, cross-linked double-stranded cDNAs are denatured with heat or alkali. In preparation for interrogating the cDNA arrays, mRNA samples isolated from both experimental and reference samples are converted to cDNA in a reverse transcription reaction (upper right). An example of an experimental and reference sample would be mRNA isolated from monocytes stimulated with interleukin-1 (experimental sample) and from nonstimulated monocytes (reference sample). Fluorescent-labeled cDNA targets can be generated directly by reverse transcription in the presence of fluorescent nucleotide precursors. Alternative approaches for indirect labeling are discussed in the text. To facilitate comparison between the expression of each gene in the experimental and reference samples, fluorophores with distinct emission spectra are used to label each sample. Fluorescent-labeled targets are mixed together and hybridized to a single cDNA microarray. Hybridization reactions are conducted under conditions of a large excess of immobilized probe relative to the amount of labeled target added. Under stringent conditions, labeled cDNA sequences hybridize specifically with their corresponding complement (probe) sequence on the array. Less specifically bound cDNA sequences are removed by washing, and the array is subjected to two-color laser confocal microscopy. With use of appropriate filters, the fluorescent emission for each fluorophore is measured separately. Monochrome images generated by the scanner are pseudo-colored and merged for visualization. A grid specifying the location of individual probe sequences on the array is overlaid, and the fluorescence of both fluorophores at each location is measured and subtracted from local background. To adjust for slight differences between the labeled probes, one of several normalization techniques is applied. While the normalized fluorescence at each spot is proportionate to the relative abundance of the original mRNA transcript, results are generally expressed as a fold change of the experimental sample relative to the reference. Fig. 1. View largeDownload slide Gene expression analysis using cDNA microarrays. In this illustration, individual steps involved with making cDNA arrays are shown in the upper left-hand box. After selection of an appropriate clone collection, cDNA inserts contained within individual plasmids are amplified by polymerase chain reaction (PCR). Products are than purified to remove salts and PCR primers and undergo validation by agarose gel electrophoresis. Final amplified products are spotted onto duplicate slides by direct contact using a robotic printer. cDNAs are covalently cross-linked to positively charged amine groups on the slide surface by ultraviolet irradiation. During the final step, cross-linked double-stranded cDNAs are denatured with heat or alkali. In preparation for interrogating the cDNA arrays, mRNA samples isolated from both experimental and reference samples are converted to cDNA in a reverse transcription reaction (upper right). An example of an experimental and reference sample would be mRNA isolated from monocytes stimulated with interleukin-1 (experimental sample) and from nonstimulated monocytes (reference sample). Fluorescent-labeled cDNA targets can be generated directly by reverse transcription in the presence of fluorescent nucleotide precursors. Alternative approaches for indirect labeling are discussed in the text. To facilitate comparison between the expression of each gene in the experimental and reference samples, fluorophores with distinct emission spectra are used to label each sample. Fluorescent-labeled targets are mixed together and hybridized to a single cDNA microarray. Hybridization reactions are conducted under conditions of a large excess of immobilized probe relative to the amount of labeled target added. Under stringent conditions, labeled cDNA sequences hybridize specifically with their corresponding complement (probe) sequence on the array. Less specifically bound cDNA sequences are removed by washing, and the array is subjected to two-color laser confocal microscopy. With use of appropriate filters, the fluorescent emission for each fluorophore is measured separately. Monochrome images generated by the scanner are pseudo-colored and merged for visualization. A grid specifying the location of individual probe sequences on the array is overlaid, and the fluorescence of both fluorophores at each location is measured and subtracted from local background. To adjust for slight differences between the labeled probes, one of several normalization techniques is applied. While the normalized fluorescence at each spot is proportionate to the relative abundance of the original mRNA transcript, results are generally expressed as a fold change of the experimental sample relative to the reference. Fig. 2. View largeDownload slide Analysis of gene expression analysis using Affymetrix oligonucleotide arrays. Oligonucleotide arrays are fabricated at Affymetrix using photolithography-directed in situ synthesis. This method involves the site-specific removal of a photoactive group by light projected through a photolithographic mask onto the surface of a silicon wafer. The series of photolithographic masks passed over the wafer define the site-specific removal of photoactive groups, generating 5´ hydroxy groups that are capable of coupling to the next added nucleoside. Individual genes are represented on the array by a set of 16–20 features, each containing a different 25-mer “perfect match” oligonucleotide sequence, selected for sequence complementarity and specificity for a particular gene. Along with each set of perfect match features, a paired set of mismatch features are also represented directly beneath the perfect match features. Oligonucleotide sequences used for the mismatch probe sets are identical to the perfect match probes set except they contain a single-base substitution in the central position. In contrast to cDNA arrays, Affymetrix GeneChips are designed to monitor expression from a single sample. mRNA derived from tissue or cells is converted to double-stranded cDNA by reverse transcription using an oligo-d(T) primer engineered to contain a T7 bacterial phage RNA promoter. Biotin-labeled nucleotides (bio-NTPs) are directly incorporated into cRNA target by in vitro transcription (IVT). Labeled cRNA targets are fragmented and hybridized to the GeneChip. Arrays are stained using a streptavidin/phycoerythrin conjugate that binds with high affinity to the biotin-labeled nucleotides incorporated into the target and are analyzed by confocal laser scanning. The specific hybridization signal for each individual gene is determined by subtraction of mismatch fluorescence from the perfect match fluorescence, averaged across all the probe pairs representing a particular gene. This strategy substantially reduces the contribution of background and cross-hybridization. Fig. 2. View largeDownload slide Analysis of gene expression analysis using Affymetrix oligonucleotide arrays. Oligonucleotide arrays are fabricated at Affymetrix using photolithography-directed in situ synthesis. This method involves the site-specific removal of a photoactive group by light projected through a photolithographic mask onto the surface of a silicon wafer. The series of photolithographic masks passed over the wafer define the site-specific removal of photoactive groups, generating 5´ hydroxy groups that are capable of coupling to the next added nucleoside. Individual genes are represented on the array by a set of 16–20 features, each containing a different 25-mer “perfect match” oligonucleotide sequence, selected for sequence complementarity and specificity for a particular gene. Along with each set of perfect match features, a paired set of mismatch features are also represented directly beneath the perfect match features. Oligonucleotide sequences used for the mismatch probe sets are identical to the perfect match probes set except they contain a single-base substitution in the central position. In contrast to cDNA arrays, Affymetrix GeneChips are designed to monitor expression from a single sample. mRNA derived from tissue or cells is converted to double-stranded cDNA by reverse transcription using an oligo-d(T) primer engineered to contain a T7 bacterial phage RNA promoter. Biotin-labeled nucleotides (bio-NTPs) are directly incorporated into cRNA target by in vitro transcription (IVT). Labeled cRNA targets are fragmented and hybridized to the GeneChip. Arrays are stained using a streptavidin/phycoerythrin conjugate that binds with high affinity to the biotin-labeled nucleotides incorporated into the target and are analyzed by confocal laser scanning. The specific hybridization signal for each individual gene is determined by subtraction of mismatch fluorescence from the perfect match fluorescence, averaged across all the probe pairs representing a particular gene. This strategy substantially reduces the contribution of background and cross-hybridization. Although microarray-based gene expression profiling has only recently been applied to the study of IBD (8,15,16), there are several important areas where this technology is likely to rapidly impact our understanding of the pathophysiology and treatment of this disease process. In this review, we will discuss the currently available options for performing genome-wide gene expression analyses and discuss individual steps involved in performing an array experiment, starting with preparation of sample RNA and ending with array scanning, analysis, and data interpretation. We will also discuss general areas where microarrays can be applied to address critical issues in IBD treatment and research. For this last section, we borrow heavily from proven concepts utilizing arrays in the investigation of other disease paradigms. Specific Approaches for Performing Genome-Wide Gene Expression Analysis Investigators are faced with choosing between the two leading technologies that are now widely available. The first possibility is to use microarrays constructed by robotic spotting and immobilization of purified cDNA inserts or synthetic oligonucleotide sequences to glass slides (11). This approach had its genesis in the 1980s (17) and was subsequently refined into its present form in the early 1990s (18). Because actual cDNAs are spotted, this method requires cataloging, growth, and purification of plasmids from individual bacterial clones, followed by polymerase chain reaction (PCR)-based amplification and purification of the contained cDNA inserts. DNA microarrays are now manufactured and sold by several commercial vendors or may be generated locally in core facilities. The second approach involves purchasing arrays constructed by in situ synthesis of oligonucleotides directly on silica wafers or “chips” (13). This is available commercially as the Affymetrix GeneChip. GeneChips are based on an eloquent modification of the photolithographic process used in the computer industry. Multiple oligonucleotide probes are selected to represent each individual gene. The actual sequences chosen for synthesis in situ on silicon wafers use computer-assisted design to optimize desired oligonucleotide physicochemical characteristics (e.g., predicted melting temperatures, GC content) and to avoid similarity to other known transcripts or repetitive sequences. Since oligonucleotide probes are designed and synthesized entirely based on sequence information, they do not require handling of thousands of individual cDNA clones. As cDNA microarrays and GeneChips have become widely available to most academic investigators, this review will focus on details germane to the application of the Affymetrix GeneChip and cDNA microarray to expression analysis. Both provide powerful platforms for gene expression experiments; however, there are important differences between these approaches in the way individual genes are represented and how expression is quantified. It is worth noting that a number of competing oligonucleotide-based chips are also being developed using alternative methodologies. One recent approach used standard phosphoramidite monomers positionally delivered in situ to a modified glass surface with commercial ink-jet printer heads (19). This approach was used to spatially synthesize features containing 60-mers representing 49,218 unique Unigene clusters. Another strategy uses small oligonucleotide or peptide nucleic acid probes (typically 25- to 80-mers) that are synthesized by conventional oligonucleotide synthesis followed by robotic spotting or printing and surface linkage. Affymetrix Genechip Array Affymetrix GeneChips are synthesized using a process combining photolithography with a modified solid phase DNA synthesis chemistry (20). To spatially direct the synthesis of individual oligonucleotides, a set of photolithographic masks are manufactured that define the sequential addition of specific deoxynucleosides to particular chip locations or features. In a series of repeating steps, a mercury lamp is shone through the photolithographic mask, permitting location-specific photodeprotection. This serves to remove a chemical group that otherwise prevents chemical coupling of the next deoxynucleoside. In the subsequent step, a reactive deoxynucleoside is added and can extend oligonucleotides at sites that were illuminated during the preceding step (21). This process is sequentially repeated to direct the synthesis of individual 25-mer oligonucleotides in hundreds of thousands of individual 24 × 24-μm areas or features contained on a 1.28 × 1.28-cm silicon wafer. Affymetrix GeneChips use a redundant set of probes for each gene. Individual genes are represented on the GeneChip using a series (typically 16–20) of different 25-mer “perfect match” oligonucleotides. Each probe sequence was selected for complementarity to the selected gene or EST, uniqueness relative to other related genes, and a lack of near complementarity to other highly expressed sequences (tRNA, rRNA, alu-like sequences, housekeeping genes). Along with each set of p erfectly m atched oligonucleotide probes (PM), a paired set of m is m atch probes (MM) are also represented. The oligonucleotide sequences used for the MM probe set are identical to the PM probe set, except that they are engineered to contain a single-base substitution in the central position. MM probes act as internal controls and quantify the background signals (e.g., hybridization to nonspecific sequences) that are present in complex samples (Affymetrix technical note no. 701009). The “specific” hybridization signal is determined by subtraction of the MM fluorescence from the PM fluorescence, averaged over all the probe pairs representing a particular gene. This fluorescence “average difference” measurement is a reflection of transcript abundance. Experimentally this strategy permits high sensitivity at low target concentrations and preserves the ability to discriminate between closely related sequences. The Affymetrix software then makes a present (detected) or absent call for each individual gene by integration of hybridization pattern (PM and MM differences and ratios) and abundance across all probe pairs using the quantitative hybridization intensity as previously described (22). These analysis parameters are set to values that have been experimentally demonstrated to provide reliable detection of gene transcripts and spiked RNAs present at a low abundance level. Selected prokaryotic and bacteriophage genes are also represented on the arrays to serve as hybridization controls. In vitro transcribed (polyadenylated) control RNA transcripts are added to the starting sample RNA in defined concentrations to allow monitoring of the detection sensitivity. In this manner, GeneChip arrays offer an estimation of the absolute expression level when used with the appropriate spiked controls. Each GeneChip is designed to analyze a single RNA sample. In contrast, cDNA microarrays are usually used to perform a comparative analysis using a two-color hybridization that contains labeled target from both a control and an experimental sample. Experimental results are expressed as a fold change ratio relative to the control sample. This serves as an internal control for differences in the amount of cDNA printed on the microarray and allows two samples to be analyzed on a single array. A final difference between cDNA-based arrays and the Affymetrix oligonucleotide arrays relates to specificity differences offered by the different length probes. Longer probes present on cDNA microarrays are more likely to undergo stable cross-hybridization with related sequences present in the pool of labeled target molecules. This can result in interference from other members of closely related gene families (e.g., paralogs) (23). The Affymetrix GeneChips are less sensitive to this type of interference because mismatches here lead to greater instability between the oligonucleotide probe and labeled target and because the inherent mispairing signal is subtracted (MM probe set). Making cDNA Microarrays Steps involved in constructing cDNA microarrays can be performed in most molecular biology labs (see Fig. 1) (7). Bacteria containing individual cDNA clones are first grown, plasmids are isolated, and cDNA inserts are then PCR amplified and purified. For efficiency, these steps are usually carried out in a 96-well plate format. For our studies, we have used NucleoSpin Multi-96 plasmid kits (Clontech Laboratories, Palo Alto, CA, U.S.A.). This protocol involves harvesting 1.3-ml overnight cultures by centrifugation in the provided 96-well culture plate and purification by a modified alkaline lysis procedure utilizing a vacuum manifold. Plasmids containing cDNA inserts are not directly spotted onto arrays; instead, the individual cDNA inserts are generated by a PCR reaction. This serves to significantly increase the molar amount of probe sequence contained at each spot and reduce hybridization background. Some protocols use a small quantity of frozen bacterial culture directly (e.g., Research Genetics) instead of starting with purified plasmid DNA as the template for the PCR reaction. However, this approach may be associated with a lower PCR efficiency and product yield and a higher clone failure rate. Therefore, we use a few microliters of the purified plasmid preparation as the template for PCR reactions. Amplification of the cDNA insert is performed using primers directed against common sequences contained in the cloning vector. Integrated Molecular Analysis of Genomes (IMAGE) clones or cDNAs from other sources may be contained in multiple plasmid vectors, some requiring a different set of PCR primers. A list and sequence of many cloning vectors and possible primer sequences are available at http://image.llnl.gov/image/html/vectors.shtml. GF200 forward and reverse primers (Research Genetics) are compatible with many of the commonly used cloning vectors. Our PCR amplifications are performed in an MJ Research PCR machine in 96-well format with 100-μl reaction volumes. Products are then purified to remove salts and primers by Nucleospin Multi-96 PCR kit or by alcohol precipitation (protocols are available on-line at http://www.microarrays.org/ or at http://cmgm.stanford.edu/pbrown/protocols/2_DNA.html. If a commercial kit (e.g., Nucleospin, Clontech) is used, the final product should be eluted in water (pH 7.5–8.5) instead of a buffer containing Tris (hydroxymethyl aminomethane), which contains amine groups that could interfere with the coupling chemistry. One to 2 μl of the amplified final products is then assessed using 96-well agarose gel electrophoresis. Reactions showing multiple bands or poor or absent products are repeated. An ethidium bromide-stained gel of this quality control step is shown in Fig. 1. Printing is usually performed by a robot that uses a set of 16 or 32 printing tips. Each tip picks up approximately 0.5 μl of the purified cDNA insert (probe) and then repetitively spots approximately 3–10 nl onto multiple duplicate slides by direct contact. Robots for printing cDNAs are commercially available or can be built for around $25,000–30,000. Patrick Brown's lab at Stanford University has shared its technical expertise in the form of complete plans for constructing an arrayer, necessary software, a parts list, and supplier information on their web site (http://genome_www.stanford.edu/hardware). While some companies use slides designed for proprietary DNA surface attachment chemistry (e.g., Amersham Pharmacia Biotech microarray technology access program, Piscataway, NJ, U.S.A.), excellent results can be obtained using amine- or poly-l-lysine-coated slides. These can be directly purchased (e.g., SuperAmine slides, Telechem International, http://www.arrayit.com) or made by treatment of standard glass microscope slides. Protocols are provided at the DeRisi and Brown lab web sites (http://www.microarrays.org/ and http://cmgm.stanford.edu/pbrown/protocols/). After being printed onto treated slides, the cDNA probes are covalently cross-linked to the positively charged amine groups by ultraviolet light exposure using Stratagene's Stratalinker. Most protocols also include a postprocessing blocking step with succinic anhydride to reduce the positive charge on residual amine groups and reduce the nonspecific binding of labeled target. This additional step may not be necessary when using SuperAmine slides (technical note PBT-004, Packard Bioscience). As a last step, cross-linked cDNAs on the arrays are denatured in water at 95°C or with alkali before use in hybridization. Selection and Sources of Clone Collections for IBD Researchers The subset of genes expressed together with environmental influences determine the tissue phenotype. Analysis of genome-wide mucosal expression patterns therefore dictates that an array provide broad representation of genes expressed by that tissue. A key component of the public Human Genome Project has been the high-throughput, single-pass cDNA library sequencing as a means to identify transcribed regions of the genome. These projects, including the Merck/Washington University EST Project and Cancer Genome Anatomy Project, have generated and disseminated EST information and have provided a publicly available clone collection. Over 3.7 million ESTs corresponding with approximately 97,000 unique human transcripts (http://www.ncbi.nlm.nih.gov/UniGene/) have been identified. These sequence data are available and searchable in the GenBank's dbEST database (http://www.ncbi.nlm.nih.gov/dbEST/index.html), and physical clone collections are publicly available through the IMAGE Consortium (http://image.llnl.gov/) and authorized distributors (http://image.llnl.gov/image/html/idistributors.shtml). Commercial arrays are now available, being composed of either sequenced cDNA clones or more recently using 60- to 80-mer synthetic oligonucleotides based on EST sequences. However, since a given tissue or cell expresses only a small subset of the genes encoded by the genome, it may be preferable to use a more tissue- or disease-specific subset of genes. We have produced and characterized normalized UC and CD clone libraries to use in the construction of custom microarrays. Creation of this customized array enables identification of novel or disease-specific gene expression and directly focuses our investigation on genes expressed in the inflamed IBD mucosa. In addition, customized arrays provide additional benefits associated with both cost and flexibility. Once a custom array is created, the incremental cost for each additional duplicate array is minimal. This is particularly beneficial when one needs to examine a large number of patient samples, as is the requirement for clinical correlative studies. From the standpoint of flexibility, new genes of interest can be easily added to a custom array at any time. There have been reports of numerous discrepancies between the actual and designated clone sequences in existing clone collections (and on commercial arrays). Apparently, between the original single-pass cDNA library sequencing project and the maintenance, cataloging, and dissemination of bacterial cDNA clones to distributors, a number of errors were introduced. It is therefore recommended that users of IMAGE clones (and cDNAs from other sources as well) use stocks that have undergone restreaking and resequencing for validation. Authorized IMAGE clone distributors, including Incyte Genomics (http://www.incyte.com), Research Genetics (http://www.resgen.com/), and the American Type Culture Collection (ATCC) (http://www.atcc.org) offer sequence verification services and sell an already verified clone subcollection. Some of the early IMAGE clone collection also contained T1 bacterial phage contamination. To avoid this problem, T1-resistant bacterial strains should be used for growing and maintaining individual clone collections. Use of Disease-Specific Libraries Our strategy has been to establish a mucosa- and disease-specific gene set for representation on cDNA microarrays. Normalized UC and CD libraries were constructed from pooled mucosal specimens in collaboration with Dr. Bento Soares. Specimens were selected that broadly represented possible clinical presentations and histologic disease activity. These primary CD and UC libraries in pT7T3D-Pac I contained 1.3 million and 2.6 million respective recombinants. To balance representation of common and rare cDNAs, both libraries underwent normalization (24). Normalization reduces representation of highly expressed clones such as immunoglobulin, which may represent as much as 10–20% of the members of a typical UC library (unpublished data). Resulting plasmids were sent to the IMAGE Consortium at the Lawrence Livermore National Laboratory and transformed into DH10B phage-resistant bacteria. Individual colonies were robotically picked and arrayed into individual wells of 384-well plates, and single-pass (3´-EST) sequencing reactions were performed at the genome sequencing center at Washington University. Resulting cDNA clones (dbEST library id 1729, Soares_Dieckgraefe_colon_NH UC and dbEST library id 1728, Soares_Dieckgraefe_colon_NH CD) were placed in the public domain as a general resource for the IBD research community (available through IMAGE). Both libraries are a good source for novel genes and genes only rarely expressed in other tissues (EST representation in one or a few libraries). Since not all human genes are represented by an EST in existing public databases, normalized disease-specific libraries may be a preferred source of clones for gene expression profiling in IBD. Sample Preparation and Generation of Labeled Target Many studies have relied on specimens from surgical resections to obtain sufficient amounts of RNA for generating labeled target. More recent techniques facilitate generation of sufficient target using samples from endoscopic biopsies. RNA is extracted from tissue specimens by traditional approaches; however, limited quantities necessitate the use of an amplification step to generate sufficient target for interrogation of arrays. There are two common approaches for mRNA amplification. The first method has been applied for the generation of cDNA libraries from the starting mRNA contained within a single cell (25). Primers are incorporated during reverse transcription, allowing subsequent PCR amplification. While this method is highly efficient, it may not maintain a proportional relationship between starting mRNA transcripts and the resulting amount of cDNA product (26). Despite preferential amplification of some transcript sequences over others, this technique could still prove useful provided that amplification efficiencies for individual transcripts (in the control and experimental sample) are equal. A second method for amplification has been developed that avoids PCR-associated transcript bias. In this approach, the oligo-d(T) primer used in the reverse transcription reaction has been modified to contain the essential sequence elements of the T7 bacterial phage promoter. Resulting cDNA templates are then used to direct an in vitro transcription (IVT) reaction. In the IVT reaction, T7 RNA polymerase directs the synthesis of multiple cRNA copies of each starting cDNA template. Fluorophore-labeled nucleotides can be enzymatically incorporated during this step. This method also reliably produces large quantities of labeled material with less quantitative bias then PCR-based methods (22,26). A protocol for mRNA amplification and target preparation from as little as 1 μg of starting total RNA is available at http://cmgm.stanford.edu/pbrown/protocols/ampprotocol_3.html Incorporation of a Fluorophore Since cDNA arrays are interrogated with a fluorescent-tagged cRNA or cDNA representation of the tissue mRNA pool from both an experimental and a reference sample, fluorescent reporters (fluors) are selected with nonoverlapping emission spectra, such as Cy3 (green emission at approximately 540 nm) and Cy5 (red emission at approximately 650 nm). Cy3- or Cy5-labeled cDNA targets can be generated by direct incorporation using Cy3-dUTP or Cy5-dUTP with an oligo-d(T) primer in a reverse transcription reaction. Protocols generating labeled target starting with an oligo-d(T) primer are ideal for clone sets derived from EST libraries that are biased toward 3´ representation. Cy-dUTP labels are incorporated with reasonably high efficiency by reverse transcription. Cy3 and Cy5 also exhibit well-separated excitation and emission spectra, allowing discrimination of the individual fluorescence signals, one being used for the control or reference sample and the other from the experimental sample. Between 50 and 200 μg of total RNA or between 2 and 5 μg of poly(A) RNA is ideal for labeling by direct incorporation (6). A detailed protocol for direct incorporation can be found at http://cmgm.stanford.edu/pbrown/protocols/5_hyb_human.html. The development of additional nonoverlapping dyes and multicolor laser scanners will facilitate the multiplex analysis of several samples simultaneously in the near future. Direct labeling requires compatibility between the fluorescent precursor and enzyme (e.g., reverse transcription). This compatibility is sometimes deterred by steric hindrance, which can lead to inhibition of enzymatic activity. An indirect strategy for cDNA or cRNA labeling with Cy3 or Cy5 (and especially other fluors) using a two-step process has been applied to address this problem. Allylamine-derivatized nucleotides 5-(3-aminoallyl) thymidine 5´-triphosphate and 5-(3-aminoallyl) uridine 5´-triphosphate are enzymatically incorporated into cDNA or cRNAs. Allylamine-derivatized polynucleotide products are then purified and directly reacted with N-hydroxysuccinimide esters of Cy3, Cy5, or Alexa fluors (Amersham Pharmacia Biotech or Molecular Probes, Eugene, OR, U.S.A.). Details of these protocol are provided at http://cmgm.stanford.edu/pbrown/protocols/amino-allyl.htm and http://www.packardbioscience.com/reference_matl/reference_matl.asp?content_item_id=286. Another recently developed indirect method entails the use of dendrimer technology (Genisphere; http://www.genisphere.com) (27). Dendrimers are stable structures composed of complexes of double-stranded oligonucleotides that contain multiple free ends (>250), each coupled to a fluor label. First-strand cDNA synthesis takes place with an oligo-d(T) primer modified to contain a specific “capture” sequence of 31 bases at the 5´ end. Dendrimers are engineered with a sequence complementary to this capture sequence, thereby facilitating hybridization at high stringency. Unlabeled cDNAs are hybridized to the array, washed, and detected by incubation with fluor-labeled dendrimers that bind specifically to their respective capture sequences. This technique provides robust signal amplification and the flexibility of multiplexing fluors, each different dendrimer targeting a unique capture sequence. An application note describing this technique is available at http://www.packardbioscience.com/reference_matl/reference_matl.asp?content_item_id=284. Individual Affymetrix GeneChips are designed to monitor expression from a single RNA sample rather than a comparative multicolor hybridization reaction containing both experimental and control samples. For this reason, the protocol for generating labeled target differs from that used for cDNA arrays. RNA is isolated from the tissue of interest. Between 0.2 and 5 μg of poly(A)+ mRNA or 5 and 40 μg of high-quality total RNA are optimal for interrogation of Affymetrix GeneChip oligonucleotide arrays (Affymetrix Expression Analysis technical manual). cDNA synthesis takes place with an oligo-d(T) primer containing the T7 bacterial phage promoter as described previously. Biotin-labeled nucleotides are then directly incorporated during IVT. Following hybridization, arrays are stained using a fluorescent streptavidin conjugate that binds with high affinity to the biotin-labeled nucleotides incorporated into the target. Two staining methods have been established: a single stain protocol, utilizing direct staining with streptavidin/phycoerythrin, and a second protocol that incorporates antibody amplification using biotinylated antistreptavidin antibody (Affymetrix Expression Analysis technical manual). Hybridization, Detection, and Quantification of Signal Arrays are hybridized using traditional high-salt buffers (e.g., 3 × saline/sodium citrate solution, 0.45 M NaCl, 45 m M Na-Citrate, pH 7.0) such as those applied for Southern or northern blots (for examples see http://cmgm.stanford.edu/pbrown/protocols/). Commercially available solutions such as ExpressHyb (Clontech) also offer easy alternatives. Hybridizations are carried out under conditions of excess immobilized probe cDNA (or oligonucleotides) relative to the labeled target, avoiding interprobe competition. This results in pseudo-first-order kinetics, where the amount of individual hybridized targets is proportionate to the concentration in the hybridization solution. Recent protocols have begun exploring the use of larger hybridization chamber volumes and rotating hybridization ovens instead of stationary hybridization under a coverslip to ensure more uniform and reproducible hybridization results. Following hybridization and serial washes under increasing stringency, slides are scanned using a confocal or nonconfocal laser scanner. Fluorophores are excited with appropriate wavelength lasers, and the resulting fluorescence is recorded in the form of an image file. Each of the spectrally distinct fluorophores has a characteristic emission wavelength. Although the actual color of the emission is not recorded, the emission wavelength is used to guide the choice of filters to block the excitation wavelength and other emission wavelengths. Through the use of appropriate filters, an image file is generated for the emission intensity of each individual fluorophore. A grid specifying cDNA locations is overlaid onto the image by the analysis software and is then manually inspected and adjusted as needed to ensure separation of each spot or feature. The expression level for each arrayed cDNA is then determined by measuring the average or integrated signal intensities for each spot and subtracting the local background fluorescence. Data Collection, Visualization, and Normalization Two-color fluorescence hybridization uses a simultaneous control or common baseline and experimental samples, both hybridized simultaneously to the same array. These results are then expressed as a ratio measurement of the experimental result to the baseline. This approach increases the accuracy of the analysis and has previously been shown to reliably detect quantitative changes as small as twofold (11). An important role for array analysis software is to facilitate quick identification of the subset of arrayed genes that are differentially expressed. In a typical two-label hybridization (e.g., Cy3 and Cy5), the fluorescent signal from each fluorophore is pseudo-colored (red and green) for visual analysis and overlaid for inspection. In this manner, genes expressed at approximately the same level appear as yellow, while genes expressed at higher levels in one of the two samples show up as red or green. A web source for a number of free software packages that have been developed for quantifying and visualizing microarray data are included in the  Appendix. For application of microarrays to large numbers of clinical specimens, reproducibility and comparability are critical. Slight differences in the efficiency at every step leading up to the final analysis are inevitable (e.g., array manufacture, preparation of labeled target and amounts added, hybridization and washing, and differences between fluorophores in emission and detection efficiency). To address these factors, the raw data are normalized to correct for these differences. Three common approaches to normalization have been applied. The first and often simplest approach is to normalize to the average expression of all genes represented on the array. This “global normalization” is valid when most genes will not be changing and the gene set is not biased toward a particular function. Therefore, this approach works well when comparing similar samples (where the expression of most genes contained on the arrays is not expected to vary markedly) and when using arrays that containing large numbers of genes. A second approach involves using a designated subset of genes for normalization (28,29). Khan et al. defined a set of approximately 80 genes that previously had been shown to have stable expression under the experimental models being studied (29). For validity, it is critical that the “housekeeping” sets of genes already have been shown not to substantially change expression under the conditions being studied. This approach is preferred when the samples are substantially divergent, resulting in a greater fraction of genes with altered expression levels. This approach is also preferable when using a small or function-centered gene array (e.g., genes involved in inflammation or apoptosis, etc.). If the experimental conditions led to marked changes in the expression of most genes contained on the array, then normalization using a reasonably sized set of housekeeping genes would be preferred over global normalization to function-centered genes on the array. A third approach to normalization would be to use the signal from a set of spiked genes. This method is practiced by adding known quantities of exogenous polyadenylated RNA transcripts that are represented by probes contained on the array but are not present in the native experimental or control RNA samples (30). In studies focused on human samples, this might involve the use of spiked plant or bacterial genes. A variety of Arabidopsis genes are available at http://aims.cps.msu.edu/aims/ and bacterial genes from the ATCC (http://www.atcc.org). This method requires that the genes selected for use as spiked controls not possess significant homology with other genes contained on the array. Data Analysis and Bioinformatics Measurement of the fold change of the experimental sample relative to the control group is a method commonly applied in biologic science. Applying rules that eliminate genes with less than a two- or threefold fold difference will undoubtedly miss biologically important genes that undergo a smaller but significant change. Similarly, some genes that have greater than threefold changes may simply reflect a greater biologic variation in expression of that particular gene, independent of experimental conditions or disease state. Determination of the level of significance of the observed changes is helpful to distinguish between significant biologic changes and random variation. The variance of expression levels in the normalizing gene set has also been applied to predict confidence intervals for genes significantly changing. Recently, a number of approaches have been advanced to better define the statistical significance of the fold change measurements (31,–33). A microarray data set may provide information for the expression level of thousands of individual genes in multiple pathologic and normal clinical specimens. Understanding the relationships between genes contained within large microarray data sets is a formidable task. Data are often represented in a spreadsheet with each row corresponding to an individual gene and each column reflecting a different experimental condition or specimen. While the analytic tools discussed previously serve to identify genes differentially expressed between the control and experimental experiments, a higher-level analysis is required to extract additional information from these data sets. This task has required the development of analytical and visualization tools that can organize and present the data in a manner that groups and logically emphasizes systematic patterns contained within the data. A number of these analytical techniques have been recently reviewed (34,–36). Goals pertinent to the study of IBD include identifying genes that are coregulated in the samples analyzed, recognizing and interpreting similarities and differences in patient samples themselves on the basis of gene expression profiles, and determining the correlation of gene expression profiles with specific clinical phenotypes or distinct underlying disease processes. These goals require the use of analytical tools to reduce the complexity of these large data sets by grouping together similar elements (e.g., genes that are similarly expressed) and to eliminate data points that provide little information. A linked goal is then to organize and present the processed data in a format amenable to visualization and comprehension of these groupings. Most analytical approaches can be divided into methods that are unsupervised or supervised. Unsupervised methods use only the expression data themselves and are designed to discover expression patterns or profiles contained within the data set. This approach to data grouping is referred to as clustering. Clustering algorithms group together genes demonstrating a similar expression profile or pattern across samples. For example, this could be applied to study the gene expression from cells studied at different times following a stimulus or the gene expression across endoscopic colon biopsies taken from individuals with unique underlying inflammatory conditions. All clustering methods apply a distance metric such as the statistical correlation (e.g., Pearson or jackknife correlation) or Euclidean distance to measure the similarity between two expression profiles. The data are then partitioned, using one of several strategies [e.g., hierarchical clustering (37), self-organizing maps (38,39) or k-means clustering (40), and deterministic annealing (41)]. The advantages and weaknesses of these individual partitioning methods have recently been reviewed (36). Each leads to a functional grouping of genes sharing a related expression profile across multiple specimens (e.g., different samples or time points). Since each method differs in the approach used to group genes, individual genes may differ in their final assignment into clusters, each emphasizing different underlying features in the data. Accordingly, the best approach is often to investigate more than one method to organize the data sets. Two-way clustering may also be applied to reveal important differences between samples. In this case, individual genes are organized by clustering, and then the specimens themselves are grouped according to their similarity. This approach was applied to group normal colonic and adenocarcinoma specimens on the basis of their gene expression profiles, demonstrating the utility of clustering for achieving diagnostic separation (41). Since these approaches generally weight the contribution of all gene expression measurements equally, most investigators apply some form of prefiltering to remove the contribution of genes not thought to be relevant to their experimental model or those genes that are not significantly changed. This may prevent dilution of meaningful gene expression patterns by a larger set of less relevant genes. Software programs implementing many of these clustering methods are now widely available on the internet (see the  Appendix). Supervised methods for analyzing microarray data provide a means to correlate gene expression differences with additional data. This type of data might include having a clinical response to therapy or the presence of a particular clinical phenotype. Supervised grouping is frequently termed “classification.” In supervised learning, the idea is to identify gene expression patterns in the data that are predictive of the nonexpression variable (e.g., response or nonresponse to infliximab). Analytical methods applied for this type of analysis include linear discriminant analysis, logistic regression, and several varieties of neural networks. Initial input could be the microarray-based gene expression measurements from a set of patients who later responded to infliximab and from a set of patients that later did not respond to infliximab. In this example, “learning” takes place as numeric weighting is differentially applied to those genes that are found to be the most informative for distinguishing between patients having a response or nonresponse to therapy. Often, a subset of the patient samples is used for training, and the results are validated using the remaining cases. An alternative strategy would be to test the predictor gene set in a second independent patient population. A web-based implementation of the linear discriminant analysis is available at http://classify.stanford.edu/ Having identified a large gene set that is predictive of the nongene expression data or phenotype, the next step is to perform dimension reduction. This refers to reducing what is generally a large predictive gene set into a minimal number of genes that are informative for classification purposes. This is achieved through incremental removal of genes from the predictor set, starting with the lowest weighted genes, followed by retesting the classification algorithm. This step is repeated until the predictive performance significantly deteriorates. Accurate classification between two phenotypes may ultimately be achieved with a very small number of genes (35). Although helpful for patient classification, dimension reduction may not be ideal for narrowing a list of candidate gene products linked to and potentially causative of a particular phenotype. Dimension reduction can result in the removal of genes that are highly associated with the phenotype. For example, consider the situation where there are several coregulated genes that are highly correlated with a patient having treatment failure. Despite their potential relevance to the underlying pathophysiology, if these genes did not provide independent predictive information (beyond that provided by first gene), they would have been removed during dimension reduction. Applications of DNA Microarray Technology to IBD Progress in our understanding and treatment of IBD remains slowed by our limited knowledge of gene products involved in disease pathogenesis, coupled with the complex and heterogeneous nature of the disease itself. There are several areas where microarrays are poised to make a significant impact on IBD by addressing some of these important unanswered questions. Array-based investigations are naturally suited to expand our list of genes involved in disease pathogenesis and identify potential new therapeutic targets, aid in the dissection of signaling pathways that perpetuate mucosal inflammation or pathways that direct reparative programs, provide new tools for improved clinical diagnosis and patient characterization, identify pharmacogenomic markers indicative of the most appropriate therapeutic intervention for an individual patient, and support ongoing efforts to identify disease-linked genetic mutations and allelic polymorphisms. For this section, we will borrow heavily from proven concepts utilizing arrays in the investigation of other disease paradigms. Gene Discovery and Identification of New Therapeutic Targets Microarray-based investigations have proven to be an exceptionally powerful technique to expand our knowledge of genes involved in a biologic process. Of the approximately 97,000 unique human transcripts contained in the most recent version of Unigene, the function, expression, and regulation of a surprisingly small minority are known. The onset of mucosal disease is accompanied by complex anatomic and biochemical alterations in association with corresponding changes in gene expression. The central premise of applying gene expression analysis for identification of drug targets in IBD is that description of the expressed component of the genome in affected tissues can identify molecules and pathways involved in disease pathogenesis. The most widely applied method to identify candidate gene targets is to directly compare gene expression between normal or uninvolved tissue and diseased tissue. Genes significantly induced or repressed are cataloged for further evaluation. These points are illustrated by microarray-based studies of Saccharomyces cerevisiae, which has been a highly studied model for temporal gene regulation during different phases of cell growth. Despite the existing knowledge base, application of microarrays containing representation of the approximately 6,200 S. cerevisiae genes unexpectedly revealed a large number of additional genes involved in the process of sporulation (from 50 to >500) (42). Similarly, Iyer et al. discovered >200 previously unknown genes that were temporally induced following fibroblast stimulation (43). The catalog of genes differentially expressed in response to ionizing radiation was also considerably expanded by microarray experiments (44). Genes differentially regulated may be a result of the stimulus or disease process itself (e.g., in IBD markers for cell infiltration or molecules involved in mucosal repair), be involved in the pathophysiologic process (e.g., important inflammatory mediators), or play a causal or requisite role (e.g., a susceptibility gene). Once implicated, these genes warrant follow-up investigation as it relates to their putative involvement in these specific biologic events. Microarray data can also be used to construct hypotheses about the function of unknown genes. Genes that share regulatory patterns often participate in the same biochemical pathway. Analysis of patterns of gene expression therefore can provide insight as to the function of both novel genes and those that have been identified but whose role has been unclear [termed “guilt by association” (42)]. In this type of analysis, known genes are used to provide insight into the potential role of novel or poorly characterized coexpressed genes. This coordinated expression of genes into groups or clusters whose products function in a common pathway or process has been successfully used to predict the function of novel genes in several model systems (37,42,43). In one example, potential transcriptional regulators for these individual gene programs were inferred solely by analysis of common upstream sequences in sets of coordinately expressed gene sets (42). In diseases involving cell recruitment and migration, such as IBD, gene products from a particular cell type also tend to cluster together, providing clues to the cellular origin of novel gene products (15). Microarray investigations have significantly expanded the list of genes known to be differentially expressed in the mucosa of patients with IBD (8,15,16). One example from our studies is glia maturation factor-γ (GMFG). GMFG was named by shared homology with GMF-β, a growth and differentiation factor for neurons and glia, but had not been functionally characterized. We used self-organizing maps, based on an unsupervised neural network algorithm, to cluster genes and make functional predictions. GMFG mRNA levels were increased an average of sevenfold in UC specimens and clustered with other genes related to inflammatory disease activity. This clustering assignment would suggest its involvement in the immune response, an idea supported by the presence of GMFG transcripts in hematopoietic stem/progenitor cells (45) and representation in multiple lymphoid tissues in the expressed sequence tag (dbEST) database. Another area where microarray investigations may hold promise is in the study of diseases where the contributions of both genetic and environmental factors are important. DNA microarray analyses have recently been applied to identify specific gene expression signatures pointing to host-microbial interactions (46,–49). Study of Biological Pathways and Signal Transduction Signal transduction involves the transfer of information from the extracellular environment to the nucleus, resulting in an appropriate cellular response, directed by regulating absolute and relative changes in gene expression. Alterations in gene expression eventuate in phenotypic changes such as proliferation, cellular activation, apoptosis, and resistance to apoptosis. Gene expression profiling has recently been shown to be a powerful tool in the study of signaling pathways activated by growth factors (50), retinoic acid-induced differentiation (51), the yeast pheromone response (via multiple mitogen-activated protein kinase pathways) (52), ionizing radiation (44), c-myc (53), and apoptotic stimuli (54). A single signaling factor or stimulus may activate numerous signaling pathways. Therefore, by assessing global gene expression, referred to as the transcriptome, the integrative function of these different signaling paths is captured. We have applied microarrays to our investigations of the regenerating (REG) gene family. These proteins are small C-lectin-like molecules that are highly expressed and secreted in IBD-affected mucosa (15). While a cell surface receptor has recently been described (55), little was known about the physiologic role of this protein family. Affymetrix GeneChips containing approximately 12,000 genes were interrogated using target generated from RNA samples from REG-treated and -untreated cells. The resulting transcriptome is being used to provide a biologic readout for REG activity, to provide a testable hypothesis concerning the biologic action of REG genes, and to implicate involvement of particular intracellular signaling pathways in REG-mediated signal transduction (unpublished data). A similar kind of analysis could be envisioned for studying the impact of specific allelic variations in the NOD2 gene on the monocyte response to intracellular bacteria or bacterial products. This could provide a direct test of the hypothesis that CD-associated mutations in NOD2 impair its role as an intracellular sensor for bacterial products, thereby blunting cellular signaling pathways and activation of appropriate transcriptional programs. Mechanistic studies using microarrays are a part of our ongoing placebo-controlled, multicenter Phase II trial of granulocyte-macrophage colony-stimulating factor (GM-CSF) in moderate to severe CD. In this case, gene expression profiling is being applied to better understand the molecular events underlying the therapeutic effect of GM-CSF in CD. In the drug discovery process, microarrays are being used to generate data beyond merely identifying candidate gene targets (56,–58). As described above for our studies with REG, induced genes are being used as biomarkers of signaling pathway activation. For example, this could be a gene program stimulated by tumor necrosis factor-α (TNF). In a strategy termed “reverse engineering” (59), the molecular effects of drug candidates on TNF signaling are then studied by microarrays, monitoring the gene expression profiles as a readout. There has been a high degree of correlation between the mechanism of action of a compound and resulting changes in gene expression (60,–62). This approach also provides a powerful means to predict the mechanism of action of novel uncharacterized compounds (63). Improved Clinical Characterization and Diagnosis There is little doubt that currently held diagnostic subdivisions of UC and CD contain molecularly distinct diseases that share a related intestinal inflammatory phenotype. Efforts aimed at mapping and identifying specific disease loci and genes further support the clinical impression of genetic heterogeneity. The diverse group of monogenic murine knockout models (64,65), each leading to a similar IBD phenotype, suggests that human IBD also likely consists of phenocopies (different underlying pathophysiologic defects that share a related clinical phenotype). For both UC and CD, important subgroups are likely to exist, but they have not been identified or defined by molecular markers. An important challenge for researchers investigating IBD is to now apply methods for gene expression monitoring and computational methods to facilitate class discovery and class prediction. Discovery refers to identification of previously unrecognized disease subtypes, and prediction refers to the assignment of individual patients to clinically significant groups reflective of significant parameters such as overall disease course or response to treatment. Once unique patterns of gene expression are found to correlate with specific clinical subgroups, the pattern itself can be used as a molecular tool to assist in patient classification, treatment, and prognosis. Proof of principle for this approach was provided by Alizadeh et al. (66). Previous attempts to identify common clinical subgroups in diffuse large B-cell lymphoma (DLBCL) by morphologic criteria had been unsuccessful. Sixty percent of individuals with this diagnosis respond poorly to treatment and succumb to their disease, while the remaining 40% respond well to current therapy with prolonged survival. In this landmark study, microarrays were used to investigate DLBCL gene expression. The patient group that fared better clinically had a gene expression signature consistent with a germinal center B-cell population, while the group that did poorly had expression of genes normally induced during activation of peripheral B cells (66). Thus, gene expression signatures were able to identify two different disease subgroups that warrant markedly different therapeutic strategies. A similar class discovery procedure has been applied to distinguish acute myeloid leukemia from acute lymphoblastic leukemia (67). In this study, they constructed a class predictor from a set of 50 informative genes that, when challenged with expression data from unknown acute leukemia samples, was able to accurately place that sample in the appropriate diagnostic class based on the gene expression data derived from the microarray hybridization. These results provide proof that molecular gene expression signatures can identify previously undetected and clinically significant disease subtypes. Molecular profiling of clinical tissue specimens has been the subject of a recent review (68). Similar studies have demonstrated the use of gene expression profiling to accurately classify patients with other cancers such as breast adenocarcinoma (69), melanoma (70), and oligodendrogliomas (71). A similar analysis identified consistent tumor-specific gene expression differences associated with a characteristic chromosomal translocation that is present in a subset of cases of alveolar rhabdomyosarcoma (72). These were able to identify a group of 37 genes highly expressed in the setting of this specific chromosomal translocation. This approach has yet to be systematically applied to the study of IBD. Our initial GeneChip array experiments showed evidence of considerable disease heterogeneity but focused on too small a number of cases to draw firm conclusions. Differences in the clinical presentations of CD are also often cited as evidence of disease heterogeneity or of different underlying immunologic mechanisms (73). A number of groups have advocated separation of CD into different disease subsets on the basis of anatomic distribution of disease (e.g., ileocolic, small intestine, colon/anorectal) or behavior (e.g., fistulizing or “perforating” versus fibrostenotic “nonperforating”). Although controversial, a number of investigations have shown differences in the clinical course or prognosis, such as the time to second surgery or steroid dependency, suggesting that these different disease presentations may indeed reflect different disease processes (74,–76). Gilberts et al. made the argument that biochemical determinants at the level of gene expression for the host immune response would differ between these forms of disease (77). They investigated seven genes and found that perforating CD was associated with substantially lower levels of interleukin (IL)-1β and IL-1 receptor antagonist mRNAs. Analysis using more genome-wide mucosal gene expression patterns should serve to identify pathognomonic patterns of gene expression or “signatures” that distinguish different IBD groups, reflective of the differences in the underlying genetic and environmental pathogenesis. Thus, once characterized, “signatures” from microarrays should provide a basis for improved diagnosis and molecular classification of disease subgroups. Pharmacogenomics Advances in the field of genomics hold the promise that future therapeutic interventions can be more precisely tailored to the specific genetic makeup of the individual, thus minimizing medication toxicity and side effects while maximizing therapeutic efficacy. Individual pharmacokinetic, pharmacodynamic, and etiologic differences can lead to markedly varied patient outcomes. Pharmacokinetic variations refer to differences in drug absorption, distribution, metabolism, and excretion. A classic example is allelic differences in the TPMT (thiopurine 5-methyltransferase) gene and variation in enzyme activity. This can be directly tied to an increased risk of toxicity related to 6-mercaptopurine or azathioprine therapy (78). Pharmacodynamic variations refer to differences in drug action, for example, those secondary to receptor or transporter polymorphisms. Etiologic variation, likely to be of great importance to the field of IBD, is related to different underlying pathophysiological derangements. Crohn's disease is likely to consist of phenocopies where the same clinical phenotype results from variation at different genetic loci or environmental stimuli. Differences in the environmental and genetic factors that lead to a mucosal inflammatory phenotype will be reflected in different gene expression profiles. Thus, microarray-based analysis of genome-wide mucosal gene expression patterns should serve to identify pathognomonic patterns of gene expression or “signatures” that distinguish different IBD groups. These signatures could then provide a basis for matching the most optimal form of therapy for a particular underlying disease group. As a Complement to Allelic Analysis Although the role for a genetic contribution is established, families with IBD generally lack the Mendelian segregation of the inflammatory phenotype that would be expected if the disease were caused by a mutation of a single gene (79,80). Like other human diseases, IBD appears to result from the complex interaction of multiple genes and the environment. Unlike monogenic disorders, gene variants (polymorphisms) acting alone in polygenic disorders do not cause the disease phenotype. Selection against them can occur only when they are present in the disease-causing combination. Therefore, genetic variation underlying IBD is likely to be determined by common alleles at multiple loci (81). Indeed, only a small fraction of individuals with CD harbor one of the three linked allelic variants of the recently described CD susceptibility gene, NOD2 (82,–84). Major barriers complicate additional IBD gene identification by linkage analysis, including the inability to control for environmental factors, the likely occurrence of phenocopies (genetic heterogeneity), poor phenotype definition, and the contribution of phenotype-modifying genes. Allelic differences in the regulatory elements of genes may result in altered temporal patterns of expression or differences in absolute expression levels. Chronic granulomatous disease (CGD), caused by several well-defined mutations in NADPH oxidase, provides a model for understanding the potential contribution of gene expression profiling for identification of relevant disease alleles in the pathogenesis of intestinal inflammation. Many CGD patients develop gastrointestinal complications resembling CD, including ileocolitis, perirectal abscess, and stricture or fistula formation (85,86). Host polymorphisms for myeloperoxidase (MPO) and Fcγ receptor IIIb (CD16b) were strongly associated with the risk for development of gastrointestinal involvement (87). The MPO polymorphism associated with increased risk for immunologically mediated gastrointestinal disease resides within the promoter and results in increased MPO transcriptional activity detectable by increased MPO mRNA levels. This polymorphism has no effect in the general population yet becomes significant in the setting of a coinherited mutation in NADPH oxidase. cDNA microarrays also proved useful in the analysis of the spontaneously hypertensive SHR rat, often used as a model for human obesity, diabetes, hyperlipidemia, and hypertension. Microarray-based investigations of gene expression in adipose tissue showed significantly diminished expression of CD36. This led to subsequent investigations that revealed that the hyperlipidemic phenotype resulted from deletion of the 3´ untranslated region of CD36 (88). While not all disease-causing mutations will result in alterations of transcript levels such as in the SHR rat, gene profiling may still substantially contribute to identification and study of disease genes. In another recent study, cDNA microarrays were used to discover unique gene expression profiles in breast cancers from patients with an inherited BRCA1 mutation as compared with patients with an inherited BRCA2 mutation or patients with sporadic breast cancer (89). From the array data, a minimal set of 176 genes was identified that accurately stratified the 21 patients included in this study. One patient of seven tested with sporadic breast cancer had a gene expression profile suggestive of an inherited BRCA1 mutation. When no mutation of the BRCA1 gene was found, the BRCA1 promoter region was analyzed to look for aberrant methylation, known to silence BRCA1 in sporadic breast cancers. This patient had hypermethylation of the BRCA1 promoter region, consistent with inactivation of BRCA1 expression. These findings were validated by results of the gene expression studies, showing that this tumor specimen had the lowest levels of BRCA1 mRNA expression. These results show that gene expression profiles can be used to subgroup patients and implicate involvement of a particular gene or pathway. By providing a sensitive method for measuring differences in gene expression, microarrays should prove to be a useful tool to suggest potential biologic determinants that influence the expression of specific disease behaviors. Conclusions The sequencing of the human genome now presents scientists with an unknown territory to explore and begin to make sense of. Genome-wide expression analyses promise to provide unprecedented insights into the gene programs that underlie both normal physiology and disease states. Although traditional scientific research is based on the hypothesis-driven approach to experimental design, the need to first gain better insight to the complex interrelationships that exist at the level of gene expression is better served by an approach that leaves open the door to discovery. DNA microarrays are perfectly suited for providing a simple, inexpensive, reliable, and systematic approach to that exploratory process. The reliable interrogation of this vast amount of information will lead to many specific questions that can be studied in a more traditional hypothesis-driven approach. Individual gene expression changes or profiles over pathologic samples can implicate new molecules associated with the disease process, provide important functional information, and lead to the generation of target hypotheses. Looked upon from the perspective afforded by the complete set of profiled genes, the resulting gene expression signatures provide a distinctive and detailed picture of the underlying molecular state of the sampled tissue. The microarray can provide data regarding either genome-wide gene expression or, as later will undoubtedly be the case, serve as a disease-specific diagnostic tool. Its beauty is the global perspective it affords, regardless of how narrow a question is asked. The probability of discovering a heretofore-unknown interrelationship among genes will enhance our level of understanding beyond the questions that we are currently ready to ask. The subset of genes expressed defines how a cell or tissue functions, which biochemical and signaling pathways are operative, and how that tissue interacts and responds to environmental influences. This information, which is important to understanding the pathophysiologic derangements in IBD and other diseases, can now be investigated with this new tool designed to look beyond one or a few genes and to provide a broader perspective into the functioning of our genome. Use of small endoscopic biopsies will enable routine addition of microarray-based investigations to cross-sectional IBD patient populations and facilitate longitudinal gene expression studies as part of ongoing treatment investigations. Gene expression profiles, or “signatures,” identified on microarrays will form the basis for new molecular diagnostic markers predictive of disease subgroups, specific diagnosis, clinical behavior and course, and response to therapy. Informative molecular assays based on small microarrays will soon serve to supplement standard histopathology. The development and validation of techniques that allow interrogation of microarrays with small tissue samples obtainable through biopsy will enable the application of this technology for use in future patient care decisions and research investigations. Because involved disease tissue can easily be sampled by minimally invasive techniques, IBD provides an ideal disease model to advance this paradigm. Acknowledgment We thank Drs. Joshua Korzenik and William Stenson for their insightful suggestions and critique. This work was supported by NIH Digestive Disease Research Core Center (DDRCC) grant P30 DK52574 and AI48137. Appendix Web-Based Microarray Resources and Research Tools http://www.deathstarinc.com/science/biology/chips.html: Andreas Matern's DNA microarray web site. Array background information and relevant links. http://www.gene-chips.com/ : Leming Shi's DNA microarray (genome chip) site. Basics on DNA microarray technology, academic, and industrial links. http://www.ncbi.nlm.nih.gov/ : National Center for Biotechnology Information, National Library of Medicine home page. Links to databases of interest to genomics researchers. http://image.llnl.gov/image/html/vectors.shtml : Primers for commonly used plasmid vectors that can be used for amplification of the contained cDNA inserts. http://image.llnl.gov/ : Integrated Molecular Analysis of Genomes (IMAGE) Consortium home page. http://image.llnl.gov/image/html/idistributors.shtml : Authorized distributors of IMAGE cDNA clones and associated products. http://www.ncbi.nlm.nih.gov/UniGene/ : UniGene database that partitions GenBank and novel expressed sequence tag sequences into a nonredundant set of gene-oriented clusters. Includes related information such as the tissue types in which the gene has been expressed and map location. http://www.microarrays.org/ : DeRisi lab site containing microarray protocols and software for analysis of microarray data. Included protocols largely derived from the Cold Spring Harbor Laboratory Microarray Course manual (http://microarrays.org/protocols.html). http://aims.cps.msu.edu/aims/ : Arabidopsis Information Management System directed by Prof. Sakti Pramanik of Michigan State University. Source of plant cDNAs that can be used as spiked controls. http://www.atcc.org/ : American Type Culture Collection source for cDNA clones of interest and control genes. http://rana.lbl.gov/ : Michael Eisen's lab home page at the Lawrence Berkeley National Lab. Includes protocols, data, and software for DNA microarray analysis. http://cmgm.stanford.edu/pbrown/ : Patrick Brown's lab home page at the Howard Hughes Medical Institute at Stanford University. (http://cmgm.stanford.edu/pbrown/protocols/index.html) Patrick Brown web site listing current protocols. http://genome-www4.stanford.edu/MicroArray/SMD/ : Stanford microarray database stores raw and normalized data from microarray experiments, as well as their corresponding image files. Interfaces for data retrieval, analysis, and visualization. http://genome-www4.stanford.edu/MicroArray/SMD/resacademic.html : List of various academic microarray-associated laboratories, groups, and projects. http://www.tigr.org/softlab/ : Site for programs including TIGR (The Institute for Genomic Research) MultipleExperimentViewer, ArrayViewer, and TIGR Spotfinder. http://rana.lbl.gov/EisenSoftware.htm : Software available from Dr. Eisen's lab. Cluster (implementation of hierarchical clustering, self-organizing maps, k-means clustering, principal component analysis). TreeView (implementation of graphic interface to visualize results of clustering and other analyses from Cluster). ScanAlyze (process fluorescent images of microarrays). GMEP (computes genome-mean expression profiles from expression and sequence data). http://industry.ebi.ac.uk/alan/MicroArray/ : Alan Robinson's web page from the Industry Program of the European Bioinformatics Institute. Contains large-scale gene expression and microarray links and related resources. http://classify.stanford.edu/ : Web site-based access to Classification of Expression Arrays, version 1.0 (CLEAVER). This program implements basic microarray analysis software for grouping and feature reduction. http://www.cs.wustl.edu/jbuhler/research/dapple/ : Dapple is a program for quantitating spots on a two-color DNA microarray images. http://genome-www4.Stanford.EDU/cgi-bin/sfgf/home.pl/$script/?page=protocols§iontitle=sfgf : Stanford functional genomics protocol home page. http://linkage.rockefeller.edu/wli/microarray/core.html : List of academic core facilities. http://linkage.rockefeller.edu/wli/microarray/soft.html : List of publicly available analysis programs. References 1. Consortium IHGS. Initial sequencing and analysis of the human genome. Nature  2001; 409: 860– 921. CrossRef Search ADS PubMed  2. Hillier LD, Lennon G, Becker M, et al. Generation and analysis of 280,000 human expressed sequence tags. Genome Res  1996; 6: 807– 28. Google Scholar CrossRef Search ADS PubMed  3. Bowtell DD. Options available–from start to finish–for obtaining expression data by microarray. Nat Genet 1999; 21( suppl 1): 25– 32. 4. Brown PO, Botstein D. Exploring the new world of the genome with DNA microarrays. Nat Genet 1999; 21( suppl 1): 33– 7. 5. Cheung VG, Morley M, Aguilar F, et al. Making and reading microarrays. Nat Genet 1999; 21( suppl 1): 15– 9. 6. Duggan DJ, Bittner M, Chen Y, et al. Expression profiling using cDNA microarrays. Nat Genet 1999; 21( suppl 1): 10– 4. 7. Eisen MB, Brown PO. DNA arrays for analysis of gene expression. Methods Enzymol  1999; 303: 179– 205. Google Scholar CrossRef Search ADS PubMed  8. Heller RA, Schena M, Chai A, et al. Discovery and analysis of inflammatory disease-related genes using cDNA microarrays. Proc Natl Acad Sci USA  1997; 94: 2150– 5. Google Scholar CrossRef Search ADS PubMed  9. Lockhart DJ, Dong H, Byrne MC, et al. Expression monitoring by hybridization to high-density oligonucleotide arrays. Nat Biotechnol  1996; 14: 1675– 80. Google Scholar CrossRef Search ADS PubMed  10. Ramsay G. DNA chips: state-of-the-art. Nat Biotechnol  1998; 16: 40– 4. Google Scholar CrossRef Search ADS PubMed  11. Schena M, Shalon D, Heller R, et al. Parallel human genome analysis: microarray-based expression monitoring of 1,000 genes. Proc Natl Acad Sci USA  1996; 93: 10614– 9. Google Scholar CrossRef Search ADS PubMed  12. Wodicka L, Dong H, Mittmann M, et al. Genome-wide expression monitoring in Saccharomyces cerevisiae. Nat Biotechnol  1997; 15: 1359– 67. Google Scholar CrossRef Search ADS PubMed  13. Fodor SP, Read JL, Pirrung MC, et al. Light-directed, spatially addressable parallel chemical synthesis. Science  1991; 251: 767– 73. Google Scholar CrossRef Search ADS PubMed  14. Phimister B. Going global: a note on nomenclature. Nat Genet  1999; 21(suppl 1): 1. Google Scholar CrossRef Search ADS   15. Dieckgraefe BK, Stenson WF, Korzenik JR, et al. Analysis of mucosal gene expression in inflammatory bowel disease by parallel oligonucleotide arrays. Physiol Genom  2000; 4: 1– 11. Google Scholar CrossRef Search ADS   16. Lawrance IC, Fiocchi C, Chakravarti S. Ulcerative colitis and Crohn's disease: distinctive gene expression profiles and novel susceptibility candidate genes. Hum Mol Genet  2001; 10: 445– 56. Google Scholar CrossRef Search ADS PubMed  17. Ekins R, Chu FW. Microarrays: their origins and applications. Trends Biotechnol  1999; 17: 217– 8. Google Scholar CrossRef Search ADS PubMed  18. Schena M, Shalon D, Davis RW, et al. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science  1995; 270: 467– 70. Google Scholar CrossRef Search ADS PubMed  19. Hughes TR, Mao M, Jones AR, et al. Expression profiling using microarrays fabricated by an ink-jet oligonucleotide synthesizer. Nat Biotechnol  2001; 19: 342– 7. Google Scholar CrossRef Search ADS PubMed  20. Pease AC, Solas D, Sullivan EJ, et al. Light-generated oligonucleotide arrays for rapid DNA sequence analysis. Proc Natl Acad Sci USA  1994; 91: 5022– 6. Google Scholar CrossRef Search ADS PubMed  21. Lipshutz RJ, Morris D, Chee M, et al. Using oligonucleotide probe arrays to access genetic diversity. Biotechniques  1995; 19: 442– 7. Google Scholar PubMed  22. Lockhart DJ, Dong H, Byrne MC, et al. Expression monitoring by hybridization to high-density oligonucleotide arrays. Nat Biotechnol  1996; 14: 1675– 80. Google Scholar CrossRef Search ADS PubMed  23. Richmond CS, Glasner JD, Mau R, et al. Genome-wide expression profiling in Escherichia coli K-12. Nucleic Acids Res  1999; 27: 3821– 35. Google Scholar CrossRef Search ADS PubMed  24. Bonaldo MF, Lennon G, Soares MB. Normalization and subtraction: two approaches to facilitate gene discovery. Genome Res  1996; 6: 791– 806. Google Scholar CrossRef Search ADS PubMed  25. Dulac C. Cloning of genes from single neurons. Curr Top Dev Biol  1998; 36: 245– 58. Google Scholar CrossRef Search ADS PubMed  26. Lockhart DJ, Winzeler EA. Genomics, gene expression and DNA arrays. Nature  2000; 405: 827– 36. Google Scholar CrossRef Search ADS PubMed  27. Stears RL, Getts RC, Gullans SR. A novel, sensitive detection system for high-density microarrays using dendrimer technology. Physiol Genom  2000; 3: 93– 9. Google Scholar CrossRef Search ADS   28. Rocha D, Carrier A, Naspetti M, et al. Modulation of mRNA levels in the presence of thymocytes and genome mapping for a set of genes expressed in mouse thymic epithelial cells. Immunogenetics  1997; 46: 142– 51. Google Scholar CrossRef Search ADS PubMed  29. Khan J, Lao HS, Bittner ML, et al. Expression profiling in cancer using cDNA microarrays. Electrophoresis  1999; 20: 223– 9. Google Scholar CrossRef Search ADS PubMed  30. Bernard K, Auphan N, Granjeaud S, et al. Multiplex messenger assay: simultaneous, quantitative measurement of expression of many genes in the context of T cell activation. Nucleic Acids Res  1996; 24: 1435– 42. Google Scholar CrossRef Search ADS PubMed  31. Wittes J, Friedman HP. Searching for evidence of altered gene expression: a comment on statistical analysis of microarray data. JNCI  1999; 91: 400– 1. Google Scholar CrossRef Search ADS PubMed  32. Claverie JM. Computational methods for the identification of differential and coordinated gene expression. Hum Mol Genet  1999; 8: 1821– 32. Google Scholar CrossRef Search ADS PubMed  33. Tanaka TS, Jaradat SA, Lim MK, et al. Genome-wide expression profiling of mid-gestation placenta and embryo using a 15,000 mouse developmental cDNA microarray. Proc Natl Acad Sci USA  2000; 97: 9127– 32. Google Scholar CrossRef Search ADS PubMed  34. Gilbert DR, Schroeder M, van Helden J. Interactive visualization and exploration of relationships between biologic objects. Trends Biotechnol  2000; 18: 487– 94. Google Scholar CrossRef Search ADS PubMed  35. Raychaudhuri S, Sutphin PD, Chang JT, et al. Basic microarray analysis: grouping and feature reduction. Trends Biotechnol  2001; 19: 189– 93. Google Scholar CrossRef Search ADS PubMed  36. Sherlock G. Analysis of large-scale gene expression data. Curr Opin Immunol  2000; 12: 201– 5. Google Scholar CrossRef Search ADS PubMed  37. Eisen MB, Spellman PT, Brown PO, et al. Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci USA  1998; 95: 14863– 8. Google Scholar CrossRef Search ADS PubMed  38. Tamayo P, Slonim D, Mesirov J, et al. Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. Proc Natl Acad Sci USA  1999; 96: 2907– 12. Google Scholar CrossRef Search ADS PubMed  39. Toronen P, Kolehmainen M, Wong G, et al. Analysis of gene expression data using self-organizing maps. FEBS Lett  1999; 451: 142– 6. Google Scholar CrossRef Search ADS PubMed  40. Kohonen T. Self-Organizing Maps . Berlin: Springer, 1995. Google Scholar CrossRef Search ADS   41. Alon U, Barkai N, Notterman DA, et al. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci USA  1999; 96: 6745– 50. Google Scholar CrossRef Search ADS PubMed  42. Chu S, DeRisi J, Eisen M, et al. The transcriptional program of sporulation in budding yeast. Science  1998; 282: 699– 705. Google Scholar CrossRef Search ADS PubMed  43. Iyer VR, Eisen MB, Ross DT, et al. The transcriptional program in the response of human fibroblasts to serum. Science  1999; 283: 83– 7. Google Scholar CrossRef Search ADS PubMed  44. Fornace Jr AJ, Amundson SA, Bittner M, et al. The complexity of radiation stress responses: analysis by informatics and functional genomics approaches. Gene Exp  1999; 7: 387– 400. 45. Mao M, Fu G, Wu JS, et al. Identification of genes expressed in human CD34(+) hematopoietic stem/progenitor cells by expressed sequence tags and efficient full-length cDNA cloning. Proc Natl Acad Sci USA  1998; 95: 8175– 80. Google Scholar CrossRef Search ADS PubMed  46. Relman DA. Detection and identification of previously unrecognized microbial pathogens. Emerg Infect Dis  1998; 4: 382– 9. Google Scholar CrossRef Search ADS PubMed  47. Nikkari S, Relman DA. Molecular approaches for identification of infectious agents in Wegener's granulomatosis and other vasculitides. Curr Opin Rheumatol  1999; 11: 11– 6. Google Scholar CrossRef Search ADS PubMed  48. Zhu H, Cong JP, Mamtora G, et al. Cellular gene expression altered by human cytomegalovirus: global monitoring with oligonucleotide arrays. Proc Natl Acad Sci USA  1998; 95: 14470– 5. Google Scholar CrossRef Search ADS PubMed  49. Ichikawa JK, Norris A, Bangera MG, et al. Interaction of Pseudomonas aeruginosa with epithelial cells: identification of differentially regulated genes by expression microarray analysis of human cDNAs. Proc Natl Acad Sci USA  2000; 97: 9659– 64. Google Scholar CrossRef Search ADS PubMed  50. Fambrough D, McClure K, Kazlauskas A, et al. Diverse signaling pathways activated by growth factor receptors induce broadly overlapping, rather than independent, sets of genes. Cell  1999; 97: 727– 41. Google Scholar CrossRef Search ADS PubMed  51. Liu TX, Zhang JW, Tao J, et al. Gene expression networks underlying retinoic acid-induced differentiation of acute promyelocytic leukemia cells. Blood  2000; 96: 1496– 504. Google Scholar PubMed  52. Roberts CJ, Nelson B, Marton MJ, et al. Signaling and circuitry of multiple MAPK pathways revealed by a matrix of global gene expression profiles. Science  2000; 287: 873– 80. Google Scholar CrossRef Search ADS PubMed  53. Coller HA, Grandori C, Tamayo P, et al. Expression analysis with oligonucleotide microarrays reveals that MYC regulates genes involved in growth, cell cycle, signaling, and adhesion. Proc Natl Acad Sci USA  2000; 97: 3260– 5. Google Scholar CrossRef Search ADS PubMed  54. Voehringer DW, Hirschberg DL, Xiao J, et al. Gene microarray identification of redox and mitochondrial elements that control resistance or sensitivity to apoptosis. Proc Natl Acad Sci USA  2000; 97: 2680– 5. Google Scholar CrossRef Search ADS PubMed  55. Kobayashi S, Akiyama T, Nata K, et al. Identification of a receptor for reg (regenerating gene) protein, a pancreatic beta-cell regeneration factor. J Biol Chem  2000; 275: 10723– 6. Google Scholar CrossRef Search ADS PubMed  56. Zanders ED. Gene expression analysis as an aid to the identification of drug targets. Pharmacogenomics  2000; 1: 375– 84. Google Scholar CrossRef Search ADS PubMed  57. Cunningham MJ. Genomics and proteomics: the new millennium of drug discovery and development. J Pharmacol Toxicol Methods  2000; 44: 291– 300. Google Scholar CrossRef Search ADS PubMed  58. Ivanov I, Schaab C, Planitzer S, et al. DNA microarray technology and antimicrobial drug discovery. Pharmacogenomics  2000; 1: 169– 78. Google Scholar CrossRef Search ADS PubMed  59. D'Haeseleer P, Liang S, Somogyi R. Genetic network inference: from co-expression clustering to reverse engineering. Bioinformatics  2000; 16: 707– 26. Google Scholar CrossRef Search ADS PubMed  60. Wilson M, DeRisi J, Kristensen HH, et al. Exploring drug-induced alterations in gene expression in Mycobacterium tuberculosis by microarray hybridization. Proc Natl Acad Sci USA  1999; 96: 12833– 8. Google Scholar CrossRef Search ADS PubMed  61. Gray NS, Wodicka L, Thunnissen AM, et al. Exploiting chemical libraries, structure, and genomics in the search for kinase inhibitors. Science  1998; 281: 533– 8. Google Scholar CrossRef Search ADS PubMed  62. Hughes TR, Marton MJ, Jones AR, et al. Functional discovery via a compendium of expression profiles. Cell  2000; 102: 109– 26. Google Scholar CrossRef Search ADS PubMed  63. Weinstein JN, Myers TG, O'Connor PM, et al. An information-intensive approach to the molecular pharmacology of cancer. Science  1997; 275: 343– 9. Google Scholar CrossRef Search ADS PubMed  64. Podolsky DK. Lessons from genetic models of inflammatory bowel disease. Acta Gastroenterol Belg  1997; 60: 163– 5. Google Scholar PubMed  65. Elson CO, Sartor RB, Tennyson GS, et al. Experimental models of inflammatory bowel disease. Gastroenterology  1995; 109: 1344– 67. Google Scholar CrossRef Search ADS PubMed  66. Alizadeh AA, Eisen MB, Davis RE, et al. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature  2000; 403: 503– 11. Google Scholar CrossRef Search ADS PubMed  67. Golub TR, Slonim DK, Tamayo P, et al. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science  1999; 286: 531– 7. Google Scholar CrossRef Search ADS PubMed  68. Emmert-Buck MR, Strausberg RL, Krizman DB, et al. Molecular profiling of clinical tissues specimens: feasibility and applications. J Mol Diagn  2000; 2: 60– 6. Google Scholar CrossRef Search ADS PubMed  69. Perou CM, Sorlie T, Eisen MB, et al. Molecular portraits of human breast tumors. Nature  2000; 406: 747– 52. Google Scholar CrossRef Search ADS PubMed  70. Bittner M, Meltzer P, Chen Y, et al. Molecular classification of cutaneous malignant melanoma by gene expression profiling. Nature  2000; 406: 536– 40. Google Scholar CrossRef Search ADS PubMed  71. Watson MA, Perry A, Budhjara V, et al. Gene expression profiling with oligonucleotide microarrays distinguishes World Health Organization grade of oligodendrogliomas. Cancer Res  2001; 61: 1825– 9. Google Scholar PubMed  72. Khan J, Simon R, Bittner M, et al. Gene expression profiling of alveolar rhabdomyosarcoma with cDNA microarrays. Cancer Res  1998; 58: 5009– 13. Google Scholar PubMed  73. Sachar D. Crohn's disease: to lump or split? Inflammatory Bowel Disseases 1997; 3: 177. 74. Perri F, Napolitano G, Caruso N, et al. Subgroups of patients with Crohn's disease have different clinical outcomes. Inflammatory Bowel Diseases  1996; 2: 1– 5. Google Scholar CrossRef Search ADS PubMed  75. Aeberhard P, Berchtold W, Riedtmann H-J, et al. Surgical recurrence of perforating and nonperforating Crohn's disease: a study of 101 surgically treated patients. Dis Colon Rectum.  1996; 39: 80– 7. Google Scholar CrossRef Search ADS PubMed  76. Franchimont DP, Louis E, Croes F, et al. Clinical pattern of corticosteroid dependent Crohn's disease. Eur J Gastroenterol Hepatol  1998; 10: 821– 5. Google Scholar CrossRef Search ADS PubMed  77. Gilberts EC, Greenstein AJ, Katsel P, et al. Molecular evidence for two forms of Crohn disease. Proc Natl Acad Sci USA  1994; 91: 12721– 4. Google Scholar CrossRef Search ADS PubMed  78. Dubinsky MC, Lamothe S, Yang HY, et al. Pharmacogenomics and metabolite measurement for 6-mercaptopurine therapy in inflammatory bowel disease. Gastroenterology  2000; 118: 705– 13. Google Scholar CrossRef Search ADS PubMed  79. Pena AS, Crusius JB. Genetics of inflammatory bowel disease: implications for the future. World J Surg  1998; 22: 390– 3. Google Scholar CrossRef Search ADS PubMed  80. Satsangi J, Parkes M, Jewell DP, et al. Genetics of inflammatory bowel disease. Clin Sci  1998; 94: 473– 8. Google Scholar CrossRef Search ADS PubMed  81. Chakravarti A. Population genetics–making sense out of sequence. Nat Genet 1999; 21( suppl 1): 56– 60. 82. Hampe J, Cuthbert A, Croucher PJ, et al. Association between insertion mutation in NOD2 gene and Crohn's disease in German and British populations. Lancet  2001; 357: 1925– 8. Google Scholar CrossRef Search ADS PubMed  83. Hugot JP, Chamaillard M, Zouali H, et al. Association of NOD2 leucine-rich repeat variants with susceptibility to Crohn's disease. Nature  2001; 411: 599– 603. Google Scholar CrossRef Search ADS PubMed  84. Ogura Y, Bonen DK, Inohara N, et al. A frameshift mutation in NOD2 associated with susceptibility to Crohn's disease. Nature  2001; 411: 603– 6. Google Scholar CrossRef Search ADS PubMed  85. Ament ME, Ochs HD. Gastrointestinal manifestations of chronic granulomatous disease. N Engl J Med  1973; 288: 382– 7. Google Scholar CrossRef Search ADS PubMed  86. Ahlin A, De Boer M, Roos D, et al. Prevalence, genetics and clinical presentation of chronic granulomatous disease in Sweden. Acta Paediatr  1995; 84: 1386– 94. Google Scholar CrossRef Search ADS PubMed  87. Foster CB, Lehrnbecher T, Mol F, et al. Host defense molecule polymorphisms influence the risk for immune-mediated complications in chronic granulomatous disease. J Clin Invest  1998; 102: 2146– 55. Google Scholar CrossRef Search ADS PubMed  88. Febbraio M, Abumrad NA, Hajjar DP, et al. A null mutation in murine CD36 reveals an important role in fatty acid and lipoprotein metabolism. J Biol Chem  1999; 274: 19055– 62. Google Scholar CrossRef Search ADS PubMed  89. Hedenfalk I, Duggan D, Chen Y, et al. Gene-expression profiles in hereditary breast cancer. N Engl J Med  2001; 344: 539– 48. Google Scholar CrossRef Search ADS PubMed  © 2002 Crohn's & Colitis Foundation of America, Inc. TI - Application of Genome-Wide Gene Expression Profiling by High-Density DNA Arrays to the Treatment and Study of Inflammatory Bowel Disease JF - Inflammatory Bowel Diseases DO - 10.1097/00054725-200203000-00012 DA - 2002-03-01 UR - https://www.deepdyve.com/lp/oxford-university-press/application-of-genome-wide-gene-expression-profiling-by-high-density-8Zhf0MP7DE SP - 140 EP - 157 VL - 8 IS - 2 DP - DeepDyve ER -