Access the full text.
Sign up today, get DeepDyve free for 14 days.
S. Altschul, Thomas Madden, A. Schäffer, Jinghui Zhang, Zheng Zhang, W. Miller, D. Lipman (1997)
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.Nucleic acids research, 25 17
D. Bartel (2004)
MicroRNAs Genomics, Biogenesis, Mechanism, and FunctionCell, 116
C. Notredame, D. Higgins, J. Heringa (2000)
T-Coffee: A novel method for fast and accurate multiple sequence alignment.Journal of molecular biology, 302 1
E. Allen, Zhixin Xie, Adam Gustafson, J. Carrington (2005)
microRNA-Directed Phasing during Trans-Acting siRNA Biogenesis in PlantsCell, 121
S. Griffiths-Jones (2004)
The microRNA RegistryNucleic acids research, 32 Database issue
A. Adai, Cameron Johnson, Sizolwenkosi Mlotshwa, Sarah Archer-Evans, Varun Manocha, Vicki Vance, V. Sundaresan (2005)
Computational prediction of miRNAs in Arabidopsis thaliana.Genome research, 15 1
M. Axtell, D. Bartel (2005)
Antiquity of MicroRNAs and Their Targets in Land Plantsw⃞The Plant Cell Online, 17
Lin He, G. Hannon (2004)
MicroRNAs: small RNAs with a big role in gene regulationNature Reviews Genetics, 5
Petri Susi, M. Hohkuri, Tony Wahlroos, N. Kilby (2004)
Characteristics of RNA Silencing in Plants: Similarities and Differences Across KingdomsPlant Molecular Biology, 54
Temple Smith, M. Waterman (1981)
Identification of common molecular subsequences.Journal of molecular biology, 147 1
Baohong Zhang, Xiaoping Pan, Qing-lian Wang, G. Cobb, T. Anderson (2005)
Identification and characterization of new plant microRNAs using EST analysisCell Research, 15
I. Hofacker, W. Fontana, P. Stadler, L. Bonhoeffer, M. Tacker, Philipp Schuster (1994)
Fast folding and comparison of RNA secondary structuresMonatshefte für Chemie / Chemical Monthly, 125
Xiaowo Wang, Jing Zhang, Fei Li, Jin Gu, Tao He, Xuegong Zhang, Yanda Li (2005)
MicroRNA identification based on sequence and structure alignmentBioinformatics, 21 18
C. Maher, M. Timmermans, L. Stein, D. Ware (2004)
Identifyng microRNAs in plant genomesProceedings. 2004 IEEE Computational Systems Bioinformatics Conference, 2004. CSB 2004.
Vol. 22 no. 3 2006, pages 359–360 BIOINFORMATICS APPLICATIONS NOTE doi:10.1093/bioinformatics/bti802 Sequence analysis Identification of plant microRNA homologs 1, 1 2 2 Tobias Dezulian , Michael Remmert , Javier F. Palatnik , Detlef Weigel and Daniel H. Huson 1 2 Center for Bioinformatics Tu¨ bingen, Tu¨ bingen University, Germany and Max Planck Institute for Developmental Biology, Tu¨ bingen, Germany Received on August 12, 2005; revised on November 10, 2005; accepted on November 24, 2005 Advance Access publication November 29, 2005 Associate Editor: Charlie Hodgman ABSTRACT MiRNA genes in plants are grouped into families that yield (almost) identical miRNAs. Currently, 43 families containing Summary: MicroRNAs (miRNAs) are a recently discovered class of 513 miRNA genes across 7 plant species are listed in release 7.0 non-coding RNAs that regulate gene and protein expression in plants of the MicroRNA registry (Griffiths-Jones, 2004, http://microrna. and animals. MiRNAs have so far been identified mostly by specific sanger.ac.uk) responsible for name assignment of published cloning of small RNA molecules, complemented by computational miRNAs. methods. We present a computational identification approach that is Here, we present an approach and implementation (‘micro- able to identify candidate miRNA homologs in any set of sequences, HARVESTER’) that can identify candidate miRNA homologs given a query miRNA. The approach is based on a sequence similarity based on a query miRNA, with excellent sensitivity and specificity. search step followed by a set of structural filters. The microHARVESTER takes advantage of the conservation Availability: microHARVESTER is offered as a web-service and pattern typical for miRNA genes: the (mature) miRNA is most additionally as source code upon request at http://www-ab. conserved since its sequence is crucial for target-interaction; the informatik.uni-tuebingen.de/software/microHARVESTER miRNA is less conserved but restricted by the need to extensively Contact: dezulian@informatik.uni-tuebingen.de base-pair with the miRNA; the rest of the miRNA gene can be less conserved. Our approach uses a BLAST sequence similarity search MicroRNAs (miRNAs) are small RNAs 20–24 nt in length. They to first generate a set of candidates which is then rigorously refined perform important regulatory roles in both plants and animals. The by a series of filters—exploiting structural features specific to plant miRNA biogenesis and effector pathways share components with miRNAs to achieve specificity. The output of the tool consists of a PDF overview document that is generated for each miRNA query. those for another class of small RNAs, short interfering RNAs It presents candidate miRNA homologs along with figures of their (siRNAs), and both are currently under intense scrutiny (Susi predicted structure and a color-coded alignment. et al., 2004). Biogenesis of miRNAs starts with the synthesis of Given a known miRNA (miRNA precursor sequence plus mature a large primary transcript (Bartel, 2004; He and Hannon, 2004), miRNA sequence) as input for our search we use the precursor as a which contains a double-stranded miRNA precursor that adopts a query for a sequence similarity search against a set of sequences fold-back structure by complementary base pairing. In plants, the (e.g. a set of EST sequences or read from a new plant genome) to miRNA precursor is degraded in the nucleus by the RNAse III generate a set of candidate homologs. Since the (mature) miRNA enzyme DICER-LIKE1, which releases a short RNA duplex. sequence is very much conserved across large evolutionary dis- This duplex is formed by the miRNA along with the complementary tances (Axtell and Bartel, 2005), using BLAST (Altschul et al., fragment, called miRNA , from the other arm of the precursor. The 1997) with the very large E-value cutoff of 10 and minimal miRNA and the miRNA are offset by 2 nt owing to the staggered word size of 7, one can generate a hit for almost all miRNA homo- cuts of DICER-LIKE1. Finally, mature miRNAs are selected from logs at the price of many false positives. In the first filter step, we the RNA duplex and incorporated into RNA induced silencing discard those sequences of the candidate set whose aligned seg- complexes (RISCs), to which they provide sequence specificity. ments do not span most of the mature segment of the query. In a MiRNAs recognize completely or partially complementary sequences in target mRNAs and guide them to cleavage or trans- second filter step, we apply a modified Smith–Waterman pairwise lational arrest. Animal miRNAs typically recognize several target alignment algorithm (Smith and Waterman, 1981) to precisely sequences located in the 3 -UTR and inhibit their translation, determine the mature sequence in the candidate precursor from whereas plant miRNAs usually recognize one motif in the coding the optimal alignment of the query mature sequence against the region of their targets and affect their stability. It is thought that the corresponding segment of the BLAST hit. We discard a candidate better complementarity between plant miRNAs and their targets if the length of the mature sequences differs by >2 nt. In a third filter favors the latter mechanism. In plants, miRNAs regulate diverse step, we predict the minimal free energy structure of the candidate genes and pathways, such as development, hormone signaling, sequence using RNAfold (Hofacker, 1994) and determine its putat- stress response and trans-acting siRNAs (Allen et al., 2005). ive miRNA sequence. We discard a candidate if more than six nucleotides of its miRNA are not predicted to form bonds with its mature miRNA (keeping in mind the 2 nt offset between miRNA To whom correspondence should be addressed. The Author 2005. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org 359 T.Dezulian et al. MicroHARVESTER is able to identify plant miRNA homologs with good sensitivity and specificity in any set of sequences, for a given query miRNA. Using an EST database as the sequence pool offers the additional assurance that the predicted miRNA homologs are actually expressed (Zhang et al., 2005). Nevertheless, this approach has also proven useful on databases of genomic DNA. Successful approaches for plant miRNA homolog identification have previously been described (Maher et al., 2004; Adai et al., 2005). However, microHARVESTER is the first such tool that is available through a web interface. It complements a very recently published animal miRNA homolog identification approach (Wang AB C et al., 2005). In addition to the original purpose of miRNA homolog identification, microHARVESTER can be effectively used to screen Fig. 1. (A) The multiple sequence alignment shows the reliability of each candidate miRNA sets derived from comparative approaches to alignment position. Darker colors indicate better alignment scores. Dark and identify representatives of new miRNA families. In this setting, light frames mark the positions coding for the miRNA and miRNA , respec- tively. (B) The minimal free energy structure for an EST harboring a miRNA each candidate miRNA is used as the query and the number and homolog candidate is depicted; in the enlarged section (C), miRNA and divergence pattern of resulting putative homologs as well as their miRNA are marked on the right and left hand side, respectively. structure provides clues to the miRNA-likeness of the query. Conflict of Interest: none declared. and miRNA ). From a selection of all candidates that pass each filter we construct a multiple sequence alignment, using T-Coffee REFERENCES (Notredame et al., 2000), of a region that includes the miRNA, the miRNA and the ‘loop’ sequence in between the miRNA and Adai,A., Johnson,C., Mlotshwa,S., Archer-Evans,S., Manocha,V., Vance,V. and the miRNA . The reliability of each position of this multiple align- Sundaresan,V. (2005) Computational prediction of miRNAs in Arabidopsis thaliana. Genome Res., 15, 78–91. ment is visualized using a color scheme. An overview PDF docu- Allen,E., Xie,Z., Gustafson,A.M. and Carrington,J.C. (2005) microRNA-directed ment is generated, which contains this multiple sequence alignment. phasing during trans-acting siRNA biogenesis in plants. Cell, 121, 207–221. In addition, it provides for each putative miRNA homolog: a figure Altschul,S.F., Madden,T.L., Schaffer,A.A., Zhang,J., Zhang,Z., Miller,W. and of its minimal free energy structure with the miRNA and the Lipman,D.J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res., 25, 3389–3402. miRNA highlighted in dark and light shades, respectively, along Axtell,M.J. and Bartel,D.P. (2005) Antiquity of MicroRNAs and their targets in land with its database accession (Fig. 1). plants. Plant Cell, 17, 1658–1673. In order to assess sensitivity and specificity of this approach, we Bartel,D.P. (2004) MicroRNAs: genomics, biogenesis, mechanism, and function. Cell, applied the microHARVESTER to the fully sequenced dicot 116, 281–297. Arabidopsis thaliana (Ath) genome using a set of query sequences Griffiths-Jones,S. (2004) The microRNA Registry. Nucleic Acids Res., 32, D109–D111. from the monocot Zea mays (Zma). For each of the currently avail- He,L. and Hannon,G.J. (2004) MicroRNAs: small RNAs with a big role in gene able (MicroRNA registry release 7.0) 18 miRNA families shared by regulation. Nat. Rev. Genet., 5, 522–531. Ath and Zma we selected one Zma miRNA gene at random. Using Hofacker,I.L., Fontana,W., Stadler,P.F., Bonhoeffer,L.S., Tacker,M. and Schuster,P. this query set, the microHARVESTER identified 67 of the 75 Ath (1994) Fast folding and comparison of RNA secondary structures. Monatsh. Chem., 125, 167–188. miRNA genes of these families—at least one in each family—at the Maher,C., Timmermans,M., Stein,L. and Ware,D. (2004) Identifying MicroRNAs in price of five false positives. Plant Genomes. Proceedings of the 2004 IEEE Computational Systems Bioinform- MicroHARVESTER is available as a web-service at www-ab. atics Conference (CSB 2004) Stanford, CA, pp. 718–723. informatik.uni-tuebingen.de/software/microHARVESTER. Up to Notredame,C., Higgins,D.G. and Heringa,J. (2000) T-Coffee: A novel method for fast five miRNA queries may be submitted upon which a job id and and accurate multiple sequence alignment. J. Mol. Biol., 302, 205–217. Smith,T.F. and Waterman,M.S. (1981) Identification of common molecular URL will be issued and the resulting PDFs will be downloadable subsequences. J. Mol. Biol., 147, 195–197. after job completion. Source code for the microHARVESTER is Susi,P., Hohkuri,M., Wahlroos,T. and Kilby,N.J. (2004) Characteristics of RNA silen- also available from the authors upon request. In order to run this cing in plants: similarities and differences across kingdoms. Plant Mol. Biol., 54, standalone version on a standard linux operating system, addition- 157–174. Wang,X., Zhang,J., Li,F., Gu,J., He,T., Zhang,X. and Li,Y. (2005) MicroRNA ally the following free software is needed: Java 1.5, NCBI BLAST, identification based on sequence and structure alignment. Bioinformatics, 21, RNAfold, T-Coffee plus a standard LaTeX installation. Results 3610–3614. can optionally be stored in a mySQL database. Note that when Zhang,B.H., Pan,X.P., Wang,Q.L., Cobb,G.P. and Anderson,T.A. (2005) Identification constructing the BLAST database, large input sequences are split and characterization of new plant microRNAs using EST analysis. Cell Res., 15, into overlapping fragments for better retrieval efficiency. 336–360.
Bioinformatics – Oxford University Press
Published: Nov 29, 2005
You can share this free article with as many people as you like with the url below! We hope you enjoy this feature!
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.