Access the full text.
Sign up today, get DeepDyve free for 14 days.
M. Saraste, G. Semenza (2000)
FEBS LettFEBS Letters, 476
N. Prantzos, E. Vangioni‐Flam, M. Cassé (1993)
Origin and evolution of the elements
(1987)
Genetics Mol. Biochem. Parasitol
(1990)
Methods Enzymol
(1992)
Annu. Rev. Cell Biol
(1996)
J. Plant Res
N. Gillham (1994)
Organelle Genes and Genomes
(1988)
Nature
(1993)
FASEB J
(1988)
EMBO J
(1994)
89 Commission on Plant Gene Nomenclature
(1998)
Modification and Editing of RNA: The Alteration of RNA Structure and Function
(1992)
Annls NY Acad. Sci
(1987)
Annu. Rev. Plant Physiol. Plant Mol. Biol. Cell
(1996)
Curr. Genet
L. Bogorad, I. Vasil (1991)
The Molecular Biology of Plastids
(1995)
Mol. Gen. Genet
(1995)
J. Mol. Biol
(1995)
Mol. Biol. Evol
H. Baum (1984)
Mitochondria 1983. Nucleo‐mitochondrial interactionsFEBS Letters, 178
(1984)
J. Biol. Chem
A. Brennicke, U. Kiick (1993)
Plant Mitochondria With Emphasis on RNA Editing and Cytoplasmic Male Sterility
(1993)
Science
(1989)
Nucleic Acids Res
C. Levings, G. Brown (1989)
Molecular biology of plant mitochondriaCell, 56
(1996)
Genes Dev
(1320)
Biochim. Biophys. Acta
(1996)
Proc. Natl. Acad. Sci. USA, 93
1998 Oxford University Press Nucleic Acids Research, 1998, Vol. 26, No. 4 865–878 Genome structure and gene content in protist mitochondrial DNAs 1, 2 2 5 Michael W. Gray *, B. Franz Lang , Robert Cedergren , G. Brian Golding , 6 3 6 4 2 Claude Lemieux , David Sankoff , Monique Turmel , Nicolas Brossard , Eric Delage , 4,+ 4 4 4 4 Tim G. Littlejohn , Isabelle Plante , Pierre Rioux , Diane Saint-Louis , Yun Zhu and 2,4 Gertraud Burger Program in Evolutionary Biology, Canadian Institute for Advanced Research, Department of Biochemistry, 2 3 Dalhousie University, Halifax, Nova Scotia B3H 4H7, Canada, Département de Biochimie, Centre de Recherche Mathématique and OGMP Sequencing Unit, Université de Montréal, Montréal, Québec H3C 3J7, Canada, 5 6 Department of Biology, McMaster University, Hamilton, Ontario L8S 4K1, Canada and Département de Biochimie, Université Laval, Québec, Québec G1K 7P4, Canada Received October 23, 1997; Revised and Accepted November 21, 1997 ABSTRACT task to attempt to elucidate the mechanisms and reconstruct the pathways by which this evolutionary diversification has occurred. Although the collection of completely sequenced The preferred approach to answering such evolutionary questions mitochondrial genomes is expanding rapidly, only is through comparative analysis of complete mtDNA sequences, recently has a phylogenetically broad representation which provides a genome-level perspective on such issues as of mtDNA sequences from protists (mostly unicellular what genes are present, how they are arranged, whether there are eukaryotes) become available. This review surveys the introns (and, if so, what types), how spacer sequences are 23 complete protist mtDNA sequences that have been distributed and how large they are, whether segments of the determined to date, commenting on such aspects as genome are repeated and other relevant information. Currently, mitochondrial genome structure, gene content, 63 complete mtDNA sequences are available through public domain ribosomal RNA, introns, transfer RNAs and the genetic databases; however, the phylogenetic range that these sequences code and phylogenetic implications. We also illustrate represent is both narrow and biased: 47 (75%) are from animal the utility of a comparative genomics approach to gene species (31 vertebrate, 16 invertebrate); five (8%) are from fungi; identification by providing evidence that orfB in plant two (3%) are from plants; only nine (14%) are from protists, in spite and protist mtDNAs is the homolog of atp8, the gene in of the fact that the latter group of organisms (mostly unicellular) animal and fungal mtDNA that encodes subunit 8 of the comprises the bulk of the biological diversity of the eukaryotic F portion of mitochondrial ATP synthase. Although lineage (6). This limited and highly non-representative data set has several protist mtDNAs, like those of animals and most made it difficult to draw meaningful conclusions about the ancestral fungi, are seen to be highly derived, others appear to form of the mitochondrial genome, a necessary starting point for be have retained a number of features of the ancestral, inferences about subsequent mitochondrial genome evolution. proto-mitochondrial genome. Some of these ancestral To redress this imbalance, the Organelle Genome Mega- features are also shared with plant mtDNA, although sequencing Program (OGMP) was established in 1992, having as the latter have evidently expanded considerably in a specific aim the systematic and comprehensive determination size, if not in gene content, in the course of evolution. of complete protist mtDNA sequences. [Brief descriptions of the Comparative analysis of protist mtDNAs is providing a OGMP and two allied databases, the Protist Image Database new perspective on mtDNA evolution: how the original (PID) and the Organelle Genome Database Project (GOBASE), mitochondrial genome was organized, what genes it appear at the end of this review]. At that time only three complete contained, and in what ways it must have changed in protist mitochondrial genome sequences had been published: the different eukaryotic phyla. 6 kb mtDNA sequences of the apicomplexans Plasmodium yoelii (a rodent parasite) (7) and Plasmodium falciparum (the human malaria parasite) (8) and the 40 kb mtDNA sequence of the ciliate protozoan INTRODUCTION Paramecium aurelia (9). Partial but extensive mtDNA sequence Mitochondrial DNA (mtDNA) is extraordinarily diverse in size, information was also available for another ciliate protozoan, gene content and genome organization (1–5) and it is a daunting Tetrahymena pyriformis, several trypanosomatid protozoa (in the *To whom correspondence should be addressed. Tel: +1 902 494 2521; Fax: +1 902 494 1355; Email: [email protected] Present address: Australian National Genomic Information Service (ANGIS), University of Sydney, Sydney, New South Wales 2006, Australia Downloaded from https://academic.oup.com/nar/article-abstract/26/4/865/2902238 by Ed 'DeepDyve' Gillespie user on 06 February 2018 866 Nucleic Acids Research, 1998, Vol. 26, No. 4 genera Trypanosoma, Leishmania and Crithidia) and the green With the exception of BLAST (used for remote database alga (chlorophyte) Chlamydomonas reinhardtii. These limited searches) (18), FASTA (used for detailed sequence comparison) data suggested that protist mtDNAs might be even more (19) and NIP (the Staden nucleotide sequence analysis package) structurally variable than their counterparts in the multicellular (20), all of the informatics tools employed for this compilation eukaryotic lineages (1). have been developed by the OGMP Sequencing Unit. Many of In the ensuing 5 years, a larger selection of complete protist the programs make use of the OGMP ‘masterfile’ (mf) concept, mtDNA sequences has become available through the efforts of an ASCII-based sequence file format that integrates nucleotide the OGMP, a complementary Fungal Mitochondrial Genome sequence, gene annotations and technical notes. Project (FMGP) (5) and other research groups. This review The sequence retrieval and analysis tools developed by the summarizes and comments upon various aspects of protist OGMP have for the most part been written in the Perl programing mitochondrial genome structure, particularly gene content, that language. These tools include: BBLAST [batch mode BLAST have emerged from these new sequences. In recent years search of the National Center for Biotechnology Information comprehensive reviews of animal (10), fungal (5,11) and plant (NCBI) GenBank database]; BOB (BLAST output browser); (12,13) mtDNAs have been published, but reviews of protist FERRET, BADGER and CLEVER, retrieval tools used in mtDNAs have been limited to specific groups, e.g. ciliates (14), conjunction with the NCBI Entrez database; GOBASE2MF [a trypanosomatids (15) and apicomplexans (16). Because protists program for converting from sequence records stored in Sybase encompass most of the phylogenetic breadth of the eukaryotic tables of GOBASE (17) into mf format]; CLEANMF (used to lineage and, by definition, contain a number of clades whose verify sequence files in mf format as to annotation syntax and evolutionary depth exceeds that of the traditional animal, plant logic); PEPPER (for translation of protein coding sequences and and fungal kingdoms, it is important to sample widely within this extraction of non-coding regions); ONIP (command line interface to disparate assemblage to obtain a clear perspective on the range of the Staden NIP program, used in the creation of codon usage mtDNA structural diversity in protists, in comparison with the tables of various gene classes); CN (sequence counter and more widely studied mitochondrial genomes from other eukaryotes. checker). For compiling the body of data presented in Table 2, a The data assembled here emphasize that most non-protist number of wrapper scripts were written in the Bourne shell script mtDNAs, particularly those of animals, are substantially derived language; these programs call upon the above tools and produce relative to most of their protist counterparts, having lost many output files of appropriate layout. Scripts that use genome genes that are commonly still found in protist mitochondrial sequence files in mf format as input include: CODAT (calculation genomes. The compilation provided here better defines the of A+T content of coding and non-coding regions); COTAB properties of a typical ancestral (i.e. minimally diverged) protist [creation of codon usage tables of three types of protein coding mtDNA and allows us to suggest with greater confidence what regions: genes, intronic open reading frames (ORFs) and unique genes were likely contained in the proto-mitochondrial genome ORFs]; BFASTA (batch FASTA search, used in comparing the (i.e. the last common ancestor of contemporary mitochondrial protein sequences of two library files); TRNLIST (which creates genomes). a list of tRNA genes present in a genome). Further information about these programs is available at the OGMP website (see below). SCOPE OF THE REVIEW Table 1 identifies the 23 complete protist mtDNA sequences that RESULTS AND DISCUSSION to our knowledge have been determined to date. These sequences encompass a reasonably broad selection of protist taxa, although they still represent only a fraction of recognized protist lineages Mitochondrial genome structure (6). Nine of these sequences are in the public domain; the remainder are unpublished ones determined by the OGMP Complete sequence analysis has provided evidence of both (eight), the FMGP (two) or other research groups (four). As well, circular mapping and linear mapping protist mtDNAs, with we include complete mtDNA sequences from representative circular mapping genomes predominating (Table 2). Among the non-protists for purposes of comparison. Figure 1 displays the protist mitochondrial genomes characterized as linear, no common relative phylogenetic positions (to the extent that these can be end structures have been identified (see Table 2 for details). inferred or proposed at present) of the protists listed in Table 1, The protist mtDNAs listed in Table 2 have a median size of together with other protist species, including future candidates ~ 40 kb, ranging from 6 kb in the three apicomplexan species (the selected by the OGMP for complete mtDNA sequencing. smallest known mtDNAs) to 77 kb in the choanoflagellate Monosiga brevicollis. The majority of protist mtDNAs are compact, gene-rich genomes, with few or no large non-coding regions. METHODOLOGY Intergenic spacers are generally small and sparse, accounting in nine cases for <10% of the mtDNA, with coding regions sometimes Data collection and analysis overlapping. In Acanthamoeba castellanii, Dictyostelium In the case of complete mtDNA sequences published by other discoideum, M.brevicollis, Chlamydomonas eugametos and groups and deposited in the public domain we have used the Pedinomonas minor all genes are transcribed from the same standardized and corrected versions available in GOBASE (17; see strand of the mtDNA; otherwise, more than one potential below). Importantly, annotations accompanying these sequences transcription unit is present in protist mitochondrial genomes. have been unified with respect to gene and product nomenclature. The overall A+T content is high (>70% in 15 cases) in protist These particular sequences have also been re-analyzed by us mtDNAs and is usually elevated in non-coding intergenic regions using informatics tools developed in-house and described below. compared with coding regions (up to 1.2-fold higher in M.brevicollis Downloaded from https://academic.oup.com/nar/article-abstract/26/4/865/2902238 by Ed 'DeepDyve' Gillespie user on 06 February 2018 867 Nucleic Acids Research, 1998, Vol. 26, No. 4 867 Nucleic Acids Research, 1994, Vol. 22, No. 1 Table 1. Completely determined mitochondrial genome sequences Descriptions of and detailed information about many of these species may be found at the Protist Image Database (PID; URL http://megasun.bch.umontreal.ca/protists/). Where the complete sequence is reported in one or two papers, the references are listed here; otherwise, relevant citations can be obtained by consulting the annotation provided in the NCBI entry. Data from unpublished sequences were provided by: OGMP, Organelle Genome Megasequencing Program; FMGP, Fungal Mitochondrial Genome Project (URL http://megasun.bch.umontreal.ca/People/lang/FMGP); RWL, R.W.Lee (Department of Biology, Dalhousie University, Halifax, Nova Scotia, Canada); YT, Y.Tanaka (Institute of Biological Sciences, University of Tsukuba, Japan); CL/MT (C.Lemieux and M.Turmel, Département de Biochimie, Université Laval, Québec, Canada); PJM, P.J.Myler (Seattle Biomedical Research Institute, Seattle, WA); DRW, D.R.Wolstenholme (Department of Biology, University of Utah, Salt Lake City, UT). Data summaries and gene maps for the individual OGMP sequencing projects are available at URL http://megasun.bch.umontreal.ca/o gmp/. P.J.Myler, personal communication. A different sequence, assembled from a number of separate sources, is available as NCBI acce ssion no. M94286. The sequence of the transcribed region of Leishmania tarentolae maxicircle DNA is also available (accession no. M101026). mtDNA). The numbers in Table 2 suggest that, in general, protist (Table 3), as well as genes for large subunit (LSU) and small subunit mtDNAs have evolved in the direction of higher A+T content. (SSU) rRNAs (rnl and rns respectively; Table 4). This ‘standard set’ In animals, as exemplified by Homo sapiens and Metridium of mtDNA-encoded genes (plus atp9) is also found in fungal senile in Table 2, the evolutionary trend has clearly been toward (e.g. Allomyces macrogynus, Ama) mtDNAs, except that certain a further compaction of the mitochondrial genome, both by loss ascomycete fungi (e.g. Schizosaccharomyces pombe, Spo) lack all of genes and by virtual elimination of intergenic spacers. nad genes. Animal and fungal mtDNAs do not encode a 5S rRNA Conversely, in plants (e.g. Marchantia polymorpha) the trend has (Table 4) nor, with the exception of rps3 in A.macrogynus mtDNA been in the opposite direction, with the mtDNA tending to (22), do they carry any ribosomal protein genes (Table 5). In land increase in size, primarily by acquisition of a large amount of plant mtDNAs a few extra respiratory chain protein genes are found apparently non-coding DNA of currently unknown origin and (e.g. nad9 and atp1 in M.polymorpha; Table 3); however, the most function (Table 2). In the recently sequenced 366 924 bp notable departure from animal and fungal mtDNAs is the presence mitochondrial genome of the angiosperm Arabidopsis thaliana in plant mtDNA of a set of ribosomal protein genes (Table 5) as well (21), fewer genes are encoded than are found in M.polymorpha as a gene for 5S rRNA (rrn5; Table 4). In the case of M.polymorpha mtDNA, which is half the size (Table 2); overall <10% of the mtDNA several homologs of known mitochondrial genes A.thaliana mtDNA has an assigned coding function. A key (e.g. sdh3,4 and yejR,U,V; Tables 3 and 6) were initially considered question is how and why evolution has produced such divergent to be unique ORFs (23). mitochondrial genome patterns in different eukaryotic lines. With respect to gene content, protist mtDNAs generally resemble plant rather than animal or fungal mtDNAs. The largest gene repertoire so far identified in any mtDNA is that found in the Gene content mitochondrial genome of the heterotrophic flagellate Reclinomonas In vertebrate animals, e.g. H.sapiens (Hsa), the mitochondrial americana (Ram, Tables 3–7; 24). Genes in the other sequenced genome contains genes for 13 inner mitochondrial membrane mtDNAs are all subsets of the R.americana set, implying that the proteins involved in electron transport and coupled oxidative R.americana pattern is closest to the ancestral pattern of genes phosphorylation (nad1-6 and 4L, cob, cox1-3 and atp6 and 8) carried by the proto-mitochondrial genome (24). The R.americana Downloaded from https://academic.oup.com/nar/article-abstract/26/4/865/2902238 by Ed 'DeepDyve' Gillespie user on 06 February 2018 868 Nucleic Acids Research, 1998, Vol. 26, No. 4 Figure 1. Phylogenetic hypothesis of the eukaryotic lineage based on ultrastructural and molecular data. Organisms are divided into three main groups distinguished by mitochondrial cristal shape (either discoidal, flattened or tubular). Unbroken lines indicate phylogenetic relationships that are firmly supported by available data; broken lines indicate uncertainties in phylogenetic placement, resolution of which will require additional data. Color coding of organismal genus names indicates mitochondrial genomes that have been completely (Table 1), almost completely (Jakoba, Naegleria and Thraustochytrium) or partially (*) sequenced by the OGMP (red), the FMGP (black) or other groups (green). Names in blue indicate those species whose mtDNAs are currently being sequenced by the OGMP or are future candidates for complete sequencing. Amitochondriate retortamonads are positioned at the base of the tree, with broken arrows denoting the endosymbiotic origin(s) of mitochondria from a Rickettsia-like eubacterium. Macrophar., Macropharyngomonas. results also indicate that gene loss (presumably by transfer to the it is notable that these additional genes are all involved in nucleus) has occurred to different extents in different lineages mitochondrial biogenesis and/or function. (25), with many respiratory chain genes and almost all ribosomal The emerging data suggest that loss of particular genes from protein genes having already been eliminated in the common mtDNA happened a number of times, independently, in the ancestor of animal and fungal mtDNAs. In support of the view course of mitochondrial genome evolution. For example, sdh that R.americana mtDNA is ancestral (i.e. minimally diverged) genes have only been found so far (Table 3) in the mtDNA of a is the highly eubacterial character of certain of its genes (e.g. rnpB, cryptophyte [Rhodomonas salina (33)], rhodophytes [the red encoding the RNA component of RNase P) as well as the algae Porphyra purpurea (33), Chondrus crispus (34) and presence of putative eubacterial translation initiation signals Cyanidium caldarium (35)] and land plants [M.polymorpha (Shine–Dalgarno motifs; 24). In addition, as in the case of (33,36)], as well as in R.americana mtDNA (24,33). These genes chloroplast genomes (3,26,27), R.americana mtDNA encodes are not present in A.thaliana mtDNA (21) and so far have not subunits of a multi-component, eubacteria-like (α ββ′) core been identified in other, partially sequenced angiosperm mito- RNA polymerase. In contrast, in other eukaryotes the core chondrial genomes. Considering the proposed phylogenetic mitochondrial RNA polymerase is a single polypeptide, nuclear positions of these lineages (Fig. 1) and the current limited DNA-encoded enzyme homologous to bacteriophage T3 and T7 distribution of mtDNA-encoded sdh genes, we infer that these RNA polymerases (28–32). Although R.americana mtDNA has genes must have been lost from mtDNA on different occasions a larger number of genes than other sequenced protist mtDNAs, (33). Downloaded from https://academic.oup.com/nar/article-abstract/26/4/865/2902238 by Ed 'DeepDyve' Gillespie user on 06 February 2018 869 Nucleic Acids Research, 1998, Vol. 26, No. 4 869 Nucleic Acids Research, 1994, Vol. 22, No. 1 Table 2. Characteristics of sequenced mitochondrial genomes C, circular mapping; L, linear mapping. Includes identified genes, unidentified ORFs, introns and intron ORFs. Includes 492 bp subterminal inverted repeats and terminal 40 nt 3′ single-strand extensions (78). Includes 2208 bp terminal inverted repeats (OGMP, unpublished results). Sequence starts at the DNA replication initiation loop, which contains a tandem array of 11 34 bp A+T-rich repeat units. Termination sequence at the other end of the linear DNA (estimated to be ~ 200 bp) remains unsequenced (14). Head-to-tail tandem repeats of a 6 kb unit (82). Length of repeat unit. Excluding tandemly arrayed telomeric sequences (31 bp repeat unit) of variable length (OGMP, unpublished results). 7.1 kb DNA element containing incompletely characterized terminal inverted repeats (79). Excludes terminal inverted repeat sequences (residues 1–59 and 5783–5895 of Z23263). Identification of fragmented and scrambled rRNA coding modules (see Table 4) is incomplete for these genomes; for that reason the proportion of coding versus non-coding DNA cannot be calculated at present. As the sorts of comparative data being generated by complete and ribosomal protein genes (Table 5), we suspect that at least protist mtDNA sequencing continue to accumulate, we should be some of these unidentified ORFs may represent highly diverged able to document more precisely the number and timing of versions of known mtDNA-encoded genes, no longer recognizable individual instances of mitochondrial gene loss, many of which by similarity searches. Additional comparative data should help undoubtedly involve mitochondrion to nucleus gene transfer. to address this question and may ultimately permit the functional Even now, the results suggest that gene flux from mitochondrial assignment of conserved ORFs, as in the case of ymf19 (orf B; see below). Assuming that further gene assignments of this type can to nuclear genomes is not only a widespread and on-going phenomenon, but that it has been both more gradual and more be made through this comparative approach, differences in protist frequent than previously appreciated. The cox2 gene, as one mtDNA gene content could turn out to be less pronounced than example, appears to have been lost from mtDNA at least three they appear to be at the moment. times (see Table 3): in the lineage leading to the Apicomplexa, in the Pedinomonas/Chlamydomonas lineage of green algae and in Ribosomal RNA certain legumes (dicotyledonous plants) (37,38). Most protist mtDNAs contain a number of conserved but With only a few exceptions, protist mtDNAs encode LSU and unidentified ORFs (Table 6). Especially notable in this regard are SSU rRNAs whose potential secondary structures deviate ymf16 (which has been shown to code for a membrane protein of minimally from their eubacterial counterparts (OGMP, unpublished unknown function; 39) and ymf39, which are present in the results). This corresponds to what has been observed with plant mtDNA of many protists and plants (but not in animal or fungal mitochondrial rRNAs, but stands in marked contrast to most mtDNA). However, most of the unidentified ORFs encountered fungal but particularly animal mitochondrial rRNAs (40,41). during mitochondrial genome sequencing are unique: they do not Clearly recognizable in most protist mitochondrial LSU rRNAs are match any sequence in the protein databases. Considering the the 5′- and 3′-terminal regions corresponding to the ‘5.8S’ and nature and distribution of identified respiratory chain (Table 3) ‘4.5S’ domains of a eubacterial counterpart such as Escherichia coli Downloaded from https://academic.oup.com/nar/article-abstract/26/4/865/2902238 by Ed 'DeepDyve' Gillespie user on 06 February 2018 870 Nucleic Acids Research, 1998, Vol. 26, No. 4 Table 3. Mitochondrial DNA-encoded genes involved in electron transport and coupled oxidative phosphorylation Full organism names are listed in Table 1. , gene present; pseudogene; m gene absent. Arabidopsis thaliana mtDNA (accession nos Y08501 and Y08502) lacks sdh genes but encodes a functional copy of nad7 (21). Pyo and Tpa mtDNAs, which have the same gene content as Pfa mtDNA, are not listed in this table. The same genes are found in the maxicircle DNA of Leishmania tarentolae (accession no. M10126). Transcripts of trypanosomatid mitochondrial genes undergo post-transcriptional U addition/deletion RNA editing to generate translatable mRNAs (83). In both T.pyriformis and P.aurelia mitochondria the nad1 gene is split into two pieces and rearranged (OGMP, unpublished results). In T.pyriformis, corresponding transcripts have been identified, one (nad1_a) encoding the N-terminal portion and the other (nad1_b) specifying the C-terminal portion of NADH dehydrogenase subunit 1 (J.Edqvist and M.W.Gray, unpublished results). Identification of nad3 in trypanosomatid mtDNA (84) should be regarded as tentative (P.J.Myler, personal communication). Gene contains six in-frame TGA codons (23); transcript detected but not further processed (85). A single open reading frame (cox1_cox2) encodes both subunits 1 and 2 of cytochrome c oxidase in A.castellanii (86) and D.discoideum (87,88) mtDNAs. orf172 (ymf19; 89) in M.polymorpha mtDNA and orfB in angiosperm mtDNA (see text). 23S rRNA. These terminal regions have largely been eliminated phylogenetically later branching green algal genus, Chlamydomonas from animal mitochondrial LSU rRNAs (41). These observations (44–46). Fragmented and dispersed rRNA gene elements, reinforce the emerging view that the most ancestral (minimally encoded on both strands of the mtDNA, have also been found in derived) mitochondrial genomes will be found among the protists. the small apicomplexan mtDNAs (8,47). Because most protist A minority of protist mtDNAs encode rRNA genes whose mtDNAs encode conventional, 16S-like and 23S-like rRNAs (the structure and/or the structure of their products is very unusual. The ancestral state), these deviant examples must represent derived 9S (SSU) and 12S (LSU) mitochondrial rRNAs of trypanosomatid patterns of mitochondrial rRNA gene structure and organization protozoa (e.g. Leishmania tarentolae and Trypanosoma brucei) within the specific lineages in which they occur. are among the smallest and structurally most divergent of known Like animal and fungal mtDNAs, most protist mtDNAs lack a rRNAs, having potential secondary structures in which only a few 5S rRNA gene, the current exceptions (Table 4) being the of the expected conserved structural elements are identifiable chlorophyte algae Prototheca wickerhamii (48) and Nephroselmis (40,41). Also unusual are the mitochondrial rnl genes of olivacea (Nol) (M.Turmel, C.Otis and C.Lemieux, unpublished Paramecium aurelia (42,43), Tetrahymena pyriformis (43) and results), the red alga C.crispus (see Table 4, footnote g) and the Pedinomonas minor (OGMP, unpublished results), which are jakobid flagellate R.americana (49). As in the case of sdh genes split into two pieces that are separated in the genome and noted above, the sporadic phylogenetic distribution of mitochondrial interspersed with other genes (Table 4). The Pedinomonas situation rrn5 suggests that this gene was lost from mtDNA a number of is particularly intriguing because a more extreme case of rnl times. fragmentation and scrambling is seen in the mtDNA of a Downloaded from https://academic.oup.com/nar/article-abstract/26/4/865/2902238 by Ed 'DeepDyve' Gillespie user on 06 February 2018 871 Nucleic Acids Research, 1998, Vol. 26, No. 4 871 Nucleic Acids Research, 1994, Vol. 22, No. 1 Table 4. RNA-encoding genes in mtDNA Full organism names are listed in Table 1. , gene present; m gene absent. The same genes are present in A.thaliana mtDNA (21). Pyo and Tpa mtDNAs have the same gene content as Pfa mtDNA. Multiply split and rearranged rnl and rns genes → multiply fragmented LSU and SSU rRNAs (44–47). Split (2 piece) and rearranged rnl (42,43; OGMP, unpublished results). Split (2 piece) rns → split (2 piece) SSU rRNA (90,91). The original claim that C.crispus mtDNA encodes a 5S rRNA (34) has since been discounted (49; see also 4) However, re-analysis of the C.crispus mtDNA sequence has now revealed a gene for a bona fide 5S rRNA, different from the 5S rRNA-like structure originally proposed by Leblanc et al. (34). The C.crispus rrn5 (complement of residues 16043–16152 in Z47547) is located between and in the same transcriptional orientation as nad3 and rps11 (G.Burger, unpublished results). B.F.Lang, unpublished results. Small RNAs that function in U addition/deletion RNA editing (83). The number of guide RNAs encoded by the T.brucei and L.tarentolae maxicircle DNAs is three and 15 respectively. For a compilation of trypanosomatid guide RNAs see http://www.biochem.mpg.de/~ goeringe/gRNA/gRNAseqs.html). Gene encoding a 129 nt RNA of unknown function is located immediately downstream of rnl (Y.Tanaka, personal communication). Transfer RNAs and the genetic code of tRNA genes. Again, in these cases it is generally held that import of cytosolic tRNAs makes up the deficit. Indeed, import into Complete sequencing of an organelle genome is the only way to M.polymorpha mitochondria has recently been documented in the determine unequivocally whether that genome encodes all of the Ile Thr case of nucleus-encoded tRNA (aau) (55) and tRNA (agu) (56), tRNA species necessary to support organellar protein synthesis. genes for which have not been identified in M.polymorpha Several protist mtDNAs (those of M.brevicollis, P.wickerhamii, mtDNA (23). However, an alternative possibility that should be R.salina and Malawimonas jakobiformis in Table 1) do appear to considered is that the anticodon sequence in a single mtDNA- encode the minimal required tRNA set, if one allows that a single encoded tRNA might be subject to partial editing, such that the tRNA is able to decode the four-codon family specifying a given unedited and edited versions accept different amino acids and pair amino acid (see Table 7). However, in most cases, tRNAs with codons corresponding to these amino acids. Partial C→U recognizing one or more codons are evidently absent from the ‘Gly’ Asp mitochondrial genome, and tRNA import from the cytosol is editing of a tRNA (gcc) to generate a tRNA (guc) in opossum usually invoked as the mechanism for making up the deficit. mitochondria (57) serves as a precedent for this possibility. Import of nuclear DNA-encoded cytosolic tRNAs into mitochondria In A.castellanii, sequencing of the mtDNA has provided is clearly required in the case of A.castellanii, D.discoideum, evidence of a novel type of tRNA editing that affects most of the P.aurelia, T.pyriformis, Chlamydomonas spp. and P.minor, whose mtDNA-encoded tRNAs (58–62; D.H.Price and M.W.Gray, mtDNAs encode substantially fewer than the minimal required unpublished results). This editing is confined to one or more of set (Table 7); in fact, import of tRNA into Tetrahymena the first three positions at the 5′-end of the tRNA (62). Except for mitochondria, long inferred on the basis of tRNA population the mismatching in the acceptor stem that is corrected by this studies (50), has recently been documented experimentally (51). editing, the secondary structures of Acanthamoeba mitochondrial No tRNA genes have been found in the mitochondrial genomes tRNAs are quite conventional (58–62). What appears to be the of apicomplexan or trypanosomatid protists, where import of a same type of mitochondrial tRNA editing has recently been full set of tRNAs from the cytoplasm is assumed (52,53). The data documented in the chytridiomycete fungus Spizellomyces punctatus in Table 7 indicate that mitochondrial tRNA import is not only likely (63) and several other primitive fungi (B.F.Lang, unpublished to be widespread among protists [as it is also in plants (54) and results); moreover, in the case of tRNAs encoded by D.discoideum several chytridiomycete fungi (5)], but that it emerged early in the mtDNA secondary structure modeling strongly suggests that evolution of the mitochondrial translation system, probably a several of these undergo a similar type of editing. Orthodox number of times independently. Genes for certain tRNAs (e.g. Met cloverleaf secondary structures are the rule for mitochondrial and Trp) are encoded by the mitochondrial genomes of virtually all tRNAs throughout the protists, one notable variant being an protists, whereas genes for other tRNAs (notably Thr) are found Met unusual tRNA in Tetrahymena mitochondria (64). The infrequently among protist mtDNAs (Table 7). Several protist mitochondrial genomes, as well as that of structurally aberrant tRNAs characteristic of animal mitochon- M.polymorpha, lack only one or two of the minimal required set dria (65,66) are therefore exceptional, representing a highly Downloaded from https://academic.oup.com/nar/article-abstract/26/4/865/2902238 by Ed 'DeepDyve' Gillespie user on 06 February 2018 872 Nucleic Acids Research, 1998, Vol. 26, No. 4 Table 5. Ribosomal protein genes encoded by mtDNA Full organism names are listed in Table 1. n gene present; pseudogene; m gene absent. Small subunit-associated ribosomal proteins are also encoded by the mtDNAs of yeast (Saccharomyces cerevisiae; var1) and Neurospora crassa (S-5) (see table III in 2); however, these proteins share no obvious sequence similarity with any known eubacterial small subunit ribosomal protein. Several of these genes have not been identified in the completely sequenced A.thaliana mitochondrial genome (accession nos Y08501 and Y08502); these include rps1, rps2, rps8, rps10, rps11, rps13 and rpl6. Two additional genes (rps14 and rps19) are present as pseudogenes in A.thaliana mtDNA (21). Like the Pfa mitochondrial genome, Pyo and Tpa mtDNAs do not encode any ribosomal protein genes. Same ribosomal protein gene content in L.tarentolae maxicircle DNA (accession no. M10126). orf227 (previously named urfa; 92); G.Burger and B.F.Lang, unpublished results. No transcript detected (22). Not reported in the original publication describing this genome (34). derived form of mitochondrial tRNA which, nevertheless, is able choanoflagellate M.brevicollis. Prototheca wickerhamii and to assume the required L-shaped tertiary structure (67). M.polymorpha mtDNAs share with one another (and with fungal In almost half of the protists listed in Table 7 we infer, on the mtDNA) positionally equivalent and structurally homologous Trp basis of codon usage and the presence of a tRNA having a CCA cox1 introns, suggesting that these introns have been inherited anticodon, that the mitochondrial translation system uses the vertically from a mitochondrial ancestor of fungi, green algae and standard genetic code, as is the case in land plants. In the plants (68). On the other hand, horizontal transfer of other group remaining protists UGA appears to be decoded as tryptophan I introns is suggested by the fact that in the rnl gene of rather than as stop (Table 7), being the preferred Trp codon in all A.castellanii mtDNA and in the chloroplast DNA of certain but P.aurelia; in fact, UGA is used almost exlusively to encode Chlamydomonas species, several mobile group I introns are not Trp in M.brevicollis and T.pyriformis mitochondria. From the only positionally identical, but have homologous intron core phylogenetic distribution of this code variation it is evident that structures and intron ORFs (69). the change in UGA coding must have occurred on more than one Very few group II introns have been found in protist mtDNAs (a occasion. total of seven such introns in five of 23 completely sequenced protist mtDNAs). Again, we have some evidence suggesting acquisition of Introns certain of these introns by horizontal transfer (OGMP, unpublished results), as appears also to be the case for certain group II introns Compared with plant mtDNA, protist mtDNAs seem to have found in the rnl gene of the brown alga Pylaiella littoralis (70). In remarkably few introns (Table 8). At least half of these genomes our view the paucity of group II introns in protist mtDNAs coupled entirely lack group I and group II introns. So far, among the 23 with their sporadic distribution and evidence of horizontal transfer completely sequenced protist mtDNAs listed in Table 1, group I makes it quite unlikely that there was a wholesale acquisition of introns have only been found (and then only in small numbers) in the amoeboid protozoa A.castellanii and D.discoideum, the green group II introns by the eukaryotic cell via the α-proteobacteria-like algae P.wickerhamii, N.olivacea and C.eugametos and the proto-mitochondrial endosymbiont. Downloaded from https://academic.oup.com/nar/article-abstract/26/4/865/2902238 by Ed 'DeepDyve' Gillespie user on 06 February 2018 873 Nucleic Acids Research, 1998, Vol. 26, No. 4 873 Nucleic Acids Research, 1994, Vol. 22, No. 1 Table 6. Additional protein genes encoded by mtDNA Full organism names are listed in Table 1. n gene present; pseudogene; m gene absent. Intron ORFs not included (see Table 8). Same in L.tarentolae maxicircle DNA. Gene is split into three separate ORFs in both M.polymorpha (orf509 = ymf4; orf169 = ymf3; orf322 = ymf2) and A.thaliana (ccb382, ccb203 and ccb452). M.polymorpha orf509 is equivalent to A.thaliana ccb382 + ccb203, whereas A.thaliana ccb452 is homologous to M.polymorpha orf169 + orf322 (21). orf228 = ymf5 (ccb256 in A.thaliana mtDNA; 21). orf277 = ymf6 (ccb206 in A.thaliana mtDNA; 21). orf244 in Mpo mtDNA. orf183 in Mpo mtDNA (orf25 in angiosperms). A putative mutS homolog, identified in a coral mtDNA (93), has not been found in any of the sequenced protist mtDNAs listed in Table 1. ORF showing similarity to mitochondrial plasmid-encoded DNA polymerase. Remnants of dpo gene (94). Coding sequence distributed over three separate ORFs (OGMP, unpublished results). ORF showing similarity to reverse transcriptase. Oda et al. (23). Boer and Gray (95). Coding sequence distributed between two separate ORFs (OGMP, unpublished results). ORF showing similarity to DNA endonuclease of type GIF-YIG (96). Three ORFs of this type have been found in Ama mtDNA (22). Comprising >60 codons and not overlapping one another or other identified genes. Only 29 ORFs >60 codons were predicted as possible genes using a defined index of G+C content in the first, second and third positions of codons (23). In the course of re-analyzing the Ccr mtDNA sequence one of the two previously annotated (34) unique ORFs, orf94, has been identified as rpl20 (G.Burger, unpublished results). An additional 13 ORFs in Tpy (equivalent to 14 Pau ORFs) are defined as ‘ciliate-specific’ (shared between Tpy and Pau but not other mtDNAs). Of the 25 ORFs (unique + ciliate-specific) in Pau mtDNA 12 were previously annotated (9), whereas an additional 13 have been found in the course of re-analyzing the Pau mtDNA sequence (G.Burger, unpublished results). A comparative genomics approach to gene identification: F portion of the ATP synthase. The latter gene has been found in the case of orfB and atp8 a number of animal and fungal mtDNAs, but up to now has not been identified in plant or protist mitochondrial genomes. Accumulating sequence data are aiding in the identification of Conversely, orfB is found in almost all plant and protist mtDNAs, some of the unassigned ORFs that have been uncovered in the but not in those of animals or fungi. Both Atp8 and OrfB proteins course of sequencing mitochondrial genomes. As an example we are characterized by the same block of three identical amino acids provide evidence here that orfB, a conserved gene of unknown at the N-terminus, followed by an otherwise quite variable function originally identified in plant mtDNA (see Table 3, sequence (Fig. 2). The known OrfB proteins of plants differ from footnote i), is the homolog of atp8, which encodes subunit 8 of the Downloaded from https://academic.oup.com/nar/article-abstract/26/4/865/2902238 by Ed 'DeepDyve' Gillespie user on 06 February 2018 874 Nucleic Acids Research, 1998, Vol. 26, No. 4 Table 7. Transfer RNA genes encoded by mtDNA See Table 1 for complete organism names. n gene present; m gene absent. Aminoacylation specificity (a.a.) is indicated by the standard one letter symbols for amino acids (Me, elongator methionine; Mf, initiator methionine). The predicted anticodon of each tRNA is shown in lower case letters, with the predicted codon(s) that would be recognized shown in upper case letters (N = any nucleotide; R = A or G; Y = C or U). Expanded wobble base pairing is assumed, such that anticodons beginning with uridine are considered to recognize all codons in a four-codon family. Duplicate identical genes. Duplicate non-identical genes. Triplicate genes, two of which are identical, the third differing by a single T→C transition. Genome specifies a single trnM(cau). C in the first position of the anticodon presumed to be modified to lysidine, which converts the tRNA to an AUA-decoding isoleucine acceptor (97). A in first the position of the anticodon presumed to be modified to inosine, with the resulting tRNA able to pair with codons ending in C, U and A, and perhaps also G (see 98). trnK(cuu), the corresponding tRNA of which would be expected to recognize AAG but not AAA (61). Only UGG Trp codons appear in conserved protein coding genes in S.pombe mtDNA, however, several UGA codons occur in rps3 and intron ORFs (92). Both UGG and UGA are decoded as Trp in A.castellanii mitochondria (61), whereas the tRNA specified by trnW(cca) would be expected to recognize only UGG. Includes a trnL(aag) not listed in the table. Includes a presumptive trnE pseudogene, unrelated in sequence to authentic trnE. Includes a trnI(uau) not listed in the table. Transcripts of most Aca mitochondrial tRNA genes (12 of 15) undergo substitutional RNA editing at one or more of the first three positions of the acceptor stem (61,64; D.H.Price and M.W.Gray, unpublished results). Transcripts of at least half of the Ddi mitochondrial tRNA genes are predicted to undergo a similar type of editing. Includes a trnX(uuua) pseudogene (D.H.Price and M.W.Gray, unpublished results), the transcript of which is predicted to have an 8 nt anticodon loop (61). Includes an unusual tRNA-like element whose anticodon sequence would pair with UAA and UAG (99), which are normally termination codons. Includes a trnI(aau) not listed in the table. Includes a trnX(cua), the corresponding tRNA of which would be expected to recognize UAG (normally a termination codon). Includes a trnL(gag) not listed. Downloaded from https://academic.oup.com/nar/article-abstract/26/4/865/2902238 by Ed 'DeepDyve' Gillespie user on 06 February 2018 875 Nucleic Acids Research, 1998, Vol. 26, No. 4 875 Nucleic Acids Research, 1994, Vol. 22, No. 1 Table 8. Introns and intron ORFs in mtDNA Full organism names are listed in Table 1. A group I intron in nad5 contains nad1 and nad3 genes (100). Atp8 essentially in their increased length. Because there is also the best candidate for the previously ‘missing’ atp8 homolog in much length variation among OrfB homologs in some protist plant and protist mtDNAs. mtDNAs, we were prompted to assess the possibility that atp8 and orfB are homologous genes. Phylogenetic implications The N-terminal functional domain (71) of ATP synthase subunit 8 is well conserved in different fungi compared with the The mitochondrial gene content and genome organization data being central hydrophobic domain (72) and the C-terminal domain generated by the OGMP and other groups are serving to further (73). The latter domain contains a region enriched in positively clarify our views about the origin and evolution of the mitochondrial charged amino acid residues (73), which are thought to play an genome. One example involves the relationship between land plant important role in assembly of the F complex (see below). If OrfB 0 and Chlamydomonas mtDNAs, which are so different in structure, is indeed homologous to Atp8, we should find similar amino acid organization and mode of expression that they show little evidence signatures in a multiple alignment of a phylogenetically diverse of having a common evolutionary origin (1,2,74). In the absence of collection of both types of sequences. Such a collection has a phylogenetically broad database of comparative information we at recently become available through the sequencing efforts of the one time entertained the possibility that the plant mitochondrial OGMP and FMGP. genome might have had a different, more recent evolutionary As shown in Figure 2, the highly conserved N-terminal domain ancestry than Chlamydomonas and other mitochondrial genomes provides the best evidence for homology between orfB and atp8. (75). However, sequencing of P.wickerhamii (48) and other Further evidence supporting this inference is the presence of (24,34,61) protist mtDNAs has clearly demonstrated that plant perfectly aligned central hydrophobic and positively charged mtDNA has retained an ancestral pattern that has evidently been lost domains. Based on the alignment of the first 57 amino acids in the more rapidly evolving and highly derived Chlamydomonas shown in Figure 2, we suggest that there is little basis for a mtDNA (74). It is worth emphasizing that the majority of the distinction between the ‘Atp8’ and ‘OrfB’ classes of protein. With protist mtDNAs sequenced to date by the OGMP, particularly two notable exceptions, this sequence compilation further those from more obscure protists selected from the wild on the demonstrates that a long C-terminal extension (position 78 and basis of ultrastructural or other phylogenetic considerations, beyond in Fig. 2) is only found among plants and protists. In the retain a more or less ancestral pattern of gene content and stramenopiles Cafeteria roenbergensis and Ochromonas danica organization. In contrast, most of the mtDNAs that had been the mtDNA codes for a shorter protein, about as long as the sequenced prior to the inception of the OGMP (those from animals, most fungi, chlamydomonadalean green algae, ciliates longest fungal sequences. This feature is not clade specific and trypanosomatid protozoa) are highly derived. It is curious that because in another stramenopile, Phytophthora infestans, the the majority of the protists that have been selected as models for mitochondrial genome specifies an Atp8 protein that is rather biochemical, genetic and molecular biological research happen to typical in size for protists. The C-terminal extension is not only have mtDNAs that are the least representative of the ancestral quite variable in size, but indeed is so divergent in sequence that form. it can only be reasonably well aligned among very closely related species (e.g. land plants). Thus the presence or absence of a C-terminal extension also does not distinguish between ‘Atp8’ Descriptions and ‘OrfB’ classes. Conserved sequence motifs within the hydrophobic and Organelle Genome Megasequencing Program (OGMP) (http:// C-terminal domains of the Atp8/OrfB protein are restricted to the megasun.bch.umontreal.ca/ogmp/ ). The OGMP was initiated as boundaries between these domains, the ‘LP motif’ (71), which is a multi-disciplinary and inter-university consortium of Canadian immediately followed by a region with one or several positively investigators interested in organelle genome evolution and charged amino acids. Previous studies in fungi have shown that eukaryotic phylogeny. As currently constituted it consists of a Team these positively charged amino acids play an important role in (B.F.Lang, administrative coordinator; M.W.Gray, scientific assembly of subunits 6, 8 and 9 (73). coordinator; G.Burger, C.Lemieux and M.Turmel) and an Advisory In summary, plant and protist mitochondrial OrfB proteins Board (R.Cedergren, G.B.Golding, D.Sankoff, T.G.Littlejohn and contain all of the conserved sequence elements characteristic of C.J.O’Kelly), with external collaborators on some individual animal and fungal Atp8 proteins. Thus the orfB gene represents projects. The experimental arm of the OGMP, the Sequencing Unit Downloaded from https://academic.oup.com/nar/article-abstract/26/4/865/2902238 by Ed 'DeepDyve' Gillespie user on 06 February 2018 876 Nucleic Acids Research, 1998, Vol. 26, No. 4 Figure 2. Alignment of Atp8 and OrfB amino acid sequences. Sequences from bacteria (B), protists (P), land plants (L), fungi (F) and animals (A) are compared. Three letter abbreviations of organism names are listed in Table 1. Additional abbreviations: Rru, Rhodospirillum rubrum; Bvu, Beta vulgaris; Rst, Rhizopus stolonifer; Rss, Rhizophydium ssp.; Hss, Harpochytrium ssp.; Sco, Schizophyllum commune; Spu, Spizellomyces punctatus; Ani, Aspergillus nidulans; Sce, Saccharomyces cerevisiae; Pli, Paracentrotus lividus. Sequences were obtained from the NCBI databases except for Pin, Mbr, Rst, Rru, Hss, Sco and Spu, which are unpublished FMGP sequences, and Mja, Rsa, Cro, Oda and Ppu, which are unpublished OGMP sequences. Color highlighting is as follows: blue, invariant amino acids; magenta, identical residues comprising at least 10 (40% or more) of the total number of residues in a given column (also colored in magenta are those residues that according to the PAM matrix are positive or neutral exchanges with reference to the most abundant residue in the column); yellow, positively charged amino acids. Dashes (–) denote a missing residue at this position in comparison with other sequence(s). Asterisks (*) mark translation termination codons; numbers preceding an asterisk indicate the remaining length of sequence that is not shown. (directed by G.Burger), is located in the Département de Biochimie, within the OGMP, contains descriptions of most of the species Université de Montréal. The Sequencing Unit comprises two whose mtDNAs have been sequenced by the OGMP. The PID is divisions: Molecular Biology (I.Plante, D.Saint-Louis and being continued independently from but in close collaboration with Y.Zhu), which constructs clone libraries, performs the actual the OGMP , with its web pages maintained on the OGMP web server. sequencing and works out improved cloning and sequencing Organelle Genome Database Project (GOBASE) (http://mega methods; Informatics (N.Brossard and P.Rioux), which develops sun.bch.umontreal.ca/gobase/ ). Shortly after the OGMP was and implements tools required for project management, data established it became apparent that there were serious limitations handling, sequence analysis and annotation. As the data production in accessing all of the relevant information associated with arm of the OGMP, the Sequencing Unit delivers analyzed and organelles. Data are dispersed among a number of sources (World fully annotated mitochondrial genome sequences for submission Wide Web, public data repositories, scientific journals and books) to public domain databases. The OGMP website (URL given and in many cases are difficult even to locate. Usually only above) contains additional information about the program, as well limited links exist among data sources (e.g. there is no easy way as data summaries and gene maps for the individual OGMP to connect from a GenBank record containing an rRNA sequence sequencing projects completed to date (Table 1). to the corresponding secondary structure contained in another Protist Image Database (PID) (http://megasun.bch.umontreal.ca/ database). It is even more difficult to perform the sort of protists/ ). The PID (T.G.Littlejohn and C.J.O’Kelly) is a cross-genome comparisons that were essential for the present compilation of images and short descriptions of selected protist review. Further, the data sets are often incomplete and/or contain genera, especially those whose species are frequently used as errors, which are sometimes hard to identify and to rectify in the experimental organisms or are important in studies of organismal underlying data source. In such a disorganized state organelle evolution. The intent of the PID is to provide integrated on-line genomic data constitute a major underexploited information information about the morphology, taxonomy and phylogenetic resource. The GOBASE project (17) was initiated by a subset of relationships of these organisms. The PID, which was initiated OGMP members (B.F.Lang, M.W.Gray, G.Burger and Downloaded from https://academic.oup.com/nar/article-abstract/26/4/865/2902238 by Ed 'DeepDyve' Gillespie user on 06 February 2018 877 Nucleic Acids Research, 1998, Vol. 26, No. 4 877 Nucleic Acids Research, 1994, Vol. 22, No. 1 29 Chen,B., Kubelik,A.R., Mohr,S. and Breitenberger,C.A. (1996) J. Biol. T.G.Littlejohn) to rectify this situation. GOBASE, which is a Chem., 271, 6537–6544. taxonomically broad database that organizes and integrates 30 Cermakian,N., Ikeda,T.M., Cedergren,R. and Gray,M.W. (1996) diverse data related to organelles, has been constructed as a Nucleic Acids Res., 24, 648–654. relational database with a web-based user interface. The current 31 Weihe,A., Hedtke,B. and Börner,T. (1997) Nucleic Acids Res., 25, version focuses on the mitochondrial subset of data. 2319–2325. 32 Hedtke,B, Börner,T. and Weihe,A. (1997) Science, 277, 809–811. 33 Burger,G., Lang,B.F., Reith,M. and Gray,M.W. (1996) Proc. Natl. Acad. Sci. ACKNOWLEDGEMENTS USA, 93, 2328–2332. 34 Leblanc,C., Boyen,C., Richard,O., Bonnard,G., Grienenberger,J.-M. and We thank J.Chesnick, R.W.Lee, P.J.Myler, K.Stuart, Y.Tanaka Kloareg,B. (1995) J. Mol. Biol., 250, 484–495. 35 Viehmann,S., Richard,O., Boyen,C. and Zetsche,K. (1996) Curr. Genet., and D.R.Wolstenholme for providing unpublished sequence data 29, 199–201. and other information in advance of publication. We are also 36 Daignan-Fornier,B., Valens,M., Lemire,B.D. and Bolotin-Fukuhara,M. grateful to C.J.O’Kelly and T.Nerad for provision of organisms (1994) J. Biol. Chem., 269, 15469–15472. and for invaluable advice on phylogeny, taxonomy and culture 37 Nugent,J.M. and Palmer,J.D. (1991) Cell, 66, 473–481. conditions. The OGMP has been supported by a Special Project 38 Covello,P.S. and Gray,M.W. (1982) EMBO J., 11, 3815–3820. 39 Prioli,L.M., Huang,J. and Levings,C.S. (1993) Plant Mol. Biol., 23, 287–295. grant (SP-34) from the Medical Research Council of Canada and 40 Gutell,R.R. (1994) Nucleic Acids Res., 22, 3502–3507. by a grant (G0-12323) from the Canadian Genome Analysis and 41 Gutell,R.R., Gray,M.W. and Schnare,M.N. (1993) Nucleic Acids Res., 21, Technology Program (CGAT), which has also provided funding 3055–3074. to GOBASE (GO-12984). Generous grants of equipment from Sun 42 Seilhamer,J.J., Gutell,R.R. and Cummings,D.J. (1984) J. Biol. Chem., 259, Microsystems Inc. and LI-COR Inc. are gratefully acknowledged, as 5173–5172. is salary and interaction support from the Canadian Institute for 43 Heinonen,T.Y.K., Schnare,M.N., Young,P.G. and Gray,M.W. (1987) J. Biol. Chem., 262, 2879–2887. Advanced Research. 44 Boer,P.H. and Gray,M.W. (1988) Cell, 55, 399–411. 45 Denovan-Wright,E.M. and Lee,R.W. (1994) J. Mol. Biol., 241, 298–311. 46 Nedelcu,A.M. (1997) Mol. Biol. Evol., 14, 506–517. REFERENCES 47 Feagin,J.E., Mericle,B.L., Werner,E. and Morris,M. (1997) Nucleic Acids Res., 25, 438–446. 1 Gray,M.W. (1989) Annu. Rev. Cell Biol., 5, 25–50. 48 Wolff,G., Plante,I., Lang,B.F., Kück,U. and Burger,G. (1994) J. Mol. Biol., 2 Gray,M.W. (1992) Int. Rev. Cytol., 141, 233–357. 237, 75–86. 3 Gillham,N.W. (1994) Organelle Genes and Genomes. Oxford University 49 Lang,B.F., Goff,L.J. and Gray,M.W. (1996) J. Mol. Biol., 261, 607–613. Press, New York, NY. 50 Suyama,Y. (1986) Curr. Genet., 10, 411–420. 4 Leblanc,C., Richard,O., Kloareg,B., Viehmann,S., Zetsche,K. and 51 Rusconi,C.P. and Cech,T.R. (1996) Genes Dev., 10, 2870–2880. Boyen,C. (1997) Curr. Genet., 31, 193–207. 52 Simpson,A.M., Suyama,Y., Dewes,H., Campbell,D.A. and Simpson,L. 5 Paquin,B., Laforest,M.-J., Forget,L., Roewer,I., Wang,Z., Longcore,J. and Lang,B.F. (1997) Curr. Genet., 31, 380–395. (1989) Nucleic Acids Res., 17, 5427–5445. 6 Patterson,D.J. and Sogin,M.L. (1992) In Hartman,H. and Matsuno,K. 53 Hancock,K. and Hajduk,S.L. (1990) J. Biol. Chem., 265, 19208–19215. (eds), The Origin and Evolution of the Cell. World Scientific, Singapore, 54 Dietrich,A., Weil,J.H. and Maréchal-Drouard,L. (1992) Annu. Rev. Cell Biol., Singapore, pp. 13–46. 8, 115–131. 7 Vaidya,A.B., Akella,R. and Suplick,K. (1989) Mol. Biochem. Parasitol., 55 Akashi,K., Sakurai,K., Hirayama,J., Fukuzawa,H. and Ohyama,K. (1996) 35, 97–108. Curr. Genet., 30, 181–185. 8 Feagin,J.E., Werner,E., Gardner,M.J., Williamson,D.H. and Wilson,R.J.M. 56 Akashi,K., Hirayama,J., Takenaka,M., Yamaoka,S., Suyama,Y., (1992) Nucleic Acids Res., 20, 879–887. Fukuzawa,H. and Ohyama,K. (1997) Biochim. Biophys. Acta, 1350, 262–266. 9 Pritchard,A.E., Seilhamer,J.J., Mahalingam,R., Sable,C.L., Venuti,S.E. and 57 Börner,G.V., Mörl,M., Janke,A. and Pääbo,S. (1996) EMBO J., 15, Cummings,D.J. (1990) Nucleic Acids Res., 18, 173–180. 5949–5957. 10 Wolstenholme,D.R. (1992) Int. Rev. Cytol., 141, 173–216. 58 Lonergan,K.M. and Gray, M.W. (1993) Science, 259, 812–816. 11 Clark-Walker,G.D. (1992) Int. Rev. Cytol., 141, 89–127. 59 Lonergan,K.M. and Gray,M.W. (1993) Nucleic Acids Res., 21, 4402. 12 Hanson,M.R. and Folkerts,O. (1992) Int. Rev. Cytol., 141, 129–172. 60 Gray,M.W. and Lonergan,K.M. (1993) In Brennicke,A. and Kück,U. (eds), 13 Wolstenholme,D.R. and Fauron,C.M.-R. (1995) In Levings,C.S. and Plant Mitochondria: With Emphasis on RNA Editing and Cytoplasmic Vasil,I.K. (eds), The Molecular Biology of Plant Mitochondria. Kluwer Male Sterility. VCH, Weinheim, Germany, pp. 15–22. Academic Publishers, Dordrecht, The Netherlands, pp. 1–59. 61 Burger,G., Plante,I., Lonergan,K.M. and Gray,M.W. (1995) J. Mol. Biol., 14 Cummings,D.J. (1992) Int. Rev. Cytol., 141, 1–64. 245, 522–537. 15 Stuart,K. and Feagin,J.E. (1992) Int. Rev. Cytol., 141, 65–88. 62 Price,D.H. and Gray,M.W. (1998) In Grosjean,H. and Benne,R. (eds), 16 Feagin,J.E. (1994) Annu. Rev. Microbiol., 48, 81–104. Modification and Editing of RNA: The Alteration of RNA Structure and 17 Korab-Laskowska,M., Rioux,P., Brossard,N., Littlejohn,T.G., Gray, M.W., Function. American Society for Microbiology, Washington, DC, in press. Lang,B.F. and Burger,G. (1998) Nucleic Acids Res., 26, 139–146. 63 Laforest,M.-J., Roewer,I. and Lang,B.F. (1997) Nucleic Acids Res., 25, 18 Altschul,S.F., Gish,W., Miller,W., Myers,E.W. and Lipman,D.J. (1990) 626–632. J. Mol. Biol., 215, 403–410. 64 Schnare,M.N., Greenwood,S.J. and Gray,M.W. (1995) FEBS Lett., 362, 19 Pearson,W.R. (1990) Methods Enzymol., 183, 63–98. 24–28. 20 Staden,R. (1990) Methods Enzymol., 183, 193–211. 65 Anderson,S., Bankier,A.T., Barrell,B.G., de Bruijn,M.H.L., Coulson,A.R., 21 Unseld,M., Marienfeld,J.R., Brandt,P. and Brennicke,A. (1997) Drouin,J., Eperon,I.C., Nierlich,D.P., Roe,B.A., Sanger,F., Schreier,P.H., Nature Genet., 15, 57–61. Smith,A.J.H., Staden,R. and Young,I.G. (1982) In Slonimski,P., Borst, P. 22 Paquin,B. and Lang,B.F. (1996) J. Mol. Biol., 255, 688–701. and Attardi,G. (eds), Mitochondrial Genes. Cold Spring Harbor 23 Oda,K., Yamato,K., Ohta,E., Nakamura,Y., Takemura,M., Nozato,N., Laboratory Press, Cold Spring Harbor, NY, pp. 5–43. Akashi,K., Kanegae,T., Ogura,Y., Kohchi,T. and Ohyama,K. (1992) 66 Okimoto,R. and Wolstenholme,D.R. (1990) EMBO J., 9, 3405–3411. J. Mol. Biol., 223, 1–7. 67 Steinberg,S. and Cedergren,R. (1994) Nature Struct. Biol., 1, 507–510. 24 Lang,B.F., Burger,G., O’Kelly,C.J., Cedergren,R., Golding,G.B., Lemieux,C., 68 Wolff,G., Burger,G., Lang,B.F. and Kück,U. (1993) Nucleic Acids Res., 21, Sankoff,D., Turmel,M. and Gray,M.W. (1997) Nature, 387, 493–497. 719–726. 25 Palmer,J.D. (1997) Nature, 387, 454–455. 69 Turmel,M., Côté,V., Otis,C., Mercier,J.-P., Gray,M.W., Lonergan,K. and 26 Bogorad,L. (1991) In Bogorad,L. and Vasil,I.K. (eds), The Molecular Lemieux,C. (1995) Mol. Biol. Evol., 12, 533–545. Biology of Plastids. Academic Press Inc., San Diego, CA, pp. 93–124. 27 Reith,M. (1995) Annu. Rev. Plant Physiol. Plant Mol. Biol., 46, 549–575. 70 Fontaine,J.M., Rousvoal,S., Leblanc,C., Kloareg,B. and Loiseaux-de Goër,S. 28 Masters,B.S., Stohl,L.L. and Clayton,D.A. (1987) Cell, 51, 89–99. (1995) J. Mol. Biol., 251, 378–389. Downloaded from https://academic.oup.com/nar/article-abstract/26/4/865/2902238 by Ed 'DeepDyve' Gillespie user on 06 February 2018 878 Nucleic Acids Research, 1998, Vol. 26, No. 4 71 Devenish,R.J., Papakonstantinou,T., Galanis,M., Law,R.H., Linnane,A.W. 87 Ogawa,S., Matsuo,K., Angata,K., Yanagisawa,K. and Tanaka,Y. (1997) and Nagley,P. (1992) Annls NY Acad. Sci., 671, 403–414. Curr. Genet., 31, 80–88. 72 Papakonstantinou,T., Law,R.H., Nesbitt,W.S., Nagley,P. and Devenish,R.J. 88 Pellizzari,R., Anjard,C. and Bisson,R. (1997) Biochim. Biophys. Acta, (1996) Curr. Genet., 30, 12–18. 1320, 1–7. 73 Papakonstantinou,T., Galanis,M., Nagley,P. and Devenish,R.J. (1993) 89 Commission on Plant Gene Nomenclature (1994) Plant Mol. Biol. Rep., 12 Biochim. Biophys. Acta, 1144, 22–32. (CPGN suppl.), S1–S109. 74 Gray,M.W. (1995) In Levings,C.S. and Vasil,I.K. (eds), The Molecular 90 Seilhamer,J.J., Olsen,G.J. and Cummings,D.J. (1984) J. Biol. Chem., 259, Biology of Plant Mitochondria. Kluwer Academic, Dordrecht, 5167–5172. The Netherlands, pp. 635–659. 91 Schnare,M.N., Heinonen,T.Y.K., Young,P.G. and Gray,M.W. (1986) 75 Gray,M.W., Cedergren,R., Abel,Y. and Sankoff,D. (1989) Proc. Natl. J. Biol. Chem., 261, 5187–5193. Acad. Sci. USA, 86, 2267–2271. 92 Lang,B.F., Ahne,F., Distler,S., Trinkl,H., Kaudewitz,F. and Wolf,K. (1983) 76 Denovan-Wright,E.M., Nedelcu,A.M. and Lee,R.W. (1998) Plant Mol. Biol., In Schweyen,R.J., Wolf,K. and Kaudewitz,F. (eds), Mitochondria 1983, 36, 285–295. Nucleo-Mitochondrial Interactions. Walter de Gruyter, Berlin, Germany, 77 Boer,P.H. and Gray,M.W. (1991) Curr. Genet., 19, 309–312. pp. 313–329. 78 Vahrenholz,C., Rieman,G., Pratje,E., Dujon,B. and Michaelis,G. (1993) 93 Pont-Kingdom,G.A., Okada,N.A., Macfarlane,J.L., Beagley,C.T., Curr. Genet., 24, 241–247. Wolstenholme,D.R., Cavalier-Smith,T. and Clark-Walker,G.D. (1995) 79 Kairo,A., Fairlamb,A.H., Gobright,E. and Nene,V. (1994) EMBO J., 13, Nature, 375, 109–111. 898–905. 94 Weber,B., Börner,T. and Weihe,A. (1995) Curr. Genet., 27, 488–490. 80 Anderson,S., Bankier,A.T., Barrell,B.G., de Bruijn,M.H.L., Coulson,A.R., 95 Boer,P.H. and Gray,M.W. (1988) EMBO J., 7, 3501–3508. Drouin,J., Eperon,I.C., Nierlich,D.P., Roe,B.A., Sanger,F., Schreier,P.H., 96 Burger,G. and Werner,S. (1985) J. Mol. Biol., 186, 231–242. Smith,A.J., Staden,R. and Young,I.G. (1981) Nature, 290, 457–465. 97 Muramatsu,T., Nishikawa,K., Nemoto,F., Kuchino,Y., Nishimura,S., 81 Beagley,C.T., Okimoto,R. and Wolstenholme,D.R. (1998) Genetics, in press. Miyazawa,T. and Yokoyama,S. (1988) Nature, 336, 179–181. 82 Vaidya,A.B. and Arasu,P. (1987) Mol. Biochem. Parasitol., 22, 249–257. 98 Pfitzinger,H., Weil,J.H., Pillay,D.T.N. and Guillemaut,P. (1990) Plant Mol. 83 Hajduk,S.L., Harris,M.E. and Pollard,V.W. (1993) FASEB J., 7, 54–63. Biol., 14, 805–814. 84 Read,L.K., Wilson,K.D., Myler,P.J. and Stuart,K. (1994) Nucleic Acids Res., 99 Pi,M., Angata,K., Ikemura,T., Yanagisawa,K. and Tanaka,Y. (1996) 22, 1489–1495. J. Plant Res., 109, 1–6. 85 Takemura,M., Nozato,N., Oda,K., Kobayashi,Y., Fukuzawa,H. and 100 Beagley,C.T., Okada,N.A. and Wolstenholme,D.R. (1996) Ohyama,K. (1995) Mol. Gen. Genet., 247, 565–570. 86 Lonergan,K.M. and Gray,M.W. (1996) J. Mol. Biol., 257, 1019–1030. Proc. Natl. Acad. Sci. USA, 93, 5619–5623. Downloaded from https://academic.oup.com/nar/article-abstract/26/4/865/2902238 by Ed 'DeepDyve' Gillespie user on 06 February 2018
Nucleic Acids Research – Oxford University Press
Published: Feb 1, 1998
You can share this free article with as many people as you like with the url below! We hope you enjoy this feature!
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.