Bookmark

Compositional features of eukaryotic genomes for checking predicted genes

Cruveiller, Stéphane; Jabbari, Kamel; Clay, Oliver; Bernardi, Giorgio
Briefings in Bioinformatics , Volume 4 (1): 43 Oxford University PressMar 1, 2003

Preview Only

Compositional features of eukaryotic genomes for checking predicted genes

Abstract

Abstract Gene prediction relies on the identification of characteristic features of coding sequences that distinguish them from non-coding DNA. The recent large-scale sequencing of entire genomes from higher eukaryotes, in conjunction with currently used gene prediction algorithms, has provided an abundance of putative genes that can now be analysed for their compositional properties. Strong, systematic differences still exist, in several species, between the compositional properties of sets of ex novo predicted genes and genes that have been experimentally detected and/or verified. This is particularly evident in the estimated gene set (>45,000 genes) of the recently sequenced rice genome, where roughly half the predicted genes are compositionally unusual and have no known orthologues in the dicot Arabidopsis . In a few cases such differences might suggest a bias in experimental gene-finding protocols, but the quasi-random nature of the compositionally aberrant predicted genes is a strong indication that many, if not most, of them are false positives. It therefore appears that some important features of coding regions have not yet been taken into account in existing gene prediction programs. Statistical base compositional properties of curated gene data sets from vertebrates, which we briefly review here, should therefore provide a useful benchmark for fine-tuning probabilistic gene models and model parameters that are currently in use.
Loading next page...
1 Page

Preview Only. This article cannot be rented because we do not currently have permission from the publisher.

 
/lp/oxford-university-press/compositional-features-of-eukaryotic-genomes-for-checking-predicted-u3bsFjvVXN
Title
Compositional features of eukaryotic genomes for checking predicted genes
Author(s)
Cruveiller, Stéphane; Jabbari, Kamel; Clay, Oliver; Bernardi, Giorgio
Journal
Briefings in Bioinformatics , Volume 4 (1): 43 Oxford University Press – Mar 1, 2003
Publisher
Oxford University Press
Copyright
Copyright © 2003 Oxford University Press
ISSN
1467-5463
eISSN
1477-4054
D.O.I.
10.1093/bib/4.1.43
Publisher site
Get PDF