Plant Molecular Biology 39: 927–932, 1999.
© 1999 Kluwer Academic Publishers. Printed in the Netherlands.
Analysis of Arabidopsis genome sequence reveals a large new gene family
, E.M. Davies
, F.C.H. Franklin
and D.F. Marshall
Wolfson Laboratory for Plant Molecular Biology, School of Biological Sciences, University of Birmingham, P.O.
Box 363, Birmingham B15 2TT, UK (
author for correspondence);
Bioinformatics & IT Research Unit, Scottish
Crop Research Institute, Invergowrie, Dundee, DD2 5DA, UK
Received 10 April 1998; accepted in revised form 17 November 1998
Key words: self-incompatibility, Arabidopsis thaliana, S-protein, Papaver rhoeas, poppy
A detailed analysis of the currently available Arabidopsis thaliana genomic sequence has revealed the presence of
a large number of open reading frames with homology to the stigmatic self-incompatibility (S) genes of Papaver
rhoeas. The products of these potential genes are all predicted to be relatively small, basic, secreted proteins
with similar predicted secondary structures. We have named these potential genes SPH (S-protein homologues).
Their presence appears to have been largely missed by the prediction methods currently used on the genomic
sequence.Equivalenthomologuescouldnotbe detected in thehuman, microbial, DrosophilaorC.elegans genomic
databases,suggestingafunctionspeciﬁcto plants. PreliminaryRT-PCR analysisindicatesthat at leasttwomembers
of the family (SPH1, SPH8) are expressed, with expression being greatest in ﬂoral tissues. The gene family may
total more than 100 members, and its discovery not only illustrates the importance of the genome sequencing
efforts, but also indicates the extent of information which remains hidden after the initial trawl for potential genes.
A detailed analysis of one of the ﬁrst large contigu-
ous regions of genomic sequence from Arabidopsis
thaliana has recently been published . The avail-
ability of such sequence should accelerate the process
of gene discovery and the deﬁnition of gene function
in higher plants. Our own analysis of the currently
available Arabidopsis genomic sequence, in predict-
ing the existence of a new plant-speciﬁc gene family
with around 100 members, suggests that the gene den-
sity may be higher than current predictions based on
preliminary analyses of the data.
Our analysis was stimulated by the discovery (G.
Murphy, personal communication) of an ORF in
Arabidopsis genomic sequence with some homology
to the self-incompatibility (S) alleles which we had
cloned from the ﬁeld poppy (Papaver rhoeas)[5,
16]. Self-incompatibility (SI) is a genetic barrier used
by many species of ﬂowering plants to prevent self-
fertilisation. In P. rhoeas SI is controlled by a single
multi-allelic S locus, which comprises S genes ex-
pressed in both pollen and stigma [6, 10]. Rejection
of pollen occurs when the S allele carried by the hap-
loid pollen matches either of the alleles in the pistil.
The stigmatic S alleles encode small apoplastic pro-
teins which control the recognition of pollen by the
stigma during fertilisation. S-proteins corresponding
to the S
alleles in Papaver rhoeas,and
allele in P. nudicaule have been identiﬁed in
stigma extracts and the corresponding genes cloned
and characterised [5, 16; S. Kurup, personal com-
munication]. Comparison of the sequences of the P.
rhoeas S genes with those of the Solanaceae reveals
no detectable homology, whilst comparison with the
Brassica S genes (SRK and SLG) reveals only a very
weak similarity . Thus it seems probable that sev-
eral mechanistically different SI systems exist within
the plant kingdom.
Until now, therefore, the Papaver S genes ap-
peared to be relatively unique, with little homology
to any other genes. This report now suggests, how-