What aDNA can (and cannot) tell us about the emergence of language and speech

What aDNA can (and cannot) tell us about the emergence of language and speech Abstract The genome sequencing of individuals belonging to extinct forms of the genus Homo has provided us with a detailed view of the genetic makeup of some of our close extinct relatives. In addition, the unprecedented depth of sequencing of modern Homo sapiens populations has given us a framework for interpreting minor changes at the DNA sequence level that are putatively relevant to a broad array of anatomical and behavioral characteristics. Here we discuss the genetic architecture of such complex characteristics, with a special focus on language and speech. We examine the extent of reported variation in the DNA sequences of genes that are thought to be involved in their production, both in H. sapiens populations and in our extinct relatives, and we discuss to what extent such sequence variations are relevant to making direct statements about the capacity of extinct hominids to generate and express language and speech. Because language is a highly complex behavioral character we stress the difficulties involved in using the ‘atomized’ DNA sequence data as indicators of its possession by extinct hominids, and we emphasize that such data should not be considered in isolation from other relevant information gleaned from comparative anatomy and the archaeological record. 1. Introduction Deciphering the genetic basis of the emergence of language in the human lineage has become an interesting and potentially achievable endeavor. Several recently sequenced archaic nuclear human genomes (all belonging to Neanderthals or Denisovans) have allowed researchers to focus on genetic differences among those archaic humans and their modern counterparts, in the hope of discovering genes and genetic elements that might have been involved in the differentiation of modern humans from their archaic precursors. However, language is a very complex behavior; and whether or not some—or any—of the differences documented between archaic genomes and modern ones are directly involved in language, remains a mystery. In this contribution we take a systematic approach to examining whether any of the differences between the genomes of modern humans and those of Neanderthals and Denisovans can reasonably be associated with language. We start by discussing the range of possible genetic mechanisms or architectures involved in language as a phenotype. Because determining the kinds of genes that might be involved in the differentiation of Homo sapiens from its close extinct relatives requires an examination of archaic genomes, we next look at the state of ancient DNA research in relation to those archaic genomes, allowing the identification of genes that differ in archaic and living humans. We next describe some simple, objective methods of obtaining lists from data bases of genes that are potentially related to language. By comparing overlap in the various lists, we were able to identify genes with potential relationships to language phenotypes. We conclude with a discussion of the implications of genomics for language, and attempt to integrate what we know about the genetics of language with the fossil record. 2. The range of the genetic architecture of language Language and speech are perhaps the most striking of all human autapomorphies, and knowing their genetic architectures is critical to understanding their origins in our lineage. Graham and Fisher (2015) and Graham, Deriziotis, and Fisher (2015) have recently reviewed this topic, and point out that the primary difficulty with pinning down the genetic architectures of language and speech lies in obtaining precise and well defined traits or phenotypes with which to begin the search for those structural underpinnings. Phenotypes important to language (Graham and Fisher 2015) have been inferred from the comparative biology of communication in other organisms (birdsong), from neuroimaging (correlating brain structural anomalies with genes), and from the study of language and communication anomalies in humans. These latter conditions include dyslexia, reading and language disabilities, receptive language ability at age 12, expressive vocabulary in infancy, specific language impairment (SLI), social communication ability at age 17, and developmental problems with speech and language (including epilepsy-aphasia spectrum disorders, speech sound disorder, dysarthria, childhood apraxia of speech (CAS), and stuttering). Some of these conditions are inherited in Mendelian fashion; and exceptional abilities in language acquisition, evidently non-Mendelian, may also shed light on this issue. Graham, Deriziotis, and Fisher (2015) point out that a range of different genetic architectures is involved in these various language-related traits and phenotypes. Some disorders are dominant, others are recessive, while yet others are sporadic and caused by a de novo event. Some language-associated traits are evidently genetically complex. The variety of the language-associated conditions listed above emphasizes the difficulty of settling on a standard phenotype for language. Indeed, in practice language can be associated with many and varied kinds of traits, suggesting that the problem we have in defining language as a specific entity may result from the involvement of many different genetic architectures within the complex whole. It is merely the simplicity of the Mendelian interpretations that has garnered them so much attention by researchers involved in the study of the genetics of language. This means that, while interpretation of Mendelian traits is the lowest-hanging fruit in the search for candidate genes, we will also need to examine genes involved in complex genetic interactions. Many authors have discussed at length the range of potential architectures underlying the traits of language and speech in Homo sapiens. Specifically, a host of recent papers have looked at this question in a variety of contexts (Graham and Fisher, 2015; Benítez-Burraco and Murphy 2016; Dediu and Christiansen 2016; Dediu, Janssen, and Moisik 2016; Elvevåg et al. 2016; Hippolyte et al. 2016; Jarvis 2016; Mozzi et al. 2016; Nuttle et al. 2016; Whalen and Griffiths 2017; Boeckx 2017; Fisher 2017; Reuter et al. 2017; Scott-Phillips 2017). Most of these publications focused on single genes that are thought to be associated with language, or on a few of them. In addition, Online Mendelian Inheritance in Man (OMIM; http://omim.org/) is a treasure trove of information for understanding single genes involved in the inheritance of specific traits, and both language and speech can be examined in this database. Because OMIM focuses on Mendelian inheritance, the genetic architectures of the traits in this database are, by default, those of single genes. However, several recent GWAS (genome wide association studies) have examined language or speech (O'Roak et al. 2011; Roeske et al. 2011; Villanueva et al. 2011; Luciano et al. 2013; St Pourcain et al. 2013, 2014a,b; Becker et al. 2014; Harlaar et al. 2014; Nudel 2014; Kornilov et al. 2016), and have the potential to discover more complex relationships amongst genes that are putatively involved. These papers all focus on genome-wide loci that are potentially associated with language, speech or verbal socialization, and on several specific genes that might be involved in the genetic architecture of language and speech production. 3. The scope of ancient human genome variation The last five years have seen an explosion of nuclear genome sequencing in modern and archaic humans. Two recent reviews summarize the work up to 2016 (Llamas et al. 2016; Slatkin and Racimo 2016). Here we will focus on the genomes reported in those two reviews, and Fig. 1 shows the distribution of these modern and archaic genomes in space and time as reported by Slatkin and Racimo (2016). Figure 1. View largeDownload slide Chart showing the distribution of ancient genomes sequenced from the specified geographic areas. The number of genomes is shown on the Y axis and the geographic location is on X axis. Shades of bars indicate age of fossil: lightest gray is whole genome sequencing from specimens 100 to 1000 years old; medium light gray is whole genome sequencing from specimens 1000 to 10000 years old; medium dark gray is whole genome sequencing from specimens 10,000 to 100,000 years old; darkest gray is high density chip sequencing from specimens 1000 to 10,000 years old. The asterisks indicate the locations of the four archaic human genomes used in this paper - lightest gray = Vindija Neanderthal; medium light gray = Altai Neanderthal; medium dark gray = Denisova; darkest gray = Ust Ishim Homo sapiens. Data from Slatkin and Racimo (2016), Allentoft et al. (2015), Green, et al. (2010), Gamba et al. (2014, Haak et al. (2015), Mathieson et al. (2015), Rasmussen et al. (2010), Olalde et al. (2014) and Wilde et al. (2014). Figure 1. View largeDownload slide Chart showing the distribution of ancient genomes sequenced from the specified geographic areas. The number of genomes is shown on the Y axis and the geographic location is on X axis. Shades of bars indicate age of fossil: lightest gray is whole genome sequencing from specimens 100 to 1000 years old; medium light gray is whole genome sequencing from specimens 1000 to 10000 years old; medium dark gray is whole genome sequencing from specimens 10,000 to 100,000 years old; darkest gray is high density chip sequencing from specimens 1000 to 10,000 years old. The asterisks indicate the locations of the four archaic human genomes used in this paper - lightest gray = Vindija Neanderthal; medium light gray = Altai Neanderthal; medium dark gray = Denisova; darkest gray = Ust Ishim Homo sapiens. Data from Slatkin and Racimo (2016), Allentoft et al. (2015), Green, et al. (2010), Gamba et al. (2014, Haak et al. (2015), Mathieson et al. (2015), Rasmussen et al. (2010), Olalde et al. (2014) and Wilde et al. (2014). Basically, two kinds of genome sequencing technologies produce ancient human genomes: whole genome shotgun sequencing, and SNP capture. Both can generate massive amounts of information, and both are required to span at least 1X coverage of the genome. Prüfer et al. (2014) sequenced the genome of a Neanderthal (N) from the Altai mountains. They combined the genome data from a previously sequenced Denisova (D) specimen (Reich et al. 2010), 1,094 present-day human genomes (H), and a chimpanzee (C) genome, and screened the data for suites of SNPs that showed the following patterns of sequence change distribution. The first pattern had a fixed base pair state in N, D and C, but was different from H (31,389 SNPs and 4,113 short indels). The second category had nearly fixed base pair changes in H (90% of modern humans) relative to N, D, C (105,757 SNPs and 3,900 indels). Castellano et al. (2014) added the sequences of two Neanderthals (one from Spain and one from Croatia) and performed a similar analysis, generating lists of SNPs in the same manner as Prüfer et al. (2014). While both the Castellano et al. and the Prüfer et al. studies discovered many SNPs that are unique to modern humans, these SNPs impact coding regions only very sparingly. The relatively few proteins impacted could then be examined using gene ontology terminology, and when this had been accomplished Castellano et al. found that ‘changes in genes related to metabolism, the cardiovascular system, hair distribution, and morphology (genitals, palate, face, extremities, joints, digits, thorax, orbital, and occipital skull regions and general mobility) … [were] … involved’. There is no direct reference to language using the ontology approach, a potential problem for this line of attack. Here we prefer to examine genes with associations in the literature, and Mendelian or GWAS associations. 4. Using simple screens for potential genes involved in speech and language emergence in humans We used the two lists presented by Castellano et al. (2014) and Prüfer et al. (2014), and label them C and P, respectively (Supplemental Table S1). Both studies included lists of genes in which SNPs are fixed, and lists in which the desired SNP pattern is found in more than 90% of the modern humans examined. As noted, several ways exist in which these two lists of genes (C and P) may potentially be compared to genes that might be involved in language. We also included a third primary list (Key et al. 2016), of genes in the genomes of ancient and living humans that are said to show indications of positive natural selection. We label this one the S list. Table 1. Gene overlap of Prüfer et al., (2014) (P), Castellano et al., (2014) (C) and Key et al. (2016) (S) with the indicated study. Citation  C  P  S  Boeckx (2017)  0  CNTNAP2  FOXP1  Fisher (2017)  0  CNTNAP2  0  Dediu and Christiansen (2016)  0  CNTNAP2  FOXP1  Elvevåg et al. (2016)  0  CNTNAP2  FOXP1  Benitez-Buraco and Murphy (2016)  0  CNTNAP2  FOXP1  Mozzi et al. (2016)  0  CNTNAP2  0  Combined  0  CNTNAP2  FOXP1  Citation  C  P  S  Boeckx (2017)  0  CNTNAP2  FOXP1  Fisher (2017)  0  CNTNAP2  0  Dediu and Christiansen (2016)  0  CNTNAP2  FOXP1  Elvevåg et al. (2016)  0  CNTNAP2  FOXP1  Benitez-Buraco and Murphy (2016)  0  CNTNAP2  FOXP1  Mozzi et al. (2016)  0  CNTNAP2  0  Combined  0  CNTNAP2  FOXP1  Genes from the literature: Using PubMed at NCBI and the associated query ‘language’, we found that the following studies from the past two years (papers with publication dates in 2016 and 2017) propose that specific genes are involved in language emergence and acquisition: Nuttle et al. (2016), Scott-Phillips (2017), Hippolyte et al. (2016), Reuter et al. (2017), Whalen and Griffiths (2017), Jarvis (2016), Boeckx (2017), Fisher (2017), Dediu and Christiansen (2016), Dediu, Janssen, and Moisik (2016), Elvevåg et al. (2016), Benítez-Burraco and Murphy (2016), and Mozzi et al. (2016). We compiled gene lists from these studies (Supplemental Table S2), and asked if there is any overlap with the P and C lists described above. Table 2. Genes that are cross listed between the Prüfer et al. (2014) (P), Castellano et al. (2014) (C) and Key et al. (2016) (S) lists, and the OMIM lists generated in this article. Keywords  C  P  S  Aphasia  0  0  NPC2  Brain size  0  0  *  Broca  0  0  0  Communication defect  0  0  0  Communication  0  CNTNAP2  0  Dyslexia  0  0  0  Language acquisition  0  CNTNAP2  0  Language perception  0  CNTNAP2  0  Larynx  GLI3  GLI3  0  Linguistic  GLI3  GLI3  PAX2  Maternal behavior  0  0  0  Neuronal paring  0  0  0  Perfect pitch  0  0  0  Reading  0  0  0  Speech difficulty  0  CNTNAP2  FOXP1  Speech perception  0  0  0  Tongue  WDPCP  0  BNC2  Vocal cord  0  CNTNAP2  DCTN1  Vocal tract  0  0  DCTN1, PAX2  Voice  0  CNTNAP2  0  Wernicke  0  0  0  Keywords  C  P  S  Aphasia  0  0  NPC2  Brain size  0  0  *  Broca  0  0  0  Communication defect  0  0  0  Communication  0  CNTNAP2  0  Dyslexia  0  0  0  Language acquisition  0  CNTNAP2  0  Language perception  0  CNTNAP2  0  Larynx  GLI3  GLI3  0  Linguistic  GLI3  GLI3  PAX2  Maternal behavior  0  0  0  Neuronal paring  0  0  0  Perfect pitch  0  0  0  Reading  0  0  0  Speech difficulty  0  CNTNAP2  FOXP1  Speech perception  0  0  0  Tongue  WDPCP  0  BNC2  Vocal cord  0  CNTNAP2  DCTN1  Vocal tract  0  0  DCTN1, PAX2  Voice  0  CNTNAP2  0  Wernicke  0  0  0  * = MCPH1, TSC1, CHRM5, DVL1, MYO16. Genes targeted by GWAS in the literature and GWAS database: Several of the studies mentioned above were used to search the GWAS database to create gene lists (Supplemental Table S3). We used the GWAS Catalogue data base (https://www.ebi.ac.uk/gwas/) to accomplish this step. This database now includes the NIH GWAS database (NHGRI-EBI GWAS). Other GWAS databases exist, but we made the arbitrary decision to focus on the GWAS Catalogue for this article. The search for GWAS studies was accomplished using the keywords ‘speech’ and ‘language’. The works referencing the genes obtained by this process are not all classical GWAS studies; but nonetheless, a circumscribed list of genes obtained in this way demonstrates the simplicity with which the GWAS database can be exploited. Genes obtained from OMIM searches: Searches of the OMIM database (https://www.omim.org/) were conducted using a set of keywords (Supplemental Table S4). Extremely large numbers of hits were obtained with these search queries, due to the way in which searches are reported in OMIM. As a result, we altered some of our searches to ‘speech AND acquisition’, ‘language AND acquisition’ and so forth. The hits were then hand-curated by reading through the OMIM descriptions to ensure that the information given does indeed contain the desired keywords, and in the desired context of language acquisition. These searches gave us a preliminary view of the genes and kinds of syndromes researchers have associated with language acquisition and speech capacity, and they resulted in gene lists for many single genes potentially involved in speech and language acquisition. The gene lists from studies in the literature were based on the most recent papers we could access, with dates in 2016 and 2017. We used an agnostic approach to search two example databases (GWAS and OMIM) to obtain the gene lists. More than likely other studies exist that could add genes to these lists, but we limited ourselves to these two databases. We recognize that our lists are highly dependent on the query terms we used, but our main purpose here is to demonstrate how databases of this kind can be used to provide lists for cross-referencing. We also point out that variant mapping is often susceptible to false positive findings, which could also be a limitation of our sequence lists. We first examined the P and C lists, to assess overlap. The P list contains 114 unique genes, while the C list has 101 unique genes. The two lists overlap on 26 genes (Supplemental Table S5). The difference most likely results from the use of different archaic human genomes to obtain the lists. The studies taken from the literature also have considerable overlap amongst themselves, suggesting that the authors of the papers cited here recognized a considerable number of genes in common (Supplemental Table S5) that might potentially be involved in language. Suggestively, there is little overlap in the C and P lists of genes that other authors have targeted in previous studies. Table 1 shows a list of the studies and genes that overlap from the P and C lists. Only one gene, CNTNAP2, shows overlap with the P list. This gene is a member of the neurexin gene family. Genes in this family function in the nervous system, and CNTNAP2 has been associated with autism. Alarmingly, there is no overlap of genes in the literature list with the C list. Finally, there was a single gene in the literature-based lists that did overlap with the S list: FOXP1. This gene is important in muscle development, especially of the esophagus. We next examined the GWAS gene lists, and found no overlap among them (over seven different studies); nor is there any overlap of the lists from these studies with the P and C lists. Part of the apparent problem with the GWAS lists might involve the difficulty of establishing strong phenotypic manifestations of language and language acquisition. Two genes from the GWAS studies matched with the S list: PRKCH and BNC2. The first is a protein kinase C (PKC) that is in the family of serine- and threonine-specific protein kinases. The second is a protein most commonly associated with skin color (BNC2). The final set of comparisons involved gene lists obtained from OMIM. While the list of OMIM-derived genes we present is not exhaustive, and includes some keywords that might have little to do with language, we suggest the exercise is useful because it permits a broader examination of both databases and a novel potential way of cross-referencing these gene lists. Table 2 shows the results of comparing the P, C, and S gene lists with these OMIM genes. There are fourteen genes on the three guide-lists that can be matched to OMIM-derived genes, three of which are related to language query terms found in the P and C lists. They are: CNTNAP2: This gene is one also targeted by several of the studies from the literature mentioned earlier. It is a neural development gene and hence could be of great importance in pursuing ideas about neural involvement in language. GLI3: This gene codes for a transcription factor ‘zinc finger’ protein. These proteins are important in regulating developmentally-important signaling proteins like Sonic hedgehog. The product of this gene is also thought to be important in embryogenesis. A previous connection of this gene to language was made by Boeckx and Benítez-Burraco (2014a,b; Boeckx 2017), but not using OMIM lists. WDPCP: This gene encodes a WD40 repeat protein that controls developmental polarity. It is important in hamartoma of the tongue. The comparison of OMIM genes with the natural selection S-list of Key et al. (2016) resulted in overlap of eleven genes, with only one of them found in more than one OMIM list (PAX2). One other gene that is prevalent in the literature search lists (Table 1) is FOXP1. The products of these genes, (FOXP1 and PAX2), are involved in development. Specifically, FOXP1 interacts with FOXP2, and has been shown to be involved in autism and intellectual disability. In general, it is difficult to see a clear causal connection with language of any of the selected genes in the Key et al. (2016) list. Creating more lists from OMIM of possible genes involved in speech or language acquisition, or in the anatomical structures required for speech, might uncover other candidate genes; but the bottom line is that this approach can at most identify whittled-down lists of genes that are putatively correlated with language. On the other hand, perhaps more Neanderthal and Denisovan genomes will reveal that the SNPs that appear to be fixed in this small group of genomes are actually not fixed. Under the cross-listing criteria we use here, if these SNPs reside in the genes we discuss above, they would have to be eliminated from potential involvement with language. 5. Genomic correlates with language The earliest and most famous candidate for a ‘language gene’, FOXP2, appears to have no variation that might have differentiated Neanderthals/Denisovans from Homo sapiens. Still, this is a pretty weak basis for suggesting that the former may, like the latter, have had language. Prüfer et al. (2014) and Castellano et al. (2014) examined their lists closely, and do not appear to have made any strong connections between language and any of the genes they determined as important in the differentiation of the Neanderthal/Denisovan genomes. We show here that there are at least three genes with some loose connection to language that pass our cross-listing test, and that some single-gene changes identified in the candidate lists we have discussed might be involved in neural processes and anatomical innovations that distinguish the Neanderthal/Denisovan group from modern humans. Of the three genes in the P and C lists that overlap with the OMIM lists, WDPCP is the least likely to be involved in language. It is found in only one OMIM list (genes associated with tongue development). It is involved in hamartoma of the tongue and causes abnormal tissue development and malfunction of the tongue, but more than likely is not involved in language acquisition or maintenance. Its role in developmental polarity is also more than likely not relevant to language. On the other hand, the connection of the role of WDPCP in developmental polarity to language is extremely complex. GL13 is particularly interesting because it is found in both the P and C lists and it overlaps with the OMIM gene lists (keywords ‘larynx’ and ‘linguistic’). This gene is a zinc finger containing developmental regulator that controls the normal patterning or shaping of organs and structure during embryogenesis. GLI3 as well as other genes in the GLI family are transcription factors and interact with Sonic Hedgehog to repress or activate other genes in that pathway. One of the human genetic disorder syndromes that GLI3 is involved in (Pallister-Hall syndrome) involves the development of the larynx and this anatomical structure is intricately involved in language. On the other hand, GLI3 is also involved in the development of the brain (specifically the thalamus) and some researchers have pointed to its involvement in the ‘human language ready brain’ (Boeckx and Benítez-Burraco 2014a,b,; Boeckx 2017). However, the gene is highly pleiotropic, is involved in development of several organs, and in fact is lethal under some conditions in organs not related to the larynx or the brain. In this context, if GLI3 does influence language it does it in the same general way as other transcription factors that have been suggested to be involved in language like FOXP1 and FOXP2. The final gene, CNTNAP2, is interesting because it is found in all of the literature lists (Table 1) and in several of the OMIM lists (Table 2). Oddly the C list does not contain this gene. The potential role of this gene in language has been discussed in detail in the publications in Table 1. Briefly, this gene is a member of the neurexin family which is important in vertebrate nervous system development. The gene itself is one of the largest genes in the human genome (it covers over 1.5% of chromosome 7) and is thought to be involved in certain forms of deafness. It has also been correlated with several neurological disorders (ADHD, schizophrenia and autism). Like GLI3, CNTNAP2 is also highly pleiotropic and has a broad spectrum of effects in human development. It is of interest to note that the two candidate genes that we discuss here with the most potential for a role in language (GLI3 and CNTNAP2) are both found on Chromosome 7, albeit on different arms of that chromosome. Mining OMIM and the literature for genes that are correlated with language is a straightforward but limited approach. It is not surprising that while our systematic use of OMIM in this context is unique, we identified several genes already thought to be involved in language (GLI3 and CNTNAP2) and other genes only peripherally involved. All this stops very far short of identifying a gene, or even a suite of genes, ‘for’ language. Both the neural capacity for language, and the anatomical apparatus needed to express it, result from some profound changes in major developmental pathways in the immediate ancestor of Homo sapiens that are unlikely to be simply related to any of the gene changes yet fingered. As with anything else in evolutionary reconstruction, the search for the genetic basis of those changes needs to be conducted within the framework both of a highly specific phylogeny and adequate aDNA sampling—neither of which has yet been achieved. 6. Integration with the fossil and archaeological records Many have made the argument that language is such a complex attribute that its evolutionary roots must lie very deep indeed in time (e.g. Lieberman 2016; Corballis 2017). However, there is very little in the material (behavioral) record to substantiate this assertion (Tattersall 2012, 2017). Indeed, if we may legitimately associate language (as familiar to us) with the symbolic thought mode with which it appears virtually synonymous (Hinzen 2012), we find the first material intimations that humans were linguistic only well within the tenure on Earth of our anatomically distinctive species Homo sapiens. The first anatomical Homo apparently behaved effectively just like their hominid antecedents and contemporaries, exhibiting little if any of the zeal for change and innovation, and none of the ability to reconceptualize the world, that so richly characterize their modern language-endowed descendants. Most likely, the neural underpinnings for language (and certainly the vocal apparatus that permits speech) were acquired as byproducts of the radical developmental reorganization that resulted, some 200,000 years ago, in the highly distinctive modern skeletal anatomy that is all the fossil record bequeaths us (Tattersall 2012). Only later, after around 100,000 years ago, did any H. sapiens society begin to routinely show evidence of symbolic behaviors, an innovation that was plausibly sparked by the spontaneous invention of language among a population of individuals already possessing a ‘language-ready’ brain (Boeckx and Benítez-Burraco 2014a,b; Boeckx 2017). It is little short of mind-boggling that any organism could ever have crossed the qualitative gulf between the non-linguistic/non-symbolic cognitive state and the linguistic/symbolic one. But we know this crossing happened, for there is no legitimate dispute that Homo sapiens is descended at some remove from a non-symbolic and non-linguistic ancestor. The transition was most likely possible only because the algorithmic basis for language is a simple one (Bolhuis et al. 2014; Berwick and Chomsky 2016), something also suggested by the apparent effortlessness with which structured sign-languages have been spontaneously invented (Senghas, Kita, and Özyürek 2005) by deaf modern children housed together. If, as seems most likely, the biological underpinnings for this transition were acquired as part of the event that resulted in the recent and apparently abrupt acquisition of the modern human morphology (something that we have as yet failed to find closely anticipated in the fossil record), then we have to look for them in a relatively simple and short-term genetic innovation that took place entirely at random to its eventual symbolic/linguistic ramifications. In other words, in an event that involved the exaptive co-optation of an existing structure to a new use. This scenario of language origin reinforces our hope that the genomic basis for the modern linguistic condition will indeed ultimately be found in genetic alterations of the kind researchers have sought by comparing ancient and modern hominid DNAs. But we are clearly not there yet. Acknowledgements We thank Dan Dediu and Antonio Benítez Burraco for the opportunity to participate in this fascinating special issue of JoLE, and two anonymous reviewers for their insightful comments. Supplementary data Supplementary data are available at Journal of Language Evolution Journal online. Conflict of interest statement. None declared. References Allentoft M. E. et al.   ( 2015) ‘ Population Genomics of Bronze Age Eurasia’, Nature , 522: 167– 72. Google Scholar CrossRef Search ADS PubMed  Becker J. et al.   ( 2014) ‘ Genetic Analysis of Dyslexia Candidate Genes in the European Cross-Linguistic NeuroDys cohort’, European Journal of Human Genetics , 22/ 5: 675– 80. Google Scholar CrossRef Search ADS PubMed  Benítez-Burraco A., Murphy E. ( 2016) ‘ The Oscillopathic Nature of Language Deficits in Autism: From Genes to Language Evolution’, Frontiers in Human Neuroscience , 10: 120– 9. Google Scholar CrossRef Search ADS PubMed  Berwick R., Chomsky N. ( 2016) Why Only Us . Cambridge, MA: MIT Press. Google Scholar CrossRef Search ADS   Boeckx C. ( 2017) ‘ The Language-Ready Head: Evolutionary Considerations’, Psychonomic Bulletin & Review , 24: 1– 6. Google Scholar CrossRef Search ADS PubMed  Boeckx C., Benítez-Burraco A. ( 2014a) ‘ The Shape of the Human Language-ready Brain’, Frontiers in Psychology , 5: 282– 90. Boeckx C., Benítez-Burraco A. ( 2014b) ‘ Globularity and Language-readiness: Generating New Predictions by Expanding the Set of Genes of Interest’, Frontiers in Psychology , 5. 1324– 32. Bolhuis J. J. et al.   ( 2014) ‘ How could Language have Evolved?’, PLoS Biology , 12/ 8: e101934. Google Scholar CrossRef Search ADS   Castellano S. et al.   ( 2014) ‘ Patterns of Coding Variation in the Complete Exomes of Three Neandertals’, Proceedings of the National Academy of Sciences, USA , 111/ 18: 6666– 71. Google Scholar CrossRef Search ADS   Corballis M. ( 2017) ‘Language Evolution: A Changing Perspective’, Trends in Cognitive Sciences 21: 229– 36. Dediu D., Christiansen M. ( 2016) ‘ Language Evolution: Constraints and Opportunities from Modern Genetics’, Topics in Cognitive Science , 8 / 2: 361– 70. Google Scholar CrossRef Search ADS PubMed  Dediu D., Janssen R., Moisik S. R. ( 2016) ‘Language is not Isolated from its Wider Environment: Vocal Tract Influences on the Evolution of Speech and Language’, Language & Communication 54: 9– 20. Elvevåg B. et al.   ( 2016) ‘ An Examination of the Language Construct in NIMH’s Research Domain Criteria: Time for Reconceptualization!’, American Journal of Medical Genetics Part B: Neuropsychiatric Genetics , 171/ 6: 904– 19. Google Scholar CrossRef Search ADS   Fisher S. E. ( 2017) ‘ Evolution of Language: Lessons from the Genome’, Psychonomic Bulletin & Review , 24: 1– 7. Google Scholar CrossRef Search ADS PubMed  Gamba C. et al.   ( 2014) ‘ Genome Flux and Stasis in a Five Millennium Transect of European Prehistory’, Nature Communications , 5: 52–57. Google Scholar CrossRef Search ADS   Graham S. A., Fisher S. E. ( 2015) ‘ Understanding Language from a Genomic Perspective’, Annual Review of Genetics , 49: 131– 60. Google Scholar CrossRef Search ADS PubMed  Graham S. A., Deriziotis P., Fisher S. E. ( 2015) ‘ Insights into the Genetic Foundations of Human Communication’, Neuropsychology Review , 25/ 1: 3– 26. Google Scholar CrossRef Search ADS PubMed  Green R. E. et al.   ( 2010) ‘ A Draft Sequence of the Neandertal Genome’, Science , 328: 710– 22. Google Scholar CrossRef Search ADS PubMed  Haak W. et al.   ( 2015) ‘ Massive Migration from the Steppe was a Source for Indo-European Languages in Europe’, Nature , 522: 207– 11. Google Scholar CrossRef Search ADS PubMed  Harlaar N. et al.   ( 2014) ‘ Genome-wide Association Study of Receptive Language Ability of 12-year-olds’, Journal of Speech, Language, and Hearing Research , 57/ 1: 96– 105. Google Scholar CrossRef Search ADS   Hinzen W. ( 2012) ‘ The Philosophical Significance of Universal Grammar’, Language Sciences , 34: 635– 49. Google Scholar CrossRef Search ADS   Hippolyte L. et al.   ( 2016) ‘ The Number of Genomic Copies at the 16p11. 2 Locus Modulates Language, Verbal Memory, and Inhibition’, Biological Psychiatry , 80/ 2: 129– 39. Google Scholar CrossRef Search ADS PubMed  Jarvis E. D. ( 2016) ‘ Evolution of Brain and Genes for Vocal Learning and Spoken Language’, International Journal of Psychology , 51: 825. Key F. M. et al.   ( 2016) ‘ Human Adaptation and Population Differentiation in the Light of Ancient Genomes’, Nature Communications , 7: 10775– 82. Google Scholar CrossRef Search ADS PubMed  Kornilov S. A. ( 2016) ‘ Genome-ide Association and Exome Sequencing Study of Language Disorder in an Isolated Population’, Pediatrics , 137: e20152469. Google Scholar CrossRef Search ADS PubMed  Lieberman P. ( 2016) ‘ The Evolution of Language and Thought’, Journal of Anthropological Science , 94: 1– 20. Llamas B. et al.   ( 2017) ‘ Human Evolution: A Tale from Ancient Genomes’, Philosophical Transactions of Royal Society B , 372: 20150484. Google Scholar CrossRef Search ADS   Luciano M. et al.   ( 2013) ‘ A Genome‐wide Association Study for Reading and Language Abilities in Two Population Cohorts’, Genes, Brain and Behavior , 12/ 6: 645– 52. Google Scholar CrossRef Search ADS   Mathieson I. et al.   ( 2015) ‘ Genome-wide Patterns of Selection in 230 Ancient Eurasians’, Nature , 528: 499– 503. Google Scholar CrossRef Search ADS PubMed  Mozzi A. et al.   ( 2016) ‘ The Evolutionary History of Genes Involved in Spoken and Written Language: Beyond FOXP2’, Scientific Reports , 6: 22157– 65. Google Scholar CrossRef Search ADS PubMed  Nudel R. et al.   ( 2014) ‘ Genome‐wide Association Analyses of Child Genotype Effects and Parent‐of‐Origin Effects in Specific Language Impairment’, Genes, Brain and Behavior , 13/ 4: 418– 29. Google Scholar CrossRef Search ADS   Nuttle X. et al.   ( 2016) ‘ Emergence of a Homo Sapiens-specific Gene Family and Chromosome 16p11. 2 CNV Susceptibility’, Nature , 536: 205– 10. Google Scholar CrossRef Search ADS PubMed  Olalde I. et al.   ( 2014) ‘ Derived Immune and Ancestral Pigmentation Alleles in a 7,000-Year-Old Mesolithic European’, Nature , 507: 225– 8. Google Scholar CrossRef Search ADS PubMed  O'Roak B. J. et al.   ( 2011) ‘ Exome Sequencing in Sporadic Autism Spectrum Disorders Identifies Severe de novo Mutations’, Nature Genetics , 43/ 6: 585– 9. Google Scholar CrossRef Search ADS PubMed  Prüfer K. et al.   ( 2014) ‘ The Complete Genome Sequence of a Neanderthal from the Altai Mountains’, Nature , 505/ 7481: 43– 9. Google Scholar CrossRef Search ADS PubMed  Rasmussen M. et al.   ( 2010) ‘ Ancient Human Genome Sequence of an Extinct Palaeo-Eskimo’, Nature , 463 / 7282: 757– 62. Google Scholar CrossRef Search ADS PubMed  Reich D. et al.   ( 2010) ‘ Genetic History of an Archaic Hominin Group from Denisova Cave, Siberia’, Nature , 468: 1053– 60. Google Scholar CrossRef Search ADS PubMed  Reuter M. et al.   ( 2017) ‘ FOXP2 Variants in 14 Individuals with Developmental Speech and Language Disorders Broaden the Mutational and Clinical Spectrum’, Journal of Medical Genetics , 54: 64– 72. Google Scholar CrossRef Search ADS PubMed  Roeske D. et al.   ( 2011) ‘ First Genome-wide Association Scan on Neurophysiological Endophenotypes Points to Trans-Regulation Effects on SLC2A3 in Dyslexic Children’, Molecular Psychiatry , 16: 97– 107. Google Scholar CrossRef Search ADS PubMed  Scott-Phillips T. C. ( 2017) ‘ Pragmatics and the Aims of Language Evolution’, Psychonomic Bulletin & Review , 24: 1– 4. Google Scholar CrossRef Search ADS PubMed  Senghas A., Kita S., Özyürek A. ( 2005) ‘ Children Creating Core Properties of Language: Evidence from an Emerging Sign Language in Nicaragua’, Science , 305: 1779– 82. Google Scholar CrossRef Search ADS   Slatkin M., Racimo F. ( 2016) ‘ Ancient DNA and Human History’, Proceedings of the National Academy of Sciences , 113: 6380– 7. Google Scholar CrossRef Search ADS   St Pourcain B. et al.   ( 2013) ‘ Common Variation Contributes to the Genetic Architecture of Social Communication Traits’, Molecular Autism , 4: 34. Google Scholar CrossRef Search ADS PubMed  St Pourcain B. ( 2014a) ‘ Common Variation near ROBO2 is Associated with Expressive Vocabulary in Infancy’, Nature Communications , 5: 4831. Google Scholar CrossRef Search ADS   St Pourcain B. ( 2014b) ‘ Variability in the Common Genetic Architecture of Social-Communication Spectrum Phenotypes during Childhood and Adolescence’, Molecular Autism , 5: 18. Google Scholar CrossRef Search ADS   Tattersall I. ( 2012) Masters of the Planet: The Search for Our Human Origins . New York: Palgrave Macmillan. Tattersall I. ( 2017) ‘ How can We Detect when Language Emerged?’, Psychonomic Bulletin & Review , 24: 64– 7. Google Scholar CrossRef Search ADS PubMed  Villanueva P. et al.   ( 2011) ‘ Genome-wide Analysis of Genetic Susceptibility to Language Impairment in an Isolated Chilean Population’, European Journal of Human Genetics , 19/ 6: 687– 95. Google Scholar CrossRef Search ADS PubMed  Whalen A., Griffiths T. L. ( 2017) ‘ Adding Population Structure to Models of Language Evolution by Iterated Learning’, Journal of Mathematical Psychology , 76 / 2017: 1– 6. Google Scholar CrossRef Search ADS   Wilde S. et al.   ( 2104) ‘ Direct Evidence for Positive Selection of Skin, Hair, and Eye Pigmentation in Europeans during the Last 5,000 y’, Proceedings of the National Academy of Sciences USA , 111: 4832– 7. Google Scholar CrossRef Search ADS   © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Journal of Language Evolution Oxford University Press

What aDNA can (and cannot) tell us about the emergence of language and speech

Loading next page...
 
/lp/ou_press/what-adna-can-and-cannot-tell-us-about-the-emergence-of-language-and-oM2eknW1oE
Publisher
Oxford University Press
Copyright
© The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
ISSN
2058-4571
eISSN
2058-458X
D.O.I.
10.1093/jole/lzx018
Publisher site
See Article on Publisher Site

Abstract

Abstract The genome sequencing of individuals belonging to extinct forms of the genus Homo has provided us with a detailed view of the genetic makeup of some of our close extinct relatives. In addition, the unprecedented depth of sequencing of modern Homo sapiens populations has given us a framework for interpreting minor changes at the DNA sequence level that are putatively relevant to a broad array of anatomical and behavioral characteristics. Here we discuss the genetic architecture of such complex characteristics, with a special focus on language and speech. We examine the extent of reported variation in the DNA sequences of genes that are thought to be involved in their production, both in H. sapiens populations and in our extinct relatives, and we discuss to what extent such sequence variations are relevant to making direct statements about the capacity of extinct hominids to generate and express language and speech. Because language is a highly complex behavioral character we stress the difficulties involved in using the ‘atomized’ DNA sequence data as indicators of its possession by extinct hominids, and we emphasize that such data should not be considered in isolation from other relevant information gleaned from comparative anatomy and the archaeological record. 1. Introduction Deciphering the genetic basis of the emergence of language in the human lineage has become an interesting and potentially achievable endeavor. Several recently sequenced archaic nuclear human genomes (all belonging to Neanderthals or Denisovans) have allowed researchers to focus on genetic differences among those archaic humans and their modern counterparts, in the hope of discovering genes and genetic elements that might have been involved in the differentiation of modern humans from their archaic precursors. However, language is a very complex behavior; and whether or not some—or any—of the differences documented between archaic genomes and modern ones are directly involved in language, remains a mystery. In this contribution we take a systematic approach to examining whether any of the differences between the genomes of modern humans and those of Neanderthals and Denisovans can reasonably be associated with language. We start by discussing the range of possible genetic mechanisms or architectures involved in language as a phenotype. Because determining the kinds of genes that might be involved in the differentiation of Homo sapiens from its close extinct relatives requires an examination of archaic genomes, we next look at the state of ancient DNA research in relation to those archaic genomes, allowing the identification of genes that differ in archaic and living humans. We next describe some simple, objective methods of obtaining lists from data bases of genes that are potentially related to language. By comparing overlap in the various lists, we were able to identify genes with potential relationships to language phenotypes. We conclude with a discussion of the implications of genomics for language, and attempt to integrate what we know about the genetics of language with the fossil record. 2. The range of the genetic architecture of language Language and speech are perhaps the most striking of all human autapomorphies, and knowing their genetic architectures is critical to understanding their origins in our lineage. Graham and Fisher (2015) and Graham, Deriziotis, and Fisher (2015) have recently reviewed this topic, and point out that the primary difficulty with pinning down the genetic architectures of language and speech lies in obtaining precise and well defined traits or phenotypes with which to begin the search for those structural underpinnings. Phenotypes important to language (Graham and Fisher 2015) have been inferred from the comparative biology of communication in other organisms (birdsong), from neuroimaging (correlating brain structural anomalies with genes), and from the study of language and communication anomalies in humans. These latter conditions include dyslexia, reading and language disabilities, receptive language ability at age 12, expressive vocabulary in infancy, specific language impairment (SLI), social communication ability at age 17, and developmental problems with speech and language (including epilepsy-aphasia spectrum disorders, speech sound disorder, dysarthria, childhood apraxia of speech (CAS), and stuttering). Some of these conditions are inherited in Mendelian fashion; and exceptional abilities in language acquisition, evidently non-Mendelian, may also shed light on this issue. Graham, Deriziotis, and Fisher (2015) point out that a range of different genetic architectures is involved in these various language-related traits and phenotypes. Some disorders are dominant, others are recessive, while yet others are sporadic and caused by a de novo event. Some language-associated traits are evidently genetically complex. The variety of the language-associated conditions listed above emphasizes the difficulty of settling on a standard phenotype for language. Indeed, in practice language can be associated with many and varied kinds of traits, suggesting that the problem we have in defining language as a specific entity may result from the involvement of many different genetic architectures within the complex whole. It is merely the simplicity of the Mendelian interpretations that has garnered them so much attention by researchers involved in the study of the genetics of language. This means that, while interpretation of Mendelian traits is the lowest-hanging fruit in the search for candidate genes, we will also need to examine genes involved in complex genetic interactions. Many authors have discussed at length the range of potential architectures underlying the traits of language and speech in Homo sapiens. Specifically, a host of recent papers have looked at this question in a variety of contexts (Graham and Fisher, 2015; Benítez-Burraco and Murphy 2016; Dediu and Christiansen 2016; Dediu, Janssen, and Moisik 2016; Elvevåg et al. 2016; Hippolyte et al. 2016; Jarvis 2016; Mozzi et al. 2016; Nuttle et al. 2016; Whalen and Griffiths 2017; Boeckx 2017; Fisher 2017; Reuter et al. 2017; Scott-Phillips 2017). Most of these publications focused on single genes that are thought to be associated with language, or on a few of them. In addition, Online Mendelian Inheritance in Man (OMIM; http://omim.org/) is a treasure trove of information for understanding single genes involved in the inheritance of specific traits, and both language and speech can be examined in this database. Because OMIM focuses on Mendelian inheritance, the genetic architectures of the traits in this database are, by default, those of single genes. However, several recent GWAS (genome wide association studies) have examined language or speech (O'Roak et al. 2011; Roeske et al. 2011; Villanueva et al. 2011; Luciano et al. 2013; St Pourcain et al. 2013, 2014a,b; Becker et al. 2014; Harlaar et al. 2014; Nudel 2014; Kornilov et al. 2016), and have the potential to discover more complex relationships amongst genes that are putatively involved. These papers all focus on genome-wide loci that are potentially associated with language, speech or verbal socialization, and on several specific genes that might be involved in the genetic architecture of language and speech production. 3. The scope of ancient human genome variation The last five years have seen an explosion of nuclear genome sequencing in modern and archaic humans. Two recent reviews summarize the work up to 2016 (Llamas et al. 2016; Slatkin and Racimo 2016). Here we will focus on the genomes reported in those two reviews, and Fig. 1 shows the distribution of these modern and archaic genomes in space and time as reported by Slatkin and Racimo (2016). Figure 1. View largeDownload slide Chart showing the distribution of ancient genomes sequenced from the specified geographic areas. The number of genomes is shown on the Y axis and the geographic location is on X axis. Shades of bars indicate age of fossil: lightest gray is whole genome sequencing from specimens 100 to 1000 years old; medium light gray is whole genome sequencing from specimens 1000 to 10000 years old; medium dark gray is whole genome sequencing from specimens 10,000 to 100,000 years old; darkest gray is high density chip sequencing from specimens 1000 to 10,000 years old. The asterisks indicate the locations of the four archaic human genomes used in this paper - lightest gray = Vindija Neanderthal; medium light gray = Altai Neanderthal; medium dark gray = Denisova; darkest gray = Ust Ishim Homo sapiens. Data from Slatkin and Racimo (2016), Allentoft et al. (2015), Green, et al. (2010), Gamba et al. (2014, Haak et al. (2015), Mathieson et al. (2015), Rasmussen et al. (2010), Olalde et al. (2014) and Wilde et al. (2014). Figure 1. View largeDownload slide Chart showing the distribution of ancient genomes sequenced from the specified geographic areas. The number of genomes is shown on the Y axis and the geographic location is on X axis. Shades of bars indicate age of fossil: lightest gray is whole genome sequencing from specimens 100 to 1000 years old; medium light gray is whole genome sequencing from specimens 1000 to 10000 years old; medium dark gray is whole genome sequencing from specimens 10,000 to 100,000 years old; darkest gray is high density chip sequencing from specimens 1000 to 10,000 years old. The asterisks indicate the locations of the four archaic human genomes used in this paper - lightest gray = Vindija Neanderthal; medium light gray = Altai Neanderthal; medium dark gray = Denisova; darkest gray = Ust Ishim Homo sapiens. Data from Slatkin and Racimo (2016), Allentoft et al. (2015), Green, et al. (2010), Gamba et al. (2014, Haak et al. (2015), Mathieson et al. (2015), Rasmussen et al. (2010), Olalde et al. (2014) and Wilde et al. (2014). Basically, two kinds of genome sequencing technologies produce ancient human genomes: whole genome shotgun sequencing, and SNP capture. Both can generate massive amounts of information, and both are required to span at least 1X coverage of the genome. Prüfer et al. (2014) sequenced the genome of a Neanderthal (N) from the Altai mountains. They combined the genome data from a previously sequenced Denisova (D) specimen (Reich et al. 2010), 1,094 present-day human genomes (H), and a chimpanzee (C) genome, and screened the data for suites of SNPs that showed the following patterns of sequence change distribution. The first pattern had a fixed base pair state in N, D and C, but was different from H (31,389 SNPs and 4,113 short indels). The second category had nearly fixed base pair changes in H (90% of modern humans) relative to N, D, C (105,757 SNPs and 3,900 indels). Castellano et al. (2014) added the sequences of two Neanderthals (one from Spain and one from Croatia) and performed a similar analysis, generating lists of SNPs in the same manner as Prüfer et al. (2014). While both the Castellano et al. and the Prüfer et al. studies discovered many SNPs that are unique to modern humans, these SNPs impact coding regions only very sparingly. The relatively few proteins impacted could then be examined using gene ontology terminology, and when this had been accomplished Castellano et al. found that ‘changes in genes related to metabolism, the cardiovascular system, hair distribution, and morphology (genitals, palate, face, extremities, joints, digits, thorax, orbital, and occipital skull regions and general mobility) … [were] … involved’. There is no direct reference to language using the ontology approach, a potential problem for this line of attack. Here we prefer to examine genes with associations in the literature, and Mendelian or GWAS associations. 4. Using simple screens for potential genes involved in speech and language emergence in humans We used the two lists presented by Castellano et al. (2014) and Prüfer et al. (2014), and label them C and P, respectively (Supplemental Table S1). Both studies included lists of genes in which SNPs are fixed, and lists in which the desired SNP pattern is found in more than 90% of the modern humans examined. As noted, several ways exist in which these two lists of genes (C and P) may potentially be compared to genes that might be involved in language. We also included a third primary list (Key et al. 2016), of genes in the genomes of ancient and living humans that are said to show indications of positive natural selection. We label this one the S list. Table 1. Gene overlap of Prüfer et al., (2014) (P), Castellano et al., (2014) (C) and Key et al. (2016) (S) with the indicated study. Citation  C  P  S  Boeckx (2017)  0  CNTNAP2  FOXP1  Fisher (2017)  0  CNTNAP2  0  Dediu and Christiansen (2016)  0  CNTNAP2  FOXP1  Elvevåg et al. (2016)  0  CNTNAP2  FOXP1  Benitez-Buraco and Murphy (2016)  0  CNTNAP2  FOXP1  Mozzi et al. (2016)  0  CNTNAP2  0  Combined  0  CNTNAP2  FOXP1  Citation  C  P  S  Boeckx (2017)  0  CNTNAP2  FOXP1  Fisher (2017)  0  CNTNAP2  0  Dediu and Christiansen (2016)  0  CNTNAP2  FOXP1  Elvevåg et al. (2016)  0  CNTNAP2  FOXP1  Benitez-Buraco and Murphy (2016)  0  CNTNAP2  FOXP1  Mozzi et al. (2016)  0  CNTNAP2  0  Combined  0  CNTNAP2  FOXP1  Genes from the literature: Using PubMed at NCBI and the associated query ‘language’, we found that the following studies from the past two years (papers with publication dates in 2016 and 2017) propose that specific genes are involved in language emergence and acquisition: Nuttle et al. (2016), Scott-Phillips (2017), Hippolyte et al. (2016), Reuter et al. (2017), Whalen and Griffiths (2017), Jarvis (2016), Boeckx (2017), Fisher (2017), Dediu and Christiansen (2016), Dediu, Janssen, and Moisik (2016), Elvevåg et al. (2016), Benítez-Burraco and Murphy (2016), and Mozzi et al. (2016). We compiled gene lists from these studies (Supplemental Table S2), and asked if there is any overlap with the P and C lists described above. Table 2. Genes that are cross listed between the Prüfer et al. (2014) (P), Castellano et al. (2014) (C) and Key et al. (2016) (S) lists, and the OMIM lists generated in this article. Keywords  C  P  S  Aphasia  0  0  NPC2  Brain size  0  0  *  Broca  0  0  0  Communication defect  0  0  0  Communication  0  CNTNAP2  0  Dyslexia  0  0  0  Language acquisition  0  CNTNAP2  0  Language perception  0  CNTNAP2  0  Larynx  GLI3  GLI3  0  Linguistic  GLI3  GLI3  PAX2  Maternal behavior  0  0  0  Neuronal paring  0  0  0  Perfect pitch  0  0  0  Reading  0  0  0  Speech difficulty  0  CNTNAP2  FOXP1  Speech perception  0  0  0  Tongue  WDPCP  0  BNC2  Vocal cord  0  CNTNAP2  DCTN1  Vocal tract  0  0  DCTN1, PAX2  Voice  0  CNTNAP2  0  Wernicke  0  0  0  Keywords  C  P  S  Aphasia  0  0  NPC2  Brain size  0  0  *  Broca  0  0  0  Communication defect  0  0  0  Communication  0  CNTNAP2  0  Dyslexia  0  0  0  Language acquisition  0  CNTNAP2  0  Language perception  0  CNTNAP2  0  Larynx  GLI3  GLI3  0  Linguistic  GLI3  GLI3  PAX2  Maternal behavior  0  0  0  Neuronal paring  0  0  0  Perfect pitch  0  0  0  Reading  0  0  0  Speech difficulty  0  CNTNAP2  FOXP1  Speech perception  0  0  0  Tongue  WDPCP  0  BNC2  Vocal cord  0  CNTNAP2  DCTN1  Vocal tract  0  0  DCTN1, PAX2  Voice  0  CNTNAP2  0  Wernicke  0  0  0  * = MCPH1, TSC1, CHRM5, DVL1, MYO16. Genes targeted by GWAS in the literature and GWAS database: Several of the studies mentioned above were used to search the GWAS database to create gene lists (Supplemental Table S3). We used the GWAS Catalogue data base (https://www.ebi.ac.uk/gwas/) to accomplish this step. This database now includes the NIH GWAS database (NHGRI-EBI GWAS). Other GWAS databases exist, but we made the arbitrary decision to focus on the GWAS Catalogue for this article. The search for GWAS studies was accomplished using the keywords ‘speech’ and ‘language’. The works referencing the genes obtained by this process are not all classical GWAS studies; but nonetheless, a circumscribed list of genes obtained in this way demonstrates the simplicity with which the GWAS database can be exploited. Genes obtained from OMIM searches: Searches of the OMIM database (https://www.omim.org/) were conducted using a set of keywords (Supplemental Table S4). Extremely large numbers of hits were obtained with these search queries, due to the way in which searches are reported in OMIM. As a result, we altered some of our searches to ‘speech AND acquisition’, ‘language AND acquisition’ and so forth. The hits were then hand-curated by reading through the OMIM descriptions to ensure that the information given does indeed contain the desired keywords, and in the desired context of language acquisition. These searches gave us a preliminary view of the genes and kinds of syndromes researchers have associated with language acquisition and speech capacity, and they resulted in gene lists for many single genes potentially involved in speech and language acquisition. The gene lists from studies in the literature were based on the most recent papers we could access, with dates in 2016 and 2017. We used an agnostic approach to search two example databases (GWAS and OMIM) to obtain the gene lists. More than likely other studies exist that could add genes to these lists, but we limited ourselves to these two databases. We recognize that our lists are highly dependent on the query terms we used, but our main purpose here is to demonstrate how databases of this kind can be used to provide lists for cross-referencing. We also point out that variant mapping is often susceptible to false positive findings, which could also be a limitation of our sequence lists. We first examined the P and C lists, to assess overlap. The P list contains 114 unique genes, while the C list has 101 unique genes. The two lists overlap on 26 genes (Supplemental Table S5). The difference most likely results from the use of different archaic human genomes to obtain the lists. The studies taken from the literature also have considerable overlap amongst themselves, suggesting that the authors of the papers cited here recognized a considerable number of genes in common (Supplemental Table S5) that might potentially be involved in language. Suggestively, there is little overlap in the C and P lists of genes that other authors have targeted in previous studies. Table 1 shows a list of the studies and genes that overlap from the P and C lists. Only one gene, CNTNAP2, shows overlap with the P list. This gene is a member of the neurexin gene family. Genes in this family function in the nervous system, and CNTNAP2 has been associated with autism. Alarmingly, there is no overlap of genes in the literature list with the C list. Finally, there was a single gene in the literature-based lists that did overlap with the S list: FOXP1. This gene is important in muscle development, especially of the esophagus. We next examined the GWAS gene lists, and found no overlap among them (over seven different studies); nor is there any overlap of the lists from these studies with the P and C lists. Part of the apparent problem with the GWAS lists might involve the difficulty of establishing strong phenotypic manifestations of language and language acquisition. Two genes from the GWAS studies matched with the S list: PRKCH and BNC2. The first is a protein kinase C (PKC) that is in the family of serine- and threonine-specific protein kinases. The second is a protein most commonly associated with skin color (BNC2). The final set of comparisons involved gene lists obtained from OMIM. While the list of OMIM-derived genes we present is not exhaustive, and includes some keywords that might have little to do with language, we suggest the exercise is useful because it permits a broader examination of both databases and a novel potential way of cross-referencing these gene lists. Table 2 shows the results of comparing the P, C, and S gene lists with these OMIM genes. There are fourteen genes on the three guide-lists that can be matched to OMIM-derived genes, three of which are related to language query terms found in the P and C lists. They are: CNTNAP2: This gene is one also targeted by several of the studies from the literature mentioned earlier. It is a neural development gene and hence could be of great importance in pursuing ideas about neural involvement in language. GLI3: This gene codes for a transcription factor ‘zinc finger’ protein. These proteins are important in regulating developmentally-important signaling proteins like Sonic hedgehog. The product of this gene is also thought to be important in embryogenesis. A previous connection of this gene to language was made by Boeckx and Benítez-Burraco (2014a,b; Boeckx 2017), but not using OMIM lists. WDPCP: This gene encodes a WD40 repeat protein that controls developmental polarity. It is important in hamartoma of the tongue. The comparison of OMIM genes with the natural selection S-list of Key et al. (2016) resulted in overlap of eleven genes, with only one of them found in more than one OMIM list (PAX2). One other gene that is prevalent in the literature search lists (Table 1) is FOXP1. The products of these genes, (FOXP1 and PAX2), are involved in development. Specifically, FOXP1 interacts with FOXP2, and has been shown to be involved in autism and intellectual disability. In general, it is difficult to see a clear causal connection with language of any of the selected genes in the Key et al. (2016) list. Creating more lists from OMIM of possible genes involved in speech or language acquisition, or in the anatomical structures required for speech, might uncover other candidate genes; but the bottom line is that this approach can at most identify whittled-down lists of genes that are putatively correlated with language. On the other hand, perhaps more Neanderthal and Denisovan genomes will reveal that the SNPs that appear to be fixed in this small group of genomes are actually not fixed. Under the cross-listing criteria we use here, if these SNPs reside in the genes we discuss above, they would have to be eliminated from potential involvement with language. 5. Genomic correlates with language The earliest and most famous candidate for a ‘language gene’, FOXP2, appears to have no variation that might have differentiated Neanderthals/Denisovans from Homo sapiens. Still, this is a pretty weak basis for suggesting that the former may, like the latter, have had language. Prüfer et al. (2014) and Castellano et al. (2014) examined their lists closely, and do not appear to have made any strong connections between language and any of the genes they determined as important in the differentiation of the Neanderthal/Denisovan genomes. We show here that there are at least three genes with some loose connection to language that pass our cross-listing test, and that some single-gene changes identified in the candidate lists we have discussed might be involved in neural processes and anatomical innovations that distinguish the Neanderthal/Denisovan group from modern humans. Of the three genes in the P and C lists that overlap with the OMIM lists, WDPCP is the least likely to be involved in language. It is found in only one OMIM list (genes associated with tongue development). It is involved in hamartoma of the tongue and causes abnormal tissue development and malfunction of the tongue, but more than likely is not involved in language acquisition or maintenance. Its role in developmental polarity is also more than likely not relevant to language. On the other hand, the connection of the role of WDPCP in developmental polarity to language is extremely complex. GL13 is particularly interesting because it is found in both the P and C lists and it overlaps with the OMIM gene lists (keywords ‘larynx’ and ‘linguistic’). This gene is a zinc finger containing developmental regulator that controls the normal patterning or shaping of organs and structure during embryogenesis. GLI3 as well as other genes in the GLI family are transcription factors and interact with Sonic Hedgehog to repress or activate other genes in that pathway. One of the human genetic disorder syndromes that GLI3 is involved in (Pallister-Hall syndrome) involves the development of the larynx and this anatomical structure is intricately involved in language. On the other hand, GLI3 is also involved in the development of the brain (specifically the thalamus) and some researchers have pointed to its involvement in the ‘human language ready brain’ (Boeckx and Benítez-Burraco 2014a,b,; Boeckx 2017). However, the gene is highly pleiotropic, is involved in development of several organs, and in fact is lethal under some conditions in organs not related to the larynx or the brain. In this context, if GLI3 does influence language it does it in the same general way as other transcription factors that have been suggested to be involved in language like FOXP1 and FOXP2. The final gene, CNTNAP2, is interesting because it is found in all of the literature lists (Table 1) and in several of the OMIM lists (Table 2). Oddly the C list does not contain this gene. The potential role of this gene in language has been discussed in detail in the publications in Table 1. Briefly, this gene is a member of the neurexin family which is important in vertebrate nervous system development. The gene itself is one of the largest genes in the human genome (it covers over 1.5% of chromosome 7) and is thought to be involved in certain forms of deafness. It has also been correlated with several neurological disorders (ADHD, schizophrenia and autism). Like GLI3, CNTNAP2 is also highly pleiotropic and has a broad spectrum of effects in human development. It is of interest to note that the two candidate genes that we discuss here with the most potential for a role in language (GLI3 and CNTNAP2) are both found on Chromosome 7, albeit on different arms of that chromosome. Mining OMIM and the literature for genes that are correlated with language is a straightforward but limited approach. It is not surprising that while our systematic use of OMIM in this context is unique, we identified several genes already thought to be involved in language (GLI3 and CNTNAP2) and other genes only peripherally involved. All this stops very far short of identifying a gene, or even a suite of genes, ‘for’ language. Both the neural capacity for language, and the anatomical apparatus needed to express it, result from some profound changes in major developmental pathways in the immediate ancestor of Homo sapiens that are unlikely to be simply related to any of the gene changes yet fingered. As with anything else in evolutionary reconstruction, the search for the genetic basis of those changes needs to be conducted within the framework both of a highly specific phylogeny and adequate aDNA sampling—neither of which has yet been achieved. 6. Integration with the fossil and archaeological records Many have made the argument that language is such a complex attribute that its evolutionary roots must lie very deep indeed in time (e.g. Lieberman 2016; Corballis 2017). However, there is very little in the material (behavioral) record to substantiate this assertion (Tattersall 2012, 2017). Indeed, if we may legitimately associate language (as familiar to us) with the symbolic thought mode with which it appears virtually synonymous (Hinzen 2012), we find the first material intimations that humans were linguistic only well within the tenure on Earth of our anatomically distinctive species Homo sapiens. The first anatomical Homo apparently behaved effectively just like their hominid antecedents and contemporaries, exhibiting little if any of the zeal for change and innovation, and none of the ability to reconceptualize the world, that so richly characterize their modern language-endowed descendants. Most likely, the neural underpinnings for language (and certainly the vocal apparatus that permits speech) were acquired as byproducts of the radical developmental reorganization that resulted, some 200,000 years ago, in the highly distinctive modern skeletal anatomy that is all the fossil record bequeaths us (Tattersall 2012). Only later, after around 100,000 years ago, did any H. sapiens society begin to routinely show evidence of symbolic behaviors, an innovation that was plausibly sparked by the spontaneous invention of language among a population of individuals already possessing a ‘language-ready’ brain (Boeckx and Benítez-Burraco 2014a,b; Boeckx 2017). It is little short of mind-boggling that any organism could ever have crossed the qualitative gulf between the non-linguistic/non-symbolic cognitive state and the linguistic/symbolic one. But we know this crossing happened, for there is no legitimate dispute that Homo sapiens is descended at some remove from a non-symbolic and non-linguistic ancestor. The transition was most likely possible only because the algorithmic basis for language is a simple one (Bolhuis et al. 2014; Berwick and Chomsky 2016), something also suggested by the apparent effortlessness with which structured sign-languages have been spontaneously invented (Senghas, Kita, and Özyürek 2005) by deaf modern children housed together. If, as seems most likely, the biological underpinnings for this transition were acquired as part of the event that resulted in the recent and apparently abrupt acquisition of the modern human morphology (something that we have as yet failed to find closely anticipated in the fossil record), then we have to look for them in a relatively simple and short-term genetic innovation that took place entirely at random to its eventual symbolic/linguistic ramifications. In other words, in an event that involved the exaptive co-optation of an existing structure to a new use. This scenario of language origin reinforces our hope that the genomic basis for the modern linguistic condition will indeed ultimately be found in genetic alterations of the kind researchers have sought by comparing ancient and modern hominid DNAs. But we are clearly not there yet. Acknowledgements We thank Dan Dediu and Antonio Benítez Burraco for the opportunity to participate in this fascinating special issue of JoLE, and two anonymous reviewers for their insightful comments. Supplementary data Supplementary data are available at Journal of Language Evolution Journal online. Conflict of interest statement. None declared. References Allentoft M. E. et al.   ( 2015) ‘ Population Genomics of Bronze Age Eurasia’, Nature , 522: 167– 72. Google Scholar CrossRef Search ADS PubMed  Becker J. et al.   ( 2014) ‘ Genetic Analysis of Dyslexia Candidate Genes in the European Cross-Linguistic NeuroDys cohort’, European Journal of Human Genetics , 22/ 5: 675– 80. Google Scholar CrossRef Search ADS PubMed  Benítez-Burraco A., Murphy E. ( 2016) ‘ The Oscillopathic Nature of Language Deficits in Autism: From Genes to Language Evolution’, Frontiers in Human Neuroscience , 10: 120– 9. Google Scholar CrossRef Search ADS PubMed  Berwick R., Chomsky N. ( 2016) Why Only Us . Cambridge, MA: MIT Press. Google Scholar CrossRef Search ADS   Boeckx C. ( 2017) ‘ The Language-Ready Head: Evolutionary Considerations’, Psychonomic Bulletin & Review , 24: 1– 6. Google Scholar CrossRef Search ADS PubMed  Boeckx C., Benítez-Burraco A. ( 2014a) ‘ The Shape of the Human Language-ready Brain’, Frontiers in Psychology , 5: 282– 90. Boeckx C., Benítez-Burraco A. ( 2014b) ‘ Globularity and Language-readiness: Generating New Predictions by Expanding the Set of Genes of Interest’, Frontiers in Psychology , 5. 1324– 32. Bolhuis J. J. et al.   ( 2014) ‘ How could Language have Evolved?’, PLoS Biology , 12/ 8: e101934. Google Scholar CrossRef Search ADS   Castellano S. et al.   ( 2014) ‘ Patterns of Coding Variation in the Complete Exomes of Three Neandertals’, Proceedings of the National Academy of Sciences, USA , 111/ 18: 6666– 71. Google Scholar CrossRef Search ADS   Corballis M. ( 2017) ‘Language Evolution: A Changing Perspective’, Trends in Cognitive Sciences 21: 229– 36. Dediu D., Christiansen M. ( 2016) ‘ Language Evolution: Constraints and Opportunities from Modern Genetics’, Topics in Cognitive Science , 8 / 2: 361– 70. Google Scholar CrossRef Search ADS PubMed  Dediu D., Janssen R., Moisik S. R. ( 2016) ‘Language is not Isolated from its Wider Environment: Vocal Tract Influences on the Evolution of Speech and Language’, Language & Communication 54: 9– 20. Elvevåg B. et al.   ( 2016) ‘ An Examination of the Language Construct in NIMH’s Research Domain Criteria: Time for Reconceptualization!’, American Journal of Medical Genetics Part B: Neuropsychiatric Genetics , 171/ 6: 904– 19. Google Scholar CrossRef Search ADS   Fisher S. E. ( 2017) ‘ Evolution of Language: Lessons from the Genome’, Psychonomic Bulletin & Review , 24: 1– 7. Google Scholar CrossRef Search ADS PubMed  Gamba C. et al.   ( 2014) ‘ Genome Flux and Stasis in a Five Millennium Transect of European Prehistory’, Nature Communications , 5: 52–57. Google Scholar CrossRef Search ADS   Graham S. A., Fisher S. E. ( 2015) ‘ Understanding Language from a Genomic Perspective’, Annual Review of Genetics , 49: 131– 60. Google Scholar CrossRef Search ADS PubMed  Graham S. A., Deriziotis P., Fisher S. E. ( 2015) ‘ Insights into the Genetic Foundations of Human Communication’, Neuropsychology Review , 25/ 1: 3– 26. Google Scholar CrossRef Search ADS PubMed  Green R. E. et al.   ( 2010) ‘ A Draft Sequence of the Neandertal Genome’, Science , 328: 710– 22. Google Scholar CrossRef Search ADS PubMed  Haak W. et al.   ( 2015) ‘ Massive Migration from the Steppe was a Source for Indo-European Languages in Europe’, Nature , 522: 207– 11. Google Scholar CrossRef Search ADS PubMed  Harlaar N. et al.   ( 2014) ‘ Genome-wide Association Study of Receptive Language Ability of 12-year-olds’, Journal of Speech, Language, and Hearing Research , 57/ 1: 96– 105. Google Scholar CrossRef Search ADS   Hinzen W. ( 2012) ‘ The Philosophical Significance of Universal Grammar’, Language Sciences , 34: 635– 49. Google Scholar CrossRef Search ADS   Hippolyte L. et al.   ( 2016) ‘ The Number of Genomic Copies at the 16p11. 2 Locus Modulates Language, Verbal Memory, and Inhibition’, Biological Psychiatry , 80/ 2: 129– 39. Google Scholar CrossRef Search ADS PubMed  Jarvis E. D. ( 2016) ‘ Evolution of Brain and Genes for Vocal Learning and Spoken Language’, International Journal of Psychology , 51: 825. Key F. M. et al.   ( 2016) ‘ Human Adaptation and Population Differentiation in the Light of Ancient Genomes’, Nature Communications , 7: 10775– 82. Google Scholar CrossRef Search ADS PubMed  Kornilov S. A. ( 2016) ‘ Genome-ide Association and Exome Sequencing Study of Language Disorder in an Isolated Population’, Pediatrics , 137: e20152469. Google Scholar CrossRef Search ADS PubMed  Lieberman P. ( 2016) ‘ The Evolution of Language and Thought’, Journal of Anthropological Science , 94: 1– 20. Llamas B. et al.   ( 2017) ‘ Human Evolution: A Tale from Ancient Genomes’, Philosophical Transactions of Royal Society B , 372: 20150484. Google Scholar CrossRef Search ADS   Luciano M. et al.   ( 2013) ‘ A Genome‐wide Association Study for Reading and Language Abilities in Two Population Cohorts’, Genes, Brain and Behavior , 12/ 6: 645– 52. Google Scholar CrossRef Search ADS   Mathieson I. et al.   ( 2015) ‘ Genome-wide Patterns of Selection in 230 Ancient Eurasians’, Nature , 528: 499– 503. Google Scholar CrossRef Search ADS PubMed  Mozzi A. et al.   ( 2016) ‘ The Evolutionary History of Genes Involved in Spoken and Written Language: Beyond FOXP2’, Scientific Reports , 6: 22157– 65. Google Scholar CrossRef Search ADS PubMed  Nudel R. et al.   ( 2014) ‘ Genome‐wide Association Analyses of Child Genotype Effects and Parent‐of‐Origin Effects in Specific Language Impairment’, Genes, Brain and Behavior , 13/ 4: 418– 29. Google Scholar CrossRef Search ADS   Nuttle X. et al.   ( 2016) ‘ Emergence of a Homo Sapiens-specific Gene Family and Chromosome 16p11. 2 CNV Susceptibility’, Nature , 536: 205– 10. Google Scholar CrossRef Search ADS PubMed  Olalde I. et al.   ( 2014) ‘ Derived Immune and Ancestral Pigmentation Alleles in a 7,000-Year-Old Mesolithic European’, Nature , 507: 225– 8. Google Scholar CrossRef Search ADS PubMed  O'Roak B. J. et al.   ( 2011) ‘ Exome Sequencing in Sporadic Autism Spectrum Disorders Identifies Severe de novo Mutations’, Nature Genetics , 43/ 6: 585– 9. Google Scholar CrossRef Search ADS PubMed  Prüfer K. et al.   ( 2014) ‘ The Complete Genome Sequence of a Neanderthal from the Altai Mountains’, Nature , 505/ 7481: 43– 9. Google Scholar CrossRef Search ADS PubMed  Rasmussen M. et al.   ( 2010) ‘ Ancient Human Genome Sequence of an Extinct Palaeo-Eskimo’, Nature , 463 / 7282: 757– 62. Google Scholar CrossRef Search ADS PubMed  Reich D. et al.   ( 2010) ‘ Genetic History of an Archaic Hominin Group from Denisova Cave, Siberia’, Nature , 468: 1053– 60. Google Scholar CrossRef Search ADS PubMed  Reuter M. et al.   ( 2017) ‘ FOXP2 Variants in 14 Individuals with Developmental Speech and Language Disorders Broaden the Mutational and Clinical Spectrum’, Journal of Medical Genetics , 54: 64– 72. Google Scholar CrossRef Search ADS PubMed  Roeske D. et al.   ( 2011) ‘ First Genome-wide Association Scan on Neurophysiological Endophenotypes Points to Trans-Regulation Effects on SLC2A3 in Dyslexic Children’, Molecular Psychiatry , 16: 97– 107. Google Scholar CrossRef Search ADS PubMed  Scott-Phillips T. C. ( 2017) ‘ Pragmatics and the Aims of Language Evolution’, Psychonomic Bulletin & Review , 24: 1– 4. Google Scholar CrossRef Search ADS PubMed  Senghas A., Kita S., Özyürek A. ( 2005) ‘ Children Creating Core Properties of Language: Evidence from an Emerging Sign Language in Nicaragua’, Science , 305: 1779– 82. Google Scholar CrossRef Search ADS   Slatkin M., Racimo F. ( 2016) ‘ Ancient DNA and Human History’, Proceedings of the National Academy of Sciences , 113: 6380– 7. Google Scholar CrossRef Search ADS   St Pourcain B. et al.   ( 2013) ‘ Common Variation Contributes to the Genetic Architecture of Social Communication Traits’, Molecular Autism , 4: 34. Google Scholar CrossRef Search ADS PubMed  St Pourcain B. ( 2014a) ‘ Common Variation near ROBO2 is Associated with Expressive Vocabulary in Infancy’, Nature Communications , 5: 4831. Google Scholar CrossRef Search ADS   St Pourcain B. ( 2014b) ‘ Variability in the Common Genetic Architecture of Social-Communication Spectrum Phenotypes during Childhood and Adolescence’, Molecular Autism , 5: 18. Google Scholar CrossRef Search ADS   Tattersall I. ( 2012) Masters of the Planet: The Search for Our Human Origins . New York: Palgrave Macmillan. Tattersall I. ( 2017) ‘ How can We Detect when Language Emerged?’, Psychonomic Bulletin & Review , 24: 64– 7. Google Scholar CrossRef Search ADS PubMed  Villanueva P. et al.   ( 2011) ‘ Genome-wide Analysis of Genetic Susceptibility to Language Impairment in an Isolated Chilean Population’, European Journal of Human Genetics , 19/ 6: 687– 95. Google Scholar CrossRef Search ADS PubMed  Whalen A., Griffiths T. L. ( 2017) ‘ Adding Population Structure to Models of Language Evolution by Iterated Learning’, Journal of Mathematical Psychology , 76 / 2017: 1– 6. Google Scholar CrossRef Search ADS   Wilde S. et al.   ( 2104) ‘ Direct Evidence for Positive Selection of Skin, Hair, and Eye Pigmentation in Europeans during the Last 5,000 y’, Proceedings of the National Academy of Sciences USA , 111: 4832– 7. Google Scholar CrossRef Search ADS   © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

Journal

Journal of Language EvolutionOxford University Press

Published: Jan 1, 2018

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off