The genomic landscape of language: Insights into evolution

The genomic landscape of language: Insights into evolution Abstract Studies of severe, monogenic forms of language disorders have revealed important insights into the mechanisms that underpin language development and evolution. It is clear that monogenic mutations in genes such as FOXP2 and CNTNAP2 only account for a small proportion of language disorders seen in children, and the genetic basis of language in modern humans is highly complex and poorly understood. In this review, we examine why we understand so little of the genetic landscape of language disorders, and how the genetic background of an individual greatly affects the way in which a genetic change is expressed. We discuss how the underlying genetics of language disorders has informed our understanding of language evolution, and how recent advances may obtain a clearer picture of language capacity in ancient hominins. 1. Introduction The ease with which most children acquire their native language has lead researchers to propose that language acquisition is innate (Chomsky 1998), and suggest that this reflects a genetically determined language-specific module (Pinker 1994). Others argue that it simply reflects higher order processing in humans and is facilitated by their existing cognitive skills (Locke 1836). Major questions remain as to the evolutionary and genetic mechanisms that underpin these proposed models; did language evolution rely upon a small number of ‘big-hit’ mutations which rapidly changed cognition, or through a series of small-step changes where many variants were accumulated slowly over thousands of years? Did ancient hominins have the cognitive ability to use some form of language? The study of genetic variation that underpins language ability in modern humans can provide insights into how higher language function evolved in our ancient ancestors. The application of next-generation sequencing technology means that we are now able to generate a near-complete picture of genetic variation with relative ease. The discovery of genetic variants associated with language disorders results in the identification of the genes and molecular pathways necessary for the successful acquisition of language. Genetic studies of modern humans, therefore, have direct relevance to the study of how language evolved in our ancestors. Discussion of the evolution of language in fields outside of genetics, still tend to consider ‘a gene for language’ as the principle driver of language evolution. While the consideration of single variants and genes has provided important insights, the field of human genetics has moved on. Here, we argue that in order to understand language evolution, we first need to consider the full genetic landscape in modern humans, then use this to inform our understanding of the forces that shaped language evolution in ancient hominins. 2. Language disorders When considering which genetic pathways contribute to language, researchers often choose to study the extremes of language ability—most often when a person’s ability to speak is severely impaired. So far, the greatest insights into the molecular biology of language have come from studying the genetics of families and individuals with persistent language disorders. A recent study found that over 7% of British children (n = 12,000, Surrey) at school entry had impaired language, either as part of a complex developmental disorder such as autism spectrum disorder (ASD), developmental delay or intellectual disability, or as a primary language disorder with no other explanatory features (Norbury et al. 2016). Previous smaller English-speaking studies concluded similar rates (Tomblin et al. 1997; Shriberg et al. 1999). In real terms, this means that a staggering three children in every class have a language disorder (Norbury et al. 2016). Age appropriate language acquisition is so important to a child’s development that receptive language ability at age 3 years is a predictor of an individuals’ future economic burden (Caspi et al. 2016). Despite educational intervention, over half of children with language disorders have lasting difficulties with language throughout their childhood (Hulme and Snowling 2009). This means that a child who struggles to understand or produce language, even from an early age, has an increased risk of behavioural disorders, unemployment, and mental health issues later in life (Conti-Ramsden and Botting 2008). This importance is clearly demonstrated in a recent systematic review which found that there was a consistent strong association between young offenders and language disorders (Anderson et al. 2016).From a genetics point-of-view, it is of particular interest when language disorder occurs in isolation (so-called primary language disorder), with no other features such as autism spectrum disorder or developmental delay that may confound difficulties with language. Primary language disorders may represent domain-independent deficits and therefore provide an excellent opportunity to study the genetics that underpin speech. Two such primary language disorders are childhood apraxia of speech (previously called developmental verbal dyspraxia) (CAS, OMIM #602081) and developmental language disorder (DLD) (also known as specific language impairment) (SLI, OMIM %606711, %606712, %607134, %612514). Although both conditions are primary language disorders, they are proposed to arise from different obstacles in language production pathways. CAS is primarily a motoric difficulty in which the brain cannot coordinate the fine muscles controlling the tongue, lips and mouth that are required to produce speech (Shriberg et al. 1999). DLDs are a persistent difficulty with more generalised aspects of speech and language, in the absence of any other explanatory medical condition such as hearing difficulties or developmental delay (Bishop et al. 2017). The diagnostic guidelines for DLDs are therefore less stringent than CAS and, accordingly, DLDs are an extremely common childhood developmental issue that can persist throughout the child’s life. In this review, we will focus on the primary language disorders DLD and CAS. There is little doubt as to the impact of language disorders on children, but despite the frequency and impact on society, we still understand little of the underlying neurobiology. It is clear that the risk of speech and language disorder is increased if a parent or sibling has a speech disorder (Stromswold 1998). Many studies indicate that language ability is highly heritable, and that that genetic factors play a role in this familiality (Stromswold 1998; Bishop et al. 2006; Barry et al. 2007). The identification of genetic variants or risk factors for DLDs may explain why some children struggle with language acquisition. It may also help explain why language ability is so often affected in related disorders such as ASD, developmental dyslexia, intellectual learning disability or attention deficit hyperactivity disorder (ADHD) and tease apart the phenotypic overlaps between these highly related disorders. Assuming that language impairments are at one end of a continuum of language ability, genetic studies are providing a better understanding of the molecular pathways that are important in language acquisition. 3. Genes involved in disorders of language development When a language disorder recurs within multiple generations of a family, we often assume a strong genetic contribution. Such families have therefore traditionally been the obvious place to start when studying genetic inheritance. The principal insights into the genetics of DLDs have come from such family studies, and several genes have been identified using genetic linkage and candidate gene sequencing in related family members (Table 1). These genes were often identified from single families or a number of related individuals, using genetic linkage to look for regions of the genome shared by language impaired family members, or by testing for genetic association between large numbers of unrelated individuals with a similar phenotype (Table 1). Genetic linkage and association approaches have traditionally been the mainstay of neurodevelopment genetics, with much success. Table 1. Major genes implicated in language disorders, and associated overlapping phenotypes The table shows genes from association or linkage of language disorders, and does not include a thorough review of other phenotypes (dyslexia, ASD, etc.). Asterisk indicates gene has been reported as monogenic. Gene  Associated disorder(s)  Key references  ABCC13  Language disorder  Luciano et al. (2013)  ARHGEF39  Language disorder  Devanna et al. (2017)  ATP2C2  Language disorder (short term memory)  Newbury et al. (2009); Smith et al. (2015)  BCL11A  Language disorder (specifically CAS) with expressive language and mild intellectual delay  Peter et al. (2014)  CMIP  Language disorder (short term memory) Language disorder and dyslexia Dyslexia  Newbury et al. (2009),Scerri et al. (2011)  CNTNAP2  Language disorder Autism  Vernes et al. (2008); Arking et al. (2008); Bakkaloglu et al. (2008)  DCDC2  DyslexiaLanguage disorder and dyslexia  Schumacher et al. (2006); Marino et al. (2012),Scerri et al. (2011); Marino et al. (2011); Powers et al. (2013)  ERC1  Language disorder (CAS)  Thevenon et al. (2013); Chen et al. (2017)  FLNC  Language disorder and reading difficulties  Gialluisi et al. (2014)  FOXP1*  Language disorder and intellectual delay  Horn et al. (2010); Hamdan et al. (2010); Le Fevre et al. (2013); Srivastava et al. (2014); Sollis et al. (2015)  FOXP2*  Language disorder (specifically CAS)  Lai et al. (2001); MacDermot et al. (2005); Tomblin et al. (2009); Turner et al. (2013); Moralli et al. (2015); Reuter et al. (2017)  GRIN2A  Focal epilepsy with speech disorder, with or without mental retardation  Chen et al. (2017); Endele et al. (2010); De Ligt et al. (2012); Carvill et al. (2013)  KIAA0319  Dyslexia Language disorder  Scerri et al. (2011),;Kirsten et al. (2012),Newbury et al. (2011)  NDST4  Language disorder  Eicher et al. (2013)  NFXL1  Language disorder  Villanueva et al. (2015)  NOP9  Language disorder  Nudel et al. (2014)  RBFOX2  Language disorder and reading difficulties  Gialluisi et al. (2014)  ROBO1  Dyslexia Language disorder and dyslexia  Hannula-Jouppi et al. (2005),Bates et al. (2011)  ROBO2  Language disorder  St Pourcain et al. (2014)  SETBP1  Language disorder  Filges et al. (2010); Marseglia et al. (2012); Kornilov et al. (2016)  SRPX2  Language disorder, rolandic seizures and intellectual delay  Chen et al. (2017); Roll et al. (2006)  TM4SF20*  Language disorder  Wiszniewski et al. (2013)  Gene  Associated disorder(s)  Key references  ABCC13  Language disorder  Luciano et al. (2013)  ARHGEF39  Language disorder  Devanna et al. (2017)  ATP2C2  Language disorder (short term memory)  Newbury et al. (2009); Smith et al. (2015)  BCL11A  Language disorder (specifically CAS) with expressive language and mild intellectual delay  Peter et al. (2014)  CMIP  Language disorder (short term memory) Language disorder and dyslexia Dyslexia  Newbury et al. (2009),Scerri et al. (2011)  CNTNAP2  Language disorder Autism  Vernes et al. (2008); Arking et al. (2008); Bakkaloglu et al. (2008)  DCDC2  DyslexiaLanguage disorder and dyslexia  Schumacher et al. (2006); Marino et al. (2012),Scerri et al. (2011); Marino et al. (2011); Powers et al. (2013)  ERC1  Language disorder (CAS)  Thevenon et al. (2013); Chen et al. (2017)  FLNC  Language disorder and reading difficulties  Gialluisi et al. (2014)  FOXP1*  Language disorder and intellectual delay  Horn et al. (2010); Hamdan et al. (2010); Le Fevre et al. (2013); Srivastava et al. (2014); Sollis et al. (2015)  FOXP2*  Language disorder (specifically CAS)  Lai et al. (2001); MacDermot et al. (2005); Tomblin et al. (2009); Turner et al. (2013); Moralli et al. (2015); Reuter et al. (2017)  GRIN2A  Focal epilepsy with speech disorder, with or without mental retardation  Chen et al. (2017); Endele et al. (2010); De Ligt et al. (2012); Carvill et al. (2013)  KIAA0319  Dyslexia Language disorder  Scerri et al. (2011),;Kirsten et al. (2012),Newbury et al. (2011)  NDST4  Language disorder  Eicher et al. (2013)  NFXL1  Language disorder  Villanueva et al. (2015)  NOP9  Language disorder  Nudel et al. (2014)  RBFOX2  Language disorder and reading difficulties  Gialluisi et al. (2014)  ROBO1  Dyslexia Language disorder and dyslexia  Hannula-Jouppi et al. (2005),Bates et al. (2011)  ROBO2  Language disorder  St Pourcain et al. (2014)  SETBP1  Language disorder  Filges et al. (2010); Marseglia et al. (2012); Kornilov et al. (2016)  SRPX2  Language disorder, rolandic seizures and intellectual delay  Chen et al. (2017); Roll et al. (2006)  TM4SF20*  Language disorder  Wiszniewski et al. (2013)  The most successful study in this field, to date, has been the identification of an arginine to histidine mutation at amino acid position 553 (denoted as p.R553H) in the FOXP2 gene, identified in a large, multigenerational family known as the KE family. Family members who carry this mutation have the CAS phenotype (Lai et al. 2001). In genetic terminology, the p.R553H change is a dominant, fully penetrant mutation—one mutated copy of the gene is enough to result in a particular disorder. Fully penetrant cases are rare and presumably differ from more ‘typical’ cases of DLD, where one genetic change cannot be directly correlated with their disorder. While this remains the most studied and best characterised gene implicated in speech, mutations in FOXP2 only account for about 2% of CAS cases (Worthey et al. 2013), and as such, causative mutations in FOXP2 are still considered a rare cause of language disorders. FOXP2, dubbed a ‘molecular window’ into speech and language development, has been a leap-pad for the identification of other genes and mechanisms involved in language [for example, CNTNAP2 (Vernes et al. 2008), as described below]. The discovery of FOXP2 was hailed by the media as the ‘speech gene’—suggesting that this single protein is responsible for language development in humans. This headline tag is an overly simplistic interpretation, which has endured in fields outside of genetics and language biology. More recently, investigation into the molecular function of FOXP2 has slowly built a more detailed picture of its role in language development (Dediu and Christiansen 2016; Fitch 2017; Fisher 2017). The literature is clear—FOXP2 is not the sole explanatory factor for presence of language. There are very few instances of monogenic inheritance, where the absence of a protein leads directly to language disorder. In Table 1, only FOXP2, FOXP1, and TM4SF20 have been described as monogenic drivers of language disorders. The remainder of the identified genes instead confer risk of language disorder through genetic variations that subtly alter the way in which genes and proteins work. The majority of genes have been implicated in language disorders through association with language-related phenotypes obtained from cohort studies. In contrast to FOXP2, where a mutation explains the observed language difficulties (monogenic model), these genes tend to play a role within a complex genetic model. Carrying a risk variant within these genes confers a ‘susceptibility’ to develop language disorder, however this remains difficult to quantify and is poorly understood. Nonetheless, the study of cases and their families has provided an important window into the underlying mechanisms of language disorders. At present, FOXP2 and FOXP1 remain the best characterised of the genes implicated in language disorders. Clinical diagnosis of the underlying molecular cause of a language disorder is not usually possible, unless the causative mutation is within FOXP2, FOXP1, or TM4SF20. Mutations in these genes are rare, and therefore the majority of language disorder cases are unlikely to have an underlying molecular cause identified. Large-scale genome sequencing projects such as 1000 Genomes Project Consortium (2015) and ExAC (Lek et al. 2016) have created a major shift in how we perceive human genetic variation and its contribution to disease. We have understood for decades that monogenic disorders usually involve rare mutations which impact upon the function of the protein. Such mutations usually lead to non-functional proteins which manifest in a disease phenotype. Access to large numbers of control genomes through 1000 Genomes and ExAC has enabled us to more accurately identify and assess genetic risk factors, which tend to be more common in the population, but may confer a modest risk of developing a phenotype. These databases also provide unprecedented power to inform our understanding of gene function in modern humans, and by proxy, our ancestors. It is well established that Neanderthals and Denisovans shared the ‘humanised’ version of FOXP2, which differs from ancestral FOXP2 at two positions; chromosome 7, base-pair 114, 282, 597 (denoted as chr7: 114, 282, 597) resulting in an arginine rather than the ancestral threonine at position 303 (denoted as p.N303) and chromosome 7, base-pair 114, 282, 663 (denoted as chr7: 114, 282, 663) resulting in a serine at amino acid position 325 rather than the ancestral arginine (denoted as p.S325) (variant 1, hg19) (Krause et al. 2007). This important finding gave rise to the idea that Neanderthals may have had a sophisticated level of cognitive processing to support some form of language (Krause et al. 2007). Interestingly, the ‘humanised’ FOXP2 amino acid at position 325 is somewhat called into question by the presence of two apparently healthy controls in the ExAC database. These two individuals carry one copy (heterozygous) of a T > G change at neighbouring position (chr7: 114, 282, 664), essentially reverting the amino acid sequence to the ancestral form, resulting in a serine to arginine change (p.S325N). This change is extremely rare (allele frequency = 0.00001648) and only seen in 2 of more than 60,000 individuals, but it poses the question—did these apparently healthy individuals have language difficulties? Although ExAC participants were not specifically screened for cognitive function or language ability, it is unlikely that they had an overt phenotype as this would have excluded them from the study. This presents an interesting line of thought, that if these two amino acids are the hominin form of FOXP2, then there are at least two functioning humans out there who do not have a fully ‘humanised’ version of FOXP2. The presence of a non-human FOXP2 amino acid change in these two healthy individuals shows the power of these databases to identify extremely rare occurrences of a variant carried in <0.0016% of the population. It provides a more accurate snapshot of human variation with which we can more effectively predict which variants are likely to be important. Even in monogenic disorders, when it is clear that the trait is directly caused by a dominant mutation, we still observe a high degree of variability between individuals (incomplete penetrance). Such phenotypic variability is even present within the KE family who have a ‘fully’ penetrant dominant FOXP2 mutation with a clear-cut phenotype (Lai et al. 2001; Watkins et al. 2002). It is widely reported that some individuals of the KE family present with non-verbal difficulties. The performance IQ scores of five affected KE family members are varied—on male affected (age 10 years) scored 112 compared to a second 10 year old affected male who scores 66. These individuals carry the p.R553H mutation which explains their CAS phenotype, but the differences in performance IQ are likely due to genetic modifiers, and not directly related to FOXP2. For the majority of language disorder loci discovered to date, it is likely that they explain only part of the risk and the modifier, and additional variants have yet to be identified. We are only just beginning to understand the actions of modifiers and risk factors, but this concept underlies a shift from the traditional genetic model, in which phenotypes are truly dominant or recessive. Instead, we now understand the importance of considering all variation on a genetic background. 4. Complex inheritance and genetic risk The power of familial studies is a proven method to identify contributory genes, but increasingly molecular genetics is focussing on the role of modifiers and risk factors in DLDs. The majority of genes listed in Table 1 that has been associated with language disorders fall into this category. An example is an asparagine to lysine change at amino acid position 150 (denoted as p.N150K) in the NFXL1 gene. This variant (rs144169475), identified by sequencing five affected Islanders, was found to be associated with language impairment on Robinson Crusoe Island, an isolated Chilean population with an exceptionally high rate of language disorders (Villanueva et al. 2015). This variant likely forms a key part of a complex inheritance model where a single variant only explains part of the DLD risk. The variant is seen in 4.1% in South American control genomes, and is therefore considered common in Latin America, suggesting that it may confer susceptibility to DLD when inherited in combination with other variants that are yet to be identified. The study of complex genetic factors is primarily performed using large numbers of unrelated cases specifically selected to have a high degree of phenotypic similarity. Large-scale genome-wide association studies (GWAS) with several thousands of participants may be able to successfully identify common risk variants involved in DLDs; however, a large-scale study of this nature has not yet been attempted. A recent GWAS into the genetic basis of schizophrenia successfully identified more than 100 associated loci using 37,000 schizophrenia patients and 113,000 controls (Schizophrenia Working Group of the Psychiatric Genomes Consortium 2014). The application of these methods in clinical traits such as schizophrenia, have shown that enormous sample sizes are required to enable the consistent replication of associated loci. A major limiting factor in performing a large-scale GWAS for language disorders remains the systematic phenotyping of enough participants to gain the statistical power required to detect contributory variants. This challenge is common to most large complex genetics studies, but is particularly pronounced for the field of language disorders where there is little consensus on what constitutes a speech and language disorder, or how it should be diagnosed and classified. A recent report by the CATALISE consortium aims to do exactly that (Bishop et al. 2017). Even the terminology used to describe language disorders and DLDs required standardisation across disciplines, and although these are the current approved terms, they are taking time to become standard in research and education. Establishing consistent terminology is the keystone to developing standardised diagnostic criteria. Once these definitions are consistent within and across disciplines, then a large-scale study could be successfully developed. It would likely lead to the identification of a novel pathways and gene networks involved in language production. Table 1 reveals the striking number of genes implicated in DLDs which are also implicated in other, closely related neurodevelopmental disorders. Vernes and colleagues identified an association between variants in the contactin-associated protein like 2 gene CNTNAP2 and DLDs through its interaction with the transcription factor FOXP2 (Vernes et al. 2008). Variants in CNTNAP2 are also associated with ASD (Alarcon et al. 2008; Arking et al. 2008), cortical dysplasia focal epilepsy syndrome (OMIM #610042) (Strauss et al. 2006), and Pitt–Hopkins-like syndrome (OMIM #610042) (Zweier et al. 2009). Another example of genes implicated in language overlapping with related disorders is the axon guidance receptor protein ROBO1. It was first implicated as a candidate gene for dyslexia in a patient with a translocation involving the ROBO1 region (Hannula-Jouppi et al., 2005), and was subsequently found to be associated with short-term memory of words, a key feature of DLD (Bates et al. 2011). Other examples of genes involved in language disorders that overlap with a dyslexia phenotype, include DCDC2, KIAA0319, and CMIP (Schumacher et al. 2006; Scerri et al. 2011). This observation suggests the documented phenotypic overlap between developmental disorders like DLD, ASD, and dyslexia may be driven by shared genetic aetiology. We should note, however that the level of shared aetiology is hard to objectively ascertain without genome-wide data. Technical and financial limitations mean that many studies of DLDs to date are limited to candidate genes, leading to substantial ascertainment bias. The factors that determine how a given genetic variant manifests to become one phenotype over another is not fully understood, but they are likely to involve interactions between genetic variants. This emphasises the need to consider the genetic background of an individual within any candidate gene analyses. These multiple layers of complexity partly explain why genetic studies have so far struggled to elucidate the genetic basis of many neurodevelopmental disorders. 5. Limitations of current genomic studies There are a number of reasons why we do not have a better picture of the genetics of speech and language disorders. As discussed above, the majority of studies have used relatively low resolution mapping methods within small sample sizes with inconsistent characterisation between studies. Recent advances in DNA sequencing technology allow us to generate a more complete picture of genetic variation across the entire genome (whole-genome sequencing) or across all known genes in the genome (whole-exome sequencing). While such technologies afford better resolution and, to some extent, offset these problems, the identification of risk variants, which only have a small effect size, remain difficult. The average human genome contains between 4 and 5 million variants that differ from published reference sequences. Only about 1% of the human genome actually encodes genes, and these gene encoding regions will contain about 150 coding mutations which result in the loss-of-function of the protein. They will also contain around 10,000 ‘silent’ mutations that fall within genes but do not alter the amino sequence. Each person’s genome will contain about 120 novel coding variants which have not previously been reported (1000 Genomes Project Consortium 2015). The vast majority of variation we see in the human genome does not directly change the protein, and is non-coding. Once we consider that these non-coding changes may have a function affecting gene expression (how much of each protein is made), the list of potential variants can be vast, and extremely challenging. Exome sequencing studies investigate just the coding regions of the genes of individuals affected by DLD (Kornilov et al. 2016; Chen et al. 2017) or CAS (Worthey et al. 2013). These preliminary, small-scale investigations confirm the complexity of the underlying genetics in the majority of cases and reinforce the need for larger-scale screening studies. Even though they lie outside of gene sequences, non-coding variants can change gene functions, for example by increasing or decreasing expression. It is highly likely that these non-coding variants will be involved in neurodevelopmental disorders. These variants represent a far greater challenge than coding variants. They are often not captured by whole-exome sequencing meaning that we may simply be missing important mutations. Whole-genome sequencing is becoming more commonly used, but cost is often prohibitively expensive. Even when these variants are captured, their categorisation is difficult. A recent study demonstrated a role for variations within non-coding regulatory regions in DLD and other neurodevelopmental conditions underlining the importance of this route of investigation (Devanna et al. 2017). The use of whole-genome sequencing produces vastly more data, and analysis can be more computationally expensive, and requires a much greater level of analytical expertise. Since the effects of these variants are often indirect, their characterisation usually involves complex functional validation steps that are challenging to complete for a high number of variants. Genetic studies tend to be performed on European or American cohorts. Findings in these participants may not be relevant in other populations as some variants can be more or less common in a different population, and different groups may need their own specific studies to gain a better global understanding. For example, the NFXL1 variant found to increase risk of DLD on Robinson Crusoe Island was found in 4.1% of Latin Americans, but 0% of Europeans (Villanueva et al. 2015). Similarly, investigations of an isolated Russian population have yielded novel loci in relation to DLD (Kornilov et al. 2016). The availability of 1000 Genomes data has improved power to detect variants that differ in allele frequencies between populations however, these are still limited to relatively small numbers of individuals from a restricted set of countries. Another degree of complexity is added by tissue specificity; while present in the genomic DNA of every cell, some mutations may only have a detectable effect in a specific tissue, at a particular time in development. The function of a gene can vary between cell types and conditions, and many genes have multiple, and often surprisingly different, functions. FOXP2 is not only highly expressed in the brain, but is also highly expressed in the lungs and many other tissue types all of which will carry the mutation at a DNA level (Shu et al. 2007). The brain appears particularly sensitive to this particularly change and, as far as we are aware, the lungs of the KE family are unaffected (Lai et al. 2001). It is therefore important to remember that although genomic technologies can give us a window into what is happening in a particular individual, it is far more challenging to predict the cellular context in which it will become important. 6. Paleogenetics and language In evolutionary terms, the window to understand genetic effects on cognitive function and language ability in hominins is even narrower than in modern humans, and must be interpreted with extreme caution. The humanised version of FOXP2 is thought to have become fixed in the population around 500 KYA, prior to the last shared common ancestor (370–450 KYA) (Green et al. 2010) and the presence of this version in Neanderthals supports the notion of cognitive function sophisticated enough to support language. More recently, a regulatory region of FOXP2 was identified exclusively in modern humans at a binding site of the transcription factor POUF3F2 which is absent in Neanderthals (Maricic et al. 2013). This suggests that differences in gene regulation and expression may be involved in cognitive function, and that species differences are due to far more than just two variants in a single gene. We must be cautious when interpreting such information as it is extremely unlikely that these FOXP2 changes are solely responsible for the presence (or absence) of language function, and any observations, modern or otherwise, should consider the entire genetic background. (Mozzi et al. 2016) To further complicate the underlying assumptions in evolutionary studies, the small numbers of Neanderthals sequenced heavily biases the study findings. The difficulties in obtaining ancient DNA of suitable quality for sequencing, means that sequenced individuals are not representative of time periods or geographical locations, and are from a small number of sites where preservation conditions were optimal. As discussed above, genome sequence studies clearly illustrate that small numbers of individuals from one or two geographical locations do not represent the entire population. This is the modern genetic equivalent of sequencing one family and assuming that everyone else is the same—this is not genetically plausible. There is not enough available population data to be able to accurately predict genetic affects, particularly with respect to complex cognitive processes like language function. Paleogenetics researchers are slowly building a broader and more accurate picture of ancient hominin genetics through sequencing larger numbers from a range of geographical locations. A larger sample size will greatly improve the statistical significance of findings, and increase confidence in their implication for language and higher cognitive function. Genes that are implicated in language disorders in modern can inform investigation of language in ancient hominins, and there have been several efforts to investigate the impact of language associated genes more broadly (Mozzi et al. 2016). Through the expansion of genetic technologies and a greater understanding of their application and limitation, we will continue to build a more accurate picture of both modern and ancient language cognition slowly, piece by piece, applying the scientific rigour and multiple lines of evidence of molecular biology. 7. Discussion The study of language disorders has been fruitful in implicating genes, and subsequent molecular pathways that are involved in the mechanisms of language. While there have been many exciting discoveries spanning the past two decades, there remains much more to understand. We still do not fully understand the underlying causes of DLDs, and what makes some children are more susceptible. Family studies can still provide novel insights into the underlying mechanisms of DLDs. There is strong potential for using a familial shared genetics-based approach, particularly when combined with recent advances in sequencing technologies that can investigate more of the genome than ever before. We increasingly recognise that genetic risk plays a key role in language disorders and many current approaches are investigating a genetic background of susceptibility. To be statistically sound, these studies require much larger sample sizes and more consistently phenotyped datasets to generate sufficient statistical power. The reality is that DLDs are likely to involve some high impact rare mutations, genetic rearrangements and common sequence variations, all of which create a background of susceptibility. Family based and association studies are still uncovering some unlikely pathways which play a role in language disorders, and it is clear that it will not be a simple story. The idea that a single gene has a distinct role or confers a single trait is an outdated concept. Similarly, the idea that a gene will have a single role in the cell has been dispelled. We understand that non-coding variants can play a crucial role in gene regulation, and are highly likely to have an important function in DLDs, and other neurodevelopmental disorders. The genetic background and regulation of gene expression and function is dynamic, and depends greatly on individual cell types. While this is still poorly understood, methods for detecting and experimentally validating such context dependent states are in development. The function of a gene in a particular cellular circumstance can, and will be validated by molecular biology in model animal or cellular systems. Genetic control is no longer beyond our testing capability, and we have a range of technologies to characterise gene function and expression across different cell types and under different conditions. Environmental factors clearly play a role in language development, and poor life circumstance may impact the DLD phenotype. Nature versus nurture is a falsely binary concept, and the underlying genetics plays a key role within an environmental (nurture) context. The theory that the presence of ‘humanised’ FOXP2 gene in Neanderthals drove language ability is naive and overly simplistic. FOXP2 clearly plays an important role in speech evolution and production; however, we must be cautious to avoid making over-inflated statements about language in Neanderthals based on a single gene (Fitch 2017). We are only just beginning to unravel the highly complex developmental processes that underlie speech in modern humans, and should be extremely cautious in extrapolating any findings into hominins. The identification of risk factors for DLDs in modern humans will inform our understanding of capacity for language in ancient hominins. We may be able to build a far clearer picture of how language evolved once we increase our understanding of the neuromolecular pathways involved language development in modern humans. References 1000 Genomes Project Consortium ( 2015) ‘ A Global Reference for Human Genetic Variation’, Nature , 526: 68– 74. CrossRef Search ADS PubMed  Alarcon M. et al.   ( 2008) ‘ Linkage, Association, and Gene-expression Analyses Identify CNTNAP2 as an Autism-susceptibility Gene’, The American Journal of Human Genetics , 82: 150– 9. Google Scholar CrossRef Search ADS PubMed  Anderson S. A., Hawes D. J., Snow P. C. ( 2016) ‘ Language Impairments Among Youth Offenders: A Systematic Review’, Children and Youth Services Review , 65: 195– 203. Google Scholar CrossRef Search ADS   Arking D. E. et al.   ( 2008) ‘ A Common Genetic Variant in the Neurexin Superfamily Member CNTNAP2 Increases Familial Risk of Autism’, The American Journal of Human Genetics , 82: 160– 4. Google Scholar CrossRef Search ADS PubMed  Bakkaloglu B. et al.   ( 2008) ‘ Molecular Cytogenetic Analysis and Resequencing of Contactin Associated Protein-like 2 in Autism Spectrum Disorders’, The American Journal of Human Genetics , 82: 165– 73. Google Scholar CrossRef Search ADS PubMed  Barry J. G., Yasin I., Bishop D. V. ( 2007) ‘ Heritable Risk Factors Associated with Language Impairments’, Genes, Brain and Behavior , 6: 66– 76. Google Scholar CrossRef Search ADS   Bates T. C. et al.   ( 2011) ‘ Genetic Variance in a Component of the Language Acquisition Device: ROBO1 Polymorphisms Associated with Phonological Buffer Deficits’, Behavior Genetics , 41: 50– 7. Google Scholar CrossRef Search ADS PubMed  Bishop D. V. M. et al.   ( 2017) ‘Phase 2 of CATALISE: A Multinational and Multidisciplinary Delphi Consensus Study of Problems with Language Development: Terminology’, Journal of Child Psychology and Psychiatry , 58: 1068– 80. Google Scholar CrossRef Search ADS PubMed  Bishop D. V., Adams C. V., Norbury C. F. ( 2006) ‘ Distinct Genetic Influences on Grammar and Phonological Short‐term Memory Deficits: Evidence from 6‐year‐old Twins’, Genes, Brain and Behavior , 5: 158– 69. Google Scholar CrossRef Search ADS   Carvill G. L. et al.   ( 2013) ‘ GRIN2A Mutations Cause Epilepsy-aphasia Spectrum disorders’, Nature Genetics , 45: 1073– 6. Google Scholar CrossRef Search ADS PubMed  Caspi A. et al.   ( 2016) ‘ Childhood Forecasting of a Small Segment of the Population with Large Economic Burden’, Nature Human Behaviour , 1: 0005. Google Scholar CrossRef Search ADS PubMed  Chen X. S. et al.   ( 2017) ‘ Next-generation DNA Sequencing Identifies Novel Gene Variants and Pathways Involved in Specific Language Impairment’, Scientific Reports , 7: 46105. Google Scholar CrossRef Search ADS PubMed  Chomsky N. ( 1998) ‘On the Nature, Use and Acquisition of Language’. In: Toribio J., Cíark A. (eds), Language and Meaning in Cognitive Science: Cognitive Issues and Semantic Theory , pp. 1– 20. Taylor and Francis: New York and London. Conti-Ramsden G., Botting N. ( 2008) ‘ Emotional Health in Adolescents with and without a History of Specific Language Impairment (SLI)’, Journal of Child Psychology Psychiatry , 49: 516– 25. Google Scholar CrossRef Search ADS PubMed  De Ligt J. et al.   ( 2012) ‘ Diagnostic Exome Sequencing in Persons with Severe Intellectual Disability’, New England Journal of Medicine , 367: 1921– 9. Google Scholar CrossRef Search ADS PubMed  Dediu D., Christiansen M. H. ( 2016) ‘ Language Evolution: Constraints and Opportunities From Modern Genetics’, Topics in Cognitive Science , 8: 361– 70. Google Scholar CrossRef Search ADS PubMed  Devanna P. et al.   ( 2017) ‘Next-gen Sequencing Identifies Non-coding Variation Disrupting miRNA-Binding Sites in Neurological Disorders’, Molecular Psychiatry , 1– 10. Eicher J. D. et al.   ( 2013) ‘ Genome-wide Association Study of Shared Components of Reading Disability and Language Impairment’, Genes Brain Behavior , 12: 792– 801. Google Scholar CrossRef Search ADS   Endele S. et al.   ( 2010) ‘ Mutations in GRIN2A and GRIN2B Encoding Regulatory Subunits of NMDA Receptors Cause Variable Neurodevelopmental Phenotypes’, Nature Genetics , 42: 1021– 6. Google Scholar CrossRef Search ADS PubMed  Filges I. et al.   ( 2010) ‘Reduced Expression by SETBP1 Haploinsufficiency Causes Developmental and Expressive Language Delay Indicating a Phenotype Distinct from Schinzel–Giedion Syndrome’, Journal of Medical Genetics , 48: 117– 22. Google Scholar CrossRef Search ADS PubMed  Fisher S. E. ( 2017) ‘ Evolution of Language: Lessons from the Genome’, Psychonomic Bulletin & Review , 24: 34– 40. Google Scholar CrossRef Search ADS PubMed  Fitch W. T. ( 2017) ‘ Empirical Approaches to the Study of Language Evolution’, Psychonomic Bulletin & Review , 24: 3– 33. Google Scholar CrossRef Search ADS PubMed  Gialluisi A. et al.   ( 2014) ‘ Genome-wide Screening for DNA Variants Associated with Reading and Language Traits’, Genes Brain Behavior , 13: 686– 701. Google Scholar CrossRef Search ADS   Green R. E. et al.   ( 2010) ‘ A Draft Sequence of the Neandertal Genome’, Science , 328: 710– 22. Google Scholar CrossRef Search ADS PubMed  Hamdan F. F. et al.   ( 2010) ‘ De Novo Mutations in FOXP1 in Cases with Intellectual Disability, Autism, and Language Impairment’, The American Journal of Human Genetics , 87: 671– 8. Google Scholar CrossRef Search ADS PubMed  Hannula-Jouppi K. et al.   ( 2005) ‘ The Axon Guidance Receptor Gene ROBO1 is a Candidate Gene for Developmental Dyslexia’, PLoS Genet , 1: e50. Google Scholar CrossRef Search ADS PubMed  Horn D. et al.   ( 2010) ‘ Identification of FOXP1 Deletions in Three Unrelated Patients with Mental Retardation and Significant Speech and Language Deficits’, Human Mutation , 31: E1851– 60. Google Scholar CrossRef Search ADS PubMed  Hulme C., Snowling M. J. ( 2009) Developmental Disorders of language learning and cognition . West Sussex, UK: John Wiley & Sons. Kirsten H. et al.   ( 2012) ‘ Association Study of a Functional Genetic Variant in KIAA0319 in German Dyslexics’, Psychiatric Genetics , 22: 216– 7. Google Scholar CrossRef Search ADS PubMed  Kornilov S. A. et al.   ( 2016) ‘ Genome-Wide Association and Exome Sequencing Study of Language Disorder in an Isolated Population’, Pediatrics , 137. Krause J. et al.   ( 2007) ‘ The Derived FOXP2 Variant of Modern Humans was Shared with Neandertals’, Current Biology , 17: 1908– 12. Google Scholar CrossRef Search ADS PubMed  Lai C. S. et al.   ( 2001) ‘ A Forkhead-domain Gene is Mutated in a Severe Speech and Language Disorder’, Nature , 413: 519– 23. Google Scholar CrossRef Search ADS PubMed  Le Fevre A. K. et al.   ( 2013) ‘ FOXP1 Mutations Cause Intellectual Disability and a Recognizable Phenotype’, American Journal of Medical Genetics Part A , 161: 3166– 75. Google Scholar CrossRef Search ADS   Lek M. et al.   ( 2016) ‘ Analysis of Protein-coding Genetic Variation in 60, 706 Humans’, Nature , 536: 285– 91. Google Scholar CrossRef Search ADS PubMed  Locke J. ( 1836) An Essay Concerning Human Understanding . London, UK: T. Tegg and Son. Luciano M. et al.   ( 2013) ‘ A Genome-wide Association Study for Reading and Language Abilities in Two Population Cohorts’, Genes Brain Behavior , 12: 645– 52. Google Scholar CrossRef Search ADS   MacDermot K. D. et al.   ( 2005) ‘ Identification of FOXP2 Truncation as a Novel Cause of Developmental Speech and Language Deficits’, The American Journal of Human Genetics , 76: 1074– 80. Google Scholar CrossRef Search ADS PubMed  Maricic T. et al.   ( 2013) ‘ A Recent Evolutionary Change Affects a Regulatory Element in the Human FOXP2 Gene’, Molecular Biology and Evolution , 30: 844– 52. Google Scholar CrossRef Search ADS PubMed  Marino C. et al.   ( 2011) ‘ Pleiotropic Effects of DCDC2 and DYX1C1 Genes on Language and Mathematics Traits in Nuclear Families of Developmental Dyslexia’, Behavior Genetics , 41: 67– 76. Google Scholar CrossRef Search ADS PubMed  Marino C. et al.   ( 2012) ‘ DCDC2 Genetic Variants and Susceptibility to Developmental Dyslexia’, Psychiatric Genetics , 22: 25. Google Scholar CrossRef Search ADS PubMed  Marseglia G. et al.   ( 2012) ‘ 372 kb Microdeletion in 18q12. 3 Causing SETBP1 Haploinsufficiency Associated with Mild Mental Retardation and Expressive Speech Impairment’, European Journal of Medical Genetics , 55: 216– 21. Google Scholar CrossRef Search ADS PubMed  Moralli D. et al.   ( 2015) ‘ Language Impairment in a Case of a Complex Chromosomal Rearrangement with a Breakpoint Downstream of FOXP2’, Molecular Cytogenetics , 8: 36. Google Scholar CrossRef Search ADS PubMed  Mozzi A. et al.   ( 2016) ‘ The Evolutionary History of Genes Involved in Spoken and Written Language: Beyond FOXP2’, Scientific Reports , 6: 22157. Google Scholar CrossRef Search ADS PubMed  Newbury D. F. et al.   ( 2009) ‘ CMIP and ATP2C2 Modulate Phonological Short-term Memory in Language Impairment’, The American Journal of Human Genetics , 85: 264– 72. Google Scholar CrossRef Search ADS PubMed  Newbury D. F. et al.   ( 2011) ‘ Investigation of Dyslexia and SLI Risk Variants in Reading- and Language-impaired Subjects’, Behavioral Genetics , 41: 90– 104. Google Scholar CrossRef Search ADS   Norbury C. F. et al.   ( 2016) ‘ The Impact of Nonverbal Ability on Prevalence and Clinical Presentation of Language Disorder: Evidence from a Population Study’, Journal of Child Psychology and Psychiatry , 57: 1247– 57. Google Scholar CrossRef Search ADS PubMed  Nudel R. et al.   ( 2014) ‘ Genome-wide Association Analyses of Child Genotype Effects and Parent-of-origin Effects in Specific Language Impairment’, Genes Brain and Behavior , 13: 418– 29. Google Scholar CrossRef Search ADS   Peter B. et al.   ( 2014) ‘ De Novo Microdeletion of BCL11A is Associated with Severe Speech Sound Disorder’, American Journal of Medical Genetics A , 164A: 2091– 6. Google Scholar CrossRef Search ADS   Pinker S. ( 1994) The Language Instinct (1994/2007) New York: NY Harper Perennial Modern Classics. Powers N. R. et al.   ( 2013) ‘ Alleles of a Polymorphic ETV6 Binding Site in DCDC2 Confer Risk of Reading and Language Impairment’, American Journal of Human Genetics , 93: 19– 28. Google Scholar CrossRef Search ADS PubMed  Reuter M. S. et al.   ( 2017) ‘ FOXP2 Variants in 14 Individuals with Developmental Speech and Language Disorders Broaden the Mutational and Clinical Spectrum’, Journal of Medical Genetics , 54: 64– 72. Google Scholar CrossRef Search ADS PubMed  Roll P. et al.   ( 2006) ‘ SRPX2 Mutations in Disorders of Language Cortex and Cognition’, Human Molecular Genetics , 15: 1195– 207. Google Scholar CrossRef Search ADS PubMed  Scerri T. S. et al.   ( 2011) ‘ DCDC2, KIAA0319 and CMIP are Associated with Reading-related Traits’, Biological Psychiatry , 70: 237– 45. Google Scholar CrossRef Search ADS PubMed  Schizophrenia Working Group of the Psychiatric Genomics Consortia ( 2014) ‘ Biological Insights from 108 Schizophrenia-associated Genetic Loci’, Nature , 511: 421– 7. CrossRef Search ADS PubMed  Schumacher J. et al.   ( 2006) ‘ Strong Genetic Evidence of DCDC2 as a Susceptibility Gene for Dyslexia’, American Journal of Human Genetics , 78: 52– 62. Google Scholar CrossRef Search ADS PubMed  Shriberg L. D., Tomblin J. B., McSweeny J. L. ( 1999) ‘ Prevalence of Speech Delay in 6-year-old Children and Comorbidity with Language Impairment’, Journal of Speech, Language and Hearing Research , 42: 1461– 81. Google Scholar CrossRef Search ADS   Shu W. et al.   ( 2007) ‘ Foxp2 and Foxp1 Cooperatively Regulate Lung and Esophagus Development’, Development , 134: 1991– 2000. Google Scholar CrossRef Search ADS PubMed  Smith A. W. et al.   ( 2015) ‘ Deletion of 16q24. 1 Supports a Role for the ATP2C2 Gene in Specific Language Impairment’, Journal of Child Neurology , 30: 517– 21. Google Scholar CrossRef Search ADS PubMed  Sollis E. et al.   ( 2015) ‘ Identification and Functional Characterization of De novo FOXP1 Variants Provides Novel Insights into the Etiology of Neurodevelopmental Disorder’, Human Molecular Genetics , 25: 546– 57. Google Scholar CrossRef Search ADS PubMed  Srivastava S. et al.   ( 2014) ‘ Clinical Whole Exome Sequencing in Child Neurology Practice’, Annals of Neurology , 76: 473– 83. Google Scholar CrossRef Search ADS PubMed  St Pourcain B. et al.   ( 2014) ‘ Common Variation near ROBO2 is Associated with Expressive Vocabulary in Infancy’, Nature Communications , 5: 4831. Google Scholar CrossRef Search ADS PubMed  Strauss K. A. et al.   ( 2006) ‘ Recessive Symptomatic Focal Epilepsy and Mutant Contactin-associated Protein-like 2’, The New England Journal of Medicine , 354: 1370– 7. Google Scholar CrossRef Search ADS PubMed  Stromswold K. ( 1998) ‘ Genetics of Spoken Language Disorders’, Human Biology , 70: 297– 324. Google Scholar PubMed  Thevenon J. et al.   ( 2013) ‘ 12p13. 33 Microdeletion Including ELKS/ERC1, a New Locus Associated with Childhood Apraxia of Speech’, European Journal of Human Genetics , 21: 82. Google Scholar CrossRef Search ADS PubMed  Tomblin J. B. et al.   ( 1997) ‘ Prevalence of Specific Language Impairment in Kindergarten Children’, Journal of Speech, Language and Hearing Research , 40: 1245– 60. Google Scholar CrossRef Search ADS   Tomblin J. B. et al.   ( 2009) ‘ Language Features in a Mother and Daughter of a Chromosome 7; 13 Translocation Involving FOXP2’, Journal of Speech, Language, and Hearing Research , 52: 1157– 74. Google Scholar CrossRef Search ADS   Turner S. J. et al.   ( 2013) ‘ Small Intragenic Deletion in FOXP2 Associated with Childhood Apraxia of Speech and Dysarthria’, American Journal of Medical Genetics Part A , 161: 2321– 6. Google Scholar CrossRef Search ADS   Vernes S. C. et al.   ( 2008) ‘ A Functional Genetic Link Between Distinct Developmental Language Disorders’, The New England Journal of Medicine , 359: 2337– 45. Google Scholar CrossRef Search ADS PubMed  Villanueva P. et al.   ( 2015) ‘ Exome Sequencing in an Admixed Isolated Population Indicates NFXL1 Variants Confer a Risk for Specific Language Impairment’, PLoS Genet , 11: e1004925. Google Scholar CrossRef Search ADS PubMed  Watkins K. E. et al.   ( 2002) ‘ Behavioural Analysis of an Inherited Speech and Language Disorder: Comparison with Acquired Aphasia’, Brain , 125: 452– 64. Google Scholar CrossRef Search ADS PubMed  Wiszniewski W. et al.   ( 2013) ‘ TM4SF20 Ancestral Deletion and Susceptibility to a Pediatric Disorder of Early Language Delay and Cerebral White Matter Hyperintensities’, The American Journal of Human Genetics , 93: 197– 210. Google Scholar CrossRef Search ADS PubMed  Worthey E. A. et al.   ( 2013) ‘ Whole-exome Sequencing Supports Genetic Heterogeneity in Childhood Apraxia of Speech’, Journal of Neurodevelopmental Disorders , 5: 29. Google Scholar CrossRef Search ADS PubMed  Zweier C. et al.   ( 2009) ‘ CNTNAP2 and NRXN1 are Mutated in Autosomal-recessive Pitt-Hopkins-like Mental Retardation and Determine the Level of a Common Synaptic Protein in Drosophila’, The American Journal of Human Genetics , 85: 655– 66. Google Scholar CrossRef Search ADS PubMed  © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Journal of Language Evolution Oxford University Press

The genomic landscape of language: Insights into evolution

Loading next page...
 
/lp/ou_press/the-genomic-landscape-of-language-insights-into-evolution-UyjoLk0Ljq
Publisher
Oxford University Press
Copyright
© The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
ISSN
2058-4571
eISSN
2058-458X
D.O.I.
10.1093/jole/lzx019
Publisher site
See Article on Publisher Site

Abstract

Abstract Studies of severe, monogenic forms of language disorders have revealed important insights into the mechanisms that underpin language development and evolution. It is clear that monogenic mutations in genes such as FOXP2 and CNTNAP2 only account for a small proportion of language disorders seen in children, and the genetic basis of language in modern humans is highly complex and poorly understood. In this review, we examine why we understand so little of the genetic landscape of language disorders, and how the genetic background of an individual greatly affects the way in which a genetic change is expressed. We discuss how the underlying genetics of language disorders has informed our understanding of language evolution, and how recent advances may obtain a clearer picture of language capacity in ancient hominins. 1. Introduction The ease with which most children acquire their native language has lead researchers to propose that language acquisition is innate (Chomsky 1998), and suggest that this reflects a genetically determined language-specific module (Pinker 1994). Others argue that it simply reflects higher order processing in humans and is facilitated by their existing cognitive skills (Locke 1836). Major questions remain as to the evolutionary and genetic mechanisms that underpin these proposed models; did language evolution rely upon a small number of ‘big-hit’ mutations which rapidly changed cognition, or through a series of small-step changes where many variants were accumulated slowly over thousands of years? Did ancient hominins have the cognitive ability to use some form of language? The study of genetic variation that underpins language ability in modern humans can provide insights into how higher language function evolved in our ancient ancestors. The application of next-generation sequencing technology means that we are now able to generate a near-complete picture of genetic variation with relative ease. The discovery of genetic variants associated with language disorders results in the identification of the genes and molecular pathways necessary for the successful acquisition of language. Genetic studies of modern humans, therefore, have direct relevance to the study of how language evolved in our ancestors. Discussion of the evolution of language in fields outside of genetics, still tend to consider ‘a gene for language’ as the principle driver of language evolution. While the consideration of single variants and genes has provided important insights, the field of human genetics has moved on. Here, we argue that in order to understand language evolution, we first need to consider the full genetic landscape in modern humans, then use this to inform our understanding of the forces that shaped language evolution in ancient hominins. 2. Language disorders When considering which genetic pathways contribute to language, researchers often choose to study the extremes of language ability—most often when a person’s ability to speak is severely impaired. So far, the greatest insights into the molecular biology of language have come from studying the genetics of families and individuals with persistent language disorders. A recent study found that over 7% of British children (n = 12,000, Surrey) at school entry had impaired language, either as part of a complex developmental disorder such as autism spectrum disorder (ASD), developmental delay or intellectual disability, or as a primary language disorder with no other explanatory features (Norbury et al. 2016). Previous smaller English-speaking studies concluded similar rates (Tomblin et al. 1997; Shriberg et al. 1999). In real terms, this means that a staggering three children in every class have a language disorder (Norbury et al. 2016). Age appropriate language acquisition is so important to a child’s development that receptive language ability at age 3 years is a predictor of an individuals’ future economic burden (Caspi et al. 2016). Despite educational intervention, over half of children with language disorders have lasting difficulties with language throughout their childhood (Hulme and Snowling 2009). This means that a child who struggles to understand or produce language, even from an early age, has an increased risk of behavioural disorders, unemployment, and mental health issues later in life (Conti-Ramsden and Botting 2008). This importance is clearly demonstrated in a recent systematic review which found that there was a consistent strong association between young offenders and language disorders (Anderson et al. 2016).From a genetics point-of-view, it is of particular interest when language disorder occurs in isolation (so-called primary language disorder), with no other features such as autism spectrum disorder or developmental delay that may confound difficulties with language. Primary language disorders may represent domain-independent deficits and therefore provide an excellent opportunity to study the genetics that underpin speech. Two such primary language disorders are childhood apraxia of speech (previously called developmental verbal dyspraxia) (CAS, OMIM #602081) and developmental language disorder (DLD) (also known as specific language impairment) (SLI, OMIM %606711, %606712, %607134, %612514). Although both conditions are primary language disorders, they are proposed to arise from different obstacles in language production pathways. CAS is primarily a motoric difficulty in which the brain cannot coordinate the fine muscles controlling the tongue, lips and mouth that are required to produce speech (Shriberg et al. 1999). DLDs are a persistent difficulty with more generalised aspects of speech and language, in the absence of any other explanatory medical condition such as hearing difficulties or developmental delay (Bishop et al. 2017). The diagnostic guidelines for DLDs are therefore less stringent than CAS and, accordingly, DLDs are an extremely common childhood developmental issue that can persist throughout the child’s life. In this review, we will focus on the primary language disorders DLD and CAS. There is little doubt as to the impact of language disorders on children, but despite the frequency and impact on society, we still understand little of the underlying neurobiology. It is clear that the risk of speech and language disorder is increased if a parent or sibling has a speech disorder (Stromswold 1998). Many studies indicate that language ability is highly heritable, and that that genetic factors play a role in this familiality (Stromswold 1998; Bishop et al. 2006; Barry et al. 2007). The identification of genetic variants or risk factors for DLDs may explain why some children struggle with language acquisition. It may also help explain why language ability is so often affected in related disorders such as ASD, developmental dyslexia, intellectual learning disability or attention deficit hyperactivity disorder (ADHD) and tease apart the phenotypic overlaps between these highly related disorders. Assuming that language impairments are at one end of a continuum of language ability, genetic studies are providing a better understanding of the molecular pathways that are important in language acquisition. 3. Genes involved in disorders of language development When a language disorder recurs within multiple generations of a family, we often assume a strong genetic contribution. Such families have therefore traditionally been the obvious place to start when studying genetic inheritance. The principal insights into the genetics of DLDs have come from such family studies, and several genes have been identified using genetic linkage and candidate gene sequencing in related family members (Table 1). These genes were often identified from single families or a number of related individuals, using genetic linkage to look for regions of the genome shared by language impaired family members, or by testing for genetic association between large numbers of unrelated individuals with a similar phenotype (Table 1). Genetic linkage and association approaches have traditionally been the mainstay of neurodevelopment genetics, with much success. Table 1. Major genes implicated in language disorders, and associated overlapping phenotypes The table shows genes from association or linkage of language disorders, and does not include a thorough review of other phenotypes (dyslexia, ASD, etc.). Asterisk indicates gene has been reported as monogenic. Gene  Associated disorder(s)  Key references  ABCC13  Language disorder  Luciano et al. (2013)  ARHGEF39  Language disorder  Devanna et al. (2017)  ATP2C2  Language disorder (short term memory)  Newbury et al. (2009); Smith et al. (2015)  BCL11A  Language disorder (specifically CAS) with expressive language and mild intellectual delay  Peter et al. (2014)  CMIP  Language disorder (short term memory) Language disorder and dyslexia Dyslexia  Newbury et al. (2009),Scerri et al. (2011)  CNTNAP2  Language disorder Autism  Vernes et al. (2008); Arking et al. (2008); Bakkaloglu et al. (2008)  DCDC2  DyslexiaLanguage disorder and dyslexia  Schumacher et al. (2006); Marino et al. (2012),Scerri et al. (2011); Marino et al. (2011); Powers et al. (2013)  ERC1  Language disorder (CAS)  Thevenon et al. (2013); Chen et al. (2017)  FLNC  Language disorder and reading difficulties  Gialluisi et al. (2014)  FOXP1*  Language disorder and intellectual delay  Horn et al. (2010); Hamdan et al. (2010); Le Fevre et al. (2013); Srivastava et al. (2014); Sollis et al. (2015)  FOXP2*  Language disorder (specifically CAS)  Lai et al. (2001); MacDermot et al. (2005); Tomblin et al. (2009); Turner et al. (2013); Moralli et al. (2015); Reuter et al. (2017)  GRIN2A  Focal epilepsy with speech disorder, with or without mental retardation  Chen et al. (2017); Endele et al. (2010); De Ligt et al. (2012); Carvill et al. (2013)  KIAA0319  Dyslexia Language disorder  Scerri et al. (2011),;Kirsten et al. (2012),Newbury et al. (2011)  NDST4  Language disorder  Eicher et al. (2013)  NFXL1  Language disorder  Villanueva et al. (2015)  NOP9  Language disorder  Nudel et al. (2014)  RBFOX2  Language disorder and reading difficulties  Gialluisi et al. (2014)  ROBO1  Dyslexia Language disorder and dyslexia  Hannula-Jouppi et al. (2005),Bates et al. (2011)  ROBO2  Language disorder  St Pourcain et al. (2014)  SETBP1  Language disorder  Filges et al. (2010); Marseglia et al. (2012); Kornilov et al. (2016)  SRPX2  Language disorder, rolandic seizures and intellectual delay  Chen et al. (2017); Roll et al. (2006)  TM4SF20*  Language disorder  Wiszniewski et al. (2013)  Gene  Associated disorder(s)  Key references  ABCC13  Language disorder  Luciano et al. (2013)  ARHGEF39  Language disorder  Devanna et al. (2017)  ATP2C2  Language disorder (short term memory)  Newbury et al. (2009); Smith et al. (2015)  BCL11A  Language disorder (specifically CAS) with expressive language and mild intellectual delay  Peter et al. (2014)  CMIP  Language disorder (short term memory) Language disorder and dyslexia Dyslexia  Newbury et al. (2009),Scerri et al. (2011)  CNTNAP2  Language disorder Autism  Vernes et al. (2008); Arking et al. (2008); Bakkaloglu et al. (2008)  DCDC2  DyslexiaLanguage disorder and dyslexia  Schumacher et al. (2006); Marino et al. (2012),Scerri et al. (2011); Marino et al. (2011); Powers et al. (2013)  ERC1  Language disorder (CAS)  Thevenon et al. (2013); Chen et al. (2017)  FLNC  Language disorder and reading difficulties  Gialluisi et al. (2014)  FOXP1*  Language disorder and intellectual delay  Horn et al. (2010); Hamdan et al. (2010); Le Fevre et al. (2013); Srivastava et al. (2014); Sollis et al. (2015)  FOXP2*  Language disorder (specifically CAS)  Lai et al. (2001); MacDermot et al. (2005); Tomblin et al. (2009); Turner et al. (2013); Moralli et al. (2015); Reuter et al. (2017)  GRIN2A  Focal epilepsy with speech disorder, with or without mental retardation  Chen et al. (2017); Endele et al. (2010); De Ligt et al. (2012); Carvill et al. (2013)  KIAA0319  Dyslexia Language disorder  Scerri et al. (2011),;Kirsten et al. (2012),Newbury et al. (2011)  NDST4  Language disorder  Eicher et al. (2013)  NFXL1  Language disorder  Villanueva et al. (2015)  NOP9  Language disorder  Nudel et al. (2014)  RBFOX2  Language disorder and reading difficulties  Gialluisi et al. (2014)  ROBO1  Dyslexia Language disorder and dyslexia  Hannula-Jouppi et al. (2005),Bates et al. (2011)  ROBO2  Language disorder  St Pourcain et al. (2014)  SETBP1  Language disorder  Filges et al. (2010); Marseglia et al. (2012); Kornilov et al. (2016)  SRPX2  Language disorder, rolandic seizures and intellectual delay  Chen et al. (2017); Roll et al. (2006)  TM4SF20*  Language disorder  Wiszniewski et al. (2013)  The most successful study in this field, to date, has been the identification of an arginine to histidine mutation at amino acid position 553 (denoted as p.R553H) in the FOXP2 gene, identified in a large, multigenerational family known as the KE family. Family members who carry this mutation have the CAS phenotype (Lai et al. 2001). In genetic terminology, the p.R553H change is a dominant, fully penetrant mutation—one mutated copy of the gene is enough to result in a particular disorder. Fully penetrant cases are rare and presumably differ from more ‘typical’ cases of DLD, where one genetic change cannot be directly correlated with their disorder. While this remains the most studied and best characterised gene implicated in speech, mutations in FOXP2 only account for about 2% of CAS cases (Worthey et al. 2013), and as such, causative mutations in FOXP2 are still considered a rare cause of language disorders. FOXP2, dubbed a ‘molecular window’ into speech and language development, has been a leap-pad for the identification of other genes and mechanisms involved in language [for example, CNTNAP2 (Vernes et al. 2008), as described below]. The discovery of FOXP2 was hailed by the media as the ‘speech gene’—suggesting that this single protein is responsible for language development in humans. This headline tag is an overly simplistic interpretation, which has endured in fields outside of genetics and language biology. More recently, investigation into the molecular function of FOXP2 has slowly built a more detailed picture of its role in language development (Dediu and Christiansen 2016; Fitch 2017; Fisher 2017). The literature is clear—FOXP2 is not the sole explanatory factor for presence of language. There are very few instances of monogenic inheritance, where the absence of a protein leads directly to language disorder. In Table 1, only FOXP2, FOXP1, and TM4SF20 have been described as monogenic drivers of language disorders. The remainder of the identified genes instead confer risk of language disorder through genetic variations that subtly alter the way in which genes and proteins work. The majority of genes have been implicated in language disorders through association with language-related phenotypes obtained from cohort studies. In contrast to FOXP2, where a mutation explains the observed language difficulties (monogenic model), these genes tend to play a role within a complex genetic model. Carrying a risk variant within these genes confers a ‘susceptibility’ to develop language disorder, however this remains difficult to quantify and is poorly understood. Nonetheless, the study of cases and their families has provided an important window into the underlying mechanisms of language disorders. At present, FOXP2 and FOXP1 remain the best characterised of the genes implicated in language disorders. Clinical diagnosis of the underlying molecular cause of a language disorder is not usually possible, unless the causative mutation is within FOXP2, FOXP1, or TM4SF20. Mutations in these genes are rare, and therefore the majority of language disorder cases are unlikely to have an underlying molecular cause identified. Large-scale genome sequencing projects such as 1000 Genomes Project Consortium (2015) and ExAC (Lek et al. 2016) have created a major shift in how we perceive human genetic variation and its contribution to disease. We have understood for decades that monogenic disorders usually involve rare mutations which impact upon the function of the protein. Such mutations usually lead to non-functional proteins which manifest in a disease phenotype. Access to large numbers of control genomes through 1000 Genomes and ExAC has enabled us to more accurately identify and assess genetic risk factors, which tend to be more common in the population, but may confer a modest risk of developing a phenotype. These databases also provide unprecedented power to inform our understanding of gene function in modern humans, and by proxy, our ancestors. It is well established that Neanderthals and Denisovans shared the ‘humanised’ version of FOXP2, which differs from ancestral FOXP2 at two positions; chromosome 7, base-pair 114, 282, 597 (denoted as chr7: 114, 282, 597) resulting in an arginine rather than the ancestral threonine at position 303 (denoted as p.N303) and chromosome 7, base-pair 114, 282, 663 (denoted as chr7: 114, 282, 663) resulting in a serine at amino acid position 325 rather than the ancestral arginine (denoted as p.S325) (variant 1, hg19) (Krause et al. 2007). This important finding gave rise to the idea that Neanderthals may have had a sophisticated level of cognitive processing to support some form of language (Krause et al. 2007). Interestingly, the ‘humanised’ FOXP2 amino acid at position 325 is somewhat called into question by the presence of two apparently healthy controls in the ExAC database. These two individuals carry one copy (heterozygous) of a T > G change at neighbouring position (chr7: 114, 282, 664), essentially reverting the amino acid sequence to the ancestral form, resulting in a serine to arginine change (p.S325N). This change is extremely rare (allele frequency = 0.00001648) and only seen in 2 of more than 60,000 individuals, but it poses the question—did these apparently healthy individuals have language difficulties? Although ExAC participants were not specifically screened for cognitive function or language ability, it is unlikely that they had an overt phenotype as this would have excluded them from the study. This presents an interesting line of thought, that if these two amino acids are the hominin form of FOXP2, then there are at least two functioning humans out there who do not have a fully ‘humanised’ version of FOXP2. The presence of a non-human FOXP2 amino acid change in these two healthy individuals shows the power of these databases to identify extremely rare occurrences of a variant carried in <0.0016% of the population. It provides a more accurate snapshot of human variation with which we can more effectively predict which variants are likely to be important. Even in monogenic disorders, when it is clear that the trait is directly caused by a dominant mutation, we still observe a high degree of variability between individuals (incomplete penetrance). Such phenotypic variability is even present within the KE family who have a ‘fully’ penetrant dominant FOXP2 mutation with a clear-cut phenotype (Lai et al. 2001; Watkins et al. 2002). It is widely reported that some individuals of the KE family present with non-verbal difficulties. The performance IQ scores of five affected KE family members are varied—on male affected (age 10 years) scored 112 compared to a second 10 year old affected male who scores 66. These individuals carry the p.R553H mutation which explains their CAS phenotype, but the differences in performance IQ are likely due to genetic modifiers, and not directly related to FOXP2. For the majority of language disorder loci discovered to date, it is likely that they explain only part of the risk and the modifier, and additional variants have yet to be identified. We are only just beginning to understand the actions of modifiers and risk factors, but this concept underlies a shift from the traditional genetic model, in which phenotypes are truly dominant or recessive. Instead, we now understand the importance of considering all variation on a genetic background. 4. Complex inheritance and genetic risk The power of familial studies is a proven method to identify contributory genes, but increasingly molecular genetics is focussing on the role of modifiers and risk factors in DLDs. The majority of genes listed in Table 1 that has been associated with language disorders fall into this category. An example is an asparagine to lysine change at amino acid position 150 (denoted as p.N150K) in the NFXL1 gene. This variant (rs144169475), identified by sequencing five affected Islanders, was found to be associated with language impairment on Robinson Crusoe Island, an isolated Chilean population with an exceptionally high rate of language disorders (Villanueva et al. 2015). This variant likely forms a key part of a complex inheritance model where a single variant only explains part of the DLD risk. The variant is seen in 4.1% in South American control genomes, and is therefore considered common in Latin America, suggesting that it may confer susceptibility to DLD when inherited in combination with other variants that are yet to be identified. The study of complex genetic factors is primarily performed using large numbers of unrelated cases specifically selected to have a high degree of phenotypic similarity. Large-scale genome-wide association studies (GWAS) with several thousands of participants may be able to successfully identify common risk variants involved in DLDs; however, a large-scale study of this nature has not yet been attempted. A recent GWAS into the genetic basis of schizophrenia successfully identified more than 100 associated loci using 37,000 schizophrenia patients and 113,000 controls (Schizophrenia Working Group of the Psychiatric Genomes Consortium 2014). The application of these methods in clinical traits such as schizophrenia, have shown that enormous sample sizes are required to enable the consistent replication of associated loci. A major limiting factor in performing a large-scale GWAS for language disorders remains the systematic phenotyping of enough participants to gain the statistical power required to detect contributory variants. This challenge is common to most large complex genetics studies, but is particularly pronounced for the field of language disorders where there is little consensus on what constitutes a speech and language disorder, or how it should be diagnosed and classified. A recent report by the CATALISE consortium aims to do exactly that (Bishop et al. 2017). Even the terminology used to describe language disorders and DLDs required standardisation across disciplines, and although these are the current approved terms, they are taking time to become standard in research and education. Establishing consistent terminology is the keystone to developing standardised diagnostic criteria. Once these definitions are consistent within and across disciplines, then a large-scale study could be successfully developed. It would likely lead to the identification of a novel pathways and gene networks involved in language production. Table 1 reveals the striking number of genes implicated in DLDs which are also implicated in other, closely related neurodevelopmental disorders. Vernes and colleagues identified an association between variants in the contactin-associated protein like 2 gene CNTNAP2 and DLDs through its interaction with the transcription factor FOXP2 (Vernes et al. 2008). Variants in CNTNAP2 are also associated with ASD (Alarcon et al. 2008; Arking et al. 2008), cortical dysplasia focal epilepsy syndrome (OMIM #610042) (Strauss et al. 2006), and Pitt–Hopkins-like syndrome (OMIM #610042) (Zweier et al. 2009). Another example of genes implicated in language overlapping with related disorders is the axon guidance receptor protein ROBO1. It was first implicated as a candidate gene for dyslexia in a patient with a translocation involving the ROBO1 region (Hannula-Jouppi et al., 2005), and was subsequently found to be associated with short-term memory of words, a key feature of DLD (Bates et al. 2011). Other examples of genes involved in language disorders that overlap with a dyslexia phenotype, include DCDC2, KIAA0319, and CMIP (Schumacher et al. 2006; Scerri et al. 2011). This observation suggests the documented phenotypic overlap between developmental disorders like DLD, ASD, and dyslexia may be driven by shared genetic aetiology. We should note, however that the level of shared aetiology is hard to objectively ascertain without genome-wide data. Technical and financial limitations mean that many studies of DLDs to date are limited to candidate genes, leading to substantial ascertainment bias. The factors that determine how a given genetic variant manifests to become one phenotype over another is not fully understood, but they are likely to involve interactions between genetic variants. This emphasises the need to consider the genetic background of an individual within any candidate gene analyses. These multiple layers of complexity partly explain why genetic studies have so far struggled to elucidate the genetic basis of many neurodevelopmental disorders. 5. Limitations of current genomic studies There are a number of reasons why we do not have a better picture of the genetics of speech and language disorders. As discussed above, the majority of studies have used relatively low resolution mapping methods within small sample sizes with inconsistent characterisation between studies. Recent advances in DNA sequencing technology allow us to generate a more complete picture of genetic variation across the entire genome (whole-genome sequencing) or across all known genes in the genome (whole-exome sequencing). While such technologies afford better resolution and, to some extent, offset these problems, the identification of risk variants, which only have a small effect size, remain difficult. The average human genome contains between 4 and 5 million variants that differ from published reference sequences. Only about 1% of the human genome actually encodes genes, and these gene encoding regions will contain about 150 coding mutations which result in the loss-of-function of the protein. They will also contain around 10,000 ‘silent’ mutations that fall within genes but do not alter the amino sequence. Each person’s genome will contain about 120 novel coding variants which have not previously been reported (1000 Genomes Project Consortium 2015). The vast majority of variation we see in the human genome does not directly change the protein, and is non-coding. Once we consider that these non-coding changes may have a function affecting gene expression (how much of each protein is made), the list of potential variants can be vast, and extremely challenging. Exome sequencing studies investigate just the coding regions of the genes of individuals affected by DLD (Kornilov et al. 2016; Chen et al. 2017) or CAS (Worthey et al. 2013). These preliminary, small-scale investigations confirm the complexity of the underlying genetics in the majority of cases and reinforce the need for larger-scale screening studies. Even though they lie outside of gene sequences, non-coding variants can change gene functions, for example by increasing or decreasing expression. It is highly likely that these non-coding variants will be involved in neurodevelopmental disorders. These variants represent a far greater challenge than coding variants. They are often not captured by whole-exome sequencing meaning that we may simply be missing important mutations. Whole-genome sequencing is becoming more commonly used, but cost is often prohibitively expensive. Even when these variants are captured, their categorisation is difficult. A recent study demonstrated a role for variations within non-coding regulatory regions in DLD and other neurodevelopmental conditions underlining the importance of this route of investigation (Devanna et al. 2017). The use of whole-genome sequencing produces vastly more data, and analysis can be more computationally expensive, and requires a much greater level of analytical expertise. Since the effects of these variants are often indirect, their characterisation usually involves complex functional validation steps that are challenging to complete for a high number of variants. Genetic studies tend to be performed on European or American cohorts. Findings in these participants may not be relevant in other populations as some variants can be more or less common in a different population, and different groups may need their own specific studies to gain a better global understanding. For example, the NFXL1 variant found to increase risk of DLD on Robinson Crusoe Island was found in 4.1% of Latin Americans, but 0% of Europeans (Villanueva et al. 2015). Similarly, investigations of an isolated Russian population have yielded novel loci in relation to DLD (Kornilov et al. 2016). The availability of 1000 Genomes data has improved power to detect variants that differ in allele frequencies between populations however, these are still limited to relatively small numbers of individuals from a restricted set of countries. Another degree of complexity is added by tissue specificity; while present in the genomic DNA of every cell, some mutations may only have a detectable effect in a specific tissue, at a particular time in development. The function of a gene can vary between cell types and conditions, and many genes have multiple, and often surprisingly different, functions. FOXP2 is not only highly expressed in the brain, but is also highly expressed in the lungs and many other tissue types all of which will carry the mutation at a DNA level (Shu et al. 2007). The brain appears particularly sensitive to this particularly change and, as far as we are aware, the lungs of the KE family are unaffected (Lai et al. 2001). It is therefore important to remember that although genomic technologies can give us a window into what is happening in a particular individual, it is far more challenging to predict the cellular context in which it will become important. 6. Paleogenetics and language In evolutionary terms, the window to understand genetic effects on cognitive function and language ability in hominins is even narrower than in modern humans, and must be interpreted with extreme caution. The humanised version of FOXP2 is thought to have become fixed in the population around 500 KYA, prior to the last shared common ancestor (370–450 KYA) (Green et al. 2010) and the presence of this version in Neanderthals supports the notion of cognitive function sophisticated enough to support language. More recently, a regulatory region of FOXP2 was identified exclusively in modern humans at a binding site of the transcription factor POUF3F2 which is absent in Neanderthals (Maricic et al. 2013). This suggests that differences in gene regulation and expression may be involved in cognitive function, and that species differences are due to far more than just two variants in a single gene. We must be cautious when interpreting such information as it is extremely unlikely that these FOXP2 changes are solely responsible for the presence (or absence) of language function, and any observations, modern or otherwise, should consider the entire genetic background. (Mozzi et al. 2016) To further complicate the underlying assumptions in evolutionary studies, the small numbers of Neanderthals sequenced heavily biases the study findings. The difficulties in obtaining ancient DNA of suitable quality for sequencing, means that sequenced individuals are not representative of time periods or geographical locations, and are from a small number of sites where preservation conditions were optimal. As discussed above, genome sequence studies clearly illustrate that small numbers of individuals from one or two geographical locations do not represent the entire population. This is the modern genetic equivalent of sequencing one family and assuming that everyone else is the same—this is not genetically plausible. There is not enough available population data to be able to accurately predict genetic affects, particularly with respect to complex cognitive processes like language function. Paleogenetics researchers are slowly building a broader and more accurate picture of ancient hominin genetics through sequencing larger numbers from a range of geographical locations. A larger sample size will greatly improve the statistical significance of findings, and increase confidence in their implication for language and higher cognitive function. Genes that are implicated in language disorders in modern can inform investigation of language in ancient hominins, and there have been several efforts to investigate the impact of language associated genes more broadly (Mozzi et al. 2016). Through the expansion of genetic technologies and a greater understanding of their application and limitation, we will continue to build a more accurate picture of both modern and ancient language cognition slowly, piece by piece, applying the scientific rigour and multiple lines of evidence of molecular biology. 7. Discussion The study of language disorders has been fruitful in implicating genes, and subsequent molecular pathways that are involved in the mechanisms of language. While there have been many exciting discoveries spanning the past two decades, there remains much more to understand. We still do not fully understand the underlying causes of DLDs, and what makes some children are more susceptible. Family studies can still provide novel insights into the underlying mechanisms of DLDs. There is strong potential for using a familial shared genetics-based approach, particularly when combined with recent advances in sequencing technologies that can investigate more of the genome than ever before. We increasingly recognise that genetic risk plays a key role in language disorders and many current approaches are investigating a genetic background of susceptibility. To be statistically sound, these studies require much larger sample sizes and more consistently phenotyped datasets to generate sufficient statistical power. The reality is that DLDs are likely to involve some high impact rare mutations, genetic rearrangements and common sequence variations, all of which create a background of susceptibility. Family based and association studies are still uncovering some unlikely pathways which play a role in language disorders, and it is clear that it will not be a simple story. The idea that a single gene has a distinct role or confers a single trait is an outdated concept. Similarly, the idea that a gene will have a single role in the cell has been dispelled. We understand that non-coding variants can play a crucial role in gene regulation, and are highly likely to have an important function in DLDs, and other neurodevelopmental disorders. The genetic background and regulation of gene expression and function is dynamic, and depends greatly on individual cell types. While this is still poorly understood, methods for detecting and experimentally validating such context dependent states are in development. The function of a gene in a particular cellular circumstance can, and will be validated by molecular biology in model animal or cellular systems. Genetic control is no longer beyond our testing capability, and we have a range of technologies to characterise gene function and expression across different cell types and under different conditions. Environmental factors clearly play a role in language development, and poor life circumstance may impact the DLD phenotype. Nature versus nurture is a falsely binary concept, and the underlying genetics plays a key role within an environmental (nurture) context. The theory that the presence of ‘humanised’ FOXP2 gene in Neanderthals drove language ability is naive and overly simplistic. FOXP2 clearly plays an important role in speech evolution and production; however, we must be cautious to avoid making over-inflated statements about language in Neanderthals based on a single gene (Fitch 2017). We are only just beginning to unravel the highly complex developmental processes that underlie speech in modern humans, and should be extremely cautious in extrapolating any findings into hominins. The identification of risk factors for DLDs in modern humans will inform our understanding of capacity for language in ancient hominins. We may be able to build a far clearer picture of how language evolved once we increase our understanding of the neuromolecular pathways involved language development in modern humans. References 1000 Genomes Project Consortium ( 2015) ‘ A Global Reference for Human Genetic Variation’, Nature , 526: 68– 74. CrossRef Search ADS PubMed  Alarcon M. et al.   ( 2008) ‘ Linkage, Association, and Gene-expression Analyses Identify CNTNAP2 as an Autism-susceptibility Gene’, The American Journal of Human Genetics , 82: 150– 9. Google Scholar CrossRef Search ADS PubMed  Anderson S. A., Hawes D. J., Snow P. C. ( 2016) ‘ Language Impairments Among Youth Offenders: A Systematic Review’, Children and Youth Services Review , 65: 195– 203. Google Scholar CrossRef Search ADS   Arking D. E. et al.   ( 2008) ‘ A Common Genetic Variant in the Neurexin Superfamily Member CNTNAP2 Increases Familial Risk of Autism’, The American Journal of Human Genetics , 82: 160– 4. Google Scholar CrossRef Search ADS PubMed  Bakkaloglu B. et al.   ( 2008) ‘ Molecular Cytogenetic Analysis and Resequencing of Contactin Associated Protein-like 2 in Autism Spectrum Disorders’, The American Journal of Human Genetics , 82: 165– 73. Google Scholar CrossRef Search ADS PubMed  Barry J. G., Yasin I., Bishop D. V. ( 2007) ‘ Heritable Risk Factors Associated with Language Impairments’, Genes, Brain and Behavior , 6: 66– 76. Google Scholar CrossRef Search ADS   Bates T. C. et al.   ( 2011) ‘ Genetic Variance in a Component of the Language Acquisition Device: ROBO1 Polymorphisms Associated with Phonological Buffer Deficits’, Behavior Genetics , 41: 50– 7. Google Scholar CrossRef Search ADS PubMed  Bishop D. V. M. et al.   ( 2017) ‘Phase 2 of CATALISE: A Multinational and Multidisciplinary Delphi Consensus Study of Problems with Language Development: Terminology’, Journal of Child Psychology and Psychiatry , 58: 1068– 80. Google Scholar CrossRef Search ADS PubMed  Bishop D. V., Adams C. V., Norbury C. F. ( 2006) ‘ Distinct Genetic Influences on Grammar and Phonological Short‐term Memory Deficits: Evidence from 6‐year‐old Twins’, Genes, Brain and Behavior , 5: 158– 69. Google Scholar CrossRef Search ADS   Carvill G. L. et al.   ( 2013) ‘ GRIN2A Mutations Cause Epilepsy-aphasia Spectrum disorders’, Nature Genetics , 45: 1073– 6. Google Scholar CrossRef Search ADS PubMed  Caspi A. et al.   ( 2016) ‘ Childhood Forecasting of a Small Segment of the Population with Large Economic Burden’, Nature Human Behaviour , 1: 0005. Google Scholar CrossRef Search ADS PubMed  Chen X. S. et al.   ( 2017) ‘ Next-generation DNA Sequencing Identifies Novel Gene Variants and Pathways Involved in Specific Language Impairment’, Scientific Reports , 7: 46105. Google Scholar CrossRef Search ADS PubMed  Chomsky N. ( 1998) ‘On the Nature, Use and Acquisition of Language’. In: Toribio J., Cíark A. (eds), Language and Meaning in Cognitive Science: Cognitive Issues and Semantic Theory , pp. 1– 20. Taylor and Francis: New York and London. Conti-Ramsden G., Botting N. ( 2008) ‘ Emotional Health in Adolescents with and without a History of Specific Language Impairment (SLI)’, Journal of Child Psychology Psychiatry , 49: 516– 25. Google Scholar CrossRef Search ADS PubMed  De Ligt J. et al.   ( 2012) ‘ Diagnostic Exome Sequencing in Persons with Severe Intellectual Disability’, New England Journal of Medicine , 367: 1921– 9. Google Scholar CrossRef Search ADS PubMed  Dediu D., Christiansen M. H. ( 2016) ‘ Language Evolution: Constraints and Opportunities From Modern Genetics’, Topics in Cognitive Science , 8: 361– 70. Google Scholar CrossRef Search ADS PubMed  Devanna P. et al.   ( 2017) ‘Next-gen Sequencing Identifies Non-coding Variation Disrupting miRNA-Binding Sites in Neurological Disorders’, Molecular Psychiatry , 1– 10. Eicher J. D. et al.   ( 2013) ‘ Genome-wide Association Study of Shared Components of Reading Disability and Language Impairment’, Genes Brain Behavior , 12: 792– 801. Google Scholar CrossRef Search ADS   Endele S. et al.   ( 2010) ‘ Mutations in GRIN2A and GRIN2B Encoding Regulatory Subunits of NMDA Receptors Cause Variable Neurodevelopmental Phenotypes’, Nature Genetics , 42: 1021– 6. Google Scholar CrossRef Search ADS PubMed  Filges I. et al.   ( 2010) ‘Reduced Expression by SETBP1 Haploinsufficiency Causes Developmental and Expressive Language Delay Indicating a Phenotype Distinct from Schinzel–Giedion Syndrome’, Journal of Medical Genetics , 48: 117– 22. Google Scholar CrossRef Search ADS PubMed  Fisher S. E. ( 2017) ‘ Evolution of Language: Lessons from the Genome’, Psychonomic Bulletin & Review , 24: 34– 40. Google Scholar CrossRef Search ADS PubMed  Fitch W. T. ( 2017) ‘ Empirical Approaches to the Study of Language Evolution’, Psychonomic Bulletin & Review , 24: 3– 33. Google Scholar CrossRef Search ADS PubMed  Gialluisi A. et al.   ( 2014) ‘ Genome-wide Screening for DNA Variants Associated with Reading and Language Traits’, Genes Brain Behavior , 13: 686– 701. Google Scholar CrossRef Search ADS   Green R. E. et al.   ( 2010) ‘ A Draft Sequence of the Neandertal Genome’, Science , 328: 710– 22. Google Scholar CrossRef Search ADS PubMed  Hamdan F. F. et al.   ( 2010) ‘ De Novo Mutations in FOXP1 in Cases with Intellectual Disability, Autism, and Language Impairment’, The American Journal of Human Genetics , 87: 671– 8. Google Scholar CrossRef Search ADS PubMed  Hannula-Jouppi K. et al.   ( 2005) ‘ The Axon Guidance Receptor Gene ROBO1 is a Candidate Gene for Developmental Dyslexia’, PLoS Genet , 1: e50. Google Scholar CrossRef Search ADS PubMed  Horn D. et al.   ( 2010) ‘ Identification of FOXP1 Deletions in Three Unrelated Patients with Mental Retardation and Significant Speech and Language Deficits’, Human Mutation , 31: E1851– 60. Google Scholar CrossRef Search ADS PubMed  Hulme C., Snowling M. J. ( 2009) Developmental Disorders of language learning and cognition . West Sussex, UK: John Wiley & Sons. Kirsten H. et al.   ( 2012) ‘ Association Study of a Functional Genetic Variant in KIAA0319 in German Dyslexics’, Psychiatric Genetics , 22: 216– 7. Google Scholar CrossRef Search ADS PubMed  Kornilov S. A. et al.   ( 2016) ‘ Genome-Wide Association and Exome Sequencing Study of Language Disorder in an Isolated Population’, Pediatrics , 137. Krause J. et al.   ( 2007) ‘ The Derived FOXP2 Variant of Modern Humans was Shared with Neandertals’, Current Biology , 17: 1908– 12. Google Scholar CrossRef Search ADS PubMed  Lai C. S. et al.   ( 2001) ‘ A Forkhead-domain Gene is Mutated in a Severe Speech and Language Disorder’, Nature , 413: 519– 23. Google Scholar CrossRef Search ADS PubMed  Le Fevre A. K. et al.   ( 2013) ‘ FOXP1 Mutations Cause Intellectual Disability and a Recognizable Phenotype’, American Journal of Medical Genetics Part A , 161: 3166– 75. Google Scholar CrossRef Search ADS   Lek M. et al.   ( 2016) ‘ Analysis of Protein-coding Genetic Variation in 60, 706 Humans’, Nature , 536: 285– 91. Google Scholar CrossRef Search ADS PubMed  Locke J. ( 1836) An Essay Concerning Human Understanding . London, UK: T. Tegg and Son. Luciano M. et al.   ( 2013) ‘ A Genome-wide Association Study for Reading and Language Abilities in Two Population Cohorts’, Genes Brain Behavior , 12: 645– 52. Google Scholar CrossRef Search ADS   MacDermot K. D. et al.   ( 2005) ‘ Identification of FOXP2 Truncation as a Novel Cause of Developmental Speech and Language Deficits’, The American Journal of Human Genetics , 76: 1074– 80. Google Scholar CrossRef Search ADS PubMed  Maricic T. et al.   ( 2013) ‘ A Recent Evolutionary Change Affects a Regulatory Element in the Human FOXP2 Gene’, Molecular Biology and Evolution , 30: 844– 52. Google Scholar CrossRef Search ADS PubMed  Marino C. et al.   ( 2011) ‘ Pleiotropic Effects of DCDC2 and DYX1C1 Genes on Language and Mathematics Traits in Nuclear Families of Developmental Dyslexia’, Behavior Genetics , 41: 67– 76. Google Scholar CrossRef Search ADS PubMed  Marino C. et al.   ( 2012) ‘ DCDC2 Genetic Variants and Susceptibility to Developmental Dyslexia’, Psychiatric Genetics , 22: 25. Google Scholar CrossRef Search ADS PubMed  Marseglia G. et al.   ( 2012) ‘ 372 kb Microdeletion in 18q12. 3 Causing SETBP1 Haploinsufficiency Associated with Mild Mental Retardation and Expressive Speech Impairment’, European Journal of Medical Genetics , 55: 216– 21. Google Scholar CrossRef Search ADS PubMed  Moralli D. et al.   ( 2015) ‘ Language Impairment in a Case of a Complex Chromosomal Rearrangement with a Breakpoint Downstream of FOXP2’, Molecular Cytogenetics , 8: 36. Google Scholar CrossRef Search ADS PubMed  Mozzi A. et al.   ( 2016) ‘ The Evolutionary History of Genes Involved in Spoken and Written Language: Beyond FOXP2’, Scientific Reports , 6: 22157. Google Scholar CrossRef Search ADS PubMed  Newbury D. F. et al.   ( 2009) ‘ CMIP and ATP2C2 Modulate Phonological Short-term Memory in Language Impairment’, The American Journal of Human Genetics , 85: 264– 72. Google Scholar CrossRef Search ADS PubMed  Newbury D. F. et al.   ( 2011) ‘ Investigation of Dyslexia and SLI Risk Variants in Reading- and Language-impaired Subjects’, Behavioral Genetics , 41: 90– 104. Google Scholar CrossRef Search ADS   Norbury C. F. et al.   ( 2016) ‘ The Impact of Nonverbal Ability on Prevalence and Clinical Presentation of Language Disorder: Evidence from a Population Study’, Journal of Child Psychology and Psychiatry , 57: 1247– 57. Google Scholar CrossRef Search ADS PubMed  Nudel R. et al.   ( 2014) ‘ Genome-wide Association Analyses of Child Genotype Effects and Parent-of-origin Effects in Specific Language Impairment’, Genes Brain and Behavior , 13: 418– 29. Google Scholar CrossRef Search ADS   Peter B. et al.   ( 2014) ‘ De Novo Microdeletion of BCL11A is Associated with Severe Speech Sound Disorder’, American Journal of Medical Genetics A , 164A: 2091– 6. Google Scholar CrossRef Search ADS   Pinker S. ( 1994) The Language Instinct (1994/2007) New York: NY Harper Perennial Modern Classics. Powers N. R. et al.   ( 2013) ‘ Alleles of a Polymorphic ETV6 Binding Site in DCDC2 Confer Risk of Reading and Language Impairment’, American Journal of Human Genetics , 93: 19– 28. Google Scholar CrossRef Search ADS PubMed  Reuter M. S. et al.   ( 2017) ‘ FOXP2 Variants in 14 Individuals with Developmental Speech and Language Disorders Broaden the Mutational and Clinical Spectrum’, Journal of Medical Genetics , 54: 64– 72. Google Scholar CrossRef Search ADS PubMed  Roll P. et al.   ( 2006) ‘ SRPX2 Mutations in Disorders of Language Cortex and Cognition’, Human Molecular Genetics , 15: 1195– 207. Google Scholar CrossRef Search ADS PubMed  Scerri T. S. et al.   ( 2011) ‘ DCDC2, KIAA0319 and CMIP are Associated with Reading-related Traits’, Biological Psychiatry , 70: 237– 45. Google Scholar CrossRef Search ADS PubMed  Schizophrenia Working Group of the Psychiatric Genomics Consortia ( 2014) ‘ Biological Insights from 108 Schizophrenia-associated Genetic Loci’, Nature , 511: 421– 7. CrossRef Search ADS PubMed  Schumacher J. et al.   ( 2006) ‘ Strong Genetic Evidence of DCDC2 as a Susceptibility Gene for Dyslexia’, American Journal of Human Genetics , 78: 52– 62. Google Scholar CrossRef Search ADS PubMed  Shriberg L. D., Tomblin J. B., McSweeny J. L. ( 1999) ‘ Prevalence of Speech Delay in 6-year-old Children and Comorbidity with Language Impairment’, Journal of Speech, Language and Hearing Research , 42: 1461– 81. Google Scholar CrossRef Search ADS   Shu W. et al.   ( 2007) ‘ Foxp2 and Foxp1 Cooperatively Regulate Lung and Esophagus Development’, Development , 134: 1991– 2000. Google Scholar CrossRef Search ADS PubMed  Smith A. W. et al.   ( 2015) ‘ Deletion of 16q24. 1 Supports a Role for the ATP2C2 Gene in Specific Language Impairment’, Journal of Child Neurology , 30: 517– 21. Google Scholar CrossRef Search ADS PubMed  Sollis E. et al.   ( 2015) ‘ Identification and Functional Characterization of De novo FOXP1 Variants Provides Novel Insights into the Etiology of Neurodevelopmental Disorder’, Human Molecular Genetics , 25: 546– 57. Google Scholar CrossRef Search ADS PubMed  Srivastava S. et al.   ( 2014) ‘ Clinical Whole Exome Sequencing in Child Neurology Practice’, Annals of Neurology , 76: 473– 83. Google Scholar CrossRef Search ADS PubMed  St Pourcain B. et al.   ( 2014) ‘ Common Variation near ROBO2 is Associated with Expressive Vocabulary in Infancy’, Nature Communications , 5: 4831. Google Scholar CrossRef Search ADS PubMed  Strauss K. A. et al.   ( 2006) ‘ Recessive Symptomatic Focal Epilepsy and Mutant Contactin-associated Protein-like 2’, The New England Journal of Medicine , 354: 1370– 7. Google Scholar CrossRef Search ADS PubMed  Stromswold K. ( 1998) ‘ Genetics of Spoken Language Disorders’, Human Biology , 70: 297– 324. Google Scholar PubMed  Thevenon J. et al.   ( 2013) ‘ 12p13. 33 Microdeletion Including ELKS/ERC1, a New Locus Associated with Childhood Apraxia of Speech’, European Journal of Human Genetics , 21: 82. Google Scholar CrossRef Search ADS PubMed  Tomblin J. B. et al.   ( 1997) ‘ Prevalence of Specific Language Impairment in Kindergarten Children’, Journal of Speech, Language and Hearing Research , 40: 1245– 60. Google Scholar CrossRef Search ADS   Tomblin J. B. et al.   ( 2009) ‘ Language Features in a Mother and Daughter of a Chromosome 7; 13 Translocation Involving FOXP2’, Journal of Speech, Language, and Hearing Research , 52: 1157– 74. Google Scholar CrossRef Search ADS   Turner S. J. et al.   ( 2013) ‘ Small Intragenic Deletion in FOXP2 Associated with Childhood Apraxia of Speech and Dysarthria’, American Journal of Medical Genetics Part A , 161: 2321– 6. Google Scholar CrossRef Search ADS   Vernes S. C. et al.   ( 2008) ‘ A Functional Genetic Link Between Distinct Developmental Language Disorders’, The New England Journal of Medicine , 359: 2337– 45. Google Scholar CrossRef Search ADS PubMed  Villanueva P. et al.   ( 2015) ‘ Exome Sequencing in an Admixed Isolated Population Indicates NFXL1 Variants Confer a Risk for Specific Language Impairment’, PLoS Genet , 11: e1004925. Google Scholar CrossRef Search ADS PubMed  Watkins K. E. et al.   ( 2002) ‘ Behavioural Analysis of an Inherited Speech and Language Disorder: Comparison with Acquired Aphasia’, Brain , 125: 452– 64. Google Scholar CrossRef Search ADS PubMed  Wiszniewski W. et al.   ( 2013) ‘ TM4SF20 Ancestral Deletion and Susceptibility to a Pediatric Disorder of Early Language Delay and Cerebral White Matter Hyperintensities’, The American Journal of Human Genetics , 93: 197– 210. Google Scholar CrossRef Search ADS PubMed  Worthey E. A. et al.   ( 2013) ‘ Whole-exome Sequencing Supports Genetic Heterogeneity in Childhood Apraxia of Speech’, Journal of Neurodevelopmental Disorders , 5: 29. Google Scholar CrossRef Search ADS PubMed  Zweier C. et al.   ( 2009) ‘ CNTNAP2 and NRXN1 are Mutated in Autosomal-recessive Pitt-Hopkins-like Mental Retardation and Determine the Level of a Common Synaptic Protein in Drosophila’, The American Journal of Human Genetics , 85: 655– 66. Google Scholar CrossRef Search ADS PubMed  © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

Journal

Journal of Language EvolutionOxford University Press

Published: Jan 1, 2018

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 12 million articles from more than
10,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Unlimited reading

Read as many articles as you need. Full articles with original layout, charts and figures. Read online, from anywhere.

Stay up to date

Keep up with your field with Personalized Recommendations and Follow Journals to get automatic updates.

Organize your research

It’s easy to organize your research with our built-in tools.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve Freelancer

DeepDyve Pro

Price
FREE
$49/month

$360/year
Save searches from
Google Scholar,
PubMed
Create lists to
organize your research
Export lists, citations
Read DeepDyve articles
Abstract access only
Unlimited access to over
18 million full-text articles
Print
20 pages/month
PDF Discount
20% off