Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Basque Lexicography and Purism

Basque Lexicography and Purism Abstract ‘Purism’ can characterise attitudes about a wide range of linguistic phenomena, but the most common forms of linguistic purism are those concerned with the lexicon. When standardisation of language is at issue, questions of purism are unavoidable. Are processes of standardisation necessarily motivated by puristic attitudes? Or is purism a consequence of standardisation? In this paper we consider lexical purism in the standardisation of Basque, a minoritised European language. We offer a rough periodisation of Basque lexicography through the lens of puristic attitudes towards the lexicon in terms of the classifi`cation by Thomas (1991). In 18th, 19th, and early 20th century Basque lexicography and terminology we find mostly playfulness, elitism and xenophobia as the salient characteristics of the puristic choices for the standard variety and for terminology modernisation. In the latter 20th century, we find a shift to reformist purism. We examine puristic proposals for loanword replacement from the different periods, and we measure their success in contrast to their borrowed counterparts using frequency data extracted from large text corpora. 1. Introduction According to different definitions given by scholars, linguistic purism covers a wide range of issues. Langer and Davies (2005: 4) compare four definitions of purism (those by Trask 1999, Thomas 1991, Crystal 1997 and Van der Sijs 1999) and conclude that they all ‘largely agree on what purism is: an (influential) part of the speech community voices objections to the presence of particular linguistic features and aims to remove them from their language’. Purism reflects folk-linguistic attitudes in general: members of a speech community share ideas about the degree of prestige of a certain variety or dialect, and about the relative desirability of certain linguistic features. In short: the existence of purism presupposes the existence of a prestige variety. Furthermore, prestige is a social category that is often linked to a ‘standard variety’ (Milroy 2001: 532), which brings us to the relationship between purism and standardisation. Van der Sijs (1999: 11) argues that purism only affects languages that are standardised or are in the process of standardisation since, before one can remove elements from a linguistic norm, one has to have such a norm (Langer and Davies 2005: 4). Boeder et al. (2003: viii) disagree with this interdependence between purism and standardisation: ‘For many cases, purism need not be connected with conscious standardisation, and it should not be separated from a broader concept of “pure language”.’ Basque can be characterised as a late standard language (Vogl 2012: 25) as its standardisation process did not start until 1968. In contrast to other communities, such as Flanders or Finland, the rise of Basque nationalism at the end of the 19th century did not entail any promotion of uniformity. On the contrary, ‘the founder of the Basque Nationalist Party favored the development of a different written variety for each of the Basque provinces’ (Hualde and Zuazo 2007: 143), although some of the first authors to publish in Basque in the 16th and 17th centuries explicitly remarked on the difficulties brought about by dialectal diversity.1 Basque was rarely used in prestigious domains, there was no truly socially dominant Basque dialect, and the need for a single written standard was not universally accepted. Even the foundation of the Academy of the Basque Language in 1918, with the unification of the written language as one of its main goals, turned out to be no help in the quest to achieve a standard variety (Hualde and Zuazo 2007, Salaburu and Alberdi 2012). A minority language could hardly survive in the world of today if it lacked a ‘shared common writing code’ (Salaburu and Alberdi 2012: 94). Awareness of this fact arose in the Basque Country in the second half of the 20th century, and was the driving force behind standardisation. The motivation behind standardisation could thus be described as an empowerment of the language community. Mitxelena, the linguist and academician entrusted by the Academy with the task of drawing up a proposal for the unification of Basque, stated clearly what the primarily goal was: ‘We believe that it is absolutely necessary, a matter of life or death, to put Basque on the path to unification. If one is teaching our children and young people in Basque - and if Basque is to survive, we must use it in teaching - it is indispensable that we teach them in a unified manner. The unification that we need is in written Basque, at least for the first few steps’ (Mitxelena 1968 [2008: 253]). Thus, the canonical forms were to be taught at school (Milroy 2001: 537). And, as is usually the case, from then on, (and particularly after the acquisition of the status of an official language at regional level in some territories south of the Pyrenees in 19792), Basque society felt the need for a standard, seen as ‘a variety that could fulfil every conceivable function for which its speakers could need it’ (Davies 2012: 56). Forty years later, the standard variety does have what Davies (2012: 49) calls ‘a privileged place in public and official domains, e.g. the media and the education system.’ Moreover, the cultivation of the standard variety was also linked to a nation-building project (Milroy 2001, Davies 2012): ‘the rapid acceptance of the new standard within Basque society is undoubtedly related to the strength of Basque nationalistic feeling at the time of its adoption’ (Hualde and Zuazo 2007: 160). After Mitxelena’s foundational report, which included recommendations regarding orthography, morphology, lexical variants and the adaptation of neologisms, the Basque Academy has worked continuously towards the codification of the standard through the publication of a basic grammar of standard Basque along with a standard Basque Dictionary and by establishing rules of ‘good usage’3. The ‘correctness ideology’ (Milroy 2001, Vogl 2012) is a central component of the standard language ideology, as well as the claim of mutual intelligibility (Davies 2012): ‘The benefits that the Academy’s standard has brought to Basque society are widely recognised. First of all, it has made it possible for Basque speakers to discuss any topic in Basque. Secondly, it has eliminated the (sometimes serious) obstacles which previously existed in communication between speakers from different areas of the Basque Country’ (Hualde and Zuazo 2007: 162). Nevertheless, dialectal variety is not only allowed, but also promoted in informal registers. There is even a certain acceptance of code-switching in order to increase the use of the language, in other words, a move from formal correction as a sole criterion to that of communicative and expressive quality. However, although there was no Basque standard variety until the late 1960’s, ‘the Early Modern striving for ‘correctness’ which was produced for vernacular languages all over Europe’ (Vogl 2012: 20) was to be reflected during the 18th century in Larramendi’s grammar, El imposible vencido(1729), and in his Trilingual Dictionary (1745). And, at the end of the 19th century, Sabino Arana, the founder of the Basque Nationalist Party, took a clear stance in favour of a language untouched by external influences (the ideology of linguistic isolationism, Davies 2012). In both cases, long before the Academy was founded, they had to deal with borrowed lexical items (cf. section 2). In the following we shall analyse the influence of purism on the lexicon, and look at the criteria regarding what constitutes a Basque word for Basque lexicographers when compiling dictionaries. In a rough periodisation, we first group the standard reference works of Basque lexicography according to the stages of standardisation proposed by Thomas (1991:115–122), i.e. pre-standardisation, standardisation, and post-standardisation. This model is complementary to two of the four steps in the standardisation of any language according to Haugen’s classic proposal, namely codification and elaboration (Haugen 1983). Second, we describe the type of purism that prevails in these works, according to a taxonomy of puristic orientations which we will discuss in detail in the following. Finally, we relate these works to the functional typology of dictionaries proposed by Bergenholtz and Gouws (2010, cf. section 4). Thomas (1991:75–83) distinguishes six types of purism, or six puristic orientations, as follows: Archaising purism: a conservative approach that favours the language found in written text from a ‘golden past’ over any innovation. Ethnographic purism: the lexicon of certain, typically rural dialects is favoured over modern, urban vocabularies. Elitist purism: the sociolect of the educated urban elite is regarded as the purest. Reformist purism: ‘a salient feature of most of the language renewals of the nineteenth century as well as the more recent efforts to create standard languages. It involves […] adapting the language for its role as a medium of communication in a modern society’ (Thomas 1991: 79). Playful purism: most typically, the creation of neologisms by native means as a result of an individual activity that often replaces well-established foreign words. Xenophobic purism: an attitude in favour of replacing elements identified as foreign with native elements. With this taxonomic model, which encompasses all cases of purism in any language, we may also describe the purisms that have targeted the lexicon of the Basque language. In our case, however – that of a minority language in a situation of diglossia – further clarification is needed. As for ‘playful’ attitudes of word creation (creation of neologisms, as a result of an individual activity, that often replace well-established foreign words), we hold that the effort of individual lexicographers in a standard-creating attempt to propose solutions for lexical gaps may not always be termed playful, but it is always creative. In other words, although these words are not attested in corpora of Basque texts, they are Basque insomuch as they are created following Basque morphosyntactic conventions. Therefore, many neologisms, even if proposed by an individual or a small group, may be totally intelligible and transparent to the listener. We thus describe a purist attitude as playful only when a lexicographer’s term proposal is following his/her own personal taste rather than a strategy of combination and derivation of items in use. Strategies to fill lexical gaps may be more or less purist in the sense Thomas calls xenophobic, that is, purism to modulate the presence of foreign lexical items, using native elements. But what does ‘foreign’ mean in the case of Basque? Celtic, Greek, Latin, Arabic, Spanish or French? Purism that aims at the replacement of non-native elements in the Basque case always targets loans from Latin and its descendant languages (as well as Graeco-Roman internationalisms in terminology). Thus it is not against elements that have come from foreign peoples, but against well-known words that have been there for a considerable period of time, and come from the dominant partner in diglossia. Following Brunstad (2010: 67-77), who also discusses Thomas’ taxonomy of puristic orientations, we may therefore conclude that for an adequate interpretation of puristic orientations, the language’s status relative to other languages must be taken into account: e.g. state language vs. not a state language, majority language vs. minority language, language contact between two mutual intelligible languages vs. language contact between two unintelligible languages, language standardised between 1550 and 1800 vs. language standardised in the period after 1800. Table 1 summarises our proposal for a periodisation. Until the most recent past, all Basque dictionaries were bilingual. That means that in their attempt to define Basque equivalents to lexical items of the high-prestige counterparts in diglossia – i.e. Latin, Spanish and French – pre-standard Basque lexicographers almost always had to face issues related to purism. This first period can be characterised as a 250-year search for a standard. The Basque lexicon itself was not codified, and Basque was not subject to institutionally-backed standardisation, until 1968. Basque dictionaries of all periods deserve to be analysed taking into account to what extent they influenced the codification of the lexicon of the Basque language. Table 1. Periodisation of Basque Lexicography: Principal reference works. Dictionary Languages Standardisation stage Purism types Dictionary type LAR1745 ES>EU-LA pre-standardisation reformist prescriptive AZK1905 EU>ES-FR pre-standardisation ethnographic, reformist descriptive BEM1915 ES<>EU pre-standardisation playful-elitist, nativist- xenophobic, reformist prescriptive KIN1977 ES<>EU standardisation nativist-empowering, creative, reformist prescriptive MUG1977 ES<>EU standardisation nativist-empowering, creative, reformist prescriptive OEH1986 EU<> ES-FR standardisation reformist (institutional) descriptive SEH1996 EU post-standardisation reformist descriptive ZEH2005 ES>EU post-standardisation reformist proscriptive ELH2006 ES<>EU post-standardisation reformist proscriptive ADO2009 ES<>EU post-standardisation reformist proscriptive EEH2007- EU post-standardisation reformist descriptive EHI2016 EU standardisation reformist (institutional) prescriptive Dictionary Languages Standardisation stage Purism types Dictionary type LAR1745 ES>EU-LA pre-standardisation reformist prescriptive AZK1905 EU>ES-FR pre-standardisation ethnographic, reformist descriptive BEM1915 ES<>EU pre-standardisation playful-elitist, nativist- xenophobic, reformist prescriptive KIN1977 ES<>EU standardisation nativist-empowering, creative, reformist prescriptive MUG1977 ES<>EU standardisation nativist-empowering, creative, reformist prescriptive OEH1986 EU<> ES-FR standardisation reformist (institutional) descriptive SEH1996 EU post-standardisation reformist descriptive ZEH2005 ES>EU post-standardisation reformist proscriptive ELH2006 ES<>EU post-standardisation reformist proscriptive ADO2009 ES<>EU post-standardisation reformist proscriptive EEH2007- EU post-standardisation reformist descriptive EHI2016 EU standardisation reformist (institutional) prescriptive Table 1. Periodisation of Basque Lexicography: Principal reference works. Dictionary Languages Standardisation stage Purism types Dictionary type LAR1745 ES>EU-LA pre-standardisation reformist prescriptive AZK1905 EU>ES-FR pre-standardisation ethnographic, reformist descriptive BEM1915 ES<>EU pre-standardisation playful-elitist, nativist- xenophobic, reformist prescriptive KIN1977 ES<>EU standardisation nativist-empowering, creative, reformist prescriptive MUG1977 ES<>EU standardisation nativist-empowering, creative, reformist prescriptive OEH1986 EU<> ES-FR standardisation reformist (institutional) descriptive SEH1996 EU post-standardisation reformist descriptive ZEH2005 ES>EU post-standardisation reformist proscriptive ELH2006 ES<>EU post-standardisation reformist proscriptive ADO2009 ES<>EU post-standardisation reformist proscriptive EEH2007- EU post-standardisation reformist descriptive EHI2016 EU standardisation reformist (institutional) prescriptive Dictionary Languages Standardisation stage Purism types Dictionary type LAR1745 ES>EU-LA pre-standardisation reformist prescriptive AZK1905 EU>ES-FR pre-standardisation ethnographic, reformist descriptive BEM1915 ES<>EU pre-standardisation playful-elitist, nativist- xenophobic, reformist prescriptive KIN1977 ES<>EU standardisation nativist-empowering, creative, reformist prescriptive MUG1977 ES<>EU standardisation nativist-empowering, creative, reformist prescriptive OEH1986 EU<> ES-FR standardisation reformist (institutional) descriptive SEH1996 EU post-standardisation reformist descriptive ZEH2005 ES>EU post-standardisation reformist proscriptive ELH2006 ES<>EU post-standardisation reformist proscriptive ADO2009 ES<>EU post-standardisation reformist proscriptive EEH2007- EU post-standardisation reformist descriptive EHI2016 EU standardisation reformist (institutional) prescriptive A common factor shared by all of the dictionaries discussed here is a motivation to unify Basque, and hence, to create a standard, and to spread the knowledge about the lexical richness of Basque, which also can be taken as a (softer) type of ‘reformist’ purism as described by Thomas (1991:79). According to the stages of standardisation described above, Basque dictionaries can be classified in three groups: (a) those that, since the middle of the eighteenth century, in one way or another, were searching for the appropriate words for a standard (pre-standardisation stage); (b) those that codified the lexicon from the institutional background of the Basque Language Academy (standardisation stage); and (c) those that are contributing to the further elaboration and educational spread of the lexicon of Standard Basque in a post-standardisation setting, i.e. representatives of a reformist purism concerned with the quality of text production. In the following, we look at the lexicographic production in the three stages. 2. The search for a standard lexicon (1745-1968) Three attempts deserve to be mentioned here. Each one set out its own criteria when choosing the most appropriate words for a standard lexicon, i.e. the most suitable words for a cultivated language. Each dealt with purism in a different way. 2.1 Larramendi 1745 The trilingual dictionary compiled by the Jesuit Manuel de Larramendi, Diccionario Trilingüe del Castellano, Bascuence y Latín (LAR1745), was the first printed Basque dictionary, and the main reference in Basque lexicography until the beginning of the 20th century. Larramendi’s dictionary exerted a great influence over the language used by Basque religious and secular writers for a century and a half. The publication of Larramendi’s Basque grammar (1729) and his dictionary, is what ‘separates the new from the old age’ (Mitxelena 1984). Larramendi had a dual objective in mind when compiling his dictionary: on the one hand, his goal was to fight against the detractors of the Basque language by demonstrating that Basque is as rich in lexical resources as Spanish or Latin. On the other hand, he was concerned with the quality of the oral and written text production. Larramendi wanted Basque preachers and writers to embrace a conviction that Spanish borrowings should not be used anywhere, particularly when Basque-native words were available: unnecessary borrowings had to be rejected. Larramendi shows a concern for purity, for a language as pure and unmixed as possible, which had to be protected from corruption and decay, typical of standard language cultures (Milroy 2001). To fulfil this dual objective, for every one of the 43.000 Spanish headwords in the main reference dictionary of the time (RAE1726) he provides a corresponding equivalent in Basque; thus ‘proving’ that Basque had the same range of semantic expressiveness as Spanish. The dictionary contains approximately 40.000 different Basque lemmata (Urgell 2000:5). In the preface to his Trilingual Dictionary, Larramendi provides the keys to his puristic process of choosing Basque equivalents. He would accept any word occurring in Basque literature, no matter its origin, i.e., whether it was identifiable as a loan or not, or from whatever dialect. Also terms of a foreign origin that had been borrowed from Latin in distant centuries, such as the vocabulary of Christianity – aingeru ‘angel’, eliza ‘church’, meza ‘mass’ or obispo, apezpiku ‘bishop’, etc. – are entered as equivalents of Spanish ángel, iglesia, misa or obispo. Hence, Larramendi’s purism does not reject loanwords that are well-established in Basque. On the other hand, Larramendi rejects widely used borrowings that he felt to be unnecessary, preferring in many cases a synonym that was also in use but not identified as loan, such as egiazko, and not berdadero, Spanish verdadero, ‘true’, damu (and not dolore) ‘regret, repentance’, and irakurri (and not leitu), ‘to read’. As for terms, Larramendi also coined his own neologisms, compound and derived words. For terms belonging to domains where Basque at the time was not normalised, he sometimes proposed a purist neologism, as jainkokinde, (jainko-kinde, ‘God’ and derivational suffix, ‘theology’), or izarkinde (izar- being ‘star’), ‘astrology’, instead of the internationally widely used terms teologia and astrologia. These newly-coined compound nouns would be more easily understood than the Greek loans. Thus, where his work is a contribution to a functional development of Basque, Larramendi’s purism is a stance in favour of self-intelligible, transparent assemblages of Basque morphological components. With this criterion in mind, he also rejects some loanwords that had been used in literature and replaces them with a purist but transparent synonym: kondaira (konda-era, ‘way of telling’, fairy tale) instead of historia, zenbate (‘number’, zenbat-te ‘how much’, ‘how many’ and collective suffix) instead of numero or biltoki (bil-toki, ‘gather place’) instead of teatro. Few of this last group are found in frequent use nowadays; but there is no doubt that Larramendi’s Trilingual Dictionary, in line with its intention, can be called a standard-creating and even modernising (Zgusta 1989) attempt. Furthermore, Larramendi’s reformist purism, his prejudice in favour of transparency, may be regarded as contrary to an elitist approach. According to SEH1996, 3515 lemmata (including derived words and compounds) are documented in LAR1745 for the very first time. 1535 of these have 20 or more appearances in the ETC reference corpus, which is made up of contemporary texts (see section 3.4 for reference). All in all, Larramendi’s contribution is similar to that of other lexicographers in other minoritised languages, i.e. that of Evans for Welsh (Löffler 2003: 71). 2.2 Arana and his followers Larramendi’s influence vanished at the turn of the late 19th and early 20th century, when Sabin Arana Goiri (the founder of the Basque National Party EAJ-PNV) started a crusade to ‘clean’ Basque of any supposed loan, even of those adopted long ago. The uniqueness of the Basque language had to be emphasised; every foreign word banned. Widely used terms like aingeru ‘angel’, eliza ‘church’, meza ‘mass’ or apezpiku ‘bishop’ were substituted with new coinages gotzon < gogo huts on ‘spirit pure good’, txadon < etxe done ‘house holy’, jaupa < Jaun opa ‘Lord offering’ and gotzain < gogo zain ‘soul keeper’, respectively (cf. Pagola 2005). Thus, Arana Goiri’s coinages can be taken as examples of xenophobic (Thomas 1991) or sanitary purism (Milroy 2005). Arana Goiri coined new words by applying the morpho-phonological rules that occur in Basque composition and derivation (cf. Oñederra 1990) – but did so in such a peculiar way that the outcome is totally opaque for speakers. In these proposals, the main motivation was not transparency for an easy understanding, but a purist ‘renewal’ and rejection of everything ‘foreign’. This type of purism is closely linked to romanticism and nationalism, as it was common in a wide range of language communities in Europe in the late 19th and early 20th century; but the approach can also be regarded as elitist, i.e. not intelligible for the lay person (cf. Thomas 1991: 43-45). As for terms, Arana rejected almost all of Larramendi’s proposals. Although they had been created transparently by means of Basque word-formation rules, Arana did not spare them from being substituted with other new coinages: zenbaki ‘number’, lutelesti ‘geography’, edesti ‘history’ or antzoki ‘theater’ (zenbat–ki ‘how much’, ‘how many’, and object marking suffix; ludi-eres-ti, ‘world-account’ and collective suffix; eres-ti, ‘tale’ and collective suffix; antze-(t)oki, ‘art-place’. In addition to excluding any foreign word, Arana claimed that a dictionary must establish the exact form of each native word. In this pursuit, he took into account not the written tradition, but his own feelings. For example, he favoured forms such as odoldau, lotsage or argiztu over the widely-used odoldu (odol-du blood-PTCP ‘stained with blood’, lotsagabe (lotsa-gabe, ‘shame-less’) and argitu (argi-tu light-PTCP, i.e. ‘lit (up)’. Arana himself did not compile any dictionary, but his proposals were gathered by Bera-Mendizabal’s dictionaries (BEM1916) and influenced literary production until the Civil War 1936-1939 and even afterwards, until the late 1950s. These dictionaries were re-edited several times as late as 1975; they contain mainly the headwords of Azkue’s dictionary (see below), and the proposals of Arana and his followers. As regards the survival of Arana’s new coinages, those that were intended to replace accepted loan words mostly failed, although some of them survived, such as gotzain, ‘bishop’ (see above) or olerki, ‘poem’ (ol-eder-ki ‘thought-beautiful’ and object suffix). Those created to supplant Larramendi’s coinages, however, mostly succeeded and entered the standard dictionary. The purist neologisms proposed by Arana Goiri and his followers roughly fall into three domains: politics, religion and everyday life. Table 2 contains a list of the 25 most frequent neologisms proposed by Arana, their translations to English, and relative frequency (i.e. the percentage value of occurrence) and rank in the ETC reference corpus (see section 3.4). Table 2. Arana’s neologisms most used today, relative frequency and rank in ETC corpus. Basque English Part of Speech Frequency Rank alderdi party noun 0.0804 131 idatzi write verb 0.0628 188 jaurlaritza Government noun 0.0391 308 abertzale patriot adjective, noun 0.0335 367 hauteskunde election noun 0.0330 376 ordezkari representative noun 0.0308 401 batzorde council noun 0.0306 404 argazki photograph noun 0.0261 478 askatasun freedom noun 0.0252 495 lehendakari president noun 0.0200 623 zenbaki number noun 0.0192 649 idazkari secretary noun 0.0158 777 espetxe prison noun 0.0153 806 antzerki theatre (play) noun 0.0147 840 aske free adjective 0.0121 982 ikastola school noun 0.0119 996 aldundi (province) Government noun 0.0113 1,027 antzoki theatre (room, building) noun 0.0080 1,358 hautetsi elect verb, noun, adjective 0.0067 1,540 ikur sign; emblem noun 0.0060 1,648 bazkide member noun 0.0058 1,685 abestu sing verb 0.0047 1,925 egutegi calendar noun 0.0044 2,002 aberri homeland noun 0.0039 2,182 eguazten Wednesday noun 0.0029 2,612 Basque English Part of Speech Frequency Rank alderdi party noun 0.0804 131 idatzi write verb 0.0628 188 jaurlaritza Government noun 0.0391 308 abertzale patriot adjective, noun 0.0335 367 hauteskunde election noun 0.0330 376 ordezkari representative noun 0.0308 401 batzorde council noun 0.0306 404 argazki photograph noun 0.0261 478 askatasun freedom noun 0.0252 495 lehendakari president noun 0.0200 623 zenbaki number noun 0.0192 649 idazkari secretary noun 0.0158 777 espetxe prison noun 0.0153 806 antzerki theatre (play) noun 0.0147 840 aske free adjective 0.0121 982 ikastola school noun 0.0119 996 aldundi (province) Government noun 0.0113 1,027 antzoki theatre (room, building) noun 0.0080 1,358 hautetsi elect verb, noun, adjective 0.0067 1,540 ikur sign; emblem noun 0.0060 1,648 bazkide member noun 0.0058 1,685 abestu sing verb 0.0047 1,925 egutegi calendar noun 0.0044 2,002 aberri homeland noun 0.0039 2,182 eguazten Wednesday noun 0.0029 2,612 Table 2. Arana’s neologisms most used today, relative frequency and rank in ETC corpus. Basque English Part of Speech Frequency Rank alderdi party noun 0.0804 131 idatzi write verb 0.0628 188 jaurlaritza Government noun 0.0391 308 abertzale patriot adjective, noun 0.0335 367 hauteskunde election noun 0.0330 376 ordezkari representative noun 0.0308 401 batzorde council noun 0.0306 404 argazki photograph noun 0.0261 478 askatasun freedom noun 0.0252 495 lehendakari president noun 0.0200 623 zenbaki number noun 0.0192 649 idazkari secretary noun 0.0158 777 espetxe prison noun 0.0153 806 antzerki theatre (play) noun 0.0147 840 aske free adjective 0.0121 982 ikastola school noun 0.0119 996 aldundi (province) Government noun 0.0113 1,027 antzoki theatre (room, building) noun 0.0080 1,358 hautetsi elect verb, noun, adjective 0.0067 1,540 ikur sign; emblem noun 0.0060 1,648 bazkide member noun 0.0058 1,685 abestu sing verb 0.0047 1,925 egutegi calendar noun 0.0044 2,002 aberri homeland noun 0.0039 2,182 eguazten Wednesday noun 0.0029 2,612 Basque English Part of Speech Frequency Rank alderdi party noun 0.0804 131 idatzi write verb 0.0628 188 jaurlaritza Government noun 0.0391 308 abertzale patriot adjective, noun 0.0335 367 hauteskunde election noun 0.0330 376 ordezkari representative noun 0.0308 401 batzorde council noun 0.0306 404 argazki photograph noun 0.0261 478 askatasun freedom noun 0.0252 495 lehendakari president noun 0.0200 623 zenbaki number noun 0.0192 649 idazkari secretary noun 0.0158 777 espetxe prison noun 0.0153 806 antzerki theatre (play) noun 0.0147 840 aske free adjective 0.0121 982 ikastola school noun 0.0119 996 aldundi (province) Government noun 0.0113 1,027 antzoki theatre (room, building) noun 0.0080 1,358 hautetsi elect verb, noun, adjective 0.0067 1,540 ikur sign; emblem noun 0.0060 1,648 bazkide member noun 0.0058 1,685 abestu sing verb 0.0047 1,925 egutegi calendar noun 0.0044 2,002 aberri homeland noun 0.0039 2,182 eguazten Wednesday noun 0.0029 2,612 2.3 Azkue 1905-06 While Arana and his followers were ‘sowing’ the literary field with their new coinages, Azkue set out searching for the words really used by Basques: town by town, village by village, but also trawling for the words used in Basque literature. As a result of this search, he included in his Diccionario vasco-español-francés (AZK1905) loans such as meza ‘mass’, apezpiku, ‘bishop’, eliza (Lat. ecclesia), ‘church’, aingeru ‘angel’, or lege (Lat. lex), ‘law’. When entering these headwords, he flagged them with double question marks, since he also found that the origin of these words made them only liminally Basque. Nevertheless, as Sarasola (2002) points out, other loan words with a long tradition in use were excluded from his dictionary, even, for instance, other religious words such as fede ‘faith, confidence’. Azkue placed oral use before literary use, and consequently rejected many words present in Larramendi’s dictionary because he had no means to ascertain which ones had been common words used by lay people in the 18th century, and which ones had simply been created by Larramendi. For similar reasons he rejected Arana’s neologisms. In other words, Azkue did not want to accept in his dictionary novel, rare coinages intended to replace others, well-established in the language: he recorded the words really used by Basque people and by Basque writers. However, Sarasola (2002) considers Azkue a purist, in the sense that weighted his findings from field research according to his personal preference. SEH1996 marks 1.422 lemmata, including derived and compound words, to be first documented in AZK1905, 495 of which are found 20 or more times in the ETC reference corpus. That means that about two thirds of the lemmata first documented in AZK1905 are very rare words today, if used at all. Azkue’s work influenced Basque lexicographers and linguists; but Bera-Mendizabal’s bilingual dictionary (BEM1916, see section 2.2) remained a reference for writers and for lay people until the late 1970s. 3. Codification of standard Basque (1968-2000) Controversy over the status of loans and neologisms continued after the foundation of the Academy of the Basque Language Euskaltzaindia in 1919, until 1959, when the Academy approved a declaration stating that the use of words, not their language of origin, was the main criterion to be taken into account for the admission of new words. In light of this declaration, words widely used in Basque literature are deemed Basque words, no matter where they come from. This implies, for instance, that all the words in Table 2, in spite of their origin, were accepted for the Dictionary of the Academy (cf. section 3.2). As for new words required by modern life, the Academy preferred to resort to compounding and derivation, but did not forsake the other option; that is, loan words (Azkarate 2008). After 1968, new bilingual dictionaries (with Spanish as their target language) faced the same challenges as Larramendi in the 18th century: to find a Basque equivalent for each Spanish word sense. These works sought, again, to promote, to enrich the minority language into a standard language that could fulfil every function in formal and in informal settings. Nearly three quarters of a century separates the publication of Azkue’s and Bera-Mendizabal’s dictionaries in the 1900s-10s, and the publication of the General Basque DictionaryOEH1986. Late in this interval, the influence of Kintana’s (KIN1977) and Múgica Urdangarín (MUG1977) dictionaries deserves consideration. At this point, the first standardisation of the language had been undertaken, and the use of Basque in public had become legal in the Spanish territories: a new generation entered the Basque ‘arena’. In this environment, Kintana’s proposals were well-received: his neologisms were used in textbooks, in government, and in the media. To measure the extent to which the two dictionaries make proposals with no attestation in written texts, Sarasola has studied the rate at which Kintana and Múgica agree on equivalents for prefixed Spanish words, finding that they agree less than 50% of the time (Sarasola 2003). Thus, both dictionaries (and other works compiled during the 1980s and 1990s such as a couple of Basque encyclopaedias) can be taken as standard-creating, with a creative approach to the enrichment and modernisation of the Basque lexicon. The approach is somewhat ‘playful’ as Thomas understands it (1991:75–83), since it cannot be entirely evidence-based in the sense that it does not consider corpus data. The terminological dictionaries compiled by UZEI (Basque Center for Terminology and Terminography) from 1977 on, were also not based on corpus data. Rather, they were seen as instruments that would allow the introduction of Basque at University-level teaching. 3.1 Orotariko Euskal Hiztegia (General Basque Dictionary) Compiled by Koldo Mitxelena and Ibon Sarasola for the Academy of the Basque Language, this dictionary is the first Basque dictionary based on an electronic text corpus. The corpus includes almost every written word ever published in print: thus, the wealth of published text from any time and dialect, until around 1980. The first edition was finished in 1986. The aim of the authors is to describe the Basque lexicon as a whole: to account for the words used by Basque writers, but also for the words contained in Basque dictionaries. This comparison allows the reader to realise which words belong almost exclusively to the realm of lexicography and which ones have really been used by Basque writers – and to what extent. There are no new proposals, no new coinages: just an honest picture of Basque lexical practice, as complete, neutral, rich, and far-reaching as possible. The picture is drawn by an overwhelming number of hand-selected corpus-based usage examples, each one with the bibliographical reference. The lemmata are not discussed according to any puristic criteria: it is just a matter of showing which words and expressions have been used in written language. OEH1986 contains 95,000 headwords of entries and subentries, 48,800 of which are not found in the reference corpus ETC nor in the big web corpora (see section 10 for references), and thus must be regarded as historical or lexicographical. Updating of OEH is still ongoing:4 the fifth version, enriched with entries corresponding to legal and administrative texts, is now available. The online version contains dictionary articles for 145.318 lemmata, and the entries also contain translation equivalents for Spanish and/or French. The lemmatisation criteria in OEH obey the guidelines for orthographical standardisation which Euskaltzaindia has been releasing from 1968 onwards (see section 9 below); in this regard, the OEH dictionary reflects reformist purism policies for a standardised unification of Basque. 3.2 Euskaltzaindiaren Hiztegia Once the OEH corpus was in place, the Academy of the Basque Language faced both the task of choosing words and that of compiling its prescriptive dictionary of the standard variety: Euskaltzaindiaren Hiztegia (EHI2016, ‘the Academy’s Dictionary’), which in its present version contains 37.884 entries, 6.944 subentries and 61.398 word senses5. This lemma list represents the codification of the standard lexicon; Euskaltzaindiaren Hiztegia is thus to be characterised as a straightforwardly prescriptive dictionary. Following Mitxelena’s advice, the corpus created for OEH1986 enabled the selection of words according to their frequency in the literary tradition. But words required by present-day discourse could not be neglected. What follows is a summary of the criteria followed by the lexicography board at Euskaltzaindia: Different dialectal forms of the same lemma were standardised in such a way that they could be understood by most speakers: in many cases, not the most frequent form, but an older form that was common to various dialects. For terms, new coinages (usually compounds or derived words) were accepted, provided that they were well formed (haragi-jale, literally ‘meat-eater’, carnivorous; gainzama ‘excess load’; historiaurre ‘prehistory’). International loanwords were accepted, since they would be easily understood by most Basque speakers, (intsektizida ‘insecticide’, eskizofrenia ‘schizophrenia’, eszeptiko ‘sceptic’, neurona ‘neuron’…). Sometimes both forms coexist, the loanword and the new coinage: herbiboro and belarjale ‘grass eater’; akordeoi and eskusoinu ‘accordion’ (lit. ‘hand sound)’. Occasionally, a newly coined transparent compound or derived term is selected and the loanword is rejected (apikultura* > erlezaintza ‘bee keeping’). The requirement for a loanword to be accepted is that it must be understandable to speakers of both sides of the Pyrenees: e.g. hipoteka ‘mortgage’ entered the Academýs Dictionary as it is used in both Spanish and French. Problems arise when southern Basques use a Spanish loanword in contexts where northern Basques use a French loanword. In those cases, both loanwords entered the Academy’s Dictionary together with the corresponding marker North or South: koaderno Heg / kaier Ipar ‘notebook’. Unable to select a word based on the practices of a majority of dialect groups, the Academy preferred to select words that are truly used instead of coining terms that have had no tradition at all. This also served to inform Basques from one side of the Pyrenees about the words used on the other. When only a Spanish loanword was attested, two approaches were taken. In some cases, the marker South indicates that the word is not suitable for the standard variety, but that it is used by southern Basques: teklatu, ‘keyboard’, hormigoi, ‘concrete’. In other cases, when a Spanish loanword is found to be very frequent, a new coinage is proposed for the standard variety: *bainera > bainuontzia (‘bath tub’), *boligrafo > bolaluma (‘pen’), *motxila > bizkarzakua (‘mochila’,’backpack’), *labadora > garbigailua (‘lavadora’, ‘washing machine’). We take a more detailed look at this type of proposal in the following section. 3.3 Terminology glossaries and dictionaries Continuing the work on Terminology started by UZEI in 1977, in 2003 the Basque Government created the Terminology Commission. Its members are experts from universities, government, the Academy or from organisations devoted to terminological tasks. The outcome of their work is gathered in the Basque Public Term Bank, which currently holds about 500.000 terminological records.6 Terms are constantly updated according to the Euskaltzaindia guidelines and the proposals for terminology normalisation. New coinages follow the phonological and morphological structure of Basque and its rules for compounding and derivation. In addition, the Commission’s guiding criterion is real usage: If a term is widely used among experts of the relevant field, it should not be replaced by a new coinage. The terminographical task is based on corpus evidence whenever the existence of written documents in a field of specialisation allows it. Today, Basque is used in a normalised way in a wide range of domains, and specialised corpora are available. However, there is still room for a reformist purism, in order to promote the use of Basque in semi-normalised fields (sports, pottery, or social media, to cite but a few of the domains of recently compiled specialised dictionaries). 3.4 Success of purist proposals: Corpus-based measurements As described by sources like Thomas (1991:164–170) there are many approaches to studying purism through quantitative data. One may compare lemma lists that indicate ‘foreign’ and ‘non-foreign’ in order to characterise the composition of a language’s vocabulary. With diachronically-indexed text corpora and relevant processing tools, it has also become possible to systematically measure variations in frequency of use over time. For big languages like German we can count on these resources and display charts such as Figure 1, where we show the frequency over time of a set of synonyms: Telephon/Telefon, an internationally-used coinage of a Greek term in two orthographic variants, vs. Fernsprecher (‘far-speaker’), a purist neologism. Figure 1. View largeDownload slide Frequency of a set of synonyms over time (German).7 Figure 1. View largeDownload slide Frequency of a set of synonyms over time (German).7 This allows us to judge the success of the purist proposal Fernsprecher, in absolute terms as well as in relation to its loan synonyms, and to consider their performance over time in relation to extralinguistic factors like language planning policies. When diachronically-indexed corpora of suitable composition and size are available, two methodological steps are necessary. First, the definition of synonym sets; second, the tagging of their components according to lexical features related to purism. Minimally speaking, and for simple comparisons like that in Figure 1, these features may be binary, as ‘foreign’ versus ‘non-foreign’. But they may also involve more sophisticated taxonomies, as the ones proposed by Thomas (1991:73). The resources available today for Basque do not enable comparative studies like those exemplified here for German, but they do allow us, for instance, to compare the counts of the same lemma in different corpora. Lexikoa atzo eta Gaur ‘lexicon yesterday and today’ (LAG, Sarasola et al. 2008)8 is a web tool that shows counts in (1) subcorpora of the OEH-corpus (18th century; 19th century; and 1900-1968), and (2), the EPG (Ereduzko Prosa Gaur ‘Exemplary Prose Today’) corpus which contains hand-selected reference prose from 1968 onwards. The tool displays the counts of the queried lemma (a) for different authors, that is, the number of writers that use the lemma, and (b), as occurrences in the text, and consequently, the number of sentences in which the lemma is used. If we look at the Basque counterpart of the German example cited above, the LAG data for the synonym pair telefono and urritizkin (the literal ‘far-speaker’, an often cited purist proposal dating from 1915) shows a constant frequency increase for telefono from the 19th century until today. In contrast, urritizkin, coined in 1915, did not last long after 1968, when it shows a clear decline. This purist neologism shares with German Fernsprecher the fact that, according to corpus data, there is no period where it was used more frequently than its loan synonym, and that the promotion of its use was related to romanticist attitudes about national ‘renewal’. Today, the international term outnumbers its purist synonym by far, both the German and the Basque purist ‘far-speaker’ neologisms can be characterised as markers of a marginal, traditionalist discourse. Following the methodology described above, we can define sets of synonyms having tagged their constituents as ‘purist neologism’ vs. ‘loanword’. We can then compare the frequency of their appearance in a set of dictionaries (a ‘lexicographical corpus’), and big electronic text corpora, like Egungo Testuen Corpusa (ETC ‘Present day Texts Corpus’, Sarasola et al. 2013) and the Elhuyar web corpora (Leturia 2012, Leturia 2014).9 For a selection of everyday words like the ones mentioned in section 3.2, we obtain the results displayed in Table 3, where synonymous lexical items appear in order of frequency (the average of relative frequencies in ETC and the Elhuyar web corpora), together with their mention in some lexical resources (dictionaries and others). Orthographic variants of the loanword are indicated in bold type. The labels ‘beh.’ (behe-mailakoa, ‘low register’), ‘goi.’ (goi mailakoa ‘high register’), and ‘heg.’ (Hegoaldea ‘south-of-Pyrenees dialect’) are usage markers; the (*) refers to a mark in the Academy’s EHI2016 that indicates a preference for its deprecation. Table 3. Variants of modern everyday life terms as synonym sets. Lexical item Rank EHI2016 ZEH2005 ELH2006 ADO2009 OEH1986 ‘dummy’, ‘pacifier’ (baby)  txupete 1 x (heg.) x (beh.) x (heg.) x  txupaki 2 x x  chupete 3  zupaki 4 x  xurgaki 5 x x  ttunttun 6 x  edoskai 7 x ‘corkscrew’  sakakortxo 1 x  sacacorchos 2  sakakortxos 3  kortxokentzeko 4  kortxo_torloju 5 x x  kortxo_ateratzaile 6  kortxokentzaile 7  kortxo_irekigailu 8 x  tapoi_irekigailu 9  kortxo_kentzeko 10 x (heg.) x x  kortxu_ateratzaile 11 x  tapoi_kentzeko 12 x x ‘bathtub’  bainuontzi 1 x x x x  bainera 2 x (*)  bañera 3  bainu-ontzi 4 ‘washing machine (clothes)’  garbigailu 1 x x x x x  labadora 2  ikuzgailu 3 x x (goi) x x x  labadoria 4  ikuzmakina 5 x  labadorie 6  garbi-gailu 7 Lexical item Rank EHI2016 ZEH2005 ELH2006 ADO2009 OEH1986 ‘dummy’, ‘pacifier’ (baby)  txupete 1 x (heg.) x (beh.) x (heg.) x  txupaki 2 x x  chupete 3  zupaki 4 x  xurgaki 5 x x  ttunttun 6 x  edoskai 7 x ‘corkscrew’  sakakortxo 1 x  sacacorchos 2  sakakortxos 3  kortxokentzeko 4  kortxo_torloju 5 x x  kortxo_ateratzaile 6  kortxokentzaile 7  kortxo_irekigailu 8 x  tapoi_irekigailu 9  kortxo_kentzeko 10 x (heg.) x x  kortxu_ateratzaile 11 x  tapoi_kentzeko 12 x x ‘bathtub’  bainuontzi 1 x x x x  bainera 2 x (*)  bañera 3  bainu-ontzi 4 ‘washing machine (clothes)’  garbigailu 1 x x x x x  labadora 2  ikuzgailu 3 x x (goi) x x x  labadoria 4  ikuzmakina 5 x  labadorie 6  garbi-gailu 7 Table 3. Variants of modern everyday life terms as synonym sets. Lexical item Rank EHI2016 ZEH2005 ELH2006 ADO2009 OEH1986 ‘dummy’, ‘pacifier’ (baby)  txupete 1 x (heg.) x (beh.) x (heg.) x  txupaki 2 x x  chupete 3  zupaki 4 x  xurgaki 5 x x  ttunttun 6 x  edoskai 7 x ‘corkscrew’  sakakortxo 1 x  sacacorchos 2  sakakortxos 3  kortxokentzeko 4  kortxo_torloju 5 x x  kortxo_ateratzaile 6  kortxokentzaile 7  kortxo_irekigailu 8 x  tapoi_irekigailu 9  kortxo_kentzeko 10 x (heg.) x x  kortxu_ateratzaile 11 x  tapoi_kentzeko 12 x x ‘bathtub’  bainuontzi 1 x x x x  bainera 2 x (*)  bañera 3  bainu-ontzi 4 ‘washing machine (clothes)’  garbigailu 1 x x x x x  labadora 2  ikuzgailu 3 x x (goi) x x x  labadoria 4  ikuzmakina 5 x  labadorie 6  garbi-gailu 7 Lexical item Rank EHI2016 ZEH2005 ELH2006 ADO2009 OEH1986 ‘dummy’, ‘pacifier’ (baby)  txupete 1 x (heg.) x (beh.) x (heg.) x  txupaki 2 x x  chupete 3  zupaki 4 x  xurgaki 5 x x  ttunttun 6 x  edoskai 7 x ‘corkscrew’  sakakortxo 1 x  sacacorchos 2  sakakortxos 3  kortxokentzeko 4  kortxo_torloju 5 x x  kortxo_ateratzaile 6  kortxokentzaile 7  kortxo_irekigailu 8 x  tapoi_irekigailu 9  kortxo_kentzeko 10 x (heg.) x x  kortxu_ateratzaile 11 x  tapoi_kentzeko 12 x x ‘bathtub’  bainuontzi 1 x x x x  bainera 2 x (*)  bañera 3  bainu-ontzi 4 ‘washing machine (clothes)’  garbigailu 1 x x x x x  labadora 2  ikuzgailu 3 x x (goi) x x x  labadoria 4  ikuzmakina 5 x  labadorie 6  garbi-gailu 7 Looking at the examples, we observe that if the Academy and the other recent lexical resources agree on a neologism, it is likely to appear more often in the reference corpora than its loan counterpart. If, on the other hand, the dictionaries do not agree on the term to be used, the most frequently-used word will be the loanword. For ‘comforter’, the Spanish loan txupete, in spite of being marked as ‘southern’ or ‘low register’, in its two graphical variants combined, is used far more often than any of the various purist proposals. In the case of ‘corkscrew’, EHI2016 offers a recommendation marked as ‘southern’, and ELH2006 is the only resource that clearly agrees, while the other sources show quite a diverse picture. On the other hand, ‘bathtub’ and ‘washing machine’ do have a consensus recommendation for a purist equivalent: in these cases, although the orthographic variants of the Spanish loan immediately follow the recommended term on the rank list, their combined frequencies are no match for the frequency of the purist leader. Nevertheless, neologisms, such as those in Table 3, are mainly used when the standard variety is required, both in written and oral modalities. Informal oral speech is frequently characterized by loanwords. 4. Elaborating the code: From prescription to proscription The standardisation of the Basque lexicon is an ongoing process. By the 1990s all published dictionaries – still limited to bilingual dictionaries – were edited according to Standard Basque. In 1994, Sarasola finished the first edition of his monolingual Basque dictionary, from the second edition on entitled Euskal Hiztegia, ‘Basque Dictionary’ (SEH199610 and SEH2007), which has become a main reference work. The entries of this dictionary contain the date of the first documented use of a lemma in lexicography and literature, grammatical information, definitions, synonyms and usage examples from the OEH corpus. In many ways, therefore, this dictionary is a monolingual descendant of the multilingual General Basque Dictionary OEH1986. However, its lemma list contains more than 10.000 items not found in OEH1986, and among these around 5000 have been documented after AZK1905.11 Just like its multilingual antecedent, it is a purely descriptive dictionary: its aim is to give a picture of the written language in use, offering large numbers of referenced corpus-based usage examples. If we wanted to find any reformist motivation behind this piece of work, it would be the enrichment of Basque in use by promoting the whole range of the language’s lexicon as part of its lexical richness. Ibon Sarasola compiled a Spanish-Basque dictionary for translation and Basque text production addressed to a Spanish-Basque bilingual user (ZEH2005), also available online.12 This dictionary cannot be called descriptive in the sense that the selection of its Basque lexical items would be necessarily corpus-based. For translation purposes, all Spanish word senses must be furnished with Basque equivalents, so lexicographers must provide at least one equivalent, even if they fail to find any in existing reference corpora. In those cases where a word sense may be translated by more than one equivalent, lexicographers must decide whether to include all members of this synonym set, or whether to mark any of the lexical items according to register or regional distribution. Sarasola indicates this in the dictionary entry for chupete, ‘comforter’, shown in Figure 2, where the direct, non-assimilated loan is marked as beh., ‘low register’, while an assimilated version of the loan, txupaki, appears unmarked next to the more purist proposal, xurgaki, which does not carry any further mark either (cf. also the data in table 3 above). Figure 2. View largeDownload slide ZEH2005, s.v. chupete. Figure 2. View largeDownload slide ZEH2005, s.v. chupete. The other two main contemporary Spanish-Basque reference works follow a similar method. All equivalents listed in the dictionary entries are found in corpora. Therefore, these dictionaries cannot be considered prescriptive. But, on the other hand, the classification descriptive is not strictly suitable for these three bilingual dictionaries, either, since the lexicographer marks some equivalents as preferable to others, or, as we just have remarked, has to fill lexical gaps in order to provide equivalents for all Spanish word senses. It seems to us that this ‘third path’ in between description and prescription runs close to what Bergenholtz and Gouws describe as ‘weighed description’ or proscription: Proscription allows the same possibilities for the empirical basis as description […]. However, the results of empirical analysis are dealt with in a different way compared to a descriptive approach. In this regard the most salient distinction lies in the fact that the lexicographer does not only provide the results from the empirical analysis but goes further by indicating a specific variant that he/she regards as the recommended form. (Bergenholtz and Gouws 2010:36). This paradigm shift from prescription to proscription seems to be characteristic of the post-standardisation stage that the Basque language has reached. In concordance to Thomas’ model (1991:121–122), both the stringency of purism and the effort in prescription at this stage have been relaxed. More than by any conviction that ‘the battle has been won’ in favour of Basque, a language formerly regarded as a vernacular and bound to disappear, this may be explained by a growing self-assurance among Basque speakers, who nowadays do not look to a dictionary as often when they write a text as they would have done twenty or more years ago. Consequently, the Basque lexicographer Sarasola urges a softening of institutional prescription efforts, along with a professionalization of corpus-based proscriptive lexicography: We used to propose correctness, and we didn’t achieve it. It is usage which is important, that is, the word that is used. We have to develop the Basque language in the shade of Spanish. You may say, a certain word goes against unity, but what if nobody cares? What shall you do then? Break out in tears? We have to achieve unity, yes, but with the most used words. There was a time we proposed ‘krokodilo’ as correct form [sc. not ‘kokodrilo’ like in Spanish], and everybody took notice of that. But today, an approach like that doesn’t work. So, what shall we do? We shall resign and save what can be saved. Usage prevails, and we have to respect that. That is why corpora are so important. (Ibon Sarasola, 2015 speech at UPV/EHU; translation and brackets M.A./D.L.)Egungo Euskararen Hiztegia, ‘The Dictionary of Contemporary Basque’13 (EEH2007, see an example entry in Figure 3) is an online dictionary of Basque as it is used today, based entirely and only on EPG14 (Sarasola et al. 2007), a hand-selected corpus of contemporary reference prose with 25 million tokens. The project to edit this dictionary has been ongoing since 2007, and will take several more years before to complete. Figure 3. View largeDownload slide EEH2007, s.v. hiztegigintza, ‘lexicography’. Figure 3. View largeDownload slide EEH2007, s.v. hiztegigintza, ‘lexicography’. With time, additional item types such as grammatical and diachronic information will be incorporated. In its present form, the dictionary offers entries consisting of definitions, synonyms and corpus-derived usage examples for the word senses of the lemma (cf. Figure 3 which shows the entry ‘lexicography’). This dictionary follows a purely descriptive approach, and consequently, only lemmata found in the corpus enter the dictionary as headwords. Terminology is undergoing a shift from prescription to proscription with the program Terminologia Sareak Ehunduz (‘Weaving Terminology Networks’), which aims at making the real terminology and phraseology of specialised communication visible to experts, to linguists and to participants in language normalisation initiatives. Terms are extracted from real texts employed in university teaching.15 5. Conclusions and further work In this article, we have revisited Basque lexicography from the point of view of purism. We have asked what kinds of purist intentions motivated Basque lexicographers. We have proposed a rough periodisation of Basque lexicography according to the stages of language standardisation, showing a relation of these stages to a functional classification of lexicographical products. Based on lemma frequency data from different corpora for small example sets of purist neologisms and their loanword synonyms, we have proposed two hypotheses on the success of purist terminology proposals. First, as regards the purist term production of Sabin Arana and his followers in the early 20th century, we have seen that newly-coined terms that filled lexical gaps, that is, that had no established loanword synonym to replace, were bound to succeed, while purist proposals that corresponded to already-established internationalisms are not among the most frequently used neologisms of this period. Second, we have compared the frequencies of some terms necessary in modern life, in their loanword and their purist variants dating from a period of standardisation backed by the Language Academy. Here we have observed that in cases where all main dictionaries agree on the same proposed neologism, this neologism is much more likely to succeed over its borrowed synonym, than when the main reference resources do not propose the same term. In these cases, the loanword leads the frequency ranking. Thanks to the ongoing introduction of Digital Humanities methodology into Basque philology, more and more of the textual and lexicographical sources referenced in this article can be found in structured representation formats, which is necessary for further research into the issues developed here, as well as the creation of diachronically indexed subsets of the already available Basque reference corpora and extraction of frequency data for each of them. As soon as the content of paper dictionaries has been retro-digitised and enriched with structural mark-up dealing with all types of information, we will be able to test our hypothesis about the success factors of different sorts of neologisms, in relation to loanword synonyms and as tendencies over time. In this manner we may move from isolated examples to a broad empirical scale: that is, towards taking into account the lexicographical corpus of Basque as a whole.16 Footnotes 1 For Basque dialectal variation, cf. Zuazo 2013. 2 The Statute of Autonomy of the Basque Autonomous Community dates from 1979. In that sense, it can be said that the cultivation of the standard variety was linked to a nation-building project (Davies 2012: 56). For the evolution of Basque during the last decades and its present-day situation, cf. the Basque Government's Fifth Sociolinguistic Survey, available online, see http://www.euskara.euskadi.eus/r59-734/en/. 3 For a more detailed account of the process of selection, codification and elaboration of standard Basque cf. Hualde and Zuazo 2007, Salaburu and Alberdi 2012. 4 In its present version, OEH dictionary can be consulted only on line: see http://www.euskaltzaindia.eus. 5 Euskaltzaindiaren Hiztegia can be consulted on line: see http://www.euskaltzaindia.eus. 6 It can be consulted on line: see http://www.euskara.euskadi.eus/r59-734/en/. 7 Query for Telefon, Telephon and Fernsprecher in the German Google Books corpus (1870-2008, graph smoothing factor 3), Google Ngram Viewer (accessed on March 21, 2018). 8 See http://www.ehu.eus/lag/. 9 ETC and the 2014 Elhuyar web corpus contain 200 million tokens each. While ETC contains hand-selected reference prose, the Elhuyar web corpora are built by automatic methods and thus reach out for a wider range of registers (cf. also Lindemann and San Vicente 2015). ETC can be queried at http://www.ehu.eus/etc/; the Elhuyar corpora at http://webcorpusak.elhuyar.eus/. 10 The contents of this dictionary have also been represented as XML (Arriola … Sarasola 2003), from where we have obtained the data we are referring to. 11 On the other hand, SEH1996 contains not more than 6,100 of the 48,800 historical lemmata and variants present in the corpora used for the compilation of OEH1986 but not in the big reference corpora available today (cf. section 8). 12 See http://www.ehu.eus/ehg/zehazki/. 13 See http://www.ehu.eus/eeh/. 14 See http://www.ehu.eus/euskara-orria/euskara/ereduzkoa/. 15 The outcomes can be consulted online, https://www.ehu.eus/ehusfera/tse/. 16 The research leading to these results has received funding from the European Union's Seventh Framework Programme for research, technological development and demonstration under grant agreement no. 613465, and from the Basque Government (IT665-13). Funding is gratefully acknowledged. References Adorez Taldea (ed.). 2009. Adorez 5000 hiztegia . Bostak Bat Lantaldea . ( ADO2009 ) Azkarate M. , Kintana X. , Mendiguren X. (ed.). 2006. Elhuyar hiztegia: euskara-gaztelania, castellano-vasco . Usurbil : Elhuyar . ( ELH2006 ) Azkue R. M. 1984 [1905-06]. Diccionario vasco-español-francés . Bilbo : Euskaltzaindia . ( AZK1905 ) Bera E. , López-Mendizabal I. . 1916. Diccionario castellano-euzkera / Bera’tar Eroman Mirena aba, buruñurduna. Euzkel-erdel iztegia / López Mendizabal’dar Ixaka . Tolosa : E. Lopez . ( BEM1916 ) Euskaltzaindia (ed.), 2016. Euskaltzaindiaren Hiztegia . Bilbo : Euskaltzaindia . ( EHI2016 ) Kintana X. (ed.). 1977. Euskal Hiztegi Modernoa . Bilbo : Cinsa . ( KIN1977 ) Larramendi M. 1745. Diccionario trilingüe castellano, bascuence y latin dedicado a la M.N. y M.L. provincia de Guipuzcoa . San Sebastián : Bartholomé Riesgo y Montero . ( LAR1745 ) Mitxelena K. , Sarasola I. . 1986. Orotariko Euskal Hiztegia – Diccionario General Vasco . Bilbo : Euskaltzaindia; Desclée de Brouwer . ( OEH1986 ) Múgica Urdangarín L. M. 1977. Diccionario General Y Técnico . San Sebastián : Ed. Vascas . ( MUG1977 ) Real Academia . 1726. Diccionario de la lengua castellana en que se explica el verdadero sentido de las voces, su naturaleza y calidad, con las phrases o modos de hablar, los proverbios o rephranes, y otras cosas convenientes al uso de la lengua.. . Madrid : Francisco del Hierro . ( RAE1726 ) Sarasola I. 1996. Euskal Hiztegia . Donostia : Kutxa Gizarte eta Kultur Fundazioa . ( SEH1996 ) Sarasola I. 2005. Zehazki: gaztelania-euskara hiztegia, diccionario castellano-euskera . Irun : Alberdania . ( ZEH2005 ) Sarasola I. 2007. Egungo Euskararen Hiztegia . Bilbo : UPV/EHU . ( EEH2007 ) Arriola J. , Artola X. , Arregi X. , Díaz De Ilarraza A. , García E. , Laskurain B. , Sarasola K. . 2003. ‘Semiautomatic Conversion of the Euskal Hiztegia Basque Dictionary to a Queryable Electronic Form’ . T.A.L. journal 44 : 2 : 107 – 124 . Azkarate M. 2008. ‘Hiztegigintza eta euskararen normalkuntza’. Euskalgintza XXI. mendeari buruz. Iker-19. Euskaltzaindia, 171–182. Bergenholtz H. , Gouws R. . 2010. ‘A Functional Approach to the Choice between Descriptive, Prescriptive and Proscriptive Lexicography’ . Lexikos 20 . 1 : 26 – 51 . Google Scholar CrossRef Search ADS Boeder W. , Brincat J. , Stolz T. (eds). 2003. ‘Preface’. In Brincat, J., W. Boeder and T. Stolz (eds). Purism in minor languaguages, endangered languages, regional languages, mixed languages. Papers form the conference on ‘Purism in the Age of Globalisation’ Bremen, September 2001. Brockmeyer, vii–xiv Brunstad E. 2010. ‘Standard Language and Linguistic Purism’ . Sociolinguistica 17 . 1 : 52 – 70 . Crystal D. 1997. The Cambridge Encyclopedia of Language . Cambridge University Press . Davies W.V. 2012. ‘Myths we live and speak by’. In Hüning M. , Vogl U. , Moliner O. (eds). Standard Languages and Multilingualism in European History . John Benjamins , 45 – 69 Google Scholar CrossRef Search ADS Haugen E. 1983. ‘The Implementation of Corpus Planning: Theory and Practice’. In Cobarrubias J. , Fishman J. A. (eds), Progress in Language Planning: International Perspectives . Walter de Gruyter , 269 – 289 . Hualde J. I. , Zuazo K. . 2007. ‘The standardization of the Basque language’ . Language Problems and Language Planning 31 . 2 : 142 – 168 . Google Scholar CrossRef Search ADS Langer N. , Davies W. . 2005 . ‘An Introduction to Linguistic Purism’. In Langer N. , Davies W. (eds), Lingustic Purism in the Germanic Languages . Walter de Gruyter , 1 – 17 Google Scholar CrossRef Search ADS Larramendi M. de. 1729. El impossible vencido. Arte de la lengua bascongada . Salamanca : A.J. Villargordo Alcaráz . Leturia I. 2012. ‘Evaluating Different Methods for Automatically Collecting Large General Corpora for Basque from the Web’. Proceedings of 24th International Conference on Computational Linguistics (COLING 2012). Mumbai, India, 1553–1570. Leturia I. 2014. The Web as a Corpus of Basque. PhD Thesis. Donostia: UPV/EHU Lindemann D. , San Vicente I. . 2015. ‘Building Corpus-Based Frequency Lemma Lists’ . Procedia - Social and Behavioral Sciences 198 : 266 – 277 . Google Scholar CrossRef Search ADS Löffler M. 2003. ‘Purism and the Welsh Language: a matter of survical?’, In Brincat, J., W. Boeder and T. Stolz (eds). Purism in minor languaguages, endangered languages, regional languages, mixed languages. Papers form the conference on ‘Purism in the Age of Globalisation’ Bremen, September 2001. Brockmeyer, 61–90. Milroy J. 2001. Lanuage ideologies and the consequences of standardization’ . Journal of Sociolinguistics 5 / 4 , 530 – 555 . Google Scholar CrossRef Search ADS Milroy J. 2005 . ‘Some effects of purist ideologies on historical descriptions of English’. In Langer N. , Davies W. (eds), Lingustic Purism in the Germanic Languages . Walter de Gruyter , 324 – 341 . Google Scholar CrossRef Search ADS Mitxelena K. 1968. ‘Orthography’, In Salaburu P. 2008. Koldo Mitxelena: Selected Writings of a Basque Scholar . University of Reno , Nevada , 253 – 271 Mitxelena K. 1984. ‘Hauta-Lanerako Euskal Hiztegia-ren Aurkezpena’. In Sarasola I. (ed), Hauta-Lanerako Euskal Hiztegia . Donostia : Fundación Kutxa . Oñederra M. L. 1990. ‘Morphonological Aspects of Word Compounding in Basque’. In Boretzky N. (ed). Spielarten der Natürlichkeit, Spielarten der Ökonomie . Brockmeyer Pagola I. 2005. Neologismos en la obra de Sabino Arana Goiri. (Iker-18) . Bilbo : UPV/EHU; Euskaltzaindia . Salaburu P. , Alberdi X. . 2012. ‘The Search for a Common Code’, In Salaburu P. , Alberdi X. (eds). The Challenge of a Bilingual Society in the Basque Country . Center for Basque Studies, University of Nevada , Reno , 93 – 112 Sarasola I. 2002. ‘Euskal Hiztegigintzaren Historiarako Oharrak: Añibarro, Iztueta eta Aizkibelen Hiztegiez, Eta Azkueren Hiztegigintzaz’. In Artiagoitia X. , Goenaga P. , Lakarra J. (eds), Festschrift for Rudolf P.G. de Rijk. (ASJU Separata XLIV) . Bilbo : UPV/EHU , 611 – 628 . Sarasola I. 2003 . ‘Lexicography in the last quarter of the 20th century up to the publication of the Orotariko Euskal Hiztegia’. In Gorrochategui J. (ed.), Basque and (Paleo)Hispanic Studies in the Wake of Michelena’s Work . UPV/ EHU . 219 – 244 Sarasola I. , Landa J. , Salaburu P. . 2013. ‘Egungo Testuen Corpusa’ . UPV-EHU . http://www.ehu.eus/etc/. Sarasola I. , Salaburu P. , Landa J. , Ugarteburu I. . 2008. Lexikoa Atzo eta Gaur . Bilbo : UPV-EHU . http://www.ehu.eus/lag/. Sarasola I. , Salaburu P. , Landa J. , Zabaleta P. . 2007. ‘Ereduzko Prosa Gaur’ . UPV/EHU . http://www.ehu.eus/euskara-orria/euskara/ereduzkoa/. Thomas G. 1991. Linguistic Purism . Longman . Urgell B. 2000. Larramendiren Hiztegi Hirukoitza-ren osagaiez. PhD Thesis. Vitoria-Gasteiz: UPV/EHU. Van der Sijs N. 1999. ‘Inleiding. De rol van taalzuivering in de taalontwikkeling: historische en politieke aspecten’. In Van der Sijs N. (ed), Taaltrots . Contact , 11 – 36 Vogl U. 2012. ‘Multilingualism in a standard language culture’. In Hüning M. , Vogl U. , Moliner O. (eds). Standard Languages and Multilingualism in European History . John Benjamins , 1 – 42 Google Scholar CrossRef Search ADS Zgusta L. 1989. ‘The Role of Dictionaries in the Genesis and Development of the Standard’. In Hausmann F.J. , Reichmann O. , Wiegand H.E. , Zgusta L. (eds.), Wörterbücher. Dictionaries. Dictionnaires . Erster Teilband / First Volume / Tome Premier. Walter de Gruyter . 70 – 79 . [reprinted in Dolezal, F. and T. Creamer (eds). 2006. Ladislav Zgusta. Lexicography Then and Now. Selected Essays. Max Niemeyer Verlag, 186-197] Google Scholar CrossRef Search ADS Zuazo K. 2013. The Dialects of Basque . Center for Basque Studies, University of Nevada, Reno © 2018 Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices) http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png International Journal of Lexicography Oxford University Press

Basque Lexicography and Purism

International Journal of Lexicography , Volume Advance Article (2) – May 17, 2018

Loading next page...
 
/lp/ou_press/basque-lexicography-and-purism-epfqXURdnr
Publisher
Oxford University Press
Copyright
© 2018 Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com
ISSN
0950-3846
eISSN
1477-4577
DOI
10.1093/ijl/ecy003
Publisher site
See Article on Publisher Site

Abstract

Abstract ‘Purism’ can characterise attitudes about a wide range of linguistic phenomena, but the most common forms of linguistic purism are those concerned with the lexicon. When standardisation of language is at issue, questions of purism are unavoidable. Are processes of standardisation necessarily motivated by puristic attitudes? Or is purism a consequence of standardisation? In this paper we consider lexical purism in the standardisation of Basque, a minoritised European language. We offer a rough periodisation of Basque lexicography through the lens of puristic attitudes towards the lexicon in terms of the classifi`cation by Thomas (1991). In 18th, 19th, and early 20th century Basque lexicography and terminology we find mostly playfulness, elitism and xenophobia as the salient characteristics of the puristic choices for the standard variety and for terminology modernisation. In the latter 20th century, we find a shift to reformist purism. We examine puristic proposals for loanword replacement from the different periods, and we measure their success in contrast to their borrowed counterparts using frequency data extracted from large text corpora. 1. Introduction According to different definitions given by scholars, linguistic purism covers a wide range of issues. Langer and Davies (2005: 4) compare four definitions of purism (those by Trask 1999, Thomas 1991, Crystal 1997 and Van der Sijs 1999) and conclude that they all ‘largely agree on what purism is: an (influential) part of the speech community voices objections to the presence of particular linguistic features and aims to remove them from their language’. Purism reflects folk-linguistic attitudes in general: members of a speech community share ideas about the degree of prestige of a certain variety or dialect, and about the relative desirability of certain linguistic features. In short: the existence of purism presupposes the existence of a prestige variety. Furthermore, prestige is a social category that is often linked to a ‘standard variety’ (Milroy 2001: 532), which brings us to the relationship between purism and standardisation. Van der Sijs (1999: 11) argues that purism only affects languages that are standardised or are in the process of standardisation since, before one can remove elements from a linguistic norm, one has to have such a norm (Langer and Davies 2005: 4). Boeder et al. (2003: viii) disagree with this interdependence between purism and standardisation: ‘For many cases, purism need not be connected with conscious standardisation, and it should not be separated from a broader concept of “pure language”.’ Basque can be characterised as a late standard language (Vogl 2012: 25) as its standardisation process did not start until 1968. In contrast to other communities, such as Flanders or Finland, the rise of Basque nationalism at the end of the 19th century did not entail any promotion of uniformity. On the contrary, ‘the founder of the Basque Nationalist Party favored the development of a different written variety for each of the Basque provinces’ (Hualde and Zuazo 2007: 143), although some of the first authors to publish in Basque in the 16th and 17th centuries explicitly remarked on the difficulties brought about by dialectal diversity.1 Basque was rarely used in prestigious domains, there was no truly socially dominant Basque dialect, and the need for a single written standard was not universally accepted. Even the foundation of the Academy of the Basque Language in 1918, with the unification of the written language as one of its main goals, turned out to be no help in the quest to achieve a standard variety (Hualde and Zuazo 2007, Salaburu and Alberdi 2012). A minority language could hardly survive in the world of today if it lacked a ‘shared common writing code’ (Salaburu and Alberdi 2012: 94). Awareness of this fact arose in the Basque Country in the second half of the 20th century, and was the driving force behind standardisation. The motivation behind standardisation could thus be described as an empowerment of the language community. Mitxelena, the linguist and academician entrusted by the Academy with the task of drawing up a proposal for the unification of Basque, stated clearly what the primarily goal was: ‘We believe that it is absolutely necessary, a matter of life or death, to put Basque on the path to unification. If one is teaching our children and young people in Basque - and if Basque is to survive, we must use it in teaching - it is indispensable that we teach them in a unified manner. The unification that we need is in written Basque, at least for the first few steps’ (Mitxelena 1968 [2008: 253]). Thus, the canonical forms were to be taught at school (Milroy 2001: 537). And, as is usually the case, from then on, (and particularly after the acquisition of the status of an official language at regional level in some territories south of the Pyrenees in 19792), Basque society felt the need for a standard, seen as ‘a variety that could fulfil every conceivable function for which its speakers could need it’ (Davies 2012: 56). Forty years later, the standard variety does have what Davies (2012: 49) calls ‘a privileged place in public and official domains, e.g. the media and the education system.’ Moreover, the cultivation of the standard variety was also linked to a nation-building project (Milroy 2001, Davies 2012): ‘the rapid acceptance of the new standard within Basque society is undoubtedly related to the strength of Basque nationalistic feeling at the time of its adoption’ (Hualde and Zuazo 2007: 160). After Mitxelena’s foundational report, which included recommendations regarding orthography, morphology, lexical variants and the adaptation of neologisms, the Basque Academy has worked continuously towards the codification of the standard through the publication of a basic grammar of standard Basque along with a standard Basque Dictionary and by establishing rules of ‘good usage’3. The ‘correctness ideology’ (Milroy 2001, Vogl 2012) is a central component of the standard language ideology, as well as the claim of mutual intelligibility (Davies 2012): ‘The benefits that the Academy’s standard has brought to Basque society are widely recognised. First of all, it has made it possible for Basque speakers to discuss any topic in Basque. Secondly, it has eliminated the (sometimes serious) obstacles which previously existed in communication between speakers from different areas of the Basque Country’ (Hualde and Zuazo 2007: 162). Nevertheless, dialectal variety is not only allowed, but also promoted in informal registers. There is even a certain acceptance of code-switching in order to increase the use of the language, in other words, a move from formal correction as a sole criterion to that of communicative and expressive quality. However, although there was no Basque standard variety until the late 1960’s, ‘the Early Modern striving for ‘correctness’ which was produced for vernacular languages all over Europe’ (Vogl 2012: 20) was to be reflected during the 18th century in Larramendi’s grammar, El imposible vencido(1729), and in his Trilingual Dictionary (1745). And, at the end of the 19th century, Sabino Arana, the founder of the Basque Nationalist Party, took a clear stance in favour of a language untouched by external influences (the ideology of linguistic isolationism, Davies 2012). In both cases, long before the Academy was founded, they had to deal with borrowed lexical items (cf. section 2). In the following we shall analyse the influence of purism on the lexicon, and look at the criteria regarding what constitutes a Basque word for Basque lexicographers when compiling dictionaries. In a rough periodisation, we first group the standard reference works of Basque lexicography according to the stages of standardisation proposed by Thomas (1991:115–122), i.e. pre-standardisation, standardisation, and post-standardisation. This model is complementary to two of the four steps in the standardisation of any language according to Haugen’s classic proposal, namely codification and elaboration (Haugen 1983). Second, we describe the type of purism that prevails in these works, according to a taxonomy of puristic orientations which we will discuss in detail in the following. Finally, we relate these works to the functional typology of dictionaries proposed by Bergenholtz and Gouws (2010, cf. section 4). Thomas (1991:75–83) distinguishes six types of purism, or six puristic orientations, as follows: Archaising purism: a conservative approach that favours the language found in written text from a ‘golden past’ over any innovation. Ethnographic purism: the lexicon of certain, typically rural dialects is favoured over modern, urban vocabularies. Elitist purism: the sociolect of the educated urban elite is regarded as the purest. Reformist purism: ‘a salient feature of most of the language renewals of the nineteenth century as well as the more recent efforts to create standard languages. It involves […] adapting the language for its role as a medium of communication in a modern society’ (Thomas 1991: 79). Playful purism: most typically, the creation of neologisms by native means as a result of an individual activity that often replaces well-established foreign words. Xenophobic purism: an attitude in favour of replacing elements identified as foreign with native elements. With this taxonomic model, which encompasses all cases of purism in any language, we may also describe the purisms that have targeted the lexicon of the Basque language. In our case, however – that of a minority language in a situation of diglossia – further clarification is needed. As for ‘playful’ attitudes of word creation (creation of neologisms, as a result of an individual activity, that often replace well-established foreign words), we hold that the effort of individual lexicographers in a standard-creating attempt to propose solutions for lexical gaps may not always be termed playful, but it is always creative. In other words, although these words are not attested in corpora of Basque texts, they are Basque insomuch as they are created following Basque morphosyntactic conventions. Therefore, many neologisms, even if proposed by an individual or a small group, may be totally intelligible and transparent to the listener. We thus describe a purist attitude as playful only when a lexicographer’s term proposal is following his/her own personal taste rather than a strategy of combination and derivation of items in use. Strategies to fill lexical gaps may be more or less purist in the sense Thomas calls xenophobic, that is, purism to modulate the presence of foreign lexical items, using native elements. But what does ‘foreign’ mean in the case of Basque? Celtic, Greek, Latin, Arabic, Spanish or French? Purism that aims at the replacement of non-native elements in the Basque case always targets loans from Latin and its descendant languages (as well as Graeco-Roman internationalisms in terminology). Thus it is not against elements that have come from foreign peoples, but against well-known words that have been there for a considerable period of time, and come from the dominant partner in diglossia. Following Brunstad (2010: 67-77), who also discusses Thomas’ taxonomy of puristic orientations, we may therefore conclude that for an adequate interpretation of puristic orientations, the language’s status relative to other languages must be taken into account: e.g. state language vs. not a state language, majority language vs. minority language, language contact between two mutual intelligible languages vs. language contact between two unintelligible languages, language standardised between 1550 and 1800 vs. language standardised in the period after 1800. Table 1 summarises our proposal for a periodisation. Until the most recent past, all Basque dictionaries were bilingual. That means that in their attempt to define Basque equivalents to lexical items of the high-prestige counterparts in diglossia – i.e. Latin, Spanish and French – pre-standard Basque lexicographers almost always had to face issues related to purism. This first period can be characterised as a 250-year search for a standard. The Basque lexicon itself was not codified, and Basque was not subject to institutionally-backed standardisation, until 1968. Basque dictionaries of all periods deserve to be analysed taking into account to what extent they influenced the codification of the lexicon of the Basque language. Table 1. Periodisation of Basque Lexicography: Principal reference works. Dictionary Languages Standardisation stage Purism types Dictionary type LAR1745 ES>EU-LA pre-standardisation reformist prescriptive AZK1905 EU>ES-FR pre-standardisation ethnographic, reformist descriptive BEM1915 ES<>EU pre-standardisation playful-elitist, nativist- xenophobic, reformist prescriptive KIN1977 ES<>EU standardisation nativist-empowering, creative, reformist prescriptive MUG1977 ES<>EU standardisation nativist-empowering, creative, reformist prescriptive OEH1986 EU<> ES-FR standardisation reformist (institutional) descriptive SEH1996 EU post-standardisation reformist descriptive ZEH2005 ES>EU post-standardisation reformist proscriptive ELH2006 ES<>EU post-standardisation reformist proscriptive ADO2009 ES<>EU post-standardisation reformist proscriptive EEH2007- EU post-standardisation reformist descriptive EHI2016 EU standardisation reformist (institutional) prescriptive Dictionary Languages Standardisation stage Purism types Dictionary type LAR1745 ES>EU-LA pre-standardisation reformist prescriptive AZK1905 EU>ES-FR pre-standardisation ethnographic, reformist descriptive BEM1915 ES<>EU pre-standardisation playful-elitist, nativist- xenophobic, reformist prescriptive KIN1977 ES<>EU standardisation nativist-empowering, creative, reformist prescriptive MUG1977 ES<>EU standardisation nativist-empowering, creative, reformist prescriptive OEH1986 EU<> ES-FR standardisation reformist (institutional) descriptive SEH1996 EU post-standardisation reformist descriptive ZEH2005 ES>EU post-standardisation reformist proscriptive ELH2006 ES<>EU post-standardisation reformist proscriptive ADO2009 ES<>EU post-standardisation reformist proscriptive EEH2007- EU post-standardisation reformist descriptive EHI2016 EU standardisation reformist (institutional) prescriptive Table 1. Periodisation of Basque Lexicography: Principal reference works. Dictionary Languages Standardisation stage Purism types Dictionary type LAR1745 ES>EU-LA pre-standardisation reformist prescriptive AZK1905 EU>ES-FR pre-standardisation ethnographic, reformist descriptive BEM1915 ES<>EU pre-standardisation playful-elitist, nativist- xenophobic, reformist prescriptive KIN1977 ES<>EU standardisation nativist-empowering, creative, reformist prescriptive MUG1977 ES<>EU standardisation nativist-empowering, creative, reformist prescriptive OEH1986 EU<> ES-FR standardisation reformist (institutional) descriptive SEH1996 EU post-standardisation reformist descriptive ZEH2005 ES>EU post-standardisation reformist proscriptive ELH2006 ES<>EU post-standardisation reformist proscriptive ADO2009 ES<>EU post-standardisation reformist proscriptive EEH2007- EU post-standardisation reformist descriptive EHI2016 EU standardisation reformist (institutional) prescriptive Dictionary Languages Standardisation stage Purism types Dictionary type LAR1745 ES>EU-LA pre-standardisation reformist prescriptive AZK1905 EU>ES-FR pre-standardisation ethnographic, reformist descriptive BEM1915 ES<>EU pre-standardisation playful-elitist, nativist- xenophobic, reformist prescriptive KIN1977 ES<>EU standardisation nativist-empowering, creative, reformist prescriptive MUG1977 ES<>EU standardisation nativist-empowering, creative, reformist prescriptive OEH1986 EU<> ES-FR standardisation reformist (institutional) descriptive SEH1996 EU post-standardisation reformist descriptive ZEH2005 ES>EU post-standardisation reformist proscriptive ELH2006 ES<>EU post-standardisation reformist proscriptive ADO2009 ES<>EU post-standardisation reformist proscriptive EEH2007- EU post-standardisation reformist descriptive EHI2016 EU standardisation reformist (institutional) prescriptive A common factor shared by all of the dictionaries discussed here is a motivation to unify Basque, and hence, to create a standard, and to spread the knowledge about the lexical richness of Basque, which also can be taken as a (softer) type of ‘reformist’ purism as described by Thomas (1991:79). According to the stages of standardisation described above, Basque dictionaries can be classified in three groups: (a) those that, since the middle of the eighteenth century, in one way or another, were searching for the appropriate words for a standard (pre-standardisation stage); (b) those that codified the lexicon from the institutional background of the Basque Language Academy (standardisation stage); and (c) those that are contributing to the further elaboration and educational spread of the lexicon of Standard Basque in a post-standardisation setting, i.e. representatives of a reformist purism concerned with the quality of text production. In the following, we look at the lexicographic production in the three stages. 2. The search for a standard lexicon (1745-1968) Three attempts deserve to be mentioned here. Each one set out its own criteria when choosing the most appropriate words for a standard lexicon, i.e. the most suitable words for a cultivated language. Each dealt with purism in a different way. 2.1 Larramendi 1745 The trilingual dictionary compiled by the Jesuit Manuel de Larramendi, Diccionario Trilingüe del Castellano, Bascuence y Latín (LAR1745), was the first printed Basque dictionary, and the main reference in Basque lexicography until the beginning of the 20th century. Larramendi’s dictionary exerted a great influence over the language used by Basque religious and secular writers for a century and a half. The publication of Larramendi’s Basque grammar (1729) and his dictionary, is what ‘separates the new from the old age’ (Mitxelena 1984). Larramendi had a dual objective in mind when compiling his dictionary: on the one hand, his goal was to fight against the detractors of the Basque language by demonstrating that Basque is as rich in lexical resources as Spanish or Latin. On the other hand, he was concerned with the quality of the oral and written text production. Larramendi wanted Basque preachers and writers to embrace a conviction that Spanish borrowings should not be used anywhere, particularly when Basque-native words were available: unnecessary borrowings had to be rejected. Larramendi shows a concern for purity, for a language as pure and unmixed as possible, which had to be protected from corruption and decay, typical of standard language cultures (Milroy 2001). To fulfil this dual objective, for every one of the 43.000 Spanish headwords in the main reference dictionary of the time (RAE1726) he provides a corresponding equivalent in Basque; thus ‘proving’ that Basque had the same range of semantic expressiveness as Spanish. The dictionary contains approximately 40.000 different Basque lemmata (Urgell 2000:5). In the preface to his Trilingual Dictionary, Larramendi provides the keys to his puristic process of choosing Basque equivalents. He would accept any word occurring in Basque literature, no matter its origin, i.e., whether it was identifiable as a loan or not, or from whatever dialect. Also terms of a foreign origin that had been borrowed from Latin in distant centuries, such as the vocabulary of Christianity – aingeru ‘angel’, eliza ‘church’, meza ‘mass’ or obispo, apezpiku ‘bishop’, etc. – are entered as equivalents of Spanish ángel, iglesia, misa or obispo. Hence, Larramendi’s purism does not reject loanwords that are well-established in Basque. On the other hand, Larramendi rejects widely used borrowings that he felt to be unnecessary, preferring in many cases a synonym that was also in use but not identified as loan, such as egiazko, and not berdadero, Spanish verdadero, ‘true’, damu (and not dolore) ‘regret, repentance’, and irakurri (and not leitu), ‘to read’. As for terms, Larramendi also coined his own neologisms, compound and derived words. For terms belonging to domains where Basque at the time was not normalised, he sometimes proposed a purist neologism, as jainkokinde, (jainko-kinde, ‘God’ and derivational suffix, ‘theology’), or izarkinde (izar- being ‘star’), ‘astrology’, instead of the internationally widely used terms teologia and astrologia. These newly-coined compound nouns would be more easily understood than the Greek loans. Thus, where his work is a contribution to a functional development of Basque, Larramendi’s purism is a stance in favour of self-intelligible, transparent assemblages of Basque morphological components. With this criterion in mind, he also rejects some loanwords that had been used in literature and replaces them with a purist but transparent synonym: kondaira (konda-era, ‘way of telling’, fairy tale) instead of historia, zenbate (‘number’, zenbat-te ‘how much’, ‘how many’ and collective suffix) instead of numero or biltoki (bil-toki, ‘gather place’) instead of teatro. Few of this last group are found in frequent use nowadays; but there is no doubt that Larramendi’s Trilingual Dictionary, in line with its intention, can be called a standard-creating and even modernising (Zgusta 1989) attempt. Furthermore, Larramendi’s reformist purism, his prejudice in favour of transparency, may be regarded as contrary to an elitist approach. According to SEH1996, 3515 lemmata (including derived words and compounds) are documented in LAR1745 for the very first time. 1535 of these have 20 or more appearances in the ETC reference corpus, which is made up of contemporary texts (see section 3.4 for reference). All in all, Larramendi’s contribution is similar to that of other lexicographers in other minoritised languages, i.e. that of Evans for Welsh (Löffler 2003: 71). 2.2 Arana and his followers Larramendi’s influence vanished at the turn of the late 19th and early 20th century, when Sabin Arana Goiri (the founder of the Basque National Party EAJ-PNV) started a crusade to ‘clean’ Basque of any supposed loan, even of those adopted long ago. The uniqueness of the Basque language had to be emphasised; every foreign word banned. Widely used terms like aingeru ‘angel’, eliza ‘church’, meza ‘mass’ or apezpiku ‘bishop’ were substituted with new coinages gotzon < gogo huts on ‘spirit pure good’, txadon < etxe done ‘house holy’, jaupa < Jaun opa ‘Lord offering’ and gotzain < gogo zain ‘soul keeper’, respectively (cf. Pagola 2005). Thus, Arana Goiri’s coinages can be taken as examples of xenophobic (Thomas 1991) or sanitary purism (Milroy 2005). Arana Goiri coined new words by applying the morpho-phonological rules that occur in Basque composition and derivation (cf. Oñederra 1990) – but did so in such a peculiar way that the outcome is totally opaque for speakers. In these proposals, the main motivation was not transparency for an easy understanding, but a purist ‘renewal’ and rejection of everything ‘foreign’. This type of purism is closely linked to romanticism and nationalism, as it was common in a wide range of language communities in Europe in the late 19th and early 20th century; but the approach can also be regarded as elitist, i.e. not intelligible for the lay person (cf. Thomas 1991: 43-45). As for terms, Arana rejected almost all of Larramendi’s proposals. Although they had been created transparently by means of Basque word-formation rules, Arana did not spare them from being substituted with other new coinages: zenbaki ‘number’, lutelesti ‘geography’, edesti ‘history’ or antzoki ‘theater’ (zenbat–ki ‘how much’, ‘how many’, and object marking suffix; ludi-eres-ti, ‘world-account’ and collective suffix; eres-ti, ‘tale’ and collective suffix; antze-(t)oki, ‘art-place’. In addition to excluding any foreign word, Arana claimed that a dictionary must establish the exact form of each native word. In this pursuit, he took into account not the written tradition, but his own feelings. For example, he favoured forms such as odoldau, lotsage or argiztu over the widely-used odoldu (odol-du blood-PTCP ‘stained with blood’, lotsagabe (lotsa-gabe, ‘shame-less’) and argitu (argi-tu light-PTCP, i.e. ‘lit (up)’. Arana himself did not compile any dictionary, but his proposals were gathered by Bera-Mendizabal’s dictionaries (BEM1916) and influenced literary production until the Civil War 1936-1939 and even afterwards, until the late 1950s. These dictionaries were re-edited several times as late as 1975; they contain mainly the headwords of Azkue’s dictionary (see below), and the proposals of Arana and his followers. As regards the survival of Arana’s new coinages, those that were intended to replace accepted loan words mostly failed, although some of them survived, such as gotzain, ‘bishop’ (see above) or olerki, ‘poem’ (ol-eder-ki ‘thought-beautiful’ and object suffix). Those created to supplant Larramendi’s coinages, however, mostly succeeded and entered the standard dictionary. The purist neologisms proposed by Arana Goiri and his followers roughly fall into three domains: politics, religion and everyday life. Table 2 contains a list of the 25 most frequent neologisms proposed by Arana, their translations to English, and relative frequency (i.e. the percentage value of occurrence) and rank in the ETC reference corpus (see section 3.4). Table 2. Arana’s neologisms most used today, relative frequency and rank in ETC corpus. Basque English Part of Speech Frequency Rank alderdi party noun 0.0804 131 idatzi write verb 0.0628 188 jaurlaritza Government noun 0.0391 308 abertzale patriot adjective, noun 0.0335 367 hauteskunde election noun 0.0330 376 ordezkari representative noun 0.0308 401 batzorde council noun 0.0306 404 argazki photograph noun 0.0261 478 askatasun freedom noun 0.0252 495 lehendakari president noun 0.0200 623 zenbaki number noun 0.0192 649 idazkari secretary noun 0.0158 777 espetxe prison noun 0.0153 806 antzerki theatre (play) noun 0.0147 840 aske free adjective 0.0121 982 ikastola school noun 0.0119 996 aldundi (province) Government noun 0.0113 1,027 antzoki theatre (room, building) noun 0.0080 1,358 hautetsi elect verb, noun, adjective 0.0067 1,540 ikur sign; emblem noun 0.0060 1,648 bazkide member noun 0.0058 1,685 abestu sing verb 0.0047 1,925 egutegi calendar noun 0.0044 2,002 aberri homeland noun 0.0039 2,182 eguazten Wednesday noun 0.0029 2,612 Basque English Part of Speech Frequency Rank alderdi party noun 0.0804 131 idatzi write verb 0.0628 188 jaurlaritza Government noun 0.0391 308 abertzale patriot adjective, noun 0.0335 367 hauteskunde election noun 0.0330 376 ordezkari representative noun 0.0308 401 batzorde council noun 0.0306 404 argazki photograph noun 0.0261 478 askatasun freedom noun 0.0252 495 lehendakari president noun 0.0200 623 zenbaki number noun 0.0192 649 idazkari secretary noun 0.0158 777 espetxe prison noun 0.0153 806 antzerki theatre (play) noun 0.0147 840 aske free adjective 0.0121 982 ikastola school noun 0.0119 996 aldundi (province) Government noun 0.0113 1,027 antzoki theatre (room, building) noun 0.0080 1,358 hautetsi elect verb, noun, adjective 0.0067 1,540 ikur sign; emblem noun 0.0060 1,648 bazkide member noun 0.0058 1,685 abestu sing verb 0.0047 1,925 egutegi calendar noun 0.0044 2,002 aberri homeland noun 0.0039 2,182 eguazten Wednesday noun 0.0029 2,612 Table 2. Arana’s neologisms most used today, relative frequency and rank in ETC corpus. Basque English Part of Speech Frequency Rank alderdi party noun 0.0804 131 idatzi write verb 0.0628 188 jaurlaritza Government noun 0.0391 308 abertzale patriot adjective, noun 0.0335 367 hauteskunde election noun 0.0330 376 ordezkari representative noun 0.0308 401 batzorde council noun 0.0306 404 argazki photograph noun 0.0261 478 askatasun freedom noun 0.0252 495 lehendakari president noun 0.0200 623 zenbaki number noun 0.0192 649 idazkari secretary noun 0.0158 777 espetxe prison noun 0.0153 806 antzerki theatre (play) noun 0.0147 840 aske free adjective 0.0121 982 ikastola school noun 0.0119 996 aldundi (province) Government noun 0.0113 1,027 antzoki theatre (room, building) noun 0.0080 1,358 hautetsi elect verb, noun, adjective 0.0067 1,540 ikur sign; emblem noun 0.0060 1,648 bazkide member noun 0.0058 1,685 abestu sing verb 0.0047 1,925 egutegi calendar noun 0.0044 2,002 aberri homeland noun 0.0039 2,182 eguazten Wednesday noun 0.0029 2,612 Basque English Part of Speech Frequency Rank alderdi party noun 0.0804 131 idatzi write verb 0.0628 188 jaurlaritza Government noun 0.0391 308 abertzale patriot adjective, noun 0.0335 367 hauteskunde election noun 0.0330 376 ordezkari representative noun 0.0308 401 batzorde council noun 0.0306 404 argazki photograph noun 0.0261 478 askatasun freedom noun 0.0252 495 lehendakari president noun 0.0200 623 zenbaki number noun 0.0192 649 idazkari secretary noun 0.0158 777 espetxe prison noun 0.0153 806 antzerki theatre (play) noun 0.0147 840 aske free adjective 0.0121 982 ikastola school noun 0.0119 996 aldundi (province) Government noun 0.0113 1,027 antzoki theatre (room, building) noun 0.0080 1,358 hautetsi elect verb, noun, adjective 0.0067 1,540 ikur sign; emblem noun 0.0060 1,648 bazkide member noun 0.0058 1,685 abestu sing verb 0.0047 1,925 egutegi calendar noun 0.0044 2,002 aberri homeland noun 0.0039 2,182 eguazten Wednesday noun 0.0029 2,612 2.3 Azkue 1905-06 While Arana and his followers were ‘sowing’ the literary field with their new coinages, Azkue set out searching for the words really used by Basques: town by town, village by village, but also trawling for the words used in Basque literature. As a result of this search, he included in his Diccionario vasco-español-francés (AZK1905) loans such as meza ‘mass’, apezpiku, ‘bishop’, eliza (Lat. ecclesia), ‘church’, aingeru ‘angel’, or lege (Lat. lex), ‘law’. When entering these headwords, he flagged them with double question marks, since he also found that the origin of these words made them only liminally Basque. Nevertheless, as Sarasola (2002) points out, other loan words with a long tradition in use were excluded from his dictionary, even, for instance, other religious words such as fede ‘faith, confidence’. Azkue placed oral use before literary use, and consequently rejected many words present in Larramendi’s dictionary because he had no means to ascertain which ones had been common words used by lay people in the 18th century, and which ones had simply been created by Larramendi. For similar reasons he rejected Arana’s neologisms. In other words, Azkue did not want to accept in his dictionary novel, rare coinages intended to replace others, well-established in the language: he recorded the words really used by Basque people and by Basque writers. However, Sarasola (2002) considers Azkue a purist, in the sense that weighted his findings from field research according to his personal preference. SEH1996 marks 1.422 lemmata, including derived and compound words, to be first documented in AZK1905, 495 of which are found 20 or more times in the ETC reference corpus. That means that about two thirds of the lemmata first documented in AZK1905 are very rare words today, if used at all. Azkue’s work influenced Basque lexicographers and linguists; but Bera-Mendizabal’s bilingual dictionary (BEM1916, see section 2.2) remained a reference for writers and for lay people until the late 1970s. 3. Codification of standard Basque (1968-2000) Controversy over the status of loans and neologisms continued after the foundation of the Academy of the Basque Language Euskaltzaindia in 1919, until 1959, when the Academy approved a declaration stating that the use of words, not their language of origin, was the main criterion to be taken into account for the admission of new words. In light of this declaration, words widely used in Basque literature are deemed Basque words, no matter where they come from. This implies, for instance, that all the words in Table 2, in spite of their origin, were accepted for the Dictionary of the Academy (cf. section 3.2). As for new words required by modern life, the Academy preferred to resort to compounding and derivation, but did not forsake the other option; that is, loan words (Azkarate 2008). After 1968, new bilingual dictionaries (with Spanish as their target language) faced the same challenges as Larramendi in the 18th century: to find a Basque equivalent for each Spanish word sense. These works sought, again, to promote, to enrich the minority language into a standard language that could fulfil every function in formal and in informal settings. Nearly three quarters of a century separates the publication of Azkue’s and Bera-Mendizabal’s dictionaries in the 1900s-10s, and the publication of the General Basque DictionaryOEH1986. Late in this interval, the influence of Kintana’s (KIN1977) and Múgica Urdangarín (MUG1977) dictionaries deserves consideration. At this point, the first standardisation of the language had been undertaken, and the use of Basque in public had become legal in the Spanish territories: a new generation entered the Basque ‘arena’. In this environment, Kintana’s proposals were well-received: his neologisms were used in textbooks, in government, and in the media. To measure the extent to which the two dictionaries make proposals with no attestation in written texts, Sarasola has studied the rate at which Kintana and Múgica agree on equivalents for prefixed Spanish words, finding that they agree less than 50% of the time (Sarasola 2003). Thus, both dictionaries (and other works compiled during the 1980s and 1990s such as a couple of Basque encyclopaedias) can be taken as standard-creating, with a creative approach to the enrichment and modernisation of the Basque lexicon. The approach is somewhat ‘playful’ as Thomas understands it (1991:75–83), since it cannot be entirely evidence-based in the sense that it does not consider corpus data. The terminological dictionaries compiled by UZEI (Basque Center for Terminology and Terminography) from 1977 on, were also not based on corpus data. Rather, they were seen as instruments that would allow the introduction of Basque at University-level teaching. 3.1 Orotariko Euskal Hiztegia (General Basque Dictionary) Compiled by Koldo Mitxelena and Ibon Sarasola for the Academy of the Basque Language, this dictionary is the first Basque dictionary based on an electronic text corpus. The corpus includes almost every written word ever published in print: thus, the wealth of published text from any time and dialect, until around 1980. The first edition was finished in 1986. The aim of the authors is to describe the Basque lexicon as a whole: to account for the words used by Basque writers, but also for the words contained in Basque dictionaries. This comparison allows the reader to realise which words belong almost exclusively to the realm of lexicography and which ones have really been used by Basque writers – and to what extent. There are no new proposals, no new coinages: just an honest picture of Basque lexical practice, as complete, neutral, rich, and far-reaching as possible. The picture is drawn by an overwhelming number of hand-selected corpus-based usage examples, each one with the bibliographical reference. The lemmata are not discussed according to any puristic criteria: it is just a matter of showing which words and expressions have been used in written language. OEH1986 contains 95,000 headwords of entries and subentries, 48,800 of which are not found in the reference corpus ETC nor in the big web corpora (see section 10 for references), and thus must be regarded as historical or lexicographical. Updating of OEH is still ongoing:4 the fifth version, enriched with entries corresponding to legal and administrative texts, is now available. The online version contains dictionary articles for 145.318 lemmata, and the entries also contain translation equivalents for Spanish and/or French. The lemmatisation criteria in OEH obey the guidelines for orthographical standardisation which Euskaltzaindia has been releasing from 1968 onwards (see section 9 below); in this regard, the OEH dictionary reflects reformist purism policies for a standardised unification of Basque. 3.2 Euskaltzaindiaren Hiztegia Once the OEH corpus was in place, the Academy of the Basque Language faced both the task of choosing words and that of compiling its prescriptive dictionary of the standard variety: Euskaltzaindiaren Hiztegia (EHI2016, ‘the Academy’s Dictionary’), which in its present version contains 37.884 entries, 6.944 subentries and 61.398 word senses5. This lemma list represents the codification of the standard lexicon; Euskaltzaindiaren Hiztegia is thus to be characterised as a straightforwardly prescriptive dictionary. Following Mitxelena’s advice, the corpus created for OEH1986 enabled the selection of words according to their frequency in the literary tradition. But words required by present-day discourse could not be neglected. What follows is a summary of the criteria followed by the lexicography board at Euskaltzaindia: Different dialectal forms of the same lemma were standardised in such a way that they could be understood by most speakers: in many cases, not the most frequent form, but an older form that was common to various dialects. For terms, new coinages (usually compounds or derived words) were accepted, provided that they were well formed (haragi-jale, literally ‘meat-eater’, carnivorous; gainzama ‘excess load’; historiaurre ‘prehistory’). International loanwords were accepted, since they would be easily understood by most Basque speakers, (intsektizida ‘insecticide’, eskizofrenia ‘schizophrenia’, eszeptiko ‘sceptic’, neurona ‘neuron’…). Sometimes both forms coexist, the loanword and the new coinage: herbiboro and belarjale ‘grass eater’; akordeoi and eskusoinu ‘accordion’ (lit. ‘hand sound)’. Occasionally, a newly coined transparent compound or derived term is selected and the loanword is rejected (apikultura* > erlezaintza ‘bee keeping’). The requirement for a loanword to be accepted is that it must be understandable to speakers of both sides of the Pyrenees: e.g. hipoteka ‘mortgage’ entered the Academýs Dictionary as it is used in both Spanish and French. Problems arise when southern Basques use a Spanish loanword in contexts where northern Basques use a French loanword. In those cases, both loanwords entered the Academy’s Dictionary together with the corresponding marker North or South: koaderno Heg / kaier Ipar ‘notebook’. Unable to select a word based on the practices of a majority of dialect groups, the Academy preferred to select words that are truly used instead of coining terms that have had no tradition at all. This also served to inform Basques from one side of the Pyrenees about the words used on the other. When only a Spanish loanword was attested, two approaches were taken. In some cases, the marker South indicates that the word is not suitable for the standard variety, but that it is used by southern Basques: teklatu, ‘keyboard’, hormigoi, ‘concrete’. In other cases, when a Spanish loanword is found to be very frequent, a new coinage is proposed for the standard variety: *bainera > bainuontzia (‘bath tub’), *boligrafo > bolaluma (‘pen’), *motxila > bizkarzakua (‘mochila’,’backpack’), *labadora > garbigailua (‘lavadora’, ‘washing machine’). We take a more detailed look at this type of proposal in the following section. 3.3 Terminology glossaries and dictionaries Continuing the work on Terminology started by UZEI in 1977, in 2003 the Basque Government created the Terminology Commission. Its members are experts from universities, government, the Academy or from organisations devoted to terminological tasks. The outcome of their work is gathered in the Basque Public Term Bank, which currently holds about 500.000 terminological records.6 Terms are constantly updated according to the Euskaltzaindia guidelines and the proposals for terminology normalisation. New coinages follow the phonological and morphological structure of Basque and its rules for compounding and derivation. In addition, the Commission’s guiding criterion is real usage: If a term is widely used among experts of the relevant field, it should not be replaced by a new coinage. The terminographical task is based on corpus evidence whenever the existence of written documents in a field of specialisation allows it. Today, Basque is used in a normalised way in a wide range of domains, and specialised corpora are available. However, there is still room for a reformist purism, in order to promote the use of Basque in semi-normalised fields (sports, pottery, or social media, to cite but a few of the domains of recently compiled specialised dictionaries). 3.4 Success of purist proposals: Corpus-based measurements As described by sources like Thomas (1991:164–170) there are many approaches to studying purism through quantitative data. One may compare lemma lists that indicate ‘foreign’ and ‘non-foreign’ in order to characterise the composition of a language’s vocabulary. With diachronically-indexed text corpora and relevant processing tools, it has also become possible to systematically measure variations in frequency of use over time. For big languages like German we can count on these resources and display charts such as Figure 1, where we show the frequency over time of a set of synonyms: Telephon/Telefon, an internationally-used coinage of a Greek term in two orthographic variants, vs. Fernsprecher (‘far-speaker’), a purist neologism. Figure 1. View largeDownload slide Frequency of a set of synonyms over time (German).7 Figure 1. View largeDownload slide Frequency of a set of synonyms over time (German).7 This allows us to judge the success of the purist proposal Fernsprecher, in absolute terms as well as in relation to its loan synonyms, and to consider their performance over time in relation to extralinguistic factors like language planning policies. When diachronically-indexed corpora of suitable composition and size are available, two methodological steps are necessary. First, the definition of synonym sets; second, the tagging of their components according to lexical features related to purism. Minimally speaking, and for simple comparisons like that in Figure 1, these features may be binary, as ‘foreign’ versus ‘non-foreign’. But they may also involve more sophisticated taxonomies, as the ones proposed by Thomas (1991:73). The resources available today for Basque do not enable comparative studies like those exemplified here for German, but they do allow us, for instance, to compare the counts of the same lemma in different corpora. Lexikoa atzo eta Gaur ‘lexicon yesterday and today’ (LAG, Sarasola et al. 2008)8 is a web tool that shows counts in (1) subcorpora of the OEH-corpus (18th century; 19th century; and 1900-1968), and (2), the EPG (Ereduzko Prosa Gaur ‘Exemplary Prose Today’) corpus which contains hand-selected reference prose from 1968 onwards. The tool displays the counts of the queried lemma (a) for different authors, that is, the number of writers that use the lemma, and (b), as occurrences in the text, and consequently, the number of sentences in which the lemma is used. If we look at the Basque counterpart of the German example cited above, the LAG data for the synonym pair telefono and urritizkin (the literal ‘far-speaker’, an often cited purist proposal dating from 1915) shows a constant frequency increase for telefono from the 19th century until today. In contrast, urritizkin, coined in 1915, did not last long after 1968, when it shows a clear decline. This purist neologism shares with German Fernsprecher the fact that, according to corpus data, there is no period where it was used more frequently than its loan synonym, and that the promotion of its use was related to romanticist attitudes about national ‘renewal’. Today, the international term outnumbers its purist synonym by far, both the German and the Basque purist ‘far-speaker’ neologisms can be characterised as markers of a marginal, traditionalist discourse. Following the methodology described above, we can define sets of synonyms having tagged their constituents as ‘purist neologism’ vs. ‘loanword’. We can then compare the frequency of their appearance in a set of dictionaries (a ‘lexicographical corpus’), and big electronic text corpora, like Egungo Testuen Corpusa (ETC ‘Present day Texts Corpus’, Sarasola et al. 2013) and the Elhuyar web corpora (Leturia 2012, Leturia 2014).9 For a selection of everyday words like the ones mentioned in section 3.2, we obtain the results displayed in Table 3, where synonymous lexical items appear in order of frequency (the average of relative frequencies in ETC and the Elhuyar web corpora), together with their mention in some lexical resources (dictionaries and others). Orthographic variants of the loanword are indicated in bold type. The labels ‘beh.’ (behe-mailakoa, ‘low register’), ‘goi.’ (goi mailakoa ‘high register’), and ‘heg.’ (Hegoaldea ‘south-of-Pyrenees dialect’) are usage markers; the (*) refers to a mark in the Academy’s EHI2016 that indicates a preference for its deprecation. Table 3. Variants of modern everyday life terms as synonym sets. Lexical item Rank EHI2016 ZEH2005 ELH2006 ADO2009 OEH1986 ‘dummy’, ‘pacifier’ (baby)  txupete 1 x (heg.) x (beh.) x (heg.) x  txupaki 2 x x  chupete 3  zupaki 4 x  xurgaki 5 x x  ttunttun 6 x  edoskai 7 x ‘corkscrew’  sakakortxo 1 x  sacacorchos 2  sakakortxos 3  kortxokentzeko 4  kortxo_torloju 5 x x  kortxo_ateratzaile 6  kortxokentzaile 7  kortxo_irekigailu 8 x  tapoi_irekigailu 9  kortxo_kentzeko 10 x (heg.) x x  kortxu_ateratzaile 11 x  tapoi_kentzeko 12 x x ‘bathtub’  bainuontzi 1 x x x x  bainera 2 x (*)  bañera 3  bainu-ontzi 4 ‘washing machine (clothes)’  garbigailu 1 x x x x x  labadora 2  ikuzgailu 3 x x (goi) x x x  labadoria 4  ikuzmakina 5 x  labadorie 6  garbi-gailu 7 Lexical item Rank EHI2016 ZEH2005 ELH2006 ADO2009 OEH1986 ‘dummy’, ‘pacifier’ (baby)  txupete 1 x (heg.) x (beh.) x (heg.) x  txupaki 2 x x  chupete 3  zupaki 4 x  xurgaki 5 x x  ttunttun 6 x  edoskai 7 x ‘corkscrew’  sakakortxo 1 x  sacacorchos 2  sakakortxos 3  kortxokentzeko 4  kortxo_torloju 5 x x  kortxo_ateratzaile 6  kortxokentzaile 7  kortxo_irekigailu 8 x  tapoi_irekigailu 9  kortxo_kentzeko 10 x (heg.) x x  kortxu_ateratzaile 11 x  tapoi_kentzeko 12 x x ‘bathtub’  bainuontzi 1 x x x x  bainera 2 x (*)  bañera 3  bainu-ontzi 4 ‘washing machine (clothes)’  garbigailu 1 x x x x x  labadora 2  ikuzgailu 3 x x (goi) x x x  labadoria 4  ikuzmakina 5 x  labadorie 6  garbi-gailu 7 Table 3. Variants of modern everyday life terms as synonym sets. Lexical item Rank EHI2016 ZEH2005 ELH2006 ADO2009 OEH1986 ‘dummy’, ‘pacifier’ (baby)  txupete 1 x (heg.) x (beh.) x (heg.) x  txupaki 2 x x  chupete 3  zupaki 4 x  xurgaki 5 x x  ttunttun 6 x  edoskai 7 x ‘corkscrew’  sakakortxo 1 x  sacacorchos 2  sakakortxos 3  kortxokentzeko 4  kortxo_torloju 5 x x  kortxo_ateratzaile 6  kortxokentzaile 7  kortxo_irekigailu 8 x  tapoi_irekigailu 9  kortxo_kentzeko 10 x (heg.) x x  kortxu_ateratzaile 11 x  tapoi_kentzeko 12 x x ‘bathtub’  bainuontzi 1 x x x x  bainera 2 x (*)  bañera 3  bainu-ontzi 4 ‘washing machine (clothes)’  garbigailu 1 x x x x x  labadora 2  ikuzgailu 3 x x (goi) x x x  labadoria 4  ikuzmakina 5 x  labadorie 6  garbi-gailu 7 Lexical item Rank EHI2016 ZEH2005 ELH2006 ADO2009 OEH1986 ‘dummy’, ‘pacifier’ (baby)  txupete 1 x (heg.) x (beh.) x (heg.) x  txupaki 2 x x  chupete 3  zupaki 4 x  xurgaki 5 x x  ttunttun 6 x  edoskai 7 x ‘corkscrew’  sakakortxo 1 x  sacacorchos 2  sakakortxos 3  kortxokentzeko 4  kortxo_torloju 5 x x  kortxo_ateratzaile 6  kortxokentzaile 7  kortxo_irekigailu 8 x  tapoi_irekigailu 9  kortxo_kentzeko 10 x (heg.) x x  kortxu_ateratzaile 11 x  tapoi_kentzeko 12 x x ‘bathtub’  bainuontzi 1 x x x x  bainera 2 x (*)  bañera 3  bainu-ontzi 4 ‘washing machine (clothes)’  garbigailu 1 x x x x x  labadora 2  ikuzgailu 3 x x (goi) x x x  labadoria 4  ikuzmakina 5 x  labadorie 6  garbi-gailu 7 Looking at the examples, we observe that if the Academy and the other recent lexical resources agree on a neologism, it is likely to appear more often in the reference corpora than its loan counterpart. If, on the other hand, the dictionaries do not agree on the term to be used, the most frequently-used word will be the loanword. For ‘comforter’, the Spanish loan txupete, in spite of being marked as ‘southern’ or ‘low register’, in its two graphical variants combined, is used far more often than any of the various purist proposals. In the case of ‘corkscrew’, EHI2016 offers a recommendation marked as ‘southern’, and ELH2006 is the only resource that clearly agrees, while the other sources show quite a diverse picture. On the other hand, ‘bathtub’ and ‘washing machine’ do have a consensus recommendation for a purist equivalent: in these cases, although the orthographic variants of the Spanish loan immediately follow the recommended term on the rank list, their combined frequencies are no match for the frequency of the purist leader. Nevertheless, neologisms, such as those in Table 3, are mainly used when the standard variety is required, both in written and oral modalities. Informal oral speech is frequently characterized by loanwords. 4. Elaborating the code: From prescription to proscription The standardisation of the Basque lexicon is an ongoing process. By the 1990s all published dictionaries – still limited to bilingual dictionaries – were edited according to Standard Basque. In 1994, Sarasola finished the first edition of his monolingual Basque dictionary, from the second edition on entitled Euskal Hiztegia, ‘Basque Dictionary’ (SEH199610 and SEH2007), which has become a main reference work. The entries of this dictionary contain the date of the first documented use of a lemma in lexicography and literature, grammatical information, definitions, synonyms and usage examples from the OEH corpus. In many ways, therefore, this dictionary is a monolingual descendant of the multilingual General Basque Dictionary OEH1986. However, its lemma list contains more than 10.000 items not found in OEH1986, and among these around 5000 have been documented after AZK1905.11 Just like its multilingual antecedent, it is a purely descriptive dictionary: its aim is to give a picture of the written language in use, offering large numbers of referenced corpus-based usage examples. If we wanted to find any reformist motivation behind this piece of work, it would be the enrichment of Basque in use by promoting the whole range of the language’s lexicon as part of its lexical richness. Ibon Sarasola compiled a Spanish-Basque dictionary for translation and Basque text production addressed to a Spanish-Basque bilingual user (ZEH2005), also available online.12 This dictionary cannot be called descriptive in the sense that the selection of its Basque lexical items would be necessarily corpus-based. For translation purposes, all Spanish word senses must be furnished with Basque equivalents, so lexicographers must provide at least one equivalent, even if they fail to find any in existing reference corpora. In those cases where a word sense may be translated by more than one equivalent, lexicographers must decide whether to include all members of this synonym set, or whether to mark any of the lexical items according to register or regional distribution. Sarasola indicates this in the dictionary entry for chupete, ‘comforter’, shown in Figure 2, where the direct, non-assimilated loan is marked as beh., ‘low register’, while an assimilated version of the loan, txupaki, appears unmarked next to the more purist proposal, xurgaki, which does not carry any further mark either (cf. also the data in table 3 above). Figure 2. View largeDownload slide ZEH2005, s.v. chupete. Figure 2. View largeDownload slide ZEH2005, s.v. chupete. The other two main contemporary Spanish-Basque reference works follow a similar method. All equivalents listed in the dictionary entries are found in corpora. Therefore, these dictionaries cannot be considered prescriptive. But, on the other hand, the classification descriptive is not strictly suitable for these three bilingual dictionaries, either, since the lexicographer marks some equivalents as preferable to others, or, as we just have remarked, has to fill lexical gaps in order to provide equivalents for all Spanish word senses. It seems to us that this ‘third path’ in between description and prescription runs close to what Bergenholtz and Gouws describe as ‘weighed description’ or proscription: Proscription allows the same possibilities for the empirical basis as description […]. However, the results of empirical analysis are dealt with in a different way compared to a descriptive approach. In this regard the most salient distinction lies in the fact that the lexicographer does not only provide the results from the empirical analysis but goes further by indicating a specific variant that he/she regards as the recommended form. (Bergenholtz and Gouws 2010:36). This paradigm shift from prescription to proscription seems to be characteristic of the post-standardisation stage that the Basque language has reached. In concordance to Thomas’ model (1991:121–122), both the stringency of purism and the effort in prescription at this stage have been relaxed. More than by any conviction that ‘the battle has been won’ in favour of Basque, a language formerly regarded as a vernacular and bound to disappear, this may be explained by a growing self-assurance among Basque speakers, who nowadays do not look to a dictionary as often when they write a text as they would have done twenty or more years ago. Consequently, the Basque lexicographer Sarasola urges a softening of institutional prescription efforts, along with a professionalization of corpus-based proscriptive lexicography: We used to propose correctness, and we didn’t achieve it. It is usage which is important, that is, the word that is used. We have to develop the Basque language in the shade of Spanish. You may say, a certain word goes against unity, but what if nobody cares? What shall you do then? Break out in tears? We have to achieve unity, yes, but with the most used words. There was a time we proposed ‘krokodilo’ as correct form [sc. not ‘kokodrilo’ like in Spanish], and everybody took notice of that. But today, an approach like that doesn’t work. So, what shall we do? We shall resign and save what can be saved. Usage prevails, and we have to respect that. That is why corpora are so important. (Ibon Sarasola, 2015 speech at UPV/EHU; translation and brackets M.A./D.L.)Egungo Euskararen Hiztegia, ‘The Dictionary of Contemporary Basque’13 (EEH2007, see an example entry in Figure 3) is an online dictionary of Basque as it is used today, based entirely and only on EPG14 (Sarasola et al. 2007), a hand-selected corpus of contemporary reference prose with 25 million tokens. The project to edit this dictionary has been ongoing since 2007, and will take several more years before to complete. Figure 3. View largeDownload slide EEH2007, s.v. hiztegigintza, ‘lexicography’. Figure 3. View largeDownload slide EEH2007, s.v. hiztegigintza, ‘lexicography’. With time, additional item types such as grammatical and diachronic information will be incorporated. In its present form, the dictionary offers entries consisting of definitions, synonyms and corpus-derived usage examples for the word senses of the lemma (cf. Figure 3 which shows the entry ‘lexicography’). This dictionary follows a purely descriptive approach, and consequently, only lemmata found in the corpus enter the dictionary as headwords. Terminology is undergoing a shift from prescription to proscription with the program Terminologia Sareak Ehunduz (‘Weaving Terminology Networks’), which aims at making the real terminology and phraseology of specialised communication visible to experts, to linguists and to participants in language normalisation initiatives. Terms are extracted from real texts employed in university teaching.15 5. Conclusions and further work In this article, we have revisited Basque lexicography from the point of view of purism. We have asked what kinds of purist intentions motivated Basque lexicographers. We have proposed a rough periodisation of Basque lexicography according to the stages of language standardisation, showing a relation of these stages to a functional classification of lexicographical products. Based on lemma frequency data from different corpora for small example sets of purist neologisms and their loanword synonyms, we have proposed two hypotheses on the success of purist terminology proposals. First, as regards the purist term production of Sabin Arana and his followers in the early 20th century, we have seen that newly-coined terms that filled lexical gaps, that is, that had no established loanword synonym to replace, were bound to succeed, while purist proposals that corresponded to already-established internationalisms are not among the most frequently used neologisms of this period. Second, we have compared the frequencies of some terms necessary in modern life, in their loanword and their purist variants dating from a period of standardisation backed by the Language Academy. Here we have observed that in cases where all main dictionaries agree on the same proposed neologism, this neologism is much more likely to succeed over its borrowed synonym, than when the main reference resources do not propose the same term. In these cases, the loanword leads the frequency ranking. Thanks to the ongoing introduction of Digital Humanities methodology into Basque philology, more and more of the textual and lexicographical sources referenced in this article can be found in structured representation formats, which is necessary for further research into the issues developed here, as well as the creation of diachronically indexed subsets of the already available Basque reference corpora and extraction of frequency data for each of them. As soon as the content of paper dictionaries has been retro-digitised and enriched with structural mark-up dealing with all types of information, we will be able to test our hypothesis about the success factors of different sorts of neologisms, in relation to loanword synonyms and as tendencies over time. In this manner we may move from isolated examples to a broad empirical scale: that is, towards taking into account the lexicographical corpus of Basque as a whole.16 Footnotes 1 For Basque dialectal variation, cf. Zuazo 2013. 2 The Statute of Autonomy of the Basque Autonomous Community dates from 1979. In that sense, it can be said that the cultivation of the standard variety was linked to a nation-building project (Davies 2012: 56). For the evolution of Basque during the last decades and its present-day situation, cf. the Basque Government's Fifth Sociolinguistic Survey, available online, see http://www.euskara.euskadi.eus/r59-734/en/. 3 For a more detailed account of the process of selection, codification and elaboration of standard Basque cf. Hualde and Zuazo 2007, Salaburu and Alberdi 2012. 4 In its present version, OEH dictionary can be consulted only on line: see http://www.euskaltzaindia.eus. 5 Euskaltzaindiaren Hiztegia can be consulted on line: see http://www.euskaltzaindia.eus. 6 It can be consulted on line: see http://www.euskara.euskadi.eus/r59-734/en/. 7 Query for Telefon, Telephon and Fernsprecher in the German Google Books corpus (1870-2008, graph smoothing factor 3), Google Ngram Viewer (accessed on March 21, 2018). 8 See http://www.ehu.eus/lag/. 9 ETC and the 2014 Elhuyar web corpus contain 200 million tokens each. While ETC contains hand-selected reference prose, the Elhuyar web corpora are built by automatic methods and thus reach out for a wider range of registers (cf. also Lindemann and San Vicente 2015). ETC can be queried at http://www.ehu.eus/etc/; the Elhuyar corpora at http://webcorpusak.elhuyar.eus/. 10 The contents of this dictionary have also been represented as XML (Arriola … Sarasola 2003), from where we have obtained the data we are referring to. 11 On the other hand, SEH1996 contains not more than 6,100 of the 48,800 historical lemmata and variants present in the corpora used for the compilation of OEH1986 but not in the big reference corpora available today (cf. section 8). 12 See http://www.ehu.eus/ehg/zehazki/. 13 See http://www.ehu.eus/eeh/. 14 See http://www.ehu.eus/euskara-orria/euskara/ereduzkoa/. 15 The outcomes can be consulted online, https://www.ehu.eus/ehusfera/tse/. 16 The research leading to these results has received funding from the European Union's Seventh Framework Programme for research, technological development and demonstration under grant agreement no. 613465, and from the Basque Government (IT665-13). Funding is gratefully acknowledged. References Adorez Taldea (ed.). 2009. Adorez 5000 hiztegia . Bostak Bat Lantaldea . ( ADO2009 ) Azkarate M. , Kintana X. , Mendiguren X. (ed.). 2006. Elhuyar hiztegia: euskara-gaztelania, castellano-vasco . Usurbil : Elhuyar . ( ELH2006 ) Azkue R. M. 1984 [1905-06]. Diccionario vasco-español-francés . Bilbo : Euskaltzaindia . ( AZK1905 ) Bera E. , López-Mendizabal I. . 1916. Diccionario castellano-euzkera / Bera’tar Eroman Mirena aba, buruñurduna. Euzkel-erdel iztegia / López Mendizabal’dar Ixaka . Tolosa : E. Lopez . ( BEM1916 ) Euskaltzaindia (ed.), 2016. Euskaltzaindiaren Hiztegia . Bilbo : Euskaltzaindia . ( EHI2016 ) Kintana X. (ed.). 1977. Euskal Hiztegi Modernoa . Bilbo : Cinsa . ( KIN1977 ) Larramendi M. 1745. Diccionario trilingüe castellano, bascuence y latin dedicado a la M.N. y M.L. provincia de Guipuzcoa . San Sebastián : Bartholomé Riesgo y Montero . ( LAR1745 ) Mitxelena K. , Sarasola I. . 1986. Orotariko Euskal Hiztegia – Diccionario General Vasco . Bilbo : Euskaltzaindia; Desclée de Brouwer . ( OEH1986 ) Múgica Urdangarín L. M. 1977. Diccionario General Y Técnico . San Sebastián : Ed. Vascas . ( MUG1977 ) Real Academia . 1726. Diccionario de la lengua castellana en que se explica el verdadero sentido de las voces, su naturaleza y calidad, con las phrases o modos de hablar, los proverbios o rephranes, y otras cosas convenientes al uso de la lengua.. . Madrid : Francisco del Hierro . ( RAE1726 ) Sarasola I. 1996. Euskal Hiztegia . Donostia : Kutxa Gizarte eta Kultur Fundazioa . ( SEH1996 ) Sarasola I. 2005. Zehazki: gaztelania-euskara hiztegia, diccionario castellano-euskera . Irun : Alberdania . ( ZEH2005 ) Sarasola I. 2007. Egungo Euskararen Hiztegia . Bilbo : UPV/EHU . ( EEH2007 ) Arriola J. , Artola X. , Arregi X. , Díaz De Ilarraza A. , García E. , Laskurain B. , Sarasola K. . 2003. ‘Semiautomatic Conversion of the Euskal Hiztegia Basque Dictionary to a Queryable Electronic Form’ . T.A.L. journal 44 : 2 : 107 – 124 . Azkarate M. 2008. ‘Hiztegigintza eta euskararen normalkuntza’. Euskalgintza XXI. mendeari buruz. Iker-19. Euskaltzaindia, 171–182. Bergenholtz H. , Gouws R. . 2010. ‘A Functional Approach to the Choice between Descriptive, Prescriptive and Proscriptive Lexicography’ . Lexikos 20 . 1 : 26 – 51 . Google Scholar CrossRef Search ADS Boeder W. , Brincat J. , Stolz T. (eds). 2003. ‘Preface’. In Brincat, J., W. Boeder and T. Stolz (eds). Purism in minor languaguages, endangered languages, regional languages, mixed languages. Papers form the conference on ‘Purism in the Age of Globalisation’ Bremen, September 2001. Brockmeyer, vii–xiv Brunstad E. 2010. ‘Standard Language and Linguistic Purism’ . Sociolinguistica 17 . 1 : 52 – 70 . Crystal D. 1997. The Cambridge Encyclopedia of Language . Cambridge University Press . Davies W.V. 2012. ‘Myths we live and speak by’. In Hüning M. , Vogl U. , Moliner O. (eds). Standard Languages and Multilingualism in European History . John Benjamins , 45 – 69 Google Scholar CrossRef Search ADS Haugen E. 1983. ‘The Implementation of Corpus Planning: Theory and Practice’. In Cobarrubias J. , Fishman J. A. (eds), Progress in Language Planning: International Perspectives . Walter de Gruyter , 269 – 289 . Hualde J. I. , Zuazo K. . 2007. ‘The standardization of the Basque language’ . Language Problems and Language Planning 31 . 2 : 142 – 168 . Google Scholar CrossRef Search ADS Langer N. , Davies W. . 2005 . ‘An Introduction to Linguistic Purism’. In Langer N. , Davies W. (eds), Lingustic Purism in the Germanic Languages . Walter de Gruyter , 1 – 17 Google Scholar CrossRef Search ADS Larramendi M. de. 1729. El impossible vencido. Arte de la lengua bascongada . Salamanca : A.J. Villargordo Alcaráz . Leturia I. 2012. ‘Evaluating Different Methods for Automatically Collecting Large General Corpora for Basque from the Web’. Proceedings of 24th International Conference on Computational Linguistics (COLING 2012). Mumbai, India, 1553–1570. Leturia I. 2014. The Web as a Corpus of Basque. PhD Thesis. Donostia: UPV/EHU Lindemann D. , San Vicente I. . 2015. ‘Building Corpus-Based Frequency Lemma Lists’ . Procedia - Social and Behavioral Sciences 198 : 266 – 277 . Google Scholar CrossRef Search ADS Löffler M. 2003. ‘Purism and the Welsh Language: a matter of survical?’, In Brincat, J., W. Boeder and T. Stolz (eds). Purism in minor languaguages, endangered languages, regional languages, mixed languages. Papers form the conference on ‘Purism in the Age of Globalisation’ Bremen, September 2001. Brockmeyer, 61–90. Milroy J. 2001. Lanuage ideologies and the consequences of standardization’ . Journal of Sociolinguistics 5 / 4 , 530 – 555 . Google Scholar CrossRef Search ADS Milroy J. 2005 . ‘Some effects of purist ideologies on historical descriptions of English’. In Langer N. , Davies W. (eds), Lingustic Purism in the Germanic Languages . Walter de Gruyter , 324 – 341 . Google Scholar CrossRef Search ADS Mitxelena K. 1968. ‘Orthography’, In Salaburu P. 2008. Koldo Mitxelena: Selected Writings of a Basque Scholar . University of Reno , Nevada , 253 – 271 Mitxelena K. 1984. ‘Hauta-Lanerako Euskal Hiztegia-ren Aurkezpena’. In Sarasola I. (ed), Hauta-Lanerako Euskal Hiztegia . Donostia : Fundación Kutxa . Oñederra M. L. 1990. ‘Morphonological Aspects of Word Compounding in Basque’. In Boretzky N. (ed). Spielarten der Natürlichkeit, Spielarten der Ökonomie . Brockmeyer Pagola I. 2005. Neologismos en la obra de Sabino Arana Goiri. (Iker-18) . Bilbo : UPV/EHU; Euskaltzaindia . Salaburu P. , Alberdi X. . 2012. ‘The Search for a Common Code’, In Salaburu P. , Alberdi X. (eds). The Challenge of a Bilingual Society in the Basque Country . Center for Basque Studies, University of Nevada , Reno , 93 – 112 Sarasola I. 2002. ‘Euskal Hiztegigintzaren Historiarako Oharrak: Añibarro, Iztueta eta Aizkibelen Hiztegiez, Eta Azkueren Hiztegigintzaz’. In Artiagoitia X. , Goenaga P. , Lakarra J. (eds), Festschrift for Rudolf P.G. de Rijk. (ASJU Separata XLIV) . Bilbo : UPV/EHU , 611 – 628 . Sarasola I. 2003 . ‘Lexicography in the last quarter of the 20th century up to the publication of the Orotariko Euskal Hiztegia’. In Gorrochategui J. (ed.), Basque and (Paleo)Hispanic Studies in the Wake of Michelena’s Work . UPV/ EHU . 219 – 244 Sarasola I. , Landa J. , Salaburu P. . 2013. ‘Egungo Testuen Corpusa’ . UPV-EHU . http://www.ehu.eus/etc/. Sarasola I. , Salaburu P. , Landa J. , Ugarteburu I. . 2008. Lexikoa Atzo eta Gaur . Bilbo : UPV-EHU . http://www.ehu.eus/lag/. Sarasola I. , Salaburu P. , Landa J. , Zabaleta P. . 2007. ‘Ereduzko Prosa Gaur’ . UPV/EHU . http://www.ehu.eus/euskara-orria/euskara/ereduzkoa/. Thomas G. 1991. Linguistic Purism . Longman . Urgell B. 2000. Larramendiren Hiztegi Hirukoitza-ren osagaiez. PhD Thesis. Vitoria-Gasteiz: UPV/EHU. Van der Sijs N. 1999. ‘Inleiding. De rol van taalzuivering in de taalontwikkeling: historische en politieke aspecten’. In Van der Sijs N. (ed), Taaltrots . Contact , 11 – 36 Vogl U. 2012. ‘Multilingualism in a standard language culture’. In Hüning M. , Vogl U. , Moliner O. (eds). Standard Languages and Multilingualism in European History . John Benjamins , 1 – 42 Google Scholar CrossRef Search ADS Zgusta L. 1989. ‘The Role of Dictionaries in the Genesis and Development of the Standard’. In Hausmann F.J. , Reichmann O. , Wiegand H.E. , Zgusta L. (eds.), Wörterbücher. Dictionaries. Dictionnaires . Erster Teilband / First Volume / Tome Premier. Walter de Gruyter . 70 – 79 . [reprinted in Dolezal, F. and T. Creamer (eds). 2006. Ladislav Zgusta. Lexicography Then and Now. Selected Essays. Max Niemeyer Verlag, 186-197] Google Scholar CrossRef Search ADS Zuazo K. 2013. The Dialects of Basque . Center for Basque Studies, University of Nevada, Reno © 2018 Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices)

Journal

International Journal of LexicographyOxford University Press

Published: May 17, 2018

There are no references for this article.