An Experimental Study of Dictionary use on Vocabulary Learning and Reading Comprehension in Different Task Conditions

An Experimental Study of Dictionary use on Vocabulary Learning and Reading Comprehension in... Abstract This study examines the effects of dictionary use and integrated skills on vocabulary learning and reading comprehension, and learners’ perceived difficulty of reading texts. Sixty-two high school students were assigned to four groups according to different task conditions – read-aloud with dictionary use, listening with dictionary use, dictionary use only, and noticing (non-dictionary). The learners’ vocabulary learning and reading comprehension were measured through pre-, post-, and delayed tests. Their perceived text difficulty was checked with a seven-point Likert scale. Results were that there were substantial improvements in vocabulary scores, regardless of task conditions, over time. It was also revealed that the sustained effect of all task conditions was found in both vocabulary learning and reading comprehension, except for the noticing group in reading comprehension, supporting the positive influence of using dictionary. In perceived text difficulty, the read-aloud group felt significantly easier than the noticing group. 1. Introduction Recent years have seen steady increase of studies on dictionary use (Chen 2011a). Experimental studies have appeared on the effectiveness of dictionaries according to dictionary type, such as monolingual, bilingual, and bilingualized dictionaries (Chen 2011b, Hayati and Pour-Mohammadi 2005, Laufer and Hadar 1997, Luppescu and Day 1993, Pujol, Corrius, and Masnou 2006, Tono 1989, Zarei and Naseri 2008). With the extensive use of electronic dictionaries, investigations on comparing electronic with print dictionaries have also been on the rise (Chen 2010, 2011a, Dziemianko 2010, 2011, 2012, Koyama and Takeuchi 2003, 2004a, 2004b, 2007). Other lines of research have been on the effects of dictionary consultation on vocabulary learning (Hulstijn, Hollander, and Greidanus 1996, Luppescu and Day 1993, Ronald 2002), on reading comprehension (Bensoussan, Sim, and Weiss 1984, Nesi and Meara 1991, Shieh and Freiermuth 2010, Tono 1989), and on both vocabulary and reading comprehension (Fraser 1999a, Knight 1994, Summers 1988). Research on dictionary use, in many cases, has been embedded in reading context, rather than context-free vocabulary learning conditions. It is possibly because learning words is inseparable from reading; we read newspaper, books, magazines, and e-mails every day from which we encounter many words, and one of the ultimate goals of learning new words is to gain more information through reading. Despite the interest in vocabulary development using dictionaries in the context of reading in the second or foreign language (hereafter L2) field, studies of dictionary use in association with other language skills have been rarely carried out. Furthermore, even though the development and consultation of dictionary apps increase due to a wide use of smartphones (Holmer, von Martens, and Sköldberg 2015), there has been no experimental research conducted on the effects of smartphone dictionary use on vocabulary learning and reading comprehension. The present study aims to explore the impact of dictionary use on vocabulary learning and reading comprehension in diverse task conditions. Additionally, this study attempts to examine the way L2 dictionary users perceive text difficulty. The following are four task conditions to be studied: read-aloud with dictionary use, listening with dictionary use, dictionary use only, and noticing with no dictionary reference. All the four conditions were embedded within the context of reading. With regard to dictionary use, a smartphone English-Korean bilingual dictionary (hereafter, SBD) application program was employed. For the current study, the three research questions below were formulated. What are the immediate effects of task conditions (read-aloud activity with dictionary use, listening activity with dictionary use, dictionary use only, and noticing with no dictionary use) on L2 vocabulary learning and reading comprehension? What are the delayed effects of task conditions (read-aloud activity with dictionary use, listening activity with dictionary use, dictionary use only, and noticing with no dictionary use) on L2 vocabulary learning and reading comprehension? How do L2 learners perceive text difficulty according to task conditions? 2. Literature review 2.1 Paper and electronic dictionaries and dictionary preference Since the increasing use of electronic dictionaries in language classrooms around the 1990s (Midlane 2005, Stirling 2005, Tang 1997), one of the research focuses into dictionaries has been comparing paper dictionaries (PD) with electronic dictionaries (ED) (Nesi 2014). Koyama and Takeuchi’s study (2003), one of the first attempts to compare the two dictionary forms, found no significant difference in the number of words looked up, search time, and word retention between PD and ED groups. A subsequent study conducted by the same authors (2004a) reported that there was little difference in both search time and retention of words between PD and ED, while the words looked up in a PD condition were statistically better recognized than those looked up in an ED condition. Another empirical study of Koyama and Takeuchi (2004b) showed no significance of reading comprehension scores between the two dictionary types. They added that the ED group showed more frequency of consulting dictionaries than the PD group in a shorter time, suggesting that more frequent look-up does not ensure better comprehension in reading texts. Two sub-studies conducted in Koyama and Takeuchi’s research (2007) confirmed the results that whether the participants had access to PD or ED did not make difference in reading comprehension, and ED users looked up more words than PD users in a briefer period of time. Chen (2010) compared the effects of PD and ED on vocabulary acquisition, and found no significant differences in vocabulary comprehension, production, and retention between the two. Another study conducted by the same author (Chen 2011a) included the guessing group as well as PD and ED groups. It was found that both dictionary groups gained significantly higher scores than the guessing group in vocabulary comprehension and retention, yet no significant difference was shown in the two dictionary groups. In contrast, Dziemianko (2010) found that online dictionary use was more effective than PD use in doing the receptive task, the productive task, and the retention of both target words and collocations. Two subsequent replication studies by Dziemianko (2011, 2012), however, did not confirm the superiority of ED in reception, production and retention of target items and collocations. The author mentioned that possible factors for different obtained results could have been structural features of different dictionaries such as organization, colours or shapes of information, and extent to effort to access entry. Despite the increase of comparative studies on paper and electronic versions of dictionaries, to our knowledge, experimental studies using smartphones are non-existent. Researchers also investigated L2 learners’ dictionary consulting frequency and preference for dictionary types. Studies indicate L2 learners commonly own and prefer to use bilingual dictionaries (Atkins and Knowles 1990, Atkins and Varantola 1997, Baxter 1980, Bensoussan et al. 1984, Hsien-jen 2001, Nation 2001, Nation and Webb 2011, Nesi 2014, Rundell 1999), and even a large number of bilingualized dictionary users report they mostly refer to the L1 part rather than the L2 information (Pujol et al. 2006). A questionnaire survey by Baxter (1980) to 342 Japanese university students displayed a strong predilection for bilingual dictionaries; all of the respondents said they had more than one bilingual dictionary, while 76.3% of the non-English majors did not even buy any L2 monolingual dictionary. Further, 88.3% responded they referred to a bilingual dictionary daily or several times a week, whereas only 10.8% used a monolingual dictionary. Another extensive survey on dictionary use, conducted by Atkins and Knowles (1990), revealed that 57.9% of the 740 respondents looked up a bilingual dictionary almost every week and only 0.4% claimed they never used it. The favoured reference for bilingual dictionaries was also revealed in the study of Bensoussan et al. (1984), which disclosed that 59% of 670 and 58% of 740 participants consulted bilingual dictionaries whereas 20% and 21% used monolingual dictionaries. Atkins and Varantola (1997) confirmed the preference; in their study, bilingual dictionaries were consulted 714 times (71%) out of 1,000 look-ups while monolingual 281 times (28%). 2.2 Impact of dictionary use on vocabulary learning and reading comprehension A line of research into dictionary use has been conducted on the value of dictionary on vocabulary learning and reading comprehension. As for vocabulary learning, results from previous studies support the role of dictionary as a facilitator for vocabulary learning. Luppescu and Day (1993) found that dictionary users, around half of the 293 Japanese university students, gained significantly higher scores on a vocabulary test than non-dictionary users. Ronald (2002) also found benefits of dictionary use to L2 vocabulary growth through a case study focusing on one motivated learner majoring in English. The evidence of values of dictionary use was partly supported by Hulstijn et al. (1996). They investigated the influence of three vocabulary strategies in reading context – marginal glosses, bilingual dictionary use, and reading text only – and one of their main findings was that when the dictionary group did consult dictionaries, their retention rate of words was greater than that of the other groups. Contrary to the generally accepted idea of the usefulness of dictionaries on vocabulary learning, a mixed bag of results has been displayed in reading comprehension. Bensoussan et al. (1984) investigated the effect of dictionary use on reading comprehension, and found no significant correlation. This result was in accordance with that of Nesi and Meara’s study (1991) which was conducted under two conditions, dictionary access and non-dictionary groups. In their study, the two groups did not significantly differ in reading test scores. In Tono’s study (1989), however, that was not the case; the bilingual English-Japanese dictionary group, which consisted of junior high school students, performed significantly better on a reading comprehension test. Shieh and Freiermuth (2010) also emphasized the effects of dictionary reference as a way of improving reading comprehension, showing higher scores of with-dictionary group. They added that the positive effects of dictionary use happen only if sufficient time is allowed for dictionary consultation. On both vocabulary learning and reading comprehension, Summers (1988) drew a comparison among four groups, including three different entry types of dictionary – definitions only, examples only, definitions and examples – and a control group. The author discovered non-dictionary group fared worst, while dictionary use, regardless of the entry types, brought statistically distinct improvements in both vocabulary learning and reading comprehension. The finding was corroborated by Knight’s study (1994), which used two vocabulary tests and one reading test. It was found that the dictionary group demonstrated significant differences on both vocabulary learning and reading comprehension. Different results, however, were revealed in an experimental study of Fraser (1999a). She investigated the impact of three lexical processing strategies, consulting a dictionary, inferencing, and ignoring, and reported that a high proportion (78%) of dictionary consultation led to full text comprehension, while no difference between consulting a dictionary and inferencing was shown in vocabulary retention. It was also stated that the combination of consulting a dictionary and inferencing, followed by consulting only, was most effective on vocabulary retention. The previous research on dictionary use, as beforehand alluded to, has been largely discussed with solely a reading condition. In other words, few studies have handled the effects of dictionary consultation with integrated language skills. The present study aims to access the effects of dictionary use on vocabulary learning and reading comprehension along with integrated skills, such as speaking and reading, and listening and reading. The direction of the study towards combination of dictionary consultation with integrated language skills would broaden the scope of future research areas and enrich the relationship between dictionary use and language skill integration. 3. Method 3.1 Participants The participants for this study were initially 96 students who answered the questionnaire survey on dictionary use. However, nineteen students were excluded in the process of selecting target words. The qualification for being participants was the extent to know how many target words; among twenty target words, qualified participants knew less than two words. The remaining 77 participants were almost equally assigned to four groups – read-aloud, listening, dictionary only, and noticing groups. We named the groups based on a distinct feature of the task the participant groups would perform. For example, the read aloud group did a read-aloud activity; the listening group did a listening activity; the dictionary only group consulted SBDs and did not perform any language skill-related activities; and the noticing group was given chances to notice the meaning of unknown target words without any dictionary use and language skill-relevant activities. After all the experiment process, another fifteen were excluded from the data analysis because they failed to take more than two reading and vocabulary tests, leading to the participant number at 62. They were in the 10th grade from a high school situated in South Korea. They all were born and raised in Korea, with no experience of studying abroad, and share the same mother tongue and cultural background. They also had studied English as a required subject in the national curriculum for around eight years and took 5 hours of class a week on average. As Table 1 shows, each group consists of seventeen, thirteen, thirteen, and nineteen, and in total, 34 males (54.8%) and 28 females (45.2%); nine males and eight females in the read aloud group, five males and eight females in the listening group, seven males and six females in the dictionary only group, and thirteen males and six females in the noticing group. Table 1 Distribution of the participants Group  N  Male  Female  Read aloud  17  9 (52.9%)  8 (47.1%)  Listening  13  5 (38.5%)  8 (61.5%)  Dictionary only  13  7 (53.8%)  6 (46.2%)  Noticing  19  13 (68.4%)  6 (31.6%)  Total  62  34 (54.8%)  28 (45.2%)  Group  N  Male  Female  Read aloud  17  9 (52.9%)  8 (47.1%)  Listening  13  5 (38.5%)  8 (61.5%)  Dictionary only  13  7 (53.8%)  6 (46.2%)  Noticing  19  13 (68.4%)  6 (31.6%)  Total  62  34 (54.8%)  28 (45.2%)  3.2. Materials 3.2.1 Reading materials and target word items. As for reading materials, the participants read four expository texts concerning Africa and global environment which were extracted from More Reading Power (2nd edition) by Mikulecky and Jeffries (2004). In order to select reading materials, several factors were considered. First, expository texts were chosen for taking students’ reading purpose into account, since the participants were studying English for the Korean Scholastic Aptitude Test (KSAT), the university entrance exam, which largely consists of expository texts, about more than 20 out of 28 reading questions. Next, all four texts comprised five to six passages containing similar numbers of tokens, words in text, ranged from 497 to 506 (see Table 2). Additionally, to ensure a similar level of text difficulty, readability for all the reading texts was computed and checked through the program Web VP Classic (www.lextutor.ca/vp/eng/). In the data, as shown in Table 2, the scope of types, which means different words in text, was 219 to 250 and the range of word families, which means a group of cognate words, was 162 to 190. The combined percentage of the most frequent 1000 words of English (1K words) and the second most 1000 words of English (2K words) of the four expository texts ranged from 83.27% (N=418) to 89.53% (N=450). Those are relatively higher figures than one for typical written texts, 80%. The Flesch-Kincaid Grade (FKG) Level indicated that all four texts were in a similar range of reading difficulty, from 8.70 to 11.14, though the FKG level of the text 2 is rather higher than that of the others (see Table 2).1 Fourthly, every reading text contained ten multiple-choice questions with five exemplars each at the end of the passage. Among ten reading comprehension questions, eight were adapted from the More Reading Power, with the change of the number of exemplars from four to five. Two questions were newly formulated to enhance chances to measure more accurately the global understanding of the text and the identification of specific information. All of the questions were solved and examined by two Korean teachers of English and one native English–speaking teacher. After we reviewed and double-checked the questions, through discussion, disagreements were resolved and the finalized versions of the reading comprehension check questions were made. For the establishment of internal consistency, Cronbach α coefficients were computed for the four reading posttests. The results for the posttests 1 to 4 were 0.769, 0.753, 0.729, and 0.813 in order. Table 2 Text readability results from Web VP Classic Text  Tokens  Types  Word families  1K and 2K words percentage (N)  FKG Grade level  1  497  238  190+?  89.53% (445)  9.94  2  502  219  162+?  83.27% (418)  11.14  3  503  239  185+?  89.27% (450)  8.70  4  506  250  181+?  84.59% (428)  9.40  Text  Tokens  Types  Word families  1K and 2K words percentage (N)  FKG Grade level  1  497  238  190+?  89.53% (445)  9.94  2  502  219  162+?  83.27% (418)  11.14  3  503  239  185+?  89.27% (450)  8.70  4  506  250  181+?  84.59% (428)  9.40  For target words selection, three teachers, instructing the participants for two continuous years, discussed and picked out 40 words which seemed unknown to the students. The reason for selecting 40 words was that the students were accustomed to the format of 40-word vocabulary tests which had often been used in class. The 40 words were used as a pretest, and finally 20 target words were selected which were unknown to more than 96% (N=60) of the 62 participants. The target lexical items met the following criteria: a) proper noun or technical terms were excluded, as Chen (2011a) mentioned, which were of little experimental value, b) only less-than-five-syllable single words, not idiomatic expressions or word associates, were included since multi-word lexical items could induce a memory burden on the participants, and c) only content words were selected; parts of speech were controlled and limited to adjective, adverb, noun, and verb, which are of relatively more importance in comprehending texts than others. The selected target words were destructive, dramatically, worsen, preventable, conflict, up-to-date, afford, operator, correspondence, course, plantation, deforestation, ecology, coral, reef, amphibian, endanger, mass, substance, and ultraviolet. 3.2.2 Smartphone bilingual dictionary. In order to select a dictionary for the study, we conducted a questionnaire survey on dictionary types in use, such as bilingual, monolingual, or bilingualized, and dictionary forms in use, such as paper, computer desktop, pocket electronic, or smartphone app dictionaries. The survey was handed out to 96 students. The results displayed their inclinations for bilingual dictionary in types and smartphone dictionary app in forms. Specifically, most of the students (N=92, 95.8%) responded they were using English-Korean bilingual dictionaries; only a small number of participants (N=4, 4.2%) monolingual, and none bilingualized dictionaries. This finding of a penchant for bilingual dictionary use has been presented by many previous studies (Atkins and Knowles 1990, Atkins and Varantola 1997, Baxter 1980, Bensoussan et al. 1984, Hsien-jen 2001, Nation and Webb 2011, Rundell 1999). Interestingly, the survey conducted in the present study also revealed that when consulting dictionaries, all of the participants (N=96, 100%) used downloaded dictionary app software programs on their cell phones, rather than printed or electronic pocket dictionaries. Based on the survey results, and to control the variables of dictionary form and type, the dictionary used in the study was restricted to an English-Korean bilingual dictionary on the smartphone. To eliminate differences in the use of a SBD, for instance, letter size, layout, font, phonetic symbols, example sentences, idiomatic and lexical phrases, translated meaning, etc., one English-Korean dictionary app was selected which can be freely downloaded from the application software Google Play Store. For a SBD selection, fifteen SBDs were searched and examined to find out which one was most appropriate for the study. Taking the variables suggested above into consideration, we selected one SBD application program, named Manneung ‘omnipotent’ English dictionary. All the participants, except the noticing group, downloaded the same SBD app on their smartphones and used it while engaging in the tasks. Despite the availability of the SBD, however, the participants may not use the SBD when confronting the target words while reading. The failure or sparing use of consulting dictionaries on target words has also been mentioned in other studies (Chen 2011a, Hulstijn et al. 1996) and the investigation on frequency of dictionary consulting showed great individual differences (Fraser 1999a, 1999b). To prevent the underuse of dictionaries, in this study, the participants were asked to find the meaning of all the underlined target words and to put a check mark beside words. 3.3. Procedure To figure out students’ perceptions on dictionary use, we first administered the questionnaire to 96 students. For approval from the participants as volunteers for the study, consent forms assuring their voluntary participation and confidentiality of identity were distributed and they all agreed. Next, to ensure the homogeneity of their vocabulary knowledge, a vocabulary pretest, containing 40 lexical items, was taken by all of the participants for 15 minutes. They were asked to write the meaning of the words in their L1, Korean. Among the 40 words, 20 words unknown to most of the students (N=92, 95.8%) were selected as target lexical items and only they were counted. As a reading comprehension pretest, fifteen reading questions on the 2012 National Assessment of Educational Achievement (NAEA) test were administered for 30 minutes.2 After both vocabulary and reading pretests, nineteen students were excluded who knew more than two target words in the vocabulary pretest, and the rest 77 participants were assigned to four groups, 19 for read-aloud, listening, dictionary only groups respectively and 20 for noticing group. The three dictionary groups, read-aloud, listening, and dictionary only, were directed to download the smartphone dictionary application selected and had time to practice using the SBD in order to prevent the effects of students’ inadequate dictionary consulting skills. After one week from the pretests, treatment of different task conditions for each group such as a read aloud activity with dictionary use, a listening activity with dictionary use, dictionary consultation, and opportunities for noticing target words was given every Tuesday and Thursday for two consecutive weeks. To set up a consistent and systematic instructional procedure, one of the researchers, a Korean teacher of English, taught all the four groups. During the experiment period, each group performed the given task paragraph by paragraph, not covering the whole reading texts at a time. This teaching procedure was intended for reducing the participants’ cognitive burden of remembering reading contents in the whole passage at once. For the read-aloud group, forty minutes were given in each treatment. Students were asked to read one paragraph, 80 to 100 words long, and to look up unknown words containing underlined target words in their SBD. Then they put a check mark beside the words consulted. After consulting dictionaries while reading, the instructor made the group read aloud the sentences including target words twice. Then they solved one to three reading comprehension check questions related to the paragraph they read. The same procedure applied to the other paragraphs as well. Shortly after all the tasks, the participants checked their perceptions on the text difficulty using a seven-point Likert scale, and took a vocabulary posttest consisting of five target words and five distractors. The listening group was given the same reading texts and questions. After reading and SBD use for a paragraph, however, instead of the read-aloud activity, the participants listened to the same paragraph recorded by a text to speech (TTS) program, panopreter basic (http://modangs.tistory.com/469). The use of the TTS program is for establishing the same quality of sound variables such as speed, pitch, and intonation. Listening activities were also done paragraph by paragraph. After all the tasks, self-perceptions for the text difficulty were checked with a seven-point Likert scale, and a vocabulary posttest was administered. In all, around forty minute experiment time was also given to this group. The same procedure was practiced for the dictionary only group, except read-aloud and listening activities. The participants used their SBDs for unknown words while reading a paragraph, and then solved the reading questions attached. The rest of the passage was also dealt with in the same way. Lastly, perceived text difficulty was marked on a seven-point Likert scale, and vocabulary posttests were taken. It took about thirty-five minutes. In the case of the noticing group, the participants were not allowed to consult any dictionaries. They just had the opportunity to notice the meaning of the underlined words during reading, after which they solved the reading comprehension questions. After the whole reading passage was handled as above, they were asked to rate text difficulty on a seven-point Likert scale and took a vocabulary posttest. Altogether, it took about thirty minutes. Finally, to discover whether sustained improvements of vocabulary or reading ability were made, vocabulary and reading delayed tests were given to all the participant groups after two weeks from the last treatment day. On a delayed vocabulary test, the forty words used on the pretest were rearranged and presented in a different order for minimizing the carryover effect from the pretest. As a delayed reading test, fifteen questions selected from the 2013 NAEA were used to measure the participants’ reading comprehension. 3.4. Data analysis The data collected were analyzed by the Statistical Package for the Social Sciences (SPSS) 20.0. Firstly, Cronbach α coefficients were calculated for the internal consistency of reading comprehension tests. Secondly, the pretest vocabulary and reading scores of four groups were examined through a one-way ANOVA to make sure that all four groups were homogeneous in terms of vocabulary knowledge and reading comprehension initially. Thirdly, the four posttest vocabulary and reading scores were compared by a repeated-measures ANOVA to see whether short-term effects of task conditions exist, and if any difference among groups was shown, a LSD post hoc test was also conducted. Fourthly, to determine whether there were delayed effects of task conditions, the comparison between pretests and delayed posttests of both vocabulary and reading was made by a paired-samples t-test. Additionally, with the same purpose, delayed vocabulary scores among groups and delayed reading scores among groups respectively were compared by a one-way ANOVA. The data for learners’ perceived text difficulty were also compared and analyzed, to see if different task conditions can assist the participants in recognizing reading comprehension easier. For all statistical tests, the alpha (α) level was set at .05, nondirectional. 4. Results and discussion From the starting point, to ensure the homogeneity of the four groups in terms of vocabulary and reading comprehension, we conducted pretests. The vocabulary pretest consisted of 40 words among which scored were only the 20 target lexical items. Each word was counted as one point. For a reading comprehension pretest, fifteen multiple-choice reading questions on the 2012 NAEA were used. Each question was also counted as one point. The results of the vocabulary and reading pretest scores among four groups were presented in Tables 3 and 4. One-way ANOVA results showed that the vocabulary mean scores of the four groups were 0.65, 0.77, 0.54, and, 0.84, and the reading comprehension mean scores of the groups were 5.76, 5.29, 6.38, and 5.05 respectively. No statistically significant difference was found in both pretests (F = 0.381, p=0.767 for word and F = 0.406, p=0.749 for reading comprehension), which indicates all the participant groups were homogeneous in their vocabulary and reading ability. Table 3 Results of group comparison on vocabulary pretest Group  N  M  SD  F  Sig.  η2  Read aloud  17  0.65  0.786  0.381  0.767  0.019  Listening  13  0.77  0.832        Dictionary only  13  0.54  0.776        Noticing  19  0.84  0.958        Group  N  M  SD  F  Sig.  η2  Read aloud  17  0.65  0.786  0.381  0.767  0.019  Listening  13  0.77  0.832        Dictionary only  13  0.54  0.776        Noticing  19  0.84  0.958        * p < 0.05 Table 4 Results of group comparison on reading comprehension pretest Group  N  M  SD  F  Sig.  η2  Read aloud  17  5.76  4.085  0.406  0.749  0.020  Listening  13  5.69  3.276        Dictionary only  13  6.38  3.042        Noticing  19  5.05  2.990        Group  N  M  SD  F  Sig.  η2  Read aloud  17  5.76  4.085  0.406  0.749  0.020  Listening  13  5.69  3.276        Dictionary only  13  6.38  3.042        Noticing  19  5.05  2.990        * p < 0.05 4.1 Immediate effects of task conditions on vocabulary learning For the immediate effects of task conditions on vocabulary learning, we examined four vocabulary posttests by using a repeated-measures ANOVA. Each vocabulary posttest comprised ten words, including five target words and five distractors. One point was awarded to each target word and so vocabulary test scores of one posttest could be up to five. The results of descriptive statistics revealed that read-aloud, listening, dictionary only, and noticing groups on average scored 3.79, 3.38, 3.23, and 3.29 respectively (see Table 5). It was also found that all the four groups gained gradually better vocabulary scores over time, except just one case of posttests 1 and 2 of the listening group. Table 5 Descriptive statistics for vocabulary posttests Group    Posttest 1  Posttest 2  Posttest 3  Posttest 4  Total  N  M  SD  M  SD  M  SD  M  SD  M  SD  Read aloud  17  3.59  1.121  3.65  1.367  3.94  1.298  4.00  1.118  3.79  1.216  Listening  13  3.00  1.472  2.77  0.927  3.54  1.330  4.23  0.832  3.38  1.270  Dictionary only  13  2.92  1.706  3.23  1.691  3.23  1.691  3.54  1.506  3.23  1.616  Noticing  19  2.89  1.595  3.37  1.257  3.37  1.257  3.53  1.467  3.29  1.393  Group    Posttest 1  Posttest 2  Posttest 3  Posttest 4  Total  N  M  SD  M  SD  M  SD  M  SD  M  SD  Read aloud  17  3.59  1.121  3.65  1.367  3.94  1.298  4.00  1.118  3.79  1.216  Listening  13  3.00  1.472  2.77  0.927  3.54  1.330  4.23  0.832  3.38  1.270  Dictionary only  13  2.92  1.706  3.23  1.691  3.23  1.691  3.54  1.506  3.23  1.616  Noticing  19  2.89  1.595  3.37  1.257  3.37  1.257  3.53  1.467  3.29  1.393  The repeated-measures ANOVA results in Table 6 showed that there was a significant effect of task conditions on posttest scores over time (F = 7.155, p=0.000, Partial η2=0.110) while no significant group difference was shown (F = 0.853, p=0.471, Partial η2=0.042). An interaction effect between the two variables was not seen either (F = 1.112, p=0.357, Partial η2=0.054). Table 6 Repeated-measures ANOVA results for vocabulary posttests Source  SS  df  MS  F  Sig.  Partial η2  Posttest  18.214  3  6.071  7.155  0.000*  0.110  Posttest*Group  8.492  9  0.944  1.112  0.357  0.054  Group  12.680  3  4.227  0.853  0.471  0.042  Source  SS  df  MS  F  Sig.  Partial η2  Posttest  18.214  3  6.071  7.155  0.000*  0.110  Posttest*Group  8.492  9  0.944  1.112  0.357  0.054  Group  12.680  3  4.227  0.853  0.471  0.042  * p < 0.05 The significant increase in posttest scores with the lapse of time implies that all the different types of tasks, read-aloud, listening, dictionary only, and noticing, have a positive immediate effect on vocabulary learning. Another reason for the gradual improvements in scores over the experiment periods might be the effects of repetition or familiarity of tasks. As the experiment procedures went on, the participants would increasingly get familiar to the tasks; the more accustomed and familiar learners are to instruction procedures, the more likely they can get centered around learning goals, leading to the better test scores. They might also recognize they had to memorize the underlined words. It was also found that there was no immediate effect of task conditions on vocabulary learning; so, with a short-term perspective, little difference between dictionary and non-dictionary group was shown. It can be inferred that noticing as well as tasks with dictionary use could exert a positive influence on vocabulary growth. This result conflicts with a finding from Luppescu and Day’s study (1993), which proved the effectiveness of dictionary reference on vocabulary learning, and Hulstijn et al.’s study (1996), which showed greater retention rate of the dictionary users than the other groups, marginal glosses and reading only. This difference may be attributable to different approaches to vocabulary learning. Contrary to the previous studies, we underlined all the target words making students pay more attention to them, which might result in vocabulary development of the non-dictionary group as well as the dictionary groups. 4.2 Immediate effects of task conditions on reading comprehension To identify the immediate effects of task conditions on the learners’ reading comprehension, four reading posttests were analyzed by a repeated-measures ANOVA. First, the descriptive statistics results displayed that read-aloud, listening, dictionary only, and noticing groups on average gained 7.12, 6.73, 6.54, and 6.21 respectively (refer to Table 7). Interestingly, in every four posttests, the read-aloud group achieved the highest scores, followed by listening, dictionary only, and noticing in order, though there was no significant difference among groups (see Table 8). One reason the noticing group achieved the lowest scores might be that noticing is less directly coupled with reading comprehension than language skill related activities such as read-aloud and listening. To gain a clear indication, extended further research is needed. Table 7 Descriptive statistics for reading comprehension posttests Group    Posttest 1  Posttest 2  Posttest 3  Posttest 4  Total  N  M  SD  M  SD  M  SD  M  SD  M  SD  Read aloud  17  6.94  1.749  7.29  1.724  7.18  1.510  7.06  1.853  7.12  1.680  Listening  13  6.77  1.235  6.69  1.437  6.77  1.235  6.69  1.377  6.73  1.285  Dictionary only  13  6.15  2.267  6.54  2.259  6.77  2.006  6.69  2.720  6.54  2.271  Noticing  19  5.53  2.816  6.47  2.195  6.74  1.851  6.11  1.883  6.21  2.223  Group    Posttest 1  Posttest 2  Posttest 3  Posttest 4  Total  N  M  SD  M  SD  M  SD  M  SD  M  SD  Read aloud  17  6.94  1.749  7.29  1.724  7.18  1.510  7.06  1.853  7.12  1.680  Listening  13  6.77  1.235  6.69  1.437  6.77  1.235  6.69  1.377  6.73  1.285  Dictionary only  13  6.15  2.267  6.54  2.259  6.77  2.006  6.69  2.720  6.54  2.271  Noticing  19  5.53  2.816  6.47  2.195  6.74  1.851  6.11  1.883  6.21  2.223  Table 8 Repeated-measures ANOVA results for reading comprehension posttests Source  SS  df  MS  F  Sig.  Partial η2  Posttest  8.857  2.598  3.409  2.267  0.092  0.038  Posttest*Group  8.748  7.798  1.122  0.746  0.647  0.037  Group  30.494  3  10.165  0.893  0.450  0.044  Source  SS  df  MS  F  Sig.  Partial η2  Posttest  8.857  2.598  3.409  2.267  0.092  0.038  Posttest*Group  8.748  7.798  1.122  0.746  0.647  0.037  Group  30.494  3  10.165  0.893  0.450  0.044  * p < 0.05 The Greenhous-Geisser values in the repeated-measures design were applied for this analysis, since the sphericity for posttest scores was not assumed (p=0.007). Table 8 exhibits the results that there was no significant main effect of both time (F = 2.267, p=0.092, Partial η2=0.038) and group (F = 0.893, p=0.450, Partial η2=0.044) on the posttests. No interaction effect was observed between the two variables (F = 0.746, p=0.647, Partial η2=0.037). Differently from the results of immediate vocabulary learning, which exhibited the steady increase in scores over time, reading comprehension scores did not show improvements. It could imply that the reading ability would not be easily achieved or increased in a short time, and thus longer experimental periods might have led more meaningful results. In addition, the results of no group difference in posttest scores suggest that at least for immediate effects, diverse task conditions bring little difference in reading comprehension. That is, the dictionary use groups did not perform any better than the non-dictionary group. Similar results were also found in previous research (Bensoussan et al. 1984, Nesi and Meara 1991) which claimed the availability of dictionary did not bring benefits toward reading comprehension. Although some researchers emphasized the value of dictionary use in reading (Shieh and Freiermuth 2010, Tono 1989), it was inferred from only descriptive statistics, such as mean and standard division (Shieh and Freiermuth 2010), or as Tono himself pointed out, from the possibly flawed tests in terms of reliability and test validity (Tono 1989). Considering steady posttest scores of both dictionary and non-dictionary groups in the present study, more experimental research needs to be conducted in this reading domain. 4.3 Delayed effects of task conditions on vocabulary learning We discovered from descriptive statistics in Table 9 that the listening group exhibited the best performance in sustained vocabulary learning (M = 12.08), preceding noticing (M = 10.58), dictionary only (M = 10.46), and read-aloud (M = 8.65). In order to investigate which group has more sustained effects of task conditions on learning vocabulary, we analyzed delayed vocabulary test scores among groups through a one-way ANOVA. Table 10 illustrates that there was no significant difference among four groups (F = 0.958, p=0.419). This finding was in accordance with that of immediate vocabulary posttests which proved no difference among groups. Thus, it could be concluded that tasks with dictionary use and noticing do not bring out substantial differences in at least vocabulary learning. Table 9 Descriptive statistics for delayed vocabulary test Group  N  M  SD  Read aloud  17  8.65  5.061  Listening  13  12.08  4.518  Dictionary only  13  10.46  4.789  Noticing  19  10.58  6.955  Total  62  10.34  5.566  Group  N  M  SD  Read aloud  17  8.65  5.061  Listening  13  12.08  4.518  Dictionary only  13  10.46  4.789  Noticing  19  10.58  6.955  Total  62  10.34  5.566  Table 10 Results of group comparison on delayed vocabulary test   Source  SS  df  MS  F  Sig.  η2  Delayed test  Between Groups  89.219  3  29.740  0.958  0.419  0.047  Within Groups  1800.668  58  31.046        Total  1889.887  61            Source  SS  df  MS  F  Sig.  η2  Delayed test  Between Groups  89.219  3  29.740  0.958  0.419  0.047  Within Groups  1800.668  58  31.046        Total  1889.887  61          * p < 0.05 In order to more exactly certify whether task conditions exert prolonged effects on vocabulary learning, compared were pretest and delayed test vocabulary scores among groups. The results of a paired-samples t-test showed that, as illustrated in Table 11, significant improvements in vocabulary learning were found in all four groups ( p=0.000), showing quite a large effect size, 0.619, 0.813, 0.723, and 0.518. This finding suggests that the noticing group as well as the dictionary use groups has an affirmative impact on long-term vocabulary learning. Table 11 Results of group comparison on vocabulary pretest and delayed test Group  Test  N  M  SD  t  Sig.  η2  Read aloud  Pre  17  0.65  0.786  -7.219  0.000*  0.619    Delayed    8.65  5.061        Listening  Pre  13  0.77  0.832  -10.225  0.000*  0.813    Delayed    12.08  4.518        Dictionary only  Pre  13  0.54  0.776  -7.919  0.000*  0.723    Delayed    10.46  4.789        Noticing  Pre  19  0.84  0.958  -6.221  0.000*  0.518    Delayed    10.58  6.955        Group  Test  N  M  SD  t  Sig.  η2  Read aloud  Pre  17  0.65  0.786  -7.219  0.000*  0.619    Delayed    8.65  5.061        Listening  Pre  13  0.77  0.832  -10.225  0.000*  0.813    Delayed    12.08  4.518        Dictionary only  Pre  13  0.54  0.776  -7.919  0.000*  0.723    Delayed    10.46  4.789        Noticing  Pre  19  0.84  0.958  -6.221  0.000*  0.518    Delayed    10.58  6.955        * p < 0.05 Ronald (2002) found a similar result that vocabulary development during extensive reading was made in both with and without dictionary conditions, though more clear benefits were shown in dictionary use. No significant difference between consulting dictionaries and inferring was also found in vocabulary retention in Fraser’s study (1999a). A different finding was reported in Chen’s study (2011a); the bilingual dictionary groups, whether electronic or paper, exhibited significantly higher scores than the guessing group in both vocabulary comprehension and retention. It is noticeable, however, that the delayed vocabulary test was given one week after the last experiment in Chen’s study, while it was carried out two weeks after in this study. It might suggest that the results from the present study would be more representative for sustained effects. 4.4 Delayed effects of task conditions on reading comprehension Descriptive statistics for delayed reading test scores in Table 12 showed that the listening group achieved the highest (M = 7.85), dictionary only the second (M = 7.31), read-aloud the third (M = 6.88), and noticing the lowest (M = 4.89). It seems that dictionary use has more positive effects in reading comprehension than noticing from a longer perspective. Table 12 Descriptive statistics for delayed reading test Group  N  M  SD  Read aloud  17  6.88  3.462  Listening  13  7.85  2.577  Dictionary only  13  7.31  3.146  Noticing  19  4.89  3.143  Total  62  6.56  3.267  Group  N  M  SD  Read aloud  17  6.88  3.462  Listening  13  7.85  2.577  Dictionary only  13  7.31  3.146  Noticing  19  4.89  3.143  Total  62  6.56  3.267  To draw a clearer picture of what task conditions are more effective in understanding reading texts from a longer perspective, delayed reading tests were analyzed via a one-way ANOVA. The results, summarized in Table 13, revealed that there was a significant mean difference among four groups (F = 2.833, p=0.046). More specifically, LSD post hoc test results disclosed that the two groups, listening (p=0.011) and dictionary only (p=0.036), showed significantly better performance than the noticing group. Table 13 Results of group comparison on delayed reading test   Source  SS  df  MS  F  Sig.  η2  Delayed Posttest  Between Groups  83.226  3  27.742  2.833  0.046*  0.127  Within Groups  568.016  58  9.793        Total  651.242  61            Source  SS  df  MS  F  Sig.  η2  Delayed Posttest  Between Groups  83.226  3  27.742  2.833  0.046*  0.127  Within Groups  568.016  58  9.793        Total  651.242  61          * p < 0.05 Among the three dictionary use groups, who did extra tasks of read-aloud, listening, or dictionary consultation, the read-aloud group only did not show a significant difference with the noticing group. It may indicate that more numbers of tasks does not necessarily guarantee higher scores. One of the explanations for the significant differences between the dictionary only and noticing groups, and the listening and noticing groups may lie in task characteristics. When looking up unknown words, the dictionary group would think over meaning of the words appropriate for the reading text given, which could possibly make connections of word meanings within the context of reading. By listening to the whole reading passage, the listening group could also reflect deeply on their initial understanding of the text and thus more firmly understand it. The noticing group, on the other hand, did not perform activities which could actively stimulate their cognitions linking to text understanding. With regard to the read-aloud group, the participants could possibly forget part of what they have read while doing the activity and thus might have difficulty solving reading questions. If the participants had read out the whole text, instead of a few sentences containing the target words, it is more likely that different experimental results might have been shown. In order to ascertain the prolonged effects of task conditions on reading comprehension, reading pretest and delayed test scores by group were analyzed by a paired-samples t-test. Table 14 illustrates that significant differences were found in the three dictionary use groups, read-aloud, listening, and dictionary only ( p=0.017, 0.001, and 0.008 respectively), whereas there was no statistically significant difference in the noticing group ( p=0.635). This finding suggests that noticing only could not lead to enduring positive effects on reading comprehension, and noticing would not be directly associated with developing reading ability. Table 14 Results of group comparison reading pretest and delayed test Group  Test  N  M  SD  t  Sig.  η2  Read aloud  Pre  17  5.76  4.085  -2.667  0.017*  0.181    Delayed    6.88  3.462        Listening  Pre  13  5.69  3.276  -4.635  0.001*  0.472    Delayed    7.85  2.577        Dictionary only  Pre  13  6.38  3.042  -3.207  0.008*  0.299    Delayed    7.31  3.146        Noticing  Pre  19  5.05  2.990  0.483  0.635  0.006    Delayed    4.89  3.143        Group  Test  N  M  SD  t  Sig.  η2  Read aloud  Pre  17  5.76  4.085  -2.667  0.017*  0.181    Delayed    6.88  3.462        Listening  Pre  13  5.69  3.276  -4.635  0.001*  0.472    Delayed    7.85  2.577        Dictionary only  Pre  13  6.38  3.042  -3.207  0.008*  0.299    Delayed    7.31  3.146        Noticing  Pre  19  5.05  2.990  0.483  0.635  0.006    Delayed    4.89  3.143        * p < 0.05 From the results above, we could draw the following inferences. First, read-aloud with dictionary use could function as a facilitator of understanding reading passages in that it could make the participants channel underlined target words within the global understanding the sentences. It also could stimulate and refresh their memory of what they have read in the texts. Secondly, listening to the whole texts with dictionary use seems to be most helpful for reading comprehension. It might also be because diverse modes, including not only visual input from reading but also auditory input from listening, were provided to the listening group. In addition, the listening group participants only had chances to be exposed to the whole texts twice, which could help them to understand the texts more precisely. Lastly, dictionary use only could contribute to enhancing the reading ability over a long-time duration. It seems that smartphone dictionary use allowed learners to spend less time having access to dictionary information than other forms of dictionary and to focus more on comprehending texts. This interpretation was in part supported by an experimental study of Weschler and Pitts (2000), which revealed that learners with an ED looked up words about 23% faster than those with a PD, and that of Loucky (2013) which found less mean accessing time of bilingual ED over bilingual PD. Spending less time on dictionary consultation and more on reading comprehension would make lower the possibility of obstructing readers’ processing and longer their memory on reading content. 4.5 The relationship between task conditions and difficulty perception The participants rated their perceived difficulty of reading texts on a seven-point Likert scale, the lower the easier. The results in Table 15 revealed that on average, the read-aloud group found the reading passages easiest among the groups (M = 3.37), followed by listening (M = 3.60), dictionary only (M = 3.62), and noticing (M = 3.97) group. The mean of the group perceptions on reading texts ranged from 3.37 to 3.97, which means, in general, the texts selected were readable, not too easy or challenging, for the participants, and thus it can be inferred that appropriate text selections were made. Table 15 Descriptive statistics for reading perceived text difficulty Group    Text 1  Text 2  Text 3  Text 4  Total  N  M  SD  M  SD  M  SD  M  SD  M  SD  Read aloud  17  3.41  0.795  3.24  1.147  3.59  0.939  3.24  1.033  3.37  0.976  Listening  13  3.31  1.315  3.62  0.961  3.77  1.092  3.69  1.182  3.60  1.125  Dictionary only  13  4.00  1.291  3.77  1.235  3.31  1.182  3.38  1.223  3.62  1.223  Noticing  19  4.16  1.214  3.95  1.129  3.79  1.032  4.00  1.054  3.97  1.095  Group    Text 1  Text 2  Text 3  Text 4  Total  N  M  SD  M  SD  M  SD  M  SD  M  SD  Read aloud  17  3.41  0.795  3.24  1.147  3.59  0.939  3.24  1.033  3.37  0.976  Listening  13  3.31  1.315  3.62  0.961  3.77  1.092  3.69  1.182  3.60  1.125  Dictionary only  13  4.00  1.291  3.77  1.235  3.31  1.182  3.38  1.223  3.62  1.223  Noticing  19  4.16  1.214  3.95  1.129  3.79  1.032  4.00  1.054  3.97  1.095  The noticing group perceived reading texts most difficult in all the reading texts. This finding corresponds to that of reading posttest and delayed test scores of the noticing group; among the four groups, it gained the lowest scores in all the four posttest (see Tables 7 and 12). In order to find out a general group tendency for reading difficulty perceptions, we put together all the perceptions by group and computed a one-way ANOVA results. As shown in Table 16, there was a significant difference in perceptions on reading difficulty among groups (F = 3.753, p=0.012) and LSD post hoc test results indicated that the read-aloud group felt significantly easier than the noticing group ( p=0.001). Even though both read-aloud and listening groups performed mixed-skill tasks with dictionary use, the read-aloud group only showed a significant difference with the noticing group. It could be explained by the types of mixed skills, i.e., reception (reading) and production (read-aloud) vs. reception (reading) and reception (listening). It seems that the blending of reception and production, specifically reading with a read-aloud task in this study, allows learners to subjectively experience reading texts more easily or effortlessly than the other combination. Table 16 Results of group comparisons on perceived text difficulty Group  N  M  SD  F  Sig.  η2  Read aloud  17  3.37  0.976  3.753  0.012*  0.044  Listening  13  3.60  1.125        Dictionary only  13  3.62  1.223        Noticing  19  3.97  1.095        Group  N  M  SD  F  Sig.  η2  Read aloud  17  3.37  0.976  3.753  0.012*  0.044  Listening  13  3.60  1.125        Dictionary only  13  3.62  1.223        Noticing  19  3.97  1.095        * p < 0.05 The results from immediate tests also supported this phenomenon; that is, among the four groups, all the scores on immediate vocabulary and reading tests of the read-aloud group were the highest, except just one case. Since the perceptions on reading difficulty were checked right before vocabulary posttests and after reading posttests, it is more likely that the higher vocabulary and reading immediate posttest scores they got, the easier they could feel the texts while reading. 5. Conclusion The present study traces the impact of dictionary use and integrated skills on reading comprehension and vocabulary learning, and learners’ perceptions on reading text difficulty. The following are major findings of the study. Firstly, there was no significant difference in vocabulary posttest scores among the four groups, which means no immediate effect of task conditions on vocabulary learning. In spite of no significant group difference, there were significant improvements in vocabulary posttest scores as time elapsed. It could be inferred from the result that regardless of task types, the repetition or familiarity of tasks contributes to vocabulary growth. The results from the comparison of a vocabulary pretest and a delayed test also revealed that all the four groups found statistically significant differences, implying that all the task conditions, regardless of whether SBDs were used or not, could lead to sustained vocabulary development. Next, the four groups did not show any significant differences in reading posttest scores, indicating that there is no immediate influence of task conditions on reading comprehension. There was also no significant effect of time, which could imply that reading ability enhancement cannot be easily made in a short time. Thus, different results could be expected if longer experimental periods were given. The results from the comparison between a reading pretest and a delayed test, however, took on a different aspect; the three groups using SBDs – read-aloud, listening, and dictionary only group – showed significant differences while the non-dictionary, noticing group did not show any statistical enhancement. The outcomes were also supported by one-way ANOVA that among the three dictionary groups, the listening and dictionary only groups brought far better performances in comprehending texts than the noticing group. The following inferences were drawn; read-aloud with dictionary consultation uses as a facilitator of reading comprehension; among the tasks, listening to texts with dictionary use is most useful for understanding reading texts; SBD use only leads to enhanced reading ability. Finally, as for learners’ perceptions on reading text difficulty, the participant groups perceived the texts easier in the order of read-aloud, listening, dictionary only, and noticing. A general tendency for group perceptions from one-way ANOVA results was that the read-aloud group was significantly different from the noticing group. That is, the dictionary group performing a read-aloud activity showed much less difficulty in reading texts than the non-dictionary group. Considering that the listening group, another mixed-skill one, showed little difference with the non-dictionary group, a no less important role was played by the type of mixed skills; the combination of a receptive (reading) with a productive (read-aloud) skill makes learners experience reading passages easier than that of two receptive (reading and listening) ones. Limitations of the study lie in the sample size and rather a short duration of the experimental period. To draw a clearer picture of the effects of task conditions, a longer experiment period, with a larger size of groups, needs to be set in the future research. Another suggested research direction is to include a dictionary group performing a different speaking activity other than read-aloud, or one with a writing activity, another type of mixed skill combination. This study contributed to the re-evaluation of vocabulary instructions in practice, implying the importance of integrating language skills with dictionary use in vocabulary development and reading comprehension. Notes Footnotes 1 The Flesch-Kincaid Grade (FKG) Level, one of the two the Flesch-Kincaid readability tests, presents a score as a US grade level to judge the readability level of various books and texts. This FKG level is extensively used in the field of education along with the other readability test, the Flesch reading ease. 2 Since 2010, the NAEA test has been executed for all students in the 11th grade in Korea, testing three subjects, Korean, mathematics, and English. According to the NAEA result reports, it judges students’ academic attainments by a distinct scoring system which allows a score comparison with previous NAEA tests. References Joysoft. Manneung ‘omnipotent’ English dictionary: an English-Korean bilingual dictionary downloaded from the application software Google Play Store on the smartphone, https://play.google.com/store/apps/details?id=com.joysoft.wordBook Atkins B. T., Knowles F. E.. 1990. ‘Interim Report on the EURALEX/AILA Research Project into Dictionary Use’ In Magay I., Zigany J. (eds), BudaLEX 88 Proceedings . Budapest: Akademiai Kiado, 381– 392. Atkins B. T., Varantola K.. 1997. ‘ Monitoring Dictionary Use.’ International Journal of Lexicography  10. 1: 1– 45. Google Scholar CrossRef Search ADS   Baxter J. 1980. ‘ The Dictionary and Vocabulary Behavior: a Single Word or a Handful?’ TESOL Quarterly:  325– 336. Bensoussan M., Sim D., Weiss R.. 1984. ‘ The Effect of Dictionary Usage on EFL Test Performance Compared with Student and Teacher Attitudes and Expectations.’ Reading in a Foreign Language  2: 262– 275. Chen Y. Z. 2010. ‘ Dictionary Use and EFL Learning. A Contrastive Study of Pocket Electronic Dictionaries and Paper Dictionaries.’ International Journal of Lexicography  23. 3: 275– 306. Google Scholar CrossRef Search ADS   Chen Y. Z. 2011a. ‘ Dictionary Use and Vocabulary Learning in the Context of Reading.’ International Journal of Lexicography  25. 2: 216– 247. Google Scholar CrossRef Search ADS   Chen Y. Z. 2011b. ‘ Studies on Bilingualized Dictionaries: the User Perspective.’ International Journal of Lexicography  24. 2: 161– 197. Google Scholar CrossRef Search ADS   Dziemianko A. 2010. ‘ Paper or Electronic? The Role of Dictionary Form in Language Reception, Production and the Retention of Meaning and Collocations.’ International Journal of Lexicography  23. 3: 257– 273. Google Scholar CrossRef Search ADS   Dziemianko A. 2011. ‘Does Dictionary Form Really Matter?’ In Akasu K., Satoru U. (eds), ASIALEX 2011 proceedings lexicography: Theoretical and Practical Perspectives . Kyoto: Asian Association for Lexicography, 92– 101. Dziemianko A. 2012. ‘ Why One and Two Do Not Make Three: Dictionary Form Revisited.’ Lexikos  22: 195– 216. Google Scholar CrossRef Search ADS   Fraser C. A. 1999a. ‘ Lexical Processing Strategy Use and Vocabulary Learning through Reading.’ Studies in Second Language Acquisition  21. 2: 225– 241. Google Scholar CrossRef Search ADS   Fraser C. A. 1999b. ‘ The Role of Consulting a Dictionary in Reading and Vocabulary Learning.’ Canadian Journal of Applied Linguistics  2. 1-2: 73– 89. Hayati A. M., Pour-Mohammadi M.. 2005. ‘ A Comparative Study of Using Bilingual and Monolingual Dictionaries in Reading Comprehension of Intermediate EFL Students.’ Reading  5. 2: 61– 66. Holmer L., von Martens M., Sköldberg E.. 2015. ‘Making a Dictionary App from a Lexical Database: the Case of the Contemporary Dictionary of the Swedish Academy.’ Proceedings of eLex conference, August 2015. Accessed on 21 January 2016.https://elex.link/elex2015/proceedings/eLex_2015_03_Holmer+vonMartens+Skoldberg.pdf. Hsien-jen C. 2001. ‘The Effects of Dictionary Use on the Vocabulary Learning Strategies Used by Language Learners of Spanish.’ Paper presented at the Annual Meeting of the Acquisition of Spanish and Portuguese as First and Second Languages. University of Illinois, 11-14 October 2001. Accessed on 21 January 2016.http://files.eric.ed.gov/fulltext/ED471315.pdf. Hulstijn J. H., Hollander M., Greidanus T.. 1996. ‘ Incidental Vocabulary Learning by Advanced Foreign Language Students: the Influence of Marginal Glosses, Dictionary Use, and Reoccurrence of Unknown Words.’ The Modern Language Journal  80. 3: 327– 339. Google Scholar CrossRef Search ADS   Knight S. 1994. ‘ Dictionary Use while Reading: the Effects on Comprehension and Vocabulary Acquisition for Students of Different Verbal Abilities.’ The Modern Language Journal  78. 3: 285– 299. Google Scholar CrossRef Search ADS   Koyama T., Takeuchi O.. 2003. ‘ Printed Dictionaries vs. Electronic Dictionaries: a Pilot Study on How Japanese EFL Learners Differ in Using Dictionaries.’ Language Education & Technology  40: 61– 79. Koyama T., Takeuchi O.. 2004a. ‘ Comparing Electronic and Printed Dictionaries: How the Difference Affected EFL Learning.’ JACET Bulletin  38: 33– 46. Koyama T., Takeuchi O.. 2004b. ‘How Look-up Frequency Affects EFL Learning?: an Empirical Study on the Use of Handheld-electronic Dictionaries’ Proceedings of the CLaSIC 2004 Conference, 2004. Accessed on 21 January 2016.http://kuir.jm.kansai-u.ac.jp/dspace/bitstream/10112/5189/1/KU-1100-200400.pdf Koyama T., Takeuchi O.. 2007. ‘ Does Look-up Frequency Help Reading Comprehension of EFL Learners? Two Empirical Studies of Electronic Dictionaries.’ CALICO Journal  25. 1: 110– 125. Laufer B., Hadar L.. 1997. ‘ Assessing the Effectiveness of Monolingual, Bilingual, and “Bilingualised” Dictionaries in the Comprehension and Production of New Words.’ The Modern Language Journal  81. 2: 189– 196. Loucky J. P. 2013. ‘ Comparing Electronic Dictionary Functions and Use.’ CALICO Journal , 28. 1: 156– 174. Google Scholar CrossRef Search ADS   Luppescu S., Day R. R.. 1993. ‘ Reading, Dictionaries, and Vocabulary Learning.’ Language Learning  43. 2: 263– 279. Google Scholar CrossRef Search ADS   Midlane V. 2005. Students’ Use of Portable Electronic Dictionaries in the EFL/ESL Classroom; a Survey of Teacher Attitudes. M.A. Thesis, University of Manchester. Mikulecky B. S., Jeffries L.. 2004. More Reading Power . New York: Longman. Nation I. S. P. 2001. Learning Vocabulary in Another Language . Cambridge: Cambridge University Press. Google Scholar CrossRef Search ADS   Nation I. S. P., Webb S.. 2011. Researching and Analyzing Vocabulary . Boston: Heinle. Nesi H. 2014. ‘ Dictionary Use by English Language Learners.’ Language Teaching  47. 1: 38– 55. Google Scholar CrossRef Search ADS   Nesi H., Meara P.. 1991. ‘ How Using Dictionaries Affects Performance in Multiple-Choice EFL Tests.’ Reading in a Foreign Language  8: 631– 643. Pujol D., Corrius M., Masnou J.. 2006. ‘ Print Deferred Bilingualised Dictionaries and their Implications for Effective Language Learning: a New Approach to Pedagogical Lexicography.’ International Journal of Lexicography  19. 2: 197– 215. Google Scholar CrossRef Search ADS   Ronald J. 2002. ‘L2 Lexical Growth through Extensive Reading and Dictionary Use: a Case Study’ Proceedings of the Tenth EURALEX International Congress, EURALEX 2002, 13– 17 August 2002. Rundell M. 1999. ‘ Dictionary Use in Production.’ International Journal of Lexicography  12. 1: 35– 53. Google Scholar CrossRef Search ADS   Shieh W., Freiermuth M. R.. 2010. ‘ Using the DASH Method to Measure Reading Comprehension.’ TESOL Quarterly  44. 1: 110– 128. Google Scholar CrossRef Search ADS   Stirling J. 2005. ‘ The Portable Electronic Dictionary: Faithful Friend or Faceless Foe?’ Modern English Teacher  14. 3: 64– 72. Summers D. 1988. ‘The Role of Dictionaries in Language Learning’ In Carter R., McCarthy M. (eds), Vocabulary and Language Teaching , London: Longman, 111– 125. Tang G. M. 1997. ‘ Pocket Electronic Dictionaries for Second Language Learning: Help or Hindrance?’ TESL Canada Journal  15. 1: 39– 57. Google Scholar CrossRef Search ADS   Tono Y. 1989. ‘Can a Dictionary Help One Read Better? On the Relationship between E.F.L. Learners’ Dictionary Reference Skills and Reading Comprehension’ In James G (eds), Lexicographers and Their Works . Exeter: University of Exeter Press, 192– 200. Weschler R., Pitts C.. 2000. ‘ An Experiment Using Electronic Dictionaries with EFL Students.’ The Internet TESL Journal  6. 8: 56– 67. Zarei A. A., Naseri D.. 2008. ‘The Effect of Monolingual, Bilingual, and Bilingualized Dictionaries on Vocabulary Comprehension and Production.’ TELL 2.7: 42– 69. © 2016 Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png International Journal of Lexicography Oxford University Press

An Experimental Study of Dictionary use on Vocabulary Learning and Reading Comprehension in Different Task Conditions

Loading next page...
 
/lp/ou_press/an-experimental-study-of-dictionary-use-on-vocabulary-learning-and-H3Sgf0v78y
Publisher
Oxford University Press
Copyright
© 2016 Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com
ISSN
0950-3846
eISSN
1477-4577
D.O.I.
10.1093/ijl/ecw037
Publisher site
See Article on Publisher Site

Abstract

Abstract This study examines the effects of dictionary use and integrated skills on vocabulary learning and reading comprehension, and learners’ perceived difficulty of reading texts. Sixty-two high school students were assigned to four groups according to different task conditions – read-aloud with dictionary use, listening with dictionary use, dictionary use only, and noticing (non-dictionary). The learners’ vocabulary learning and reading comprehension were measured through pre-, post-, and delayed tests. Their perceived text difficulty was checked with a seven-point Likert scale. Results were that there were substantial improvements in vocabulary scores, regardless of task conditions, over time. It was also revealed that the sustained effect of all task conditions was found in both vocabulary learning and reading comprehension, except for the noticing group in reading comprehension, supporting the positive influence of using dictionary. In perceived text difficulty, the read-aloud group felt significantly easier than the noticing group. 1. Introduction Recent years have seen steady increase of studies on dictionary use (Chen 2011a). Experimental studies have appeared on the effectiveness of dictionaries according to dictionary type, such as monolingual, bilingual, and bilingualized dictionaries (Chen 2011b, Hayati and Pour-Mohammadi 2005, Laufer and Hadar 1997, Luppescu and Day 1993, Pujol, Corrius, and Masnou 2006, Tono 1989, Zarei and Naseri 2008). With the extensive use of electronic dictionaries, investigations on comparing electronic with print dictionaries have also been on the rise (Chen 2010, 2011a, Dziemianko 2010, 2011, 2012, Koyama and Takeuchi 2003, 2004a, 2004b, 2007). Other lines of research have been on the effects of dictionary consultation on vocabulary learning (Hulstijn, Hollander, and Greidanus 1996, Luppescu and Day 1993, Ronald 2002), on reading comprehension (Bensoussan, Sim, and Weiss 1984, Nesi and Meara 1991, Shieh and Freiermuth 2010, Tono 1989), and on both vocabulary and reading comprehension (Fraser 1999a, Knight 1994, Summers 1988). Research on dictionary use, in many cases, has been embedded in reading context, rather than context-free vocabulary learning conditions. It is possibly because learning words is inseparable from reading; we read newspaper, books, magazines, and e-mails every day from which we encounter many words, and one of the ultimate goals of learning new words is to gain more information through reading. Despite the interest in vocabulary development using dictionaries in the context of reading in the second or foreign language (hereafter L2) field, studies of dictionary use in association with other language skills have been rarely carried out. Furthermore, even though the development and consultation of dictionary apps increase due to a wide use of smartphones (Holmer, von Martens, and Sköldberg 2015), there has been no experimental research conducted on the effects of smartphone dictionary use on vocabulary learning and reading comprehension. The present study aims to explore the impact of dictionary use on vocabulary learning and reading comprehension in diverse task conditions. Additionally, this study attempts to examine the way L2 dictionary users perceive text difficulty. The following are four task conditions to be studied: read-aloud with dictionary use, listening with dictionary use, dictionary use only, and noticing with no dictionary reference. All the four conditions were embedded within the context of reading. With regard to dictionary use, a smartphone English-Korean bilingual dictionary (hereafter, SBD) application program was employed. For the current study, the three research questions below were formulated. What are the immediate effects of task conditions (read-aloud activity with dictionary use, listening activity with dictionary use, dictionary use only, and noticing with no dictionary use) on L2 vocabulary learning and reading comprehension? What are the delayed effects of task conditions (read-aloud activity with dictionary use, listening activity with dictionary use, dictionary use only, and noticing with no dictionary use) on L2 vocabulary learning and reading comprehension? How do L2 learners perceive text difficulty according to task conditions? 2. Literature review 2.1 Paper and electronic dictionaries and dictionary preference Since the increasing use of electronic dictionaries in language classrooms around the 1990s (Midlane 2005, Stirling 2005, Tang 1997), one of the research focuses into dictionaries has been comparing paper dictionaries (PD) with electronic dictionaries (ED) (Nesi 2014). Koyama and Takeuchi’s study (2003), one of the first attempts to compare the two dictionary forms, found no significant difference in the number of words looked up, search time, and word retention between PD and ED groups. A subsequent study conducted by the same authors (2004a) reported that there was little difference in both search time and retention of words between PD and ED, while the words looked up in a PD condition were statistically better recognized than those looked up in an ED condition. Another empirical study of Koyama and Takeuchi (2004b) showed no significance of reading comprehension scores between the two dictionary types. They added that the ED group showed more frequency of consulting dictionaries than the PD group in a shorter time, suggesting that more frequent look-up does not ensure better comprehension in reading texts. Two sub-studies conducted in Koyama and Takeuchi’s research (2007) confirmed the results that whether the participants had access to PD or ED did not make difference in reading comprehension, and ED users looked up more words than PD users in a briefer period of time. Chen (2010) compared the effects of PD and ED on vocabulary acquisition, and found no significant differences in vocabulary comprehension, production, and retention between the two. Another study conducted by the same author (Chen 2011a) included the guessing group as well as PD and ED groups. It was found that both dictionary groups gained significantly higher scores than the guessing group in vocabulary comprehension and retention, yet no significant difference was shown in the two dictionary groups. In contrast, Dziemianko (2010) found that online dictionary use was more effective than PD use in doing the receptive task, the productive task, and the retention of both target words and collocations. Two subsequent replication studies by Dziemianko (2011, 2012), however, did not confirm the superiority of ED in reception, production and retention of target items and collocations. The author mentioned that possible factors for different obtained results could have been structural features of different dictionaries such as organization, colours or shapes of information, and extent to effort to access entry. Despite the increase of comparative studies on paper and electronic versions of dictionaries, to our knowledge, experimental studies using smartphones are non-existent. Researchers also investigated L2 learners’ dictionary consulting frequency and preference for dictionary types. Studies indicate L2 learners commonly own and prefer to use bilingual dictionaries (Atkins and Knowles 1990, Atkins and Varantola 1997, Baxter 1980, Bensoussan et al. 1984, Hsien-jen 2001, Nation 2001, Nation and Webb 2011, Nesi 2014, Rundell 1999), and even a large number of bilingualized dictionary users report they mostly refer to the L1 part rather than the L2 information (Pujol et al. 2006). A questionnaire survey by Baxter (1980) to 342 Japanese university students displayed a strong predilection for bilingual dictionaries; all of the respondents said they had more than one bilingual dictionary, while 76.3% of the non-English majors did not even buy any L2 monolingual dictionary. Further, 88.3% responded they referred to a bilingual dictionary daily or several times a week, whereas only 10.8% used a monolingual dictionary. Another extensive survey on dictionary use, conducted by Atkins and Knowles (1990), revealed that 57.9% of the 740 respondents looked up a bilingual dictionary almost every week and only 0.4% claimed they never used it. The favoured reference for bilingual dictionaries was also revealed in the study of Bensoussan et al. (1984), which disclosed that 59% of 670 and 58% of 740 participants consulted bilingual dictionaries whereas 20% and 21% used monolingual dictionaries. Atkins and Varantola (1997) confirmed the preference; in their study, bilingual dictionaries were consulted 714 times (71%) out of 1,000 look-ups while monolingual 281 times (28%). 2.2 Impact of dictionary use on vocabulary learning and reading comprehension A line of research into dictionary use has been conducted on the value of dictionary on vocabulary learning and reading comprehension. As for vocabulary learning, results from previous studies support the role of dictionary as a facilitator for vocabulary learning. Luppescu and Day (1993) found that dictionary users, around half of the 293 Japanese university students, gained significantly higher scores on a vocabulary test than non-dictionary users. Ronald (2002) also found benefits of dictionary use to L2 vocabulary growth through a case study focusing on one motivated learner majoring in English. The evidence of values of dictionary use was partly supported by Hulstijn et al. (1996). They investigated the influence of three vocabulary strategies in reading context – marginal glosses, bilingual dictionary use, and reading text only – and one of their main findings was that when the dictionary group did consult dictionaries, their retention rate of words was greater than that of the other groups. Contrary to the generally accepted idea of the usefulness of dictionaries on vocabulary learning, a mixed bag of results has been displayed in reading comprehension. Bensoussan et al. (1984) investigated the effect of dictionary use on reading comprehension, and found no significant correlation. This result was in accordance with that of Nesi and Meara’s study (1991) which was conducted under two conditions, dictionary access and non-dictionary groups. In their study, the two groups did not significantly differ in reading test scores. In Tono’s study (1989), however, that was not the case; the bilingual English-Japanese dictionary group, which consisted of junior high school students, performed significantly better on a reading comprehension test. Shieh and Freiermuth (2010) also emphasized the effects of dictionary reference as a way of improving reading comprehension, showing higher scores of with-dictionary group. They added that the positive effects of dictionary use happen only if sufficient time is allowed for dictionary consultation. On both vocabulary learning and reading comprehension, Summers (1988) drew a comparison among four groups, including three different entry types of dictionary – definitions only, examples only, definitions and examples – and a control group. The author discovered non-dictionary group fared worst, while dictionary use, regardless of the entry types, brought statistically distinct improvements in both vocabulary learning and reading comprehension. The finding was corroborated by Knight’s study (1994), which used two vocabulary tests and one reading test. It was found that the dictionary group demonstrated significant differences on both vocabulary learning and reading comprehension. Different results, however, were revealed in an experimental study of Fraser (1999a). She investigated the impact of three lexical processing strategies, consulting a dictionary, inferencing, and ignoring, and reported that a high proportion (78%) of dictionary consultation led to full text comprehension, while no difference between consulting a dictionary and inferencing was shown in vocabulary retention. It was also stated that the combination of consulting a dictionary and inferencing, followed by consulting only, was most effective on vocabulary retention. The previous research on dictionary use, as beforehand alluded to, has been largely discussed with solely a reading condition. In other words, few studies have handled the effects of dictionary consultation with integrated language skills. The present study aims to access the effects of dictionary use on vocabulary learning and reading comprehension along with integrated skills, such as speaking and reading, and listening and reading. The direction of the study towards combination of dictionary consultation with integrated language skills would broaden the scope of future research areas and enrich the relationship between dictionary use and language skill integration. 3. Method 3.1 Participants The participants for this study were initially 96 students who answered the questionnaire survey on dictionary use. However, nineteen students were excluded in the process of selecting target words. The qualification for being participants was the extent to know how many target words; among twenty target words, qualified participants knew less than two words. The remaining 77 participants were almost equally assigned to four groups – read-aloud, listening, dictionary only, and noticing groups. We named the groups based on a distinct feature of the task the participant groups would perform. For example, the read aloud group did a read-aloud activity; the listening group did a listening activity; the dictionary only group consulted SBDs and did not perform any language skill-related activities; and the noticing group was given chances to notice the meaning of unknown target words without any dictionary use and language skill-relevant activities. After all the experiment process, another fifteen were excluded from the data analysis because they failed to take more than two reading and vocabulary tests, leading to the participant number at 62. They were in the 10th grade from a high school situated in South Korea. They all were born and raised in Korea, with no experience of studying abroad, and share the same mother tongue and cultural background. They also had studied English as a required subject in the national curriculum for around eight years and took 5 hours of class a week on average. As Table 1 shows, each group consists of seventeen, thirteen, thirteen, and nineteen, and in total, 34 males (54.8%) and 28 females (45.2%); nine males and eight females in the read aloud group, five males and eight females in the listening group, seven males and six females in the dictionary only group, and thirteen males and six females in the noticing group. Table 1 Distribution of the participants Group  N  Male  Female  Read aloud  17  9 (52.9%)  8 (47.1%)  Listening  13  5 (38.5%)  8 (61.5%)  Dictionary only  13  7 (53.8%)  6 (46.2%)  Noticing  19  13 (68.4%)  6 (31.6%)  Total  62  34 (54.8%)  28 (45.2%)  Group  N  Male  Female  Read aloud  17  9 (52.9%)  8 (47.1%)  Listening  13  5 (38.5%)  8 (61.5%)  Dictionary only  13  7 (53.8%)  6 (46.2%)  Noticing  19  13 (68.4%)  6 (31.6%)  Total  62  34 (54.8%)  28 (45.2%)  3.2. Materials 3.2.1 Reading materials and target word items. As for reading materials, the participants read four expository texts concerning Africa and global environment which were extracted from More Reading Power (2nd edition) by Mikulecky and Jeffries (2004). In order to select reading materials, several factors were considered. First, expository texts were chosen for taking students’ reading purpose into account, since the participants were studying English for the Korean Scholastic Aptitude Test (KSAT), the university entrance exam, which largely consists of expository texts, about more than 20 out of 28 reading questions. Next, all four texts comprised five to six passages containing similar numbers of tokens, words in text, ranged from 497 to 506 (see Table 2). Additionally, to ensure a similar level of text difficulty, readability for all the reading texts was computed and checked through the program Web VP Classic (www.lextutor.ca/vp/eng/). In the data, as shown in Table 2, the scope of types, which means different words in text, was 219 to 250 and the range of word families, which means a group of cognate words, was 162 to 190. The combined percentage of the most frequent 1000 words of English (1K words) and the second most 1000 words of English (2K words) of the four expository texts ranged from 83.27% (N=418) to 89.53% (N=450). Those are relatively higher figures than one for typical written texts, 80%. The Flesch-Kincaid Grade (FKG) Level indicated that all four texts were in a similar range of reading difficulty, from 8.70 to 11.14, though the FKG level of the text 2 is rather higher than that of the others (see Table 2).1 Fourthly, every reading text contained ten multiple-choice questions with five exemplars each at the end of the passage. Among ten reading comprehension questions, eight were adapted from the More Reading Power, with the change of the number of exemplars from four to five. Two questions were newly formulated to enhance chances to measure more accurately the global understanding of the text and the identification of specific information. All of the questions were solved and examined by two Korean teachers of English and one native English–speaking teacher. After we reviewed and double-checked the questions, through discussion, disagreements were resolved and the finalized versions of the reading comprehension check questions were made. For the establishment of internal consistency, Cronbach α coefficients were computed for the four reading posttests. The results for the posttests 1 to 4 were 0.769, 0.753, 0.729, and 0.813 in order. Table 2 Text readability results from Web VP Classic Text  Tokens  Types  Word families  1K and 2K words percentage (N)  FKG Grade level  1  497  238  190+?  89.53% (445)  9.94  2  502  219  162+?  83.27% (418)  11.14  3  503  239  185+?  89.27% (450)  8.70  4  506  250  181+?  84.59% (428)  9.40  Text  Tokens  Types  Word families  1K and 2K words percentage (N)  FKG Grade level  1  497  238  190+?  89.53% (445)  9.94  2  502  219  162+?  83.27% (418)  11.14  3  503  239  185+?  89.27% (450)  8.70  4  506  250  181+?  84.59% (428)  9.40  For target words selection, three teachers, instructing the participants for two continuous years, discussed and picked out 40 words which seemed unknown to the students. The reason for selecting 40 words was that the students were accustomed to the format of 40-word vocabulary tests which had often been used in class. The 40 words were used as a pretest, and finally 20 target words were selected which were unknown to more than 96% (N=60) of the 62 participants. The target lexical items met the following criteria: a) proper noun or technical terms were excluded, as Chen (2011a) mentioned, which were of little experimental value, b) only less-than-five-syllable single words, not idiomatic expressions or word associates, were included since multi-word lexical items could induce a memory burden on the participants, and c) only content words were selected; parts of speech were controlled and limited to adjective, adverb, noun, and verb, which are of relatively more importance in comprehending texts than others. The selected target words were destructive, dramatically, worsen, preventable, conflict, up-to-date, afford, operator, correspondence, course, plantation, deforestation, ecology, coral, reef, amphibian, endanger, mass, substance, and ultraviolet. 3.2.2 Smartphone bilingual dictionary. In order to select a dictionary for the study, we conducted a questionnaire survey on dictionary types in use, such as bilingual, monolingual, or bilingualized, and dictionary forms in use, such as paper, computer desktop, pocket electronic, or smartphone app dictionaries. The survey was handed out to 96 students. The results displayed their inclinations for bilingual dictionary in types and smartphone dictionary app in forms. Specifically, most of the students (N=92, 95.8%) responded they were using English-Korean bilingual dictionaries; only a small number of participants (N=4, 4.2%) monolingual, and none bilingualized dictionaries. This finding of a penchant for bilingual dictionary use has been presented by many previous studies (Atkins and Knowles 1990, Atkins and Varantola 1997, Baxter 1980, Bensoussan et al. 1984, Hsien-jen 2001, Nation and Webb 2011, Rundell 1999). Interestingly, the survey conducted in the present study also revealed that when consulting dictionaries, all of the participants (N=96, 100%) used downloaded dictionary app software programs on their cell phones, rather than printed or electronic pocket dictionaries. Based on the survey results, and to control the variables of dictionary form and type, the dictionary used in the study was restricted to an English-Korean bilingual dictionary on the smartphone. To eliminate differences in the use of a SBD, for instance, letter size, layout, font, phonetic symbols, example sentences, idiomatic and lexical phrases, translated meaning, etc., one English-Korean dictionary app was selected which can be freely downloaded from the application software Google Play Store. For a SBD selection, fifteen SBDs were searched and examined to find out which one was most appropriate for the study. Taking the variables suggested above into consideration, we selected one SBD application program, named Manneung ‘omnipotent’ English dictionary. All the participants, except the noticing group, downloaded the same SBD app on their smartphones and used it while engaging in the tasks. Despite the availability of the SBD, however, the participants may not use the SBD when confronting the target words while reading. The failure or sparing use of consulting dictionaries on target words has also been mentioned in other studies (Chen 2011a, Hulstijn et al. 1996) and the investigation on frequency of dictionary consulting showed great individual differences (Fraser 1999a, 1999b). To prevent the underuse of dictionaries, in this study, the participants were asked to find the meaning of all the underlined target words and to put a check mark beside words. 3.3. Procedure To figure out students’ perceptions on dictionary use, we first administered the questionnaire to 96 students. For approval from the participants as volunteers for the study, consent forms assuring their voluntary participation and confidentiality of identity were distributed and they all agreed. Next, to ensure the homogeneity of their vocabulary knowledge, a vocabulary pretest, containing 40 lexical items, was taken by all of the participants for 15 minutes. They were asked to write the meaning of the words in their L1, Korean. Among the 40 words, 20 words unknown to most of the students (N=92, 95.8%) were selected as target lexical items and only they were counted. As a reading comprehension pretest, fifteen reading questions on the 2012 National Assessment of Educational Achievement (NAEA) test were administered for 30 minutes.2 After both vocabulary and reading pretests, nineteen students were excluded who knew more than two target words in the vocabulary pretest, and the rest 77 participants were assigned to four groups, 19 for read-aloud, listening, dictionary only groups respectively and 20 for noticing group. The three dictionary groups, read-aloud, listening, and dictionary only, were directed to download the smartphone dictionary application selected and had time to practice using the SBD in order to prevent the effects of students’ inadequate dictionary consulting skills. After one week from the pretests, treatment of different task conditions for each group such as a read aloud activity with dictionary use, a listening activity with dictionary use, dictionary consultation, and opportunities for noticing target words was given every Tuesday and Thursday for two consecutive weeks. To set up a consistent and systematic instructional procedure, one of the researchers, a Korean teacher of English, taught all the four groups. During the experiment period, each group performed the given task paragraph by paragraph, not covering the whole reading texts at a time. This teaching procedure was intended for reducing the participants’ cognitive burden of remembering reading contents in the whole passage at once. For the read-aloud group, forty minutes were given in each treatment. Students were asked to read one paragraph, 80 to 100 words long, and to look up unknown words containing underlined target words in their SBD. Then they put a check mark beside the words consulted. After consulting dictionaries while reading, the instructor made the group read aloud the sentences including target words twice. Then they solved one to three reading comprehension check questions related to the paragraph they read. The same procedure applied to the other paragraphs as well. Shortly after all the tasks, the participants checked their perceptions on the text difficulty using a seven-point Likert scale, and took a vocabulary posttest consisting of five target words and five distractors. The listening group was given the same reading texts and questions. After reading and SBD use for a paragraph, however, instead of the read-aloud activity, the participants listened to the same paragraph recorded by a text to speech (TTS) program, panopreter basic (http://modangs.tistory.com/469). The use of the TTS program is for establishing the same quality of sound variables such as speed, pitch, and intonation. Listening activities were also done paragraph by paragraph. After all the tasks, self-perceptions for the text difficulty were checked with a seven-point Likert scale, and a vocabulary posttest was administered. In all, around forty minute experiment time was also given to this group. The same procedure was practiced for the dictionary only group, except read-aloud and listening activities. The participants used their SBDs for unknown words while reading a paragraph, and then solved the reading questions attached. The rest of the passage was also dealt with in the same way. Lastly, perceived text difficulty was marked on a seven-point Likert scale, and vocabulary posttests were taken. It took about thirty-five minutes. In the case of the noticing group, the participants were not allowed to consult any dictionaries. They just had the opportunity to notice the meaning of the underlined words during reading, after which they solved the reading comprehension questions. After the whole reading passage was handled as above, they were asked to rate text difficulty on a seven-point Likert scale and took a vocabulary posttest. Altogether, it took about thirty minutes. Finally, to discover whether sustained improvements of vocabulary or reading ability were made, vocabulary and reading delayed tests were given to all the participant groups after two weeks from the last treatment day. On a delayed vocabulary test, the forty words used on the pretest were rearranged and presented in a different order for minimizing the carryover effect from the pretest. As a delayed reading test, fifteen questions selected from the 2013 NAEA were used to measure the participants’ reading comprehension. 3.4. Data analysis The data collected were analyzed by the Statistical Package for the Social Sciences (SPSS) 20.0. Firstly, Cronbach α coefficients were calculated for the internal consistency of reading comprehension tests. Secondly, the pretest vocabulary and reading scores of four groups were examined through a one-way ANOVA to make sure that all four groups were homogeneous in terms of vocabulary knowledge and reading comprehension initially. Thirdly, the four posttest vocabulary and reading scores were compared by a repeated-measures ANOVA to see whether short-term effects of task conditions exist, and if any difference among groups was shown, a LSD post hoc test was also conducted. Fourthly, to determine whether there were delayed effects of task conditions, the comparison between pretests and delayed posttests of both vocabulary and reading was made by a paired-samples t-test. Additionally, with the same purpose, delayed vocabulary scores among groups and delayed reading scores among groups respectively were compared by a one-way ANOVA. The data for learners’ perceived text difficulty were also compared and analyzed, to see if different task conditions can assist the participants in recognizing reading comprehension easier. For all statistical tests, the alpha (α) level was set at .05, nondirectional. 4. Results and discussion From the starting point, to ensure the homogeneity of the four groups in terms of vocabulary and reading comprehension, we conducted pretests. The vocabulary pretest consisted of 40 words among which scored were only the 20 target lexical items. Each word was counted as one point. For a reading comprehension pretest, fifteen multiple-choice reading questions on the 2012 NAEA were used. Each question was also counted as one point. The results of the vocabulary and reading pretest scores among four groups were presented in Tables 3 and 4. One-way ANOVA results showed that the vocabulary mean scores of the four groups were 0.65, 0.77, 0.54, and, 0.84, and the reading comprehension mean scores of the groups were 5.76, 5.29, 6.38, and 5.05 respectively. No statistically significant difference was found in both pretests (F = 0.381, p=0.767 for word and F = 0.406, p=0.749 for reading comprehension), which indicates all the participant groups were homogeneous in their vocabulary and reading ability. Table 3 Results of group comparison on vocabulary pretest Group  N  M  SD  F  Sig.  η2  Read aloud  17  0.65  0.786  0.381  0.767  0.019  Listening  13  0.77  0.832        Dictionary only  13  0.54  0.776        Noticing  19  0.84  0.958        Group  N  M  SD  F  Sig.  η2  Read aloud  17  0.65  0.786  0.381  0.767  0.019  Listening  13  0.77  0.832        Dictionary only  13  0.54  0.776        Noticing  19  0.84  0.958        * p < 0.05 Table 4 Results of group comparison on reading comprehension pretest Group  N  M  SD  F  Sig.  η2  Read aloud  17  5.76  4.085  0.406  0.749  0.020  Listening  13  5.69  3.276        Dictionary only  13  6.38  3.042        Noticing  19  5.05  2.990        Group  N  M  SD  F  Sig.  η2  Read aloud  17  5.76  4.085  0.406  0.749  0.020  Listening  13  5.69  3.276        Dictionary only  13  6.38  3.042        Noticing  19  5.05  2.990        * p < 0.05 4.1 Immediate effects of task conditions on vocabulary learning For the immediate effects of task conditions on vocabulary learning, we examined four vocabulary posttests by using a repeated-measures ANOVA. Each vocabulary posttest comprised ten words, including five target words and five distractors. One point was awarded to each target word and so vocabulary test scores of one posttest could be up to five. The results of descriptive statistics revealed that read-aloud, listening, dictionary only, and noticing groups on average scored 3.79, 3.38, 3.23, and 3.29 respectively (see Table 5). It was also found that all the four groups gained gradually better vocabulary scores over time, except just one case of posttests 1 and 2 of the listening group. Table 5 Descriptive statistics for vocabulary posttests Group    Posttest 1  Posttest 2  Posttest 3  Posttest 4  Total  N  M  SD  M  SD  M  SD  M  SD  M  SD  Read aloud  17  3.59  1.121  3.65  1.367  3.94  1.298  4.00  1.118  3.79  1.216  Listening  13  3.00  1.472  2.77  0.927  3.54  1.330  4.23  0.832  3.38  1.270  Dictionary only  13  2.92  1.706  3.23  1.691  3.23  1.691  3.54  1.506  3.23  1.616  Noticing  19  2.89  1.595  3.37  1.257  3.37  1.257  3.53  1.467  3.29  1.393  Group    Posttest 1  Posttest 2  Posttest 3  Posttest 4  Total  N  M  SD  M  SD  M  SD  M  SD  M  SD  Read aloud  17  3.59  1.121  3.65  1.367  3.94  1.298  4.00  1.118  3.79  1.216  Listening  13  3.00  1.472  2.77  0.927  3.54  1.330  4.23  0.832  3.38  1.270  Dictionary only  13  2.92  1.706  3.23  1.691  3.23  1.691  3.54  1.506  3.23  1.616  Noticing  19  2.89  1.595  3.37  1.257  3.37  1.257  3.53  1.467  3.29  1.393  The repeated-measures ANOVA results in Table 6 showed that there was a significant effect of task conditions on posttest scores over time (F = 7.155, p=0.000, Partial η2=0.110) while no significant group difference was shown (F = 0.853, p=0.471, Partial η2=0.042). An interaction effect between the two variables was not seen either (F = 1.112, p=0.357, Partial η2=0.054). Table 6 Repeated-measures ANOVA results for vocabulary posttests Source  SS  df  MS  F  Sig.  Partial η2  Posttest  18.214  3  6.071  7.155  0.000*  0.110  Posttest*Group  8.492  9  0.944  1.112  0.357  0.054  Group  12.680  3  4.227  0.853  0.471  0.042  Source  SS  df  MS  F  Sig.  Partial η2  Posttest  18.214  3  6.071  7.155  0.000*  0.110  Posttest*Group  8.492  9  0.944  1.112  0.357  0.054  Group  12.680  3  4.227  0.853  0.471  0.042  * p < 0.05 The significant increase in posttest scores with the lapse of time implies that all the different types of tasks, read-aloud, listening, dictionary only, and noticing, have a positive immediate effect on vocabulary learning. Another reason for the gradual improvements in scores over the experiment periods might be the effects of repetition or familiarity of tasks. As the experiment procedures went on, the participants would increasingly get familiar to the tasks; the more accustomed and familiar learners are to instruction procedures, the more likely they can get centered around learning goals, leading to the better test scores. They might also recognize they had to memorize the underlined words. It was also found that there was no immediate effect of task conditions on vocabulary learning; so, with a short-term perspective, little difference between dictionary and non-dictionary group was shown. It can be inferred that noticing as well as tasks with dictionary use could exert a positive influence on vocabulary growth. This result conflicts with a finding from Luppescu and Day’s study (1993), which proved the effectiveness of dictionary reference on vocabulary learning, and Hulstijn et al.’s study (1996), which showed greater retention rate of the dictionary users than the other groups, marginal glosses and reading only. This difference may be attributable to different approaches to vocabulary learning. Contrary to the previous studies, we underlined all the target words making students pay more attention to them, which might result in vocabulary development of the non-dictionary group as well as the dictionary groups. 4.2 Immediate effects of task conditions on reading comprehension To identify the immediate effects of task conditions on the learners’ reading comprehension, four reading posttests were analyzed by a repeated-measures ANOVA. First, the descriptive statistics results displayed that read-aloud, listening, dictionary only, and noticing groups on average gained 7.12, 6.73, 6.54, and 6.21 respectively (refer to Table 7). Interestingly, in every four posttests, the read-aloud group achieved the highest scores, followed by listening, dictionary only, and noticing in order, though there was no significant difference among groups (see Table 8). One reason the noticing group achieved the lowest scores might be that noticing is less directly coupled with reading comprehension than language skill related activities such as read-aloud and listening. To gain a clear indication, extended further research is needed. Table 7 Descriptive statistics for reading comprehension posttests Group    Posttest 1  Posttest 2  Posttest 3  Posttest 4  Total  N  M  SD  M  SD  M  SD  M  SD  M  SD  Read aloud  17  6.94  1.749  7.29  1.724  7.18  1.510  7.06  1.853  7.12  1.680  Listening  13  6.77  1.235  6.69  1.437  6.77  1.235  6.69  1.377  6.73  1.285  Dictionary only  13  6.15  2.267  6.54  2.259  6.77  2.006  6.69  2.720  6.54  2.271  Noticing  19  5.53  2.816  6.47  2.195  6.74  1.851  6.11  1.883  6.21  2.223  Group    Posttest 1  Posttest 2  Posttest 3  Posttest 4  Total  N  M  SD  M  SD  M  SD  M  SD  M  SD  Read aloud  17  6.94  1.749  7.29  1.724  7.18  1.510  7.06  1.853  7.12  1.680  Listening  13  6.77  1.235  6.69  1.437  6.77  1.235  6.69  1.377  6.73  1.285  Dictionary only  13  6.15  2.267  6.54  2.259  6.77  2.006  6.69  2.720  6.54  2.271  Noticing  19  5.53  2.816  6.47  2.195  6.74  1.851  6.11  1.883  6.21  2.223  Table 8 Repeated-measures ANOVA results for reading comprehension posttests Source  SS  df  MS  F  Sig.  Partial η2  Posttest  8.857  2.598  3.409  2.267  0.092  0.038  Posttest*Group  8.748  7.798  1.122  0.746  0.647  0.037  Group  30.494  3  10.165  0.893  0.450  0.044  Source  SS  df  MS  F  Sig.  Partial η2  Posttest  8.857  2.598  3.409  2.267  0.092  0.038  Posttest*Group  8.748  7.798  1.122  0.746  0.647  0.037  Group  30.494  3  10.165  0.893  0.450  0.044  * p < 0.05 The Greenhous-Geisser values in the repeated-measures design were applied for this analysis, since the sphericity for posttest scores was not assumed (p=0.007). Table 8 exhibits the results that there was no significant main effect of both time (F = 2.267, p=0.092, Partial η2=0.038) and group (F = 0.893, p=0.450, Partial η2=0.044) on the posttests. No interaction effect was observed between the two variables (F = 0.746, p=0.647, Partial η2=0.037). Differently from the results of immediate vocabulary learning, which exhibited the steady increase in scores over time, reading comprehension scores did not show improvements. It could imply that the reading ability would not be easily achieved or increased in a short time, and thus longer experimental periods might have led more meaningful results. In addition, the results of no group difference in posttest scores suggest that at least for immediate effects, diverse task conditions bring little difference in reading comprehension. That is, the dictionary use groups did not perform any better than the non-dictionary group. Similar results were also found in previous research (Bensoussan et al. 1984, Nesi and Meara 1991) which claimed the availability of dictionary did not bring benefits toward reading comprehension. Although some researchers emphasized the value of dictionary use in reading (Shieh and Freiermuth 2010, Tono 1989), it was inferred from only descriptive statistics, such as mean and standard division (Shieh and Freiermuth 2010), or as Tono himself pointed out, from the possibly flawed tests in terms of reliability and test validity (Tono 1989). Considering steady posttest scores of both dictionary and non-dictionary groups in the present study, more experimental research needs to be conducted in this reading domain. 4.3 Delayed effects of task conditions on vocabulary learning We discovered from descriptive statistics in Table 9 that the listening group exhibited the best performance in sustained vocabulary learning (M = 12.08), preceding noticing (M = 10.58), dictionary only (M = 10.46), and read-aloud (M = 8.65). In order to investigate which group has more sustained effects of task conditions on learning vocabulary, we analyzed delayed vocabulary test scores among groups through a one-way ANOVA. Table 10 illustrates that there was no significant difference among four groups (F = 0.958, p=0.419). This finding was in accordance with that of immediate vocabulary posttests which proved no difference among groups. Thus, it could be concluded that tasks with dictionary use and noticing do not bring out substantial differences in at least vocabulary learning. Table 9 Descriptive statistics for delayed vocabulary test Group  N  M  SD  Read aloud  17  8.65  5.061  Listening  13  12.08  4.518  Dictionary only  13  10.46  4.789  Noticing  19  10.58  6.955  Total  62  10.34  5.566  Group  N  M  SD  Read aloud  17  8.65  5.061  Listening  13  12.08  4.518  Dictionary only  13  10.46  4.789  Noticing  19  10.58  6.955  Total  62  10.34  5.566  Table 10 Results of group comparison on delayed vocabulary test   Source  SS  df  MS  F  Sig.  η2  Delayed test  Between Groups  89.219  3  29.740  0.958  0.419  0.047  Within Groups  1800.668  58  31.046        Total  1889.887  61            Source  SS  df  MS  F  Sig.  η2  Delayed test  Between Groups  89.219  3  29.740  0.958  0.419  0.047  Within Groups  1800.668  58  31.046        Total  1889.887  61          * p < 0.05 In order to more exactly certify whether task conditions exert prolonged effects on vocabulary learning, compared were pretest and delayed test vocabulary scores among groups. The results of a paired-samples t-test showed that, as illustrated in Table 11, significant improvements in vocabulary learning were found in all four groups ( p=0.000), showing quite a large effect size, 0.619, 0.813, 0.723, and 0.518. This finding suggests that the noticing group as well as the dictionary use groups has an affirmative impact on long-term vocabulary learning. Table 11 Results of group comparison on vocabulary pretest and delayed test Group  Test  N  M  SD  t  Sig.  η2  Read aloud  Pre  17  0.65  0.786  -7.219  0.000*  0.619    Delayed    8.65  5.061        Listening  Pre  13  0.77  0.832  -10.225  0.000*  0.813    Delayed    12.08  4.518        Dictionary only  Pre  13  0.54  0.776  -7.919  0.000*  0.723    Delayed    10.46  4.789        Noticing  Pre  19  0.84  0.958  -6.221  0.000*  0.518    Delayed    10.58  6.955        Group  Test  N  M  SD  t  Sig.  η2  Read aloud  Pre  17  0.65  0.786  -7.219  0.000*  0.619    Delayed    8.65  5.061        Listening  Pre  13  0.77  0.832  -10.225  0.000*  0.813    Delayed    12.08  4.518        Dictionary only  Pre  13  0.54  0.776  -7.919  0.000*  0.723    Delayed    10.46  4.789        Noticing  Pre  19  0.84  0.958  -6.221  0.000*  0.518    Delayed    10.58  6.955        * p < 0.05 Ronald (2002) found a similar result that vocabulary development during extensive reading was made in both with and without dictionary conditions, though more clear benefits were shown in dictionary use. No significant difference between consulting dictionaries and inferring was also found in vocabulary retention in Fraser’s study (1999a). A different finding was reported in Chen’s study (2011a); the bilingual dictionary groups, whether electronic or paper, exhibited significantly higher scores than the guessing group in both vocabulary comprehension and retention. It is noticeable, however, that the delayed vocabulary test was given one week after the last experiment in Chen’s study, while it was carried out two weeks after in this study. It might suggest that the results from the present study would be more representative for sustained effects. 4.4 Delayed effects of task conditions on reading comprehension Descriptive statistics for delayed reading test scores in Table 12 showed that the listening group achieved the highest (M = 7.85), dictionary only the second (M = 7.31), read-aloud the third (M = 6.88), and noticing the lowest (M = 4.89). It seems that dictionary use has more positive effects in reading comprehension than noticing from a longer perspective. Table 12 Descriptive statistics for delayed reading test Group  N  M  SD  Read aloud  17  6.88  3.462  Listening  13  7.85  2.577  Dictionary only  13  7.31  3.146  Noticing  19  4.89  3.143  Total  62  6.56  3.267  Group  N  M  SD  Read aloud  17  6.88  3.462  Listening  13  7.85  2.577  Dictionary only  13  7.31  3.146  Noticing  19  4.89  3.143  Total  62  6.56  3.267  To draw a clearer picture of what task conditions are more effective in understanding reading texts from a longer perspective, delayed reading tests were analyzed via a one-way ANOVA. The results, summarized in Table 13, revealed that there was a significant mean difference among four groups (F = 2.833, p=0.046). More specifically, LSD post hoc test results disclosed that the two groups, listening (p=0.011) and dictionary only (p=0.036), showed significantly better performance than the noticing group. Table 13 Results of group comparison on delayed reading test   Source  SS  df  MS  F  Sig.  η2  Delayed Posttest  Between Groups  83.226  3  27.742  2.833  0.046*  0.127  Within Groups  568.016  58  9.793        Total  651.242  61            Source  SS  df  MS  F  Sig.  η2  Delayed Posttest  Between Groups  83.226  3  27.742  2.833  0.046*  0.127  Within Groups  568.016  58  9.793        Total  651.242  61          * p < 0.05 Among the three dictionary use groups, who did extra tasks of read-aloud, listening, or dictionary consultation, the read-aloud group only did not show a significant difference with the noticing group. It may indicate that more numbers of tasks does not necessarily guarantee higher scores. One of the explanations for the significant differences between the dictionary only and noticing groups, and the listening and noticing groups may lie in task characteristics. When looking up unknown words, the dictionary group would think over meaning of the words appropriate for the reading text given, which could possibly make connections of word meanings within the context of reading. By listening to the whole reading passage, the listening group could also reflect deeply on their initial understanding of the text and thus more firmly understand it. The noticing group, on the other hand, did not perform activities which could actively stimulate their cognitions linking to text understanding. With regard to the read-aloud group, the participants could possibly forget part of what they have read while doing the activity and thus might have difficulty solving reading questions. If the participants had read out the whole text, instead of a few sentences containing the target words, it is more likely that different experimental results might have been shown. In order to ascertain the prolonged effects of task conditions on reading comprehension, reading pretest and delayed test scores by group were analyzed by a paired-samples t-test. Table 14 illustrates that significant differences were found in the three dictionary use groups, read-aloud, listening, and dictionary only ( p=0.017, 0.001, and 0.008 respectively), whereas there was no statistically significant difference in the noticing group ( p=0.635). This finding suggests that noticing only could not lead to enduring positive effects on reading comprehension, and noticing would not be directly associated with developing reading ability. Table 14 Results of group comparison reading pretest and delayed test Group  Test  N  M  SD  t  Sig.  η2  Read aloud  Pre  17  5.76  4.085  -2.667  0.017*  0.181    Delayed    6.88  3.462        Listening  Pre  13  5.69  3.276  -4.635  0.001*  0.472    Delayed    7.85  2.577        Dictionary only  Pre  13  6.38  3.042  -3.207  0.008*  0.299    Delayed    7.31  3.146        Noticing  Pre  19  5.05  2.990  0.483  0.635  0.006    Delayed    4.89  3.143        Group  Test  N  M  SD  t  Sig.  η2  Read aloud  Pre  17  5.76  4.085  -2.667  0.017*  0.181    Delayed    6.88  3.462        Listening  Pre  13  5.69  3.276  -4.635  0.001*  0.472    Delayed    7.85  2.577        Dictionary only  Pre  13  6.38  3.042  -3.207  0.008*  0.299    Delayed    7.31  3.146        Noticing  Pre  19  5.05  2.990  0.483  0.635  0.006    Delayed    4.89  3.143        * p < 0.05 From the results above, we could draw the following inferences. First, read-aloud with dictionary use could function as a facilitator of understanding reading passages in that it could make the participants channel underlined target words within the global understanding the sentences. It also could stimulate and refresh their memory of what they have read in the texts. Secondly, listening to the whole texts with dictionary use seems to be most helpful for reading comprehension. It might also be because diverse modes, including not only visual input from reading but also auditory input from listening, were provided to the listening group. In addition, the listening group participants only had chances to be exposed to the whole texts twice, which could help them to understand the texts more precisely. Lastly, dictionary use only could contribute to enhancing the reading ability over a long-time duration. It seems that smartphone dictionary use allowed learners to spend less time having access to dictionary information than other forms of dictionary and to focus more on comprehending texts. This interpretation was in part supported by an experimental study of Weschler and Pitts (2000), which revealed that learners with an ED looked up words about 23% faster than those with a PD, and that of Loucky (2013) which found less mean accessing time of bilingual ED over bilingual PD. Spending less time on dictionary consultation and more on reading comprehension would make lower the possibility of obstructing readers’ processing and longer their memory on reading content. 4.5 The relationship between task conditions and difficulty perception The participants rated their perceived difficulty of reading texts on a seven-point Likert scale, the lower the easier. The results in Table 15 revealed that on average, the read-aloud group found the reading passages easiest among the groups (M = 3.37), followed by listening (M = 3.60), dictionary only (M = 3.62), and noticing (M = 3.97) group. The mean of the group perceptions on reading texts ranged from 3.37 to 3.97, which means, in general, the texts selected were readable, not too easy or challenging, for the participants, and thus it can be inferred that appropriate text selections were made. Table 15 Descriptive statistics for reading perceived text difficulty Group    Text 1  Text 2  Text 3  Text 4  Total  N  M  SD  M  SD  M  SD  M  SD  M  SD  Read aloud  17  3.41  0.795  3.24  1.147  3.59  0.939  3.24  1.033  3.37  0.976  Listening  13  3.31  1.315  3.62  0.961  3.77  1.092  3.69  1.182  3.60  1.125  Dictionary only  13  4.00  1.291  3.77  1.235  3.31  1.182  3.38  1.223  3.62  1.223  Noticing  19  4.16  1.214  3.95  1.129  3.79  1.032  4.00  1.054  3.97  1.095  Group    Text 1  Text 2  Text 3  Text 4  Total  N  M  SD  M  SD  M  SD  M  SD  M  SD  Read aloud  17  3.41  0.795  3.24  1.147  3.59  0.939  3.24  1.033  3.37  0.976  Listening  13  3.31  1.315  3.62  0.961  3.77  1.092  3.69  1.182  3.60  1.125  Dictionary only  13  4.00  1.291  3.77  1.235  3.31  1.182  3.38  1.223  3.62  1.223  Noticing  19  4.16  1.214  3.95  1.129  3.79  1.032  4.00  1.054  3.97  1.095  The noticing group perceived reading texts most difficult in all the reading texts. This finding corresponds to that of reading posttest and delayed test scores of the noticing group; among the four groups, it gained the lowest scores in all the four posttest (see Tables 7 and 12). In order to find out a general group tendency for reading difficulty perceptions, we put together all the perceptions by group and computed a one-way ANOVA results. As shown in Table 16, there was a significant difference in perceptions on reading difficulty among groups (F = 3.753, p=0.012) and LSD post hoc test results indicated that the read-aloud group felt significantly easier than the noticing group ( p=0.001). Even though both read-aloud and listening groups performed mixed-skill tasks with dictionary use, the read-aloud group only showed a significant difference with the noticing group. It could be explained by the types of mixed skills, i.e., reception (reading) and production (read-aloud) vs. reception (reading) and reception (listening). It seems that the blending of reception and production, specifically reading with a read-aloud task in this study, allows learners to subjectively experience reading texts more easily or effortlessly than the other combination. Table 16 Results of group comparisons on perceived text difficulty Group  N  M  SD  F  Sig.  η2  Read aloud  17  3.37  0.976  3.753  0.012*  0.044  Listening  13  3.60  1.125        Dictionary only  13  3.62  1.223        Noticing  19  3.97  1.095        Group  N  M  SD  F  Sig.  η2  Read aloud  17  3.37  0.976  3.753  0.012*  0.044  Listening  13  3.60  1.125        Dictionary only  13  3.62  1.223        Noticing  19  3.97  1.095        * p < 0.05 The results from immediate tests also supported this phenomenon; that is, among the four groups, all the scores on immediate vocabulary and reading tests of the read-aloud group were the highest, except just one case. Since the perceptions on reading difficulty were checked right before vocabulary posttests and after reading posttests, it is more likely that the higher vocabulary and reading immediate posttest scores they got, the easier they could feel the texts while reading. 5. Conclusion The present study traces the impact of dictionary use and integrated skills on reading comprehension and vocabulary learning, and learners’ perceptions on reading text difficulty. The following are major findings of the study. Firstly, there was no significant difference in vocabulary posttest scores among the four groups, which means no immediate effect of task conditions on vocabulary learning. In spite of no significant group difference, there were significant improvements in vocabulary posttest scores as time elapsed. It could be inferred from the result that regardless of task types, the repetition or familiarity of tasks contributes to vocabulary growth. The results from the comparison of a vocabulary pretest and a delayed test also revealed that all the four groups found statistically significant differences, implying that all the task conditions, regardless of whether SBDs were used or not, could lead to sustained vocabulary development. Next, the four groups did not show any significant differences in reading posttest scores, indicating that there is no immediate influence of task conditions on reading comprehension. There was also no significant effect of time, which could imply that reading ability enhancement cannot be easily made in a short time. Thus, different results could be expected if longer experimental periods were given. The results from the comparison between a reading pretest and a delayed test, however, took on a different aspect; the three groups using SBDs – read-aloud, listening, and dictionary only group – showed significant differences while the non-dictionary, noticing group did not show any statistical enhancement. The outcomes were also supported by one-way ANOVA that among the three dictionary groups, the listening and dictionary only groups brought far better performances in comprehending texts than the noticing group. The following inferences were drawn; read-aloud with dictionary consultation uses as a facilitator of reading comprehension; among the tasks, listening to texts with dictionary use is most useful for understanding reading texts; SBD use only leads to enhanced reading ability. Finally, as for learners’ perceptions on reading text difficulty, the participant groups perceived the texts easier in the order of read-aloud, listening, dictionary only, and noticing. A general tendency for group perceptions from one-way ANOVA results was that the read-aloud group was significantly different from the noticing group. That is, the dictionary group performing a read-aloud activity showed much less difficulty in reading texts than the non-dictionary group. Considering that the listening group, another mixed-skill one, showed little difference with the non-dictionary group, a no less important role was played by the type of mixed skills; the combination of a receptive (reading) with a productive (read-aloud) skill makes learners experience reading passages easier than that of two receptive (reading and listening) ones. Limitations of the study lie in the sample size and rather a short duration of the experimental period. To draw a clearer picture of the effects of task conditions, a longer experiment period, with a larger size of groups, needs to be set in the future research. Another suggested research direction is to include a dictionary group performing a different speaking activity other than read-aloud, or one with a writing activity, another type of mixed skill combination. This study contributed to the re-evaluation of vocabulary instructions in practice, implying the importance of integrating language skills with dictionary use in vocabulary development and reading comprehension. Notes Footnotes 1 The Flesch-Kincaid Grade (FKG) Level, one of the two the Flesch-Kincaid readability tests, presents a score as a US grade level to judge the readability level of various books and texts. This FKG level is extensively used in the field of education along with the other readability test, the Flesch reading ease. 2 Since 2010, the NAEA test has been executed for all students in the 11th grade in Korea, testing three subjects, Korean, mathematics, and English. According to the NAEA result reports, it judges students’ academic attainments by a distinct scoring system which allows a score comparison with previous NAEA tests. References Joysoft. Manneung ‘omnipotent’ English dictionary: an English-Korean bilingual dictionary downloaded from the application software Google Play Store on the smartphone, https://play.google.com/store/apps/details?id=com.joysoft.wordBook Atkins B. T., Knowles F. E.. 1990. ‘Interim Report on the EURALEX/AILA Research Project into Dictionary Use’ In Magay I., Zigany J. (eds), BudaLEX 88 Proceedings . Budapest: Akademiai Kiado, 381– 392. Atkins B. T., Varantola K.. 1997. ‘ Monitoring Dictionary Use.’ International Journal of Lexicography  10. 1: 1– 45. Google Scholar CrossRef Search ADS   Baxter J. 1980. ‘ The Dictionary and Vocabulary Behavior: a Single Word or a Handful?’ TESOL Quarterly:  325– 336. Bensoussan M., Sim D., Weiss R.. 1984. ‘ The Effect of Dictionary Usage on EFL Test Performance Compared with Student and Teacher Attitudes and Expectations.’ Reading in a Foreign Language  2: 262– 275. Chen Y. Z. 2010. ‘ Dictionary Use and EFL Learning. A Contrastive Study of Pocket Electronic Dictionaries and Paper Dictionaries.’ International Journal of Lexicography  23. 3: 275– 306. Google Scholar CrossRef Search ADS   Chen Y. Z. 2011a. ‘ Dictionary Use and Vocabulary Learning in the Context of Reading.’ International Journal of Lexicography  25. 2: 216– 247. Google Scholar CrossRef Search ADS   Chen Y. Z. 2011b. ‘ Studies on Bilingualized Dictionaries: the User Perspective.’ International Journal of Lexicography  24. 2: 161– 197. Google Scholar CrossRef Search ADS   Dziemianko A. 2010. ‘ Paper or Electronic? The Role of Dictionary Form in Language Reception, Production and the Retention of Meaning and Collocations.’ International Journal of Lexicography  23. 3: 257– 273. Google Scholar CrossRef Search ADS   Dziemianko A. 2011. ‘Does Dictionary Form Really Matter?’ In Akasu K., Satoru U. (eds), ASIALEX 2011 proceedings lexicography: Theoretical and Practical Perspectives . Kyoto: Asian Association for Lexicography, 92– 101. Dziemianko A. 2012. ‘ Why One and Two Do Not Make Three: Dictionary Form Revisited.’ Lexikos  22: 195– 216. Google Scholar CrossRef Search ADS   Fraser C. A. 1999a. ‘ Lexical Processing Strategy Use and Vocabulary Learning through Reading.’ Studies in Second Language Acquisition  21. 2: 225– 241. Google Scholar CrossRef Search ADS   Fraser C. A. 1999b. ‘ The Role of Consulting a Dictionary in Reading and Vocabulary Learning.’ Canadian Journal of Applied Linguistics  2. 1-2: 73– 89. Hayati A. M., Pour-Mohammadi M.. 2005. ‘ A Comparative Study of Using Bilingual and Monolingual Dictionaries in Reading Comprehension of Intermediate EFL Students.’ Reading  5. 2: 61– 66. Holmer L., von Martens M., Sköldberg E.. 2015. ‘Making a Dictionary App from a Lexical Database: the Case of the Contemporary Dictionary of the Swedish Academy.’ Proceedings of eLex conference, August 2015. Accessed on 21 January 2016.https://elex.link/elex2015/proceedings/eLex_2015_03_Holmer+vonMartens+Skoldberg.pdf. Hsien-jen C. 2001. ‘The Effects of Dictionary Use on the Vocabulary Learning Strategies Used by Language Learners of Spanish.’ Paper presented at the Annual Meeting of the Acquisition of Spanish and Portuguese as First and Second Languages. University of Illinois, 11-14 October 2001. Accessed on 21 January 2016.http://files.eric.ed.gov/fulltext/ED471315.pdf. Hulstijn J. H., Hollander M., Greidanus T.. 1996. ‘ Incidental Vocabulary Learning by Advanced Foreign Language Students: the Influence of Marginal Glosses, Dictionary Use, and Reoccurrence of Unknown Words.’ The Modern Language Journal  80. 3: 327– 339. Google Scholar CrossRef Search ADS   Knight S. 1994. ‘ Dictionary Use while Reading: the Effects on Comprehension and Vocabulary Acquisition for Students of Different Verbal Abilities.’ The Modern Language Journal  78. 3: 285– 299. Google Scholar CrossRef Search ADS   Koyama T., Takeuchi O.. 2003. ‘ Printed Dictionaries vs. Electronic Dictionaries: a Pilot Study on How Japanese EFL Learners Differ in Using Dictionaries.’ Language Education & Technology  40: 61– 79. Koyama T., Takeuchi O.. 2004a. ‘ Comparing Electronic and Printed Dictionaries: How the Difference Affected EFL Learning.’ JACET Bulletin  38: 33– 46. Koyama T., Takeuchi O.. 2004b. ‘How Look-up Frequency Affects EFL Learning?: an Empirical Study on the Use of Handheld-electronic Dictionaries’ Proceedings of the CLaSIC 2004 Conference, 2004. Accessed on 21 January 2016.http://kuir.jm.kansai-u.ac.jp/dspace/bitstream/10112/5189/1/KU-1100-200400.pdf Koyama T., Takeuchi O.. 2007. ‘ Does Look-up Frequency Help Reading Comprehension of EFL Learners? Two Empirical Studies of Electronic Dictionaries.’ CALICO Journal  25. 1: 110– 125. Laufer B., Hadar L.. 1997. ‘ Assessing the Effectiveness of Monolingual, Bilingual, and “Bilingualised” Dictionaries in the Comprehension and Production of New Words.’ The Modern Language Journal  81. 2: 189– 196. Loucky J. P. 2013. ‘ Comparing Electronic Dictionary Functions and Use.’ CALICO Journal , 28. 1: 156– 174. Google Scholar CrossRef Search ADS   Luppescu S., Day R. R.. 1993. ‘ Reading, Dictionaries, and Vocabulary Learning.’ Language Learning  43. 2: 263– 279. Google Scholar CrossRef Search ADS   Midlane V. 2005. Students’ Use of Portable Electronic Dictionaries in the EFL/ESL Classroom; a Survey of Teacher Attitudes. M.A. Thesis, University of Manchester. Mikulecky B. S., Jeffries L.. 2004. More Reading Power . New York: Longman. Nation I. S. P. 2001. Learning Vocabulary in Another Language . Cambridge: Cambridge University Press. Google Scholar CrossRef Search ADS   Nation I. S. P., Webb S.. 2011. Researching and Analyzing Vocabulary . Boston: Heinle. Nesi H. 2014. ‘ Dictionary Use by English Language Learners.’ Language Teaching  47. 1: 38– 55. Google Scholar CrossRef Search ADS   Nesi H., Meara P.. 1991. ‘ How Using Dictionaries Affects Performance in Multiple-Choice EFL Tests.’ Reading in a Foreign Language  8: 631– 643. Pujol D., Corrius M., Masnou J.. 2006. ‘ Print Deferred Bilingualised Dictionaries and their Implications for Effective Language Learning: a New Approach to Pedagogical Lexicography.’ International Journal of Lexicography  19. 2: 197– 215. Google Scholar CrossRef Search ADS   Ronald J. 2002. ‘L2 Lexical Growth through Extensive Reading and Dictionary Use: a Case Study’ Proceedings of the Tenth EURALEX International Congress, EURALEX 2002, 13– 17 August 2002. Rundell M. 1999. ‘ Dictionary Use in Production.’ International Journal of Lexicography  12. 1: 35– 53. Google Scholar CrossRef Search ADS   Shieh W., Freiermuth M. R.. 2010. ‘ Using the DASH Method to Measure Reading Comprehension.’ TESOL Quarterly  44. 1: 110– 128. Google Scholar CrossRef Search ADS   Stirling J. 2005. ‘ The Portable Electronic Dictionary: Faithful Friend or Faceless Foe?’ Modern English Teacher  14. 3: 64– 72. Summers D. 1988. ‘The Role of Dictionaries in Language Learning’ In Carter R., McCarthy M. (eds), Vocabulary and Language Teaching , London: Longman, 111– 125. Tang G. M. 1997. ‘ Pocket Electronic Dictionaries for Second Language Learning: Help or Hindrance?’ TESL Canada Journal  15. 1: 39– 57. Google Scholar CrossRef Search ADS   Tono Y. 1989. ‘Can a Dictionary Help One Read Better? On the Relationship between E.F.L. Learners’ Dictionary Reference Skills and Reading Comprehension’ In James G (eds), Lexicographers and Their Works . Exeter: University of Exeter Press, 192– 200. Weschler R., Pitts C.. 2000. ‘ An Experiment Using Electronic Dictionaries with EFL Students.’ The Internet TESL Journal  6. 8: 56– 67. Zarei A. A., Naseri D.. 2008. ‘The Effect of Monolingual, Bilingual, and Bilingualized Dictionaries on Vocabulary Comprehension and Production.’ TELL 2.7: 42– 69. © 2016 Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com

Journal

International Journal of LexicographyOxford University Press

Published: Mar 1, 2018

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off