TY - JOUR AU - Goldstein,, Mikael AB - Abstract Can readability on small screens be improved by using adaptive Rapid Serial Visual Presentation (RSVP) that adapts the presentation speed to the characteristics of the text instead of keeping it fixed? In this paper we introduce Adaptive RSVP, describe the design of a prototype on a mobile device, and report findings from a usability evaluation where the ability to read long and short texts was assessed. In a latin-square balanced repeated-measurement experiment, employing 16 subjects, two variants of Adaptive RSVP were benchmarked against Fixed RSVP and traditional text presentation. For short texts, all RSVP formats increased reading speed by 33% with no significant differences in comprehension or task load. For long texts, no differences were found in reading speed or comprehension, but all RSVP formats increased task load significantly. Nevertheless, Adaptive RSVP decreased task load ratings for most factors compared to Fixed RSVP. Causes, implications, and effects of these findings are discussed. 1 Introduction The mobile Internet has been widely predicted to cause a revolution in communications over the next few years and when the time is ripe it is likely to become an integral part of our everyday lives. However, before the revolution can start there must be devices, services, and applications available that people really want to use. Enabling these new technologies, bringing them to market, and making desirable content available to them, are the challenges facing the wireless business today. Nonetheless, when the mobile Internet eventually takes off, users of mobile devices are going to be able to access more or less the same information resources as desktop users do today. The fact that the mobile devices are much smaller does, however, constrain the usefulness of this advancement. The limited input capabilities of mobile devices make them primarily suitable for information retrieval, but the limited screen space currently constitutes a bottleneck for such appliances (Ericsson et al., 2001). Since it is the customers demand for small devices that limits the screen size, small screens are likely to remain a reality also in the future. This notion, combined with the fact that readability has long been considered important as even small improvements can ease reading for large groups of people (Huey, 1908), has made the issues concerning readability on small screens progressively more important for mobile usability. Early research on screen reading showed that reading speed decreased by 20–30% when reading on large screens compared to reading on paper (Mills and Weldon, 1987). With time, as screen resolution improved and people got more used to reading them from a screen, readability on large screens became more or less equal to paper (Muter and Maurutto, 1991). The evolution in readability on small screens is, however, not likely to follow the same pattern. The resolution will surely get better and thus improve legibility but decreased readability will still be intrinsic to limited screen space (Duchnicky and Kolers, 1983). Readability may, however, still be increased by designing interfaces that display the text in a way more suitable for small screens. Focus has thus shifted from how screens display static texts to how texts can be dynamically displayed on the screen. Leading and Rapid Serial Visual Presentation (RSVP) are the two major techniques that have been proposed for dynamic text presentation (Mills and Weldon, 1987). Leading, or the Times Square Format, scrolls the text on one line horizontally across the screen whereas RSVP presents the text as chunks of words or characters in rapid succession at a single visual location. Both formats offer a way of reading texts on a very limited screen space (Bruijn and Spence, 2000; Juola et al., 1995; Mills and Weldon, 1987; Rahman and Muter, 1999). Comparisons between the formats have so far been inconclusive (Juola et al., 1982; Kang and Muter, 1989), but since the eye processes information during fixed gazes it seems more natural to use RSVP, the reason for this is that the text then moves successively rather than continuously. It is important to explore the possibilities of dynamic text presentation since improved readability on small screens also means improved usability of mobile devices. In this paper, we look at the potential enhancement of the RSVP format by letting the presentation speed adapt to some linguistic characteristics of the text. 2 RSVP RSVP originated as a tool for studying reading behaviour (Forster, 1970; Juola et al., 1982; Potter, 1984), but has lately received more attention as a presentation technique with a promise of optimizing reading efficiency, especially when screen space is limited (Goldstein et al., 2001; Juola et al., 1995; Muter, 1996; Rahman and Muter, 1999; Sicheritz, 2000). The reason for the interest is that the process of reading works a little different when RSVP is used and that it requires much smaller screen space than traditional text presentation. While you read this text on paper three distinct visual tasks are performed: information is processed in fixed gazes or fixations, saccadic eye movements are executed to move between the fixations and return sweeps are used to move to the next line. Whereas saccadic eye movements and return sweeps are performed very quickly (∼40 and ∼55 ms, respectively) the fixations take longer time (∼230 ms for fast readers and ∼330 ms for average readers) (Robeck and Wallace, 1990). If you read this text using RSVP instead of paper, it would be successively displayed as small chunks within a small area. Each chunk would typically contain one or a few words depending on the width of the text presentation window. When reading in this fashion the text proceeds by itself and that reduces the need for saccadic eye movements and return sweeps (Rahman and Muter, 1999). The speed of the text presentation when using RSVP is usually measured in words per minute (wpm). The exposure time of each text chunk is calculated on basis of the set presentation speed and on how much that can be displayed in the text presentation window. Unfortunately, there is little or no documentation in previous studies on exactly how the exposure times have been calculated (Juola et al., 1982; Juola et al., 1995; Masson, 1983; Muter, 1996; Rahman and Muter, 1999). What is known, however, is that the exposure times have generally been fixed. In this evaluation, the following formula has been employed for calculating the fixed text chunk exposure times. (Eq. (1)): (1) The average number of characters that can be displayed (fchr) is divided by the product of the average word length (wavg) for the current language and the presentation speed (wpm) divided by 60. The result is a fixed exposure time for each text chunk measured in seconds (time0). 2.1 Previous readability evaluations with RSVP Juola et al. (1982) found comprehension equal between RSVP and traditional text presented on a screen whereas Masson (1981) found that comprehension of text read using RSVP was poorer. A possible explanation for the different results may be the insertion of a blank screen for 200–300 ms between the sentences in the Juola et al. study. In a repeated-measurement experiment where long texts were read on a Personal Digital Assistant (PDA) using RSVP with blank screens, Goldstein et al. (2001) found neither RSVP reading speed or comprehension to differ from reading on paper. However, the NASA-TLX (Task Load Index) (Hart and Staveland, 1988) revealed significantly higher task load when using RSVP for most factors. One explanation to the high task load may be the fact that the exposure times in previous RSVP implementations have been fixed although the reading speed actually varies (Just and Carpenter, 1980). The relation between reading speed and exposure time for, what we from here on will refer to as, Fixed RSVP can be visualized in the following speed–exposure plot (Fig. 1). Fig. 1 Open in new tabDownload slide Variations in reading speed for individual text chunks containing 1, 2, 3, 4, or 5 words (the dots from left to right), presented at a set speed of 300 wpm (indicated by the vertical arrow), when using RSVP with fixed exposure times (data derived from the training text). Fig. 1 Open in new tabDownload slide Variations in reading speed for individual text chunks containing 1, 2, 3, 4, or 5 words (the dots from left to right), presented at a set speed of 300 wpm (indicated by the vertical arrow), when using RSVP with fixed exposure times (data derived from the training text). The plot is a result of presenting the training text used in the usability evaluation at a constant speed of 300 wpm using the fixed exposure time formula (Eq. (1)). The width of the text presentation window was 25 characters, for Swedish this gives an average of 21 characters displayed in each window (fchr=21). The average word length was set to seven characters (wavg=7). These variables were obtained by corpora analysis and are language dependent (Öquist, 2001), if the algorithm is to be used with any other language, the variables would have to be substituted. As can be seen on the y-axis in Fig. 1, the fixed exposure time formula exposes each text chunk for a fixed time (600 ms), in order to present the text at the set speed of 300 wpm. The reading speed for the individual text chunks, which can be seen on the x-axis of Fig. 1, is a result of dividing each text chunks exposure time by the number of words it contains (1–5). This explains why there is a variation in reading speed (100–500 wpm), although the exposure time is fixed. Only the text chunks with three words match the selected reading speed (the vertical arrow in Fig. 1). If more words appear in a text chunk, the speed increases proportionally, and if fewer words appear, the speed decreases proportionally. The reading speed is thus inversely related to the number of words that each text chunk contains, which not seems very natural. The reason for this is that the text chunks displaying several words are likely to convey more information than the text chunks displaying few words. Yet, when fixed exposure times are used, all text chunks must be processed equally fast, although the time it takes to digest the conveyed information is likely to differ. 3 Adaptive RSVP Just and Carpenter (1980, p. 330) found that ‘there is a large variation in the duration of individual fixations as well as the total gaze duration on individual words’ when reading text from paper. Adaptive RSVP (Goldstein et al., 2001; Öquist, 2001) attempts to mimic the reader's cognitive text processing pace more adequately by adjusting each text chunk exposure time in respect to the text appearing in the RSVP text presentation window. By assuming the eye-mind hypothesis (Just and Carpenter, 1980), i.e. that the eye remains fixated on a text chunk as long as it is being processed, the needed exposure time of a text chunk can be assumed proportional to the predicted gaze duration of that text chunk. Since very common, known, or short words are usually processed faster than infrequent, unknown or long words, the text chunk exposure times can be adjusted accordingly (Just and Carpenter, 1980). Further, most new information tends to be introduced late in sentences and therefore ambiguity and references tends to be resolved there as well. A shorter sentence is also usually processed faster than a longer one since it conveys less information (Just and Carpenter, 1980). Thus, processing time differs both within and between sentences and the text chunk exposure times can therefore be adjusted accordingly as well. On the basis of these findings two adaptive algorithms supposed to decrease task load were developed. Both were deliberately kept simple since mobile clients tend to be quite thin (i.e. have limited processing power). The first algorithm adapts the exposure time to the content of the text chunks whereas the second also looks to the context in the sentences. Both algorithms insert a blank window between each sentence if there is not enough space to begin on the next sentence in the same window, otherwise a delay is added to the sentence boundary instead. 3.1 Content adaptation In content adaptive mode, the exposure time for each text chunk is based on the numbers of characters and words that are being exposed for the moment. Longer words are assumed to be more infrequent and take longer time to read than shorter words. A higher number of words are also assumed to take longer time to read and should thus receive more exposure time. The following formula is used to calculate the exposure time for content adaptation (Eq. (2)): (2) The formula uses the number of words (nwrd) and the number of characters (nchr) as a basis for the results. Both arguments are added and divided by the product of the average word length including delimiters (davg) and the currently set speed in words per minute (wpm) divided by 60. The result is a variable exposure time (time1) depending on the content of the current text chunk. The effect of using content adaptation compared to fixed exposure times can be visualized in a speed–exposure plot (Fig. 2). Fig. 2 Open in new tabDownload slide Variations in reading speed for individual text chunks containing 1, 2, 3, 4, or 5 words (the curved dotted lines from left to right), presented at a set speed of 300 wpm (indicated by the vertical arrow), when using content adaptation (data derived from the training text). Fig. 2 Open in new tabDownload slide Variations in reading speed for individual text chunks containing 1, 2, 3, 4, or 5 words (the curved dotted lines from left to right), presented at a set speed of 300 wpm (indicated by the vertical arrow), when using content adaptation (data derived from the training text). The plot is a result of presenting the training text at a constant speed of 300 wpm (vertical arrow in Fig. 2) but this time the formula for content adaptation is used instead (Eq. (2)). The average word length including delimiters was set to 7.8 whereas the other variables were the same as those used for Fixed RSVP. Worth to notice is that even though the exposure times now varies the variation in reading speed is actually smaller for content adaptation than for Fixed RSVP. When using content adaptation the exposure time for each text chunk is also directly related to the number of words and characters it contains. This approach is assumed to decrease cognitive demand while reading since the relation between conveyed information and time for digestion is more natural. 3.2 Context adaptation In context adaptive mode the exposure time for each text chunk is based on the following: the result of content adaptation, the word frequencies of the words in the chunk and the position of the chunk in sentence being exposed. To begin with, each word in the chunk is looked up in a lexicon with word frequencies. If the word is common it receives a weight lower than one and if it is rare or not in the lexicon it receives a weight higher than one. The following formula is used to calculate how the exposure time is affected by the word frequencies (Eq. (3)): (3) The formula uses the exposure time for content adaptation (time1) and the word frequency weights for the words in the chunk (wfrq) as a basis for the result. The word frequency weights are added and divided by the number of words in the text chunk (nwrd). The product is then multiplied with the content adaptive exposure time to get the weighted exposure time (time2). The next step is to give the chunk less exposure time if it appears in the beginning of a sentence and more if it appears in the end. The following formula is used to calculate the text chunk exposure time depending on the position in and the length of the current sentence (Eq. (4)): (4) The formula uses the intermediary exposure time reached earlier (time2), the number of words in the sentence exposed so far (swrd) and the average sentence length (savg). In order to get a smooth drop-off in speed along the sentence, a mean of the previously calculated exposure time and its product with the hyperbolic tangent (tanh) of the division of the number of exposed words and the average sentence length is calculated. The result is a varying text chunk exposure time (time3), the effect of using context adaptation is illustrated with a speed–exposure plot (Fig. 3). Fig. 3 Open in new tabDownload slide Variations in reading speed for individual text chunks containing 1, 2, 3, 4, or 5 words (the curved dotted lines from left to right) presented at a set speed of 300 wpm (indicated by the vertical arrow), when using context adaptation (data derived from the training text). Fig. 3 Open in new tabDownload slide Variations in reading speed for individual text chunks containing 1, 2, 3, 4, or 5 words (the curved dotted lines from left to right) presented at a set speed of 300 wpm (indicated by the vertical arrow), when using context adaptation (data derived from the training text). The plot is a result of presenting the training text at 300 wpm (vertical arrow in Fig. 3) when using context adaptation (Eq. (4)). The average sentence length was set to 11.5 words and the word frequency weights ranged between 0.6 and 1.2. A lexicon with frequencies for the 10.000 most common words in a corpus of 11.9 million words (Press 97) was used to assign the weights according to a lognormal distribution (Öquist, 2001). Context adaptation causes the largest variations in exposure times, but is still assumed to decrease task load since the variations are supposed to match the actual cognitive demand while reading better. In an experiment with an approach similar to Adaptive RSVP, Castelhano and Muter (2001) found that the introduction of punctuation pauses within sentences was significantly favoured compared to reducing common word exposure times. These findings are, however, not necessarily pertinent to this evaluation since a combination of varying exposure times and punctuation pauses were used here. The number of words with reduced exposure times was also very small in the Castelhano and Muter evaluation (11 words). 4 The Bailando prototype In order to evaluate the RSVP algorithms they had to be incorporated into a mobile device. Bailando is the resulting prototype that has been developed at the Ericsson Research's Usability & Interaction Lab. Bailando is an acronym for Better Access to Information through Linguistic Analysis and New Display Organization. Bailando was developed for the Compaq iPAQ 3630 Pocket PC, a small PDA with a touch sensitive high-resolution colour display. The iPAQ offers far more screen space than the average mobile device, but it still shares many of the properties typical for smaller devices. The size and weight of the device was similar to a pocket book of approximately 250 pages (Fig. 4). Fig. 4 Open in new tabDownload slide The Bailando prototype running in reading mode on a Compaq iPAQ 3630. Fig. 4 Open in new tabDownload slide The Bailando prototype running in reading mode on a Compaq iPAQ 3630. 4.1 System design Exact timing in the application was important since the differences in exposure time when using Adaptive RSVP is quite small. The Compaq iPAQ is a Windows 32-bit platform; the available programming languages were primarily Visual Basic and Visual C++. It might have been possible to use Java as well, but there were just too many questions around the performance to make that choice feasible. In the end, Embedded Visual C++ was chosen as a programming environment since it offered reliable real-time capabilities. Bailando was implemented as multi-threaded application with one thread to keep control over the graphical user interface, one thread to parse and process the text and one thread to spawn processes for additional media playback. Widgets and the threads were implemented as objects whereas functions, like redrawing the screen or calculating the text exposure time, were implemented as methods. 4.2 Interaction design It was important that the graphical interface was appealing and yet intuitive to use. It had to give a professional impression since it was supposed to be compared to other professional applications for traditional text presentation. Since ample screen space was available on the iPAQ all the application controls were implemented into the graphical user interface. This also makes the Bailando software easier to run on other PDA's since button assignments differ between devices. In order to get a feel for how Bailando works a step-by-step walkthrough will be presented, with design choices and explanations given along the way. The first that happens when Bailando is launched is that a start screen is shown (Fig. 5, left). The main interaction control of the application is the toolbar. It is always located at the bottom of the screen so that the hand does not obscure the screen while it is used. It always has the same appearance although unselectable items are greyed out. As can be seen on the start screen the only item selectable on the toolbar is the Menu. Fig. 5 Open in new tabDownload slide Bailando start screen (left) and the library menu (right). Fig. 5 Open in new tabDownload slide Bailando start screen (left) and the library menu (right). When the Menu item is selected, a pop-up main menu is shown. From the main menu, it is possible to exit the application, to get information about version, to change settings, and to retrieve texts from the library. If the library alternative is selected the library screen is shown. The Bailando application supports files either in a dedicated XML (eXtensible Mark-up Language) format or in plain text. A text is selected from the library by selecting it in the list (The file ‘Glöd’ is selected in Fig. 5, right). Once a text is selected in the library, the processing of the file starts. First, the file is checked to see if it can be viewed in the Bailando prototype at all. After that, the data format is determined. If it is an XML file it is parsed and the meta-data is stored as variables, if it is a plain text file some of the file properties are picked up and stored (i.e. filename and type). The next step is to analyse the text, the number of characters, words, long words and sentences are calculated in order to determine the LIX rating (Björnsson, 1968), a readability measure developed for Swedish that is comparable to the Flesh index for English (Tekfi, 1987). While processing the text the main reading screen is exposed (Fig. 6), but not until the text is processed completely are all the toolbar items selectable. The complete toolbar contains the controls to start, pause, and resume the presentation. The presentation can also be paused and resumed by touching anywhere on the text presentation area of the screen. It is important that it is easy to make pauses since the user is likely to need to do that fast when interrupted in order not to get lost. This important feature of the interface is introduced to the user by the following instruction when the text is ready to be viewed: ‘tap the screen to begin’. Fig. 6 Open in new tabDownload slide Bailando main screen in reading mode (left) and pause mode (right). Fig. 6 Open in new tabDownload slide Bailando main screen in reading mode (left) and pause mode (right). In Bailando, the text is presented at an area located slightly above the half of the screen. The text is presented at one single line that utilizes 2:3 of the screen width and the vertical alignment is similar to the text presentation area as a whole. Above the text presentation area there is a border for aesthetical reasons. Below the text presentation area there is an information area displaying the text title, the progress bar, and the current speed settings (Fig. 6). The progress bar is included in order to support memory of spatial location while reading, as said earlier a completion meter has been found to increase the user preference for RSVP in a previous evaluation (Rahman and Muter, 1999). While reading, the toolbar show the Pause button and the coloured part of the background is green (Fig. 7, left). If the user feels he missed or did not understand a presented text window it is possible to go backwards (<<). It is also possible to skip text by going forward (>>). These orientation features are included to make it easier for the user to browse through the text. It is also possible to browse through the text step-by-step in paused mode. The reading speed can be easily decreased (−) or increased (+) with the speed control buttons in steps of 10 wpm. When the text presentation is paused, the toolbar shows the Read button and the background is then also turned red as an extra affordance (Fig. 6, right). Fig. 7 Open in new tabDownload slide Bailando text information screen (left) and settings menu (right). Fig. 7 Open in new tabDownload slide Bailando text information screen (left) and settings menu (right). When a text is loaded the book item on the toolbar also becomes selectable. The idea was to add bookmark and table of content functionality to this menu, but in the present version of Bailando this menu only lets the user choose to go to the beginning of the text, to the point furthest read in the text, or to view text information. The text information screen contains the information selected during the initial processing of the text (Fig. 7, left). Apart from showing the texts title, type, and metrics, it also displays a readability rating (based on the LIX value for Swedish). The suggested reading speed is calculated on basis of what and how fast the reader has read earlier. The last screen-shot from Bailando is the settings options available from the menu (Fig. 7, right). The only setting variable that could be changed through the GUI in the version used in the evaluation was the adaptation mode. 4.3 RSVP design The width of the RSVP display window was 25 characters wide with the text presented left justified in a 10-pt. sans-serif typeface. The window width was chosen on basis of the results from a previous evaluation where a width of 25 characters gave the best results (Goldstein et al., 2001; Sicheritz, 2000). Font-size has in previous evaluations been found to have a minor effect on RSVP (Russell and Chaparro, 2001), and 10-pt. therefore seemed good enough for our purposes. Bailando supports all the three forms of RSVP that have been described in this paper. In all RSVP modes, the text chunks that contained punctuation marks received an addition of 250 ms to their exposure times and all words longer than the display width were hyphenated. In Fixed RSVP mode, a blank screen was inserted for 250 ms between each sentence. In Content and Context Adaptive RSVP, a blank screen was inserted if there was not enough space to begin on the next sentence in the same window, otherwise a delay was added to the sentence boundary instead. Table 1 offers a summary of the variables used by Bailando for text presentation: Table 1 Variables used by Bailando for RSVP with inapplicable values left blank Variable . Fixed . Content . Context . Average characters displayed (n chars) 21 – – Average word length (n chars) 7 – – Average word length incl. delimiters (n chars) – 7.8 7.8 Average sentence length (n words) – – 11.5 Default frequency weight (n) – – 1.2 Blank window exposure time (ms) 250 250 250 Sentence boundary delay (ms) – 250 250 Punctuation mark delay (ms) 250 250 250 Variable . Fixed . Content . Context . Average characters displayed (n chars) 21 – – Average word length (n chars) 7 – – Average word length incl. delimiters (n chars) – 7.8 7.8 Average sentence length (n words) – – 11.5 Default frequency weight (n) – – 1.2 Blank window exposure time (ms) 250 250 250 Sentence boundary delay (ms) – 250 250 Punctuation mark delay (ms) 250 250 250 Open in new tab Table 1 Variables used by Bailando for RSVP with inapplicable values left blank Variable . Fixed . Content . Context . Average characters displayed (n chars) 21 – – Average word length (n chars) 7 – – Average word length incl. delimiters (n chars) – 7.8 7.8 Average sentence length (n words) – – 11.5 Default frequency weight (n) – – 1.2 Blank window exposure time (ms) 250 250 250 Sentence boundary delay (ms) – 250 250 Punctuation mark delay (ms) 250 250 250 Variable . Fixed . Content . Context . Average characters displayed (n chars) 21 – – Average word length (n chars) 7 – – Average word length incl. delimiters (n chars) – 7.8 7.8 Average sentence length (n words) – – 11.5 Default frequency weight (n) – – 1.2 Blank window exposure time (ms) 250 250 250 Sentence boundary delay (ms) – 250 250 Punctuation mark delay (ms) 250 250 250 Open in new tab 4.4 Distribution model The discussion around the distribution model originated from the question if the texts were supposed to be processed on the client or on a server. The reason for raising this question is that the mobile clients usually have limited processing power and that linguistic processing can become quite demanding. Still, if a server was used for processing, the adaptation could be made much more advanced including parsers for elaborate segmentation and linguistic processing. The server approach would also make it possible to update the adaptation capabilities continuously without having to reprogram the clients. On the other hand, it is also appealing to have independent clients that can take a text, process it, and present it without being connected to a server. There are merits and pitfalls with both approaches and in the end a combination of both were chosen. In the distribution model applied here, the linguistic processing can be done on both the server and the client. The server is supposed to perform advanced linguistic processing whereas the client is supposed to do simple linguistic processing if no server is available or needed. This approach requires some form of intermediary document format in order to transfer the meta-information about the text. That format should ideally also be able to transfer information about the document structure and pointers to additional resources added to the text (i.e. sounds, bookmarks, etc). The Adaptive RSVP formulas developed for this study are quite simple and can thus be considered as client processing formulas. A document mark-up schema for RSVP was designed in the XML format in order to facilitate the addition of external resources and encourage further development on the server side processing. The following excerpt from an XML tagged file will serve as an example of what sort of information a RSVP document might contain (Fig. 8). Fig. 8 Open in new tabDownload slide An example of a brief XML tagged RSVP file. Fig. 8 Open in new tabDownload slide An example of a brief XML tagged RSVP file. 4.5 Sonification features One of the less explored properties of the RSVP format is its capability to present other media simultaneously with exact timing in respect to the text being read. The Bailando prototype supports Sonified RSVP, which means that it can play sounds simultaneously as it presents the text. This is accomplished by parsing XML tags in the text. When Bailando finds a start-sound tag (Fig. 8) in the text chunk to be displayed it plays the audio file linked to the tag. When a stop-sound tag is encountered, all playing sounds are stopped. Thus, exact synchronization between word and sound is attainable and works at any reasonable reading speed. The sonification features of Bailando have been evaluated in a user study, which showed that the addition of sound to text displayed via RSVP actually could improve the reading experience (Goldstein et al., 2002). More specifically it was found that the rating of perceived immersion was significantly higher when nomic auditory icons were played simultaneously with the text presentation. The addition of a more elaborate soundtrack to strengthen the dramatic presence of the text has been suggested as a possibility to further enhance the reading experience on small screens (Goldstein et al., 2002). 5 The usability evaluation The aim with the evaluation was to see how traditional text presentation, Fixed RSVP, Content Adaptive RSVP and Context Adaptive RSVP affected the ability to read on a mobile device. It was important that the same device was used for all conditions since the look and feel of the hardware was likely to bias the assessment. Long and short texts were included in the evaluation since the experience of reading under an extended and brief time was thought to differ. 5.1 Method In order to assess the effects caused by reading long and short texts using the four presentation formats a repeated-measurement experimental layout was adopted. The following null hypotheses were set for reading long and short texts: No difference in Reading speed No difference in Comprehension No difference in Task load The hypotheses were tested in the SPSS V10.0 software using the repeated-measurement General Linear Model (GLM). The significance level was set to 5% and the level of multiple comparisons was Bonferroni adjusted. Design. A within-subject Latin-square design was employed. Four experimental conditions were formed where each subject read one long and one short text using each presentation format. The combinations of long (A–D) and short (a–d) texts were fixed creating four text pairs Aa, Bb, Cc and Dd (Table 1). The text pairs were balanced against condition and order generating 16 combinations. Each subject was randomly assigned to one of the 16 combinations. Subjects. Sixteen paid subjects participated in the experiment. They were all enrolled with the criteria that they were fluent in Swedish and had a self-reported interest in reading. The subjects had a mean age of 25 and half of them were male. All stated that they were computer literate and seven had some previous experience of using a PDA. Nine of the subjects had corrected vision and two were left-handed. Apparatus. For all conditions, a Compaq iPAQ 3630 was used, although the iPAQ offers far more screen space than the average mobile device it was chosen since it is easier to do prototyping on and it still shares many of the properties typical for smaller devices. Bailando was used for all RSVP conditions and the initial speed of the text presentation was always set to 250 wpm but the subjects were allowed to alter the speed at any time. Two commercial programs were chosen for traditional text presentation, Microsoft Reader for long texts and Microsoft Internet Explorer for short texts. It would probably have been more experimentally sound to use a single program for all traditional text presentation, but it would not have been realistic. The foremost reason for including two different programs was their intended context of use; the MS Reader is custom-made to present longer texts such as e-books whereas the MS Explorer is designed to present shorter web-content such as news articles. The MS Reader uses page-turn buttons to move between the pages whereas the MS Explorer uses both page-turn buttons and a scroll-bar, in addition to this the MS Reader also utilize a legibility enhancing technique called ClearType (Fig. 9). Fig. 9 Open in new tabDownload slide The interface of the Microsoft Reader for long texts (left) and the Microsoft Internet Explorer for short texts (right). Fig. 9 Open in new tabDownload slide The interface of the Microsoft Reader for long texts (left) and the Microsoft Internet Explorer for short texts (right). Texts. Four long fiction texts and four short news articles where chosen to be included in the experiment. One shorter fiction text was also used as a training text. Texts in Swedish with different readability ratings were chosen (Table 2). The readability rating was measured with LIX (Björnsson, 1968), a readability rating developed for Swedish texts that is comparable to the Flesh index for English (Tekfi, 1987). Table 2 Texts used in the experiment Text . Title (in Swedish) . Author/source . Words . LIX . Training text Glöd Annette Kullenberg, Chapter 1 705 25 Long text A Röda rummet August Strindberg, Chapter 3 4272 37 Long text B Nils Holgersson Selma Lagerlöf, Chapters 1 and 2 4230 27 Long text C Valarnas sång Wally Lamb, Chapter 1 4326 31 Long text D Bara Alice Maggie O'Farrel, Chapter 1 4170 29 Short text a Makedonien Dagens Nyheter 2001-06-03 430 44 Short text b Mellanöstern Dagens Nyheter 2001-05-27 384 54 Short text c Alcala Dagens Nyheter 2001-06-02 628 40 Short text d Lundin Oil Dagens Nyheter 2001-05-18 365 49 Text . Title (in Swedish) . Author/source . Words . LIX . Training text Glöd Annette Kullenberg, Chapter 1 705 25 Long text A Röda rummet August Strindberg, Chapter 3 4272 37 Long text B Nils Holgersson Selma Lagerlöf, Chapters 1 and 2 4230 27 Long text C Valarnas sång Wally Lamb, Chapter 1 4326 31 Long text D Bara Alice Maggie O'Farrel, Chapter 1 4170 29 Short text a Makedonien Dagens Nyheter 2001-06-03 430 44 Short text b Mellanöstern Dagens Nyheter 2001-05-27 384 54 Short text c Alcala Dagens Nyheter 2001-06-02 628 40 Short text d Lundin Oil Dagens Nyheter 2001-05-18 365 49 Open in new tab Table 2 Texts used in the experiment Text . Title (in Swedish) . Author/source . Words . LIX . Training text Glöd Annette Kullenberg, Chapter 1 705 25 Long text A Röda rummet August Strindberg, Chapter 3 4272 37 Long text B Nils Holgersson Selma Lagerlöf, Chapters 1 and 2 4230 27 Long text C Valarnas sång Wally Lamb, Chapter 1 4326 31 Long text D Bara Alice Maggie O'Farrel, Chapter 1 4170 29 Short text a Makedonien Dagens Nyheter 2001-06-03 430 44 Short text b Mellanöstern Dagens Nyheter 2001-05-27 384 54 Short text c Alcala Dagens Nyheter 2001-06-02 628 40 Short text d Lundin Oil Dagens Nyheter 2001-05-18 365 49 Text . Title (in Swedish) . Author/source . Words . LIX . Training text Glöd Annette Kullenberg, Chapter 1 705 25 Long text A Röda rummet August Strindberg, Chapter 3 4272 37 Long text B Nils Holgersson Selma Lagerlöf, Chapters 1 and 2 4230 27 Long text C Valarnas sång Wally Lamb, Chapter 1 4326 31 Long text D Bara Alice Maggie O'Farrel, Chapter 1 4170 29 Short text a Makedonien Dagens Nyheter 2001-06-03 430 44 Short text b Mellanöstern Dagens Nyheter 2001-05-27 384 54 Short text c Alcala Dagens Nyheter 2001-06-02 628 40 Short text d Lundin Oil Dagens Nyheter 2001-05-18 365 49 Open in new tab Setting. The experiment took place in a dedicated usability lab outfitted with audio and video-recording facilities. While reading the subject was seated in a comfortable chair in a room separated from the experimenter by a one-way mirror. Before the experiment started each subject had some time to get acquainted with the facilities in order to create a relaxed, and consequently controlled, setting. Instructions. Each subject received instructions before the experiment that pointed out that it was the applications and not the individual performance that were being tested. All were encouraged to ask questions whenever they wanted and told that they could terminate the experiment at any time if they felt uncomfortable. Written instructions were administered before each session that described the principal features of the current user interface, what kind of text they were going to read and how long time it was likely to take. The subjects were particularly instructed to read at a pace as comfortable to them as possible. Training. To begin with, each subject participated in two training sessions. In the first session, the subject read the training text using the Microsoft Reader and in the second session, the subject read the same text again using Bailando in Content Adaptive RSVP mode. The subjects did not train on using the MS Explorer as all had prior experience of using it on desktop computers. The idea behind reading the same text twice was to give the subjects an early success experience and making them more willing to experiment with the interface. After the training sessions, the subject was introduced to the questions in the inventories and filled them in. Procedure. Four experimental conditions were administered and each condition was divided into two sessions. In the first session, the subject read a long text and filled in the inventories, in the second session the subject read a short text and filled in the inventories. Between the first and second condition the subject had a 15-min break and between the second and third condition the subject had a 45-min lunch break. Between the third and fourth condition the subject had a 15-min break again. The total participation time for each subject was around 5 h. Inventories. After each experimental session, there were two inventories to fill in. The first inventory was a comprehension test made up of multiple-choice questions with three alternatives, for long texts, there were 10 questions, and for short texts, there were five. All questions occurred in the same order as the answers were found in the text. The second inventory was the NASA-TLX Task Load Index (Hart and Staveland, 1988), which was administered to check Mental, Physical, and Temporal demands, as well as Performance, Effort, and Frustration levels. The NASA-TLX Task Load Index inventory was chosen as a measure of cognitive demands since the results would then be comparable to a previous evaluation were the measure was rewardingly used (Goldstein et al., 2001; Sicheritz, 2000). 5.2 Results All subjects completed the experiment and there were few problems with understanding what to do or how to do it. The subjects that were left-handed experienced some minor problems using the scroll-bar in MS Explorer as the hand sometimes obscured the screen. However, these subjects quickly resorted to using the page-turn buttons instead. The presentation of the results is divided into three sections: Reading speed, Comprehension and Task load. Under each section the null hypotheses set for long and short texts is tested. Reading speed. Reading speed was calculated as words read per minute based on the total time it took for the subjects to read a text including all kind of interruptions like pauses, regressions, speed changes, etc. Long texts: Adaptive RSVP improved reading speed some, but the null hypothesis regarding no difference in reading speed between the conditions when reading long texts was not disproved (Table 3). Table 3 Reading speed in words per minute (wpm) for long and short texts Condition . Long texts . Short texts . . Avg. . Sth. Dev. . Avg. . Std. Dev. . MS Reader/MS Explorer 242 80.4 157 53.2 Fixed RSVP 249 58.5 212 46.5 Content adaptive 260 51.2 213 36.8 Context adaptive 258 79.5 203 43.9 Condition . Long texts . Short texts . . Avg. . Sth. Dev. . Avg. . Std. Dev. . MS Reader/MS Explorer 242 80.4 157 53.2 Fixed RSVP 249 58.5 212 46.5 Content adaptive 260 51.2 213 36.8 Context adaptive 258 79.5 203 43.9 Open in new tab Table 3 Reading speed in words per minute (wpm) for long and short texts Condition . Long texts . Short texts . . Avg. . Sth. Dev. . Avg. . Std. Dev. . MS Reader/MS Explorer 242 80.4 157 53.2 Fixed RSVP 249 58.5 212 46.5 Content adaptive 260 51.2 213 36.8 Context adaptive 258 79.5 203 43.9 Condition . Long texts . Short texts . . Avg. . Sth. Dev. . Avg. . Std. Dev. . MS Reader/MS Explorer 242 80.4 157 53.2 Fixed RSVP 249 58.5 212 46.5 Content adaptive 260 51.2 213 36.8 Context adaptive 258 79.5 203 43.9 Open in new tab Short texts: The null hypothesis regarding no difference in reading speed between the conditions when reading short texts was rejected since the main factor for reading speed was significant (F[3.45]=8.4, p=0.04). Pair-wise comparisons revealed that all RSVP conditions increased reading speed significantly (p≤0.002) compared to using traditional text presentation with the MS Explorer (Table 3). Comprehension. Comprehension was computed as percent of correctly answered multiple-choice questions. For long texts, there were 10 questions and for short texts, there were five. Long texts: The null hypothesis regarding no difference in comprehension between the conditions when reading long texts was not disproved. Content Adaptive RSVP showed the best results but the differences between the conditions were small (Table 4). Table 4 Comprehension scores in percentage correct (%) for long and short texts Condition . Long texts . Short texts . . Avg. . Sth. Dev. . Avg. . Std. Dev. . MS Reader/MS Explorer 73 19.6 70 21.9 Fixed RSVP 75 17.9 66 26.0 Content adaptive 76 17.5 59 19.9 Context adaptive 71 21.9 66 18.9 Condition . Long texts . Short texts . . Avg. . Sth. Dev. . Avg. . Std. Dev. . MS Reader/MS Explorer 73 19.6 70 21.9 Fixed RSVP 75 17.9 66 26.0 Content adaptive 76 17.5 59 19.9 Context adaptive 71 21.9 66 18.9 Open in new tab Table 4 Comprehension scores in percentage correct (%) for long and short texts Condition . Long texts . Short texts . . Avg. . Sth. Dev. . Avg. . Std. Dev. . MS Reader/MS Explorer 73 19.6 70 21.9 Fixed RSVP 75 17.9 66 26.0 Content adaptive 76 17.5 59 19.9 Context adaptive 71 21.9 66 18.9 Condition . Long texts . Short texts . . Avg. . Sth. Dev. . Avg. . Std. Dev. . MS Reader/MS Explorer 73 19.6 70 21.9 Fixed RSVP 75 17.9 66 26.0 Content adaptive 76 17.5 59 19.9 Context adaptive 71 21.9 66 18.9 Open in new tab Short texts: The null hypothesis regarding no difference in comprehension between the conditions when reading short texts was not disproved. MS Explorer gave the best result but the differences between the conditions were small (Table 4). Task load. Task load was calculated as percent of millimetres to the left of the tick mark on a 120-mm scale. The factors were not rated within each other. Long texts: The null hypothesis regarding no difference in task load between the conditions when reading long texts was rejected as all main factors except Physical demand became significant (F[3.45]≥5.2, p≤0.014). Pair-wise comparisons revealed that the use of RSVP resulted in significantly higher (p≤0.014) task loads compared to using traditional text presentation with the MS Reader. Content Adaptive RSVP decreased task load ratings and the only factor that was rated significantly higher compared to the MS Reader was Frustration level (p=0.002). Context Adaptive RSVP also decreased task load, but in a different way. The only significantly higher factor compared to the MS Reader was Temporal demand (p=0.001) (Fig. 10). Fig. 10 Open in new tabDownload slide NASA-TLX Task Load Index ratings for long and short texts with median, 25- and 75-percentile represented. Lower ratings are better. Fig. 10 Open in new tabDownload slide NASA-TLX Task Load Index ratings for long and short texts with median, 25- and 75-percentile represented. Lower ratings are better. Short texts: The null hypothesis regarding no difference in task load between the conditions when reading short texts was not disproved (Fig. 10). 6 Discussion That no significant differences were found within the RSVP formats indicate that the effects caused by adaptation were quite small. Nevertheless, when the results obtained for RSVP were compared to those for traditional text presentation some significant differences were found. The discussion will primarily be based on these findings for reading speed and task load. Since no significant differences in reading speed were found for long texts, RSVP appears to be just as fast as traditional text presentation with the MS Reader. The lower reading speeds obtained for short texts is not very surprising as news articles are generally harder to read (Björnsson, 1968; Tekfi, 1987). However, the significant differences between using RSVP and the MS Explorer is very surprising since RSVP increased reading speed by 33%. RSVP has in previous studies not been much faster than traditional text presentation but these findings indicate that RSVP really can offer a significant increase in reading speed on mobile devices. The task load ratings obtained for Fixed RSVP and the MS Reader were close to identical to those obtained for Fixed RSVP and paper-book in the Goldstein et al. (2001) evaluation. This is surprising since the subjects now selected a comfortable reading speed. This may imply that the size of the assumed trade-off between reading speed and cognitive demand is small for the RSVP format, quite contrary to the size of the well-established speed-accuracy trade-off (Wickens, 1992). Thus, a RSVP decrease in reading speed by approximately 60 wpm when reading long texts does not decrease task load. Adaptive RSVP was supposed to decrease task load and it seems to have worked as expected for long texts. Compared to the MS Reader the only factor significantly higher for Content adaptation was Frustration level. Probably some words were not exposed for a duration that matched the time needed for cognitive processing; it is, however, encouraging that even the most straightforward form of adaptation actually decreased task load. In Context adaptive mode, the only significant factor compared to the MS Reader was Temporal demand. A probable cause for this is that the variations in exposure time were too large. However, the relation between what was exposed and the time for exposure was probably sound since the Frustration level decreased compared to Content adaptation. It seems that although the variations were too large they probably occurred at the right places. Surprisingly, there were no significant differences in task load when reading short texts. When RSVP was used, the task load ratings were almost equal to using the MS Explorer although the reading speed was 33% higher. This confirms that traditional text presentation is neither a guarantee for low task load nor high reading speed and that RSVP actually can improve readability on mobile devices. 7 Conclusions That RSVP gave the best results for short texts in this evaluation is encouraging since the typical text read on a mobile device is likely to be short. Considered that RSVP can be used effectively on mobile devices with much smaller screens than the one used in this evaluation, it is also encouraging that it is possible to read longer texts effectively. However, the major drawback of the RSVP format appears to be the high cognitive demand placed on the subjects (Goldstein et al., 2001; Castelhano and Muter, 2001; Sicheritz, 2000). An increase in task load may actually be inherent to the RSVP format, since it remains high evidently independent of reading speed. Therefore, the most important finding in this evaluation is that task load can be decreased by using adaptation. Both adaptive algorithms were found to decrease task load for most factors, this means that better adaptation may decrease task load even further. There is really no reason to use RSVP when traditional text presentation can be used efficiently. In this evaluation, RSVP was found to be just as effective as the MS Reader but also significantly more demanding. However, when traditional text presentation becomes ineffective it seems to become more demanding as well. The MS Explorer was found to be just as demanding to use as RSVP but much slower. It is probably when traditional text presentation becomes ineffective, like it is on most mobile devices of today, that RSVP can offer a real improvement in readability. A slight increase in task load may then also be acceptable if it is compensated by an increase in efficiency, particularly since time often equals money in the mobile context. Therefore, since adaptation evidently decreases task load, Adaptive RSVP can be seen as one small step towards an improved readability on mobile devices. Acknowledgements The authors would like to thank Staffan Björk and Peter Ljungstrand at the PLAY group of the Interactive Institute for rewarding discussions around the novel concepts put forward in this paper. References Björnsson, 1968 Björnsson C.H. , Läsbarhet 1968 Liber , Stockholm, Sweden Bruijn and Spence, 2000 Bruijn O. Spence R. , Rapid Serial Visual Presentation: A space-time trade-off in information presentation Gesù V.D. Levialdi S. Tarantino L. Proceedings of Advanced Visual Interfaces, AVI'00 2000 ACM Press , New York 196 – 201 OpenURL Placeholder Text WorldCat Castelhano and Muter, 2001 Castelhano M.S. Muter P. , Optimizing the reading of electronic text using rapid serial visual presentation , Behaviour and Information Technology 20 ( 4 ) 2001 ) 237 – 247 Google Scholar Crossref Search ADS WorldCat Duchnicky and Kolers, 1983 Duchnicky R.L. Kolers P.A. , Readability of text scrolled on visual display terminals as a function of window size , Human Factors 25 ( 1983 ) 683 – 692 Google Scholar PubMed OpenURL Placeholder Text WorldCat Ericsson et al., 2001 Ericsson T. Chincholle D. Goldstein M. , Both the device and the service influence WAP usability Vanderdonckt J. Blandford A. Derycke A. Usability in Practice Usability in Practice vol. II 2001 Cépaduès-Editions , Toulouse, France 79 – 85 OpenURL Placeholder Text WorldCat Forster, 1970 Forster K.I. , Visual perception of rapidly presented word sequences of varying complexity , Perception and Psychophysics 8 ( 1970 ) 215 – 221 Google Scholar Crossref Search ADS WorldCat Goldstein et al., 2001 Goldstein M. Sicheritz K. Anneroth M. , Reading from a small display using the RSVP technique , Proceedings of Nordic Radio Symposium, NRS'01 2001 (Full paper available on CD-ROM only) OpenURL Placeholder Text WorldCat Goldstein et al., 2002 Goldstein M. Öquist G. Björk S. , Evaluating Sonified Rapid Serial Visual Presentation: An immersive reading experience on a mobile device , Proceedings of User Interfaces for All, UI4ALL'02 2002 (To appear in the Springer series Lecture Notes in Computer Science) OpenURL Placeholder Text WorldCat Hart and Staveland, 1988 Hart S.G. Staveland L.E. , Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research Hancock P.A. Meshkati N. Human Mental Workload 1988 North-Holland , Amsterdam 139 – 183 OpenURL Placeholder Text WorldCat Huey, 1908 Huey E.B. , The Psychology and Pedagogy of Reading (Republished 1968) 1908 MIT Press , Cambridge, MA Juola et al., 1982 Juola J.F. Ward N.J. McNamara T. , Visual search and reading of rapid serial presentations of letter strings, words, and text , Journal of Experimental Psychology: General 111 ( 1982 ) 208 – 227 Google Scholar Crossref Search ADS WorldCat Juola et al., 1995 Juola J.F. Tiritoglu A. Pleunis J. , Reading text presented on a small display , Applied Ergonomics 26 ( 1995 ) 227 – 229 Google Scholar Crossref Search ADS PubMed WorldCat Just and Carpenter, 1980 Just M.A. Carpenter P.A. , A theory of reading: From eye fixations to comprehension , Psychological Review 87 ( 4 ) 1980 ) 329 – 354 Google Scholar Crossref Search ADS PubMed WorldCat Kang and Muter, 1989 Kang T.J. Muter P. , Reading dynamically displayed text , Behaviour and Information Technology 8 ( 1 ) 1989 ) 33 – 42 Google Scholar Crossref Search ADS WorldCat Masson, 1983 Masson M.E.J. , Conceptual processing of text during skimming and rapid sequential reading , Memory and Cognition 11 ( 1983 ) 262 – 274 Google Scholar Crossref Search ADS PubMed WorldCat Mills and Weldon, 1987 Mills C.B. Weldon L.J. , Reading text from computer screens , ACM Computing Surveys 19 ( 4 ) 1987 ) 329 – 358 Google Scholar Crossref Search ADS WorldCat Muter, 1996 Muter P. , Interface design and optimization of reading of continuous text van Oostendorp H. de Mul S. Cognitive Aspects of Electronic Text Processing 1996 Ablex , Norwood, NJ 161 – 180 OpenURL Placeholder Text WorldCat Muter and Maurutto, 1991 Muter P. Maurutto P. , Reading and skimming from computer screens and books: The paperless office revisited? , Behaviour and Information Technology 10 ( 4 ) 1991 ) 257 – 266 Google Scholar Crossref Search ADS WorldCat Öquist, 2003 Öquist, G., 2001. Adaptive Rapid Serial Visual Presentation. Master's Thesis, Department of Linguistics, Uppsala University, Sweden. Retrieved January 21, 2003, from http://www.ling.uu.se. Potter, 1984 Potter M.C. , Rapid Serial Visual Presentation (RSVP): A method for studying language processing Kieras D.E. Just M.A. New Methods in Reading Comprehension Research 1984 Erlbaum , Hillsdale, NJ 161 – 180 OpenURL Placeholder Text WorldCat Rahman and Muter, 1999 Rahman T. Muter P. , Designing an interface to optimize reading with small display windows , Human Factors 41 ( 1 ) 1999 ) 106 – 117 Google Scholar Crossref Search ADS PubMed WorldCat Robeck and Wallace, 1990 Robeck M.C. Wallace R.R. , The Psychology of Reading: An Interdisciplinary Approach 2nd ed. 1990 Erlbaum , Hillsdale, NJ Russell and Chaparro, 2001 Russell M.C. Chaparro B.S. , Proceedings of the Human Factors and Ergonomics Society 45th Annual Meeting Proceedings of the Human Factors and Ergonomics Society 45th Annual Meeting 2001 Human Factors and Ergonomics Society , Minneapolis, MN pp. 640–645 OpenURL Placeholder Text WorldCat Sicheritz, 2003 Sicheritz, K., 2000. Applying the Rapid Serial Presentation Technique to Personal Digital Assistants. Master's Thesis, Department of Linguistics, Uppsala University, Sweden. Retrieved January 21, 2003, from http://www.ling.uu.se. Tekfi, 1987 Tekfi C. , Readability formulas: An overview , Journal of Documentation 43 ( 3 ) 1987 ) 261 – 273 Google Scholar Crossref Search ADS WorldCat Wickens, 1992 Wickens C.D. , Engineering Psychology and Human Performance 2nd ed. 1992 Harper Collins , New York, NY Author notes 1 Tel.: +46-8-757-3679. © 2003 Elsevier B.V. All rights reserved. TI - Towards an improved readability on mobile devices: evaluating adaptive rapid serial visual presentation JF - Interacting with Computers DO - 10.1016/S0953-5438(03)00039-0 DA - 2003-08-01 UR - https://www.deepdyve.com/lp/oxford-university-press/towards-an-improved-readability-on-mobile-devices-evaluating-adaptive-738mz1g4MJ SP - 539 EP - 558 VL - 15 IS - 4 DP - DeepDyve ER -