Predicting user adherence to behavioral eHealth interventions in the real world: examining which aspects of intervention design matter most

Predicting user adherence to behavioral eHealth interventions in the real world: examining which... Abstract Existing frameworks have identified a range of intervention design features that may facilitate adherence to eHealth interventions; however, empirical data are lacking on whether intervention design features can predict user adherence in the real world—where the public access available tools—and whether some design aspects of behavioral eHealth interventions are more important than others in predicting adherence. This study examined whether intervention design qualities predict user adherence to behavioral eHealth interventions in real-world use and which qualities matter the most. We correlated the online activities of users of 30 web-based behavioral interventions—collected from a proprietary data set of anonymized logs from consenting users of Microsoft Internet Explorer add-on—with interventions’ quality ratings obtained by trained raters prior to empirical examination. The quality ratings included: Usability, Visual Design, User Engagement, Content, Therapeutic Persuasiveness (i.e., persuasive design and incorporation of behavior change techniques), and Therapeutic Alliance. We found Therapeutic Persuasiveness (i.e., the incorporation of persuasive design/behavior change principles) to be the most robust predictor of adherence (i.e., duration of use, number of unique sessions; 40 ≤ rs ≤ .58, ps ≤ .005), explaining 42% of the variance in user adherence in our regression model. Results indicated up to six times difference in the percentage of users utilizing the interventions for more than a minimum amount of time and sessions based on Therapeutic Persuasiveness. Findings suggest the importance of persuasive design and behavior change techniques incorporation during the design and evaluation of digital behavioral interventions. Implications Practice: Findings suggest that criteria-based ratings—mainly those focused on the quality of persuasive design and behavior change principles—can be used to enable clinicians and health providers identify the most promising programs in terms of user engagement in real-world context. Research: Future research is needed to examine the applicability to predict intervention efficacy using criteria-based evaluation, and how the multimodal relationship between different qualities of intervention design affects engagement and efficacy metrics. Policy: If quality ratings consistently show validity in predicting user adherence and efficacy of interventions, such ratings can be used to support the decision process of different stakeholders when evaluating intervention potential prior to empirical trials. INTRODUCTION The potential reach of public access to behavioral health interventions has changed dramatically over the past 30 years. Tens of thousands of health, wellness, and medical applications are now available for download from online stores [1], and eHealth interventions are expected to play a substantial role in shaping the future of health care [2]. The more behavioral health interventions move from traditional to digital platforms [3], the more the focus shifts to individuals, who can now engage in self-care around the clock outside of traditional health care settings [4–6]. However, individuals’ engagement must compete with other events in their daily lives, making poor adherence a common issue [7–10]. As a result, few individuals engage proactively in mobile applications or websites across the behavior change spectrum for more than a couple of weeks without therapist contact [11]. Existing frameworks have identified a range of intervention design features that may influence user satisfaction, and facilitate user adherence and behavior change [12–14]. In particular, systematic reviews examining the intervention design of empirical trials have suggested that adherence [15] and program efficacy [16] can be increased by embedding persuasive design and behavior change techniques within the digital intervention. However, empirical data are lacking on whether intervention design features can predict user adherence “in the wild”—where the public access available tools—and whether some design aspects of digital behavioral interventions are more important than others in facilitating engagement. One particular challenge in gathering such data revolves around the need to record user behaviors in different interventions utilizing the same analytical framework; this approach then enables the researcher to document variance in adherence across the spectrum of examined interventions. Although traditional study settings do not allow for the comparison between a large set of intervention designs within the same empirical trial, this is feasible in real-world use of digital interventions. Moller et al. [3], for example, suggested that scholars can refine theories and understand processes and outcomes of behavioral eHealth interventions by “leveraging the proliferation of real-time objective measurement and big data commonly generated and stored by digital platforms.” Another advantage of using commonly generated data is that, unlike with traditional study settings where subjects are being proactively recruited, have check-in appointments, and are paid for filling out assessments, this evaluation method does not interfere with natural user behaviors. A second notable challenge revolves around the ability to reliably rate key intervention qualities (e.g., usability, use of behavior change principles) for each of the examined interventions. Various quality aspects influence the design, development, and implementation of behavioral health interventions in different ways. Therefore, examining these aspects with respect to user engagement may reveal the importance of accounting for these aspects during the design and pre-empirical evaluation of programs, as is common in other areas of website development [17]. In this study, we sought to address these challenges and to study how different design aspects of behavioral eHealth interventions affect user adherence. We collected data on user activity of a large number of web-based behavioral interventions and combined these with the interventions’ previously rated quality scores. The quality scores were generated using the Enlight tool, which provides a comprehensive set of separate quality ratings on topics ranging from usability to persuasive design/behavior change concepts [18] (see Methods section for more detail). We aimed to assess the correlation between the two data sets and to estimate the impact of relevant quality ratings on user adherence. To the best of our knowledge, this is the first study to combine a large-scale data set of online user behaviors to evaluate the impact of different intervention design features on adherence in real-world use. METHODS Web-based eHealth intervention quality ratings We used 42 web-based eHealth interventions available to the public, which were previously selected and scored for their quality during the development of Enlight [18]. The selection of interventions followed the PRISMA statement guidelines and involved the random selection of a sample of available behavioral eHealth interventions that were free to use and could be found through popular search engines. The clinical aims of the selected interventions were broad across the behavioral health domain which could be grouped into interventions targeting health-related behaviors (e.g., diet, physical activity, smoking, and alcohol cessation) and those targeting mental health (depression, anxiety, well-being) [18]. Enlight is a comprehensive suite of criteria-based measurements developed through a rigorous process of content analysis, grouping, and classification of 476 identified criteria items by a multidisciplinary team. The tool covers six different product quality domains: Usability, Visual Design, User Engagement, Content, Therapeutic Persuasiveness (i.e., persuasive design and incorporation of behavior change techniques), and Therapeutic Alliance (see Supplementary Appendix 1 for a detailed description of the scale and operational definitions of all items). Each quality domain score ranges from 1 to 5 and is based on averaging the criteria ratings produced by two independent trained raters on a Likert scale (ranging from 1 to 5)—following program examination. For example, Therapeutic Persuasiveness assesses the extent to which the program is designed to encourage users to make positive behavior changes or to maintain positive aspects of their life. It is calculated by averaging the raters’ scores of the following criteria items: call to action (e.g., goal settings, prompting goals, encouragement to complete goals), load reduction of activities (e.g., set graded tasks), therapeutic rationale and pathway (e.g., reasoned action, provide information about behavioral health link), rewards (e.g., contingent rewards), real data-driven/adaptive content (e.g., self-monitoring of behavior), ongoing feedback, and expectations and relevance. The quality domains exhibited excellent inter-raters reliability scores (intraclass correlations = .77–.98, median .91) and internal consistencies between domain items (Cronbach’s alphas = .83–.90, median .88) [18]. Behavioral data Information on user behaviors (i.e., behavioral variables) was collected from a proprietary data set of anonymized logs from consenting users of a widely distributed web browser add-on toolbar associated with the Microsoft Internet Explorer web browser [19]. These data were gathered over a 12-month period from January 1 to December 31, 2015. Each log contained an anonymized user identifier, the time and date of the record, and the URL that the user visited. This study was deemed exempt by the “masked for blind review” Institutional Review Board. The logs were screened to include only those URLs associated with the 42 rated intervention programs and we only documented the URL pages relevant to the intervention itself. We further filtered the data to include only those users who reached a program login or registry pages and visited the program pages for at least 2 days, to exclude users who did not eventually access the intervention program itself. We also excluded websites with fewer than 10 users that passed this filter (i.e., utilized the intervention), in order to assure statistically valid results. This left a total of 30 eHealth programs for analysis (see Supplementary Appendix 2 for a complete list of included programs). In order to examine user adherence, we retrieved the number of unique sessions and the total time spent using each program [20, 21]. To avoid ambiguity, we defined a unique session (i.e., the difference between one session/visit and another) on the basis of the date the program was utilized; accordingly, different dates of utilization counted as different sessions. We did not examine the number of unique page views, as the ways in which content was embedded within the programs (i.e., the content architecture) varied [3]. For example, some programs utilized “active content,” whereby users could receive and navigate content while technically being on the same URL, whereas other programs required users to navigate through a large number of web pages to achieve the same pattern of engagement. We also defined a composite measure [21] to examine the percentage of users who utilized the intervention for more than the minimum number of sessions and hours (i.e., threshold). Such an examination was designed to account for the assumption that utilizing a digital behavioral intervention for a certain threshold is sufficient to meet therapeutic aims—where programs do not receive better scores for extended utilization. When disregarding extended utilization we also aimed to account for the possibility that heavy website browsing of a limited amount of users affected the results. To address the potential concern that different interventions might require different thresholds to successfully change user behaviors, we examined three different composite measures by incorporating three different levels of thresholds: more than 3, 5, or 7 sessions and for an accumulated time of 1.5, 3, or 5 hr, respectively (e.g., one composite measure counted the percentage of users who utilized the program for at least 7 different sessions and 5 hr of use). Data analysis For each program, we calculated the average number of unique sessions per user and the average duration of use. We used median and interquartile range (IQR) to present the distribution of these variables. Means and standard deviations were used to describe Enlight quality scores (which were based on averaging several items ranging from 1 to 5 on a Likert scale). Spearman’s correlations were used to examine the relationships between the variables. We conducted a sensitivity analysis to examine whether the pattern of correlations found differed depending on the clinical aim of the website (i.e., health-related behaviors, mental health) and if we only included websites with a larger number of logged users (>100 users). The differences between correlations of these groups were calculated using Fisher’s Z-transformation. To account for the overlapping variance between different quality domains in the prediction of behavioral variables, the partial correlations between each quality domain and the predicted variables while controlling for the effect of all other quality domains were calculated. Finally, a stepwise regression was applied to examine which variables best explain user engagement, without adding predicting variables that do not have a significant contribution to the model. RESULTS The 30 web-based programs included in the analysis had a median of 110 users (IQR = 414); the program with the median number of sessions had an average of 4.31 unique sessions per user (IQR = 4.27); the program with the median duration of engagement time had an average of 3.97 hr of use per user (IQR = 4.19). The mean quality scores and standard deviations of the website sample were as follows: Usability M = 3.21 (SD = 0.70); Visual Design M = 2.73 (SD = 0.80); User Engagement M = 2.81 (SD = 0.76); Content M = 3.56 (SD = 0.63); Therapeutic Persuasiveness M = 2.39 (SD = 0.58); and Therapeutic Alliance M = 2.44 (SD = 0.78). Spearman’s correlations between the study variables are presented in Table 1. A pattern of very strong positive correlations was found between the behavioral variables (i.e., the variables recording user adherence; rs > .85). The high correlations found between these variables demonstrate that heavy website usage of outliers—which only affects number of unique sessions and total engagement time (and not the threshold variables)—did not bias the results. Results point to a range of moderate to strong correlations between Therapeutic Persuasiveness and the behavioral variables (.40 ≤ rs ≤ .58). User Engagement showed a pattern of weak, positive correlations with the behavioral variables (.23 ≤ rs ≤ .35); however, only two correlations were significant (rs = .31, .35), while three were close to significant (r = .23, .24, .26; ps ≤ .11). Usability showed a pattern of weak, negative correlations with the behavioral variables; one correlation was significant (r = -.40), while the other four were close to significant (-.26 ≤ r ≤ -.29; ps ≤ .09). The sensitivity analysis revealed no significant differences in Spearman’s correlations between clinical aims, or between the complete sample of websites (n = 30) and the subsample of websites with >100 consenting users (n = 17). The partial correlation analysis revealed significant positive correlations between Therapeutic Persuasiveness and all behavioral variables after controlling for all other quality domains (.48 ≤ r ≤ .59; .002 ≤ ps ≤ .014). No other significant partial correlations were found. Table 1 Spearman’s Correlations Between Study Variables 1. Number of unique sessions 2. Total engagement time 3. % of users ≥3 sessions & ≥1.5 hr 4. % of users ≥5 sessions & ≥3 hr 5. % of users ≥7 sessions & ≥5 hr Variables r p r p r p r p r p Adherence measures #2 .91 <.001 #3 .86 <.001 .90 <.001 #4 .97 <.001 .90 <.001 .90 <.001 #5 .96 <.001 .90 <.001 .86 <.001 .96 <.001 Program quality scores Usability −.26 .08 −.40 .02 −.26 .09 −.29 .06 −.29 .06 Visual Design −.02 .47 −.21 .13 −.18 .17 −.09 .32 −.06 .38 User Engagement .35 .03 .24 .10 .23 .11 .31 .048 .26 .08 Content .01 .48 −.01 .47 .07 .36 .01 .48 .03 .43 Therapeutic Persuasiveness .58 <.001 .46 .005 .40 .01 .51 .002 .48 .003 Therapeutic Alliance .11 .28 .02 .46 .10 .30 .10 .31 .07 .36 1. Number of unique sessions 2. Total engagement time 3. % of users ≥3 sessions & ≥1.5 hr 4. % of users ≥5 sessions & ≥3 hr 5. % of users ≥7 sessions & ≥5 hr Variables r p r p r p r p r p Adherence measures #2 .91 <.001 #3 .86 <.001 .90 <.001 #4 .97 <.001 .90 <.001 .90 <.001 #5 .96 <.001 .90 <.001 .86 <.001 .96 <.001 Program quality scores Usability −.26 .08 −.40 .02 −.26 .09 −.29 .06 −.29 .06 Visual Design −.02 .47 −.21 .13 −.18 .17 −.09 .32 −.06 .38 User Engagement .35 .03 .24 .10 .23 .11 .31 .048 .26 .08 Content .01 .48 −.01 .47 .07 .36 .01 .48 .03 .43 Therapeutic Persuasiveness .58 <.001 .46 .005 .40 .01 .51 .002 .48 .003 Therapeutic Alliance .11 .28 .02 .46 .10 .30 .10 .31 .07 .36 Significant correlations are in bold. Using Fisher’s Z-transformation, no significant differences in Spearman’s correlation values were found between mental health (n = 17) and health-related behaviors (n = 13), or between the complete website sample and websites with >100 documented users (n = 17). While conrolling all other quality domains, positive partial correlations were found between Therapeutic Persuasiveness and number of unique sessions (r = .58, p = .002), total engagement time (r = .57, p = .002), % of users ≥3 sessions & ≥1.5 hr (r = .48, p = .014), % of users ≥5 sessions & ≥3 hr (r = .59, p = .002), % of users ≥7 sessions & ≥5 hr (r = .59, p = .002). No other significant partial correlations were found. View Large Table 1 Spearman’s Correlations Between Study Variables 1. Number of unique sessions 2. Total engagement time 3. % of users ≥3 sessions & ≥1.5 hr 4. % of users ≥5 sessions & ≥3 hr 5. % of users ≥7 sessions & ≥5 hr Variables r p r p r p r p r p Adherence measures #2 .91 <.001 #3 .86 <.001 .90 <.001 #4 .97 <.001 .90 <.001 .90 <.001 #5 .96 <.001 .90 <.001 .86 <.001 .96 <.001 Program quality scores Usability −.26 .08 −.40 .02 −.26 .09 −.29 .06 −.29 .06 Visual Design −.02 .47 −.21 .13 −.18 .17 −.09 .32 −.06 .38 User Engagement .35 .03 .24 .10 .23 .11 .31 .048 .26 .08 Content .01 .48 −.01 .47 .07 .36 .01 .48 .03 .43 Therapeutic Persuasiveness .58 <.001 .46 .005 .40 .01 .51 .002 .48 .003 Therapeutic Alliance .11 .28 .02 .46 .10 .30 .10 .31 .07 .36 1. Number of unique sessions 2. Total engagement time 3. % of users ≥3 sessions & ≥1.5 hr 4. % of users ≥5 sessions & ≥3 hr 5. % of users ≥7 sessions & ≥5 hr Variables r p r p r p r p r p Adherence measures #2 .91 <.001 #3 .86 <.001 .90 <.001 #4 .97 <.001 .90 <.001 .90 <.001 #5 .96 <.001 .90 <.001 .86 <.001 .96 <.001 Program quality scores Usability −.26 .08 −.40 .02 −.26 .09 −.29 .06 −.29 .06 Visual Design −.02 .47 −.21 .13 −.18 .17 −.09 .32 −.06 .38 User Engagement .35 .03 .24 .10 .23 .11 .31 .048 .26 .08 Content .01 .48 −.01 .47 .07 .36 .01 .48 .03 .43 Therapeutic Persuasiveness .58 <.001 .46 .005 .40 .01 .51 .002 .48 .003 Therapeutic Alliance .11 .28 .02 .46 .10 .30 .10 .31 .07 .36 Significant correlations are in bold. Using Fisher’s Z-transformation, no significant differences in Spearman’s correlation values were found between mental health (n = 17) and health-related behaviors (n = 13), or between the complete website sample and websites with >100 documented users (n = 17). While conrolling all other quality domains, positive partial correlations were found between Therapeutic Persuasiveness and number of unique sessions (r = .58, p = .002), total engagement time (r = .57, p = .002), % of users ≥3 sessions & ≥1.5 hr (r = .48, p = .014), % of users ≥5 sessions & ≥3 hr (r = .59, p = .002), % of users ≥7 sessions & ≥5 hr (r = .59, p = .002). No other significant partial correlations were found. View Large In light of the pattern of strong correlations found between Therapeutic Persuasiveness and behavioral variables, we divided the websites into three groups based on their Therapeutic Persuasiveness percentile rank (0%–33.3%; 33.3%–66.6%; 66.6%–100%) in order to examine the median percentage of users meeting each of the thresholds in these groups (see Table 2). As can be seen in Table 2, websites with higher Therapeutic Persuasiveness scores had higher percentages of users who met the desired thresholds. In addition, the differences between website groups, in terms of the percentages of users meeting the desired thresholds, were larger at higher thresholds. For example, 19.2% of users in the lowest ranked Therapeutic Persuasiveness group met the lowest threshold compared to 36.6% of users in the highest ranked Therapeutic Persuasiveness group (almost twice as many); moreover, only 2.8% of users in the lowest ranked Therapeutic Persuasiveness group met the highest threshold compared to 16.2% of users in the highest ranked Therapeutic Persuasiveness group (almost six times as many). Table 2 The Portion of Users Meeting Engagement Thresholds by Program Therapeutic Persuasiveness Ranks Therapeutic Persuasiveness score % of users ≥3 unique days & ≥1.5 hr % of users ≥5 unique days & ≥3 hr % of users ≥7 unique days & ≥5 hr Lowest TP percentile (n = 9) Median (IQR) 1.57 (0.36) 19.23 (19.73) 7.78 (10.09) 2.78 (8.49) Middle TP percentile (n = 9) Median (IQR) 2.29 (0.21) 32.98 (16.58) 15.37 (17.08) 12.82 (10.57) Highest TP percentile (n = 12) Median (IQR) 2.86 (0.57) 36.62 (35.46) 23.08 (31.63) 16.22 (29.48) Therapeutic Persuasiveness score % of users ≥3 unique days & ≥1.5 hr % of users ≥5 unique days & ≥3 hr % of users ≥7 unique days & ≥5 hr Lowest TP percentile (n = 9) Median (IQR) 1.57 (0.36) 19.23 (19.73) 7.78 (10.09) 2.78 (8.49) Middle TP percentile (n = 9) Median (IQR) 2.29 (0.21) 32.98 (16.58) 15.37 (17.08) 12.82 (10.57) Highest TP percentile (n = 12) Median (IQR) 2.86 (0.57) 36.62 (35.46) 23.08 (31.63) 16.22 (29.48) TP Therapeutic Persuasiveness. View Large Table 2 The Portion of Users Meeting Engagement Thresholds by Program Therapeutic Persuasiveness Ranks Therapeutic Persuasiveness score % of users ≥3 unique days & ≥1.5 hr % of users ≥5 unique days & ≥3 hr % of users ≥7 unique days & ≥5 hr Lowest TP percentile (n = 9) Median (IQR) 1.57 (0.36) 19.23 (19.73) 7.78 (10.09) 2.78 (8.49) Middle TP percentile (n = 9) Median (IQR) 2.29 (0.21) 32.98 (16.58) 15.37 (17.08) 12.82 (10.57) Highest TP percentile (n = 12) Median (IQR) 2.86 (0.57) 36.62 (35.46) 23.08 (31.63) 16.22 (29.48) Therapeutic Persuasiveness score % of users ≥3 unique days & ≥1.5 hr % of users ≥5 unique days & ≥3 hr % of users ≥7 unique days & ≥5 hr Lowest TP percentile (n = 9) Median (IQR) 1.57 (0.36) 19.23 (19.73) 7.78 (10.09) 2.78 (8.49) Middle TP percentile (n = 9) Median (IQR) 2.29 (0.21) 32.98 (16.58) 15.37 (17.08) 12.82 (10.57) Highest TP percentile (n = 12) Median (IQR) 2.86 (0.57) 36.62 (35.46) 23.08 (31.63) 16.22 (29.48) TP Therapeutic Persuasiveness. View Large The pattern of very strong correlations found between all engagement variables demonstrated that each variable could predict the others with a substantial degree of accuracy. Therefore, we examined the predictive potential of quality scores with respect to user adherence using one behavioral variable: the number of unique sessions. We chose this variable as it descriptively had the strongest pattern of positive correlations with the other behavioral variables. A stepwise linear regression was applied using the number of unique sessions as the predicted variable and the different quality scores as the predicting variables. A log transformation of the predicted variable was applied to account for the non-normal distribution of regression residuals (this transformation did not affect the Spearman’s correlation results since it did not change the order of the scores). As can be seen in Table 3, Therapeutic Persuasiveness was entered in the first step, and it accounted for 42% of the variance in the predicted variable; in effect, higher Therapeutic Persuasiveness scores contributed to a higher number of sessions. Usability was entered in the second step, accounting for 12% of the variance in the predicted variable; higher Usability scores contributed to a lower number of sessions. Table 3 Standard Regression Coefficients (Beta) Showing the Contribution of Therapeutic Persuasiveness and Usability in Predicting the Normalized Number of Unique Sessions Adhered to by Users of the Examined Programs Variables Step 1 Step 2 Beta p Beta p Therapeutic Persuasiveness 0.65 <.001 0.76 <.001 Usability −0.36 .02 R2 0.42 <.001 0.54 <.001 ΔR2 – – 0.12 .02 Variables Step 1 Step 2 Beta p Beta p Therapeutic Persuasiveness 0.65 <.001 0.76 <.001 Usability −0.36 .02 R2 0.42 <.001 0.54 <.001 ΔR2 – – 0.12 .02 A stepwise linear regression model was applied. View Large Table 3 Standard Regression Coefficients (Beta) Showing the Contribution of Therapeutic Persuasiveness and Usability in Predicting the Normalized Number of Unique Sessions Adhered to by Users of the Examined Programs Variables Step 1 Step 2 Beta p Beta p Therapeutic Persuasiveness 0.65 <.001 0.76 <.001 Usability −0.36 .02 R2 0.42 <.001 0.54 <.001 ΔR2 – – 0.12 .02 Variables Step 1 Step 2 Beta p Beta p Therapeutic Persuasiveness 0.65 <.001 0.76 <.001 Usability −0.36 .02 R2 0.42 <.001 0.54 <.001 ΔR2 – – 0.12 .02 A stepwise linear regression model was applied. View Large DISCUSSION The results demonstrated that the objective quality rating for Therapeutic Persuasiveness was more important in predicting user adherence to web-based behavioral health interventions (in terms of duration of use, number of unique sessions, and combined variables) compared to quality ratings for Usability (ease of use), Visual Design, User Engagement, Content, and Therapeutic Alliance. These findings are congruent with previous empirical examinations that have demonstrated the importance of persuasive design and behavior change principles in the creation of engaging digital interventions [15, 16]. These results are also in line with theoretical studies that have argued for the importance of applying behavior change theoretical frameworks in the process of intervention design [3, 22]. While Therapeutic Persuasiveness is not a theoretical framework for behavior change, the link between Therapeutic Persuasiveness and important behavior change techniques suggests the importance of discussing such frameworks during the design, development, and evaluation of digital behavioral interventions. The negative association found between Usability and adherence should be further explained. The previous Enlight study identified that lean programs with very limited content and features can sometimes be very easy to learn and use, consequently resulting in high Usability scores [18]. Therefore, this current study finding does not imply that if one improves a product usability people use the product less, but that reaching very high usability scores using criteria-based tools unintentionally enables to identify lean programs which lack engaging features. Using only usability scores to compare between different programs—and not between two different versions of the same program—might not enable to understand which program is generally better since “ease of use” without the incorporation of product features does not provide sufficient information. Our results provide another empirical justification to the notion that Usability may be a necessary condition for program effectiveness, but not a sufficient condition [23, 24]. Finally, our results demonstrate the predictive validity of a criteria-based tool (Enlight). Taking objective quality measures into account during the process of intervention design and pre-empirical evaluation relies on the assumption that such views will correspond to user behavior and the success of the intervention in real-world use. From this perspective, our results shed light on the relationship between expert views on intervention design and desired user behaviors, making the case for the further examination of this intersection in order to advance a feasible evaluation process for behavioral eHealth interventions [25]. This study has several limitations and considerations for future examination that should be addressed. While we point to Therapeutic Persuasiveness as a robust measure that can better predict user adherence in real-world context compared to other quality ratings, this is not meant to minimize the importance of other quality concepts for an intervention’s success. The multimodal relationship between different intervention qualities implies that none of the quality ratings are independent of the others [18]; for example, a program needs to be usable and captivating (engaging) in order for it to work outside of traditional health care settings. In addition, since this study relied on analytic examination “in the wild”—without interfering with user behaviors—we did not account for intervention efficacy. Although studies have demonstrated that there is a strong relationship between adherence and efficacy [26, 27] and that the actual use of the intervention is a precondition for a program’s success [28], more does not always equal better [29]. One future direction would be to directly test the correlations between intervention design and program efficacy “in the wild.” This testing should take into account fundamental questions related to participant consent and how to measure intervention outcomes. Finally, demographic information on (anonymous) users was unavailable. Such information would have allowed more detailed breakdown of our analysis. Future work will elicit demographic information from usage, as per Goel et al. [30]. Despite these limitations, our results ultimately indicate the potential to advance knowledge through the use of digital platforms that enable the gathering of analytic measures “in the wild”—with current data supporting the importance of Therapeutic Persuasiveness in predicting user adherence to digital health interventions in real-world use. SUPPLEMENTARY MATERIAL Supplementary material is available at Translational Behavioral Medicine online. Acknowledgements The findings reported have not been previously published and the manuscript is not under consideration for publication elsewhere. The data presented were not reported previously. The authors have full control of all primary data and agree to allow the journal to review the data if requested. All work was done as part of the respective authors’ research position, with no additional or external funding. Compliance with Ethical Standards Conflict of Interest: Dr. E. Yom-Tov is an employee of Microsoft Research. Dr. A. Baumel reports no potential conflict of interest. Ethical Approval: This study was deemed exempt by the Microsoft Institutional Review Board. This article does not contain any studies with animals performed by any of the authors. Informed Consent: For this type of study informed consent is not required. References 1. Aitken M , Gauntlett C. Patient apps for improved healthcare from novelty to mainstream . Parsippany, NJ : IMS Institute for Healthcare Informatics ; 2013 . 2. Catwell L , Sheikh A . Evaluating eHealth interventions: the need for continuous systemic evaluation . PLoS Med . 2009 ; 6 ( 8 ): e1000126 . Google Scholar Crossref Search ADS PubMed 3. Moller AC , Merchant G , Conroy DE , et al. Applying and advancing behavior change theories and techniques in the context of a digital health revolution: proposals for more effectively realizing untapped potential . J Behav Med . 2017 ; 40 ( 1 ): 85 – 98 . Google Scholar Crossref Search ADS PubMed 4. Krishna S , Boren SA , Balas EA . Healthcare via cell phones: a systematic review . Telemed J E Health . 2009 ; 15 ( 3 ): 231 – 240 . Google Scholar Crossref Search ADS PubMed 5. Norman GJ , Zabinski MF , Adams MA , Rosenberg DE , Yaroch AL , Atienza AA . A review of eHealth interventions for physical activity and dietary behavior change . Am J Prev Med . 2007 ; 33 ( 4 ): 336 – 345 . Google Scholar Crossref Search ADS PubMed 6. Naslund JA , Marsch LA , McHugo GJ , Bartels SJ . Emerging mHealth and eHealth interventions for serious mental illness: a review of the literature . J Ment Health . 2015 ; 24 ( 5 ): 321 – 332 . Google Scholar Crossref Search ADS PubMed 7. van Ballegooijen W , Cuijpers P , van Straten A , et al. Adherence to Internet-based and face-to-face cognitive behavioural therapy for depression: a meta-analysis . PLoS One . 2014 ; 9 ( 7 ): e100674 . Google Scholar Crossref Search ADS PubMed 8. Christensen H , Griffiths KM , Farrer L . Adherence in internet interventions for anxiety and depression . J Med Internet Res . 2009 ; 11 ( 2 ): e13 . Google Scholar Crossref Search ADS PubMed 9. Wangberg SC , Bergmo TS , Johnsen JA . Adherence in Internet-based interventions . Patient Prefer Adherence . 2008 ; 2 : 57 – 65 . Google Scholar PubMed 10. Eysenbach G . The law of attrition . J Med Internet Res . 2005 ; 7 ( 1 ): e11 . Google Scholar Crossref Search ADS PubMed 11. Kohl LF , Crutzen R , de Vries NK . Online prevention aimed at lifestyle behaviors: a systematic review of reviews . J Med Internet Res . 2013 ; 15 ( 7 ): e146 . Google Scholar Crossref Search ADS PubMed 12. Ritterband LM , Thorndike FP , Cox DJ , Kovatchev BP , Gonder-Frederick LA . A behavior change model for internet interventions . Ann Behav Med . 2009 ; 38 ( 1 ): 18 – 27 . Google Scholar Crossref Search ADS PubMed 13. Kok G , Schaalma H , Ruiter RA , van Empelen P , Brug J . Intervention mapping: protocol for applying health psychology theory to prevention programmes . J Health Psychol . 2004 ; 9 ( 1 ): 85 – 98 . Google Scholar Crossref Search ADS PubMed 14. Provost M , Koompalum D , Dong D , Martin BC . The initial development of the WebMedQual scale: domain assessment of the construct of quality of health web sites . Int J Med Inform . 2006 ; 75 ( 1 ): 42 – 57 . Google Scholar Crossref Search ADS PubMed 15. Kelders SM , Kok RN , Ossebaard HC , Van Gemert-Pijnen JE . Persuasive system design does matter: a systematic review of adherence to web-based interventions . J Med Internet Res . 2012 ; 14 ( 6 ): e152 . Google Scholar Crossref Search ADS PubMed 16. Webb TL , Joseph J , Yardley L , Michie S . Using the internet to promote health behavior change: a systematic review and meta-analysis of the impact of theoretical basis, use of behavior change techniques, and mode of delivery on efficacy . J Med Internet Res . 2010 ; 12 ( 1 ): e4 . Google Scholar Crossref Search ADS PubMed 17. Lalmas M , O’Brien H , Yom-Tov E . Measuring user engagement . Synthesis Lectures on Information Concepts, Retrieval, and Services . 2014 ; 6 ( 4 ): 1 – 132 . Google Scholar Crossref Search ADS 18. Baumel A , Faber K , Mathur N , Kane JM , Muench F . Enlight: a comprehensive quality and therapeutic potential evaluation tool for mobile and web-based eHealth interventions . J Med Internet Res . 2017 ; 19 ( 3 ): e82 . Google Scholar Crossref Search ADS PubMed 19. White RW , Horvitz E . From health search to healthcare: explorations of intention and utilization via query logs and user surveys . J Am Med Inform Assoc . 2014 ; 21 ( 1 ): 49 – 55 . Google Scholar Crossref Search ADS PubMed 20. Couper MP , Alexander GL , Zhang N , et al. Engagement and retention: measuring breadth and depth of participant use of an online intervention . J Med Internet Res . 2010 ; 12 ( 4 ): e52 . Google Scholar Crossref Search ADS PubMed 21. Danaher BG , Seeley JR . Methodological issues in research on web-based behavioral interventions . Ann Behav Med . 2009 ; 38 ( 1 ): 28 – 39 . Google Scholar Crossref Search ADS PubMed 22. Riley WT , Rivera DE , Atienza AA , Nilsen W , Allison SM , Mermelstein R . Health behavior models in the age of mobile interventions: are our theories up to the task ? Transl Behav Med . 2011 ; 1 ( 1 ): 53 – 71 . Google Scholar Crossref Search ADS PubMed 23. O’Brien HL , Toms EG . What is user engagement? A conceptual framework for defining user engagement with technology . J Assoc Inf Sci Technol . 2008 ; 59 ( 6 ): 938 – 955 . Google Scholar Crossref Search ADS 24. Atkinson M , Kydd C . Individual characteristics associated with World Wide Web use: an empirical study of playfulness and motivation . ACM SIGMIS Database . 1997 ; 28 ( 2 ): 53 – 62 . Google Scholar Crossref Search ADS 25. Baumel A . Making the case for a feasible evaluation method of available e-mental health products . Adm Policy Ment Health . 2016 : 1 – 4 . 26. Cugelman B , Thelwall M , Dawes P . Online interventions for social marketing health behavior change campaigns: a meta-analysis of psychological architectures and adherence factors . J Med Internet Res . 2011 ; 13 ( 1 ): e17 . Google Scholar Crossref Search ADS PubMed 27. Danaher BG , Smolkowski K , Seeley JR , Severson HH . Mediators of a successful web-based smokeless tobacco cessation program . Addiction . 2008 ; 103 ( 10 ): 1706 – 1712 . Google Scholar Crossref Search ADS PubMed 28. Makai P , Perry M , Robben SH , et al. Evaluation of an eHealth intervention in chronic care for frail older people: why adherence is the first target . J Med Internet Res . 2014 ; 16 ( 6 ): e156 . Google Scholar Crossref Search ADS PubMed 29. Christensen H , Griffiths K , Groves C , Korten A . Free range users and one hit wonders: community users of an Internet-based cognitive behaviour therapy program . Aust N Z J Psychiatry . 2006 ; 40 ( 1 ): 59 – 62 . Google Scholar Crossref Search ADS PubMed 30. Goel S , Hofman JM , Sirer MI . Who does what on the web: a large-scale study of browsing behavior . Paper presented at: ICWSM 122012 ; Dublin, Ireland . © Society of Behavioral Medicine 2018. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model) http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Translational Behavioral Medicine Oxford University Press

Predicting user adherence to behavioral eHealth interventions in the real world: examining which aspects of intervention design matter most

Loading next page...
 
/lp/ou_press/predicting-user-adherence-to-behavioral-ehealth-interventions-in-the-f3ydGAP0ld
Copyright
© Society of Behavioral Medicine 2018. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
ISSN
1869-6716
eISSN
1613-9860
D.O.I.
10.1093/tbm/ibx037
Publisher site
See Article on Publisher Site

Abstract

Abstract Existing frameworks have identified a range of intervention design features that may facilitate adherence to eHealth interventions; however, empirical data are lacking on whether intervention design features can predict user adherence in the real world—where the public access available tools—and whether some design aspects of behavioral eHealth interventions are more important than others in predicting adherence. This study examined whether intervention design qualities predict user adherence to behavioral eHealth interventions in real-world use and which qualities matter the most. We correlated the online activities of users of 30 web-based behavioral interventions—collected from a proprietary data set of anonymized logs from consenting users of Microsoft Internet Explorer add-on—with interventions’ quality ratings obtained by trained raters prior to empirical examination. The quality ratings included: Usability, Visual Design, User Engagement, Content, Therapeutic Persuasiveness (i.e., persuasive design and incorporation of behavior change techniques), and Therapeutic Alliance. We found Therapeutic Persuasiveness (i.e., the incorporation of persuasive design/behavior change principles) to be the most robust predictor of adherence (i.e., duration of use, number of unique sessions; 40 ≤ rs ≤ .58, ps ≤ .005), explaining 42% of the variance in user adherence in our regression model. Results indicated up to six times difference in the percentage of users utilizing the interventions for more than a minimum amount of time and sessions based on Therapeutic Persuasiveness. Findings suggest the importance of persuasive design and behavior change techniques incorporation during the design and evaluation of digital behavioral interventions. Implications Practice: Findings suggest that criteria-based ratings—mainly those focused on the quality of persuasive design and behavior change principles—can be used to enable clinicians and health providers identify the most promising programs in terms of user engagement in real-world context. Research: Future research is needed to examine the applicability to predict intervention efficacy using criteria-based evaluation, and how the multimodal relationship between different qualities of intervention design affects engagement and efficacy metrics. Policy: If quality ratings consistently show validity in predicting user adherence and efficacy of interventions, such ratings can be used to support the decision process of different stakeholders when evaluating intervention potential prior to empirical trials. INTRODUCTION The potential reach of public access to behavioral health interventions has changed dramatically over the past 30 years. Tens of thousands of health, wellness, and medical applications are now available for download from online stores [1], and eHealth interventions are expected to play a substantial role in shaping the future of health care [2]. The more behavioral health interventions move from traditional to digital platforms [3], the more the focus shifts to individuals, who can now engage in self-care around the clock outside of traditional health care settings [4–6]. However, individuals’ engagement must compete with other events in their daily lives, making poor adherence a common issue [7–10]. As a result, few individuals engage proactively in mobile applications or websites across the behavior change spectrum for more than a couple of weeks without therapist contact [11]. Existing frameworks have identified a range of intervention design features that may influence user satisfaction, and facilitate user adherence and behavior change [12–14]. In particular, systematic reviews examining the intervention design of empirical trials have suggested that adherence [15] and program efficacy [16] can be increased by embedding persuasive design and behavior change techniques within the digital intervention. However, empirical data are lacking on whether intervention design features can predict user adherence “in the wild”—where the public access available tools—and whether some design aspects of digital behavioral interventions are more important than others in facilitating engagement. One particular challenge in gathering such data revolves around the need to record user behaviors in different interventions utilizing the same analytical framework; this approach then enables the researcher to document variance in adherence across the spectrum of examined interventions. Although traditional study settings do not allow for the comparison between a large set of intervention designs within the same empirical trial, this is feasible in real-world use of digital interventions. Moller et al. [3], for example, suggested that scholars can refine theories and understand processes and outcomes of behavioral eHealth interventions by “leveraging the proliferation of real-time objective measurement and big data commonly generated and stored by digital platforms.” Another advantage of using commonly generated data is that, unlike with traditional study settings where subjects are being proactively recruited, have check-in appointments, and are paid for filling out assessments, this evaluation method does not interfere with natural user behaviors. A second notable challenge revolves around the ability to reliably rate key intervention qualities (e.g., usability, use of behavior change principles) for each of the examined interventions. Various quality aspects influence the design, development, and implementation of behavioral health interventions in different ways. Therefore, examining these aspects with respect to user engagement may reveal the importance of accounting for these aspects during the design and pre-empirical evaluation of programs, as is common in other areas of website development [17]. In this study, we sought to address these challenges and to study how different design aspects of behavioral eHealth interventions affect user adherence. We collected data on user activity of a large number of web-based behavioral interventions and combined these with the interventions’ previously rated quality scores. The quality scores were generated using the Enlight tool, which provides a comprehensive set of separate quality ratings on topics ranging from usability to persuasive design/behavior change concepts [18] (see Methods section for more detail). We aimed to assess the correlation between the two data sets and to estimate the impact of relevant quality ratings on user adherence. To the best of our knowledge, this is the first study to combine a large-scale data set of online user behaviors to evaluate the impact of different intervention design features on adherence in real-world use. METHODS Web-based eHealth intervention quality ratings We used 42 web-based eHealth interventions available to the public, which were previously selected and scored for their quality during the development of Enlight [18]. The selection of interventions followed the PRISMA statement guidelines and involved the random selection of a sample of available behavioral eHealth interventions that were free to use and could be found through popular search engines. The clinical aims of the selected interventions were broad across the behavioral health domain which could be grouped into interventions targeting health-related behaviors (e.g., diet, physical activity, smoking, and alcohol cessation) and those targeting mental health (depression, anxiety, well-being) [18]. Enlight is a comprehensive suite of criteria-based measurements developed through a rigorous process of content analysis, grouping, and classification of 476 identified criteria items by a multidisciplinary team. The tool covers six different product quality domains: Usability, Visual Design, User Engagement, Content, Therapeutic Persuasiveness (i.e., persuasive design and incorporation of behavior change techniques), and Therapeutic Alliance (see Supplementary Appendix 1 for a detailed description of the scale and operational definitions of all items). Each quality domain score ranges from 1 to 5 and is based on averaging the criteria ratings produced by two independent trained raters on a Likert scale (ranging from 1 to 5)—following program examination. For example, Therapeutic Persuasiveness assesses the extent to which the program is designed to encourage users to make positive behavior changes or to maintain positive aspects of their life. It is calculated by averaging the raters’ scores of the following criteria items: call to action (e.g., goal settings, prompting goals, encouragement to complete goals), load reduction of activities (e.g., set graded tasks), therapeutic rationale and pathway (e.g., reasoned action, provide information about behavioral health link), rewards (e.g., contingent rewards), real data-driven/adaptive content (e.g., self-monitoring of behavior), ongoing feedback, and expectations and relevance. The quality domains exhibited excellent inter-raters reliability scores (intraclass correlations = .77–.98, median .91) and internal consistencies between domain items (Cronbach’s alphas = .83–.90, median .88) [18]. Behavioral data Information on user behaviors (i.e., behavioral variables) was collected from a proprietary data set of anonymized logs from consenting users of a widely distributed web browser add-on toolbar associated with the Microsoft Internet Explorer web browser [19]. These data were gathered over a 12-month period from January 1 to December 31, 2015. Each log contained an anonymized user identifier, the time and date of the record, and the URL that the user visited. This study was deemed exempt by the “masked for blind review” Institutional Review Board. The logs were screened to include only those URLs associated with the 42 rated intervention programs and we only documented the URL pages relevant to the intervention itself. We further filtered the data to include only those users who reached a program login or registry pages and visited the program pages for at least 2 days, to exclude users who did not eventually access the intervention program itself. We also excluded websites with fewer than 10 users that passed this filter (i.e., utilized the intervention), in order to assure statistically valid results. This left a total of 30 eHealth programs for analysis (see Supplementary Appendix 2 for a complete list of included programs). In order to examine user adherence, we retrieved the number of unique sessions and the total time spent using each program [20, 21]. To avoid ambiguity, we defined a unique session (i.e., the difference between one session/visit and another) on the basis of the date the program was utilized; accordingly, different dates of utilization counted as different sessions. We did not examine the number of unique page views, as the ways in which content was embedded within the programs (i.e., the content architecture) varied [3]. For example, some programs utilized “active content,” whereby users could receive and navigate content while technically being on the same URL, whereas other programs required users to navigate through a large number of web pages to achieve the same pattern of engagement. We also defined a composite measure [21] to examine the percentage of users who utilized the intervention for more than the minimum number of sessions and hours (i.e., threshold). Such an examination was designed to account for the assumption that utilizing a digital behavioral intervention for a certain threshold is sufficient to meet therapeutic aims—where programs do not receive better scores for extended utilization. When disregarding extended utilization we also aimed to account for the possibility that heavy website browsing of a limited amount of users affected the results. To address the potential concern that different interventions might require different thresholds to successfully change user behaviors, we examined three different composite measures by incorporating three different levels of thresholds: more than 3, 5, or 7 sessions and for an accumulated time of 1.5, 3, or 5 hr, respectively (e.g., one composite measure counted the percentage of users who utilized the program for at least 7 different sessions and 5 hr of use). Data analysis For each program, we calculated the average number of unique sessions per user and the average duration of use. We used median and interquartile range (IQR) to present the distribution of these variables. Means and standard deviations were used to describe Enlight quality scores (which were based on averaging several items ranging from 1 to 5 on a Likert scale). Spearman’s correlations were used to examine the relationships between the variables. We conducted a sensitivity analysis to examine whether the pattern of correlations found differed depending on the clinical aim of the website (i.e., health-related behaviors, mental health) and if we only included websites with a larger number of logged users (>100 users). The differences between correlations of these groups were calculated using Fisher’s Z-transformation. To account for the overlapping variance between different quality domains in the prediction of behavioral variables, the partial correlations between each quality domain and the predicted variables while controlling for the effect of all other quality domains were calculated. Finally, a stepwise regression was applied to examine which variables best explain user engagement, without adding predicting variables that do not have a significant contribution to the model. RESULTS The 30 web-based programs included in the analysis had a median of 110 users (IQR = 414); the program with the median number of sessions had an average of 4.31 unique sessions per user (IQR = 4.27); the program with the median duration of engagement time had an average of 3.97 hr of use per user (IQR = 4.19). The mean quality scores and standard deviations of the website sample were as follows: Usability M = 3.21 (SD = 0.70); Visual Design M = 2.73 (SD = 0.80); User Engagement M = 2.81 (SD = 0.76); Content M = 3.56 (SD = 0.63); Therapeutic Persuasiveness M = 2.39 (SD = 0.58); and Therapeutic Alliance M = 2.44 (SD = 0.78). Spearman’s correlations between the study variables are presented in Table 1. A pattern of very strong positive correlations was found between the behavioral variables (i.e., the variables recording user adherence; rs > .85). The high correlations found between these variables demonstrate that heavy website usage of outliers—which only affects number of unique sessions and total engagement time (and not the threshold variables)—did not bias the results. Results point to a range of moderate to strong correlations between Therapeutic Persuasiveness and the behavioral variables (.40 ≤ rs ≤ .58). User Engagement showed a pattern of weak, positive correlations with the behavioral variables (.23 ≤ rs ≤ .35); however, only two correlations were significant (rs = .31, .35), while three were close to significant (r = .23, .24, .26; ps ≤ .11). Usability showed a pattern of weak, negative correlations with the behavioral variables; one correlation was significant (r = -.40), while the other four were close to significant (-.26 ≤ r ≤ -.29; ps ≤ .09). The sensitivity analysis revealed no significant differences in Spearman’s correlations between clinical aims, or between the complete sample of websites (n = 30) and the subsample of websites with >100 consenting users (n = 17). The partial correlation analysis revealed significant positive correlations between Therapeutic Persuasiveness and all behavioral variables after controlling for all other quality domains (.48 ≤ r ≤ .59; .002 ≤ ps ≤ .014). No other significant partial correlations were found. Table 1 Spearman’s Correlations Between Study Variables 1. Number of unique sessions 2. Total engagement time 3. % of users ≥3 sessions & ≥1.5 hr 4. % of users ≥5 sessions & ≥3 hr 5. % of users ≥7 sessions & ≥5 hr Variables r p r p r p r p r p Adherence measures #2 .91 <.001 #3 .86 <.001 .90 <.001 #4 .97 <.001 .90 <.001 .90 <.001 #5 .96 <.001 .90 <.001 .86 <.001 .96 <.001 Program quality scores Usability −.26 .08 −.40 .02 −.26 .09 −.29 .06 −.29 .06 Visual Design −.02 .47 −.21 .13 −.18 .17 −.09 .32 −.06 .38 User Engagement .35 .03 .24 .10 .23 .11 .31 .048 .26 .08 Content .01 .48 −.01 .47 .07 .36 .01 .48 .03 .43 Therapeutic Persuasiveness .58 <.001 .46 .005 .40 .01 .51 .002 .48 .003 Therapeutic Alliance .11 .28 .02 .46 .10 .30 .10 .31 .07 .36 1. Number of unique sessions 2. Total engagement time 3. % of users ≥3 sessions & ≥1.5 hr 4. % of users ≥5 sessions & ≥3 hr 5. % of users ≥7 sessions & ≥5 hr Variables r p r p r p r p r p Adherence measures #2 .91 <.001 #3 .86 <.001 .90 <.001 #4 .97 <.001 .90 <.001 .90 <.001 #5 .96 <.001 .90 <.001 .86 <.001 .96 <.001 Program quality scores Usability −.26 .08 −.40 .02 −.26 .09 −.29 .06 −.29 .06 Visual Design −.02 .47 −.21 .13 −.18 .17 −.09 .32 −.06 .38 User Engagement .35 .03 .24 .10 .23 .11 .31 .048 .26 .08 Content .01 .48 −.01 .47 .07 .36 .01 .48 .03 .43 Therapeutic Persuasiveness .58 <.001 .46 .005 .40 .01 .51 .002 .48 .003 Therapeutic Alliance .11 .28 .02 .46 .10 .30 .10 .31 .07 .36 Significant correlations are in bold. Using Fisher’s Z-transformation, no significant differences in Spearman’s correlation values were found between mental health (n = 17) and health-related behaviors (n = 13), or between the complete website sample and websites with >100 documented users (n = 17). While conrolling all other quality domains, positive partial correlations were found between Therapeutic Persuasiveness and number of unique sessions (r = .58, p = .002), total engagement time (r = .57, p = .002), % of users ≥3 sessions & ≥1.5 hr (r = .48, p = .014), % of users ≥5 sessions & ≥3 hr (r = .59, p = .002), % of users ≥7 sessions & ≥5 hr (r = .59, p = .002). No other significant partial correlations were found. View Large Table 1 Spearman’s Correlations Between Study Variables 1. Number of unique sessions 2. Total engagement time 3. % of users ≥3 sessions & ≥1.5 hr 4. % of users ≥5 sessions & ≥3 hr 5. % of users ≥7 sessions & ≥5 hr Variables r p r p r p r p r p Adherence measures #2 .91 <.001 #3 .86 <.001 .90 <.001 #4 .97 <.001 .90 <.001 .90 <.001 #5 .96 <.001 .90 <.001 .86 <.001 .96 <.001 Program quality scores Usability −.26 .08 −.40 .02 −.26 .09 −.29 .06 −.29 .06 Visual Design −.02 .47 −.21 .13 −.18 .17 −.09 .32 −.06 .38 User Engagement .35 .03 .24 .10 .23 .11 .31 .048 .26 .08 Content .01 .48 −.01 .47 .07 .36 .01 .48 .03 .43 Therapeutic Persuasiveness .58 <.001 .46 .005 .40 .01 .51 .002 .48 .003 Therapeutic Alliance .11 .28 .02 .46 .10 .30 .10 .31 .07 .36 1. Number of unique sessions 2. Total engagement time 3. % of users ≥3 sessions & ≥1.5 hr 4. % of users ≥5 sessions & ≥3 hr 5. % of users ≥7 sessions & ≥5 hr Variables r p r p r p r p r p Adherence measures #2 .91 <.001 #3 .86 <.001 .90 <.001 #4 .97 <.001 .90 <.001 .90 <.001 #5 .96 <.001 .90 <.001 .86 <.001 .96 <.001 Program quality scores Usability −.26 .08 −.40 .02 −.26 .09 −.29 .06 −.29 .06 Visual Design −.02 .47 −.21 .13 −.18 .17 −.09 .32 −.06 .38 User Engagement .35 .03 .24 .10 .23 .11 .31 .048 .26 .08 Content .01 .48 −.01 .47 .07 .36 .01 .48 .03 .43 Therapeutic Persuasiveness .58 <.001 .46 .005 .40 .01 .51 .002 .48 .003 Therapeutic Alliance .11 .28 .02 .46 .10 .30 .10 .31 .07 .36 Significant correlations are in bold. Using Fisher’s Z-transformation, no significant differences in Spearman’s correlation values were found between mental health (n = 17) and health-related behaviors (n = 13), or between the complete website sample and websites with >100 documented users (n = 17). While conrolling all other quality domains, positive partial correlations were found between Therapeutic Persuasiveness and number of unique sessions (r = .58, p = .002), total engagement time (r = .57, p = .002), % of users ≥3 sessions & ≥1.5 hr (r = .48, p = .014), % of users ≥5 sessions & ≥3 hr (r = .59, p = .002), % of users ≥7 sessions & ≥5 hr (r = .59, p = .002). No other significant partial correlations were found. View Large In light of the pattern of strong correlations found between Therapeutic Persuasiveness and behavioral variables, we divided the websites into three groups based on their Therapeutic Persuasiveness percentile rank (0%–33.3%; 33.3%–66.6%; 66.6%–100%) in order to examine the median percentage of users meeting each of the thresholds in these groups (see Table 2). As can be seen in Table 2, websites with higher Therapeutic Persuasiveness scores had higher percentages of users who met the desired thresholds. In addition, the differences between website groups, in terms of the percentages of users meeting the desired thresholds, were larger at higher thresholds. For example, 19.2% of users in the lowest ranked Therapeutic Persuasiveness group met the lowest threshold compared to 36.6% of users in the highest ranked Therapeutic Persuasiveness group (almost twice as many); moreover, only 2.8% of users in the lowest ranked Therapeutic Persuasiveness group met the highest threshold compared to 16.2% of users in the highest ranked Therapeutic Persuasiveness group (almost six times as many). Table 2 The Portion of Users Meeting Engagement Thresholds by Program Therapeutic Persuasiveness Ranks Therapeutic Persuasiveness score % of users ≥3 unique days & ≥1.5 hr % of users ≥5 unique days & ≥3 hr % of users ≥7 unique days & ≥5 hr Lowest TP percentile (n = 9) Median (IQR) 1.57 (0.36) 19.23 (19.73) 7.78 (10.09) 2.78 (8.49) Middle TP percentile (n = 9) Median (IQR) 2.29 (0.21) 32.98 (16.58) 15.37 (17.08) 12.82 (10.57) Highest TP percentile (n = 12) Median (IQR) 2.86 (0.57) 36.62 (35.46) 23.08 (31.63) 16.22 (29.48) Therapeutic Persuasiveness score % of users ≥3 unique days & ≥1.5 hr % of users ≥5 unique days & ≥3 hr % of users ≥7 unique days & ≥5 hr Lowest TP percentile (n = 9) Median (IQR) 1.57 (0.36) 19.23 (19.73) 7.78 (10.09) 2.78 (8.49) Middle TP percentile (n = 9) Median (IQR) 2.29 (0.21) 32.98 (16.58) 15.37 (17.08) 12.82 (10.57) Highest TP percentile (n = 12) Median (IQR) 2.86 (0.57) 36.62 (35.46) 23.08 (31.63) 16.22 (29.48) TP Therapeutic Persuasiveness. View Large Table 2 The Portion of Users Meeting Engagement Thresholds by Program Therapeutic Persuasiveness Ranks Therapeutic Persuasiveness score % of users ≥3 unique days & ≥1.5 hr % of users ≥5 unique days & ≥3 hr % of users ≥7 unique days & ≥5 hr Lowest TP percentile (n = 9) Median (IQR) 1.57 (0.36) 19.23 (19.73) 7.78 (10.09) 2.78 (8.49) Middle TP percentile (n = 9) Median (IQR) 2.29 (0.21) 32.98 (16.58) 15.37 (17.08) 12.82 (10.57) Highest TP percentile (n = 12) Median (IQR) 2.86 (0.57) 36.62 (35.46) 23.08 (31.63) 16.22 (29.48) Therapeutic Persuasiveness score % of users ≥3 unique days & ≥1.5 hr % of users ≥5 unique days & ≥3 hr % of users ≥7 unique days & ≥5 hr Lowest TP percentile (n = 9) Median (IQR) 1.57 (0.36) 19.23 (19.73) 7.78 (10.09) 2.78 (8.49) Middle TP percentile (n = 9) Median (IQR) 2.29 (0.21) 32.98 (16.58) 15.37 (17.08) 12.82 (10.57) Highest TP percentile (n = 12) Median (IQR) 2.86 (0.57) 36.62 (35.46) 23.08 (31.63) 16.22 (29.48) TP Therapeutic Persuasiveness. View Large The pattern of very strong correlations found between all engagement variables demonstrated that each variable could predict the others with a substantial degree of accuracy. Therefore, we examined the predictive potential of quality scores with respect to user adherence using one behavioral variable: the number of unique sessions. We chose this variable as it descriptively had the strongest pattern of positive correlations with the other behavioral variables. A stepwise linear regression was applied using the number of unique sessions as the predicted variable and the different quality scores as the predicting variables. A log transformation of the predicted variable was applied to account for the non-normal distribution of regression residuals (this transformation did not affect the Spearman’s correlation results since it did not change the order of the scores). As can be seen in Table 3, Therapeutic Persuasiveness was entered in the first step, and it accounted for 42% of the variance in the predicted variable; in effect, higher Therapeutic Persuasiveness scores contributed to a higher number of sessions. Usability was entered in the second step, accounting for 12% of the variance in the predicted variable; higher Usability scores contributed to a lower number of sessions. Table 3 Standard Regression Coefficients (Beta) Showing the Contribution of Therapeutic Persuasiveness and Usability in Predicting the Normalized Number of Unique Sessions Adhered to by Users of the Examined Programs Variables Step 1 Step 2 Beta p Beta p Therapeutic Persuasiveness 0.65 <.001 0.76 <.001 Usability −0.36 .02 R2 0.42 <.001 0.54 <.001 ΔR2 – – 0.12 .02 Variables Step 1 Step 2 Beta p Beta p Therapeutic Persuasiveness 0.65 <.001 0.76 <.001 Usability −0.36 .02 R2 0.42 <.001 0.54 <.001 ΔR2 – – 0.12 .02 A stepwise linear regression model was applied. View Large Table 3 Standard Regression Coefficients (Beta) Showing the Contribution of Therapeutic Persuasiveness and Usability in Predicting the Normalized Number of Unique Sessions Adhered to by Users of the Examined Programs Variables Step 1 Step 2 Beta p Beta p Therapeutic Persuasiveness 0.65 <.001 0.76 <.001 Usability −0.36 .02 R2 0.42 <.001 0.54 <.001 ΔR2 – – 0.12 .02 Variables Step 1 Step 2 Beta p Beta p Therapeutic Persuasiveness 0.65 <.001 0.76 <.001 Usability −0.36 .02 R2 0.42 <.001 0.54 <.001 ΔR2 – – 0.12 .02 A stepwise linear regression model was applied. View Large DISCUSSION The results demonstrated that the objective quality rating for Therapeutic Persuasiveness was more important in predicting user adherence to web-based behavioral health interventions (in terms of duration of use, number of unique sessions, and combined variables) compared to quality ratings for Usability (ease of use), Visual Design, User Engagement, Content, and Therapeutic Alliance. These findings are congruent with previous empirical examinations that have demonstrated the importance of persuasive design and behavior change principles in the creation of engaging digital interventions [15, 16]. These results are also in line with theoretical studies that have argued for the importance of applying behavior change theoretical frameworks in the process of intervention design [3, 22]. While Therapeutic Persuasiveness is not a theoretical framework for behavior change, the link between Therapeutic Persuasiveness and important behavior change techniques suggests the importance of discussing such frameworks during the design, development, and evaluation of digital behavioral interventions. The negative association found between Usability and adherence should be further explained. The previous Enlight study identified that lean programs with very limited content and features can sometimes be very easy to learn and use, consequently resulting in high Usability scores [18]. Therefore, this current study finding does not imply that if one improves a product usability people use the product less, but that reaching very high usability scores using criteria-based tools unintentionally enables to identify lean programs which lack engaging features. Using only usability scores to compare between different programs—and not between two different versions of the same program—might not enable to understand which program is generally better since “ease of use” without the incorporation of product features does not provide sufficient information. Our results provide another empirical justification to the notion that Usability may be a necessary condition for program effectiveness, but not a sufficient condition [23, 24]. Finally, our results demonstrate the predictive validity of a criteria-based tool (Enlight). Taking objective quality measures into account during the process of intervention design and pre-empirical evaluation relies on the assumption that such views will correspond to user behavior and the success of the intervention in real-world use. From this perspective, our results shed light on the relationship between expert views on intervention design and desired user behaviors, making the case for the further examination of this intersection in order to advance a feasible evaluation process for behavioral eHealth interventions [25]. This study has several limitations and considerations for future examination that should be addressed. While we point to Therapeutic Persuasiveness as a robust measure that can better predict user adherence in real-world context compared to other quality ratings, this is not meant to minimize the importance of other quality concepts for an intervention’s success. The multimodal relationship between different intervention qualities implies that none of the quality ratings are independent of the others [18]; for example, a program needs to be usable and captivating (engaging) in order for it to work outside of traditional health care settings. In addition, since this study relied on analytic examination “in the wild”—without interfering with user behaviors—we did not account for intervention efficacy. Although studies have demonstrated that there is a strong relationship between adherence and efficacy [26, 27] and that the actual use of the intervention is a precondition for a program’s success [28], more does not always equal better [29]. One future direction would be to directly test the correlations between intervention design and program efficacy “in the wild.” This testing should take into account fundamental questions related to participant consent and how to measure intervention outcomes. Finally, demographic information on (anonymous) users was unavailable. Such information would have allowed more detailed breakdown of our analysis. Future work will elicit demographic information from usage, as per Goel et al. [30]. Despite these limitations, our results ultimately indicate the potential to advance knowledge through the use of digital platforms that enable the gathering of analytic measures “in the wild”—with current data supporting the importance of Therapeutic Persuasiveness in predicting user adherence to digital health interventions in real-world use. SUPPLEMENTARY MATERIAL Supplementary material is available at Translational Behavioral Medicine online. Acknowledgements The findings reported have not been previously published and the manuscript is not under consideration for publication elsewhere. The data presented were not reported previously. The authors have full control of all primary data and agree to allow the journal to review the data if requested. All work was done as part of the respective authors’ research position, with no additional or external funding. Compliance with Ethical Standards Conflict of Interest: Dr. E. Yom-Tov is an employee of Microsoft Research. Dr. A. Baumel reports no potential conflict of interest. Ethical Approval: This study was deemed exempt by the Microsoft Institutional Review Board. This article does not contain any studies with animals performed by any of the authors. Informed Consent: For this type of study informed consent is not required. References 1. Aitken M , Gauntlett C. Patient apps for improved healthcare from novelty to mainstream . Parsippany, NJ : IMS Institute for Healthcare Informatics ; 2013 . 2. Catwell L , Sheikh A . Evaluating eHealth interventions: the need for continuous systemic evaluation . PLoS Med . 2009 ; 6 ( 8 ): e1000126 . Google Scholar Crossref Search ADS PubMed 3. Moller AC , Merchant G , Conroy DE , et al. Applying and advancing behavior change theories and techniques in the context of a digital health revolution: proposals for more effectively realizing untapped potential . J Behav Med . 2017 ; 40 ( 1 ): 85 – 98 . Google Scholar Crossref Search ADS PubMed 4. Krishna S , Boren SA , Balas EA . Healthcare via cell phones: a systematic review . Telemed J E Health . 2009 ; 15 ( 3 ): 231 – 240 . Google Scholar Crossref Search ADS PubMed 5. Norman GJ , Zabinski MF , Adams MA , Rosenberg DE , Yaroch AL , Atienza AA . A review of eHealth interventions for physical activity and dietary behavior change . Am J Prev Med . 2007 ; 33 ( 4 ): 336 – 345 . Google Scholar Crossref Search ADS PubMed 6. Naslund JA , Marsch LA , McHugo GJ , Bartels SJ . Emerging mHealth and eHealth interventions for serious mental illness: a review of the literature . J Ment Health . 2015 ; 24 ( 5 ): 321 – 332 . Google Scholar Crossref Search ADS PubMed 7. van Ballegooijen W , Cuijpers P , van Straten A , et al. Adherence to Internet-based and face-to-face cognitive behavioural therapy for depression: a meta-analysis . PLoS One . 2014 ; 9 ( 7 ): e100674 . Google Scholar Crossref Search ADS PubMed 8. Christensen H , Griffiths KM , Farrer L . Adherence in internet interventions for anxiety and depression . J Med Internet Res . 2009 ; 11 ( 2 ): e13 . Google Scholar Crossref Search ADS PubMed 9. Wangberg SC , Bergmo TS , Johnsen JA . Adherence in Internet-based interventions . Patient Prefer Adherence . 2008 ; 2 : 57 – 65 . Google Scholar PubMed 10. Eysenbach G . The law of attrition . J Med Internet Res . 2005 ; 7 ( 1 ): e11 . Google Scholar Crossref Search ADS PubMed 11. Kohl LF , Crutzen R , de Vries NK . Online prevention aimed at lifestyle behaviors: a systematic review of reviews . J Med Internet Res . 2013 ; 15 ( 7 ): e146 . Google Scholar Crossref Search ADS PubMed 12. Ritterband LM , Thorndike FP , Cox DJ , Kovatchev BP , Gonder-Frederick LA . A behavior change model for internet interventions . Ann Behav Med . 2009 ; 38 ( 1 ): 18 – 27 . Google Scholar Crossref Search ADS PubMed 13. Kok G , Schaalma H , Ruiter RA , van Empelen P , Brug J . Intervention mapping: protocol for applying health psychology theory to prevention programmes . J Health Psychol . 2004 ; 9 ( 1 ): 85 – 98 . Google Scholar Crossref Search ADS PubMed 14. Provost M , Koompalum D , Dong D , Martin BC . The initial development of the WebMedQual scale: domain assessment of the construct of quality of health web sites . Int J Med Inform . 2006 ; 75 ( 1 ): 42 – 57 . Google Scholar Crossref Search ADS PubMed 15. Kelders SM , Kok RN , Ossebaard HC , Van Gemert-Pijnen JE . Persuasive system design does matter: a systematic review of adherence to web-based interventions . J Med Internet Res . 2012 ; 14 ( 6 ): e152 . Google Scholar Crossref Search ADS PubMed 16. Webb TL , Joseph J , Yardley L , Michie S . Using the internet to promote health behavior change: a systematic review and meta-analysis of the impact of theoretical basis, use of behavior change techniques, and mode of delivery on efficacy . J Med Internet Res . 2010 ; 12 ( 1 ): e4 . Google Scholar Crossref Search ADS PubMed 17. Lalmas M , O’Brien H , Yom-Tov E . Measuring user engagement . Synthesis Lectures on Information Concepts, Retrieval, and Services . 2014 ; 6 ( 4 ): 1 – 132 . Google Scholar Crossref Search ADS 18. Baumel A , Faber K , Mathur N , Kane JM , Muench F . Enlight: a comprehensive quality and therapeutic potential evaluation tool for mobile and web-based eHealth interventions . J Med Internet Res . 2017 ; 19 ( 3 ): e82 . Google Scholar Crossref Search ADS PubMed 19. White RW , Horvitz E . From health search to healthcare: explorations of intention and utilization via query logs and user surveys . J Am Med Inform Assoc . 2014 ; 21 ( 1 ): 49 – 55 . Google Scholar Crossref Search ADS PubMed 20. Couper MP , Alexander GL , Zhang N , et al. Engagement and retention: measuring breadth and depth of participant use of an online intervention . J Med Internet Res . 2010 ; 12 ( 4 ): e52 . Google Scholar Crossref Search ADS PubMed 21. Danaher BG , Seeley JR . Methodological issues in research on web-based behavioral interventions . Ann Behav Med . 2009 ; 38 ( 1 ): 28 – 39 . Google Scholar Crossref Search ADS PubMed 22. Riley WT , Rivera DE , Atienza AA , Nilsen W , Allison SM , Mermelstein R . Health behavior models in the age of mobile interventions: are our theories up to the task ? Transl Behav Med . 2011 ; 1 ( 1 ): 53 – 71 . Google Scholar Crossref Search ADS PubMed 23. O’Brien HL , Toms EG . What is user engagement? A conceptual framework for defining user engagement with technology . J Assoc Inf Sci Technol . 2008 ; 59 ( 6 ): 938 – 955 . Google Scholar Crossref Search ADS 24. Atkinson M , Kydd C . Individual characteristics associated with World Wide Web use: an empirical study of playfulness and motivation . ACM SIGMIS Database . 1997 ; 28 ( 2 ): 53 – 62 . Google Scholar Crossref Search ADS 25. Baumel A . Making the case for a feasible evaluation method of available e-mental health products . Adm Policy Ment Health . 2016 : 1 – 4 . 26. Cugelman B , Thelwall M , Dawes P . Online interventions for social marketing health behavior change campaigns: a meta-analysis of psychological architectures and adherence factors . J Med Internet Res . 2011 ; 13 ( 1 ): e17 . Google Scholar Crossref Search ADS PubMed 27. Danaher BG , Smolkowski K , Seeley JR , Severson HH . Mediators of a successful web-based smokeless tobacco cessation program . Addiction . 2008 ; 103 ( 10 ): 1706 – 1712 . Google Scholar Crossref Search ADS PubMed 28. Makai P , Perry M , Robben SH , et al. Evaluation of an eHealth intervention in chronic care for frail older people: why adherence is the first target . J Med Internet Res . 2014 ; 16 ( 6 ): e156 . Google Scholar Crossref Search ADS PubMed 29. Christensen H , Griffiths K , Groves C , Korten A . Free range users and one hit wonders: community users of an Internet-based cognitive behaviour therapy program . Aust N Z J Psychiatry . 2006 ; 40 ( 1 ): 59 – 62 . Google Scholar Crossref Search ADS PubMed 30. Goel S , Hofman JM , Sirer MI . Who does what on the web: a large-scale study of browsing behavior . Paper presented at: ICWSM 122012 ; Dublin, Ireland . © Society of Behavioral Medicine 2018. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)

Journal

Translational Behavioral MedicineOxford University Press

Published: Sep 8, 2018

References

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off