Background: The clinical course of ulcerative colitis (UC) and the effects of treatment are assessed through patient- reported signs and symptoms (S&S), and endoscopic evidence of inflammation. The Ulcerative Colitis Patient- Reported Outcomes Signs and Symptoms (UC-PRO/SS) measure was developed to standardize the quantification of gastrointestinal S&S of UC in clinical trials through direct report from patient ratings. Design: The UC-PRO/SS was developed by collecting data from concept elicitation (focus groups, and individual interviews), then refined through a process of cognitive interviews of 57 UC patients. Measurement properties, including item-level statistics, scaling structure, reliability, and validity, were evaluated in an observational, four-week study of adults with mild to severe UC (N = 200). Results: Findings from qualitative focus groups and interviews identified nine symptom items covering bowel and abdominal symptoms. The final UC-PRO/SS daily diary includes two scales: Bowel S&S (six items) and Abdominal Symptoms (three items), each scored separately. Each scale showed evidence of adequate reliability (α = 80 and 0. 66, respectively); reproducibility (intraclass correlation coefficient = 0.81, 0.71) and validity, including moderate-to- high correlations with the Partial Mayo Score (0.79; 0.45) and Inflammatory Bowel Disease Questionnaire (IBDQ) total score (− 0.70; − 0.61). Scores discriminated by level of disease severity, as defined by the Partial Mayo Score, Patient Global Rating, and Clinician Global Rating (p < 0.0001). Conclusions: Results suggest that the UC-PRO/SS is a reliable and valid measure of gastrointestinal symptom severity in UC patients. Additional longitudinal data are needed to evaluate the ability of the UC-PRO/SS scores to detect responsiveness and inform the selection of responder definitions. Keywords: Ulcerative colitis, Patient-reported outcomes, Signs and symptoms, Reliability, Validity, Clinical trial endpoints Significance of this study Currently, there are no measures developed and What is already known about this subject? validated according to the FDA PRO guidance available to assess the symptoms of ulcerative colitis The US Food and Drug Administration (FDA) has (UC). established a pathway for rigorous development of disease-specific Patient-reported Outcome (PRO) What are the new findings? tools for clinical trials and clinical use. Using the US FDA pathway for rigorous development of disease-specific PRO tools, we have * Correspondence: firstname.lastname@example.org developed and validated a new patient-reported sign University of Michigan, Ann Arbor, MI, USA Full list of author information is available at the end of the article © The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. Higgins et al. Journal of Patient-Reported Outcomes (2018) 2:26 Page 2 of 9 and symptoms measure for clinical trials and clinical or clinical tests, are now viewed by the FDA as concepts use in UC. that are best measured, scored, and reported separately. This is the first symptom measure of UC to meet Furthermore, both the FDA and European Medicines US FDA PRO guidelines. Agency have recently released guidelines specific to clin- This modular instrument can be used with ical trials of ulcerative colitis, noting the importance of appropriate individual modules customized to the including an adequately validated PRO to assess symp- mechanism of action of a candidate therapy, from tomatic relief as a primary outcome measure in pivotal purely anti-inflammatory medications, to those tar- clinical trials of UC [7, 8]. For these reasons, a new geting pain, dysmotility, or functional symptoms. patient-reported sign and symptom measure for UC was developed and validated according to the US FDA PRO How might it impact on clinical practice in the fore- Guidance and is the first symptom measure of UC to seeable future? meet these guidelines. The Ulcerative Colitis Patient-reported Outcomes Using electronic device systems, PROs in IBD can (UC-PRO) instrument was designed to comprehensively be routinely measured before and between assess the signs, symptoms, and impact of UC through appointments in order to identify response to six modules. Modules 1 (Bowel Signs and Symptoms) therapies or failure of therapies. and 2 (Abdominal Symptoms) comprise the UC-PRO Signs and Symptoms (UC-PRO/SS) measure. Module 3 Background addresses Systemic Symptoms, Module 4 addresses Cop- Ulcerative colitis (UC) is a chronic, relapsing inflamma- ing Strategies, Module 5 addresses Daily Life Impact, tory disease of the colonic mucosa . Recent studies es- and Module 6 covers Emotional Impact. Any or all of timate that 700,000 people are afflicted in the United these modules may be used in any given study. States (US) and Canada alone , with a global annual The focus of this paper is on evaluating the UC-PRO/ incidence ranging from 0.5 to 24.5 cases per 100,000 . SS measure in terms of treatment-related outcomes and The characteristic signs and symptoms of UC include supporting potential labeling claims related to the abdominal pain, frequent diarrhea, urgent bowel move- gastrointestinal (GI) signs of symptoms of UC from the ments, and rectal bleeding, which are not only discon- perspective of the patient. The UC-PRO/SS was devel- certing to patients, but can adversely affect their quality oped to quantify the signs and symptoms in clinical tri- of life . als of adults (18 years of age or older) with moderate-to- Clinically, UC is monitored through signs and symp- severe UC treated in outpatient settings. This paper de- toms of disease activity and periodic objective assess- scribes the development and initial validation of this in- ment (e.g., an endoscopy) to evaluate mucosal strument. Given the variability in the symptomatic inflammation. In clinical trial settings, the Mayo Score experience of this patient population, the UC-PRO/SS is historically has been used to assess disease activity, com- completed as a daily diary, and is designed for electronic bining endoscopic findings with physician-rated signs administration. and symptoms, based on information provided by the As noted throughout the paper, details related to the patient in a single total score. UC-PRO/SS development and validation are provided in In 2009, the US Food and Drug Administration (FDA) the online Supplementary Material. Also included in the released a guidance for the development of patient- Supplementary Material is information on the Systemic reported outcome (PRO) measures to support labeling Symptoms scale (Module 3 of the UC-PRO), a five item claims for new medical treatments and products . scale that can be included as part of the daily diary to This guidance emphasizes the importance of conducting evaluate the non-gastrointestinal systemic symptoms of qualitative research throughout the process of instru- UC. Based on the qualitative work, these symptoms were ment development to ensure that the content of the found to be relevant and important to the patient ex- measure is consistent with the patient experience and perience. However, systemic symptoms are generally not covers what they consider most important about a con- affected by current gut-specific agents. From a regula- dition and/or treatment intervention. Quantitative work tory perspective, such symptoms are considered “distal” to assess the instrument’s psychometric properties, such to the target disease activity and are therefore less suit- as reliability and validity, is also recommended. This able for testing treatment effects and/or inclusion in a standard in instrument development is an increasing product label. Because the intent is to use the UC-PRO/ regulatory requirement for efficacy evaluation and label- SS in drug development trials, with the qualification of ing purposes for treatment interventions [5, 6]. Compos- the instrument as a Drug Development Tool for this ite measures that combine different aspects of a disease, purpose currently underway , Module 3, Systemic such as clinically derived signs, patient symptoms, and/ Symptoms, is not included in the CD-PRO/SS measures. Higgins et al. Journal of Patient-Reported Outcomes (2018) 2:26 Page 3 of 9 At the discretion of the user/sponsor, it can be adminis- measure in the target population. Subjects were re- tered as part of a diary and serve as an exploratory as- cruited from seven US gastroenterology sites to capture sessment in clinical trials. This scale may also be useful diversity in terms of race, ethnicity, and geographic loca- in studies or clinical trials evaluating the systemic com- tion. In addition, subjects represented a range of disease ponent of UC. Information in the online Supplementary activity, based on the SCCAI for focus group partici- Material is intended to facilitate use of this Module. pants (SCCAI ≤5, n = 3; SCCAI 6–8, n = 5; and SCCAI ≥8, n = 23 [SCCAI data were missing for two partici- Methods pants]) and the Partial Mayo Scores for those participat- The research was conducted in two phases, consistent ing in one-to-one interviews (Partial Mayo Score 2–4, n with the methodology outlined by the FDA PRO Guid- = 5; Partial Mayo Score ≥ 4, n = 4). Discussion focused ance . Phase I addressed the content and structure of on participants’ current symptom experiences, their ex- the measure, and the documentation of content validity periences during an episode or flare-up, and the impact through qualitative research methods. Phase II was a of these symptoms on their daily life. four-week observational study to address its measure- Content analyses were performed by independent ment properties, including scoring and evaluation of re- coders, with data organized using qualitative software liability and validity. All data collection and recruitment (NVivo or ATLAS.ti). At each stage of instrument devel- procedures met institutional review board (IRB) and opment, participant quotes were grouped and summa- Health Insurance Portability and Accountability Act re- rized by thematic code to assess the saturation of quirements, and all applicable state and federal laws and concepts. Saturation is defined as the point at which no regulations. Study protocols were approved by an inde- substantially new themes, descriptions of a concept, or pendent IRB and written informed consent was obtained terms are introduced as additional discussions are con- from study subjects prior to completing any study re- ducted . lated activities. Results were discussed with clinical experts and used For each phase, subjects were recruited from US to generate a list of relevant symptoms and a draft UC- gastroenterology clinics and included ambulatory adult PRO/SS measure, including instructions, items, and re- patients with clinician-confirmed UC, based on available sponse options. biopsy. Patients participating in an interventional study were excluded, as were those with an ileostomy, colos- Stage 2: Cognitive interviews tomy, or who had an intra-abdominal surgery in the 4 Two rounds of cognitive interviews (n = 15) were con- months prior to screening. Patients represented a range ducted at three US clinical sites to examine the rele- of disease activity, from mild to severe, based on the vance, comprehensiveness, and clarity of the draft UC- Simple Clinical Colitis Activity Index (SCCAI) or Partial PRO/SS (including systemic symptoms), and to refine Mayo score . the measure as needed. Subjects were asked to complete the questionnaire independently and were then inter- Phase I: Qualitative – Development and content validity viewed about the content, including instructions, recall A two-stage qualitative research process was used to de- period, candidate items, and response options. Upon termine instrument content, and to ensure clarity and completion of 10 interviews (Round 1), the instrument understanding in the target patient population. Focus was edited for clarity based on subject comments, and groups and interviews were conducted by experienced the revised instrument was evaluated by a new sample of study team members using a semi-structured discussion UC patients (Round 2, n = 5). Round 2 also provided an guide, informed by clinical expert input and a review of opportunity to examine patient understanding of the the literature to cross reference symptoms, and were scales formatted as ePRO screen shots for use as an elec- audio-recorded and transcribed for analysis. Addition- tronic daily diary, with one item per page. Upon comple- ally, participants completed a sociodemographic ques- tion of this set of interviews, the instrument was tionnaire for use in characterizing the study sample. finalized for psychometric evaluation. Additional methods are outlined below, with details pro- vided in the online Supplementary Material. Phase II: Quantitative – Reliability and validity An observational, prospective, four-week study was con- Stage 1: Focus groups and one-to-one interviews ducted to examine the reliability and validity of the UC- Six focus groups (n = 33) and one-to-one qualitative in- PRO/SS in ambulatory adults with clinician-confirmed terviews (n = 9) were conducted to identify important UC based on a biopsy obtained at least 3 months prior UC symptoms, explore the frequency and variability of to study screening. Participants represented a range of these symptoms, and inform the development of re- disease severity based on the Partial Mayo Score (0–2, n sponse options and appropriate recall for a symptom = 56 [28%]; 3–5, n = 88 [44%]; ≥6, n = 56 [28%]). Subjects Higgins et al. Journal of Patient-Reported Outcomes (2018) 2:26 Page 4 of 9 were recruited for the psychometric study (Phase II) algorithm. Rasch analyses were conducted separately for from 22 study sites in diverse regions of the US. Each each factor that consisted of a single dimension; items participated in three protocol-driven clinic visits: Day 1 with negative fit residual value ≤− 3.0 or ≥ 3.0 positive fit (Enrollment: Visit 1), Day 7 ± 3 days (Visit 2), and Day residual were flagged for potential deletion . 28 ± 4 days (Visit 3). After the items and scales were finalized, scores were evaluated for reliability and validity. Specifically, internal Measures consistency reliability was assessed using Cronbach’s Subjects completed the UC-PRO/SS (9 candidate items) alpha coefficient, with a target value of 0.7 indicating and Module 3 Systemic Symptoms (5 candidate items), a good internal consistency [20, 21]. Test-retest reliability patient global rating of disease severity, and a single item was assessed between Day 1 and Day 7 among those to assess the “worst pain”  each day during the 30- with no change in patient-rated global rating of change day study period, using an electronic hand-held device in UC severity at Visit 2. Intraclass correlation coeffi- given to the subject upon enrollment; training was pro- cients (ICC) were computed, where ≥0.7 indicates ad- vided by clinical site personnel. In addition, subjects equate reproducibility [21, 22]. completed the paper-pen Partial Mayo Symptom Diary 7 Score validity was assessed by examining correlations days prior to Visits 2 and 3. of the UC-PRO/SS with the Partial Mayo Score; IBDQ; For score validation purposes, and to coincide with the WPAI-SHP; PROMIS measures of global physical health clinician assessment, the following paper-pen question- (GPH), global mental health (GMH), general health, and naires were completed by subjects at Visits 2 and 3, satisfaction with social role scores; worst pain; and pa- prior to seeing the clinician: Inflammatory Bowel Dis- tient and clinician global ratings of disease severity. The ease Questionnaire – 32 Items (IBDQ-32) [13, 14], Work UC-PRO/SS was expected to be moderately-to-highly Productivity and Activity Impairment – Specific Health correlated (> 0.30) with Partial Mayo Scores and moder- Problem (WPAI-SHP) , Patient-Reported Outcomes ately correlated (0.30–0.50) with IBDQ scores, worst Measurement Information System (PROMIS) Global pain, PROMIS GPH, and patient-rated global ratings of Health Scale , a patient global rating of disease se- disease severity, demonstrating convergent validity . verity, and a patient global rate of change in disease Lower correlations were anticipated between the UC- severity. PRO/SS scales and PROMIS GMH and satisfaction Clinicians completed the Partial Mayo Score at Visits scores, and WPAI-SHP scores (overall work impairment 2 and 3 based on their assessment of patients’ answers and activity impairment) (≤0.30), as these concepts were to the paper-pen Partial Mayo Symptom Diary and a thought to be more distal to the symptom experience. clinical assessment after seeing the patient. This measure Known-groups validity was examined to determine is highly correlated (0.71) with the full Mayo score , whether the UC-PRO/SS could distinguish between pa- which includes flexible sigmoidoscopy. In addition, clini- tients by disease severity, defined in three ways: 1) by cians completed a clinician global rating of disease sever- Partial Mayo Scores (mild, score 0–4; moderate, score ity at each clinic visit, and a clinician global rating of 5–7, severe, score ≥ 8); 2) by clinician-rated global as- changes in disease severity at clinic Visits 2 and 3. sessment of disease severity; and 3) by patient-rated dis- ease severity. Analysis of covariance models with Statistical analysis baseline clinical measurement group as the main effects Analyses were performed in accordance with a pre- in the model were used, adjusting for age and gender. specified statistical analysis plan. SAS version 9.2 was used for all statistical analyses, excepting the confirma- Results tory factor analysis (conducted with Mplus) , and the Study samples Rasch analysis (conducted with RUMM2030) . Item- Demographics and clinical characteristics for the study level analyses were evaluated using the “worst” day be- samples by phase are shown in Table 1. The study sub- tween (and inclusive of) Visit 1 and Visit 2, defined as jects ranged in age from 21 to 80 years of age, represent- the day with the worst rating on the patient global rating ing a range of ethnicity, race, extent of disease, and of disease severity. These analyses included measures of disease severity (at baseline). central tendency, floor and ceiling effects, and inter-item correlations. An item was flagged for potential problems Phase I: Development and content validity if it showed a floor (minimum response > 25%) or ceiling Findings from focus groups and individual interviews effect (maximum response > 25%), or when the inter- identified nine sign and symptom items covering bowel item correlation was greater than 0.80. Confirmatory and abdominal symptoms. Important bowel-related and exploratory factor analyses were performed to evalu- symptoms from the perspective of the patient included ate the structure of the measure and develop a scoring frequency of bowel movements (BMs), consistency of Higgins et al. Journal of Patient-Reported Outcomes (2018) 2:26 Page 5 of 9 Table 1 Patient Demographic and Clinical Characteristics, by Study Phase* Characteristics Phase I: Qualitative Development and Content Validity Phase II: Quantitative Score Reliability and Validity (n = 57) (n = 200) Age in years, Mean (SD) [Range] 44.1 (13.8) [21–77] 45.7 (14.60) [21–80] Gender, n (%) Female 36 (63%) 117 (59%) Ethnicity, n (%) Hispanic or Latino 6 (10%) 42 (21%) Not Hispanic or Latino 50 (88%) 155 (78%) Missing 1 (2%) 3 (2%) Race, n (%) American Indian or Alaska Native 1 (2%) 2 (10%) Asian 1 (2%) 11 (6%) Black or African American 3 (5%) 28 (14%) Native Hawaiian or other Pacific 1 (2%) 2 (1%) Islander White 43 (75%) 153 (77%) Other 6 (10%) 6 (3%) Missing 2 (4%) 0 (0%) Extent of Disease, n (%) Ulcerative proctitis 4 (7%) 19 (10%) Proctosigmoiditis 9 (16%) 62 (31%) Left-sided colitis 14 (25%) 59 (30%) Extensive colitis 2 (4%) 18 (9%) Pancolitis 25 (44%) 42 (21%) Missing or unknown 3 (5%) 0 (0%) Abbreviations: n number, SD standard deviation Percents do not add to 100 due to rounding Subjects able to choose more than one race BMs, the presence of blood, the presence of mucus, the movements, response options were based on frequency. urge/need to have a BM right away, and leakage/acci- The number of bowel movements was queried on a 8- dents. Key abdominal symptoms included pain in stom- point scale with ranges considered reasonable and mean- ach area, bloating, and gas. The symptoms that were ingful to patients and clinicians (0, 1–2, 3–4, 5–6, 7–9, most relevant during flare-ups included blood in BMs, 10–12, 13–17, 18–24, more than 24). The intent was to frequency of BMs, consistency of BMs, and urge/need to use quantitative data to evaluate these categories, with have a BM right away. Patient descriptions of the symp- the possibility of combining and/or deleting categories, toms they experienced during a flare were similar to lan- while maintaining a clinically meaningful and sensitive guage they used to describe their everyday symptoms, indicator of bowel movement frequency. For all other just more severe and/or persistent. Patient descriptions symptoms, response options were based on presence of their symptom experience underline the variability (yes/no) and severity or frequency of each, with scores not only within, but also between patients. ranging from 0 (none or not at all) to 4 (always or very Additional details of the qualitative methods and re- severe). sults, along with evidence of saturation, are provided in the online Supplementary Material. Phase II: Reliability and validity The final version, which was ready for quantitative testing, was a daily diary that comprised nine candidate Item and factor analysis and scoring algorithm Item- symptom items covering all GI signs and symptoms by-item descriptive statistics are shown in Table 2. Sub- identified by patients and confirmed by clinicians as jects used the full range of response options for each relevant and important to the assessment of disease ac- item, with the exception of the item concerning number tivity in UC. For number and consistency of bowel of bowel movements; as anticipated, no study Higgins et al. Journal of Patient-Reported Outcomes (2018) 2:26 Page 6 of 9 Table 2 Item Descriptive Characteristics at Worst Day between Visit 1 and Visit 2 (N = 198) Item Mean (SD) Range Floor (%) Ceiling (%) Missing (%) Number of bowel movements 3.9 (1.60) 1–8 0 (0.0%) 6 (3.0%) 0 (0.0%) Number of liquid bowel movements 1.8 (1.34) 0–4 48 (24.2%) 27 (13.6%) 0 (0.0%) Blood in bowel movements 1.4 (1.48) 0–4 92 (46.5%) 23 (11.6%) 0 (0.0%) Mucus in bowel movements 1.1 (1.36) 0–4 104 (52.5%) 14 (7.1%) 0 (0.0%) Leak before reaching toilet 0.9 (1.21) 0–4 124 (62.6%) 6 (3.0%) 0 (0.0%) Passing gas 2.2 (1.23) 0–4 29 (14.6%) 27 (13.6%) 0 (0.0%) Need to have bowel movement right away 1.9 (1.35) 0–4 55 (27.8%) 20 (10.1%) 0 (0.0%) Pain in belly 1.7 (1.25) 0–4 51 (25.8%) 14 (7.1%) 0 (0.0%) Bloating in belly 1.5 (1.20) 0–4 60 (30.3%) 9 (4.5%) 0 (0.0%) Abbreviations: N number, SD standard deviation participants reported zero BMs on their “worst” day be- Symptoms (six items) and Abdominal Symptoms (three tween Visits 1 and 2. Six of nine items had a floor effect items), each scored as a simple mean across all items exceeding 25%, with nearly half of these outpatients comprising the scale. There is no single total score that reporting no leakage (62%), and no mucus (53%) or combines both scales. blood (47%) in their stool between Visit 1 and Visit 2 on their worst symptom day during this one-week observa- Reliability tion period. There were no ceiling effects. Adequate internal consistency was demonstrated with Findings support a two-factor solution, with confirma- alpha coefficients of 0.80 for Bowel Signs and Symptoms tory factor analyses subsequently conducted to deter- and 0.66 for Abdominal Symptoms. Although findings mine goodness of fit statistics. One factor represents indicate that the Cronbach’s alpha for the domain of Ab- “Bowel Signs and Symptoms” and includes six items dominal Symptoms would increase to 0.79 with the dele- (Comparative Fit Index [CFI] = 0.98, Root Mean Square tion of “passing gas,” the item was retained based on of Approximation [RMSEA] = 0.068, Weighted Root importance of this symptom from the patient perspec- Mean Residual [WRMR] = 0.563), while the other factor tive and expert opinion. Seven-day test-retest reliability represents “Abdominal Symptoms” and includes three in stable patients (n = 77 reporting no change in symp- items (CFI = 1.0, RMSEA = 0.0, WRMR = 0.0). Rasch toms between Day 1 and Day 7) was supported with analysis indicated that all of the fit residuals for items in ICC values of 0.81 and 0.71 for Bowel Signs and Symp- each of the two models fell within the acceptable range toms and Abdominal Symptoms, respectively. (≥− 3.0 and ≤ 3.0); however, several of the response cat- egories were not ordered correctly, primarily due to very Validity few responses for “rarely” and “mild” categories. Correlations between the UC-PRO/SS domain scores Taking into consideration findings from both the and clinical and other relevant PRO measures are pre- qualitative and quantitative studies, several decisions sented in Table 3. All relationships were confirmed were reached regarding the UC-PRO/SS. First, given that based on a priori predictions, with both UC-PRO/SS few subjects (n = 6, 3%) endorsed the response category scale scores demonstrating strong correlations with the “more than 24” for the item “number of BMs,” this item Partial Mayo Score, IBDQ, worst pain, and patient and response level was removed. Although a number of clinician global ratings of disease severity (convergent items demonstrated floor effects during the one-week validity), and weaker correlations with measures of im- observation, all were considered important from the per- pact on daily life (discriminant validity). spective of the patient, based on qualitative studies, and The Bowel Signs and Symptoms and the Abdominal clinically relevant. Finally, although the Rasch analyses Symptoms scales were each able to differentiate patients suggested the number of response options for several by symptom severity (p < 0.0001) based on the Partial items could be reduced from a 5- to a 4-point scale by Mayo Score, patient global rating of disease severity, and combining responses, the distinction between “none” clinician global rating of disease severity (Fig. 1). Scale and “mild” and between “mild” and “moderate” was con- scores for both the UC-PRO/SS scales increased (indi- sidered clinically important and the decision was made cating worse symptoms) with increasing Partial Mayo to retain the five-point scaling. scores (indicating higher disease severity). Similarly, UC- The final UC-PRO/GI-SS assesses two important indi- PRO/SS scale scores were higher among patients who cators of disease activity in UC: Bowel Signs and had patient global ratings of disease severity scores ≥ Higgins et al. Journal of Patient-Reported Outcomes (2018) 2:26 Page 7 of 9 a,b Table 3 Correlations between UC-PRO/SS Scores and Other Clinical Variables Clinical Variable Bowel Signs and Symptoms Rating (p value) Abdominal Symptoms Rating (p value) Clinician Ratings: Partial Mayo Score 0.79 (< 0.0001) 0.45 (< 0.0001) Clinician Global Rating of Disease Severity 0.69 (< 0.0001) 0.44 (< 0.0001) Patient Ratings: Patient Global Rating of Disease Severity 0.67 (< 0.0001) 0.52 (< 0.0001) IBDQ Total score −0.70 (< 0.0001) − 0.61 (< 0.0001) Bowel systems − 0.73 (< 0.0001) − 0.65 (< 0.0001) Emotional health − 0.61 (< 0.0001) − 0.53 (< 0.0001) Systemic systems − 0.51 (< 0.0001) −0.53 (< 0.0001) Social function −0.62 (< 0.0001) −0.49 (< 0.0001) WPAI-SHP Absenteeism 0.35 (< 0.0001) 0.22 (0.0107) Presenteeism 0.63 (< 0.0001) 0.45 (< 0.0001) Work productivity loss 0.63 (< 0.0001) 0.46 (< 0.0001) Activity impairment 0.63 (< 0.0001) 0.47 (< 0.0001) PROMIS Global physical health −0.21 (0.0031) − 0.28 (< 0.0001) Global mental health −0.28 (< 0.0001) − 0.33 (< 0.0001) General health −0.26 (0.0003) − 0.33 (< 0.0001) Satisfaction with social role − 0.25 (0.0006) −0.28 (< 0.0001) BPI–Worst Pain 0.57 (< 0.0001) 0.64 (< 0.0001) Abbreviations: BM bowel movement, BPI brief pain inventory, IBDQ Inflammatory Bowel Disease Questionnaire, PROMIS Patient Reported Outcomes Measurement Information System, UC-PRO/GI-SS Ulcerative Colitis Patient-reported Outcomes Gastrointestinal Signs and Symptoms Scale, WPAI-SHP Work Productivity and Activ- ity Impairment–Specific Health Problems Spearman’s correlation coefficients Seven-day average scores used Fig. 1 UC-PRO/SS Scores by Disease Activity Higgins et al. Journal of Patient-Reported Outcomes (2018) 2:26 Page 8 of 9 median compared to those with scores below the me- patient global rating of disease severity, and the clinician dian. Similar findings were demonstrated based on clin- global rating of disease severity. ician global rating of disease severity. Known-group Both scales of the UC-PRO/SS include multiple items validity tables are included in the online Supplementary to better capture the bowel and abdominal symptom ex- Material. perience of UC from the perspective of the patient, which allows for a more granular assessment of aspects of the disease that are relevant and important to pa- Discussion tients. In clinical trials of therapies for UC, the UC- The UC-PRO/SS measure was developed to standardize PRO/SS potentially can be used to collect data for a co- the quantification of GI signs and symptoms of UC in primary endpoint or a key secondary endpoint. Therap- clinical trials through direct patient ratings. The meth- ies targeting inflammation in induction studies could use odology used to develop the UC-PRO/SS followed the an objective marker of inflammation (e.g., endoscopy, US FDA Guidance on PRO instrument development, magnetic resonance enterography, fecal calprotectin) to which conveys the agency’s thinking on best practices assess the co-primary or primary endpoint, with the for the development of measures and the evidence Bowel Signs and Symptoms module as the assessment of needed for the agency’s evaluation . The UC-PRO/SS a co-primary or key secondary endpoint. Therapies ex- was developed based on data collection from concept pected to improve functional abdominal symptoms elicitation and cognitive interviews of subjects with might use this module as the primary endpoint, while moderate to severe UC who were representative of the maintenance studies of anti-inflammatory studies might UC target population eligible for typical clinical trials. use a co-primary endpoint of an objective marker of in- Measurement properties were tested in a four-week ob- flammation and the Bowel Signs and Symptoms and Ab- servational study of 200 adults with UC. The decision to dominal Symptoms scales to demonstrate a long-term retain or delete items for the final measure was an itera- significant impact on multiple symptom domains im- tive process with consideration of floor and ceiling ef- portant to patients. fects, results from the factor and Rasch analyses, Several limitations should be noted for this research. previous qualitative results, and clinical considerations. First, although all subjects included in the development The psychometric evaluation study included patients and evaluation of the UC-PRO/SS were required to have with a history of very mild to moderately severe UC to clinician-confirmed UC based on biopsy, baseline endos- capture responses from the full range of disease activity. copy was not required for participation in the studies. The relatively large number of mild patients contributed Thus, it is unclear if subjects were experiencing active to the floor effects observed across items, and the results inflammation of the colon or rectal mucosa at the time of the Rasch analyses, which suggested little response of their participation. Second, the duration of the study distinction between “none” and “mild” or between was relatively short, limiting information on the variabil- “never” and “rarely.” Given the importance of these re- ity in UC disease over time, and none of the participants sponse categories from a clinical perspective and to cap- experienced an acute flare up of their condition, thus ture degrees of improvement in more severe patients, limiting the data on change, including worsening and these response options were retained, with the under- improvement. Finally, this was an observational study standing that further evaluation will be needed to con- and not an interventional clinical trial, precluding re- firm their suitability and utility across populations with sponsiveness analyses, including tests of sensitivity to severely active disease. change with treatment. Further research is needed to The final UC-PRO/SS includes two scales: Bowel Signs replicate the results presented here in new samples and and Symptoms (six items) and Abdominal Symptoms to determine score sensitivity to change over time with (three items), with both scales are scored separately. Per- flares and treatment. formance testing of the UC-PRO/SS demonstrated evi- dence of internal consistency and reproducibility. The Conclusions UC-PRO/SS scale scores showed moderate to high cor- In conclusion, the UC-PRO/SS is a daily diary to gather relations with other relevant measures identified a priori. data on the GI signs and symptoms of UC directly from In particular, the UC-PRO/SS Bowel Signs and Symp- the patient. The instrument was developed to meet regu- toms scale score was strongly correlated with the Partial latory guidelines, with initial validation evidence suggest- Mayo Score (r = 0.79), IBDQ total score (r = 0.70), and ing that the UC-PRO/SS scores are reliable, valid, and IBDQ domain of bowel systems (r = 0.73). Both UC- ready for use and further testing in clinical trials. The PRO/SS scores also appear to have known-groups valid- UC-PRO/SS complements and extends information pro- ity with significant differences in scores between disease vided by the clinician, endoscopy, and biomarkers in severity groups when defined by the Partial Mayo Score, clinical studies. Higgins et al. Journal of Patient-Reported Outcomes (2018) 2:26 Page 9 of 9 Publisher’sNote 17. Muthén, L. K., & Muthén, B. O. (1998-2010). Mplus user’s guide (3rd ed.). Los Springer Nature remains neutral with regard to jurisdictional claims in Angeles: Muthén & Muthén. published maps and institutional affiliations. 18. Andrich, D., Lyne, A., Sheridan, B., et al. (2012). RUMM 2030: Rasch unidimensional measurement models. Australia: RUMM Laboratory Try Ltd.. Author details 19. Rasch, G. (1980). Probabilistic models for some intelligence and attainment 1 2 University of Michigan, Ann Arbor, MI, USA. Evidera, Bethesda, MD, USA. tests. Expanded. Chicago: University of Chicago Press. 3 4 Amgen Inc., Thousand Oaks, CA, USA. University of Washington, Seattle, 20. Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. 5 6 WA, USA. Genentech Inc., South San Francisco, CA, USA. Present address: Psychometrika, 16, 297–334. Allergan Inc, Irvine, CA, USA. Present address: Office of New Drugs, CDER, 21. Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory (3rd ed.). New Silver Spring, USA. York: McGraw-Hill. 22. Leidy, N. K., Revicki, D. A., & Geneste, B. (1999). Recommendations for Received: 15 November 2017 Accepted: 24 April 2018 evaluating the validity of quality of life claims for labeling and promotion. Value in Health, 2, 113–127. 23. Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Hillsdale: Lawrence Erlbaum Associates Inc.. References 1. Irvine, E. J. (2008). Quality of life of patients with ulcerative colitis: Past, present, and future. Inflammatory Bowel Diseases, 14, 554–565. 2. Loftus, C. G., Loftus Jr., E. V., Harmsen, W. S., et al. (2007). Update on the incidence and prevalence of Crohn’s disease and ulcerative colitis in Olmsted County, Minnesota, 1940-2000. Inflammatory Bowel Diseases, 13, 254–261. 3. Burisch, J., & Munkholm, P. (2013). Inflammatory bowel disease epidemiology. Current Opinion in Gastroenterology, 29, 357–362. 4. Waljee, A. K., Joyce, J. C., Wren, P. A., et al. (2009). Patient reported symptoms during an ulcerative colitis flare: A qualitative focus group study. European Journal of Gastroenterology & Hepatology, 21, 558–564. 5. Food and Drug Administration (FDA) (2009). Guidance for Industry—Patient-reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims. Available at: http://www.fda.gov/ downloads/Drugs/Guidances/UCM193282.pdf. Fed Regist 74:65132–33. 6. Spiegel, B. M., Bolus, R., Agarwal, N., et al. (2010). Measuring symptoms in the irritable bowel syndrome: Development of a framework for clinical trials. Alimentary Pharmacology and Therapeutics, 32, 1275–1291. 7. European Medicines Agency (EMA) (July 2016). Guideline on the development of new medicinal products for the treatment of ulcerative colitis (Draft). Available at http://www.ema.europa.eu/docs/en_GB/ document_library/Scientific_guideline/2016/07/WC500211431.pdf. London, UK: European Medicines Agency. 8. FDA and CDER (August 2016). Ulcerative Colitis: Clinical Trial Endpoints Guidance for Industry. (Draft Guidance) Available at http://www.fda.gov/ downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/ UCM515143.pdf Silver Spring, MD: Food and Drug Administration (FDA), Center for Drug Evaluation and Research (CDER). 9. FDA and CDER (January 2014). Guidance for Industry and FDA staff: Qualification Process for Drug Development Tools. Available at: http://www. fda.gov/downloads/drugs/guidancecomplianceregulatoryinformation/ guidances/ucm230597.pdf. Silver Spring, MD: Food & Drug Administration (FDA), Center for Drug Evaluation (CDER). 10. Lewis, J. D., Chuai, S., Nessel, L., et al. (2008). Use of the noninvasive components of the Mayo score to assess clinical response in ulcerative colitis. Inflammatory Bowel Diseases, 14, 1660–1666. 11. Leidy, N. K., & Vernon, M. (2008). Perspectives on patient-reported outcomes: Content validity and qualitative research in a changing clinical trial environment. PharmacoEconomics, 26, 363–370. 12. Farrar, J. T., Young Jr., J. P., LaMoreaux, L., et al. (2001). Clinical importance of changes in chronic pain intensity measured on an 11-point numerical pain rating scale. Pain, 94, 149–158. 13. Guyatt, G., Mitchell, A., Irvine, E. J., et al. (1989). A new measure of health status for clinical trials in inflammatory bowel disease. Gastroenterology, 96, 804–810. 14. Irvine, E. J. (1999). Development and subsequent refinement of the inflammatory bowel disease questionnaire: A quality-of-life instrument for adult patients with inflammatory bowel disease. Journal of Pediatric Gastroenterology and Nutrition, 28, S23–S27. 15. Reilly, M. C., Zbrozek, A. S., & Dukes, E. M. (1993). The validity and reproducibility of a work productivity and activity impairment instrument. PharmacoEconomics, 4, 353–365. 16. Hays, R. D., Bjorner, J. B., Revicki, D. A., et al. (2009). Development of physical and mental health summary scores from the patient-reported outcomes measurement information system (PROMIS) global items. Quality of Life Research, 18, 873–880.
Journal of Patient-Reported Outcomes
– Springer Journals
Published: May 30, 2018