Enriching context descriptions for enhanced LA scalability: a case study

Jeanette Samuelsen; Weiqin Chen; Barbara Wasson

doi:10.1186/s41039-021-00150-2

Enriching context descriptions for enhanced LA scalability: a case study

Samuelsen, Jeanette; Chen, Weiqin; Wasson, Barbara 2021-03-29 00:00:00 Samuelsen@uib.no Centre for the Science of Learning Learning analytics (LA) is a field that examines data about learners and their context, & Technology, University of Bergen, for understanding and optimizing learning and the environments in which it occurs. P.O. Box 7807, 5020 Bergen, Norway Department of Information Science Integration of multiple data sources, an important dimension of scalability, has the & Media Studies, University of potential to provide rich insights within LA. Using a common standard such as the Bergen, P.O. Box 7802, 5020 Bergen, Experience API (xAPI) to describe learning activity data across multiple sources can Norway Full list of author information is alleviate obstacles for data integration. Despite their potential, however, research available at the end of the article indicates that standards are seldom used for integration of multiple sources in LA. Our research aims to understand and address the challenges of using current learning activity data standards for describing learning context with regard to interoperability and data integration. In this paper, we present the results of an exploratory case study involving in-depth interviews with stakeholders having used xAPI in a real-world project. Based on the subsequent thematic analysis of interviews, and examination of xAPI, we identified challenges and limitations in describing learning context data, and developed recommendations (provided in this paper in summarized form) for enriching context descriptions and enhancing the expressibility of xAPI. By situating the research in a real-world setting, our research also contributes to bridge the gap between the academic community and practitioners in learning activity data standards and scalability, focusing on description of learning context. Keywords: Learning context, Learning activity data specification, xAPI, Scalability, Interoperability, Data integration, Learning analytics Introduction A diversity of digital tools are used within education. They may, for instance, facilitate exam taking (e.g., an exam system), store student demographic and result data (e.g., a student information system), or make available lecture notes and videos (e.g., a learn- ing management system [LMS]). When a student uses such systems, digital trace data may be generated and saved in individual data sources. This data can be used to gain insight into the student and their learning. Learning analytics (LA) is “the measure- ment, collection, analysis and reporting of data about learners and their contexts, for purposes of understanding and optimizing learning and the environments in which it occurs” (Siemens, 2011). Within LA, “researchers have suggested that the true poten- tial to offer meaningful insight comes from combining data from across different © The Author(s). 2021 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 2 of 26 sources” (Bakharia, Kitto, Pardo, Gašević, & Dawson, 2016, p. 378). Data integration, the combination of data from different sources, also plays an important role for the scalability of LA (Samuelsen, Chen, & Wasson, 2019). Throughout their various activities, learners are situated in different contexts. They move within the physical space, at varying times of the day, using different tools on separate devices, leading to data being generated through different sensors. A context is defined as “any information that can be used to characterize the situation of an entity” (Dey, 2001, p. 5), and an entity can be a person, place, or an object (Dey, 2001). To in- tegrate data from different sources in LA, it is crucial to take into account the context of the data. Taking context into account can have benefits for interoperability and may also be used to personalize learning for the individual learner, as well as enable better querying and reporting of the data. Data integration is closely related to interoperability, which involves semantic, technical, legal, and organizational levels (European Commission, 2017). Concerning technical and semantic interoperability, two well-known data specifications (de facto industry standards) exist which target the educational domain and LA, namely the Experience API (xAPI; Advanced Distributed Learning, 2017a) and IMS Caliper Analyt- ics (2020). These specifications both enable the exchange and the integration of learn- ing activity data originating from different tools and data sources, where individual activity data describe a learner interacting with a learning object in a learning environ- ment (modelled with the most basic structure of “actor verb object”). The activity data can subsequently be stored in a Learning Record Store (LRS). Both specifications also provide mechanisms for adding vocabularies, through profiles, which can help in terms of structuring the activity data and adding semantics. Current profile specifications are specified in the JSON for Linking Data (JSON-LD) format, which builds on JSON and semantic technologies to enable machine-readable data definitions. For xAPI, any com- munity of practice can create a new profile, while for Caliper only organizations that are members of IMS may contribute to profiles (and other parts of the specification). As the latter may suggest, the usage of xAPI is generally more flexible than that of Caliper (Griffiths & Hoel, 2016). For a detailed comparison of xAPI and Caliper, please refer to Griffiths and Hoel (2016). Despite the availability of these learning activity data specifications, previous research (Samuelsen et al., 2019) found that they are not widely used for data integration of data coming from multiple data sources for LA in the context of higher education; in the case of xAPI, a few examples of use were found, while no examples of Caliper use were found. Thus, it should be of interest for researchers and practitioners in LA to know why there seems to be so little use of learning activity data specifications, and to under- stand the challenges and limitations when using the existing specifications. This paper reports on an exploratory case study where we look at the challenges and limitations of using a current learning activity data standard (i.e., xAPI) for describing the learning context. While previous research has identified some of the challenges and limitations of xAPI (Bakharia et al., 2016; Betts & Smith, 2019; Keehn & Claggett, 2019), to our knowledge no studies have systematically collected and analyzed data from stakeholders who have used xAPI in a real-world case and identified the gaps https://json-ld.org/ Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 3 of 26 between xAPI and the needs of stakeholders with regard to interoperability and integra- tion of learning activity data. Thus, we aim to contribute to the knowledge base through the systematic collection, analysis and identification of xAPI gaps and needs with regard to interoperability and data integration as they have been experienced by stakeholders in a real-world case. The case is the Activity data for Assessment and Adaptation (AVT) project (Morlandstø, Hansen, Wasson, & Bull, 2019), a Norwegian project exploring the use of activity data coming from multiple sources to adapt learning to individual learner needs and for use in assessment. AVT used the xAPI data specification for describing student activity data originating from different sources. Through in-depth interviews with AVT stakeholders with varying perspectives, and inspection of the xAPI specification (Advanced Distributed Learning, 2017a) and the xAPI profiles specification (Advanced Distributed Learning, 2018a), we identified some challenges and limitations of xAPI, focusing on learning context description. Based on the identified challenges and limita- tions, we have provided recommendations on how xAPI can be improved to enhance its expressibility—meaning it should be possible to describe data in a consistent way across data sources—in order to better support interoperability, data integration and (consequently) scalability. This paper answers the following research questions: RQ1: Focusing on descriptions of xAPI context, what are the gaps and needs regarding interoperability and data integration? RQ2: How should the identified gaps and needs be addressed in order to provide improved interoperability and data integration? Background In this section, we first look at several data models that attempt to formalize context. Next, we examine the constructs currently available in xAPI that enable the description of context. Then we conclude with a comparison of the context data models and xAPI. Context Data Models Jovanović,Gašević, Knight, and Richards (2007) developed an ontology-based frame- work, Learning Object Context Ontologies (LOCO), to formalize and record context related to learning objects (i.e., digital artifacts used for learning on digital platforms). Learning objects consist of learning content and are assigned to learning activities to achieve learning objectives. The LOCO framework integrates several ontologies, e.g., for learning object content structure and user modelling. The learning object context that can be recorded, i.e., metadata which originates from a learning process, includes information about the learning object domain, the learning situation, and the learner. Two tools were developed based on the LOCO framework. The first tool, LOCO- Analyst, can generate feedback for instructors based on analysis of context data col- lected from an online learning environment (e.g., LMS). The second tool, TANGRAM, targets the learners and is a “Web-based application for personalized learning in the area of Intelligent information systems” (Jovanović et al., 2007, p. 57). It personalizes the assembly of learning content. Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 4 of 26 Schmitz, Wolpers, Kirschenmann, and Niemann (2011) detail a framework for col- lecting and analyzing contextual attention metadata (CAM) from digital environments. CAM expresses data selection behaviors of users. The authors developed a schema to represent CAM that allows for the registration of aspects such as which data objects (e.g., file, video, email message) capture the attention of users, what actions are per- formed on the objects (e.g., a file was opened), and what was the context of use when a user interacted with an object (e.g., time, location). To enable the collection of CAM records, the approach is to add file system/application wrappers, thereby transforming the original data format to the format of CAM in XML. The CAM schema is semi- structured for some properties, such as context. The context property is a container for data of varying types, i.e., an arbitrary number of key-value pairs can be stored within this container. The authors note that while the semi-structured approach is flexible and allows for registering different types of data, it also creates challenges for exchanging data because data can be described in different ways. They state that an alternative would be to import different metadata schema, which could be used to structure the different types of data. Such an approach would rely on pre-defined schemas, e.g., from 2 3 FOAF and Dublin Core . To avoid redundancy of stored metadata, the authors describe a tentative approach where metadata are stored as triple representations (sub- ject, predicate, object), and where pointers are added to other metadata descriptions. Regarding CAM, one prototype was implemented to collect, analyze and visualize user communication. Different metadata were extracted and transformed into CAM format, providing the basis for visualizing the social network of the user, including the type of communication that took place and the user's communication behavior. Another proto- type using CAM was developed for an online learning environment. Here, data object metadata (e.g., number of object uses) were utilized for user recommendations. Usage and behavior data were also visualized for the individual user, adding the potential for providing metacognitive support. The learning context project (Thüs et al., 2012) recognizes that devices, such as mo- bile phone and tablets that contain a diverse set of sensors, have possibilities for record- ing context. The project has developed the Learning Context Data Model (LCDM), a suggested standard to represent context data and enable increased interoperability and reusability of context models. The data model considers learners and events, where an event is categorized at either a higher or lower level. Available high-level categories are activity (e.g., writing a paper), environmental (e.g., noise level, location), and biological (e.g., heart rate level, level of motivation) (Muslim, Chatti, Mahapatra, & Schroeder, 2016). There are a limited number of low-level categories, and for each, the data model specifies required and recommended inputs. Context can be broadly categorized as ex- trinsic or intrinsic. Extrinsic context is related to the user’s current environment, while intrinsic context is related to the inside of the user such as knowledge, concentration, or motivational level (Thüs et al., 2012). The LCDM allows for the registration of both extrinsic and intrinsic context events. It can also register user interests and platforms, which specifies where an event was captured, e.g., on a mobile phone. In addition to the data model, the learning context project provides an API that enables storage and http://xmlns.com/foaf/spec/ https://dublincore.org/specifications/dublin-core/ Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 5 of 26 retrieval of the context-related information and visualizations for the collected data. For instance, one visualization shows how learner interests evolve over time, something that may enable self-reflection (Thüs, Chatti, Brandt, & Schroeder, 2015). Lincke (2020) describes an approach for context modelling in her PhD dissertation, where she has developed a rich context model (RCM). The RCM approach models the user context according to specific context dimensions, each relating to a given applica- tion domain. The RCM was designed for generalizability in terms of application do- mains; thus, it can be utilized for different domains through providing separate configurations for the individual domains, removing the need to change the core of the model to add new domains. The configurations can specify aspects such as expected data/data types and database configuration. In the research, context was modelled for different application domains, such as mobile learning, LA, and recommender systems. For instance, in the mobile learning application domain, dimensions were modelled for environment, device, and personal context. The environment context could include contextual information such as location, weather conditions, and nearby places; the de- vice context could include information such as screen size, battery level, and Internet connectivity; and the personal context could include information such as demograph- ics, courses, interests, and preferences. In the dissertation by Lincke (2020), much emphasis is placed on data analysis, especially with regard to user recommendations of relevant items as they pertain to the current situation, thus offering personalization/ contextualization to the user. Analysis of context data, with resulting recommendations, has been implemented in tools within the mobile learning application domain. Data analysis results were visualized in mobile learning and several other application domains. Context in xAPI xAPI statements are made up of the most basic building blocks of “actor verb object” (see Fig. 1 for more information on available properties and structures). An xAPI activ- ity is a type of object that an actor has interacted with. Used together with a verb, the activity may represent a unit of instruction, performance, or experience. The interpret- ation of an activity is broad, meaning this concept can not only be used to represent virtual objects, but also physical/tangible objects (Advanced Distributed Learning, 2017b). The xAPI specification (currently in version 1.0.3), expressed in the JSON format, is quite flexible. For instance, users are free to define new verbs and activity types (an activity is an instance of an activity type) for use in statements, ideally publishing these vocabulary concepts in profiles shared with relevant communities of practice. Addition- ally, a number of the expected value formats have a flexible structure (e.g., JSON ob- jects may contain an arbitrary number of properties of varying levels of nesting). Finally, a number of structures/properties that can be used for data description are optional. In xAPI statements, the context structure is an optional structure that allows us to register context data. Since xAPI is a standard for learning activity data, context is related to the learner as they interact with a learning object in a (typically) digital en- vironment. The context structure is on the same level in a statement as the actor, verb, and object structures. Another structure on this level, which also allows for registration Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 6 of 26 Fig. 1 xAPI statement (Vidal, Rabelo, & Lama, 2015) of context data related to learning activity data, is the result structure that “represents a measured outcome related to the Statement in which it is included” (Advanced Distrib- uted Learning, 2017c); it may contain information on score, duration, response, success, completion, and other relevant (user-defined) attributes. There are nine properties that can be used within the xAPI context structure (see Table 1). Seven of them are defined with keys that require a single value or object that represents a single entity, including registration (value format is a UUID), instructor (value is an agent, stored in a JSON Table 1 Context structure properties (Advanced Distributed Learning, 2017c) Property Type Description Required registration UUID The registration that the Statement is associated with. Optional instructor Agent (MAY be a Instructor that the Statement relates to, if not included as the Optional Group) Actor of the Statement. team Group Team that this Statement relates to, if not included as the Optional Actor of the Statement. contextActivities contextActivities A map of the types of learning activity context that this Optional Object Statement is related to. Valid context types are: "parent", "grouping", "category" and "other". revision String Revision of the learning activity associated with this Optional Statement. Format is free. platform String Platform used in the experience of this learning activity. Optional language String (as defined Code representing the language in which the experience Optional in RFC 5646) being recorded in this Statement (mainly) occurred in, if applicable and known. statement Statement Another Statement to be considered as context for this Optional Reference Statement. extensions Object A map of any other domain-specific context relevant to this Optional Statement. For example, in a flight simulator altitude, airspeed, wind, attitude, GPS coordinates might all be relevant. Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 7 of 26 object), statement (value is another xAPI statement that is found to be relevant to the xAPI statement, stored in a JSON object), and team (value is an xAPI group, stored in a JSON object). The properties language, platform, and revision (of a learning activity) all require a string as their value (Advanced Distributed Learning, 2017c). Context in- formation not suitable for these seven properties that all take a single value or object that represents a single entity, and not related to a measured outcome (i.e., result), can be described with ContextActivities and extensions. ContextActivities let us specify “a map of the types of learning activity context that this Statement is related to” (Advanced Distributed Learning, 2017c). The available context types are parent, grouping, category, and other. The parent structure is used to specify the parent(s) of the object activity of a statement. For instance, a quiz would be the parent if the object of a statement was a quiz question. Grouping can be used to specify activities with an indirect relation to the object activity of a statement. For in- stance, a qualification has an indirect relation to a class and can therefore be specified using the grouping structure. Category is used to add activities that can categorize/tag a statement. The only example given in the xAPI specification is that the xAPI profile used when generating statements can be specified using category. The context type other can be used to specify activities that are not found to be appropriate in any of the parent, grouping,or category context types. The example given in the xAPI specification is that an actor studies a textbook for an exam, where the exam is stated to belong to the context type other. Extensions, like ContextActivities, are organized in maps. They should include domain-specific information that is not covered using the other context properties. The map keys for extensions must be represented in Internationalized Resource Identifier (IRI)s; the map values can be any valid JSON data structure such as string, array, and object. Thus, using extensions to express context information in an xAPI statement is more flexible than using ContextActivities. As such, the advice in the specification, re- garding interoperability, is that built-in xAPI elements should be preferred to exten- sions for storing information, if available (Advanced Distributed Learning, 2017c). The xAPI specification gives the example of an actor using a flight simulator, where altitude, wind, and GPS coordinates can be expressed using extensions. Since xAPI allows for registration of such a diversity of context-related information, through both the context structure and the result structure, data described in xAPI may for instance be used for personalization, visualization, assessment, and prediction. Comparing the different context data models to xAPI Having examined context in xAPI, we now look at its similarities and differences with regard to the previous research on context data models (see Table 2). xAPI collects data regarding the learners/agents and their activities. Of the context data models presented above, all except the LOCO model also have the learner (or user) as their unit of focus. LOCO, however, focuses on learning objects (in xAPI, learning object information would be represented in the object of the xAPI statement, rather than the context). In terms of flexibility, xAPI is quite flexible regarding data registration, similar to CAM. The other data models appear generally to be more rigid, for example due to stricter specification of available properties and data types. Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 8 of 26 Table 2 Context data model and xAPI comparison Name Unit of Flexibility Categorization Interoperability Usage focus of data of context registration LOCO Learning More rigid No categorization Targets Personalization, examine object interoperability object use CAM Learner Flexible No categorization Targets Personalization, visualization, interoperability examine object use LCDM Learner More rigid Two-level Targets Visualization categorization interoperability (high/low level) RCM Learner More rigid Two-level categorization Does not target Personalization/contextualization, (high/low level) interoperability visualization xAPI Learner/ Flexible No categorization Targets Personalization, visualization, agent interoperability examine object use, assessment, prediction, etc. Concerning classification of context, the LCDM has capabilities for two-level categorization of events, making a distinction between high-level and low-level categor- ies, for example the high-level categorization “environment” and the low-level categorization “noise level.” The RCM approach, similar to LCDM, suggests both high- level categorizations of context (in the form of context dimensions) and low-level categorization (information belonging to the separate context dimensions). In contrast, xAPI and the other data models do not provide this type of classification of context. Interoperability can be enabled in varying degrees through usage of common data models/specifications, depending on how they are used/defined. As such, interoper- ability is a stated end for all the data models, except RCM. While the RCM ap- proach is used for data analysis with regard to personalization and contextualization, this approach does not specifically target interoperability. Instead of using a standardized approach, such as requiring terms to be chosen from pre- established vocabularies when generalizing the RCM to a new application domain, the configuration is done at an ad hoc basis for each domain added (e.g., for each new domain, the data format must be specified). Concerning the use of the context data models, there seems to be an emphasis on personalization (e.g., providing rec- ommendations) and visualization. CAM and LOCO also examine the use of learn- ing/data objects. While data from xAPI may be used for personalization, , it provides structures for describing data visualization, and examining object use that cannot be described with the context data models (e.g., information about learner results). Thus, provided the xAPI data are sufficiently described, they may also be used for other purposes, such as assessment and prediction. AVT project—the case The case study examined the use of xAPI in the AVT project (Morlandstø et al., 2019;Wasson, Morlandstø,&Hansen, 2019), which ran from August 2017 to May 2019. We chose AVT as a case study subject due to it being a real-world project that used a learning activity data standard (i.e., xAPI) for describing data originat- ing from multiple sources, thereby having the potential to uncover challenges and Data related to learning object use are typically stored in the object (activity) of an xAPI statement. Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 9 of 26 limitations of data description and integration as they unfold in practice. AVT, owned and funded by the Norwegian Association of Local and Regional Authorities (KS), was initiated by the Educational Authority in the Municipality of Oslo (Utdanningsetaten), and the Centre for the Science of Learning & Technology (SLATE), University of Bergen was responsible for research and for leading the project. In addition, the project group comprised 9 vendors from the Norwegian EdTech sector, and 4 of the schools in Oslo. The project consulted representatives of the Learning Committee (Læringskomiteen SN/K 186) under Standards Norway, the organization that is responsible for the majority of standardization work in Norway (Standards Norway, 2019) and representatives from The Norwegian Direct- orate for Education and Training (Utdanningsdirektoratet), as well as taking feed- back from the Norwegian Data Protection Authority (Datatilsynet), the Norwegian Competition Authority (Konkurransetilsynet), and representatives from the Parent organization for schools (Foreldreutvalget for grunnskolen) and the Student organization (Elevorganisasjonen). The AVT project explored possibilities for using activity data to adapt learning to individual learner needs, and for formative and summative assessment at the K– 12 level. Since learners generate activity data in a number of tools from different vendors, a challenge is how to integrate such data, to provide richer information on the activities of each individual learner. Therefore, AVT looked at data sharing among different EdTech vendors, resulting in the implementation of a framework that helped to standardize data originating from different educational tools and sys- tems, and which enables secure data flow among vendors. xAPI was the chosen format for data description, integration, and data exchange. To enable more con- sistent use of xAPI, the project used a number of concepts from a vocabulary that had been adapted and translated to Norwegian by Læringskomiteen SN/K 186, as they are working with learning technology and e-learning (Standards Norway, 2020). SN/K 186 also participates in projects that develop artifacts based on standardization initiatives, such as AVT. Method This research adopted an exploratory case study methodology (Oates, 2006), see Fig. 2, using AVT as a real-world case. The subject to investigate was challenges and limita- tions of using a current learning activity data standard (i.e., xAPI) for describing learn- ing context with regard to interoperability and data integration, and how these might be addressed. Consequently, we involved stakeholders at the different stages of the re- search process. This paper addresses the first four steps of Fig. 2. Initially, we studied the AVT project documents, prepared interview guides, did sam- pling and recruitment of participants, and prepared consent forms. Next, we inter- viewed stakeholders from the AVT project about the gaps and needs of xAPI, with emphasis on descriptions of context. To provide important background information, we also asked about the rationale for choosing xAPI and the process of using xAPI for data integration and data sharing. Using thematic data analysis, we then identified themes emerging from the interview data. Subsequently, based on interview data and inspection of the xAPI and xAPI profile specifications, we formulated recommenda- tions for how xAPI context can be improved regarding interoperability and data Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 10 of 26 Fig. 2 Research methodology integration. In this paper, we provide a summary of the recommendations. In future work, we will provide a detailed account of the recommendations, implement a number of the recommendations in two separate projects, and validate the recommendations through stakeholder examination and interviews. Participants The selection of participants was done through purposive sampling (Bryman, 2012,p. 418). Purposive sampling of participants is not done at random, but rather in a strategic Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 11 of 26 way (Bryman, 2012). The point is to select participants that are appropriate for the re- search questions. It may be important with variety in the sample, meaning that the sample members differ in their characteristics as they are relevant to the research questions. Eight stakeholders in the AVT project were recruited for the interviews to identify gaps and needs of xAPI (see Table 3). The first seven interviews were conducted be- tween October 22 and November 01, 2019. Through these seven interviews, it became clear that our sample was missing an important AVT member related to the questions we wished to answer in our research, thus an additional interview was conducted on February 07, 2020. The participants worked on a diverse set of tasks within the AVT project and their roles represented different perspectives. Two participants represented a de- veloper perspective (i.e., they had experience with implementation of xAPI methods and preparation of datasets), three participants represented a leader per- spective (two related to decision making for AVT; one was the leader of an ex- ternal organization associated with AVT), two participants had a vendor perspective (one of these vendors had delivered data to AVT and the other had not), and there were also two technical advisors in the sample. The advisors were knowledgeable regarding standardization within the educational domain and gave advice to the rest of the project about how to use xAPI for describing activity data and context; they also made some examples of xAPI statements that describe activity data related to AVT, which the developers subsequently followed/used as a template. Data collection Data collection was conducted using semi-structured interviews where the objective was to find answers to the following overarching questions: Table 3 Participants, sorted by interview order Identifier Gender Perspective Tasks (sample) P1 Female Leader Conducting meetings, delivery of documents, some technical work P2 Male Developer Technical tasks within AVT (e.g., server, database, data sharing, security), contributed to xAPI example statements P3 Male Leader School owner representative, specifying how vendors should code xAPI activity data P4 Male Technical advisor Vocabulary/profile work, detailed work on how to represent context for AVT activity data, created xAPI example statements P5 Male Vendor (delivered data), Planning and implementation of vendor solution developer P6 Female Vendor (did not deliver Project leadership and coordination for vendor data) P7 Female Leader (external Conducted meetings where a number of AVT members organization associated participated, which fed into the AVT project; work related to with AVT) vocabularies and their use in Norway P8 Male Technical advisor Vocabulary/profile work, detailed work on how to represent context for AVT activity data, participated in developing xAPI example statements, explored and informed vendors about tools and libraries for storage and exchange of activity data Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 12 of 26 – What was the rationale for choosing xAPI in the AVT project? – What was the process of using xAPI for data integration and data sharing? – What challenges and limitations of xAPI were identified when describing context for data integration? Interview guides were developed based on study of several documents including the final report for the AVT project (Morlandstø et al., 2019), the xAPI specification (Advanced Distributed Learning, 2017a), and the xAPI profile specification (Advanced Distributed Learning, 2018a). The interview guides contained a list of topics to be cov- ered in the individual interviews, which can be broadly be categorized as the following: Background (e.g., regarding the role/tasks of the participant in the AVT project, and their previous experience with xAPI), Leadership (e.g., related to reasons for choosing xAPI for the AVT project, and other decisions made within the project), Technical development (e.g., practical experiences of describing data in xAPI), Context (details about how context was represented using xAPI and reasons), High-level topics (e.g., benefits and challenges of using learning activity data specifications for data integration). The participants were interviewed according to their perspectives, roles, the tasks they had worked on, and their areas of competence. Each topic in the interview guide was covered by at least two participants. Before the interviews started all participants were presented with a consent form, explaining aspects such as the purpose of the project, that audio would be recorded for subsequent transcription, that participant information would be anonymized upon transcription and stored securely, and that their participation was voluntary and could be withdrawn at any time. Because the audio recordings could theoretically be used to identify the participants, the project was reported to the Norwegian Centre for Research Data (2020), which approved the pro- ject based on the measures taken concerning privacy and research ethics. Participants were informed that the interviews could take up to 90 min, although most interviews finished in less than an hour. All interviews were conducted in Norwegian. Data analysis The study used a thematic analysis approach (Bryman, 2012) to analyze the transcribed interview data. The transcripts were collated and read through several times for familiarity with the content. The interview data were coded at two levels, using NVivo (2020). First, the data were coded according to our overarching questions; next, the data for each overarching question were coded at a more fine-grained level and themes were identified through an inductive process. Findings The analysis resulted in 32 codes during the second level of coding that were fur- ther aggregated to seven themes, each pertaining to an overarching question (see Fig. 3). Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 13 of 26 Fig. 3 Questions and themes The findings in each theme are summarized below. All quotations in this section have been translated from Norwegian to English. Rationale for choosing xAPI Open, flexible, and mature specification A number of the responses given by the participants indicated that xAPI was chosen by AVT because it was open, flexible, and mature. P1 and P3 (representing a leadership perspective) explained that at the time of making the decision, xAPI had already been chosen as the preferred learning activity data specification for Norway by Standards Norway and the SN/K 186 committee, and this was the main reason that AVT chose to use xAPI. Another reason, mentioned by P1, was the openness and flexibility of xAPI. Since the choice of xAPI by SN/K 186 was the main reason that AVT decided on xAPI, the rationale for choosing xAPI by SN/K 186 was also of interest during the interviews. As P3, P7, and P8 had actively taken part in or observed the choice of xAPI by Standards Norway and SN/K 186, they explained reasons for the choice by the committee. Openness and flexibility were mentioned by all three participants. P3 and P7 specifically remarked on the possibility to customize xAPI for a particular use-case/project (e.g., through profiles or extensions). P8 stated that while IMS Caliper was an alternative at the time of the SN/K 186 decision, it seemed less ma- ture than xAPI. Particularly, xAPI had more extensive documentation than Caliper. Two other aspects for choosing xAPI over Caliper, mentioned by P8, was that Cali- per is more adjusted to the US educational system and that vendors of the big EdTech systems that have the greatest influence on Caliper. Looking at the IMS members, it is clear that the majority are US companies and institutions (IMS Glo- bal, 2020). Griffiths and Hoel (2016) confirm that it is the member organizations of IMS that influence the use cases that Caliper can describe and that the specifi- cation seems to target the needs of larger vendors and institutions. Interestingly, P7 explained that SN/K 186 had not taken a definite stand that they would only use xAPI, but they would try out the specification. Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 14 of 26 Familiarity Another reason for choosing xAPI for the AVT project, as revealed by P1, was that sev- eral project members were already familiar with the specification. P4 and P5 confirmed that they had both used the specification in their work for EdTech vendors (P4 func- tioned as an advisor in the AVT project). The AVT project members also had some knowledge about other Norwegian users of xAPI. P7 and P8 were aware of a smaller EdTech vendor that was using xAPI. P7 mentioned that a large higher education insti- tution in Norway had experimented with the specification. Process of using xAPI for data integration and data sharing Practical experimentation and learning by trial and error Using xAPI for data integration was very much a process of practical experimentation and trial and error, as P3 explained. As a starting point, the project used data in xAPI format from math tests in Oslo municipality (P2 converted the data to xAPI format). Having access to these data, a group including the technical advisors (P4 and P8) and P2 made one simple and one advanced example statement (more examples were later added), as explained by P3. P1 and P3 stated that the vendors were asked to use the examples as templates or rules for how to construct xAPI statements. The examples used concepts (e.g., verbs) from the xAPI vocabulary, i.e., the collective vocabulary defined in the published xAPI profiles (Advanced Distributed Learning, 2020b), some of which had been translated to Norwegian by SN/K 186. Other concepts used in the examples, which were not avail- able in the xAPI vocabulary, were defined in a separate xAPI profile. According to P4, following the examples was to ensure more uniform data descriptions, thus making data more easily integrable. For storage of xAPI statements, vendors were encouraged to implement their own LRS, which would accept queries in a specific format and re- turn statements. Concerning data sharing, a prototype was developed that could be queried for student data. The prototype had an LRS component (storing data from the Oslo tests), access control (to allow secure data sharing from other LRSs, e.g., from vendors), and a limited user interface that could display some data about students, as explained by P2. Initially, a number of the participating vendors in the AVT project indicated a willing- ness to share data. To make it easier for vendors to share data, thus to get more data for the project, the requirements for how statements should be formulated were eventually eased, as explained by P1. While P6’s company did not manage to deliver data in time, she did state that this decision would have allowed her company to deliver data eventually, as the data generated by their tool was at a higher level compared to the xAPI examples they were given. Also, P5’s company, who did deliver data, ended up delivering data in a format that differed slightly from the examples. This was due to internals of their applica- tion since some data generated did not fit into the built-in structures of xAPI. As P1 explained about the benefits and drawbacks about easing the requirements for the state- ments: “It gives advantages since we might get some additional data, but it gives draw- backs related to subsequent analysis. Because the data consistency, i.e., the quality of the entire data set, may not be as good. So, we have to weigh [drawbacks and benefits of] this the entire time.” Still, at the end of the AVT project, only one vendor managed to share Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 15 of 26 data in xAPI format. The data shared by the vendor was technically integrated with the Oslo municipality test data, but there was no individual student whose data appeared in both data sources (as stated by P1). Thus, more could have been said about data integra- tion if more data would had been successfully delivered. Expert-driven technical process The AVT work on using xAPI for data integration and data sharing was very much driven by the technical advisors (P4 and P8) who were knowledgeable about standardization and xAPI. When asked in-depth about how the project chose to de- scribe context using xAPI, and why that decision was made, those participants who had a leadership perspective in AVT (P1 and P3), and P2 (developer), all said to rather ask the advisors P4 and P8. As P3 put it: “P4 and P8 are the two masterminds behind the very technical parts of this [project].” He stated that the project relied heavily on P4 and P8 in the area of modelling context and that the other AVT members trusted their recommendations. P4 and P8 contributed greatly to the xAPI profile and to the ex- ample xAPI statements. In addition, P8 worked on specifying vendor LRS requirements. When asked if it was clear how to construct xAPI statements in terms of data integra- tion, P4 stated that he had hoped more vendors would have shown an interest in how the statement examples were formulated, as this could have sparked interesting discus- sions. From the perspective of a vendor who delivered data, this view of vendors as showing a limited interest in statement formulation was confirmed by P5. He expressed that how statements were generated to allow for data integration from multiple sources was not a focus area for his company: “As a vendor this is not something we care about. Rather, it is those who consume our data that have to figure this out.” Challenges and limitations of xAPI Data description constructs In terms of describing context within the AVT project, concepts from the xAPI vo- cabulary were used. While this approach enabled the description of the data, P3 noted that the example xAPI statements did not use many of the concepts from the xAPI vo- cabulary, as correct use of the concepts was quite challenging. For instance, for the verb “answered,” the many different types of items that could be solved or answered by a student could make it challenging to properly define the item. To add further concepts related to context that were not available in the xAPI vocabulary, an xAPI profile was developed for the AVT project, as explained by P3. The concepts were registered as metadata in the form of activity types. The activity types were used to represent con- cepts relevant for the AVT project, such as competence objective, school owner, and school. Thus, instances (i.e., activities) of these activity types could also be part of the statements. As P4 and P8 pointed out, using a profile allowed for restriction of the con- cepts that were used in xAPI statements. P8 also emphasized that profiles can explain the specific meaning of a concept. Related to data description, two participants (P3 and P5) mentioned that mapping real-world data to the xAPI format could be difficult, due to the assumptions of xAPI. P5 mentioned a problem of registering the answer to an assignment in xAPI, where xAPI expected exactly one answer. The solution of the vendor that P5 represented, Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 16 of 26 however, had multiple choice. Therefore, more than one answer could be correct (in combination). To add contextual data to the context structure of the xAPI statements, in addition to the data that could be added to the seven properties taking a single value or object that represents a single entity, P4 and P8 explained that the project had the choice be- tween adding them to ContextActivities and extensions. In the end, it was decided to use ContextActivities, with the following structure: context -> ContextActivities -> grouping. Accordingly, activities which collectively constituted the context data were grouped together. P4 said that part of the reason was that the example statements were inspired by CMI-5, a specification that has as its goal to enable a consistent information model for xAPI (CMI-5, 2020). He also stated that ContextActivities is a more standard way of using the xAPI specification, while extensions are a more custom approach. As P4 explained regarding extensions: “No one can know anything about how that format is.” While P4 was initially in doubt about the use of ContextActivities, he did feel it was a better choice than using extensions. P8 pointed out that while ContextActivities had a fixed structure, and extensions had a more flexible structure (e.g., it can contain a JSON object or a list). He stated two reasons for choosing ContextActivities over extensions: The first was related to the libraries available for generating xAPI statements. It was much simpler to add Contex- tActivities than extensions. The second reason was that vendors were to set up their own LRS according to the xAPI specification. One requirement was to set up a query API, so that the LRS could be queried in a standardized way. As P2 stated, however, it was a challenge that LRSs do not have good query capabilities, the focus seems to ra- ther be on data. Using ContextActivities made it easier to filter statements on context- ual data (activity types). P8 explained that to filter based on the contextual data when using extensions, a new API would need to be built on top of the query API: “You would have to build another API on top of the existing, to enable those queries. So, in a way it [using ContextActivities] is a minor ‘hack’(…). It was a way to enable more powerful queries or search.” Thus, using ContextActivities enabled enhanced query capabilities without requiring extra development work. P8 pointed out that using Con- textActivities to group contextual information had a significant drawback in terms of semantics. The activity types added to the xAPI profile, such as school, school owner, and competence objective, were not really activity types. Thus, the activities added to ContextActivities, which were instances of the activity types added in the AVT xAPI profile, were not really activities. In a report on xAPI, Learning Pool, the developers of the Learning Locker LRS, tell a related story on how not adhering to semantics can make querying easier. They wanted to model that a user had liked a specific comment by another user (Betts & Smith, 2019). Semantically, the comment (an activity) was the object of such a statement, and the other user belonged in the context. In terms of querying, they wanted to count how many times a user had made comments that were liked by other users. In this instance, they found querying easier if the other user was in the object structure, and the comment was in the xAPI context, even if this was not semantically correct. They commented: “This is just one example of how adopting the specification has been somewhat harder than we first thought it would be.” Another aspect of ContextActivies is how contextual information is grouped. When asked why grouping was used in place of category or other, both P4 and P8 were Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 17 of 26 uncertain. P4 stated he had read a lot about the topic, but he was unsure about how they differed. There seemed to be a lack of semantics that set them apart. P4 said any one of them could be used in the statements, or they could all be used: “We could have said that competence objective is in grouping, but the school owner, which school it originated from, is in [category]or [other]. But that would just make the statements even more complex.” In the end, P4 said they just needed to make a rule for which one to use, to ensure all vendors would follow the same procedure (and to not confuse the vendors). P8 also pointed out the difficulty of choosing between grouping, category, and other. When reflecting on the differences he stated: “Grouping… I don’t really know what that means, really.” He added that choosing between the three was made even more difficult since the activity types of the added activities were not really activity types. (Semantic) differences between tools generating statements Another challenge, expressed by P3, is that the openness of xAPI allows for the combination of data that may look similar, but that turn out to be not comparable. P4 mentioned duration as an example of a property that may be used differently in differ- ent EdTech systems and thus could be difficult to compare across sources. A property that was in fact problematic in the AVT project is the score of a student when solving a specific item, as mentioned by P3 and P5. Even when a score is set according to the same scale, it may not have the same meaning across systems. In AVT, the meaning of score was different in the tests administered by Oslo municipality and in the data deliv- ered by the vendor, as their tool was a system for practicing, rather than testing. P4 and P5 note that identifying these types of discrepancies require the sources to be well- known by those using them (e.g., for analysis). Because of the xAPI openness, different tools may also generate a different number of statements for the same type of event, as pointed out by P3. He stated that this was not a big challenge for the AVT project, be- cause only one vendor delivered data. When more than one vendor delivers data, how- ever, it could become a challenge. Another challenge, concerning tool differences, is that the same type of data may be recorded at different levels of granularity by different tools. For instance, P6 expressed that her company attempted to deliver data that was at a higher level compared to the xAPI examples they were first given. Semantic vs. technical interoperability The process of using xAPI for data integration was expert-driven, as mentioned earlier. This was particularly apparent when asking the different stakeholders about data de- scriptions. When asking those with a leader perspective in AVT (P1 and P3) and those with a developer perspective (P2 and P5) if xAPI could satisfactorily describe the data, they generally agreed. As P3 stated: “There is no data we have wanted to describe that we have not been able to describe with xAPI so far.” When probing further about rep- resentation of context in xAPI, P1, P3, and P5 generally agreed that xAPI could do it in a satisfactory way. The developers based their data descriptions on the examples cre- ated by P4 and P8; thus, they knew xAPI more on a technical level of interoperability than on a semantic level. As P2 expressed it: “I didn’t examine other options for xAPI. I just thought that here we have an example, then I will fill out the data I have Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 18 of 26 according to the example.” When P3, representing the leadership perspective in AVT, was asked if he thought it would have been more difficult to describe the data without the example, he agreed: “Yes, you can express almost anything with xAPI, so you have to start with a need.” When asking the technical advisors, who were more knowledgeable about xAPI and standardization, about data descriptions and context, they both mentioned several problems related to the openness/flexibility of xAPI and semantics. The different ways to represent context data (ContextActivities -> grouping, category or other) was one example. They also saw problems in constructing xAPI statements in terms of data integration because of the many ways data of the same type might be described. P4 said that in the context of AVT you need to try to find concepts appropriate for the vendors and to make them follow rules/templates. P8 said creating statements for data integration might be achievable within a small project such as AVT, based on common rules and documentation. If combining data with another community of practice, however, it would be challenging: “There would be many sources of error concerning syntax, semantics, etc.” He suggested library development and schema validation as possible ways to alleviate this challenge. Recommendations We see that the analysis of the stakeholder interviews establishes that there is a lack of clarity in how to describe xAPI context data related to interoperability and data integra- tion, which negatively affects the expressibility of xAPI. The challenges identified through systematic analysis of interview data and inspection of the xAPI and xAPI pro- file specifications are shown in Table 4. Based on the challenges identified, we provide recommendations (i.e., recommended solutions), in summarized form, on how xAPI can be improved to support interoper- ability and data integration, with emphasis on descriptions of xAPI context (see Table 4). The recommendations were specified using an iterative process, where we used a bottom-up approach of identifying challenges in the analyzed data, xAPI and xAPI pro- file specifications, and a top-down approach of examining relevant research literature on context categorizations. Although the emphasis has been on descriptions of context, we additionally give some recommendations that pertain to the xAPI framework as a whole (e.g., data typing and validation, and improved documentation), as they are im- portant with regard to the expressibility of xAPI context. In our recommended im- provements to xAPI, it should be noted that some recommended changes will require a change to the xAPI standard (or documentation), while others can be implemented using an xAPI profile. For this paper, the recommendations are at a conceptual rather than implementation level. Regarding the recommendations, data typing and validation for specific use cases can currently be addressed through two xAPI profile constructs (Advanced Distributed Learning, 2018b). Statement templates describe a way to structure xAPI statements, and can include rules, e.g., for restrictions on the data type of a specific property value, while patterns define how a group of statements should be ordered. Both constructs can be checked by an xAPI profile validator. Recently (August 2020), Advanced Distributed Learning (2020a) published informa- tion that there are plans to standardize xAPI 2.0, which is an upgrade from the current version 1.0.3 (Advanced Distributed Learning, 2020a). An IEEE LTSC working group, Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 19 of 26 Table 4 Challenges identified and recommended solutions Category Challenges identified Recommended solutions Context The distinction between ContextActivities and Use a unified structure for context in xAPI, i.e., extensions appears artificial, and it is not always context dimensions, with appropriate (low-level) clear which to use. properties for each context dimension. Use data typing and validation to restrict properties and Extensions are flexible in how data can be value types for context dimensions. Depending registered and therefore could make it more on the property, the value type can be Activity, difficult to integrate data. but other value types should also be supported (e.g., JSON object and string). ContextActivities is not a good fit for all types of context data. Grouping of related activities in ContextActivities Remove distinction between grouping, category, can be done within three different structures and other. All context data that do not have an (grouping, category or other). It is not clear how explicit (parent) relationship to the statement the three structures differ. The grouping should be placed in the same structure. For the structures are all very high-level. suggested unified structure for context, i.e., context dimensions, the related information can be listed more explicitly as property values belonging to the appropriate dimension and property. The query capabilities of LRSs are seen as limited. The xAPI specification defines the query interface The example given is that it is only possible to that all LRSs must implement (Advanced filter statements on contextual data that are Distributed Learning, 2017d). In the case of instances of an activity type. Thus, to allow filtering based on resources, the specification filtering of resources in AVT (without extra needs to be extended, so that it is possible to development work), resources were registered as filter contextual data by any resource type. activities, even if they were not really activities on Individual LRS providers have addressed this the semantic level. issue on an ad hoc basis (Learning Locker, 2020), but the problem needs to be further addressed in the xAPI specification to ensure LRS interoperability (e.g., for xAPI users not to have to rewrite substantial amounts of code if moving data between LRSs). Concepts from the xAPI vocabulary not sufficient Use of xAPI profiles to add additional concepts. to describe all data. The same vocabulary concept may be Stricter curation/approval process of the public represented in different public xAPI profiles, profiles. which make up the xAPI vocabulary. Data typing Tools generating data at multiple levels of To help tool developers identify and enforce the and validation granularity is a challenge, which may make it expected level of granularity, xAPI data typing more difficult to meaningfully integrate data. and validation can be used. For instance, if a property takes a list of activities (more granular), validation can ensure that less granular values (e.g., integer) will not be accepted. Data typing Difficulties in mapping real-world data to xAPI Data typing and validation can help to ensure and validation due to its assumptions. that the assumptions of xAPI (e.g., expected value for an xAPI concept) are made more explicit and tested against the data, to avoid wrong use of the specification. It is also crucial that the xAPI specification can be extended as new use cases reveal new needs for data registration. Data typing Different tools may generate different numbers Validation could be tied to the number of and validation of statements for the same type of event. statements generated for a given type of event, and to ensure that the statements generated follow an ordered pattern. Data typing The openness and flexibility of xAPI allows data Add clearer modelling guidelines to the and validation and relationships of the same type to be documentation. Add data typing and validation Documentation modelled in a myriad of different ways. of properties and property values. Use of profiles to specify vocabularies. Documentation It may be challenging to correctly use concepts Improve xAPI documentation, e.g., document from the xAPI vocabulary. more solutions for specific use-cases, and add more examples of how to use the xAPI vocabu- lary concepts in order to avoid misunderstand- ings and remove ambiguity. Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 20 of 26 comprising stakeholders from the xAPI community and technical experts, have agreed on the new standard; however, a formal balloting process also must be conducted in order to standardize. While there is information that some new structures will be intro- duced in order to describe context, it is indicated that these structures will allow more structured descriptions of individuals (i.e., contextAgents) and teams (i.e., contextGroups); thus, we do not believe they will solve the issues/challenges related to context that our re- search has identified and for which we provide recommendations in this paper. Another addition to the proposed xAPI 2.0 standard is a best-practices guide, “which will be linked to the eventual standard as a living document that can grow and change with advances in learning science and technologies” (Advanced Distributed Learning, 2020a). Based on the published information, this guide could help in the identified need for improvements in the xAPI documentation. Other changes in the proposed version 2.0 include forbidding additional properties in statements and standardizing timestamps. While the changes may help in terms of interoperability and data integration, they do not specifically relate to the challenges we have identified and the recommendations we have provided. Discussion Two research questions were posed in this paper, regarding (1) gaps and needs of xAPI in terms of interoperability and data integration focusing on context descriptions, and (2) how identified gaps and needs can be addressed in order to provide improved inter- operability and data integration. We have addressed RQ1 through analysis of the data from the AVT stakeholder interviews and inspection of the xAPI and xAPI profiles specifications, and RQ2 through providing summarized recommendations on how xAPI can be improved to support interoperability and data integration with emphasis on descriptions of xAPI context. In the following, we discuss patterns and trends related to xAPI data descriptions and interoperability/data integration, which we have identi- fied based on the review of research papers that utilize xAPI to describe data. Although papers on xAPI commonly mention the benefits of xAPI in terms of inter- operability, most of the studies that use or explore the use of xAPI to describe data worked with only one data source (Hruska, Long, Amburn, Kilcullen, & Poeppelman, 2014; Megliola, De Vito, Sanguini, Wild, & Lefrere, 2014; Papadokostaki, Panagiotakis, Vassilakis, & Malamos, 2017; Wu, Guo, & Zhu, 2020; Zapata-Rivera & Petrie, 2018). In such cases, there is no practical experience with the challenges and limitations regard- ing data descriptions and interoperability. For instance, the challenges related to the flexibility of xAPI and data descriptions do not readily appear. The data can be de- scribed in different ways, all accepted according to xAPI. It is when trying to integrate xAPI data from different sources the challenges of inconsistent descriptions will sur- face. Thus, out of the five referenced examples using only one data source, four of them do not touch on challenges related to xAPI and data descriptions/interoperability. The exception is Hruska et al. (2014) who examine challenges with, and give examples on, how to encode information about teams/groups in xAPI statements, including descrip- tions of group context. Interestingly, Megliola et al. (2014) have many reflections on vo- cabulary (verbs and objects) for describing events in their modelling domain (aeronautics), but the reflections are based on multiple theories in linguistics rather than the practical application in xAPI. Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 21 of 26 Having examined the knowledge base, we have found a very limited number of stud- ies that utilize xAPI to integrate data originating from multiple data sources and share lessons learnt from such a project. The CLA toolkit case study (Bakharia et al., 2016) was one such study. Here, data from different social media were described in xAPI format and integrated in an LRS for use in a systemic analytics solution. The project in- cluded designing a common vocabulary through re-use of concepts from W3C Activi- tyStreams 1.0 and providing mappings from individual tool concepts to the common concepts. They also examined how context could potentially be described and made decisions on how to describe context in the project. At the time of the study, xAPI pro- files had not been added to xAPI. Thus, vocabulary and data type/validation rules could not be described in a machine-readable manner. Rather, the vocabulary, together with prescriptions on how to describe social media data, were stored in a recipe. A recipe is a textual description on how xAPI statements for a certain experience can be con- structed (Miller, 2018). At the time, recipes were the common means to share the im- plementation of an information model with a community of practice, serving as a potential aid in terms of interoperability. They were, however, not machine-readable. Through actual usage of xAPI, the researchers working on the CLA toolkit were able to identify challenges and complexities of the data integration approach. Among the lessons learnt were that providing xAPI context data, while optional according to the standard, was essential for their project. In addition, they recommended that xAPI be extended with the JSON-LD specification, as the lack of machine-readable vocabularies and rules were a weakness of the xAPI specification. Following the research by Bakharia et al. (2016), we see that xAPI has introduced capabilities of machine-readable vocabularies and rules, since the xAPI profiles specifi- cation has been added to xAPI. Thus, machine-readability is no longer a core problem of xAPI (although it is of importance that tools and libraries implement the methods needed to read metadata descriptions and apply data typing/validation rules). It is en- couraging to see that xAPI has used the results from research when choosing to add JSON-LD capabilities. xAPI, through xAPI profiles which are defined using the JSON-LD format, leverage semantic technologies to allow for documents that are not only readable by humans, but also machines. In their article, Verborgh and Vander Sande (2020) discuss the im- portance of not conducting research related to semantic technologies in a vacuum (e.g., controlled research experiment). While researchers are often reluctant to take their so- lutions out of the labs, deeming large-scale technology use as a trivial engineering prob- lem, this article highlights that practical use of the technologies outside of the safe research environments, e.g., through integrating data from sources containing real- world rather than synthetic data, is likely to uncover new challenges that need to be solved in order to promote adoption among practitioners. The results from our inter- views indicate that there is indeed a need for more practical research with real-world data description and integration. Research has shown that data integration within LA, an important part of LA scal- ability, is a challenge in itself. Previous research in the domain of LA and higher educa- tion has found that if data are integrated, they typically originate only from a few data https://activitystrea.ms/ Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 22 of 26 sources; when data are integrated, the integration is often of data of similar formats (technically, this type of integration is easier than combining data of different formats), and there seems to be little use of learning activity data specifications such as xAPI (Samuelsen et al., 2019). One reason that xAPI is not used more for data integration may be that many tools do not support providing data in the form of xAPI statements. In such cases, the use of xAPI can be challenging, since transforming data to xAPI for- mat is likely to require a considerable effort. Depending on the situation, it may be more convenient to store data of similar formats originating from different data sources in a NoSQL database (e.g., if the data integrated are provided as JSON through REST APIs), even though the problem of alignment of concepts from different sources will still be a challenge. In some cases, the integration approach may also be a manual one, e.g., through copying and pasting data from data sources into excel sheets; an approach that may also require considerable human effort. In the case of xAPI and other learning activity data specifications/standards, there seems to be a need for more tools support- ing their formats. While previous research has found little evidence of xAPI use in the research litera- ture (Samuelsen et al., 2019), the xAPI guide by Learning Pool provides information that there are tools that can export data in xAPI format (Betts & Smith, 2019). These include LMSs, content management systems, and authoring tools. The tools often pro- vide the xAPI export functionality through plugins. Due to the flexibility of xAPI data descriptions, however, one cannot expect statements from these tools to meaningfully integrate (i.e., to scale), since different plugins, made by different developers, may use different syntactic and semantic structures for describing the same data. For the state- ments of an xAPI-enabled tool to be compatible with statements from a number of other xAPI-enabled tools, we can imagine that the tool will need a number of similar plugins, each plugin making statements compatible with statements from a specific tool or set of tools. While there are architectures (Apereo, 2020; JISC, 2020) that provide connectors/plugins for several tools to store their data as xAPI in a common data store, the problem is still not solved since different architectures can also model data of simi- lar types in different ways. Thus, there is a need for tools that can provide xAPI data in a coherent format. Looking at studies using xAPI, we find, consistent with its flexibility, a diverse set of examples of how researchers add, or intend to add, context data. Not all ex- amples seem to be within the intended use of xAPI. Wu et al. (2020)demonstrate using ContextActivities—parent to store a verb as an activity, even though a verb is not an activity, and should not be the parent of an activity. Sottilare, Long, and Goldberg (2017) propose to add course to the xAPI result structure, even though it is added in the published xAPI vocabulary as an activity type that can be used in the context structure with ContextActivities (Advanced Distributed Learning, 2020b). We also find examples related to the different ways ContextActivities can be used to describe data. Several works use ContextActivities—grouping and Con- textActivities—other in the same statement (Claggett, 2018;Hruskaetal., 2014), even though we have identified that it is not really clear that there is a difference between the two structures. These examples confirm our findings that there is a need to enhance the expressibility of context in xAPI. The recommended addition of context dimensions would be one way to enable more consistent description of Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 23 of 26 the data. Documentation, data typing, and validation should provide clarity and guidance regarding how to achieve this goal. In the xAPI specification, learning objects and their context are represented through activities. An activity has an id (an identifying IRI), and an optional definition stored in an Activity Definition Object (Advanced Distributed Learning, 2017c). The activity def- inition can be stored together with the activity id in the xAPI statement, or it can be stored in the activity id IRI and downloaded by the LRS (hosted metadata solution). While xAPI can represent learning objects and their context, the topic of representing learning objects has also been addressed by several other standards, including “ISO/IEC 19788 Information technology - Learning, education and training - Metadata for learn- ing resources” (International Organization for Standardization, 2011) and “Encoded Archival Context – Corporate bodies, Persons and Families” (Mazzini & Ricci, 2011). Thus, the xAPI community could look to such standards when aiming to provide fu- ture improvements for learning object representation. The challenges we have identified with xAPI may lead us to question if IMS Caliper would currently be better suited for representing interaction data in the educational do- main. Examining the specification (IMS Global, 2018), we find there are 14 available profiles (called metric profiles), which target experiences such as assessment, forum, and tool use. Similar to xAPI, Caliper also supports machine-readable data descriptions through JSON-LD. It seems that Caliper has the potential of avoiding many of the xAPI problems caused by flexibility. For instance, it provides a pre-defined vocabulary of available verbs (called actions). In addition, a specific event (e.g., assessment event) has a number of pre-defined properties for context that can or must be specified. The properties available vary based on the event type. In the case where there is a need to add a context property not available in the specification, there is a generic extension object, where additional properties can be added (similar to extensions in xAPI). When describing experiences not included in the specification, however, the only option is to use the generic (basic) profile, which supports only generic events (which can use any number of properties), but that can only use the verbs from the pre-defined vocabulary. It seems the use of the basic profile will also pose challenges related to interoperability due to its flexibility in terms of using generic events. Furthermore, while waiting for the addition of new profiles, the pre-defined verbs may not cover the actual needs of users. Since it appears that Caliper is more geared toward the big EdTech companies (Grif- fiths & Hoel, 2016), this becomes a considerable barrier to adoption for smaller vendors and researchers. For projects that need to support experiences outside of the 14 experi- ences for which there are profiles, it seems that xAPI would be the better choice after all. In this case study, we have focused our attention on one specific case, the AVT pro- ject. Although we have only examined one project, it is one that really strives to inte- grate multiple data sources. This case study is one of few studies that has emphasized revealing and addressing challenges and limitations in a data specification, focusing on context descriptions. The research is conducted outside of the safe lab environment, meaning we can identify challenges that would otherwise remain unnoticed. While others have previously identified some of the challenges and limitations of xAPI, this seems to be the first paper that systematically examines challenges and limitations of xAPI context, through involving stakeholders having experienced xAPI in a real-world Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 24 of 26 case, and that gives recommendations on how the context descriptions can be enriched through changes to the context structure and other means in order to improve expressibility. Conclusion and future work This paper presents an exploratory case study, taking place in a real-world setting, using the AVT project as a case. The research has aimed to systematically identify chal- lenges and limitations of using a current learning activity data standard (i.e., xAPI) for describing learning context with regard to interoperability and data integration. Subse- quently, we have provided recommendations, in summarized form, for the identified challenges and limitations. Our research has identified a lack of clarity in how to de- scribe context data in xAPI regarding interoperability and data integration. The recom- mendations relate not only to description/modelling of context in xAPI, but also to data typing, validation, and documentation, as all these are essential to enhance the expressibility of xAPI context. Despite xAPI’s potential regarding interoperability, we see a tendency in studies using xAPI that most of them describe only data from one data source. Additionally, in the cases where multiple data sources are actually integrated, few reflect on limitations or challenges concerning data descriptions. In order to scale up LA, particularly when in- tegrating data from multiple sources, it is essential to describe data in a coherent way. Therefore, we strongly encourage others in the LA research community, using xAPI for data integration, to try out the recommended solutions in their own projects. Currently, it is not possible to make use of all recommendations since some will require a change to the xAPI/xAPI profile specifications. Related to the recommendations that can be implemented, we especially highlight the use of xAPI profiles to provide vocabularies and to specify shared data typing and validation rules (through statement templates and patterns). We acknowledge that there are some limitations with our research. Due to the quali- tative approach, where we thematically analyzed data from in-depth interviews with a limited number of participants representing different stakeholder perspectives, the find- ings are based on our study, although many of the challenges are supported by the literature. Thus, although our results may not be generalizable, they are based on a real-life case involving multiple stakeholders and multiple data sources. Furthermore, the recommendations have not yet been detailed in depth, implemented, and validated. In future work, we will proceed with the next steps in the methodology, including de- tailing the recommended solutions that are summarized in this paper, and stakeholder validation of the implementable recommendations through using the xAPI and xAPI profile specifications for data descriptions in two separate projects. Abbreviations AVT: Activity Data for Assessment and Adaptation; CAM: Contextual Attention Metadata; IRI: Internationalized Resource Identifier; JSON-LD: JSON for Linking Data; LA: Learning Analytics; LMS: Learning Management System; LOCO: Learning Object Context Ontologies; LRS: Learning Record Store; LCDM: Learning Context Data Model; RCM: Rich Context Model; xAPI: Experience API Acknowledgements The authors wish to thank the participants for their valuable contributions in identifying challenges and limitations of xAPI. Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 25 of 26 Authors’ contributions This paper is a part of the PhD project conducted by JS. BW is her main supervisor, and WC is her co-supervisor. WC has been working closely with JS in planning and has assisted in writing. BW has assisted in planning and writing. All authors read and approved the final manuscript. Funding This research is a part of Jeanette Samuelsen’s PhD funded by the Centre for the Science of Learning & Technology (SLATE), University of Bergen, Norway. Availability of data and materials To protect the privacy of the participants, the data cannot be shared. Declaration Competing interests There is no conflict of interests related to this manuscript. Author details Centre for the Science of Learning & Technology, University of Bergen, P.O. Box 7807, 5020 Bergen, Norway. 2 3 Department of Information Science & Media Studies, University of Bergen, P.O. Box 7802, 5020 Bergen, Norway. Oslo Metropolitan University, Oslo, Norway. Received: 2 October 2020 Accepted: 23 February 2021 References Advanced Distributed Learning. (2017a). xAPI specification. Retrieved from https://github.com/adlnet/xAPI-Spec Advanced Distributed Learning. (2017b). xAPI specification - part one: About the experience API. Retrieved from https:// github.com/adlnet/xAPI-Spec/blob/master/xAPI-About.md#partone Advanced Distributed Learning. (2017c). xAPI specification - part two: Experience API data. Retrieved from https://github. com/adlnet/xAPI-Spec/blob/master/xAPI-Data.md#parttwo Advanced Distributed Learning. (2017d). xAPI specification - part three: Data processing, validation, and security. Retrieved from https://github.com/adlnet/xAPI-Spec/blob/master/xAPI-Communication.md#partthree Advanced Distributed Learning. (2018a). xAPI profiles specification. Retrieved from https://github.com/adlnet/xapi-profiles Advanced Distributed Learning. (2018b). xAPI profile specification - part two: xAPI profiles document structure specification. Retrieved from https://github.com/adlnet/xapi-profiles/blob/master/xapi-profiles-structure.md#part-two Advanced Distributed Learning. (2020a). Anticipating the xAPI Version 2.0 Standard. Retrieved from https://adlnet.gov/news/2 020/08/06/Anticipating-the-xAPI-Version-2.0-Standard/ Advanced Distributed Learning. (2020b). xAPI authored profiles. Retrieved from https://github.com/adlnet/xapi-authored- profiles/ Apereo. (2020). Learning Analytics Initiative | Apereo. Retrieved from https://www.apereo.org/communities/learning-analytics- initiative Bakharia, A., Kitto, K., Pardo, A., Gašević, D., & Dawson, S. (2016). Recipe for success: Lessons learnt from using xAPI within the connected learning analytics toolkit. In Proceedings of the sixth international conference on learning analytics & knowledge, (pp. 378–382). Betts, B., & Smith, R. (2019). The learning technology manager's guide to xAPI (Version 2.2). Retrieved from https://lea rningpool.com/guide-to-xapi/ Bryman, A. (2012). Social Research Methods (4th ed.). Oxford: Oxford University Press. Claggett, S. (2018). xAPI Game Demo Example Part 1 [Blog post]. Retrieved from https://gblxapi.org/community-blog-xapi- gbl/10-xapi-demo-example-threedigits CMI-5. (2020). The cmi5 Project. Retrieved from https://github.com/AICC/CMI-5_Spec_Current Dey, A. K. (2001). Understanding and using context. Personal and Ubiquitous Computing, 5(1), 4–7. European Commission. (2017). New European Interoperability Framework. Retrieved from https://ec.europa.eu/isa2/sites/isa/ files/eif_brochure_final.pdf Griffiths, D., & Hoel, T. (2016). Comparing xAPI and Caliper (Learning Analytics Review, No. 7). Bolton: LACE. Hruska, M., Long, R., Amburn, C., Kilcullen, T., & Poeppelman, T. (2014). Experience API and team evaluation: Evolving interoperable performance assessment. In The Interservice/Industry Training, Simulation & Education Conference (I/ITSEC). IMS Caliper Analytics. (2020). Caliper Analytics | IMS Global Learning Consortium. Retrieved from https://www.imsglobal.org/a ctivity/caliper IMS Global. (2018). IMS Caliper Specification v1.1. Retrieved from https://www.imsglobal.org/sites/default/files/caliper/v1p1/ca liper-spec-v1p1/caliper-spec-v1p1.html IMS Global. (2020). Members | IMS Global. Retrieved August 20, 2020, from https://site.imsglobal.org/membership/members International Organization for Standardization. (2011). ISO/IEC 19788-1:2011 Information technology — Learning, education and training — Metadata for learning resources — Part 1: Framework. Retrieved from https://www.iso.org/standard/ 50772.html JISC. (2020). Learning records warehouse: technical overview: Integration overview. Retrieved from https://docs.analytics.a lpha.jisc.ac.uk/docs/learning-records-warehouse/Technical-Overview:%2D%2DIntegration-Overview Jovanović, J., Gašević, D., Knight, C., & Richards, G. (2007). Ontologies for effective use of context in e-learning settings. Journal of Educational Technology & Society, 10(3), 47–59. Keehn, S., & Claggett, S. (2019). Collecting standardized assessment data in games. Journal of Applied Testing Technology, 20(S1), 43–51. Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 26 of 26 Learning Locker. (2020). Aggregation HTTP interface. Retrieved from https://docs.learninglocker.net/http-aggregation/ Lincke, A. (2020). A computational approach for modelling context across different application domains (Doctoral dissertation, Linnaeus University Press). Retrieved from http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-93251 Mazzini, S., & Ricci, F. (2011). EAC-CPF ontology and linked archival data. In SDA, (pp. 72–81). Megliola, M., De Vito, G., Sanguini, R., Wild, F., & Lefrere, P. (2014). Creating awareness of kinaesthetic learning using the Experience API: current practices, emerging challenges, possible solutions. In CEUR Workshop Proceedings, (vol. 1238, pp. 11–22). Miller, B. (2018). Profile Recipes vs. xAPI Profiles [Blog post]. Retrieved from https://xapi.com/blog/profile-recipes-vs-xapi- profiles/ Morlandstø, N. I., Hansen, C. J. S., Wasson, B., & Bull, S. (2019). Aktivitetsdata for vurdering og tilpasning: Sluttrapport. SLATE Research Report 2019-1. Bergen: Centre for the Science of Learning & Technology (SLATE) ISBN: 978-82-994238-7-8. Muslim, A., Chatti, M. A., Mahapatra, T., & Schroeder, U. (2016). A rule-based indicator definition tool for personalized learning analytics. In Proceedings of the sixth international conference on learning analytics & knowledge, (pp. 264–273). Norwegian Centre for Research Data (2020). NSD - Norwegian Centre for Research Data. Retrieved from https://nsd.no/nsd/ english/index.html NVivo. (2020) Qualitative Data Analysis Software | NVivo. Retrieved from https://www.qsrinternational.com/nvivo-qualitative- data-analysis-software/home Oates, B. J. (2006). Researching Information Systems and Computing. London: SAGE publications. Papadokostaki, K., Panagiotakis, S., Vassilakis, K., & Malamos, A. (2017). Implementing an adaptive learning system with the use of experience API. In Interactivity, Game Creation, Design, Learning, and Innovation, (pp. 393–402). Cham: Springer. Samuelsen, J., Chen, W., & Wasson, B. (2019). Integrating multiple data sources for learning analytics—review of literature. Research and Practice in Technology Enhanced Learning, 14(1). https://doi.org/10.1186/s41039-019-0105-4. Schmitz, H. C., Wolpers, M., Kirschenmann, U., & Niemann, K. (2011). Contextualized attention metadata. In Human attention in digital environments, (pp. 186–209). Siemens, G. (2011). 1st international conference on learning analytics and knowledge. Technology Enhanced Knowledge Research Institute (TEKRI). Retrieved from https://tekri.athabascau.ca/analytics/ Sottilare, R. A., Long, R. A., & Goldberg, B. S. (2017). Enhancing the Experience Application Program Interface (xAPI) to improve domain competency modeling for adaptive instruction. In Proceedings of the Fourth (2017) ACM Conference on Learning@ Scale, (pp. 265–268). Standards Norway. (2019). Standards Norway. Retrieved from https://www.standard.no/en/toppvalg/about-us/standards- norway/ Standards Norway. (2020). SN/K 186. Retrieved from https://www.standard.no/standardisering/komiteer/sn/snk-186/ Thüs, H., Chatti, M. A., Brandt, R., & Schroeder, U. (2015). Evolution of interests in the learning context data model. In Design for Teaching and Learning in a Networked World, (pp. 479–484). Cham: Springer. Thüs, H., Chatti, M. A., Yalcin, E., Pallasch, C., Kyryliuk, B., Mageramov, T., & Schroeder, U. (2012). Mobile learning in context. International Journal of Technology Enhanced Learning, 4(5-6), 332–344. Verborgh, R., & Vander Sande, M. (2020). The Semantic Web identity crisis: in search of the trivialities that never were. Semantic Web Journal, 11(1), 19–27 IOS Press. Retrieved from https://ruben.verborgh.org/articles/the-semantic-web- identity-crisis/. Vidal, J. C., Rabelo, T., & Lama, M. (2015). Semantic description of the Experience API specification. In 2015 IEEE 15th International Conference on Advanced Learning Technologies, (pp. 268–269). Wasson, B., Morlandstø, N. I., & Hansen, C. J. S. (2019). Summary of SLATE Research Report 2019-1: Activity data for assessment and activity (AVT). Bergen: Centre for the Science of Learning & Technology (SLATE). Retrieved from https://bora.uib.no/ha ndle/1956/20187. Wu, Y., Guo, S., & Zhu, L. (2020). Design and implementation of data collection mechanism for 3D design course based on xAPI standard. Interactive Learning Environments, 28(5), 602–619. Zapata-Rivera, L. F., & Petrie, M. M. L. (2018). xAPI-based model for tracking on-line laboratory applications. In 2018 IEEE Frontiers in Education Conference (FIE), (pp. 1–9). Publisher’sNote Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Research and Practice in Technology Enhanced Learning Springer Journals http://www.deepdyve.com/lp/springer-journals/enriching-context-descriptions-for-enhanced-la-scalability-a-case-uQM7wxx9SR

Loading next page...

References (47)

Proceedings of the 1st International Workshop on Semantic Digital Archives (SDA 2011) EAC-CPF Ontology and Linked Archival Data
B. Wasson, N. Morlandstø, C. Hansen (2019)
Summary of SLATE Research Report 2019-1: Activity data for assessment and adaptivity (AVT)
Arham Muslim, Mohamed Chatti, T. Mahapatra, U. Schroeder (2016)
A rule-based indicator definition tool for personalized learning analytics
Proceedings of the Sixth International Conference on Learning Analytics & Knowledge
(2011)
1 st international conference on learning analytics and knowledge
B. Oates (2005)
Researching Information Systems and Computing
R. Godau (2004)
Qualitative Data Analysis Software: NVivo
Qualitative Research Journal, 4
(2011)
ISO / IEC 19788 - 1 : 2011 Information technology — Learning , education and training — Metadata for learning resources — Part 1 : Framework
J. Jovanović, D. Gašević, C. Knight, Griff Richards (2008)
Ontologies for Effective Use of Context in e-Learning Settings
J. Educ. Technol. Soc., 10
Yonghe Wu, Shouchao Guo, Lijuan Zhu (2020)
Design and implementation of data collection mechanism for 3D design course based on xAPI standard
Interactive Learning Environments, 28
(2017)
xAPI specification -part two: Experience API data
(2020)
xAPI authored profiles
(2018)
Profile Recipes vs. xAPI Profiles
Jeanette Samuelsen, Weiqin Chen, B. Wasson (2019)
Integrating multiple data sources for learning analytics—review of literature
Research and Practice in Technology Enhanced Learning, 14
K. Papadokostaki, S. Panagiotakis, K. Vassilakis, A. Malamos (2017)
Implementing an Adaptive Learning System with the Use of Experience API
Hendrik Thüs, Mohamed Chatti, R. Brandt, U. Schroeder (2015)
Evolution of Interests in the Learning Context Data Model
A. Dey (2001)
Understanding and Using Context
Personal and Ubiquitous Computing, 5
(2018)
xAPI Game Demo Example Part 1 [ Blog post ]
Aneesha Bakharia, Kirsty Kitto, A. Pardo, D. Gašević, S. Dawson (2016)
Recipe for success: lessons learnt from using xAPI within the connected learning analytics toolkit
Proceedings of the Sixth International Conference on Learning Analytics & Knowledge
S. Keehn, S. Claggett (2019)
Collecting Standardized Assessment Data in Games.
, 20
(2019)
Activity data for assessment
(2012)
Teaching and Learning in a Networked World, (pp. 479–484)
Hendrik Thüs, Mohamed Chatti, Esra Yalcin, Christoph Pallasch, Bogdan Kyryliuk, Togrul Mageramov, U. Schroeder (2012)
Mobile learning in context
International Journal of Technology Enhanced Learning, 4
R. Verborgh, M. Sande (2020)
The Semantic Web identity crisis: In search of the trivialities that never were
Semantic Web, 11
(2020)
Aggregation HTTP interface
(2014)
Experience API and team evaluation: Evolving interoperable performance assessment
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations
Luis Zapata-Rivera, María Larrondo-Petrie (2018)
xAPI-Based Model for Tracking On-line Laboratory Applications
2018 IEEE Frontiers in Education Conference (FIE)
Nina Morlandstø, C. Hansen, Barbara Wasson, Susan Bull (2019)
Aktivitetsdata for vurdering og tilpasning: Sluttrapport
Alisa Lincke (2020)
A Computational Approach for Modelling Context across Different Application Domains
(2020)
NSD -Norwegian Centre for Research Data
M. Specht, Stefaan Ternier (2010)
Mobile Learning in Context
R. Sottilare, Rodney Long, B. Goldberg (2017)
Enhancing the Experience Application Program Interface (xAPI) to Improve Domain Competency Modeling for Adaptive Instruction
Proceedings of the Fourth (2017) ACM Conference on Learning @ Scale
J. Vidal, T. Rabelo, M. Lama (2015)
Semantic Description of the Experience API Specification
2015 IEEE 15th International Conference on Advanced Learning Technologies
Hans-Christian Schmitz, M. Wolpers, Uwe Kirschenmann, K. Niemann (2011)
Contextualized attention metadata
(2017)
xAPI specification -part three: Data processing, validation, and security
(2020)
The cmi5 Project
(2019)
Standards Norway
(2020)
Anticipating the xAPI Version 2.0 Standard
(2018)
xAPI profile specification -part two: xAPI profiles document structure specification
M. Megliola, Gianluigi Vito, Fridolin Wild, P. Lefrere, Roberto Sanguini (2014)
Creating awareness of kinaesthetic learning using the Experience API: current practices, emerging challenges, possible solutions
(2019)
The learning technology manager's guide to xAPI (Version 2.2)
(2012)
Social Research Methods (4th ed.)
H. Togt (2003)
Publisher's Note
J. Netw. Comput. Appl., 26
(2016)
Comparing xAPI and Caliper (Learning Analytics Review, No
(2021)
Research and Practice in Technology
(2020)
Learning Analytics Initiative | Apereo
(2017)
New European Interoperability Framework

Publisher: Springer Journals
Copyright: Copyright © The Author(s) 2021
eISSN: 1793-7078
DOI: 10.1186/s41039-021-00150-2
Publisher site: See Article on Publisher Site

Abstract

Samuelsen@uib.no Centre for the Science of Learning Learning analytics (LA) is a field that examines data about learners and their context, & Technology, University of Bergen, for understanding and optimizing learning and the environments in which it occurs. P.O. Box 7807, 5020 Bergen, Norway Department of Information Science Integration of multiple data sources, an important dimension of scalability, has the & Media Studies, University of potential to provide rich insights within LA. Using a common standard such as the Bergen, P.O. Box 7802, 5020 Bergen, Experience API (xAPI) to describe learning activity data across multiple sources can Norway Full list of author information is alleviate obstacles for data integration. Despite their potential, however, research available at the end of the article indicates that standards are seldom used for integration of multiple sources in LA. Our research aims to understand and address the challenges of using current learning activity data standards for describing learning context with regard to interoperability and data integration. In this paper, we present the results of an exploratory case study involving in-depth interviews with stakeholders having used xAPI in a real-world project. Based on the subsequent thematic analysis of interviews, and examination of xAPI, we identified challenges and limitations in describing learning context data, and developed recommendations (provided in this paper in summarized form) for enriching context descriptions and enhancing the expressibility of xAPI. By situating the research in a real-world setting, our research also contributes to bridge the gap between the academic community and practitioners in learning activity data standards and scalability, focusing on description of learning context. Keywords: Learning context, Learning activity data specification, xAPI, Scalability, Interoperability, Data integration, Learning analytics Introduction A diversity of digital tools are used within education. They may, for instance, facilitate exam taking (e.g., an exam system), store student demographic and result data (e.g., a student information system), or make available lecture notes and videos (e.g., a learn- ing management system [LMS]). When a student uses such systems, digital trace data may be generated and saved in individual data sources. This data can be used to gain insight into the student and their learning. Learning analytics (LA) is “the measure- ment, collection, analysis and reporting of data about learners and their contexts, for purposes of understanding and optimizing learning and the environments in which it occurs” (Siemens, 2011). Within LA, “researchers have suggested that the true poten- tial to offer meaningful insight comes from combining data from across different © The Author(s). 2021 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 2 of 26 sources” (Bakharia, Kitto, Pardo, Gašević, & Dawson, 2016, p. 378). Data integration, the combination of data from different sources, also plays an important role for the scalability of LA (Samuelsen, Chen, & Wasson, 2019). Throughout their various activities, learners are situated in different contexts. They move within the physical space, at varying times of the day, using different tools on separate devices, leading to data being generated through different sensors. A context is defined as “any information that can be used to characterize the situation of an entity” (Dey, 2001, p. 5), and an entity can be a person, place, or an object (Dey, 2001). To in- tegrate data from different sources in LA, it is crucial to take into account the context of the data. Taking context into account can have benefits for interoperability and may also be used to personalize learning for the individual learner, as well as enable better querying and reporting of the data. Data integration is closely related to interoperability, which involves semantic, technical, legal, and organizational levels (European Commission, 2017). Concerning technical and semantic interoperability, two well-known data specifications (de facto industry standards) exist which target the educational domain and LA, namely the Experience API (xAPI; Advanced Distributed Learning, 2017a) and IMS Caliper Analyt- ics (2020). These specifications both enable the exchange and the integration of learn- ing activity data originating from different tools and data sources, where individual activity data describe a learner interacting with a learning object in a learning environ- ment (modelled with the most basic structure of “actor verb object”). The activity data can subsequently be stored in a Learning Record Store (LRS). Both specifications also provide mechanisms for adding vocabularies, through profiles, which can help in terms of structuring the activity data and adding semantics. Current profile specifications are specified in the JSON for Linking Data (JSON-LD) format, which builds on JSON and semantic technologies to enable machine-readable data definitions. For xAPI, any com- munity of practice can create a new profile, while for Caliper only organizations that are members of IMS may contribute to profiles (and other parts of the specification). As the latter may suggest, the usage of xAPI is generally more flexible than that of Caliper (Griffiths & Hoel, 2016). For a detailed comparison of xAPI and Caliper, please refer to Griffiths and Hoel (2016). Despite the availability of these learning activity data specifications, previous research (Samuelsen et al., 2019) found that they are not widely used for data integration of data coming from multiple data sources for LA in the context of higher education; in the case of xAPI, a few examples of use were found, while no examples of Caliper use were found. Thus, it should be of interest for researchers and practitioners in LA to know why there seems to be so little use of learning activity data specifications, and to under- stand the challenges and limitations when using the existing specifications. This paper reports on an exploratory case study where we look at the challenges and limitations of using a current learning activity data standard (i.e., xAPI) for describing the learning context. While previous research has identified some of the challenges and limitations of xAPI (Bakharia et al., 2016; Betts & Smith, 2019; Keehn & Claggett, 2019), to our knowledge no studies have systematically collected and analyzed data from stakeholders who have used xAPI in a real-world case and identified the gaps https://json-ld.org/ Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 3 of 26 between xAPI and the needs of stakeholders with regard to interoperability and integra- tion of learning activity data. Thus, we aim to contribute to the knowledge base through the systematic collection, analysis and identification of xAPI gaps and needs with regard to interoperability and data integration as they have been experienced by stakeholders in a real-world case. The case is the Activity data for Assessment and Adaptation (AVT) project (Morlandstø, Hansen, Wasson, & Bull, 2019), a Norwegian project exploring the use of activity data coming from multiple sources to adapt learning to individual learner needs and for use in assessment. AVT used the xAPI data specification for describing student activity data originating from different sources. Through in-depth interviews with AVT stakeholders with varying perspectives, and inspection of the xAPI specification (Advanced Distributed Learning, 2017a) and the xAPI profiles specification (Advanced Distributed Learning, 2018a), we identified some challenges and limitations of xAPI, focusing on learning context description. Based on the identified challenges and limita- tions, we have provided recommendations on how xAPI can be improved to enhance its expressibility—meaning it should be possible to describe data in a consistent way across data sources—in order to better support interoperability, data integration and (consequently) scalability. This paper answers the following research questions: RQ1: Focusing on descriptions of xAPI context, what are the gaps and needs regarding interoperability and data integration? RQ2: How should the identified gaps and needs be addressed in order to provide improved interoperability and data integration? Background In this section, we first look at several data models that attempt to formalize context. Next, we examine the constructs currently available in xAPI that enable the description of context. Then we conclude with a comparison of the context data models and xAPI. Context Data Models Jovanović,Gašević, Knight, and Richards (2007) developed an ontology-based frame- work, Learning Object Context Ontologies (LOCO), to formalize and record context related to learning objects (i.e., digital artifacts used for learning on digital platforms). Learning objects consist of learning content and are assigned to learning activities to achieve learning objectives. The LOCO framework integrates several ontologies, e.g., for learning object content structure and user modelling. The learning object context that can be recorded, i.e., metadata which originates from a learning process, includes information about the learning object domain, the learning situation, and the learner. Two tools were developed based on the LOCO framework. The first tool, LOCO- Analyst, can generate feedback for instructors based on analysis of context data col- lected from an online learning environment (e.g., LMS). The second tool, TANGRAM, targets the learners and is a “Web-based application for personalized learning in the area of Intelligent information systems” (Jovanović et al., 2007, p. 57). It personalizes the assembly of learning content. Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 4 of 26 Schmitz, Wolpers, Kirschenmann, and Niemann (2011) detail a framework for col- lecting and analyzing contextual attention metadata (CAM) from digital environments. CAM expresses data selection behaviors of users. The authors developed a schema to represent CAM that allows for the registration of aspects such as which data objects (e.g., file, video, email message) capture the attention of users, what actions are per- formed on the objects (e.g., a file was opened), and what was the context of use when a user interacted with an object (e.g., time, location). To enable the collection of CAM records, the approach is to add file system/application wrappers, thereby transforming the original data format to the format of CAM in XML. The CAM schema is semi- structured for some properties, such as context. The context property is a container for data of varying types, i.e., an arbitrary number of key-value pairs can be stored within this container. The authors note that while the semi-structured approach is flexible and allows for registering different types of data, it also creates challenges for exchanging data because data can be described in different ways. They state that an alternative would be to import different metadata schema, which could be used to structure the different types of data. Such an approach would rely on pre-defined schemas, e.g., from 2 3 FOAF and Dublin Core . To avoid redundancy of stored metadata, the authors describe a tentative approach where metadata are stored as triple representations (sub- ject, predicate, object), and where pointers are added to other metadata descriptions. Regarding CAM, one prototype was implemented to collect, analyze and visualize user communication. Different metadata were extracted and transformed into CAM format, providing the basis for visualizing the social network of the user, including the type of communication that took place and the user's communication behavior. Another proto- type using CAM was developed for an online learning environment. Here, data object metadata (e.g., number of object uses) were utilized for user recommendations. Usage and behavior data were also visualized for the individual user, adding the potential for providing metacognitive support. The learning context project (Thüs et al., 2012) recognizes that devices, such as mo- bile phone and tablets that contain a diverse set of sensors, have possibilities for record- ing context. The project has developed the Learning Context Data Model (LCDM), a suggested standard to represent context data and enable increased interoperability and reusability of context models. The data model considers learners and events, where an event is categorized at either a higher or lower level. Available high-level categories are activity (e.g., writing a paper), environmental (e.g., noise level, location), and biological (e.g., heart rate level, level of motivation) (Muslim, Chatti, Mahapatra, & Schroeder, 2016). There are a limited number of low-level categories, and for each, the data model specifies required and recommended inputs. Context can be broadly categorized as ex- trinsic or intrinsic. Extrinsic context is related to the user’s current environment, while intrinsic context is related to the inside of the user such as knowledge, concentration, or motivational level (Thüs et al., 2012). The LCDM allows for the registration of both extrinsic and intrinsic context events. It can also register user interests and platforms, which specifies where an event was captured, e.g., on a mobile phone. In addition to the data model, the learning context project provides an API that enables storage and http://xmlns.com/foaf/spec/ https://dublincore.org/specifications/dublin-core/ Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 5 of 26 retrieval of the context-related information and visualizations for the collected data. For instance, one visualization shows how learner interests evolve over time, something that may enable self-reflection (Thüs, Chatti, Brandt, & Schroeder, 2015). Lincke (2020) describes an approach for context modelling in her PhD dissertation, where she has developed a rich context model (RCM). The RCM approach models the user context according to specific context dimensions, each relating to a given applica- tion domain. The RCM was designed for generalizability in terms of application do- mains; thus, it can be utilized for different domains through providing separate configurations for the individual domains, removing the need to change the core of the model to add new domains. The configurations can specify aspects such as expected data/data types and database configuration. In the research, context was modelled for different application domains, such as mobile learning, LA, and recommender systems. For instance, in the mobile learning application domain, dimensions were modelled for environment, device, and personal context. The environment context could include contextual information such as location, weather conditions, and nearby places; the de- vice context could include information such as screen size, battery level, and Internet connectivity; and the personal context could include information such as demograph- ics, courses, interests, and preferences. In the dissertation by Lincke (2020), much emphasis is placed on data analysis, especially with regard to user recommendations of relevant items as they pertain to the current situation, thus offering personalization/ contextualization to the user. Analysis of context data, with resulting recommendations, has been implemented in tools within the mobile learning application domain. Data analysis results were visualized in mobile learning and several other application domains. Context in xAPI xAPI statements are made up of the most basic building blocks of “actor verb object” (see Fig. 1 for more information on available properties and structures). An xAPI activ- ity is a type of object that an actor has interacted with. Used together with a verb, the activity may represent a unit of instruction, performance, or experience. The interpret- ation of an activity is broad, meaning this concept can not only be used to represent virtual objects, but also physical/tangible objects (Advanced Distributed Learning, 2017b). The xAPI specification (currently in version 1.0.3), expressed in the JSON format, is quite flexible. For instance, users are free to define new verbs and activity types (an activity is an instance of an activity type) for use in statements, ideally publishing these vocabulary concepts in profiles shared with relevant communities of practice. Addition- ally, a number of the expected value formats have a flexible structure (e.g., JSON ob- jects may contain an arbitrary number of properties of varying levels of nesting). Finally, a number of structures/properties that can be used for data description are optional. In xAPI statements, the context structure is an optional structure that allows us to register context data. Since xAPI is a standard for learning activity data, context is related to the learner as they interact with a learning object in a (typically) digital en- vironment. The context structure is on the same level in a statement as the actor, verb, and object structures. Another structure on this level, which also allows for registration Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 6 of 26 Fig. 1 xAPI statement (Vidal, Rabelo, & Lama, 2015) of context data related to learning activity data, is the result structure that “represents a measured outcome related to the Statement in which it is included” (Advanced Distrib- uted Learning, 2017c); it may contain information on score, duration, response, success, completion, and other relevant (user-defined) attributes. There are nine properties that can be used within the xAPI context structure (see Table 1). Seven of them are defined with keys that require a single value or object that represents a single entity, including registration (value format is a UUID), instructor (value is an agent, stored in a JSON Table 1 Context structure properties (Advanced Distributed Learning, 2017c) Property Type Description Required registration UUID The registration that the Statement is associated with. Optional instructor Agent (MAY be a Instructor that the Statement relates to, if not included as the Optional Group) Actor of the Statement. team Group Team that this Statement relates to, if not included as the Optional Actor of the Statement. contextActivities contextActivities A map of the types of learning activity context that this Optional Object Statement is related to. Valid context types are: "parent", "grouping", "category" and "other". revision String Revision of the learning activity associated with this Optional Statement. Format is free. platform String Platform used in the experience of this learning activity. Optional language String (as defined Code representing the language in which the experience Optional in RFC 5646) being recorded in this Statement (mainly) occurred in, if applicable and known. statement Statement Another Statement to be considered as context for this Optional Reference Statement. extensions Object A map of any other domain-specific context relevant to this Optional Statement. For example, in a flight simulator altitude, airspeed, wind, attitude, GPS coordinates might all be relevant. Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 7 of 26 object), statement (value is another xAPI statement that is found to be relevant to the xAPI statement, stored in a JSON object), and team (value is an xAPI group, stored in a JSON object). The properties language, platform, and revision (of a learning activity) all require a string as their value (Advanced Distributed Learning, 2017c). Context in- formation not suitable for these seven properties that all take a single value or object that represents a single entity, and not related to a measured outcome (i.e., result), can be described with ContextActivities and extensions. ContextActivities let us specify “a map of the types of learning activity context that this Statement is related to” (Advanced Distributed Learning, 2017c). The available context types are parent, grouping, category, and other. The parent structure is used to specify the parent(s) of the object activity of a statement. For instance, a quiz would be the parent if the object of a statement was a quiz question. Grouping can be used to specify activities with an indirect relation to the object activity of a statement. For in- stance, a qualification has an indirect relation to a class and can therefore be specified using the grouping structure. Category is used to add activities that can categorize/tag a statement. The only example given in the xAPI specification is that the xAPI profile used when generating statements can be specified using category. The context type other can be used to specify activities that are not found to be appropriate in any of the parent, grouping,or category context types. The example given in the xAPI specification is that an actor studies a textbook for an exam, where the exam is stated to belong to the context type other. Extensions, like ContextActivities, are organized in maps. They should include domain-specific information that is not covered using the other context properties. The map keys for extensions must be represented in Internationalized Resource Identifier (IRI)s; the map values can be any valid JSON data structure such as string, array, and object. Thus, using extensions to express context information in an xAPI statement is more flexible than using ContextActivities. As such, the advice in the specification, re- garding interoperability, is that built-in xAPI elements should be preferred to exten- sions for storing information, if available (Advanced Distributed Learning, 2017c). The xAPI specification gives the example of an actor using a flight simulator, where altitude, wind, and GPS coordinates can be expressed using extensions. Since xAPI allows for registration of such a diversity of context-related information, through both the context structure and the result structure, data described in xAPI may for instance be used for personalization, visualization, assessment, and prediction. Comparing the different context data models to xAPI Having examined context in xAPI, we now look at its similarities and differences with regard to the previous research on context data models (see Table 2). xAPI collects data regarding the learners/agents and their activities. Of the context data models presented above, all except the LOCO model also have the learner (or user) as their unit of focus. LOCO, however, focuses on learning objects (in xAPI, learning object information would be represented in the object of the xAPI statement, rather than the context). In terms of flexibility, xAPI is quite flexible regarding data registration, similar to CAM. The other data models appear generally to be more rigid, for example due to stricter specification of available properties and data types. Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 8 of 26 Table 2 Context data model and xAPI comparison Name Unit of Flexibility Categorization Interoperability Usage focus of data of context registration LOCO Learning More rigid No categorization Targets Personalization, examine object interoperability object use CAM Learner Flexible No categorization Targets Personalization, visualization, interoperability examine object use LCDM Learner More rigid Two-level Targets Visualization categorization interoperability (high/low level) RCM Learner More rigid Two-level categorization Does not target Personalization/contextualization, (high/low level) interoperability visualization xAPI Learner/ Flexible No categorization Targets Personalization, visualization, agent interoperability examine object use, assessment, prediction, etc. Concerning classification of context, the LCDM has capabilities for two-level categorization of events, making a distinction between high-level and low-level categor- ies, for example the high-level categorization “environment” and the low-level categorization “noise level.” The RCM approach, similar to LCDM, suggests both high- level categorizations of context (in the form of context dimensions) and low-level categorization (information belonging to the separate context dimensions). In contrast, xAPI and the other data models do not provide this type of classification of context. Interoperability can be enabled in varying degrees through usage of common data models/specifications, depending on how they are used/defined. As such, interoper- ability is a stated end for all the data models, except RCM. While the RCM ap- proach is used for data analysis with regard to personalization and contextualization, this approach does not specifically target interoperability. Instead of using a standardized approach, such as requiring terms to be chosen from pre- established vocabularies when generalizing the RCM to a new application domain, the configuration is done at an ad hoc basis for each domain added (e.g., for each new domain, the data format must be specified). Concerning the use of the context data models, there seems to be an emphasis on personalization (e.g., providing rec- ommendations) and visualization. CAM and LOCO also examine the use of learn- ing/data objects. While data from xAPI may be used for personalization, , it provides structures for describing data visualization, and examining object use that cannot be described with the context data models (e.g., information about learner results). Thus, provided the xAPI data are sufficiently described, they may also be used for other purposes, such as assessment and prediction. AVT project—the case The case study examined the use of xAPI in the AVT project (Morlandstø et al., 2019;Wasson, Morlandstø,&Hansen, 2019), which ran from August 2017 to May 2019. We chose AVT as a case study subject due to it being a real-world project that used a learning activity data standard (i.e., xAPI) for describing data originat- ing from multiple sources, thereby having the potential to uncover challenges and Data related to learning object use are typically stored in the object (activity) of an xAPI statement. Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 9 of 26 limitations of data description and integration as they unfold in practice. AVT, owned and funded by the Norwegian Association of Local and Regional Authorities (KS), was initiated by the Educational Authority in the Municipality of Oslo (Utdanningsetaten), and the Centre for the Science of Learning & Technology (SLATE), University of Bergen was responsible for research and for leading the project. In addition, the project group comprised 9 vendors from the Norwegian EdTech sector, and 4 of the schools in Oslo. The project consulted representatives of the Learning Committee (Læringskomiteen SN/K 186) under Standards Norway, the organization that is responsible for the majority of standardization work in Norway (Standards Norway, 2019) and representatives from The Norwegian Direct- orate for Education and Training (Utdanningsdirektoratet), as well as taking feed- back from the Norwegian Data Protection Authority (Datatilsynet), the Norwegian Competition Authority (Konkurransetilsynet), and representatives from the Parent organization for schools (Foreldreutvalget for grunnskolen) and the Student organization (Elevorganisasjonen). The AVT project explored possibilities for using activity data to adapt learning to individual learner needs, and for formative and summative assessment at the K– 12 level. Since learners generate activity data in a number of tools from different vendors, a challenge is how to integrate such data, to provide richer information on the activities of each individual learner. Therefore, AVT looked at data sharing among different EdTech vendors, resulting in the implementation of a framework that helped to standardize data originating from different educational tools and sys- tems, and which enables secure data flow among vendors. xAPI was the chosen format for data description, integration, and data exchange. To enable more con- sistent use of xAPI, the project used a number of concepts from a vocabulary that had been adapted and translated to Norwegian by Læringskomiteen SN/K 186, as they are working with learning technology and e-learning (Standards Norway, 2020). SN/K 186 also participates in projects that develop artifacts based on standardization initiatives, such as AVT. Method This research adopted an exploratory case study methodology (Oates, 2006), see Fig. 2, using AVT as a real-world case. The subject to investigate was challenges and limita- tions of using a current learning activity data standard (i.e., xAPI) for describing learn- ing context with regard to interoperability and data integration, and how these might be addressed. Consequently, we involved stakeholders at the different stages of the re- search process. This paper addresses the first four steps of Fig. 2. Initially, we studied the AVT project documents, prepared interview guides, did sam- pling and recruitment of participants, and prepared consent forms. Next, we inter- viewed stakeholders from the AVT project about the gaps and needs of xAPI, with emphasis on descriptions of context. To provide important background information, we also asked about the rationale for choosing xAPI and the process of using xAPI for data integration and data sharing. Using thematic data analysis, we then identified themes emerging from the interview data. Subsequently, based on interview data and inspection of the xAPI and xAPI profile specifications, we formulated recommenda- tions for how xAPI context can be improved regarding interoperability and data Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 10 of 26 Fig. 2 Research methodology integration. In this paper, we provide a summary of the recommendations. In future work, we will provide a detailed account of the recommendations, implement a number of the recommendations in two separate projects, and validate the recommendations through stakeholder examination and interviews. Participants The selection of participants was done through purposive sampling (Bryman, 2012,p. 418). Purposive sampling of participants is not done at random, but rather in a strategic Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 11 of 26 way (Bryman, 2012). The point is to select participants that are appropriate for the re- search questions. It may be important with variety in the sample, meaning that the sample members differ in their characteristics as they are relevant to the research questions. Eight stakeholders in the AVT project were recruited for the interviews to identify gaps and needs of xAPI (see Table 3). The first seven interviews were conducted be- tween October 22 and November 01, 2019. Through these seven interviews, it became clear that our sample was missing an important AVT member related to the questions we wished to answer in our research, thus an additional interview was conducted on February 07, 2020. The participants worked on a diverse set of tasks within the AVT project and their roles represented different perspectives. Two participants represented a de- veloper perspective (i.e., they had experience with implementation of xAPI methods and preparation of datasets), three participants represented a leader per- spective (two related to decision making for AVT; one was the leader of an ex- ternal organization associated with AVT), two participants had a vendor perspective (one of these vendors had delivered data to AVT and the other had not), and there were also two technical advisors in the sample. The advisors were knowledgeable regarding standardization within the educational domain and gave advice to the rest of the project about how to use xAPI for describing activity data and context; they also made some examples of xAPI statements that describe activity data related to AVT, which the developers subsequently followed/used as a template. Data collection Data collection was conducted using semi-structured interviews where the objective was to find answers to the following overarching questions: Table 3 Participants, sorted by interview order Identifier Gender Perspective Tasks (sample) P1 Female Leader Conducting meetings, delivery of documents, some technical work P2 Male Developer Technical tasks within AVT (e.g., server, database, data sharing, security), contributed to xAPI example statements P3 Male Leader School owner representative, specifying how vendors should code xAPI activity data P4 Male Technical advisor Vocabulary/profile work, detailed work on how to represent context for AVT activity data, created xAPI example statements P5 Male Vendor (delivered data), Planning and implementation of vendor solution developer P6 Female Vendor (did not deliver Project leadership and coordination for vendor data) P7 Female Leader (external Conducted meetings where a number of AVT members organization associated participated, which fed into the AVT project; work related to with AVT) vocabularies and their use in Norway P8 Male Technical advisor Vocabulary/profile work, detailed work on how to represent context for AVT activity data, participated in developing xAPI example statements, explored and informed vendors about tools and libraries for storage and exchange of activity data Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 12 of 26 – What was the rationale for choosing xAPI in the AVT project? – What was the process of using xAPI for data integration and data sharing? – What challenges and limitations of xAPI were identified when describing context for data integration? Interview guides were developed based on study of several documents including the final report for the AVT project (Morlandstø et al., 2019), the xAPI specification (Advanced Distributed Learning, 2017a), and the xAPI profile specification (Advanced Distributed Learning, 2018a). The interview guides contained a list of topics to be cov- ered in the individual interviews, which can be broadly be categorized as the following: Background (e.g., regarding the role/tasks of the participant in the AVT project, and their previous experience with xAPI), Leadership (e.g., related to reasons for choosing xAPI for the AVT project, and other decisions made within the project), Technical development (e.g., practical experiences of describing data in xAPI), Context (details about how context was represented using xAPI and reasons), High-level topics (e.g., benefits and challenges of using learning activity data specifications for data integration). The participants were interviewed according to their perspectives, roles, the tasks they had worked on, and their areas of competence. Each topic in the interview guide was covered by at least two participants. Before the interviews started all participants were presented with a consent form, explaining aspects such as the purpose of the project, that audio would be recorded for subsequent transcription, that participant information would be anonymized upon transcription and stored securely, and that their participation was voluntary and could be withdrawn at any time. Because the audio recordings could theoretically be used to identify the participants, the project was reported to the Norwegian Centre for Research Data (2020), which approved the pro- ject based on the measures taken concerning privacy and research ethics. Participants were informed that the interviews could take up to 90 min, although most interviews finished in less than an hour. All interviews were conducted in Norwegian. Data analysis The study used a thematic analysis approach (Bryman, 2012) to analyze the transcribed interview data. The transcripts were collated and read through several times for familiarity with the content. The interview data were coded at two levels, using NVivo (2020). First, the data were coded according to our overarching questions; next, the data for each overarching question were coded at a more fine-grained level and themes were identified through an inductive process. Findings The analysis resulted in 32 codes during the second level of coding that were fur- ther aggregated to seven themes, each pertaining to an overarching question (see Fig. 3). Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 13 of 26 Fig. 3 Questions and themes The findings in each theme are summarized below. All quotations in this section have been translated from Norwegian to English. Rationale for choosing xAPI Open, flexible, and mature specification A number of the responses given by the participants indicated that xAPI was chosen by AVT because it was open, flexible, and mature. P1 and P3 (representing a leadership perspective) explained that at the time of making the decision, xAPI had already been chosen as the preferred learning activity data specification for Norway by Standards Norway and the SN/K 186 committee, and this was the main reason that AVT chose to use xAPI. Another reason, mentioned by P1, was the openness and flexibility of xAPI. Since the choice of xAPI by SN/K 186 was the main reason that AVT decided on xAPI, the rationale for choosing xAPI by SN/K 186 was also of interest during the interviews. As P3, P7, and P8 had actively taken part in or observed the choice of xAPI by Standards Norway and SN/K 186, they explained reasons for the choice by the committee. Openness and flexibility were mentioned by all three participants. P3 and P7 specifically remarked on the possibility to customize xAPI for a particular use-case/project (e.g., through profiles or extensions). P8 stated that while IMS Caliper was an alternative at the time of the SN/K 186 decision, it seemed less ma- ture than xAPI. Particularly, xAPI had more extensive documentation than Caliper. Two other aspects for choosing xAPI over Caliper, mentioned by P8, was that Cali- per is more adjusted to the US educational system and that vendors of the big EdTech systems that have the greatest influence on Caliper. Looking at the IMS members, it is clear that the majority are US companies and institutions (IMS Glo- bal, 2020). Griffiths and Hoel (2016) confirm that it is the member organizations of IMS that influence the use cases that Caliper can describe and that the specifi- cation seems to target the needs of larger vendors and institutions. Interestingly, P7 explained that SN/K 186 had not taken a definite stand that they would only use xAPI, but they would try out the specification. Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 14 of 26 Familiarity Another reason for choosing xAPI for the AVT project, as revealed by P1, was that sev- eral project members were already familiar with the specification. P4 and P5 confirmed that they had both used the specification in their work for EdTech vendors (P4 func- tioned as an advisor in the AVT project). The AVT project members also had some knowledge about other Norwegian users of xAPI. P7 and P8 were aware of a smaller EdTech vendor that was using xAPI. P7 mentioned that a large higher education insti- tution in Norway had experimented with the specification. Process of using xAPI for data integration and data sharing Practical experimentation and learning by trial and error Using xAPI for data integration was very much a process of practical experimentation and trial and error, as P3 explained. As a starting point, the project used data in xAPI format from math tests in Oslo municipality (P2 converted the data to xAPI format). Having access to these data, a group including the technical advisors (P4 and P8) and P2 made one simple and one advanced example statement (more examples were later added), as explained by P3. P1 and P3 stated that the vendors were asked to use the examples as templates or rules for how to construct xAPI statements. The examples used concepts (e.g., verbs) from the xAPI vocabulary, i.e., the collective vocabulary defined in the published xAPI profiles (Advanced Distributed Learning, 2020b), some of which had been translated to Norwegian by SN/K 186. Other concepts used in the examples, which were not avail- able in the xAPI vocabulary, were defined in a separate xAPI profile. According to P4, following the examples was to ensure more uniform data descriptions, thus making data more easily integrable. For storage of xAPI statements, vendors were encouraged to implement their own LRS, which would accept queries in a specific format and re- turn statements. Concerning data sharing, a prototype was developed that could be queried for student data. The prototype had an LRS component (storing data from the Oslo tests), access control (to allow secure data sharing from other LRSs, e.g., from vendors), and a limited user interface that could display some data about students, as explained by P2. Initially, a number of the participating vendors in the AVT project indicated a willing- ness to share data. To make it easier for vendors to share data, thus to get more data for the project, the requirements for how statements should be formulated were eventually eased, as explained by P1. While P6’s company did not manage to deliver data in time, she did state that this decision would have allowed her company to deliver data eventually, as the data generated by their tool was at a higher level compared to the xAPI examples they were given. Also, P5’s company, who did deliver data, ended up delivering data in a format that differed slightly from the examples. This was due to internals of their applica- tion since some data generated did not fit into the built-in structures of xAPI. As P1 explained about the benefits and drawbacks about easing the requirements for the state- ments: “It gives advantages since we might get some additional data, but it gives draw- backs related to subsequent analysis. Because the data consistency, i.e., the quality of the entire data set, may not be as good. So, we have to weigh [drawbacks and benefits of] this the entire time.” Still, at the end of the AVT project, only one vendor managed to share Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 15 of 26 data in xAPI format. The data shared by the vendor was technically integrated with the Oslo municipality test data, but there was no individual student whose data appeared in both data sources (as stated by P1). Thus, more could have been said about data integra- tion if more data would had been successfully delivered. Expert-driven technical process The AVT work on using xAPI for data integration and data sharing was very much driven by the technical advisors (P4 and P8) who were knowledgeable about standardization and xAPI. When asked in-depth about how the project chose to de- scribe context using xAPI, and why that decision was made, those participants who had a leadership perspective in AVT (P1 and P3), and P2 (developer), all said to rather ask the advisors P4 and P8. As P3 put it: “P4 and P8 are the two masterminds behind the very technical parts of this [project].” He stated that the project relied heavily on P4 and P8 in the area of modelling context and that the other AVT members trusted their recommendations. P4 and P8 contributed greatly to the xAPI profile and to the ex- ample xAPI statements. In addition, P8 worked on specifying vendor LRS requirements. When asked if it was clear how to construct xAPI statements in terms of data integra- tion, P4 stated that he had hoped more vendors would have shown an interest in how the statement examples were formulated, as this could have sparked interesting discus- sions. From the perspective of a vendor who delivered data, this view of vendors as showing a limited interest in statement formulation was confirmed by P5. He expressed that how statements were generated to allow for data integration from multiple sources was not a focus area for his company: “As a vendor this is not something we care about. Rather, it is those who consume our data that have to figure this out.” Challenges and limitations of xAPI Data description constructs In terms of describing context within the AVT project, concepts from the xAPI vo- cabulary were used. While this approach enabled the description of the data, P3 noted that the example xAPI statements did not use many of the concepts from the xAPI vo- cabulary, as correct use of the concepts was quite challenging. For instance, for the verb “answered,” the many different types of items that could be solved or answered by a student could make it challenging to properly define the item. To add further concepts related to context that were not available in the xAPI vocabulary, an xAPI profile was developed for the AVT project, as explained by P3. The concepts were registered as metadata in the form of activity types. The activity types were used to represent con- cepts relevant for the AVT project, such as competence objective, school owner, and school. Thus, instances (i.e., activities) of these activity types could also be part of the statements. As P4 and P8 pointed out, using a profile allowed for restriction of the con- cepts that were used in xAPI statements. P8 also emphasized that profiles can explain the specific meaning of a concept. Related to data description, two participants (P3 and P5) mentioned that mapping real-world data to the xAPI format could be difficult, due to the assumptions of xAPI. P5 mentioned a problem of registering the answer to an assignment in xAPI, where xAPI expected exactly one answer. The solution of the vendor that P5 represented, Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 16 of 26 however, had multiple choice. Therefore, more than one answer could be correct (in combination). To add contextual data to the context structure of the xAPI statements, in addition to the data that could be added to the seven properties taking a single value or object that represents a single entity, P4 and P8 explained that the project had the choice be- tween adding them to ContextActivities and extensions. In the end, it was decided to use ContextActivities, with the following structure: context -> ContextActivities -> grouping. Accordingly, activities which collectively constituted the context data were grouped together. P4 said that part of the reason was that the example statements were inspired by CMI-5, a specification that has as its goal to enable a consistent information model for xAPI (CMI-5, 2020). He also stated that ContextActivities is a more standard way of using the xAPI specification, while extensions are a more custom approach. As P4 explained regarding extensions: “No one can know anything about how that format is.” While P4 was initially in doubt about the use of ContextActivities, he did feel it was a better choice than using extensions. P8 pointed out that while ContextActivities had a fixed structure, and extensions had a more flexible structure (e.g., it can contain a JSON object or a list). He stated two reasons for choosing ContextActivities over extensions: The first was related to the libraries available for generating xAPI statements. It was much simpler to add Contex- tActivities than extensions. The second reason was that vendors were to set up their own LRS according to the xAPI specification. One requirement was to set up a query API, so that the LRS could be queried in a standardized way. As P2 stated, however, it was a challenge that LRSs do not have good query capabilities, the focus seems to ra- ther be on data. Using ContextActivities made it easier to filter statements on context- ual data (activity types). P8 explained that to filter based on the contextual data when using extensions, a new API would need to be built on top of the query API: “You would have to build another API on top of the existing, to enable those queries. So, in a way it [using ContextActivities] is a minor ‘hack’(…). It was a way to enable more powerful queries or search.” Thus, using ContextActivities enabled enhanced query capabilities without requiring extra development work. P8 pointed out that using Con- textActivities to group contextual information had a significant drawback in terms of semantics. The activity types added to the xAPI profile, such as school, school owner, and competence objective, were not really activity types. Thus, the activities added to ContextActivities, which were instances of the activity types added in the AVT xAPI profile, were not really activities. In a report on xAPI, Learning Pool, the developers of the Learning Locker LRS, tell a related story on how not adhering to semantics can make querying easier. They wanted to model that a user had liked a specific comment by another user (Betts & Smith, 2019). Semantically, the comment (an activity) was the object of such a statement, and the other user belonged in the context. In terms of querying, they wanted to count how many times a user had made comments that were liked by other users. In this instance, they found querying easier if the other user was in the object structure, and the comment was in the xAPI context, even if this was not semantically correct. They commented: “This is just one example of how adopting the specification has been somewhat harder than we first thought it would be.” Another aspect of ContextActivies is how contextual information is grouped. When asked why grouping was used in place of category or other, both P4 and P8 were Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 17 of 26 uncertain. P4 stated he had read a lot about the topic, but he was unsure about how they differed. There seemed to be a lack of semantics that set them apart. P4 said any one of them could be used in the statements, or they could all be used: “We could have said that competence objective is in grouping, but the school owner, which school it originated from, is in [category]or [other]. But that would just make the statements even more complex.” In the end, P4 said they just needed to make a rule for which one to use, to ensure all vendors would follow the same procedure (and to not confuse the vendors). P8 also pointed out the difficulty of choosing between grouping, category, and other. When reflecting on the differences he stated: “Grouping… I don’t really know what that means, really.” He added that choosing between the three was made even more difficult since the activity types of the added activities were not really activity types. (Semantic) differences between tools generating statements Another challenge, expressed by P3, is that the openness of xAPI allows for the combination of data that may look similar, but that turn out to be not comparable. P4 mentioned duration as an example of a property that may be used differently in differ- ent EdTech systems and thus could be difficult to compare across sources. A property that was in fact problematic in the AVT project is the score of a student when solving a specific item, as mentioned by P3 and P5. Even when a score is set according to the same scale, it may not have the same meaning across systems. In AVT, the meaning of score was different in the tests administered by Oslo municipality and in the data deliv- ered by the vendor, as their tool was a system for practicing, rather than testing. P4 and P5 note that identifying these types of discrepancies require the sources to be well- known by those using them (e.g., for analysis). Because of the xAPI openness, different tools may also generate a different number of statements for the same type of event, as pointed out by P3. He stated that this was not a big challenge for the AVT project, be- cause only one vendor delivered data. When more than one vendor delivers data, how- ever, it could become a challenge. Another challenge, concerning tool differences, is that the same type of data may be recorded at different levels of granularity by different tools. For instance, P6 expressed that her company attempted to deliver data that was at a higher level compared to the xAPI examples they were first given. Semantic vs. technical interoperability The process of using xAPI for data integration was expert-driven, as mentioned earlier. This was particularly apparent when asking the different stakeholders about data de- scriptions. When asking those with a leader perspective in AVT (P1 and P3) and those with a developer perspective (P2 and P5) if xAPI could satisfactorily describe the data, they generally agreed. As P3 stated: “There is no data we have wanted to describe that we have not been able to describe with xAPI so far.” When probing further about rep- resentation of context in xAPI, P1, P3, and P5 generally agreed that xAPI could do it in a satisfactory way. The developers based their data descriptions on the examples cre- ated by P4 and P8; thus, they knew xAPI more on a technical level of interoperability than on a semantic level. As P2 expressed it: “I didn’t examine other options for xAPI. I just thought that here we have an example, then I will fill out the data I have Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 18 of 26 according to the example.” When P3, representing the leadership perspective in AVT, was asked if he thought it would have been more difficult to describe the data without the example, he agreed: “Yes, you can express almost anything with xAPI, so you have to start with a need.” When asking the technical advisors, who were more knowledgeable about xAPI and standardization, about data descriptions and context, they both mentioned several problems related to the openness/flexibility of xAPI and semantics. The different ways to represent context data (ContextActivities -> grouping, category or other) was one example. They also saw problems in constructing xAPI statements in terms of data integration because of the many ways data of the same type might be described. P4 said that in the context of AVT you need to try to find concepts appropriate for the vendors and to make them follow rules/templates. P8 said creating statements for data integration might be achievable within a small project such as AVT, based on common rules and documentation. If combining data with another community of practice, however, it would be challenging: “There would be many sources of error concerning syntax, semantics, etc.” He suggested library development and schema validation as possible ways to alleviate this challenge. Recommendations We see that the analysis of the stakeholder interviews establishes that there is a lack of clarity in how to describe xAPI context data related to interoperability and data integra- tion, which negatively affects the expressibility of xAPI. The challenges identified through systematic analysis of interview data and inspection of the xAPI and xAPI pro- file specifications are shown in Table 4. Based on the challenges identified, we provide recommendations (i.e., recommended solutions), in summarized form, on how xAPI can be improved to support interoper- ability and data integration, with emphasis on descriptions of xAPI context (see Table 4). The recommendations were specified using an iterative process, where we used a bottom-up approach of identifying challenges in the analyzed data, xAPI and xAPI pro- file specifications, and a top-down approach of examining relevant research literature on context categorizations. Although the emphasis has been on descriptions of context, we additionally give some recommendations that pertain to the xAPI framework as a whole (e.g., data typing and validation, and improved documentation), as they are im- portant with regard to the expressibility of xAPI context. In our recommended im- provements to xAPI, it should be noted that some recommended changes will require a change to the xAPI standard (or documentation), while others can be implemented using an xAPI profile. For this paper, the recommendations are at a conceptual rather than implementation level. Regarding the recommendations, data typing and validation for specific use cases can currently be addressed through two xAPI profile constructs (Advanced Distributed Learning, 2018b). Statement templates describe a way to structure xAPI statements, and can include rules, e.g., for restrictions on the data type of a specific property value, while patterns define how a group of statements should be ordered. Both constructs can be checked by an xAPI profile validator. Recently (August 2020), Advanced Distributed Learning (2020a) published informa- tion that there are plans to standardize xAPI 2.0, which is an upgrade from the current version 1.0.3 (Advanced Distributed Learning, 2020a). An IEEE LTSC working group, Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 19 of 26 Table 4 Challenges identified and recommended solutions Category Challenges identified Recommended solutions Context The distinction between ContextActivities and Use a unified structure for context in xAPI, i.e., extensions appears artificial, and it is not always context dimensions, with appropriate (low-level) clear which to use. properties for each context dimension. Use data typing and validation to restrict properties and Extensions are flexible in how data can be value types for context dimensions. Depending registered and therefore could make it more on the property, the value type can be Activity, difficult to integrate data. but other value types should also be supported (e.g., JSON object and string). ContextActivities is not a good fit for all types of context data. Grouping of related activities in ContextActivities Remove distinction between grouping, category, can be done within three different structures and other. All context data that do not have an (grouping, category or other). It is not clear how explicit (parent) relationship to the statement the three structures differ. The grouping should be placed in the same structure. For the structures are all very high-level. suggested unified structure for context, i.e., context dimensions, the related information can be listed more explicitly as property values belonging to the appropriate dimension and property. The query capabilities of LRSs are seen as limited. The xAPI specification defines the query interface The example given is that it is only possible to that all LRSs must implement (Advanced filter statements on contextual data that are Distributed Learning, 2017d). In the case of instances of an activity type. Thus, to allow filtering based on resources, the specification filtering of resources in AVT (without extra needs to be extended, so that it is possible to development work), resources were registered as filter contextual data by any resource type. activities, even if they were not really activities on Individual LRS providers have addressed this the semantic level. issue on an ad hoc basis (Learning Locker, 2020), but the problem needs to be further addressed in the xAPI specification to ensure LRS interoperability (e.g., for xAPI users not to have to rewrite substantial amounts of code if moving data between LRSs). Concepts from the xAPI vocabulary not sufficient Use of xAPI profiles to add additional concepts. to describe all data. The same vocabulary concept may be Stricter curation/approval process of the public represented in different public xAPI profiles, profiles. which make up the xAPI vocabulary. Data typing Tools generating data at multiple levels of To help tool developers identify and enforce the and validation granularity is a challenge, which may make it expected level of granularity, xAPI data typing more difficult to meaningfully integrate data. and validation can be used. For instance, if a property takes a list of activities (more granular), validation can ensure that less granular values (e.g., integer) will not be accepted. Data typing Difficulties in mapping real-world data to xAPI Data typing and validation can help to ensure and validation due to its assumptions. that the assumptions of xAPI (e.g., expected value for an xAPI concept) are made more explicit and tested against the data, to avoid wrong use of the specification. It is also crucial that the xAPI specification can be extended as new use cases reveal new needs for data registration. Data typing Different tools may generate different numbers Validation could be tied to the number of and validation of statements for the same type of event. statements generated for a given type of event, and to ensure that the statements generated follow an ordered pattern. Data typing The openness and flexibility of xAPI allows data Add clearer modelling guidelines to the and validation and relationships of the same type to be documentation. Add data typing and validation Documentation modelled in a myriad of different ways. of properties and property values. Use of profiles to specify vocabularies. Documentation It may be challenging to correctly use concepts Improve xAPI documentation, e.g., document from the xAPI vocabulary. more solutions for specific use-cases, and add more examples of how to use the xAPI vocabu- lary concepts in order to avoid misunderstand- ings and remove ambiguity. Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 20 of 26 comprising stakeholders from the xAPI community and technical experts, have agreed on the new standard; however, a formal balloting process also must be conducted in order to standardize. While there is information that some new structures will be intro- duced in order to describe context, it is indicated that these structures will allow more structured descriptions of individuals (i.e., contextAgents) and teams (i.e., contextGroups); thus, we do not believe they will solve the issues/challenges related to context that our re- search has identified and for which we provide recommendations in this paper. Another addition to the proposed xAPI 2.0 standard is a best-practices guide, “which will be linked to the eventual standard as a living document that can grow and change with advances in learning science and technologies” (Advanced Distributed Learning, 2020a). Based on the published information, this guide could help in the identified need for improvements in the xAPI documentation. Other changes in the proposed version 2.0 include forbidding additional properties in statements and standardizing timestamps. While the changes may help in terms of interoperability and data integration, they do not specifically relate to the challenges we have identified and the recommendations we have provided. Discussion Two research questions were posed in this paper, regarding (1) gaps and needs of xAPI in terms of interoperability and data integration focusing on context descriptions, and (2) how identified gaps and needs can be addressed in order to provide improved inter- operability and data integration. We have addressed RQ1 through analysis of the data from the AVT stakeholder interviews and inspection of the xAPI and xAPI profiles specifications, and RQ2 through providing summarized recommendations on how xAPI can be improved to support interoperability and data integration with emphasis on descriptions of xAPI context. In the following, we discuss patterns and trends related to xAPI data descriptions and interoperability/data integration, which we have identi- fied based on the review of research papers that utilize xAPI to describe data. Although papers on xAPI commonly mention the benefits of xAPI in terms of inter- operability, most of the studies that use or explore the use of xAPI to describe data worked with only one data source (Hruska, Long, Amburn, Kilcullen, & Poeppelman, 2014; Megliola, De Vito, Sanguini, Wild, & Lefrere, 2014; Papadokostaki, Panagiotakis, Vassilakis, & Malamos, 2017; Wu, Guo, & Zhu, 2020; Zapata-Rivera & Petrie, 2018). In such cases, there is no practical experience with the challenges and limitations regard- ing data descriptions and interoperability. For instance, the challenges related to the flexibility of xAPI and data descriptions do not readily appear. The data can be de- scribed in different ways, all accepted according to xAPI. It is when trying to integrate xAPI data from different sources the challenges of inconsistent descriptions will sur- face. Thus, out of the five referenced examples using only one data source, four of them do not touch on challenges related to xAPI and data descriptions/interoperability. The exception is Hruska et al. (2014) who examine challenges with, and give examples on, how to encode information about teams/groups in xAPI statements, including descrip- tions of group context. Interestingly, Megliola et al. (2014) have many reflections on vo- cabulary (verbs and objects) for describing events in their modelling domain (aeronautics), but the reflections are based on multiple theories in linguistics rather than the practical application in xAPI. Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 21 of 26 Having examined the knowledge base, we have found a very limited number of stud- ies that utilize xAPI to integrate data originating from multiple data sources and share lessons learnt from such a project. The CLA toolkit case study (Bakharia et al., 2016) was one such study. Here, data from different social media were described in xAPI format and integrated in an LRS for use in a systemic analytics solution. The project in- cluded designing a common vocabulary through re-use of concepts from W3C Activi- tyStreams 1.0 and providing mappings from individual tool concepts to the common concepts. They also examined how context could potentially be described and made decisions on how to describe context in the project. At the time of the study, xAPI pro- files had not been added to xAPI. Thus, vocabulary and data type/validation rules could not be described in a machine-readable manner. Rather, the vocabulary, together with prescriptions on how to describe social media data, were stored in a recipe. A recipe is a textual description on how xAPI statements for a certain experience can be con- structed (Miller, 2018). At the time, recipes were the common means to share the im- plementation of an information model with a community of practice, serving as a potential aid in terms of interoperability. They were, however, not machine-readable. Through actual usage of xAPI, the researchers working on the CLA toolkit were able to identify challenges and complexities of the data integration approach. Among the lessons learnt were that providing xAPI context data, while optional according to the standard, was essential for their project. In addition, they recommended that xAPI be extended with the JSON-LD specification, as the lack of machine-readable vocabularies and rules were a weakness of the xAPI specification. Following the research by Bakharia et al. (2016), we see that xAPI has introduced capabilities of machine-readable vocabularies and rules, since the xAPI profiles specifi- cation has been added to xAPI. Thus, machine-readability is no longer a core problem of xAPI (although it is of importance that tools and libraries implement the methods needed to read metadata descriptions and apply data typing/validation rules). It is en- couraging to see that xAPI has used the results from research when choosing to add JSON-LD capabilities. xAPI, through xAPI profiles which are defined using the JSON-LD format, leverage semantic technologies to allow for documents that are not only readable by humans, but also machines. In their article, Verborgh and Vander Sande (2020) discuss the im- portance of not conducting research related to semantic technologies in a vacuum (e.g., controlled research experiment). While researchers are often reluctant to take their so- lutions out of the labs, deeming large-scale technology use as a trivial engineering prob- lem, this article highlights that practical use of the technologies outside of the safe research environments, e.g., through integrating data from sources containing real- world rather than synthetic data, is likely to uncover new challenges that need to be solved in order to promote adoption among practitioners. The results from our inter- views indicate that there is indeed a need for more practical research with real-world data description and integration. Research has shown that data integration within LA, an important part of LA scal- ability, is a challenge in itself. Previous research in the domain of LA and higher educa- tion has found that if data are integrated, they typically originate only from a few data https://activitystrea.ms/ Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 22 of 26 sources; when data are integrated, the integration is often of data of similar formats (technically, this type of integration is easier than combining data of different formats), and there seems to be little use of learning activity data specifications such as xAPI (Samuelsen et al., 2019). One reason that xAPI is not used more for data integration may be that many tools do not support providing data in the form of xAPI statements. In such cases, the use of xAPI can be challenging, since transforming data to xAPI for- mat is likely to require a considerable effort. Depending on the situation, it may be more convenient to store data of similar formats originating from different data sources in a NoSQL database (e.g., if the data integrated are provided as JSON through REST APIs), even though the problem of alignment of concepts from different sources will still be a challenge. In some cases, the integration approach may also be a manual one, e.g., through copying and pasting data from data sources into excel sheets; an approach that may also require considerable human effort. In the case of xAPI and other learning activity data specifications/standards, there seems to be a need for more tools support- ing their formats. While previous research has found little evidence of xAPI use in the research litera- ture (Samuelsen et al., 2019), the xAPI guide by Learning Pool provides information that there are tools that can export data in xAPI format (Betts & Smith, 2019). These include LMSs, content management systems, and authoring tools. The tools often pro- vide the xAPI export functionality through plugins. Due to the flexibility of xAPI data descriptions, however, one cannot expect statements from these tools to meaningfully integrate (i.e., to scale), since different plugins, made by different developers, may use different syntactic and semantic structures for describing the same data. For the state- ments of an xAPI-enabled tool to be compatible with statements from a number of other xAPI-enabled tools, we can imagine that the tool will need a number of similar plugins, each plugin making statements compatible with statements from a specific tool or set of tools. While there are architectures (Apereo, 2020; JISC, 2020) that provide connectors/plugins for several tools to store their data as xAPI in a common data store, the problem is still not solved since different architectures can also model data of simi- lar types in different ways. Thus, there is a need for tools that can provide xAPI data in a coherent format. Looking at studies using xAPI, we find, consistent with its flexibility, a diverse set of examples of how researchers add, or intend to add, context data. Not all ex- amples seem to be within the intended use of xAPI. Wu et al. (2020)demonstrate using ContextActivities—parent to store a verb as an activity, even though a verb is not an activity, and should not be the parent of an activity. Sottilare, Long, and Goldberg (2017) propose to add course to the xAPI result structure, even though it is added in the published xAPI vocabulary as an activity type that can be used in the context structure with ContextActivities (Advanced Distributed Learning, 2020b). We also find examples related to the different ways ContextActivities can be used to describe data. Several works use ContextActivities—grouping and Con- textActivities—other in the same statement (Claggett, 2018;Hruskaetal., 2014), even though we have identified that it is not really clear that there is a difference between the two structures. These examples confirm our findings that there is a need to enhance the expressibility of context in xAPI. The recommended addition of context dimensions would be one way to enable more consistent description of Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 23 of 26 the data. Documentation, data typing, and validation should provide clarity and guidance regarding how to achieve this goal. In the xAPI specification, learning objects and their context are represented through activities. An activity has an id (an identifying IRI), and an optional definition stored in an Activity Definition Object (Advanced Distributed Learning, 2017c). The activity def- inition can be stored together with the activity id in the xAPI statement, or it can be stored in the activity id IRI and downloaded by the LRS (hosted metadata solution). While xAPI can represent learning objects and their context, the topic of representing learning objects has also been addressed by several other standards, including “ISO/IEC 19788 Information technology - Learning, education and training - Metadata for learn- ing resources” (International Organization for Standardization, 2011) and “Encoded Archival Context – Corporate bodies, Persons and Families” (Mazzini & Ricci, 2011). Thus, the xAPI community could look to such standards when aiming to provide fu- ture improvements for learning object representation. The challenges we have identified with xAPI may lead us to question if IMS Caliper would currently be better suited for representing interaction data in the educational do- main. Examining the specification (IMS Global, 2018), we find there are 14 available profiles (called metric profiles), which target experiences such as assessment, forum, and tool use. Similar to xAPI, Caliper also supports machine-readable data descriptions through JSON-LD. It seems that Caliper has the potential of avoiding many of the xAPI problems caused by flexibility. For instance, it provides a pre-defined vocabulary of available verbs (called actions). In addition, a specific event (e.g., assessment event) has a number of pre-defined properties for context that can or must be specified. The properties available vary based on the event type. In the case where there is a need to add a context property not available in the specification, there is a generic extension object, where additional properties can be added (similar to extensions in xAPI). When describing experiences not included in the specification, however, the only option is to use the generic (basic) profile, which supports only generic events (which can use any number of properties), but that can only use the verbs from the pre-defined vocabulary. It seems the use of the basic profile will also pose challenges related to interoperability due to its flexibility in terms of using generic events. Furthermore, while waiting for the addition of new profiles, the pre-defined verbs may not cover the actual needs of users. Since it appears that Caliper is more geared toward the big EdTech companies (Grif- fiths & Hoel, 2016), this becomes a considerable barrier to adoption for smaller vendors and researchers. For projects that need to support experiences outside of the 14 experi- ences for which there are profiles, it seems that xAPI would be the better choice after all. In this case study, we have focused our attention on one specific case, the AVT pro- ject. Although we have only examined one project, it is one that really strives to inte- grate multiple data sources. This case study is one of few studies that has emphasized revealing and addressing challenges and limitations in a data specification, focusing on context descriptions. The research is conducted outside of the safe lab environment, meaning we can identify challenges that would otherwise remain unnoticed. While others have previously identified some of the challenges and limitations of xAPI, this seems to be the first paper that systematically examines challenges and limitations of xAPI context, through involving stakeholders having experienced xAPI in a real-world Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 24 of 26 case, and that gives recommendations on how the context descriptions can be enriched through changes to the context structure and other means in order to improve expressibility. Conclusion and future work This paper presents an exploratory case study, taking place in a real-world setting, using the AVT project as a case. The research has aimed to systematically identify chal- lenges and limitations of using a current learning activity data standard (i.e., xAPI) for describing learning context with regard to interoperability and data integration. Subse- quently, we have provided recommendations, in summarized form, for the identified challenges and limitations. Our research has identified a lack of clarity in how to de- scribe context data in xAPI regarding interoperability and data integration. The recom- mendations relate not only to description/modelling of context in xAPI, but also to data typing, validation, and documentation, as all these are essential to enhance the expressibility of xAPI context. Despite xAPI’s potential regarding interoperability, we see a tendency in studies using xAPI that most of them describe only data from one data source. Additionally, in the cases where multiple data sources are actually integrated, few reflect on limitations or challenges concerning data descriptions. In order to scale up LA, particularly when in- tegrating data from multiple sources, it is essential to describe data in a coherent way. Therefore, we strongly encourage others in the LA research community, using xAPI for data integration, to try out the recommended solutions in their own projects. Currently, it is not possible to make use of all recommendations since some will require a change to the xAPI/xAPI profile specifications. Related to the recommendations that can be implemented, we especially highlight the use of xAPI profiles to provide vocabularies and to specify shared data typing and validation rules (through statement templates and patterns). We acknowledge that there are some limitations with our research. Due to the quali- tative approach, where we thematically analyzed data from in-depth interviews with a limited number of participants representing different stakeholder perspectives, the find- ings are based on our study, although many of the challenges are supported by the literature. Thus, although our results may not be generalizable, they are based on a real-life case involving multiple stakeholders and multiple data sources. Furthermore, the recommendations have not yet been detailed in depth, implemented, and validated. In future work, we will proceed with the next steps in the methodology, including de- tailing the recommended solutions that are summarized in this paper, and stakeholder validation of the implementable recommendations through using the xAPI and xAPI profile specifications for data descriptions in two separate projects. Abbreviations AVT: Activity Data for Assessment and Adaptation; CAM: Contextual Attention Metadata; IRI: Internationalized Resource Identifier; JSON-LD: JSON for Linking Data; LA: Learning Analytics; LMS: Learning Management System; LOCO: Learning Object Context Ontologies; LRS: Learning Record Store; LCDM: Learning Context Data Model; RCM: Rich Context Model; xAPI: Experience API Acknowledgements The authors wish to thank the participants for their valuable contributions in identifying challenges and limitations of xAPI. Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 25 of 26 Authors’ contributions This paper is a part of the PhD project conducted by JS. BW is her main supervisor, and WC is her co-supervisor. WC has been working closely with JS in planning and has assisted in writing. BW has assisted in planning and writing. All authors read and approved the final manuscript. Funding This research is a part of Jeanette Samuelsen’s PhD funded by the Centre for the Science of Learning & Technology (SLATE), University of Bergen, Norway. Availability of data and materials To protect the privacy of the participants, the data cannot be shared. Declaration Competing interests There is no conflict of interests related to this manuscript. Author details Centre for the Science of Learning & Technology, University of Bergen, P.O. Box 7807, 5020 Bergen, Norway. 2 3 Department of Information Science & Media Studies, University of Bergen, P.O. Box 7802, 5020 Bergen, Norway. Oslo Metropolitan University, Oslo, Norway. Received: 2 October 2020 Accepted: 23 February 2021 References Advanced Distributed Learning. (2017a). xAPI specification. Retrieved from https://github.com/adlnet/xAPI-Spec Advanced Distributed Learning. (2017b). xAPI specification - part one: About the experience API. Retrieved from https:// github.com/adlnet/xAPI-Spec/blob/master/xAPI-About.md#partone Advanced Distributed Learning. (2017c). xAPI specification - part two: Experience API data. Retrieved from https://github. com/adlnet/xAPI-Spec/blob/master/xAPI-Data.md#parttwo Advanced Distributed Learning. (2017d). xAPI specification - part three: Data processing, validation, and security. Retrieved from https://github.com/adlnet/xAPI-Spec/blob/master/xAPI-Communication.md#partthree Advanced Distributed Learning. (2018a). xAPI profiles specification. Retrieved from https://github.com/adlnet/xapi-profiles Advanced Distributed Learning. (2018b). xAPI profile specification - part two: xAPI profiles document structure specification. Retrieved from https://github.com/adlnet/xapi-profiles/blob/master/xapi-profiles-structure.md#part-two Advanced Distributed Learning. (2020a). Anticipating the xAPI Version 2.0 Standard. Retrieved from https://adlnet.gov/news/2 020/08/06/Anticipating-the-xAPI-Version-2.0-Standard/ Advanced Distributed Learning. (2020b). xAPI authored profiles. Retrieved from https://github.com/adlnet/xapi-authored- profiles/ Apereo. (2020). Learning Analytics Initiative | Apereo. Retrieved from https://www.apereo.org/communities/learning-analytics- initiative Bakharia, A., Kitto, K., Pardo, A., Gašević, D., & Dawson, S. (2016). Recipe for success: Lessons learnt from using xAPI within the connected learning analytics toolkit. In Proceedings of the sixth international conference on learning analytics & knowledge, (pp. 378–382). Betts, B., & Smith, R. (2019). The learning technology manager's guide to xAPI (Version 2.2). Retrieved from https://lea rningpool.com/guide-to-xapi/ Bryman, A. (2012). Social Research Methods (4th ed.). Oxford: Oxford University Press. Claggett, S. (2018). xAPI Game Demo Example Part 1 [Blog post]. Retrieved from https://gblxapi.org/community-blog-xapi- gbl/10-xapi-demo-example-threedigits CMI-5. (2020). The cmi5 Project. Retrieved from https://github.com/AICC/CMI-5_Spec_Current Dey, A. K. (2001). Understanding and using context. Personal and Ubiquitous Computing, 5(1), 4–7. European Commission. (2017). New European Interoperability Framework. Retrieved from https://ec.europa.eu/isa2/sites/isa/ files/eif_brochure_final.pdf Griffiths, D., & Hoel, T. (2016). Comparing xAPI and Caliper (Learning Analytics Review, No. 7). Bolton: LACE. Hruska, M., Long, R., Amburn, C., Kilcullen, T., & Poeppelman, T. (2014). Experience API and team evaluation: Evolving interoperable performance assessment. In The Interservice/Industry Training, Simulation & Education Conference (I/ITSEC). IMS Caliper Analytics. (2020). Caliper Analytics | IMS Global Learning Consortium. Retrieved from https://www.imsglobal.org/a ctivity/caliper IMS Global. (2018). IMS Caliper Specification v1.1. Retrieved from https://www.imsglobal.org/sites/default/files/caliper/v1p1/ca liper-spec-v1p1/caliper-spec-v1p1.html IMS Global. (2020). Members | IMS Global. Retrieved August 20, 2020, from https://site.imsglobal.org/membership/members International Organization for Standardization. (2011). ISO/IEC 19788-1:2011 Information technology — Learning, education and training — Metadata for learning resources — Part 1: Framework. Retrieved from https://www.iso.org/standard/ 50772.html JISC. (2020). Learning records warehouse: technical overview: Integration overview. Retrieved from https://docs.analytics.a lpha.jisc.ac.uk/docs/learning-records-warehouse/Technical-Overview:%2D%2DIntegration-Overview Jovanović, J., Gašević, D., Knight, C., & Richards, G. (2007). Ontologies for effective use of context in e-learning settings. Journal of Educational Technology & Society, 10(3), 47–59. Keehn, S., & Claggett, S. (2019). Collecting standardized assessment data in games. Journal of Applied Testing Technology, 20(S1), 43–51. Samuelsen et al. Research and Practice in Technology Enhanced Learning (2021) 16:6 Page 26 of 26 Learning Locker. (2020). Aggregation HTTP interface. Retrieved from https://docs.learninglocker.net/http-aggregation/ Lincke, A. (2020). A computational approach for modelling context across different application domains (Doctoral dissertation, Linnaeus University Press). Retrieved from http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-93251 Mazzini, S., & Ricci, F. (2011). EAC-CPF ontology and linked archival data. In SDA, (pp. 72–81). Megliola, M., De Vito, G., Sanguini, R., Wild, F., & Lefrere, P. (2014). Creating awareness of kinaesthetic learning using the Experience API: current practices, emerging challenges, possible solutions. In CEUR Workshop Proceedings, (vol. 1238, pp. 11–22). Miller, B. (2018). Profile Recipes vs. xAPI Profiles [Blog post]. Retrieved from https://xapi.com/blog/profile-recipes-vs-xapi- profiles/ Morlandstø, N. I., Hansen, C. J. S., Wasson, B., & Bull, S. (2019). Aktivitetsdata for vurdering og tilpasning: Sluttrapport. SLATE Research Report 2019-1. Bergen: Centre for the Science of Learning & Technology (SLATE) ISBN: 978-82-994238-7-8. Muslim, A., Chatti, M. A., Mahapatra, T., & Schroeder, U. (2016). A rule-based indicator definition tool for personalized learning analytics. In Proceedings of the sixth international conference on learning analytics & knowledge, (pp. 264–273). Norwegian Centre for Research Data (2020). NSD - Norwegian Centre for Research Data. Retrieved from https://nsd.no/nsd/ english/index.html NVivo. (2020) Qualitative Data Analysis Software | NVivo. Retrieved from https://www.qsrinternational.com/nvivo-qualitative- data-analysis-software/home Oates, B. J. (2006). Researching Information Systems and Computing. London: SAGE publications. Papadokostaki, K., Panagiotakis, S., Vassilakis, K., & Malamos, A. (2017). Implementing an adaptive learning system with the use of experience API. In Interactivity, Game Creation, Design, Learning, and Innovation, (pp. 393–402). Cham: Springer. Samuelsen, J., Chen, W., & Wasson, B. (2019). Integrating multiple data sources for learning analytics—review of literature. Research and Practice in Technology Enhanced Learning, 14(1). https://doi.org/10.1186/s41039-019-0105-4. Schmitz, H. C., Wolpers, M., Kirschenmann, U., & Niemann, K. (2011). Contextualized attention metadata. In Human attention in digital environments, (pp. 186–209). Siemens, G. (2011). 1st international conference on learning analytics and knowledge. Technology Enhanced Knowledge Research Institute (TEKRI). Retrieved from https://tekri.athabascau.ca/analytics/ Sottilare, R. A., Long, R. A., & Goldberg, B. S. (2017). Enhancing the Experience Application Program Interface (xAPI) to improve domain competency modeling for adaptive instruction. In Proceedings of the Fourth (2017) ACM Conference on Learning@ Scale, (pp. 265–268). Standards Norway. (2019). Standards Norway. Retrieved from https://www.standard.no/en/toppvalg/about-us/standards- norway/ Standards Norway. (2020). SN/K 186. Retrieved from https://www.standard.no/standardisering/komiteer/sn/snk-186/ Thüs, H., Chatti, M. A., Brandt, R., & Schroeder, U. (2015). Evolution of interests in the learning context data model. In Design for Teaching and Learning in a Networked World, (pp. 479–484). Cham: Springer. Thüs, H., Chatti, M. A., Yalcin, E., Pallasch, C., Kyryliuk, B., Mageramov, T., & Schroeder, U. (2012). Mobile learning in context. International Journal of Technology Enhanced Learning, 4(5-6), 332–344. Verborgh, R., & Vander Sande, M. (2020). The Semantic Web identity crisis: in search of the trivialities that never were. Semantic Web Journal, 11(1), 19–27 IOS Press. Retrieved from https://ruben.verborgh.org/articles/the-semantic-web- identity-crisis/. Vidal, J. C., Rabelo, T., & Lama, M. (2015). Semantic description of the Experience API specification. In 2015 IEEE 15th International Conference on Advanced Learning Technologies, (pp. 268–269). Wasson, B., Morlandstø, N. I., & Hansen, C. J. S. (2019). Summary of SLATE Research Report 2019-1: Activity data for assessment and activity (AVT). Bergen: Centre for the Science of Learning & Technology (SLATE). Retrieved from https://bora.uib.no/ha ndle/1956/20187. Wu, Y., Guo, S., & Zhu, L. (2020). Design and implementation of data collection mechanism for 3D design course based on xAPI standard. Interactive Learning Environments, 28(5), 602–619. Zapata-Rivera, L. F., & Petrie, M. M. L. (2018). xAPI-based model for tracking on-line laboratory applications. In 2018 IEEE Frontiers in Education Conference (FIE), (pp. 1–9). Publisher’sNote Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Journal

Research and Practice in Technology Enhanced Learning – Springer Journals

Published: Mar 29, 2021

Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Enriching context descriptions for enhanced LA scalability: a case study

Enriching context descriptions for enhanced LA scalability: a case study

Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Enriching context descriptions for enhanced LA scalability: a case study

Enriching context descriptions for enhanced LA scalability: a case study

References (47)

Abstract

Journal

Recommended Articles

There are no references for this article.

Our policy towards the use of cookies