Multi-device coverage testing of mobile applicationsVilkomir, Sergiy
2017 Software Quality Journal
doi: 10.1007/s11219-017-9357-7
This paper evaluates the effectiveness of coverage approaches for selecting mobile devices (i.e., smartphones and tablets) to test mobile software applications. Due to the large number of such devices on the market and the variations in their characteristics, it is hard to guarantee that an application will work as intended on all devices. For this reason, multi-device testing is necessary. The goal of this research was to determine how many devices must be tested and which methods for device selection are best for revealing device-specific faults. We experimentally investigated a simple coverage of all values of each device’s features separately and the each-choice coverage (i.e., the coverage of all device characteristics at the same time). To collect the experimental data, 15 Android applications were tested on 30 mobile devices and 24 device-specific faults were detected. Our research shows that a random selection of 13 devices achieved 100% effectiveness. However, coverage of device characteristics in the selection process yielded an acceptable 90% level of effectiveness with a set of only five devices. The most successful approaches were the coverage of different types of Android operating systems and the each-choice coverage. Our results include recommendations for increasing the effectiveness while decreasing the costs of mobile testing.
A model for estimating change propagation in softwareM. Ferreira, Kecia; S. Bigonha, Mariza; S. Bigonha, Roberto; Lima, Bernardo; Gomes, Bárbara; O. Mendes, Luiz
2017 Software Quality Journal
doi: 10.1007/s11219-017-9358-6
A major issue in software maintenance is change propagation. A software engineer should be able to assess the impact of a change in a software system, so that the effort to accomplish the maintenance may be properly estimated. We define a novel model, named K3B, for estimating change propagation impact. The model aims to predict how far a set of changes will propagate throughout the system. K3B is a stochastic model that has input parameters about the system and the number of modules which will be initially changed. K3B returns the estimated number of change steps, considering that a module may be changed more than once during a modification process. We provide the implementation of K3B for object-oriented programs. We compare our implementation with data from an artificial scenario, given by simulation, as well as with data from a real scenario, given by historical data. We found strong correlation between the results given by K3B and the results observed in the simulation, as well as with historical data of change propagation. K3B may be used for comparing software systems from the viewpoint of change impact. The model may aid software engineers in allocating proper resources to the maintenance tasks.
Round-trip engineering with the Two-Tier Programming ToolkitEden, A.H.; Gasparis, E.; Nicholson, J.; Kazman, R.
2017 Software Quality Journal
doi: 10.1007/s11219-017-9363-9
A major impediment to the long-term quality of large and complex programs is inconsistency between design and implementation. Conflicts between intent and execution are common because detecting them is laborious, error-prone, and poorly supported, and because the costs of continuously maintaining design documents outweigh immediate gains. A growing inconsistency between design and implementation results in software that is unpredictable and poorly understood. Round-trip engineering tools support an iterative process of detecting conflicts and resolving them by changing either the design or the implementation. We describe a Toolkit which supports a round-trip engineering of native Java programs without interfering with any existing practices, tools, or development environments, thereby posing a minimal barrier on adoption. The Toolkit includes a user-guided software visualization and design recovery tool, which generates Codecharts from source code. A “round-trip” process is possible because Codecharts visualizing source code can be edited to reflect the intended design, and the Verifier can detect conflicts between the intended and as-implemented design. We demonstrate each stage in this process, showing how the Toolkit effectively helps to close the gap between design and implementation, recreate design documentation, and maintaining consistency between intent and execution.
Recognising object-oriented software design quality: a practitioner-based questionnaire surveyStevenson, Jamie; Wood, Murray
2017 Software Quality Journal
doi: 10.1007/s11219-017-9364-8
Design quality is vital if software is to be maintainable. What practices do developers actually use to achieve design quality in their day-to-day work and which of these do they find most useful? To discover the extent to which practitioners concern themselves with object-oriented design quality and the approaches used when determining quality in practice, a questionnaire survey of 102 software practitioners, approximately half from the UK and the remainder from elsewhere around the world was used. Individual and peer experience are major contributors to design quality. Classic design guidelines, well-known lower level practices, tools and metrics all can also contribute positively to design quality. There is a potential relationship between testing practices and design quality. Inexperience, time pressures, novel problems, novel technology, and imprecise or changing requirements may have a negative impact on quality. Respondents with most experience are more confident in their design decisions, place more value on reviews by team leads and are more likely to rate design quality as very important. For practitioners, these results identify the techniques and tools that other practitioners find effective. For researchers, the results highlight a need for more work investigating the role of experience in the design process and the contribution experience makes to quality. There is also the potential for more in-depth studies of how practitioners are actually using design guidance, including Clean Code. Lastly, the potential relationship between testing practices and design quality merits further investigation.
Evaluating perceived and estimated data quality for Web 2.0 applications: a gap analysisHan, Wen-Ming
2017 Software Quality Journal
doi: 10.1007/s11219-017-9365-7
To increase user satisfaction and enhance a positive image, the quality of software needs to be continuously improved. This study empirically investigates the importance of 15 quality characteristics and evaluates how well the Web 2.0 applications perform on those characteristics from a data quality perspective. Based on questionnaire responses from 279 participants and the results of importance–performance analysis, the performance of all data quality characteristics was found to be below the end user expectation. Confidentiality showed the greatest discrepancy between importance and performance.
The effect of requests for user feedback on Quality of ExperienceFotrousi, Farnaz; Fricker, Samuel; Fiedler, Markus
2017 Software Quality Journal
doi: 10.1007/s11219-017-9373-7
Companies are interested in knowing how users experience and perceive their products. Quality of Experience (QoE) is a measurement that is used to assess the degree of delight or annoyance in experiencing a software product. To assess QoE, we have used a feedback tool integrated into a software product to ask users about their QoE ratings and to obtain information about their rationales for good or bad QoEs. It is known that requests for feedback may disturb users; however, little is known about the subjective reasoning behind this disturbance or about whether this disturbance negatively affects the QoE of the software product for which the feedback is sought. In this paper, we present a mixed qualitative-quantitative study with 35 subjects that explore the relationship between feedback requests and QoE. The subjects experienced a requirement-modeling mobile product, which was integrated with a feedback tool. During and at the end of the experience, we collected the users’ perceptions of the product and the feedback requests. Based on the users’ rational for being disturbed by the feedback requests, such as “early feedback,” “interruptive requests,” “frequent requests,” and “apparently inappropriate content,” we modeled feedback requests. The model defines feedback requests using a set of five-tuple variables: “task,” “timing” of the task for issuing the feedback requests, user’s “expertise-phase” with the product, the “frequency” of feedback requests about the task, and the “content” of the feedback request. Configuration of these parameters might drive the participants’ perceived disturbances. We also found that the disturbances generated by triggering user feedback requests have negligible impacts on the QoE of software products. These results imply that software product vendors may trust users’ feedback even when the feedback requests disturb the users.
Semantic languages for developing correct language translationsBarroca, Bruno; Amaral, Vasco; Buchs, Didier
2017 Software Quality Journal
doi: 10.1007/s11219-016-9352-4
The development and validation of language translators (e.g. port programs, language preprocessors, high-level software language compilers, etc.) are time-consuming and error-prone: language engineers need to master both the source and target languages’ syntactic constructs; and most importantly their semantics. In this paper, we present an innovative approach for developing and validating such language translators based on two languages: With the first, we specify a language translation using a syntax-to-syntax mapping; and with the second, we define the semantics of both of the source and target languages. After showing how such specifications can be combined to validate and generate language translators automatically, we demonstrate the feasibility of the approach on a particular modelling language translation.
Estimating software robustness in relation to input validation vulnerabilities using Bayesian networksUfuktepe, Ekincan; Tuglular, Tugkan
2017 Software Quality Journal
doi: 10.1007/s11219-017-9359-5
Estimating the robustness of software in the presence of invalid inputs has long been a challenging task owing to the fact that developers usually fail to take the necessary action to validate inputs during the design and implementation of software. We propose a method for estimating the robustness of software in relation to input validation vulnerabilities using Bayesian networks. The proposed method runs on all program functions and/or methods. It calculates a robustness value using information on the existence of input validation code in the functions and utilizing common weakness scores of known input validation vulnerabilities. In the case study, ten well-known software libraries implemented in the JavaScript language, which are chosen because of their increasing popularity among software developers, are evaluated. Using our method, software development teams can track changes made to software to deal with invalid inputs.
Software defect prediction: do different classifiers find the same defects?Bowes, David; Hall, Tracy; Petrić, Jean
2017 Software Quality Journal
doi: 10.1007/s11219-016-9353-3
During the last 10 years, hundreds of different defect prediction models have been published. The performance of the classifiers used in these models is reported to be similar with models rarely performing above the predictive performance ceiling of about 80% recall. We investigate the individual defects that four classifiers predict and analyse the level of prediction uncertainty produced by these classifiers. We perform a sensitivity analysis to compare the performance of Random Forest, Naïve Bayes, RPart and SVM classifiers when predicting defects in NASA, open source and commercial datasets. The defect predictions that each classifier makes is captured in a confusion matrix and the prediction uncertainty of each classifier is compared. Despite similar predictive performance values for these four classifiers, each detects different sets of defects. Some classifiers are more consistent in predicting defects than others. Our results confirm that a unique subset of defects can be detected by specific classifiers. However, while some classifiers are consistent in the predictions they make, other classifiers vary in their predictions. Given our results, we conclude that classifier ensembles with decision-making strategies not based on majority voting are likely to perform best in defect prediction.
An empirical study of crash-inducing commits in Mozilla FirefoxAn, Le; Khomh, Foutse; Guéhéneuc, Yann-Gaël
2017 Software Quality Journal
doi: 10.1007/s11219-017-9361-y
Software crashes are dreaded by both software organisations and end-users. Many software organisations have automatic crash reporting tools embedded in their software systems to help quality-assurance teams track and fix crash-related bugs. Previous approaches, which focused on the triaging of crash-types and crash-related bugs, can help software organisations increase their debugging efficiency of crashes. However, these approaches can only be applied after the software systems have been crashing for a certain period of time. To help software organisations detect and fix crash-prone code earlier, we examine the characteristics of commits that lead to crashes, which we call crash-inducing commits, in Mozilla Firefox. We observe that crash-inducing commits are often submitted by developers with less experience and that developers perform more addition and deletion of lines of code in crash-inducing commits but also that they need less effort to fix the bugs caused by these commits. We also characterise commits that would lead to frequent crashes, which impact a large user base, which we call highly impactful crash-inducing commits. Compared to other crash-related bugs, we observe that bugs due to highly impactful crash-inducing commits were less reopened by developers and tend to be fixed by a single commit. We build predictive models to help software organisations detect and fix crash-prone bugs early, when their developers commit code. Our predictive models achieve a precision of 61.2% and a recall of 94.5% to predict crash-inducing commits and a precision of 60.9% and a recall of 91.1% to predict highly impactful crash-inducing commits. Software organisations could use our models and approach to track and fix crash-prone commits early, before they negatively impact users, thus increasing bug fixing efficiency and user-perceived quality.
Stability prediction of the software requirements specificationSagrado, José; Águila, Isabel
2017 Software Quality Journal
doi: 10.1007/s11219-017-9362-x
Complex decision-making is a prominent aspect of Requirements Engineering. This work presents the Bayesian network Requisites that predicts whether the requirements specification documents have to be revised. We test Requisites’ suitability by means of metrics obtained from a large complex software project. Furthermore, this Bayesian network has been integrated into a software tool by defining a communication interface inside a multilayered architecture. In this way, we add a new decision-making functionality that provides requirements engineers with a feature to explore software requirement specification by combining requirement metrics and the probability values estimated by the Bayesian network.
Towards improving decision making and estimating the value of decisions in value-based software engineering: the VALUE frameworkMendes, Emilia; Rodriguez, Pilar; Freitas, Vitor; Baker, Simon; Atoui, Mohamed
2017 Software Quality Journal
doi: 10.1007/s11219-017-9360-z
To sustain growth, maintain competitive advantage and to innovate, companies must make a paradigm shift in which both short- and long-term value aspects are employed to guide their decision-making. Such need is clearly pressing in innovative industries, such as ICT, and is also the core of Value-based Software Engineering (VBSE). The goal of this paper is to detail a framework called VALUE—improving decision-making relating to software-intensive products and services development—and to show its application in practice to a large ICT company in Finland. The VALUE framework includes a mixed-methods approach, as follows: to elicit key stakeholders’ tacit knowledge regarding factors used during a decision-making process, either transcripts from interviews with key stakeholders are analysed and validated in focus group meetings or focus-group meeting(s) are directly applied. These value factors are later used as input to a Web-based tool (Value tool) employed to support decision making. This tool was co-created with four industrial partners in this research via a design science approach that includes several case studies and focus-group meetings. Later, data on key stakeholders’ decisions gathered using the Value tool, plus additional input from key stakeholders, are used, in combination with the Expert-based Knowledge Engineering of Bayesian Network (EKEBN) process, coupled with the weighed sum algorithm (WSA) method, to build and validate a company-specific value estimation model. The application of our proposed framework to a real case, as part of an ongoing collaboration with a large software company (company A), is presented herein. Further, we also provide a detailed example, partially using real data on decisions, of a value estimation Bayesian network (BN) model for company A. This paper presents some empirical results from applying the VALUE Framework to a large ICT company; those relate to eliciting key stakeholders’ tacit knowledge, which is later used as input to a pilot study where these stakeholders employ the Value tool to select features for one of their company’s chief products. The data on decisions obtained from this pilot study is later applied to a detailed example on building a value estimation BN model for company A. We detail a framework—VALUE framework—to be used to help companies improve their value-based decisions and to go a step further and also estimate the overall value of each decision.