Kenett, Ron S.; Gotwalt, Chris; Freeman, Laura; Deng, Xinwei
2022 Applied Stochastic Models in Business and Industry
doi: 10.1002/asmb.2701
Modern statistics and machine learning typically involve large amounts of data coupled with computationally intensive methods. In a predictive modeling context, one seeks models that achieve high predictive accuracy on new datasets. This is typically implemented by partitioning the data into training and hold‐out data sets. The allocation is often conducted randomly, at the row level of the data matrix. In this work, we discuss an overlooked gap in machine learning and predictive modeling, the role of data structure and data generation process in the partitioning of observational data into training and hold‐out datasets. Ignoring such structures can lead to deficiencies in model generalizability and operationalization. We highlight that explicitly embracing the data generation structure to partition the data for validating predictive model is essential to the success of data science projects. The proposed approach is called befitting cross validation (BCV). It relies on an information quality perspective of analytics. This requires an assessment with inputs from domain experts, in contrast to automated approaches that are purely data driven. BCV is motivated by the objective of generating information quality with data and modeling. Two case studies are illustrating the proposed approach. One is based on a 96‐h burn‐in process applied to electro‐mechanical devices, implemented in order to reduce early failures at the customer site. The goal was to shorten the burn‐in process with a predictive model applied at 20 h. The other case study is combining tablet dissolution profiles and designed mixture experiments. The goal there was to match the tablet under test dissolution profiles with a brand tablet reference profile. These case studies demonstrate the methodological points made with BCV, which are generic in nature. We suggest that BCV principles should be always considered in the development of data‐driven predictive models.
Bobbia, Michel; Poggi, Jean‐Michel; Portier, Bruno
2022 Applied Stochastic Models in Business and Industry
doi: 10.1002/asmb.2713
The context for this article is the statistical fusion of several pollutant measurement networks: a reference one of fixed sensors of high quality and others of fixed or mobile micro‐sensors of heterogeneous quality. The challenge is to use together the measurements of such different networks to obtain a better air quality map. Since pollution maps are often obtained from the correction of numerical model outputs by the measurements provided by the monitoring stations of air quality networks, the quality of the reconstructed map may be improved by increasing the density of sensors by adding low‐cost micro‐sensors. A geostatistical approach is very often used for the fusion of measurements. But the first step is to correct micro‐sensors measures using those given by the reference sensors. Usually, this preprocessing is performed during an offline preliminary study for which reference and micro‐sensor are located at the same position, which does not allow to adapt quickly to changes and to cope with time‐related nonstationarities. We propose in this article to complement these approaches by a simple online spatial correction of micro‐sensors. The principle is to use the reference measurements to correct the network of micro‐sensors. More precisely, by kriging only the measurements from micro‐sensors, the reference measurements are estimated; allowing to calculate a correction by kriging the differences, finally applied to the micro‐sensors. Then one can iterate this fundamental sequence of steps. Numerical experiments exploring the proposed algorithm by simulation and an application to a real‐world dataset are provided.
Fleming, Liam; Emerson, Joseph; Stitt, Hugh; Zhang, Jie; Coleman, Shirley
2022 Applied Stochastic Models in Business and Industry
doi: 10.1002/asmb.2709
Accurate simulators are relied upon in the process industry for plant design and operation. Typical simulators, based on mechanistic models, require considerable resources: skilled engineers, computational time, and proprietary data. This article explores the complexities of developing a statistical modelling framework for chemical processes, focusing on inherent non‐linearity in phenomena and the difficulty of obtaining data. A Bayesian approach to modelling is forwarded in this article, utilising Bayesian sequential design to maximise information gain for each experiment. Gaussian process regression is used to provide a highly flexible model class to capture non‐linearities in the process data. A non‐linear process simulator, modelled in Aspen Plus is used as a surrogate for a real chemical process, to test the capabilities of the framework.
Lambardi di San Miniato, Michele; Bellio, Ruggero; Grassetti, Luca; Vidoni, Paolo
2022 Applied Stochastic Models in Business and Industry
doi: 10.1002/asmb.2697
Environmental monitoring is a task that requires to surrogate system‐wide information with limited sensor readings. Under the proximity principle, an environmental monitoring system can be based on the virtual sensing logic and then rely on distance‐based prediction methods, foremostly spatio‐temporal kriging. The last one is cumbersome with large datasets, but we show that a suitable separability assumption reduces its computational cost to an extent broader than considered typically. Only spatial interpolation needs to be performed in a centralized way, while forecasting can be delegated to each sensor. This simplification is related to the fact that two separate models are involved, one in time and one in the space domain. Any of the two models can be replaced without re‐estimating the other under a composite likelihood (CL) approach. Moreover, the use of convenient spatial and temporal models eases up computation, not only in the CL approach, but also in maximum likelihood estimation. We show that this perspective on kriging allows to perform virtual sensing even in the case of tall datasets.
Facchinetti, Silvia; Osmetti, Silvia Angela; Magagnoli, Umberto
2022 Applied Stochastic Models in Business and Industry
doi: 10.1002/asmb.2710
The availability of information that suppliers possess about the production process, as well as about the technical and economic consequences for customers, encourages the development and application of acceptance sampling plans that follow economic criteria, such as the Bayesian ones proposed in the literature. The combination of prior knowledge described by the prior distribution and empirical knowledge based on the sample leads to the decision to accept or reject the lot under inspection. The main purpose of this study was to derive acceptance sampling plans for attributes based on a prior generalized beta distribution following the economic criterion to minimize the expected total cost of quality. Specifically, a procedure is proposed to define the optimal sampling plan based on the technical characteristics of the production process and the costs inherent in the quality of the product. After the methodological aspects are described in detail, an extensive simulation study is reported that demonstrates how the optimal plan changes according to the main parameters, providing guidance for practitioners.
Wang, To‐Cheng; Hsu, Bi‐Min; Shu, Ming‐Hung
2022 Applied Stochastic Models in Business and Industry
doi: 10.1002/asmb.2667
Today, in Industry 4.0 and sensor‐rich environments, web‐based processing, and cloud computing have played significant roles in the quality systems designed to ensure product quality and reliability. Among quality validation technologies, variables acceptance sampling plans are essential tools in supply chain channels, where the variables quick‐switch sampling (VQSS) system is among the efficient schemes. The VQSS system comprises normal and tightened inspections that can provide a dynamic adjusting mechanism for lot disposition. Recently, the VQSS system regulated on process capability indices (PCIs) has been developed to integrate consumers' requirements into manufacturing processes. Unfortunately, the existing PCI‐based VQSS systems only consider a single, critical quality characteristic of the product. With advanced sensor technologies, cost‐efficiently measuring multiple quality characteristics (MQCs) has become a reality in today's manufacturing environments. Therefore, in this article, we emphasize developing the VQSS system with MQCs according to the overall PCI for lot disposition. By comparing the existing sampling plans, the proposed overall PCI‐based VQSS system can reduce the required sample size significantly. The results also demonstrate that the proposed system can accommodate both normal and tightened inspections' strengths to obtain a superior lot‐discrimination power. Furthermore, we have developed web‐based architecture and tools for practitioners to execute our proposed system efficiently in the cloud. They can be used to construct automation and semi‐automation applications for the lot disposition, which is appropriate in Industry 4.0 environments. Finally, a case of ultra‐mini chip resistors is illustrated to demonstrate the applicability of our designed VQSS system.
Lepore, Antonio; Palumbo, Biagio; Sposito, Gianluca
2022 Applied Stochastic Models in Business and Industry
doi: 10.1002/asmb.2702
A multiple stream process (MSP) is a process at a point in time that generates several streams of output with a quality variable of interest and specifications that are identical in all streams. In this article, a new control charting framework based on artificial neural networks (NNs), whose performance is prone to be measured in terms of ARL0$$ AR{L}_0 $$ and ARL1$$ AR{L}_1 $$, is proposed to improve the monitoring of a MSP and the detection of changes in individual streams. To the best of our knowledge, this is the first time that a NN has been applied to the monitoring of a MSP. The performance of the proposed control charting is evaluated through a wide Monte Carlo simulation and is compared with the traditional Mortell and Runger's MSP control charts based on the range statistic. The proposed method's potential is demonstrated by means of a real‐case study in the monitoring of heating, ventilation and air conditioning (HVAC) systems installed on board modern trains. The NN4MSP package that implements the proposed monitoring scheme through the software environment Python, and the HVAC data set are made openly available online at https://github.com/unina‐sfere/NN4MSP and on PyPI, together with a tutorial that shows how to practically implement the proposed methodology to the real‐case study.
Borgoni, Riccardo; Farace, Vincenzo Emanuele; Zappa, Diego
2022 Applied Stochastic Models in Business and Industry
doi: 10.1002/asmb.2673
Process capability indices are routinely used to estimate the mean‐variability performance of industrial products with respect to both targets and specification limits. However, when the target variable is defined over a planar surface of a manufact, it is relevant to assess the capability of the production process locally, that is, at any spatial location of the surface, in particular if the manufact has to be split into pieces to obtain single production items. In this article, focusing on the Cpk specification introduced by Clements [Qual Prog., 22, 95–100], we suggest an approach based on additive quantile models to estimate, in a Bayesian paradigm, the index locally. We demonstrate its use in the context of the etching phase of the integrated circuit fabrication process. Since capability of etching processes is typically assessed for batches of wafers, we also propose two algorithms based on resampling to perform local capability analysis at the lot level.
Showing 1 to 10 of 11 Articles