TY - JOUR AU - Ju, Chunwu AB - Introduction Imaging systems suffer from several types of image degradations. Motion blur is a common phenomenon, as are defocus blur and atmospheric turbulence blur. All of them reduce the image quality significantly. Therefore, it is essential to develop methods for recovering approximated latent images from blurry ones in order to increase the performance of imaging systems. Such methods will find wide applicability in various fields. However, the issue of blur removal is a notoriously ill-defined inverse problem that has perplexed scholars for decades [1]. A point spread function (PSF) can be used to describe image blur. The PSF models how the imaging system captures a point source or object and describes how the point spreads across an image. Generally, the PSF can be transformed into a parametric model [2], with the parameters being the motion length, defocus radius, and turbulence degree [3]. For estimating the PSF, Fergus et al. [4] proposed a method that combines the gradient domain of natural images with the maximum a posteriori (MAP) method. Xu et al. [5] introduced a sparsity function with an -norm constraint, while Pan et al. [6] estimated the blur kernel from the dark channel of the blurry images. These estimation techniques perform well when dealing with hand-drawn PSFs. However, in real-life situations, the blur models are often known. For example, when monitoring targets on a conveyor belt with a fixed camera, the PSF can be modeled based on the motion length during the exposure time. Similarly, with respect to space exploration, the atmospheric turbulence can be modeled by a Gaussian function, whose variance indicates the blur degree. Defocus blur can be modeled based on the defocus radius. It is more convenient and practical to solve a parameter identification problem than to estimate the PSF. Given this fact, Jalobeanu et al. [7] used the maximum likelihood estimator (MLE) on the entire dataset available to estimate the parameters for a Gaussian model. Yin and Hussain [8] combined the non-Gaussianity measures for independent component analysis to estimate the parameters for blur models. Dash and Majhi [9] suggested a radial basis function neural network with image features based on the magnitude of Fourier coefficients to estimate the motion lengths. Yan and Shao [10] proposed a supervised deep neural network to classify the blur type and adopted the expected patch log likelihood method [11] to restore the latent image. Further, Kumar et al. [12] used the Tchebycheff moment to estimate the Gaussian variance for turbulence blurs. With a known PSF, the latent image can be restored using inverse filters or some other nonblind deconvolution method. Levin et al. [13] proposed a hyper-Laplacian prior and adopted the iterative reweighted least squares (IRLS) algorithm to solve the optimization problem. Joshi et al. [14] adopted the IRLS algorithm for local color statistics and hyper-Laplacian priors in order to perform deblurring and denoising. Wang et al. [15] introduced a deconvolution method based on the total variation and employed the half-quadratic minimization (HQM) algorithm, which was originally proposed by Geman and Yang [16], to solve the nonconvex problem. Simultaneous estimation of both the PSF and the latent image is a nonconvex issue that always results in a nonblur solution [17]. A feasible blind image deblurring framework is to estimate the PSF and the latent image alternately [4, 6, 18]. However, this framework requires prior knowledge of both the image and the PSF. In addition, because this procedure would be sensitive to noise, the image edges must be reconstructed during each iteration [5] before the next step. In addition, in order to avoid local minima, a coarse-to-fine technique must also be used during the alternating optimization process. All the disadvantages mentioned above make the deblurring procedure time-consuming. Furthermore, most existing approaches are designed for random hand-drawn blurs. However, in real-world situations, the blur model would always be known. In the case of target detection on a conveyor belt, the PSF only depends on the motion length. From this viewpoint, the PSF estimation procedure in the existing methods can be simplified, and the speed of the deblurring algorithm can be increased. In this paper, a new deblurring framework consisting of two stages is proposed. First, a blur feature is used to estimate the model parameters via a general regression neural network (GRNN) in order to determine the PSF. Next, a deconvolution algorithm based on sparse representation is proposed for latent image restoration. The main contributions of this paper can be summarized as follows: Three common types of blur models are discussed and a blur feature based on autocorrelation of the image gradient domain is proposed. Simulations show that its amplitude rises with an increase in the blur degree. Then, more than ten thousand natural image samples were artificially blurred for feature extraction to generate the training dataset for the GRNNs. A learning-based parameter estimation scheme is proposed. With the help of the GRNNs, the blur model parameters for input blurry images can be estimated based on their blur features. Then, their PSFs can be constructed. A nonblind deblurring algorithm based on sparse representation is presented. Its target image is constrained by an quasi-norm in the cost function. The solutions for different p values are discussed and their performances on some typical p values are analyzed. Further, the proposed deblurring method is compared with several related blur removal approaches. The remainder of this paper is organized as follows: Section 2 introduces the imaging system and the parametric model for the three most common blurs. Next, the proposed blur feature is described. Section 3 describes the proposed deblurring framework, including the properties of the GRNN and the restoration algorithm. The results of the simulations and comparative analysis performed are described in Section 4. Section 5 summarizes the study and presents concluding remarks. Blur models and features Imaging system. An imaging system can generally be regarded as a linear-translation-invariant system [19], and the image blurring procedure, shown in Fig 1, can be described as follows: (1) where f(x,y) and g(x,y) represent the original image and the observed blurry image, respectively; h(x,y) is the PSF; η(x,y) is the additive noise generated during image acquisition or transmission; x and y represent the coordinates of a digital image; and the symbol “*” represents the convolution operator. For simplicity, we can also use bold notation (such as f and h) to represent the image and PSF, respectively. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 1. Image blurring in imaging system. https://doi.org/10.1371/journal.pone.0230619.g001 Parametric model of blurs a) Linear motion blur [20]. Relative movement between the camera and the target when the exposure time is insufficiently small results in linear motion blur. Its PSF can be modeled as follows: (2) where L represents the motion length and Φ is its orientation. An example of linear motion blur is shown in Fig 2A. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 2. Examples of the PSF (first row) and its frequency domains (second row). Left: Motion blur with (L,Φ) = (30,45°) in Eq (2). Middle: Defocus blur with r = 9 in Eq (3). Right: Gaussian blur with σ = 3.5 in Eq (4). https://doi.org/10.1371/journal.pone.0230619.g002 b) Defocus Blur [21]. In the case of defocusing, blurring arises because of the optical system of the lens. The path followed by light through the lens depends on its wavelength. However, natural light consists of components with several different wavelengths. This results in physical limitations with respect to the construction of lenses that prevent the camera from producing perfectly sharp images, leading to defocus blur. The extent of blur that is visible in an image is a function of the lens aperture: (3) where the radius, r, determines the blur extent. Further, (x0,y0) is the center of the PSF. This is depicted in Fig 2B. c) Atmospheric turbulence blur [7]. The variations in the heat, pressure, and wind velocity in the atmosphere result in small-scale, irregular air motions characterized by winds, which vary in speed and direction. These have a determining effect on light propagation and cause atmospheric turbulence blur, which is also known as Gaussian blur. This type of blur usually occurs during remote sensing. Generally, a low-pass Gaussian filter [3, 22, 23] can be used to model it: (4) where σ indicates the blur degree and (x0,y0) is the center of the PSF. Since Eq (4) is an infinite impulse response filter, proper truncation and normalization are necessary in practice. Normally, the support domain is set as [x0−3σ,x0+3σ]×[y0−3σ,y0+3σ], since 99.7% of the energy is contained in this region; here, × is the Cartesian product. Fig 2C show an example of the Gaussian blur model. Blur features Natural images are diverse, and their statistical characteristics vary significantly. Nevertheless, in recent years, an increasing number of studies have indicated that the gradient domains of natural images share common features and that their histograms tend to have sharp tops and long tails [24]. Generally, a digital image can be treated as a discrete binary function mathematically, and its gradient domain can be represented based on the pixel difference. In practice, image edge detection operators [3, 25] such as the Roberts and Prewitt operators are used commonly for detecting the first-order derivatives while the Sobel and Laplacian operators are used for the second-order derivatives. In this study, inspired by the properties of image edges, a blur feature based on the autocorrelation of blurry images in the gradient domain is proposed. By definition, autocorrelation is the correlation of a signal with a delayed copy of itself as a function of the delay [26]. It is a useful tool for measuring the similarity between a signal and its shifted versions. A digital image can be treated as a two-dimensional (2D) real signal and its gradient domain e(x,y) as a binary function. Thus, its autocorrelation R(x,y) can be described as (5) where Zx and Zy are the support domains of e(x,y). In addition, based on the conjugate symmetry of real signals [27, 28], the computation of Eq (5) can be accelerated with the help of the fast Fourier transform (FFT) and inverse fast Fourier transform (IFFT). Therefore, the autocorrelation of the image edge, e(x,y), can be simplified to (6) where and e represent the matrix form of R(x,y) and e(x,y), respectively, for brevity, and and are the FFT and IFFT operators. Further, |⋅|2 represents the element-wise square of the modulus of the former. Since the value range of image edges will vary widely, their blur features should be adjusted on a notionally common scale. Feature scaling [29], also known as unity-based normalization, is a technique that is used to bring all values within the range [0,1]. Hence, the normalized blur feature, , can be described as follows: (7) where and are the maximum and minimum values of . In Figs 3–5, the left column shows examples of natural images that were blurred artificially by motion blur, defocus blur, and Gaussian blur using different parameters. Their gradient domain versions were extracted by using the Sobel operator both in the horizontal and the vertical directions. The image edges, which were obtained by adding them, are shown in the middle column. As can be seen, with an increase in the image blurriness, the information contained in the edges decreases. The right column shows the amplitude of the blur feature described in Eqs (5)–(7). It can be seen from Fig 3 that the features of the motion-blurred images retain the orientation of the PSF, but the length of the shape of the blur features increases with the motion length. In the case of the defocus blur, the amplitude range of the feature increases with the defocus radius, r, as shown in Fig 4. Finally, the feature of the Gaussian blur in Fig 5 expands with an increase in parameter σ. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 3. Examples of the blur feature in motion blur. Left: Motion blur of the “bikes.bmp” image with L =10 (first row) and L = 20 (second row) and orientation Φ = 120°. Middle: Gradient domain. Right: Amplitudes of the blur features. https://doi.org/10.1371/journal.pone.0230619.g003 Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 4. Examples of the blur feature in defocus blur. Left: Defocus blur of the “monarch.bmp” image with r = 3 (first row) and r = 3 (second row). Middle: Gradient domain. Right: Amplitudes of the blur features. https://doi.org/10.1371/journal.pone.0230619.g004 Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 5. Examples of the blur feature in atmospheric turbulence blur. Left: Atmospheric turbulence blur of the “house.bmp” image with σ = 2.0 (first row) and σ = 4.8 (second row). Middle: Gradient domain. Right: Amplitudes of the blur features. https://doi.org/10.1371/journal.pone.0230619.g005 Note that, in Figs 3–5, because the size of the blur features depends on the size of the images, according to Eqs (5)–(7), and most of the amplitude energy is located near the center of the blur feature, their boundaries have been truncated to ensure that the blur features are all the same size. The proposed learning-based blur removal method General regression neural network. A GRNN is a probabilistic neural network (PNN), and was first proposed by Specht [30]. It can converge to the underlying function of the data with only a few training samples available, and its spreading rate is the only additional parameter that requires adjustment by the user. This makes GRNNs a very useful tool in areas such as regression, prediction, and classification. A GRNN is composed of an input layer, a pattern layer, a summation layer, and an output layer [31]. Its overall block diagram is shown in Fig 6. In the input layer, X = (X1,X2,⋯Xn) represents the input data of the GRNN G(X). Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 6. Basic structure of a GRNN. https://doi.org/10.1371/journal.pone.0230619.g006 In the pattern layer, the number of units, s, is equal to the length of training data (x1,x2,⋯xs) and (y1,y2,⋯ys). The activation function of each unit, i, is given by (8) where δ is the spreading rate, and T represents the transpose. In the summation layer, and need to be determined. Their ratio yields the final output, G(X) = SN/SD. Taking another perspective, a GRNN is a generalization of a radial basis function network (RBFN) and a PNN. In the pattern layer, the activation function in Eq (8) is similar to the radial basis function kernel (Gaussian kernel), and it uses the training sample xi as the mean value of each unit. In the output layer, SN/SD indicates the probability of how well the training sample can represent the prediction position. In practice, the GRNN outperforms the RBFN as well as traditional back propagation neural networks in predictions, as the former only involves a one-pass learning scheme instead of repetitive iterations during the training process [30]. Therefore, the advantages of the GRNN are fast learning and convergence, even though the number of inputs is very high. Framework of the deblurring method. The principal framework of the proposed deblurring method is shown in Fig 7. It consists of two primary steps. The first is a learning-based parameter estimation procedure while the second involves the use of a nonlinear deblurring algorithm. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 7. Framework of the proposed method. The “Estimation” and “Deblurring” processes are described in Algorithms 1 and 2, respectively. https://doi.org/10.1371/journal.pone.0230619.g007 In order to estimate the blur parameters efficiently, more than ten thousand natural images were artificially blurred using the three above-mentioned blur models and different parameters, and their blur features were extracted using Eq (7). The Frobenius norms [32] of these features, , were taken as the inputs to train a GRNN for each blur model. In this manner, for a given blurred image, the PSF could be estimated from its blur feature while using the GRNN. Next, the latent image was restored via a nonblind blur removal algorithm, which is discussed in the following subsection. The main steps for parameter estimation are shown in Algorithm 1. Algorithm 1: Parameter Estimation Input: (1) Observed blurry image: g; (2) GRNN: G(X); Output: Deblurring result: f; 1: Extract the edges, e, from the input blurry image, g; 2: Calculate the blur feature, , using Eq (7); 3: Estimate the blur model parameter, Y, using the network, ; 4: Generate the PSF, h, from the estimated Y; 5: Restore the latent image, f, using Algorithm 2; 6: return f; Latent image restoration. Once the PSF has been acquired, a variety of nonblind deconvolution methods can be used to recover the latent image. Wiener filtering and the Lucy-Richardson method are frequently used. These deblurring techniques were proposed decades ago based on the MLE. However, they both have the disadvantage of being sensitive to any incorrect PSF estimate, and the ringing effect [33, 34] is unavoidable. In recent years, the theory of sparse representation and machine learning have been introduced in the field of image restoration [35–37]. Based on a statistical analysis of natural images, several studies have shown that the image gradients tend to have a heavy-tailed distribution [38, 39]. The most commonly used form of this distribution is the hyper-Laplacian model [13, 38]. For each element of the image gradient ∇fi, the hyper Laplacian model can be expressed by the joint distribution of f as follows: (9) where *∈{x,y} represents the gradient in two orientations and α signifies the slope of the exponential function. From the perspective of MAP estimation, the image prior is turned into a regularization term in the logarithm cost function [17]. Let ‖⋅‖p represent a quasi-norm, which is defined as (10) In this study, the cost function was defined as follows: (11) where ∇f is short for (∇xf, ∇yf). The first term on the right-hand side of the equation is a data fidelity term to ensure the best approximation of the original image while the second term is a constraint on the image gradient. Because the optimization problem described in Eq (11) is nonlinear, traditional descent methods are ineffective because of their slow convergence. Fortunately, inspired by the half-quadratic penalty method [15, 16, 40], which can alternately optimize the -based expression in Eq (11) when 01 is met, so that β→∞. Finally, the algorithm converges to a stable state, yielding the best approximation of the original image. Experimental evaluation Experimental setup. A training dataset from the Pascal Visual Object Classes (VOC) dataset was used [44]. The dataset consisted of 16,135 natural images of animals, humans, transportation, landscapes, and architectures, among other entities. Fig 8 shows some examples of images from the Pascal VOC dataset. Different artificial blurs were induced in them using increasing parameters so as to build a training set for each blur type, including a series of Gaussian blurs whose σ values were increased from 1 to 5 in steps of 0.2 and whose dimensions were of 25×25 pixels. For linear motion blur, estimating the motion length and the orientation simultaneously will lead to overfitting. Thus, in the experiment, we maintained the same orientation and trained a GRNN for each of the five values for the angle Φ; i.e., 0°, 30°, 45°, 60°, and 90°, respectively. The motion length, L, increased from 2 pixels to 20 pixels. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 8. Examples of the pascal VOC dataset. https://doi.org/10.1371/journal.pone.0230619.g008 Next, the blur features, , were extracted from each training set. Hypothesis testing [45, 46] was performed to counter the adverse effects of the outliers. Finally, the features and their corresponding parameters were used as the training sets for the GRNNs while setting the spreading rate, δ, to 0.6–0.9. The training procedure is similar to that of an RBFN: 1) An unsupervised learning method is needed to determine a set of offsets in the activation function. 2) A least squares method is used to train the weights in the summation before the output. Furthermore, the difference is that each unit of a GRNN is influenced by every sample from the training set. Thus, each training sample xq acts as the offset of the activation function in each pattern unit of a GRNN. Additionally, in the summation layer, the outputs of the pattern unit ρq are weighted with the corresponding values of the training samples, yq, when going to the denominator unit SD, but are weighted with one when going to the numerator unit SN. Because of the use of the one-pass algorithm, the training procedures converged rapidly and stably. The training was implemented in MATLAB using an Intel Xeon E5-2620 v4 CPU (2.1 GHz) and an NVIDIA 1080 GPU. It took approximately 10–15 minutes to generate a network. As can be seen from the flowchart in Fig 7, for an image blurred by an unknown PSF, the blur feature should be extracted first. Then, using the GRNN as well as the parametric blur models, its PSF can be estimated successfully. In this manner, the latent image can be restored from the blurry image and the PSF estimated by the deconvolution method described in Algorithm 2. The value of coefficient α in Eq (12) was determined by trial and error, while the criteria for convergence in the algorithm were set as ε = 1e4 and k = 2, respectively. Next, we evaluated the convergence speed of Algorithm 2 for different p values. A series of defocus blurs with increasing radii were simulated in order to test the performance of the solution methods for Eq (14). The simulations were performed on the same platform as that used in the training procedure discussed above. Fig 9 shows the cost times for p values of 1/2, 2/3, 3/4, 4/5, 1, and 2. Generally, the Newton-Raphson method is effective for solving Eq (14). However, as discussed in the previous section, for special cases, such as those where p = 1/2 and p = 2/3, analytical solutions exist, as given in Eq (17) and Eq (18), that converge faster than those for 01 is met, so that β→∞. Finally, the algorithm converges to a stable state, yielding the best approximation of the original image. General regression neural network. A GRNN is a probabilistic neural network (PNN), and was first proposed by Specht [30]. It can converge to the underlying function of the data with only a few training samples available, and its spreading rate is the only additional parameter that requires adjustment by the user. This makes GRNNs a very useful tool in areas such as regression, prediction, and classification. A GRNN is composed of an input layer, a pattern layer, a summation layer, and an output layer [31]. Its overall block diagram is shown in Fig 6. In the input layer, X = (X1,X2,⋯Xn) represents the input data of the GRNN G(X). Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 6. Basic structure of a GRNN. https://doi.org/10.1371/journal.pone.0230619.g006 In the pattern layer, the number of units, s, is equal to the length of training data (x1,x2,⋯xs) and (y1,y2,⋯ys). The activation function of each unit, i, is given by (8) where δ is the spreading rate, and T represents the transpose. In the summation layer, and need to be determined. Their ratio yields the final output, G(X) = SN/SD. Taking another perspective, a GRNN is a generalization of a radial basis function network (RBFN) and a PNN. In the pattern layer, the activation function in Eq (8) is similar to the radial basis function kernel (Gaussian kernel), and it uses the training sample xi as the mean value of each unit. In the output layer, SN/SD indicates the probability of how well the training sample can represent the prediction position. In practice, the GRNN outperforms the RBFN as well as traditional back propagation neural networks in predictions, as the former only involves a one-pass learning scheme instead of repetitive iterations during the training process [30]. Therefore, the advantages of the GRNN are fast learning and convergence, even though the number of inputs is very high. Framework of the deblurring method. The principal framework of the proposed deblurring method is shown in Fig 7. It consists of two primary steps. The first is a learning-based parameter estimation procedure while the second involves the use of a nonlinear deblurring algorithm. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 7. Framework of the proposed method. The “Estimation” and “Deblurring” processes are described in Algorithms 1 and 2, respectively. https://doi.org/10.1371/journal.pone.0230619.g007 In order to estimate the blur parameters efficiently, more than ten thousand natural images were artificially blurred using the three above-mentioned blur models and different parameters, and their blur features were extracted using Eq (7). The Frobenius norms [32] of these features, , were taken as the inputs to train a GRNN for each blur model. In this manner, for a given blurred image, the PSF could be estimated from its blur feature while using the GRNN. Next, the latent image was restored via a nonblind blur removal algorithm, which is discussed in the following subsection. The main steps for parameter estimation are shown in Algorithm 1. Algorithm 1: Parameter Estimation Input: (1) Observed blurry image: g; (2) GRNN: G(X); Output: Deblurring result: f; 1: Extract the edges, e, from the input blurry image, g; 2: Calculate the blur feature, , using Eq (7); 3: Estimate the blur model parameter, Y, using the network, ; 4: Generate the PSF, h, from the estimated Y; 5: Restore the latent image, f, using Algorithm 2; 6: return f; Latent image restoration. Once the PSF has been acquired, a variety of nonblind deconvolution methods can be used to recover the latent image. Wiener filtering and the Lucy-Richardson method are frequently used. These deblurring techniques were proposed decades ago based on the MLE. However, they both have the disadvantage of being sensitive to any incorrect PSF estimate, and the ringing effect [33, 34] is unavoidable. In recent years, the theory of sparse representation and machine learning have been introduced in the field of image restoration [35–37]. Based on a statistical analysis of natural images, several studies have shown that the image gradients tend to have a heavy-tailed distribution [38, 39]. The most commonly used form of this distribution is the hyper-Laplacian model [13, 38]. For each element of the image gradient ∇fi, the hyper Laplacian model can be expressed by the joint distribution of f as follows: (9) where *∈{x,y} represents the gradient in two orientations and α signifies the slope of the exponential function. From the perspective of MAP estimation, the image prior is turned into a regularization term in the logarithm cost function [17]. Let ‖⋅‖p represent a quasi-norm, which is defined as (10) In this study, the cost function was defined as follows: (11) where ∇f is short for (∇xf, ∇yf). The first term on the right-hand side of the equation is a data fidelity term to ensure the best approximation of the original image while the second term is a constraint on the image gradient. Because the optimization problem described in Eq (11) is nonlinear, traditional descent methods are ineffective because of their slow convergence. Fortunately, inspired by the half-quadratic penalty method [15, 16, 40], which can alternately optimize the -based expression in Eq (11) when 01 is met, so that β→∞. Finally, the algorithm converges to a stable state, yielding the best approximation of the original image. Experimental evaluation Experimental setup. A training dataset from the Pascal Visual Object Classes (VOC) dataset was used [44]. The dataset consisted of 16,135 natural images of animals, humans, transportation, landscapes, and architectures, among other entities. Fig 8 shows some examples of images from the Pascal VOC dataset. Different artificial blurs were induced in them using increasing parameters so as to build a training set for each blur type, including a series of Gaussian blurs whose σ values were increased from 1 to 5 in steps of 0.2 and whose dimensions were of 25×25 pixels. For linear motion blur, estimating the motion length and the orientation simultaneously will lead to overfitting. Thus, in the experiment, we maintained the same orientation and trained a GRNN for each of the five values for the angle Φ; i.e., 0°, 30°, 45°, 60°, and 90°, respectively. The motion length, L, increased from 2 pixels to 20 pixels. Download: PPT PowerPoint slide PNG larger image TIFF original image Fig 8. Examples of the pascal VOC dataset. https://doi.org/10.1371/journal.pone.0230619.g008 Next, the blur features, , were extracted from each training set. Hypothesis testing [45, 46] was performed to counter the adverse effects of the outliers. Finally, the features and their corresponding parameters were used as the training sets for the GRNNs while setting the spreading rate, δ, to 0.6–0.9. The training procedure is similar to that of an RBFN: 1) An unsupervised learning method is needed to determine a set of offsets in the activation function. 2) A least squares method is used to train the weights in the summation before the output. Furthermore, the difference is that each unit of a GRNN is influenced by every sample from the training set. Thus, each training sample xq acts as the offset of the activation function in each pattern unit of a GRNN. Additionally, in the summation layer, the outputs of the pattern unit ρq are weighted with the corresponding values of the training samples, yq, when going to the denominator unit SD, but are weighted with one when going to the numerator unit SN. Because of the use of the one-pass algorithm, the training procedures converged rapidly and stably. The training was implemented in MATLAB using an Intel Xeon E5-2620 v4 CPU (2.1 GHz) and an NVIDIA 1080 GPU. It took approximately 10–15 minutes to generate a network. As can be seen from the flowchart in Fig 7, for an image blurred by an unknown PSF, the blur feature should be extracted first. Then, using the GRNN as well as the parametric blur models, its PSF can be estimated successfully. In this manner, the latent image can be restored from the blurry image and the PSF estimated by the deconvolution method described in Algorithm 2. The value of coefficient α in Eq (12) was determined by trial and error, while the criteria for convergence in the algorithm were set as ε = 1e4 and k = 2, respectively. Next, we evaluated the convergence speed of Algorithm 2 for different p values. A series of defocus blurs with increasing radii were simulated in order to test the performance of the solution methods for Eq (14). The simulations were performed on the same platform as that used in the training procedure discussed above. Fig 9 shows the cost times for p values of 1/2, 2/3, 3/4, 4/5, 1, and 2. Generally, the Newton-Raphson method is effective for solving Eq (14). However, as discussed in the previous section, for special cases, such as those where p = 1/2 and p = 2/3, analytical solutions exist, as given in Eq (17) and Eq (18), that converge faster than those for 0