Abstract Shadow detection is an important pre-processing step often used in scene interpretation or shadow removal applications. In this paper, we propose a single-image shadow detection method. Many other methods use multiple images; we use a quaternion representation of colour images to extract shadows from only one input image. The generation of the final binary shadow mask is done via automatic threshold selection. Evaluation is carried out qualitatively and quantitatively over three challenging datasets of indoor and outdoor natural images. The results of qualitative assessment were consistent with the statistical results, where the proposed method improves the performance of shadow detection when compared with state-of-the-art methods. 1. INTRODUCTION Shadow detection is a crucial pre-processing step for scene interpretation. It can be useful for collecting cues about object shapes  and light source characteristics . In addition, it is useful in some computer vision applications to remove shadows due to their undesirable effects on segmentation, and object recognition . Shadow creation occurs when an object occludes light from the light source. Light source characteristics, such as the number of light sources, their colour and direction, affect the characteristics of the shadow. For example, shadows in outdoor scenes are more complex than indoor scenes. In outdoor scenes, objects are exposed to sunlight (yellow directional light) and skylight (blue ambient light); differences in chromaticity of light sources may change shadow colours. In contrast, chromaticity is nearly invariant in the case of indoor scenes, where the ambient light chromaticity is approximately the same as the chromaticity of the light source. This usually yields more accurate results for indoor images . Image-based shadow detection methods can be classified into interactive (semi-automatic)  and non-interactive (fully automatic) . The latter consist of two categories: single-image methods where the shadow mask is extracted from only the input image , and multi-image [8, 9] methods, where the shadow mask is obtained from input and near-infrared (NIR) images, or from multiple images used to compute illumination invariants. However, reliable computations of these invariants require high-resolution and high-intensity images where camera specification and configuration play an important role in the accuracy of the results . This paper addresses shadow detection based on a single image; it introduces a method based on quaternion theory, using the interpretation of colours provided by unit transform (UT) for a colour quaternion vector. Finally, to generate a final binary shadow mask, we suggest an automatic threshold selection procedure. The paper is organized as follows. Section 2 presents related work in literature. Section 3 introduces background information on quaternions and UT of RGB images. Section 4 describes the proposed method. Experiments and result analysis are given in Section 5. Finally, we conclude in Section 6. 2. RELATED WORK Work in shadow detection can be classified into two categories: the first relies on learning classifiers based on intensity and colour cues, combined with an energy minimization step to ensure global consistency such as [11–16]. The second relies on image formation properties by modelling the illumination and scene reflectance properties in the image: examples are [8, 9, 17–21]. Other classifications divide the work into dynamic versus static shadow detection techniques. The former relies on detecting dynamic shadows (cast shadow of a moving object) [22, 23]; these techniques depend on analysing sequences of images to track the changes in scene illumination within a specific duration of time. Other techniques are based on detecting shadows from static objects. The latter is our concern here, and the following are some works that share the same concern. Besheer and Abdelhafiz  proposed an enhancement to the original C1C2C3 invariant colour model using NIR image information. The authors used bi-modal histogram splitting using Otsu’s  threshold to provide binary segmentation. Their method was only performed on satellite images and necessarily required NIR channel information. Tian et al.  proposed a multi-step shadow detection method by first generating an illumination-invariant grayscale image, segmenting it to sub-regions, and then detecting shadows by tricolour attenuation, modelled by spectral power of both skylight and sunlight. However, the reliability of this approach depends on the initial illumination-invariant image which may not be accurate. Further, the tricolour attenuation model may be inaccurate when spectral differences grow (in sunrise and sunset) . Guo et al.  used a region-based approach to detect shadows from a single image, using graph-cut inference to distinguish between shadow and non-shadow regions. Their work performs well if the shadowed regions appear on flat, lightly textured surfaces. Rüfenacht et al.  proposed a shadow detection approach using NIR, where a pixel-by-pixel ratio between the NIR and the visible RGB image is computed. The use of NIR images in ratio computation allows discrimination between dark objects and shadow regions. Their approach obtained better results in comparison with other methods such as [11, 18], but it suffered from some problems, especially if an object has lower reflectance in the NIR than in the visible RGB image, which may be erroneously detected as shadow, and if an object has higher reflectance then shadow may be ignored. Li et al.  proposed a framework for shadow detection in which the decision of a shadow (or non-shadow) pixel depends on a set of cues: model (illumination based) cues, and observation cues (collected from the YIQ representation of the image and the NIR image), where the YIQ representation holds information of the image luminance (Y) and chrominance (I and Q). Their method works well on different kinds of image datasets (indoor, outdoor and satellite). However, better results are obtained when NIR images are supported by the dataset. Yuan et al.  proposed another single-image shadow detection and removal method using the local colour constancy computation where shadow and non-shadow regions are classified based on a surface descriptor. Despite the good performance of their method, the investigation of gradients of the images and the derivations in two directions causes time-consuming computation. In , cues from illumination invariants and bright channel were used within a Markov Random Field (MRF) framework. This method shows impressive results in high-quality images, but a degradation in performance occurs with low-quality photographs or web images. Further, in recent research by Tian et al. , they relied on illumination cues for shadow detection. They proposed four shadow properties derived from spectrum ratios taken under various sun angles, and their method gained high performance on natural images. 3. COLOUR QUATERNIONS AND UNIT TRANSFORM 3.1. Colour vector representation using quaternions A quaternion is a generalization of a complex number developed by Hamilton in 1843 . It is a 4D number, composed of one real (scalar) part and a 3D vector part which contains three imaginary unit vectors: q=q∘︸real+q1i+q2j+q3k︸imaginary (1)where i, j and k are the units of the quaternion number q, i.e. the standard basis of the vector. As an example, a quadruple number q(3,2,4,1) can be represented in quaternions as q=3+2i+4j+k, where 3 is a scalar and (2,4,1) denotes the vector in 3D space. The quaternion unit vectors obey the following operations : i2=j2=k2=ijk=−1,and (2) ij=k,ji=−k,jk=i,kj=−i,ki=j,ik=−j (3)Quaternion multiplication is not commutative. A quaternion with a zero scalar part is called a pure quaternion. The conjugate of a quaternion (q⁎) is computed by negating the sign besides unit vector parts (see Eq. (4)): q⁎=q∘−q1i−q2j−q3k (4) A unit quaternion is defined as having length (magnitude) 1: |q|=q∘2+q12+q22+q32=1 (5) Thus, a vector in ℝ3 could be represented by using the pure quaternion q=q1i+q2j+q3k. In colour images, a pixel qx,y can be represented as a vector in a 3D space using quaternions as qx,y=rx,yi+gx,yj+bx,yk (6) where rx,y, gx,y and bx,y are the RGB colour components of pixel q located at position x,y in the image . 3.2. Unit transform (UT) The unit transform of an RGB quaternion vector is the scalar product between the RGB quaternion vector q and two quaternion vectors: a unit quaternion n^ from the left side, and its conjugate n^⁎ from the right side: UT(q)=n^⋅q⋅n^⁎ (7)where n^=cosθ+1/3sinθ(i+j+k), and n^⁎=cosθ−1/3sinθ(i+j+k) . Substituting in Eq. (7), the formula of UT(q) becomes UT(q)=[cosθ+1/3sinθ(i+j+k)]⋅[Ri+Gj+Bk]⋅[cosθ−1/3sinθ(i+j+k)]=cos2θ(Ri+Gj+Bk)︸QC1+2/3(i+j+k)sin2θ(R+G+B)︸QC2+1/3sin2θ[(B−G)i+(R−B)j+(G−R)k]︸QC3=QC1+QC2+QC3 (8)In general, UT of quaternion vectors was essentially produced for employing 3D rotations of objects . However, in our work, the purpose is to rotate image colour vectors in the colour space. This is to obtain a colour-transformed version of the image by simply varying θ in Eq. (8). For example, setting θ to the value 45° the unit transform will transfer the image from the RGB to an HSI like space . Figure 1 shows an example of different colour transformations of an image while varying theta parameter. Equation (8) shows that UT is a combination of three quaternion vectors (cues): QC1—the RGB component, QC2—the intensity of a colour image and QC3—chromaticity difference vector. Figure 1. View largeDownload slide Quaternion colour transformations. (a) Original RGB image , (b) colour rotation by θ=50∘ and (c) θ=70∘. Figure 1. View largeDownload slide Quaternion colour transformations. (a) Original RGB image , (b) colour rotation by θ=50∘ and (c) θ=70∘. 4. THE PROPOSED SHADOW DETECTION METHOD Light source occlusion causes shadow creation in a scene. Thus, shadow regions in an image have lower illumination component and lower pixel intensity in the RGB colour space. To tackle these characteristics, many researchers investigated the chromaticity cues in shadow detection using different colour transformations, including YIQ [9, 29], HSV [30, 31] and HSI . Herein, in this paper, we exploit the UT to generate a colour-transformed image. The quaternion cues composing the UT image are then investigated to create a version of the image which darkens the shadows and brightens the illuminated non-shadow regions. After analysing the UT transformed image, we have observed that the rotation has an impact on both QC1 and QC3, while QC2 recorded the lowest variation when rotating colour vectors; this is reasonable because as seen in Eq. (8), the trigonometric function (sin) is raised to the power (2) which reduces the impact of rotation. Therefore, we found it important to conduct our analysis based on the QC1 and QC3 of the colour transformed image. Therefore, shadow extraction in our method starts by analysing QC1. Thus, in simple scenes where low variations in colours observed, QC1 is sufficient to extract shadows. This is because of the simple assumption that an illuminated pixel tends to have high reflectance (at least) in one of its colour channels. However, for more complex scenes where a high variation in colours is perceived, both QC1 and QC3 are involved in the extraction process. In our method, the choice of θ used in Eq. (8) is determined in a way so as to enlarge the discrimination between the shadow and non-shadow regions (discussed in Section 5.2). In particular, our approach is composed of two main steps: first, compute the quaternion cues. Second, analyse the histogram of |QC1| ( |⋅| is the magnitude of the quaternion cue) to decide about the complexity of a scene by which we can define the suitable cues to extract shadows. Thus, for simple scenes, the histogram of |QC1| is bi-modal, this denotes the existence of two main features in the image space and selecting a δ threshold between the two features is sufficient to separate shadow from non-shadow regions. The point δ is selected in the valley between the two peaks of the histogram. Particularly, near the darkest peak (see Fig. 2). Figure 2. View largeDownload slide Threshold selection in bi-modal histogram. The dashed line denotes the location of the threshold δ. Figure 2. View largeDownload slide Threshold selection in bi-modal histogram. The dashed line denotes the location of the threshold δ. In contrast, for more complex scenes where the histogram of |QC1| is complex multi-modal, it is harder to decide the suitable threshold location (see Fig. 3). Therefore, we suggest the investigation of both QC1 and QC3 to generate a bright image with less complex histogram; which tends to attenuate the radiance of the shadow regions and increase the radiance in non-shadow ones. Figure 3. View largeDownload slide (a) Original image . (b) Histogram of |QC1|. Figure 3. View largeDownload slide (a) Original image . (b) Histogram of |QC1|. The creation of a bright image is done by computing the ratio between two colour transformed versions of the image. Formally, assume you have a colour transformed image that is generated by rotating the colour vectors in the image by angle θ1. Also consider that the histogram of |QC1|θ1| is multi-modal. Then, for generating a bright image, create another colour transformed image by angle θ2 where θ2⊥θ1. Thereafter, create two colour images, I1 and I2, by adding the first and the third quaternion cues in each transformed image, such as I1|θ1=QC1|θ1+QC3|θ1, and I2|θ2=QC1|θ2+QC3|θ2. Finally, compute the max-channel for both I1 and I2, which is computed by finding the maximum colour component at each pixel in the specified image. The bright image is the ratio between the two max-channels. The selection of two perpendicular colour transformations aims to produce two transformed images: light and dark. This is clearly shown in their histograms, which are mirror reflections to each other. Therefore, computing the ratio between the light and the dark images will insure the generation of a bright image (see Fig. 4). Notice that when computing the ratio bright image, the light image is the numerator and the dark image is the denominator. Figure 4. View largeDownload slide Bright Channel Generation of Fig. 3a. (a) Dark image (I1|θ=70). (b) Light image (I2|θ=160). (c) The ratio bright image (max−channel(I2)/max−channel(I1)). Figure 4. View largeDownload slide Bright Channel Generation of Fig. 3a. (a) Dark image (I1|θ=70). (b) Light image (I2|θ=160). (c) The ratio bright image (max−channel(I2)/max−channel(I1)). The bright image tends to produce an outlier peak in the right part of its histogram (see Fig. 6a). Thus, by suppressing the outlier peak, a less complex histogram will be obtained. In this case, δ is a point that refers to the minimum point alongside the darkest peak. For example, Fig. 6b has two main valleys, so δ is the minimal point in the left most (darkest) valley. It is worth mentioning that, first—the type of the histogram (bi-modal or multi-modal) is determined by locating the inflection points in the histogram and then counting the number of the available peaks according to the characteristics of its first derivative as discussed in . Second—the histogram is smoothed before applying the threshold selection procedure. This is to reduce the influence of noise or to avoid the problem of selecting false peaks or outliers. In this paper, a moving averaging filter of the signal, with window length of 15 was chosen for this purpose. Third—an extra information about the approximation amount of shadow in the image space increases the reliability of the thresholding procedure. Another useful function of (|QC1|) is its ability to extract the foreground from a simple scene, but this assumes the existence of the NIR image. As shown in Fig. 5, the difference (NIR−|QC1|) excludes the background from the image (see Fig. 5d), and by binarizing the result an object mask can be obtained (see Fig. 5e). However, the limitation appears when an object in the visible image has higher reflectance in comparison with NIR, then that object is erroneously excluded (see the region inside the left mug in Fig. 5e). Figure 5. View largeDownload slide (a) Original Image . (b) |QC1|. (c) NIR. (d) Difference image. (e) The object mask. Figure 5. View largeDownload slide (a) Original Image . (b) |QC1|. (c) NIR. (d) Difference image. (e) The object mask. Figure 6. View largeDownload slide Threshold selection when |QC1| has multi-modal histogram. (a) Histogram of the ratio bright image of Fig. 3a. (b) Histogram after suppressing the right most peak. (c) The result after thresholding the ratio bright image with δ=34. (d) The shadow ground truth. Figure 6. View largeDownload slide Threshold selection when |QC1| has multi-modal histogram. (a) Histogram of the ratio bright image of Fig. 3a. (b) Histogram after suppressing the right most peak. (c) The result after thresholding the ratio bright image with δ=34. (d) The shadow ground truth. 5. RESULTS AND DISCUSSION 5.1. Datasets Three datasets were used to evaluate our proposed shadow detection method: the Rüfenacht dataset , which contains 74 (32 indoor, and 42 natural outdoor) images together with their NIR images and manually created ground truth; the University of Central Florida (UCF)  dataset, which contains 355 images with their manually labelled ground truth shadow masks, which is a mixture of uncontrolled light indoor and outdoor natural images; and the UIUC dataset, provided by Guo et al. , which consists of 108 images of natural scenes. The shadow masks of this dataset are obtained automatically by finding the difference between shadow and non-shadow images. It is important to note that the dataset in  is the only dataset offering NIR images, and its ground truth is created by manually labelling shadow and non-shadow regions, and as we observed, it has very accurate annotations in comparison to the other specified datasets. 5.2. Parameter setting The proposed method has a single parameter, θ: the angle of rotation, that needs dynamic selection to obtain accurate results. The selection was done by examining the histograms of the different colour transformations created by varying θ in the range [0−180∘] and then selecting the value which produces a colour transformed image with the highest level of information, measured by maximum overall entropy. The entropy of an image is defined as  Entropy=−∑∀i∈[0,N−1]P(si)log2(P(si)) (9)where N is the number of grey levels, and P(si) is the normalized probability of the grey level si. Experiments are performed in the Matlab environment on a machine with 8 GB of RAM and an Intel Core i7 CPU running at 2.60 GHz. 5.3. Evaluation metrics Five statistical metrics were employed for evaluating the proposed method and to compare our results with a set of state-of-the-art methods: per-pixel Accuracy, Matthew coefficient (Mcc), Precision, Recall and F-measure. Accuracy is a coefficient whose value is in the interval [0,1], where the higher the value the better the quality retrieved. Mcc is a correlation coefficient of values between [−1,1] The usefulness of Mcc comes from the fact that it is a balanced measure, which can be used even if the classes have huge difference in sizes . Accuracy and Mcc are computed as Accuracy=Tpos+TnegTpos+Tneg+Fpos+Fneg (10) Mcc=Tpos×Tneg−Fpos×Fneg(Tpos+Fpos)×(Tpos+Fneg)×(Tneg+Fpos)×(Tneg+Fneg) (11)where Tpos (true positive) is the number of correctly classified shadow pixels, Fneg (false negative) is the number of shadow pixels classified incorrectly, Fpos (false positive) is the number of non-shadow pixels classified incorrectly and Tneg (true negative) is the number of non-shadow pixels classified correctly . Beside these metrics, we adopted the commonly used Precision, Recall and F-measure to evaluate the effectiveness of our method: Precision=TposTpos+Fpos (12) Recall=TposTpos+Fneg (13) F−measure=2×Recall×PrecisionRecall+Precision (14) 5.4. Qualitative results The proposed method was assessed qualitatively, and compared with the state-of-the-art methods in Tian et al. , Rüfenacht et al. , Li et al.  and Guo et al. . Figure 7 compares visually between our method and the specified methods. The images in the figure were selected to present the power of our proposed method over indoor and outdoor images. Among the specified methods, Guo’s method is segmentation-based, while the others (including ours) are pixel-based methods. Rüfenacht’s method requires the availability of NIR image, and Li’s method recommends the use of NIR image, while Tian’s and Guo’s require a single image. Figure 7. View largeDownload slide Sample results of Rüfenacht dataset. (a–c) Original images. (d–f) Shadow ground truth. (g–i) Proposed. (j–l) Rüfenacht et al. . (m–o) Guo et al. . (p–r) Tian et al. . (s–u) Li et al. . Figure 7. View largeDownload slide Sample results of Rüfenacht dataset. (a–c) Original images. (d–f) Shadow ground truth. (g–i) Proposed. (j–l) Rüfenacht et al. . (m–o) Guo et al. . (p–r) Tian et al. . (s–u) Li et al. . As shown in Fig. 7, the proposed (single-image) method obtained highly impressive results when compared with ground truth. Natural, outdoor scenes and highly textured images are the most complex cases from the segmentation-based view. Thus, failure in segmenting shadow and non-shadow regions will affect the performance of a method—this causes the poor results of Guo’s method; however, in simpler indoor scenes, it works better. On the other hand, Rüfenacht’s method fails in the cases where the material has much lower reflectance in the NIR than in the visible RGB image, where it could be erroneously classified as shadow regions. Li’s method is affected by the illustrated NIR problems less than Rüfenacht’s method, because it depends on different cues for generating a shadow mask. Tian’s method depends on computing shadow masks based on outdoor light sources that is why it performs well, while it performs less well in the cases of controlled indoor light sources. 5.5. Quantitative results The proposed shadow detection method was assessed quantitatively over the aforementioned datasets for validation. Tables 1 and 2 show a comparison between our results and others on Rüfenacht’s dataset. Thus, as shown in Table 1, the experimental results prove that quaternion cues recorded on average the best overall results in terms of Precision, Recall, Accuracy and Mcc. In addition, the standard deviations (σ) for the Accuracy and Mcc are small and the lowest among others, indicating that our single-image method has fewer failures. Table 1. Shadow detection statistical results on Rüfenacht dataset. Method Precision Recall Mcc σMcc Accuracy σAccuracy Tian et al.  0.79 0.97 0.76 0.19 0.81 0.18 Rüfenacht et al.  0.87 0.90 0.80 0.14 0.90 0.08 Li et al.  0.89 0.89 0.79 0.15 0.90 0.11 Guo et al.  0.73 0.22 0.19 0.27 0.60 0.22 Proposed 0.91 0.92 0.84 0.12 0.92 0.05 Method Precision Recall Mcc σMcc Accuracy σAccuracy Tian et al.  0.79 0.97 0.76 0.19 0.81 0.18 Rüfenacht et al.  0.87 0.90 0.80 0.14 0.90 0.08 Li et al.  0.89 0.89 0.79 0.15 0.90 0.11 Guo et al.  0.73 0.22 0.19 0.27 0.60 0.22 Proposed 0.91 0.92 0.84 0.12 0.92 0.05 Table 2. Shadow detection confusion matrices on Rüfenacht dataset. Method Predicted shadow Predicted non-shadow Tian et al.  Actual shadow 0.977410097 0.022589903 Actual non-shadow 0.274022631 0.725977369 Rüfenacht et al.  Actual shadow 0.880844004 0.119155996 Actual non-shadow 0.180794826 0.819205174 Li et al.  Actual shadow 0.894289150 0.105710850 Actual non-Shadow 0.155147152 0.844852848 Guo et al.  Actual shadow 0.215790192 0.784209808 Actual non-shadow 0.047573612 0.952426388 Proposed Actual shadow 0.920905872 0.079094128 Actual non-shadow 0.070441239 0.929558761 Method Predicted shadow Predicted non-shadow Tian et al.  Actual shadow 0.977410097 0.022589903 Actual non-shadow 0.274022631 0.725977369 Rüfenacht et al.  Actual shadow 0.880844004 0.119155996 Actual non-shadow 0.180794826 0.819205174 Li et al.  Actual shadow 0.894289150 0.105710850 Actual non-Shadow 0.155147152 0.844852848 Guo et al.  Actual shadow 0.215790192 0.784209808 Actual non-shadow 0.047573612 0.952426388 Proposed Actual shadow 0.920905872 0.079094128 Actual non-shadow 0.070441239 0.929558761 The experimental results exhibit the enhancement of quaternion cues in shadow detection. As shown in Table 2, it reported a balanced result where errors in labelling are small on average ( 0.079 for shadow regions and 0.070 for non-shadow regions). Results of  recorded the smallest error labelling in non-shadow regions (0.047); however, it recorded the highest misclassification of the shadow regions (0.784). Among the compared methods, results of [8, 9] are the closest to our results. However, on the Rüfenacht dataset, the quaternion approach reduces the false-negative and false-positive rates reported in  by 4% and 11%, respectively. Further, it reduces these rates in  by 2.7% and 8.4%, respectively. Work in  obtained good results because of their framework which recommends the use of NIR images when they are available. Results of  were poor on the Rüfenacht dataset because it contains a set of indoor images that are most likely light controlled. The proposed quaternion cues method was also compared with [13, 16, 21] in terms of Precision, Recall and F-measure. Our results were better than their reported results on both UIUC and UCF datasets (see Table 3). Table 3. Comparison in terms of Precision, Recall, and F-measure. Method UCF dataset UIUC dataset Precision Recall F-measure Precision Recall F-measure Guo et al.  0.35 0.54 0.36 0.53 0.57 0.51 Lalonde et al.  0.35 0.66 0.43 0.49 0.69 0.55 Tian et al.  0.47 0.67 0.50 0.69 0.73 0.66 Proposed 0.55 0.69 0.57 0.83 0.79 0.79 Method UCF dataset UIUC dataset Precision Recall F-measure Precision Recall F-measure Guo et al.  0.35 0.54 0.36 0.53 0.57 0.51 Lalonde et al.  0.35 0.66 0.43 0.49 0.69 0.55 Tian et al.  0.47 0.67 0.50 0.69 0.73 0.66 Proposed 0.55 0.69 0.57 0.83 0.79 0.79 Further experiments were performed on the UCF and UIUC shadow datasets. In particular, we compared our results with the published results of a set of state-of-the-art methods in terms of per-pixel Accuracy (%). The proposed method achieved high performance. Among comparisons, we achieved 90.1% Accuracy on UCF dataset, which is the best among the compared methods. In addition, we obtained high Accuracy (92.1%) on the UIUC dataset against 88.3% achieved by  (see Table 4). Despite not being the highest Accuracy on UIUC, the proposed method achieved an overall gain of 11.8% of both datasets when compared with Khan et al. . Table 4. Comparison in terms of Accuracy (%). Method UCF dataset UIUC dataset Tian et al. : Tricolour attenuation model 80.0 82.0 Li et al. : Model and observation cues 83.1 82.01 Guo et al. : Unary SVM-pairwise 90.0 88.3 Zhu et al. : BDT-BCRF 88.7 – Guo et al. : Unary SVM 87.1 81.7 Jiang et al. : Illumination Maps-BDT-CRF 83.5 – Khan et al. : ConvNet (Boundary+Region) 89.31 92.31 Yuan et al. : Unary potential 88.3 82.1 Panagopoulos et al. : Bright Channel-MRF 85.9 – Proposed: Quaternion Cues 90.1 92.1 Method UCF dataset UIUC dataset Tian et al. : Tricolour attenuation model 80.0 82.0 Li et al. : Model and observation cues 83.1 82.01 Guo et al. : Unary SVM-pairwise 90.0 88.3 Zhu et al. : BDT-BCRF 88.7 – Guo et al. : Unary SVM 87.1 81.7 Jiang et al. : Illumination Maps-BDT-CRF 83.5 – Khan et al. : ConvNet (Boundary+Region) 89.31 92.31 Yuan et al. : Unary potential 88.3 82.1 Panagopoulos et al. : Bright Channel-MRF 85.9 – Proposed: Quaternion Cues 90.1 92.1 A significant observation from the last comparison is that not all mismatches correspond to a limitation in our approach, rather it was due to the imperfection of the ground truth especially in the automatically generated ground truth in the UIUC dataset. Also in the UCF dataset, some self and internal shadows are ignored, while the focus is on the main cast shadow in the scene, as can be seen in Fig. 8, the top of the car wheel should be part of the shadow mask, and the top of the box should not be a part of the shadow mask. On the other hand, very accurate annotations for shadows are observed in the Rüfenacht dataset. Figure 8. View largeDownload slide Examples of bad shadow masks in ground truth. (a–b) UCF dataset. (c–d) UIUC dataset. Figure 8. View largeDownload slide Examples of bad shadow masks in ground truth. (a–b) UCF dataset. (c–d) UIUC dataset. The proposed method is also evaluated with respect to computation time. As shown in Table 5, it recorded high level of efficiency; since it is a histogram-based method where image size does not affect its execution time. Rather, the scene complexity may slightly affect its performance. While this result is favourable; real-time shadow detection in videos is still costly in terms of computation time. Table 5. Average computation time (in seconds) for different methods. Image size Proposed Tian et al.  Rüfenacht et al.  Li et al.  Guo et al.  425×650 0.2 1.7 0.13 0.27 40.23 900×1400 0.4 10.1 0.63 0.77 480 Image size Proposed Tian et al.  Rüfenacht et al.  Li et al.  Guo et al.  425×650 0.2 1.7 0.13 0.27 40.23 900×1400 0.4 10.1 0.63 0.77 480 Despite the powerful results of our method, it inherits the problem of hard discrimination of dark from shadow regions (see Fig. 9). Such distinction could be solved by the use of NIR images. However, as noted, such an assumption yields problems, where in some cases huge difference in illumination reflectance between the RGB and the NIR images may cause some misclassifications. Figure 9. View largeDownload slide Example of misclassification of dark regions as shadows. (a) Original image . (b) The binary shadow mask generated by the proposed method. Figure 9. View largeDownload slide Example of misclassification of dark regions as shadows. (a) Original image . (b) The binary shadow mask generated by the proposed method. 6. CONCLUSIONS A single-image shadow detection method based on quaternion representation of colour images is proposed. Two quaternion cues (the RGB and chromaticity difference vectors) are used to automatically produce binary shadow masks. Applying thresholding on the RGB quaternion cue was enough to discriminate shadow in simple scenes, but for more complex scenes, we recommend thresholding the bright image generated from two perpendicular colour transformed images leveraging from both vector quaternion cues. We have performed experiments on three challenging datasets, and observed that our method outperforms other methods in terms of quality. Our method achieves high accuracy of 92%, 90.1% and 92.1% on Rüfenacht, UCF and UIUC datasets, respectively. It also achieved high efficiency, because it is simple to implement due to its non-iterative approach. Future plans include extending our shadow detection pipeline by adding an automatic shadow removal step, and investigating the quaternion cues in other image processing applications. REFERENCES 1 Okabe, T., Sato, I. and Sato, Y. ( 2009) Attached Shadow Coding: Estimating Surface Normals from Shadows Under Unknown Reflectance and Lighting Conditions. Proc. IEEE Int. Conf. Computer Vision (ICCV), Kyoto, Japan, 29 September–2 October, pp. 1693–1700. IEEE. 2 Panagopoulos, A., Samaras, D. and Paragios, N. ( 2009) Robust Shadow and Illumination Estimation using a Mixture Model. Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Miami, FL, 20–25 June, pp. 651–658. IEEE. 3 Shen, L., Wee Chua, T. and Leman, K. ( 2015) Shadow Optimization from Structured Deep Edge Detection. Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Boston, MA, 7–12 June, pp. 2067–2074. IEEE. 4 Nghiem, A.T., Bremond, F. and Thonnat, M. ( 2008) Shadow Removal in Indoor Scenes. Proc.IEEE Int. Conf. Advanced Video and Signal Based Surveillance, Santa Fe, NM, 1–3 September, pp. 291–298. IEEE. 5 Wu, T.P., Tang, C.K., Brown, M.S. and Shum, H.Y. ( 2007) Natural shadow matting. ACM Trans. Graph , 26, 8. Google Scholar CrossRef Search ADS 6 Levine, M.D. and Bhattacharyya, J. ( 2005) Removing shadows. Pattern Recognit. Lett. , 26, 251– 265. Google Scholar CrossRef Search ADS 7 Tian, J., Zhu, L. and Tang, Y. ( 2012) Outdoor shadow detection by combining tricolor attenuation and intensity. EURASIP J. Adv. Signal Process. , 2012, 116. Google Scholar CrossRef Search ADS 8 Rüfenacht, D., Fredembach, C. and Süsstrunk, S. ( 2014) Automatic and accurate shadow detection using near-infrared information. IEEE. Trans. Pattern. Anal. Mach. Intell. , 36, 1672– 1678. Google Scholar CrossRef Search ADS PubMed 9 Li, J., Hu, Q. and Ai, M. ( 2016) Joint model and observation cues for single-image shadow detection. Remote Sens. , 8, 484. Google Scholar CrossRef Search ADS 10 Lalonde, J.F., Efros, A.A. and Narasimhan, S.G. ( 2012) Estimating the natural illumination conditions from a single outdoor image. Int. J. Comput. Vis. , 98, 123– 145. Google Scholar CrossRef Search ADS 11 Guo, R., Dai, Q. and Hoiem, D. ( 2011) Single-Image Shadow Detection and Removal using Paired Regions. Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, 20–25 June, pp. 2033–40. IEEE. 12 Zhu, J., Samuel, K.G.G., Masood, S.Z. and Tappen, M.F. ( 2010) Learning to Recognize Shadows in Monochromatic Natural Images. Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, 13–18 June, pp. 223–230. IEEE. 13 Guo, R., Dai, Q. and Hoiem, D. ( 2013) Paired regions for shadow detection and removal. IEEE. Trans. Pattern. Anal. Mach. Intell. , 35, 2956– 2967. Google Scholar CrossRef Search ADS PubMed 14 Jiang, X., Schofield, A.J. and Wyatt, J.L. ( 2011) Shadow Detection based on Colour Segmentation and Estimated Illumination. Proc. Brit. Mach. Vis. Conf. (BMVC), University of Dundee, Dundee, Scotland, 29 August–2 September, pp. 87.1–87.11. BMVA Press. 15 Khan, S.H., Bennamoun, M., Sohel, F. and Togneri, R. ( 2016) Automatic shadow detection and removal from a single image. IEEE. Trans. Pattern. Anal. Mach. Intell. , 38, 431– 446. Google Scholar CrossRef Search ADS PubMed 16 Lalonde, J.F., Efros, A.A. and Narasimhan, S.G. ( 2010) Detecting Ground Shadows in Outdoor Consumer Photographs. Proc. Eur. Conf. Computer Vision (ECCV), Heraklion, Crete, Greece, 5–11 September, pp. 322–335. Springer, Berlin Heidelberg. 17 Besheer, M. and Abdelhafiz, A. ( 2015) Modified invariant colour model for shadow detection. Int. J. Remote. Sens. , 36, 6214– 6223. Google Scholar CrossRef Search ADS 18 Tian, J., Sun, J. and Tang, Y. ( 2009) Tricolor attenuation model for shadow detection. IEEE. Trans. Image Process. , 18( 10), 2355– 2363. Google Scholar CrossRef Search ADS PubMed 19 Yuan, X., Ebner, M. and Wang, Z. ( 2015) Single-image shadow detection and removal using local colour constancy computation. IET Image Process. , 9( 2), 118– 126. Google Scholar CrossRef Search ADS 20 Panagopoulos, A., Wang, C., Samaras, D. and Paragios, N. ( 2010) Estimating Shadows with the Bright Channel Cue. Proc. Trends and Topics in Computer Vision: Eur. Conf. Computer Vision (ECCV) Workshops, Heraklion, Crete, Greece, 5–11 September, pp. 1–12. Springer, Berlin, Heidelberg. 21 Tian, J., Qi, X., Qu, L. and Tang, Y. ( 2016) New spectrum ratio properties and features for shadow detection. Pattern. Recognit. , 51, 85– 96. Google Scholar CrossRef Search ADS 22 Moeslund, T.B., Hilton, A. and Krüger, V. ( 2006) A survey of advances in vision-based human motion capture and analysis. Comput. Vis. Image. Underst. , 104, 90– 126. Google Scholar CrossRef Search ADS 23 Madsen, C.B., Moeslund, T.B., Pal, A. and Balasubramanian, S. ( 2009) Shadow Detection in Dynamic Scenes Using Dense Stereo Information and an Outdoor Illumination Model. Proc. Dynamic 3D Imaging: DAGM 2009 Workshop, Dyn3D 2009, Jena, Germany, 9 September, pp. 110–125. Springer, Berlin, Heidelberg. 24 Otsu, N. ( 1975) A threshold selection method from gray-level histograms. Automatica , 11, 23– 27. 25 Hamilton, W.R. ( 1844) On quaternions; or on a new system of imaginaries in algebra. Philos. Mag. Ser. 3 , 25, 489– 495. 26 Geng, X., Hu, X. and Xiao, J. ( 2012) Quaternion switching filter for impulse noise reduction in color image. Signal. Process. , 92, 150– 162. Google Scholar CrossRef Search ADS 27 Cai, C. and Mitra, S.K. ( 2000) A Normalized Color Difference Edge Detector based on Quaternion Representation. Proc. IEEE Int. Conf. Image Processing (ICIP), Vancouver, BC, Canada, 10–13 September, Vol. 2, pp. 816–819. IEEE. 28 Goldman, R. ( 2010) Rethinking Quaternions. Synthesis Lectures on Computer Graphics and Animation, 4(1), 1–157. 29 Khekade, A. and Bhoyar, K. ( 2015) Shadow Detection based on RGB and YIQ Color Models in Color Aerial Images. Proc. Int. Conf. Futuristic Trends on Computational Analysis and Knowledge Management, Noida, India, 25–27 February, pp. 144–147. IEEE. 30 Cucchiara, R., Grana, C., Piccardi, M., Prati, A. and Sirotti, S. ( 2001) Improving Shadow Suppression in Moving Object Detection with HSV Color Information. Proc. IEEE Conf. Intelligent Transportation Systems, Oakland, CA, 25–29 August, pp. 334–339. IEEE. 31 Wang, B., Zhu, W., Zhao, Y. and Zhang, Y. ( 2015) Moving Cast Shadow Detection using Joint Color and Texture Features with Neighboring Information. Proc. Image and Video Technology—PSIVT 2015 Workshops, Auckland, New Zealand, 23–27 November, Vol. 9555, pp. 15–25. Springer International Publishing. 32 Shi, W. and Li, J. ( 2012) Shadow detection in color aerial images based on HSI space and color attenuation relationship. EURASIP J. Adv. Signal Process. , 2012, 141. Google Scholar CrossRef Search ADS 33 Arce, G.R. ( 2005) Nonlinear Signal Processing: A Statistical Approach . John Wiley & Sons, Hoboken, New Jersey. 34 Gray, R. ( 1990) Entropy and Information Theory . Springer-Verlag. Google Scholar CrossRef Search ADS 35 Vicente, T.F.Y., Hou, L., Yu, C.P., Hoai, M. and Samaras, D. ( 2016) Large-Scale Training of Shadow Detectors with Noisily-Annotated Shadow Examples. Proc. Eur. Conf. Computer Vision (ECCV), Amsterdam, The Netherlands, 11–14 October, Part IV, pp. 816–832. Springer International Publishing. 36 Matthews, B.W. ( 1975) Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim. Biophys. Acta (BBA)—Protein Struct. , 405, 442– 451. Google Scholar CrossRef Search ADS Author notes Handling editor: Fionn Murtagh © The British Computer Society 2018. All rights reserved. For permissions, please e-mail: firstname.lastname@example.org
The Computer Journal – Oxford University Press
Published: Mar 1, 2018
It’s your single place to instantly
discover and read the research
that matters to you.
Enjoy affordable access to
over 12 million articles from more than
10,000 peer-reviewed journals.
All for just $49/month
Read as many articles as you need. Full articles with original layout, charts and figures. Read online, from anywhere.
Keep up with your field with Personalized Recommendations and Follow Journals to get automatic updates.
It’s easy to organize your research with our built-in tools.
Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.
All the latest content is available, no embargo periods.
“Hi guys, I cannot tell you how much I love this resource. Incredible. I really believe you've hit the nail on the head with this site in regards to solving the research-purchase issue.”Daniel C.
“Whoa! It’s like Spotify but for academic articles.”@Phil_Robichaud
“I must say, @deepdyve is a fabulous solution to the independent researcher's problem of #access to #information.”@deepthiw
“My last article couldn't be possible without the platform @deepdyve that makes journal papers cheaper.”@JoseServera