An implementation of normal distribution based segmentation and entropy controlled features selection for skin lesion detection and classification

An implementation of normal distribution based segmentation and entropy controlled features... Background: Melanoma is the deadliest type of skin cancer with highest mortality rate. However, the annihilation in its early stage implies a high survival rate therefore, it demands early diagnosis. The accustomed diagnosis methods are costly and cumbersome due to the involvement of experienced experts as well as the requirements for the highly equipped environment. The recent advancements in computerized solutions for this diagnosis are highly promising with improved accuracy and efficiency. Methods: In this article, a method for the identification and classification of the lesion based on probabilistic distribution and best features selection is proposed. The probabilistic distribution such as normal distribution and uniform distribution are implemented for segmentation of lesion in the dermoscopic images. Then multi-level features are extracted and parallel strategy is performed for fusion. A novel entropy-based method with the combination of Bhattacharyya distance and variance are calculated for the selection of best features. Only selected features are classified using multi-class support vector machine, which is selected as a base classifier. Results: The proposed method is validated on three publicly available datasets such as PH2, ISIC (i.e. ISIC MSK-2 and ISIC UDA), and Combined (ISBI 2016 and ISBI 2017), including multi-resolution RGB images and achieved accuracy of 97.5%, 97.75%, and 93.2%, respectively. Conclusion: The base classifier performs significantly better on proposed features fusion and selection method as compared to other methods in terms of sensitivity, specificity, and accuracy. Furthermore, the presented method achieved satisfactory segmentation results on selected datasets. Keywords: Image enhancement, Uniform distribution, Image fusion, Multi-level features extraction, Features fusion, Features selection Background astonishing mortality rate of 75% is reported due to Skin cancer is reported to be one of the most rapidly melanoma compared to other types of skin cancers [2]. spreading cancer amongst other types. It is broadly clas- The occurrence of melanoma reported to be doubled sified into two primary classes; Melanoma and Benign. (increases 2 to 3% per year) in the last two decades, faster The Melanoma is the deadliest type of cancer with high- than any other types of cancer. American Cancer Society est mortality rate worldwide [1]. In the US alone, an (ACS) has estimated, 87,110 new cases of melanoma will be diagnosed and 9,730 people will die in the US only in *Correspondence: tallha@ciitwah.edu.pk; aamirsardar@gmail.com 2017 [3]. Malignant melanoma can be cured if detected at Department of Electrical Engineering, COMSATS Institute of Information Technology, Wah, Pakistan Department of Electrical Engineering, COMSATS Institute of Information Technology, Abbottabad, Pakistan Full list of author information is available at the end of the article © The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Khan et al. BMC Cancer (2018) 18:638 Page 2 of 20 its early stages, e.g., if diagnosed at stage I, the possible steps: preprocessing, which consists of hair removal, con- survival rate is 96%, compared to 5% at its stage IV [4, 5]. trast enhancement, segmentation, feature extraction, and However, early detection is strenuous due to its high finally classification. The most challenging task in der- resemblance with benign cancer, even an expert derma- moscopy is an accurate detection of lesion’s boundary tologist can diagnose it wrongly. A specialized technique because of different artifacts such as hairs, illumination of dermatoscopy is mostly followed by dermatologist to effects, low lesion contrast, asymmetrical and irregular diagnose melanoma. In a clinical examination, most com- border, nicked edges, etc. Therefore, for an early detec- monly adopted methods of visual features inspection are; tion of melanoma, shape analysis is more important. Menzies method [6], ABCD rule [7], and 7-point check- In features extraction step, several types of features are list [8]. The most commonly used methods are the ABCD extracted such as shape, color, texture, local etc. But, (atypical, border, color, diameter) rules and pattern analy- we have no clear knowledge about salient features for sis. It is reported that this traditional dermoscopy method classification. can increase the detection rate 10 to 27% [9]. These meth- ods distinctly increases the detection rate compared to Contribution conventional methods but still dependent on dermatolo- In this article, we propose a new method of lesion detec- gist’s skills and training [10]. To facilitate experts numer- tion and classification by implementing probabilistic dis- ous computerized analysis systems have been proposed tribution based segmentation method and conditional recently [11, 12] which are referred to as pattern analysis/ entropy controlled features selection. The proposed tech- computerized dermoscopic analysis systems. These meth- nique is an amalgamation of five major steps: a) contrast ods are non-invasive and image analysis based technique stretching; b) lesion extraction; c) multi-level features to diagnose the melanoma. extraction; d) features selection and e) classification of In the last decade, several non-invasive methods malignant and benign. The results are tested on three pub- were introduced for the diagnosis of melanoma includ- licly available datasets which are PH2, ISIC (i.e. ISIC MSK- ing optical imaging system (OIS) [13], optical coher- 2 and ISIC UDA), and Combined (ISBI 2016 and ISBI ence tomography (OCT) [14], light scattering (LS) 2017), containing RGB images of different resolutions, [15], spectropolarimetric imaging system (SIM) [16, 17], which are later normalized in our proposed technique. fourier polarimetry (FP) [18], polarimetric imaging [19], Our main contributions are enumerated below: reectance confocal microscopy (RCM) [20, 21], photo- 1 Enhanced the contrast of a lesion area by acoustic microscopy [22], optical transfer diagnosis implementing a novel contrast stretching technique, (OTD) [23], etc. All these above mentioned methods have in which we first calculated the global minima and enough potential to diagnose the skin lesions and also maxima from the input image and then utilized low accurate enough to distinguish the melanoma and benign. and high threshold values to enhance the lesion. The optical methods are mostly utilized during a clinal 2 Implemented a novel segmentation method based on tests to evaluate the presurgical boundaries of the basal normal and uniform distribution. Mean of the cell carcinoma. It can help in drawing boundaries around uniform distribution is calculated from the enhanced the region of interest (ROI) in the dermoscopic images. image and the value is added in an activation LS skin methods give the information about the micro- function, which is introduced for segmentation. architecture, which is represented with small pieces of Similarly, mean deviation of the normal distribution pigskin and mineral element and helps to determine the is calculated using enhanced image and also inserted extent of various types of skin cancers. The SIM method their values in an activation function for correctly evaluates the polarimetric contrast of the region segmentation. of interest or infectious region such as melanoma, com- 3 A fusion of segmented images is implemented by pared to the background or healthy region. However, utilizing additive law of probability. in FP method human skins is observed with laser scat- 4 Implemented a novel feature selection method, tering and difference is identified using optical method which initially calculate the Euclidean distance for the diagnostic test for differentiating melanoma and between fused feature vector by implementing an benign. Entropy-variance method. Only most discriminant features are later utilized by multi-class support Problem statement vector machine for classification. It is proved that malignant melanoma is a lethal skin cancer that is extra dominant between the 15 and above aged people [24]. The recent research shows high rate Paper organization of failure to detect and diagnose this type of cancer at The chronological order of this article is as follows: The the early stages [25]. Generally, it consists of four major related work of skin cancer detection and classification is Khan et al. BMC Cancer (2018) 18:638 Page 3 of 20 described in “Related work”section.“Methods”section detection of skin lesion. The color sampling method is explains the proposed method, which consists of several utilized with Harris detector and compared their per- sub steps including contrast stretching, segmentation, fea- formance with grayscale sampling. Also, compared the tures extraction, features fusion, classification etc. The color-SIFT (scale invariant feature transform) and SIFT experimental results and conclusion of this article are features and conclude that color-SIFT features performs described in “Results”and “Discussion”sections. good ascomparetoSIFT.Yanyangetal. [28]intro- duced an novel method for melanoma detection based Related work on Mahalanobis distance learning and graph regular- In the last few decades, advance techniques in different ized non-negative matrix factorization. The introduced domains of medical image processing, machine learn- method treated as a supervised learning method and ing, etc., have introduced tremendous improvements in reduced the dimensionality of extracted set of features computer aided diagnostic systems. Similarly, improve- and improves the classification rate. The method is eval- ments in dermatological examination tools have led the uated on PH2 dataset and achieved improved perfor- revolutions in the prognostic and diagnostic practices. mance. Catarina et al. [29]described thestrategyof The computerized features extractions of cutaneous combination of global and local features. The local fea- lesion images and features analysis by machine learning tures (BagOf Features) and global features (shape and techniques have potential to enroute the conventional geometric) are extracted from original image and fused surgical excision diagnostic methods towards CAD these features based of early fusion and late fusion. The systems. author claim the late fusion is never been utilized in this In literature several methods are implemented for auto- context and it gives better results as compared to early mated detection and classification of skin cancer from fusion. the dermoscopic images. Omer et al. [26]introduced Ebtihal et al. [30] introduced an hybrid method for an automated system for an early detection of skin lesion classification using color and texture features. lesion. They utilized color features prior to global thresh- Four moments such as mean standard deviation, degree olding for lesion’s segmentation. The enhanced image of asymmetry and variance is calculated against each was later subjected to 2D Discrete Fourier Transform channel, which are treated as a features. The local binary (DCT) and 2D Fast Fourier Transform (FFT) for fea- pattern (LBP) and gray level co-occurrences matrices (GLCM) were extracted as a texture features. Finally, tures extraction prior to the classification step. The results the combined features were classified using support were tested on a publicly available dataset PH2. Barata et al. [27] described the importance of color features for vector machine (SVM). Agn et al. [31]introduceda Fig. 1 Proposed architecture of skin lesion detection and classification Khan et al. BMC Cancer (2018) 18:638 Page 4 of 20 Fig. 2 Information of original image and their respective channels: a original image; b red channel; c green channel; d blue channel saliency detection technique for accurate lesion detec- was detected by spatial layout which includes boundaries tion. The introduced method resolve the problems when and color information. They implemented Bayesian the lesion borders are vague and the contrast between framework to minimize the detection errors. Similarly, the lesion and inundating skin is low. The saliency Lei et al. [32] introduced a new method of lesion detec- method is reproduced with the sparse representaion tion and classification based on multi-scale lesion biased method. Further, a Bayesian network is introduced that representation (MLR). This proposed method has the better explains the shape and boundary of the lesion. advantage of detecting the lesion using different rotations Euijoon et al. [38] introduced a saliency based segmen- and scales, compared to conventional methods of single tation technique where the background of original image rotation. Fig. 3 Proposed contrast stretching results Khan et al. BMC Cancer (2018) 18:638 Page 5 of 20 Fig. 4 Proposed uniform distribution based mean segmentation results. a original image; b enhanced image; c proposed uniform based mean segmentation; d 2D contour image; e Contour plot; f 3D contour plot; g lesion area From above recent studies, we noticed that the colour improved classification, several features are utilized information and contrast stretching is an important in literature but according to best our knowledge, factor for accurately detection of lesion from der- serial based features fusion is not yet utilized. How- moscopic images. Since the contrast stretching meth- ever, in our case only salient features are utilized ods improves the visual quality of lesion area and which are later subjected to fusion for improved improves the segmentation accuracy. Additionally, for classification. Fig. 5 Proposed normal distribution based M.D segmentation results. a original image; b enhanced image; c proposed M.D based segmentation; d 2D contour image; e Contour plot; f 3D contour plot; g lesion area Khan et al. BMC Cancer (2018) 18:638 Page 6 of 20 Table 1 Ground truth table for z 1 In RGB dermoscopic images, mostly the available con- tents are visually distinguishable into foreground which is X ∈iX ∈ j S 1 2 infected region and the background. This distinctness is 00 0 also evident in each and every gray channel, as shown in 01 1 Fig. 2. 10 1 Considering the fact [35], details are always high with 11 1 higher gradient regions which is foreground and details are low with the background due to low gradient values. We firstly divide the image into equal sized blocks and the Methods compute weights for all regions and for each channel. For A new method is proposed for lesion detection and clas- a single channel information, details are given below. sification using probabilistic distribution based segmenta- tion method and conditional entropy controlled features 1 Gray channel is preprocessed using Sobel edge filter selection. The proposed method is consists of two major steps: a) lesion identification; b) lesion classification. For to compute gradients where kernel size is selected to lesion identification, we first enhance the contrast of input be 3 × 3. image and then segment the lesion by implementation 2 Gradient calculation for each equal sized block and of novel probabilistic distribution (uniform distribution, rearranging in an ascending order. For each block the normal distribution). The lesion classification is done weights are assigned according to the gradient based of multiple features extraction and entropy con- magnitude. trolled most prominent features selection. The detailed b1 ⎪ ς if υ (x, y) ≤ T ; c 1 ⎪ w flow diagram of proposed method is shown in Fig. 1. ⎪ b2 ς T <υ (x, y) ≤ T ; 1 c 2 ζ (x, y) = (1) b3 ς T <υ (x, y) ≤ T ; 1 c 3 Contrast stretching w b4 ς otherwise There are numerous contrast stretching or normaliza- tion techniques [34], which attempt to improve the image bi where ς (i ≤ 4) are statistical weight coefficient contrast by stretching pixels’ specific intensity range to and T is gradient intervals threshold. a different level. Most of the available options take gray 3 Cumulative weighted gray value is calculated for each image as an input and generate an improved output gray block using: image. In our research work, the primary objective is to acquire a three channel RGB image having dimensions bi N (z) = ς n (z) (2) g i m × n × 3. Although, the proposed technique can only w i=1 work on a single channel of size m × n, therefore, in pro- posed algorithm we separately processed red, green and where n (z) represents cumulative number of gray blue channel. level pixels for each block i. Fig. 6 Proposed fusion results. a original image; b fused segmented image; c mapped on fused image; d ground truth image Khan et al. BMC Cancer (2018) 18:638 Page 7 of 20 4 Concatenate red, green and blue channel to produce where E is the block with maximum edges. Finally, max enhanced RGB image. adjust the intensity levels of enhance image and perform log operation to improved lesion region as compare to For each channel, three basic conditions are considered original. for optimized solution: I) extraction of regions with max- ϕ(AI) = ζ(B ) (4) wi imum information; II) selection of a block size; III) an improved weighting criteria. In most of the dermoscopic ϕ(t) = C × log(β + ϕ(AI)) (5) images, maximum informative regions are with in the range of 25 − 75%. Therefore, considering the minimum Where β is a constant value, (β ≤ 10), which is selected value of 25%, the number of blocks are selected to be 12 to be 3 for producing most optimal results. ζ denotes the as an optimal number, with an aspect ratio of 8.3%. These adjust intensity operation, ϕ(AI) is enhance image after blocks are later selected according to the criteria of maxi- ζ operation and ϕ(t) is final enhance image. The final mal information retained (cumulative number of pixels for contrast stretching results are shown in Fig. 3. each block). Laplacian of Gaussian method (LOG) [36]is used with sigma value of two for edge detection. Weights Lesion segmentation are assigned according to the number of edge points, E Segmentation of skin lesion is an important task in the pi for each block: analysis of skin lesions due to several problems such as color variation, presence of hairs, irregularity of lesion pi in the image and necked edges. Accurate segmentation B = (3) wi E provides important cues for accurate border detection. max Fig. 7 Proposed fusion results. a original image; b proposed segmented image; c mapped on proposed image; d ground truth image; e border on proposed segmented image Khan et al. BMC Cancer (2018) 18:638 Page 8 of 20 +∞ t−μ In this article, a novel method is implemented based of − 2 σ = |t − μ| √ e dt (14) probabilistic distribution. The probabilistic distribution 2πσ −∞ is consists of two major steps: a) uniform distribution t−μ Then put g = in Eq. 14. based mean segmentation; b) normal distribution based +∞ 2 segmentation. −g M.D = √ σ g e dg (15) 2πσ −∞ Mean segmentation The uniform distribution of mean segmentation is ∞ ∞ 2 2 −g −g calculated from enhanced image ϕ(t) and then perform 2 2 = √ ge dg + ge dg (16) 2π 0 0 threshold function for lesion extraction. The detailed description of mean segmentation is defined below: Let −g 2σ t denotes the enhanced dermoscopic image and f (t) M.D = √ ge dg (17) 2π 0 denotes the function of uniform distribution, which is determined as f (t) = .Where y and x denotes the y−x Put = l in Eq. 17 and it becomes: maximum and minimum pixels values of ϕ(t). Then the mean value is calculated as follows: 2σ dl −l M.D = √ 2le √ (18) 2π 2l μ = tf (t) dt (6) 2σ −l = √ e dl (19) = t dt (7) 2π y − x y −l 2 e 1 t = σ (20) = (8) π −1 y − x 2 = (y + x)(y − x) (9) 2 1 2(y − x) =− σ (21) π e μ = (y + x) (10) =− σ(−1) (22) Then perform an activation function, which is define as follows: 1 1 A(μ) = + + C (11) μ 2μ 1 + Table 2 Lesion detection accuracy as compared to ground truth ϕ(t) values 1 if A(μ) ≥ δ thresh Image description Similarity rate Image description Similarity rate F(μ) = (12) 0 if A(μ) < δ thresh IMD038 95.69 IMD199 94.70 where δ is Otus’s threshold, α is a scaling factor which thresh IMD020 92.52 IMD380 97.94 controls the lesion area and its value is selected on the IMD039 91.35 IMD385 94.37 basis of simulations performed, α ≤ 10, and finally got IMD144 88.33 IMD392 94.47 α = 7 to be most optimal number. C is a constant value IMD203 86.44 IMD394 96.96 which is randomly initialized within the range of 0 to 1. The segmentation results are shown in Fig. 4. IMD379 88.41 IMD047 90.07 IMD429 94.87 IMD075 95.85 Mean deviation based segmentation IMD211 92.81 IMD078 94.70 The mean deviation (M.D) of normal distribution is IMD285 95.59 IMD140 96.94 is calculated from ϕ(t) having parameter μ and σ . IMD022 96.02 IMD256 95.82 The value of M.D is utilized by activation function for extraction of lesion from the dermoscopic images. Let IMD025 96.35 IMD312 96.04 t denotes the enhanced dermoscopic image and f (t) IMD042 91.26 IMD369 96.08 denotes the normalized function, which determined as t−μ 1 2 IMD173 96.04 IMD376 93.07 1 − ( ) √ 2 σ f (t) = e . Then initialize the M.D as: 2πσ IMD182 97.97 IMD427 93.14 +∞ IMD430 98.10 IMD168 92.88 M.D = |t − μ| f (t) (13) −∞ Data in bold are significant Khan et al. BMC Cancer (2018) 18:638 Page 9 of 20 Fig. 8 A system architecture of multiple features fusion and selection Fig. 9 Selected channels for color features extraction Khan et al. BMC Cancer (2018) 18:638 Page 10 of 20 Table 3 Proposed features fusion and selection results on PH2 dataset Method Execution time /sec Sensitivity (%) Precision (%) Specificity (%) FNR (%) FPR Accuracy (%) DT 7 88.33 88.73 92.50 10.0 0.04 90.0 QDA 2 90.83 89.40 91.20 9.0 0.04 91.0 Q-SVM 2 95.83 96.60 98.70 3.0 0.01 97.0 LR 6 92.10 92.76 96.96 6.0 0.02 94.0 N-B 3 89.60 91.73 96.90 7.5 0.03 92.5 W-KNN 2 91.67 92.33 96.20 6.5 0.02 93.5 EBT 5 95.43 96.67 98.12 3.5 0.02 96.5 ESD 10 94.20 94.53 97.50 4.5 0.02 95.5 C-KNN 2 91.26 91.56 95.61 7.0 0.03 93.0 Multi-class SVM 1 96.67 97.06 98.74 2.5 0.01 97.5 Data in bold are significant Table 4 Results of individual extracted set of features using PH2 dataset Name Features Performance measures Classification Method Harlick HOG Color Sensitivity (%) Precision (%) Specificity (%) FNR (%) FPR Accuracy (%) Decision tree 67.53 67.50 70.05 31.50 0.16 68.5 71.67 72.1 85.0 23.0 0.11 77.0 87.93 86.93 86.9 12.5 0.06 87.5 Quadratic discriminant analysis 70.0 68.43 70.0 30.0 0.14 70.0 74.60 75.83 88.15 20.0 0.09 80.0 84.6 81.9 80.65 17.0 0.08 83.0 Quadratic SVM 68.33 70.27 76.25 28.5 0.14 71.5 82.5 83.37 92.7 13.5 0.06 86.5 93.77 93.33 94.44 6.0 0.03 94.0 Logistic regression 63.36 64.06 70.05 34.0 0.17 66.0 86.27 85.83 91.9 11.5 0.09 88.5 89.2 90.43 92.55 9.5 0.04 90.5 Naive bayes 62.9 62.9 66.85 35.5 0.18 64.5 81.25 81.93 90.65 15.0 0.07 85.0 87.93 87.63 90.65 11.0 0.06 89.0 Weighted KNN 66.67 67.5 72.5 31.0 0.16 69.0 81.67 83.27 92.5 14.0 0.06 86.0 90.87 90.83 92.55 8.5 0.03 91.5 Ensemble boosted tree 68.33 67.77 68.75 31.5 0.16 68.5 80.67 82.57 91.3 15.0 0.07 85.0 88.37 89.47 91.3 10.5 0.04 89.5 Ensemble subspace discriminant 68.76 68.4 71.9 30.0 0.15 70.0 87.1 87.03 91.9 11.0 0.05 89.0 92.9 94.7 96.9 5.5 0.03 94.1 Cubic KNN 65.43 66.4 71.9 32.0 0.16 68.0 80.4 80.8 89.4 16.0 0.07 84.0 90.3 89.83 91.7 9.5 0.04 90.5 Proposed 69.6 72.23 75.65 28.0 0.14 72.0 86.27 87.37 94.4 10.5 0.02 89.5 94.6 93.97 94.4 5.5 0.02 94.5 Khan et al. BMC Cancer (2018) 18:638 Page 11 of 20 Hence j pixels which pixels values are 1. It mean all 1 value pixels fall in S.Then X ∪ X written as: 1 2 M.D = 0.7979σ (23) Then perform an activation function to utilize M.D as: X ∪ X = (X ∪ X ) ∩ φ (26) 1 2 1 2 1 1 AC(M.D) = + + C (24) M.D 2 M.D P(X ∪ X ) = P((X ∪ X )) ∩ P(φ) (27) 1 2 1 2 1 + ϕ(t) ξ((X , X ) == 1) if (i, j) ∈ z 1 2 1 1 if AC(M.D) ≥ δ = (28) thresh F(M.D) = (25) ξ((X , X ) == 0) if (i, j) ∈ z 1 2 2 0 if AC(M.D)<δ thresh Where z represented as ground truth Table 1. ThesegmentationresultsofM.Dis showninFig. 5. Hence Image fusion 1 if i, j > 1 The term image fusion mean to combine the information (t) = (29) 0 Otherwise of two or more than two images in one resultant image, which contains better information as compare to any indi- P(X ∪ X ) = P(X ) + P(X ) − P(φ) (30) vidual image or source. The image fusion reduces the 1 2 1 2 redundancy between two or more images and increase the Where P(φ) denotes the 0 values which presented clinical applicability for diagnosis. In this work, we imple- as background and 1 denotes the lesion. The graphical mented a union based fusion of two segmented images results after fusion are shown in Fig. 6. into one image. The resultant image is more accurate and having much information as compare to individual. Analysis Suppose N denotes the sample space, which contains In this section, we analyze our segmentation results in 200 dermoscopic images. Let X ∈ F(μ) which is mean terms of accuracy or similarity index as compared to segmented image. Let X ∈ F(M.D) which M.D based given ground truth values. We select randomly images segmented image. Let i denotes the X pixels values and j from PH2 dataset and shows their results in tabular and denotes the X pixels values and S denotes the both i and graphical. The proposed segmentation results are directly Table 5 Confusion matrix for PH2 dataset Confusion Matrix: Proposed features fusion and selection Class Tested images Melanoma Benign Caricinoma Melanoma 20 92.5% 7.5% Benign 40 2.5% 97.5% Caricinoma 40 100% Confusion matrix: Harlick features Class Total Images Melanoma Benign Caricinoma Melanoma 20 57.5% 35% 7.5% Benign 40 8.8% 68.8% 22.5% Caricinoma 40 3.8% 13.8% 82.5% Confusion matrix: HOG features Class Total Images Melanoma Benign Caricinoma Melanoma 20 70% 30% - Benign 40 10% 88.8% 1.3% Caricinoma 40 - - 100% Confusion matrix: Color features Class Total Images Melanoma Benign Caricinoma Melanoma 20 95% 5.0% - Benign 40 3.8% 95% 1.3% Caricinoma 40 1.3% 5.0% 93.8% Khan et al. BMC Cancer (2018) 18:638 Page 12 of 20 Table 6 PH2 dataset: Comparison of proposed algorithm with HOG features existing methods The Histogram Oriented Gradients (HOG) features are originally introduced by Dalal [40] in 2005 for human Method Year Sensitivity % Specificity % Accuracy % detection. The HOG features are also called shape based Abuzaghleh et al. 2014 - - 91 [26] features because they work on the shape of the object. In our case, the HOG features are extracted from seg- Barata et al. [27] 2013 85 87 87 mented skin lesion and work efficiently because every Abuza et al. [43] 2015 - - 96.5 segmented lesion have their own shape. As shown in Kruck et al. [44] 2015 95 88.1 - Fig. 8, the HOG features are extracted from segmented Rula et al. [45] 2017 96 83 - lesion and obtain a feature vector of size 1 × 3780 because Waheed et al. [46] 2017 97 84 96 we have the size of segmented image is 96 × 128 and size of bins is 8 × 8. Thesizeofextracted features are Sath et al. [47] 2017 96 97 - too high and they effect on the classification accuracy. GUU et al. [48] 2017 94.43 81.01 - For this reason, we implement a weighted conditional Lei et al. [49] 2016 87.50 93.13 92.0 entropy with PCA (principle component analysis) on MRastagoo et al. 2015 94 92 - extracted feature vector. The PCA return the score [50] against each feature and then weighted entropy is utilized Proposed 2017 96.67 98.7 97.5 to reduced the feature space and select the maximum Data in bold are significant 200 score features. The weighted conditional entropy is define as: compare to ground truth images as shown in Fig. 7.The testing accuracy against each selected dermoscopic image K K P(i) are depicted in Table 2.FromTable 2 the accuracy of E = W . P(i, j)log (31) W i,j P(i, j) each image is above 90% and the maximum similarity i=1 j=1 rate is 98.10. From our analysis, the proposed segmenta- tion results perform well as compare to existing methods Where i, j denotes the current and next feature respec- [31, 37–39] in terms of border detection rate. tively. W denotes the weights of selected features, which i,j is selected between 0 and 1 0 ≤ W ≤ 1 and P(i, j) = ij Image representation W . n ij ij . Hence the new reduce vector size is 1 × 200. In this three types of features are extracted for the repre- W . n ij ij ij=1 sentation of an input image. The basic purpose of feature extraction is to find out a combination of most efficient Harlick features features for classification. The performance of dermo- Texture information of an input image is an important scopic images mostly depends on the quality and the component, which is utilized to identify the region of consistency of the selected features. In this work, three interest such as a lesion. For texture information of lesion, types of features are extracted such as color, texture and we extract the harlick features [41]. The harlick features HOG for classification of skin lesion. are extracted from the segmented image as shown in Table 7 Proposed features fusion and selection results on ISIC-MSK dataset Performance measures Method Sensitivity (%) Precision (%) Specificity (%) FNR (%) FPR Accuracy (%) Decision tree 92.95 93.1 94.30 6.9 0.07 93.1 Quadratic discriminant analysis 95.95 95.45 91.90 4.5 0.04 95.5 Quadratic SVM 96.25 96.10 95.60 3.8 0.03 96.2 Logistic regression 95.10 95.10 95.60 4.8 0.04 95.2 Naive bayes 92.80 93.30 95.60 6.9 0.07 93.1 Weighted KNN 95.10 95.10 95.60 4.8 0.04 95.2 Ensemble boosted tree 95.10 95.10 95.60 4.80 0.04 95.2 Ensemble subspace discriminant 95.10 95.10 95.60 4.8 0.04 95.2 Cubic KNN 89.35 90.65 95.60 10.0 0.10 90.0 Proposed 96.60 97.0 98.30 2.8 0.01 97.2 Data in bold are significant Khan et al. BMC Cancer (2018) 18:638 Page 13 of 20 Fig. 8. There are total 14 texture features implemented processing and are deeply robust to geometric variations (i.e. autocorrelation, contrast, cluster prominence, cluster of lesion patterns. Three types of color spaces are utilized shade, dissimilarity, energy, entropy, homogeneity 1, for color features extraction such as RGB, HSI, and LAB. homogeneity 2, maximum probability, average, variances, As shown in Fig. 9, the mean, variance, skewness and kur- inverse difference normalized and inverse difference tosis are calculated for each selected channel. From Fig. 8, moment normalized) and a feature vector of size 1 × 14 its shown clearly that the 1×12 features are extracted from is created. After calculating the mean, range and vari- each color space and total features of three color spaces ance of each feature, the final vector is calculated having having dimension of 1 × 36. size 1 × 42. Features fusion Color features The goal of feature fusion is to create a new feature The color information of the region of interest has vector, which contains much information as compare to attained strong prevalence for classification of lesions in individual feature vector. Different types of features are malignant or benign. The color features provide a quick extracted from same image always indicates the distinct Table 8 Results for individual extracted set of features using ISIC-MSK dataset Classifier Selected features Performance measures Color HOG Harlick Sensitivity % Precision % Specificity FNR % FPR Accuracy % DT  89.4 89.65 0.919 10.3 0.105 89.7 92.25 93.10 0.944 6.9 0.06 93.1 80.95 82.15 0.888 18.3 0.18 81.7 QDA  86.05 86.05 0.875 13.8 0.13 86.2 94.30 93.85 0.894 6.2 0.05 93.8 70.73 73.25 0.769 26.6 0.26 73.4 Q-SVM  95.6 95.75 0.956 4.1 0.03 95.9 95.5 95.46 0.956 4.5 0.04 95.5 82.05 82.3 0.856 17.6 0.17 82.4 LR  92.05 92.7 0.956 7.6 0.07 92.4 95.1 95.1 0.956 4.8 0.04 95.2 81.45 82.25 0.875 17.9 0.18 82.1 N-B  90.9 91.8 0.956 8.6 0.08 91.4 93.95 94.2 0.956 5.9 0.05 94.1 82.2 83.95 0.913 16.9 0.03 83.1 W-KNN  90.9 91.9 0.956 8.6 0.08 91.4 93.95 94.2 0.956 5.9 0.05 94.1 81.15 84.2 0.938 17.6 0.08 82.4 EBT  91.45 91.85 0.994 8.3 0.08 91.7 93.35 93.4 0.944 6.6 0.06 93.4 81.45 82.25 0.875 17.9 0.18 82.1 ESD  86.95 88.05 0.931 12.4 0.125 87.6 95.5 95.45 0.956 4.5 0.04 95.5 78.0 79.5 0.875 21.0 0.21 79.0 Cubic KNN  93.25 93.5 0.95 6.6 0.06 93.4 93.15 92.7 0.973 7.2 0.07 92.8 76.6 76.6 0.788 23.1 0.23 76.9 Proposed  95.85 95.85 0.963 4.1 0.03 95.9 97.1 96.75 0.963 3.8 0.02 96.2 82.55 84.7 0.913 16.6 0.13 83.4 Khan et al. BMC Cancer (2018) 18:638 Page 14 of 20 // Table 9 Confusion matrix for all set of extracted features using F P = (α , α , ...α )(j , j , ... j )(o , o , ... o ) 1 2 d 1 2 d 1 2 d ISIC-MSK dataset (32) Class Total images Melanoma Benign Where d denotes the dimension of extracted set of fea- Confusion matrix: Proposed features fusion and selection tures. As we know the dimension of each extracted feature Melanoma 130 99.2% 1% vector (i.e. HOG (1 × 200), Texture (1 × 42) and Color Benign 160 4.4% 95.6% (1 × 36). Then the fused vector is define as: Confusion matrix: Harlick features // ϒ F = α + ι j, α + ι o | α ∈ D, j ∈ E, o ∈ F (33) Melanoma 130 73.8% 26.2% It in an n dimensional complex vector, where n = Benign 160 8.8% 91.3% max(d(D), d(E), d(F)). From previous expression, the Confusion matrix: HOG features HOG has maximum dimension 1 × 200. Hence, make the size of E and F feature vector equally to D vector. For Melanoma 130 99.2% 0.8% this purpose adding zeros. For example below is a given Benign 160 5.0% 95.0% matrix, which consists of three feature vectors. Confusion matrix: Color features Melanoma 130 96.2% 3.8% ⎨ D = (0.20.7 0.90.110.100.56 ... 0.90) Benign 160 3.8% 96.3% E = (0.10.30.50.170.15) (34) F = (0.30.170.930.15) Then make the same size of feature vector, by adding characteristics of an image. The combination of these fea- zeros. tures effectively discriminate the information of extracted features and also eliminates the redundant information between them. The elimination of redundant informa- D = (0.2 0.7 0.9 0.11 0.10 0.56 ... 0.90) tion between extracted set of features provides improved E = (0.1 0.3 0.5 0.17 0.15 0.0 ... 0.0) (35) classification performance. In this work, we implemented F = (0.3 0.17 0.93 0.15 0.0 0.0 ... 0.0) a parallel features fusion technique. The implemented Finally, a novel feature selection technique is imple- technique efficiently fuse the all extracted features and mented on fused features vector and select the most also remove the redundant information between them. prominent features for classification. The fusion process is detailed as: Suppose C , C ,and 1 2 C are known lesion classes (i.e. melanoma, atypical nevi and benign). Let  = ψ | ψ ∈ R denotes the test- Features selection ing images. As given three extracted feature sets D = The motivation behind the implementation of feature h t c α | α ∈ R , E = j | j ∈ R , {o | o ∈ R },where α, j and selection technique is to select the most prominent fea- o are three feature vector (i.e. HOG, texture and color). tures for improving the accuracy and also make the sys- Then the parallel fusion is define as: tem fast in terms of execution time. The major reasons Table 10 Proposed features fusion and feature selection results on ISIC-UDA dataset Measures Method Sensitivity Precision Specificity FNR FPR Accuracy DT 87.25 90.65 97.1 10.7 0.12 89.3 QDA 79.75 88.60 99.3 16.3 0.19 83.7 QSVM 98.05 98.40 99.3 1.7 0.02 98.3 LR 94.8 96.35 99.3 4.3 0.04 95.7 N-B 88.5 91.00 96.4 9.9 0.10 90.1 W-KNN 83.85 91.20 100 12.9 0.16 87.1 EBT 95.2 95.85 97.9 4.3 0.4 95.7 E-S-D 89.6 89.75 92.1 9.9 0.09 90.1 L-KNN 81.7 90.25 100 14.6 0.18 85.4 Proposed 97.85 98.60 100 1.7 0.02 98.3 Data in bold are significant Khan et al. BMC Cancer (2018) 18:638 Page 15 of 20 behind feature selection technique are a) utilize only a of lesion classes and also consider more reliable as com- selected group prominent features leads to increased the pare to Euclidean distance. Second, it implements an classification accuracy by the elimination of irrelevant entropy-variance method on closeness features and select features; b) the miniature group of features is discov- the most prominent features based on their maximum val- ered that maximally increases the performance of pro- ues. Entropy in a nutshell is the uncertainty measurement posed method; c) select a group of features from the associated with initialization of the closeness features. high dimensional features set for a dense and detailed Since base classifier is highly dependent on their initial data representation. In this work, a novel Entropy- conditions for their fast convergence and accurate approx- Variances based feature selection method is imple- imation. Also, the selected closeness features should have mented. The proposed method performs in two steps. maximum entropy value. To the best of our knowledge, First, it calculates the Bhattacharyya distance of fused fea- entropy, especially in conjunction with Bhattacharyya dis- ture vector. The Bhattacharyya distance find out the close- tance and Variances, has never been adopted for selection ness between two features. It is utilized for classification of most prominent features. Let f and f are two features i i+1 Table 11 Results for individual extracted set of features using ISIC-UDA dataset Features Performance measures Method Color HOG Harlick Sensitivity (%) Precision (%) Specificity (%) FNR (%) FPR Accuracy (%) Decision tree  72.75 77.4 90.7 23.6 0.62 76.4 70.15 69.4 69.3 30.0 0.30 70.0 86.55 87.35 91.4 12.4 0.13 87.6 QDA  74.04 74.04 79.3 24.9 0.21 75.1 77.4 88.45 100 18.0 0.22 82.0 82.65 83.15 87.9 16.3 0.17 83.7 QSVM  73.7 77.25 89.3 23.2 0.73 76.8 81.35 89.3 99.3 15.0 0.18 85.0 94.45 95.8 98.6 4.7 0.05 95.3 LR  68.5 68.35 73.6 30.5 0.31 69.5 78.5 88.9 100 17.2 0.21 82.8 93.4 94.65 97.1 5.6 0.05 94.4 N-B  69.4 69.95 78.6 28.8 0.30 71.2 76.7 76.7 81.4 22.3 0.22 77.7 86.0 89.05 95.7 12.0 0.13 88.0 W-KNN  74.04 77.9 90.0 22.7 0.21 77.3 80.8 87.15 97.1 15.9 0.17 84.1 88.55 92.3 98.6 9.4 0.11 90.6 EBT  71.35 71.8 79.3 27.0 0.23 73.0 80.8 83.8 92.9 17.2 0.17 82.8 90.5 91.55 95.0 8.6 0.09 91.4 ESD  69.95 71.6 82.9 27.5 0.30 72.5 60.2 74.5 85.0 24.9 0.27 75.1 83.9 86.5 93.6 14.2 0.15 85.8 Cubic KNN  71.7 74.4 86.4 25.3 0.23 74.7 80.15 87.4 97.9 16.3 0.19 83.7 85.5 90.2 97.9 12.0 0.14 88.0 Proposed  73.65 78.5 91.4 22.7 0.22 77.3 82.6 87.55 96.4 14.6 0.15 85.4 95.2 95.85 97.9 4.3 0.04 95.7 Khan et al. BMC Cancer (2018) 18:638 Page 16 of 20 // Datasets & results of fused vector ϒ F . The Bhattacharyya distance is PH2 Dataset calculated as: The PH2 dataset [51] consists of 200 RGB dermoscopic ⎛ ⎞ images and of resolution (768 × 560). This dataset has three main divisions; a) melanoma; b) benign; c) common ⎜ ⎟ B =−ln f (u).f (u) (36) d ⎝ i i+1 ⎠ nevi. There are 40 melanoma, 80 benign and 80 common // nev image are in this dataset. For validation 50:50 strategy u∈ϒ F is performed for training and testing of proposed method. Then Entropy-variance is performed on crossness vec- Four experiments are done on different feature sets (i.e. tor to find out the best features based of their maximum harlick features, color features, HOG features, proposed entropy value. features fusion and selection method) for given a compar- 2 ison between individual set of features and proposed fea- ln f + σ (i+1) E B =− V d ture set. The proposed features fusion and selection with 2 2 ln f + σ + ln f − σ i i entropy-variances method results are depicted in Table 3. (37) The proposed method obtain maximum accuracy 97.06%, 0 0 H /δH log H /δH f f i i sensitivity 96.67%, specificity 98.74%, precision 97.06% f =1 and FPR is 0.01. The individual feature set by without utilizing feature selection algorithm results are depicted ϒ −1 i in Table 4. The results of Tables 3 and 4 are confirmed δH = H (38) by their confusion matrix in Table 5, which shows that f =0 proposed features fusion and selection method efficiently where H denotes the closeness set of features. Hence the perform on base classifier as compare to other classi- size of selected feature vector is 1 × 172. The selected vec- fication methods. The comparison of proposed method tor is feed to multi-class SVM for classification of lesion on PH2 dataset also given in Table 6, which shows the (i.e. melanoma, benign). The one-against all multi-class authenticity of proposed method. SVM [42] is utilized for classification. ISIC dataset The ISIC dataset [52] is an institutional database and Results often used in skin cancer research. It is an open source Evaluation protocol database having high-quality RGB dermoscopic images The proposed method is evaluated on four publicly of resolution (1022 × 1022). ISIC incorporates many sub- available datasets including PH2, ISIC, and collective ISBI datasets but we selected: a) ISIC MSK-2 and b) ISIC-UDA. (ISBI 2016 and ISBI 2017). The proposed method is a con- From ISIC MSK-2 dataset, we collected 290 images junction of two primary steps: a) lesion identification; b) lesion classification (i.e. melanoma, benign, atypical nevi). The lesion identification results are discussed in their Table 12 Confusion matrix for all set of extracted features using own section. In this section, we discussed proposed lesion ISIC-UDA dataset classification results. Four classifications three types of Class Total images Melanoma Benign features are extracted (i.e. texture, HOG, and color). The experimental results are obtained on each feature set Confusion matrix: Proposed features fusion and selection individually and then compare their results with pro- Melanoma 93 95.7% 4.3% posed feature vector (fused vector). The multi-class SVM Benign 140 - 100% is selected as a base classifier and compare their results with nine classifications method (decision tree (DT), Confusion matrix: Harlick features quadratic discriminant analysis (QDA), quadratic SVM Melanoma 93 55.9% 44.1% (Q-SVM), logistic regression (LR), Naive Bayes, weighted Benign 140 8.6% 91.4% K-Nearest Neighbor (w-KNN), ensemble boosted tree (EBT), ensemble subspace discriminant (ESDA), and Confusion matrix: HOG features cubic KNN (C-KNN)). Seven measures are calculated Melanoma 93 68.8% 31.2% for testing the performance of proposed method such as Benign 140 3.6% 96.4% sensitivity, specificity, precision, false negative rate (FNR), false positive rate (FPR), and accuracy. Also, calculate Confusion matrix: Color features the execution time of one image. The proposed method Melanoma 93 92.5% 7.5% is implemented on MATLAB 2017a having personal Benign 140 2.1% 97.9% computer Core i7 with 16GB of RAM. Khan et al. BMC Cancer (2018) 18:638 Page 17 of 20 Table 13 Classification results on ISBI 2016 dataset Method Sensitivity (%) Precision (%) Specificity (%) FNR (%) FPR Accuracy (%) AUC DT 63.0 62.0 79.0 28.5 0.370 71.5 0.63 QDA 68.0 65.5 79.0 26.4 0.320 73.6 0.74 Q-SVM 68.5 78.5 95.0 17.7 0.315 82.3 0.81 LR 67.0 65.0 79.0 26.1 0.330 72.9 0.69 NB 74.5 77.0 91.5 17.1 0.255 82.9 0.84 W-KNN 70.5 75.0 91.0 18.7 0.295 81.3 0.83 EBT 66.0 80.0 97.0 18.3 0.034 81.7 0.79 ESDA 72.5 55.0 90.0 18.5 0.275 81.5 0.83 Proposed 75.5 78.0 93.0 16.8 0.270 83.2 0.85 Data in bold are significant having 130 melanoma and 160 benign. For validation of ISBI - 2016 & 17 proposed algorithm, we have performed four experiments These datasets - ISBI 2016 [52] and ISBI 2017 [53], are on different types of features (i.e. Harlick features, Color based on ISIC archive, which is a largest publicly avail- features, HOG features and proposed features fusion and able collection of quality controlled dermoscopic images selection vector). Four different classification methods for skin lesions. It contains separate training and test- are compared with the base classifier ( multi-class SVM). ing RGB samples of different resolutions, such as ISBI The proposed features fusion and selection results are 2016 contains 1279 images (273 melanoma and 1006 shown in Table 7 having maximum accuracy 97.2%, sensi- benign), where 900 images for training and 350 for test- tivity 96.60% and specificity 98.30% on the base classifier. ing the algorithm. The ISBI 2017 dataset contains total The individual feature set results are depicted in Table 8, 2750 images (517 melanoma and 2233 benign) including and base classifier (multi-class SVM) perform well as 2000 training images and 750 testing. For experimental compared to other methods. The base classifier results results, first experiments are done on each dataset sep- are confirmed by their confusion matrix given in Table 9. arately and obtained classification accuracy 83.2%, and From ISIC UDA dataset, we select total 233 images 88.2% on ISBI 2016, and ISBI 2017, respectively. The clas- having 93 melanoma and 140 benign. The proposed sification results are given in Tables 13 and 14, which method results are depicted in Table 10 having maximum is proved by their confusion matrix given in Table 16. accuracy 98.3% and specificity 100% on the base classifier. After that, both datasets are combined and 10 fold cross- Also, the results on individual feature sets are depicted validation is performed for classification results. The max- in the Table 11, which shows that the proposed features imum classification accuracy of 93.2% is achieved with fusion and selection method perform significantly well as multi-class SVM, presented in Table 15, which is also compared to Table 10. The base classifier results are con- confirmed by their confusion matrix given in Table 16. firmed by their confusion matrix given in the Table 12, The proposed method is also compared with [54], which which shows the authenticity of proposed method. has achieved maximum classification accuracy of 85.5%, Table 14 Classification results on ISBI 2017 dataset Method Sensitivity (%) Precision (%) Specificity (%) FNR (%) FPR Accuracy (%) AUC DT 74.5 75.0 77 25.5 0.255 74.8 0.77 QDA 77.5 78.0 81 22.5 0.254 77.6 0.78 Q-SVM 86.5 86.5 87 13.8 0.135 86.2 0.92 LR 84.5 84.5 86 15.4 0.135 84.6 0.92 NB 79.5 80.0 83 21.5 0.212 79.5 0.80 W-KNN 87.5 88.0 88 12.2 0.125 87.8 0.92 EBT 86.0 83.5 92 14.2 0.140 85.8 0.91 ESDA 83.5 83.5 87.0 16.5 0.165 83.5 0.90 Proposed 88.5 88.0 91.0 11.8 0.120 88.2 0.93 Data in bold are significant Khan et al. BMC Cancer (2018) 18:638 Page 18 of 20 Table 15 Classification results for challenge ISBI 2016 & ISBI 2017 dataset Method Performance measures Sensitivity (%) Precision (%) Specificity (%) FNR (%) FPR Accuracy (%) AUC DT 87.5 88.0 86.0 12.4 0.125 87.6 0.86 QDA 80.0 80.0 79.0 20.0 0.200 80.0 0.86 QSVM 92.5 92.5 95.0 7.4 0.075 92.6 0.95 LR 92.0 91.5 95.0 8.2 0.08 91.8 0.95 NB 92.0 92.5 97.0 8.2 0.08 91.8 0.93 W-KNN 88.5 88.5 91.0 11.6 0.115 88.4 0.88 EBT 92.0 92.0 97.0 8.3 0.08 91.7 0.95 ESDA 89.5 89.5 91.5 10.4 0.105 89.6 0.94 Proposed 93.0 93.5 97.0 6.8 0.07 93.2 0.96 Data in bold are significant AUC 0.826, sensitivity 0.853, and specificity 0.993 on ISBI their results with proposed features fusion and selection 2016 dataset. However, with our method, achieved clas- as presented in the Tables 3, 7,and 10, which shows that sification accuracy is 93.2%, AUC 0.96, sensitivity 0.930, proposed method performs significantly better in terms of and specificity 0.970, which confirms the authenticity and classification accuracy and execution time. The base clas- efficiency of our algorithm on combined dataset com- sifier results are also confirmed by their confusion matrix pared to [54]. Moreover, in [55]reportedmaximumAUC giveninTables 5, 9,and 12. Also, the comparison results of is 0.94 for skin cancer classification for 130 melanoma the PH2 dataset with existing methods is presented in the images, however, our method achieved AUC 0.96 on 315 Table 6, which shows the efficiency of proposed method. melanoma images. In [56]and [57], the classification accu- Moreover, the proposed method is also evaluated on com- racy achieved is 85.0% and 81.33% for ISBI 2016 dataset. bination of ISBI 2016 and ISBI 2017 dataset and achieved Upon comparison with [54–56], and [57], the proposed classification accuracy 93.2% as presented in Table 15. method performs significantly better on both (ISBI 2016 The classification accuracy of proposed method on Com- & 17) datasets. bined dataset is confirmed by their confusion matrix given in Table 16, which shows the authenticity of proposed Discussion method as compared to existing methods. In this section, we epitomized our proposed method in terms of tabular and visual results. The proposed method Table 16 Confusion matrix for ISBI 2016, ISBI 2017, and consists of two major steps: a) lesion identification; b) Combined images dataset lesion classification as shown in the Fig. 1. The lesion iden- ISBI 2016 tification phase has two major parts such as enhancement Classs Classification class and segmentation. The lesion enhancement results are TPR (%) FNR (%) Method Benign Melanoma showninthe Fig. 3, which shows the efficiency of intro- duced technique. Then the lesion segmentation method is Benign 93% 3% 93% 3% performed and their results in terms of quantitative and Melanoma 11% 53% 53% 11% tabular in Table 2 and Figs. 4, 5, 6 and 7. After this extract ISBI 2017 multi-level features and fused based on parallel strategy. Class Classification class Then a novel feature selection technique is introduced TPR (%) FNR (%) Benign Melanoma and performed on fused feature vector to select the best Benign 91% 9% 91% 9% features as shown in Fig. 8. Finally, the selected features are utilized by a multi-class SVM. The multi-class SVM Melanoma 14% 86% 86% 14% selected as a base classifier. The purpose of features fusion Combined and selection is to improve the classification accuracy and Class Classification class TPR (%) FNR (%) also make the system more efficient. Three publicly avail- Benign Melanoma able datasets are utilized for classification purposes such Benign 97% 3% 97% 3% as PH2, ISIC, and Combined dataset (ISBI 2016 and ISBI Melanoma 11% 89% 89% 11% 2017). The individual feature results on selected datasets arepresented in theTables 4, 8,and 11.Thencompared Data in bold are significant Khan et al. BMC Cancer (2018) 18:638 Page 19 of 20 Conclusion Author details Department of Computer Science, COMSATS Institute of Information In this work, we have implemented a novel method for Technology, Wah, Pakistan. Department of Electrical Engineering, COMSATS the identification and classification of skin lesions. The 3 Institute of Information Technology, Wah, Pakistan. Department of Electrical proposed framework incorporates two primary phases: Engineering, COMSATS Institute of Information Technology, Abbottabad, Pakistan. College of Computer and Information Sciences, King Saud a) lesion identification; b) lesion classification. In the University, Riyadh, Saudi Arabia. Department of Electrical Engineering, identification step, a novel probabilistic method is intro- COMSATS Institute of Information Technology, Attock, Pakistan. duced prior to features extraction. An entropy controlled Received: 5 October 2017 Accepted: 30 April 2018 variances based features selection method is also imple- mented by combining Bhattacharyya distance, and with an aim of considering only discriminant features. The References selected features are later utilized for classification in its 1. Rigel DS, Friedman RJ, Kopf AW. The incidence of malignant melanoma in the United States: issues as we approach the 21st century. J Am Acad final step using multi-class SVM. The proposed method Dermatol. 1996;34(5):839–47. is tested on three publicly available datasets (i.e. PH2, 2. Altekruse SF, Kosary CL, Krapcho M, Neyman N, Aminou R, Waldron W, ISBI 2016 & 17, and ISIC), and it is concluded that Ruhl J, et al. SEER cancer statistics review, 1975–2007. Bethesda: National Cancer Institute 7; 2010. the base classifier performs significantly better with pro- 3. Abuzaghleh O, Barkana BD, Faezipour M. Automated skin lesion analysis posed features fusion and selection method, compared based on color and shape geometry feature set for melanoma early to other existing techniques in term of sensitivity, speci- detection and prevention. In: Systems, Applications and Technology Conference (LISAT), 2014 IEEE Long Island. IEEE; 2014. p. 1–6. ficity, and accuracy. Furthermore, the presented method 4. Freedberg KA, Geller AC, Miller DR, Lew RA, Koh HK. Screening for achieved satisfactory segmentation results on selected malignant melanoma: a cost-effectiveness analysis. J Am Acad Dermatol. datasets. 1999;41(5):738–45. 5. Barata C, Ruela M, Francisco M, Mendonça T, Marques JS. Two systems Abbreviations for the detection of melanomas in dermoscopy images using texture and ABCD: Atypical, border, color, diameter; ACS: American Cancer Society; CAD: color features. IEEE Syst J. 2014;8(3):965–79. Computer Aided Design; C-KNN: Cubic KNN; DCT: Discrete Fourier Transform; 6. Menzies SW, Ingvar C, Crotty KA, McCarthy WH. Frequency and DT: Decision tree; EBT: Ensemble boosted tree; ESDA: Ensemble subspace morphologic characteristics of invasive melanomas lacking specific discriminant analysis; FFT: Fast Fourier Transform; FNR: False negative rate; surface microscopic features. Arch Dermatol. 1996;132(10):1178–82. GLCM: Gray level co-occurrences matrices; HOG: Histogram Oriented 7. Stolz W, Riemann A, Cognetta AB, Pillet L, Abmayr W, Holzel D, Bilek P, Gradients; LBP: local binary pattern; LOG: Laplacian of Gaussian; LR: Logistic Nachbar F, Landthaler M. Abcd rule of dermatoscopy-a new practical regression; M.D: Mean Deviation; MLR: Multi-scale lesion biased method for early recognition of malignant-melanoma. Eur J Dermatol. representation; PCA: Principle component analysis; QDA: Quadratic 1994;4(7):521–7. discriminant analysis; Q-SVM: Quadratic SVM; RGB: Red, Green, Blue; SIFT: 8. Argenziano G, Fabbrocini G, Carli P, De Giorgi V, Sammarco E, Delfino Scale-invariant feature transform; SVM: Support vector machine; W-KNN: M. Epiluminescence microscopy for the diagnosis of doubtful Weighted K-Nearest Neighbor melanocytic skin lesions: comparison of the ABCD rule of dermatoscopy and a new 7-point checklist based on pattern analysis. Arch Dermatol. Funding 1998;134(12):1563–70. The authors extend their appreciation to the Deanship of Scientific Research at 9. Mayer J. Systematic review of the diagnostic accuracy of dermatoscopy in King Saud University for funding this work through research group under detecting malignant melanoma. Med J Aust. 1997;167(4):206–10. grant# (RG-1438-034) and Higher Education Commission, Pakistan - Startup 10. Braun RP, Rabinovitz H, Tzu JE, Marghoob AA. Dermoscopy Research Grant #: 21-260/SRGP/R&O/HEC/2014. research—An update. In: Seminars in cutaneous medicine and surgery, vol. 28, no. 3. Frontline Medical Communications; 2009. p. 165–71. Availability of data and materials 11. Katapadi AB, Celebi ME, Trotter SC, Gurcan MN. Evolving strategies for The datasets analysed during the current study are in open access using the the development and evaluation of a computerised melanoma image following links. analysis system. Comput Methods Biomech Biomed Eng Imaging Vis. 2017;1–8. 1. AADI project repository at the web link: http://www.fc.up.pt/addi/ph2 12. Jaworek-Korjakowska J. Computer-aided diagnosis of micro-malignant %20database.html melanoma lesions applying support vector machines. BioMed Res Int. 2. ISIC UDA archive. https://isic-archive.com/ 2016;2016. 3. ISBI 2016. https://challenge.kitware.com/#challenge/n/ISBI_2016%3A_ 13. Safrani A, Aharon O, Mor S, Arnon O, Rosenberg L, Abdulhalim I. Skin Skin_Lesion_Analysis_Towards_Melanoma_Detection biomedical optical imaging system using dual-wavelength polarimetric control with liquid crystals. J Biomed Opt. 2010;15(2):026024. Authors’ contributions 14. Patalay R, Craythorne E, Mallipeddi R, Coleman A. An integrated skin MAK, TA, MS and AS conceived the study and participated in its design and marking tool for use with optical coherence tomography (OCT). In: Proc. coordination and helped to draft the manuscript. KA, MA, SIA and AA provided of SPIE Vol, vol. 10037. 2017. p. 100370Y–1. guidance and support in every part of this work and assisted in the writing and 15. Rajaram N, Nguyen TH, Tunnell JW. Lookup table–based inverse model editing of the manuscript. All authors read and approved the final manuscript. for determining optical properties of turbid media. J Biomed Opt. 2008;13(5):050501. Ethics approval and consent to participate 16. Aharon O Abdulhalim, Arnon O, Rosenberg L, Dyomin V, Silberstein E. Not applicable. Differential optical spectropolarimetric imaging system assisted by liquid crystal devices for skin imaging. J Biomed Opt. 2011;16(8):086008. Competing interests 17. Graham L, Yitzhaky Y, Abdulhalim I. Classification of skin moles from The authors declare that they have no competing interests. optical spectropolarimetric images: a pilot study. J Biomed Opt. 2013;18(11):111403. Publisher’s Note 18. Ushenko AG, Dubolazov OV, Ushenko VA, Yu Novakovskaya O, Olar OV. Springer Nature remains neutral with regard to jurisdictional claims in Fourier polarimetry of human skin in the tasks of differentiation of benign published maps and institutional affiliations. and malignant formations. Appl Opt. 2016;55(12):B56–B60. Khan et al. BMC Cancer (2018) 18:638 Page 20 of 20 19. Ávila FJ, Stanciu SG, Costache M, Bueno JM. Local enhancement of 39. Bozorgtabar B, Abedini M, Garnavi R. Sparse Coding Based Skin Lesion multiphoton images of skin cancer tissues using polarimetry. In: Lasers Segmentation Using Dynamic Rule-Based Refinement. In: MLMI@ MICCAI. and Electro-Optics Europe & European Quantum Electronics Conference 2016. p. 254–61. (CLEO/Europe-EQEC, 2017 Conference on). IEEE; 2017. p. 1–1. 40. Dalal N, Triggs B. Histograms of oriented gradients for human detection. 20. Stamnes JJ, Ryzhikov G, Biryulina M, Hamre B, Zhao L, Stamnes K. In: Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Optical detection and monitoring of pigmented skin lesions. Biomed Opt Computer Society Conference on, vol. 1. IEEE; 2005. p. 886–93. Express. 2017;8(6):2946–64. 41. Haralick RM, Shanmugam K. Textural features for image classification. 21. Pellacani G, Cesinaro AM, Seidenari S. Reflectance-mode confocal IEEE Trans Syst Man CYbernetics. 1973;6:610–21. microscopy of pigmented skin lesions–improvement in melanoma 42. Liu Y, Zheng YF. One-against-all multi-class SVM classification using diagnostic specificity. J Am Acad Dermatol. 2005;53(6):979–85. reliability measures. In: Neural Networks, 2005. IJCNN’05. Proceedings 22. Oh J-T, Li M-L, Zhang HF, Maslov K, Stoica G, Wang LV. 2005 IEEE International Joint Conference on, vol. 2. IEEE; 2005. p. 849–54. Three-dimensional imaging of skin melanoma in vivo by dual-wavelength 43. Abuzaghleh O, Barkana BD, Faezipour M. Noninvasive real-time photoacoustic microscopy. J Biomed Opt. 2006;11(3):034032. automated skin lesion analysis system for melanoma early detection and 23. Swanson DL, Laman SD, Biryulina M, Ryzhikov G, Stamnes JJ, Hamre B, prevention. IEEE J Trans Eng Health Med. 2015;3:1–12. Zhao L, Sommersten E, Castellana FS, Stamnes K. Optical transfer ´ 44. Kruk M, Swiderski B, Osowski S, Kurek J, Sowinska ´ M, Walecka I. diagnosis of pigmented lesions. Dermatol Surg. 2010;36(12):1979–86. Melanoma recognition using extended set of descriptors and classifiers. 24. Rademaker M, Oakley A. Digital monitoring by whole body photography Eurasip J Image Video Process. 2015;2015(1):43. and sequential digital dermoscopy detects thinner melanomas. J Prim 45. Ruela M, Barata C, Marques JS, Rozeira J. A system for the detection of Health Care. 2010;2(4):268–72. melanomas in dermoscopy images using shape and symmetry features. 25. Moncrieff M, Cotton S, Hall P, Schiffner R, Lepski U, Claridge E. SIAscopy Comput Methods Biomech and Biomed Eng Imaging Vis. 2017;5(2): assists in the diagnosis of melanoma by utilizing computer vision 127–37. techniques to visualise the internal structure of the skin. Med Image 46. Waheed Z, Waheed A, Zafar M, Riaz F. An efficient machine learning Underst Anal. 2001;53–6. approach for the detection of melanoma using dermoscopic images. 26. Abuzaghleh O, Barkana BD, Faezipour M. Automated skin lesion analysis In: Communication, Computing and Digital Systems (C-CODE), based on color and shape geometry feature set for melanoma early International Conference on. IEEE; 2017. p. 316–9. detection and prevention. In: Systems, Applications and Technology 47. Satheesha TY, Satyanarayana D, Prasad MNG, Dhruve KD. Melanoma is Conference (LISAT), 2014 IEEE Long Island. IEEE; 2014. p. 1–6. Skin Deep: A 3D reconstruction technique for computerized dermoscopic 27. Barata C, Marques JS, Rozeira J. Evaluation of color based keypoints and skin lesion classification. IEEE J Trans Eng Health Med. 2017;5:1–17. features for the classification of melanomas using the bag-of-features 48. Gu Y Zhou, Qian B. Melanoma Detection Based on Mahalanobis Distance model. In: International Symposium on Visual Computing. Berlin, Learning and Constrained Graph Regularized Nonnegative Matrix Heidelberg: Springer; 2013. p. 40–49. Factorization. In: Applications of Computer Vision (WACV) 2017 IEEE 28. Gu Y, Zhou J, Qian B. Melanoma Detection Based on Mahalanobis Winter Conference on. IEEE; 2017. p. 797–805. Distance Learning and Constrained Graph Regularized Nonnegative 49. Bi L, Kim J, Ahn E, Feng D, Fulham M. Automatic melanoma detection Matrix Factorization. In: Applications of Computer Vision (WACV), 2017 via multi-scale lesion-biased representation and joint reverse IEEE Winter Conference on. IEEE; 2017. p. 797–805. classification. In: Biomedical Imaging (ISBI), 2016 IEEE 13th International 29. Barata C, Celebi ME, Marques JS. Melanoma detection algorithm based Symposium on. IEEE; 2016. p. 1055–8. on feature fusion. In: Engineering in Medicine and Biology Society (EMBC), 50. Rastgoo M, Morel O, Marzani F, Garcia R. Ensemble approach for 2015 37th Annual International Conference of the IEEE. IEEE; 2015. differentiation of malignant melanoma. In: The International Conference p. 2653–6. on Quality Control by Artificial Vision 2015. International Society for 30. Almansour E, Jaffar MA. Classification of Dermoscopic Skin Cancer Optics and Photonics; 2015. p. 953415. Images Using Color and Hybrid Texture Features. IJCSNS Int J Comput Sci 51. Mendonça T, Ferreira PM, Marques JS, Marcal ARS, Rozeira J. PH 2-A Netw Secur. 2016;16(4):135–9. dermoscopic image database for research and benchmarking. 31. Ahn E, Kim J, Bi L, Kumar A, Li C, Fulham M, Feng DD. Saliency-based In: Engineering in Medicine and Biology Society (EMBC) 2013 35th Annual Lesion Segmentation via Background Detection in Dermoscopic Images. International Conference of the IEEE. IEEE; 2013. p. 5437–40. IEEE J Biomed Health Inform. 2017;21(6):1685–93. 52. Gutman D, Codella NCF, Celebi E, Helba B, Marchetti M, Mishra N, 32. Bi L, Kim J, Ahn E, Feng D, Fulham M. Automatic melanoma detection Halpern A. Skin lesion analysis toward melanoma detection: A challenge via multi-scale lesion-biased representation and joint reverse at the international symposium on biomedical imaging (ISBI) 2016, classification. In: Biomedical Imaging (ISBI), 2016 IEEE 13th International hosted by the international skin imaging collaboration (ISIC). arXiv Symposium on. IEEE; 2016. p. 1055–8. preprint arXiv:1605.01397. 2016. 33. Wong A, Scharcanski J, Fieguth P. Automatic skin lesion segmentation 53. Codella NCF, Gutman D, Celebi ME, Helba B, Marchetti MA, Dusza SW, via iterative stochastic region merging. IEEE Trans Inf Technol Biomed. Kalloo A, et al. Skin lesion analysis toward melanoma detection: A 2011;15(6):929–36. challenge at the 2017 international symposium on biomedical imaging 34. Mokhtar N, Harun N, Mashor M, Roseline H, Mustafa N, Adollah R, (isbi), hosted by the international skin imaging collaboration (isic). arXiv Adilah H, Nashrul MN. Image Enhancement Techniques Using Local, preprint arXiv:1710.05006. 2017. Global, Bright, Dark and Partial Contrast Stretching For Acute Leukemia 54. Yu L, Chen H, Dou Q, Qin J, Heng P-A. Automated melanoma Images. Lect Notes Eng Comput Sci. 2009;2176. recognition in dermoscopy images via very deep residual networks. IEEE 35. Duan Q, Akram T, Duan P, Wang X. Visual saliency detection using Trans Med Imaging. 2017;36(4):994–1004. information contents weighting. In: Optik - International Journal for Light 55. Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, Thrun S. and Electron Optics, Volume 127, Issue 19. 2016. p. 7418–30. Dermatologist-level classification of skin cancer with deep neural 36. Akram T, Naqvi SR, Ali Haider S, Kamran M. Towards real-time crops networks. Nature. 2017;542(7639):115–8. surveillance for disease classification: exploiting parallelism in computer 56. Ge Z, Demyanov S, Bozorgtabar B, Abedini M, Chakravorty R, Bowling vision. In: Computers and Electrical Engineering, Volume 59. 2017. p. A, Garnavi R. Exploiting local and generic features for accurate skin 15–26. lesions classification using clinical and dermoscopy imaging. In: 37. Barata C, Celebi ME, Marques JS. Melanoma detection algorithm based Biomedical Imaging (ISBI 2017), 2017 IEEE 14th International Symposium on feature fusion. In: Engineering in Medicine and Biology Society (EMBC), on. IEEE; 2017. p. 986–90. 2015 37th Annual International Conference of the IEEE. IEEE; 2015. 57. Lopez AR, Giro-i-Nieto X, Burdick J, Marques O. Skin lesion classification p. 2653–56. from dermoscopic images using deep learning techniques. In: Biomedical 38. Ahn E, Bi L, Jung YH, Kim J, Li C, Fulham M, Feng DD. Automated Engineering (BioMed) 2017 13th IASTED International Conference on. saliency-based lesion segmentation in dermoscopic images. IEEE; 2017. p. 49–54. In: Engineering in Medicine and Biology Society (EMBC), 2015 37th Annual International Conference of the IEEE. IEEE; 2015. p. 3009–12. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png BMC Cancer Springer Journals

An implementation of normal distribution based segmentation and entropy controlled features selection for skin lesion detection and classification

Free
20 pages
Loading next page...
 
/lp/springer_journal/an-implementation-of-normal-distribution-based-segmentation-and-gL2nCyWm0Z
Publisher
BioMed Central
Copyright
Copyright © 2018 by The Author(s)
Subject
Biomedicine; Cancer Research; Oncology; Surgical Oncology; Health Promotion and Disease Prevention; Biomedicine, general; Medicine/Public Health, general
eISSN
1471-2407
D.O.I.
10.1186/s12885-018-4465-8
Publisher site
See Article on Publisher Site

Abstract

Background: Melanoma is the deadliest type of skin cancer with highest mortality rate. However, the annihilation in its early stage implies a high survival rate therefore, it demands early diagnosis. The accustomed diagnosis methods are costly and cumbersome due to the involvement of experienced experts as well as the requirements for the highly equipped environment. The recent advancements in computerized solutions for this diagnosis are highly promising with improved accuracy and efficiency. Methods: In this article, a method for the identification and classification of the lesion based on probabilistic distribution and best features selection is proposed. The probabilistic distribution such as normal distribution and uniform distribution are implemented for segmentation of lesion in the dermoscopic images. Then multi-level features are extracted and parallel strategy is performed for fusion. A novel entropy-based method with the combination of Bhattacharyya distance and variance are calculated for the selection of best features. Only selected features are classified using multi-class support vector machine, which is selected as a base classifier. Results: The proposed method is validated on three publicly available datasets such as PH2, ISIC (i.e. ISIC MSK-2 and ISIC UDA), and Combined (ISBI 2016 and ISBI 2017), including multi-resolution RGB images and achieved accuracy of 97.5%, 97.75%, and 93.2%, respectively. Conclusion: The base classifier performs significantly better on proposed features fusion and selection method as compared to other methods in terms of sensitivity, specificity, and accuracy. Furthermore, the presented method achieved satisfactory segmentation results on selected datasets. Keywords: Image enhancement, Uniform distribution, Image fusion, Multi-level features extraction, Features fusion, Features selection Background astonishing mortality rate of 75% is reported due to Skin cancer is reported to be one of the most rapidly melanoma compared to other types of skin cancers [2]. spreading cancer amongst other types. It is broadly clas- The occurrence of melanoma reported to be doubled sified into two primary classes; Melanoma and Benign. (increases 2 to 3% per year) in the last two decades, faster The Melanoma is the deadliest type of cancer with high- than any other types of cancer. American Cancer Society est mortality rate worldwide [1]. In the US alone, an (ACS) has estimated, 87,110 new cases of melanoma will be diagnosed and 9,730 people will die in the US only in *Correspondence: tallha@ciitwah.edu.pk; aamirsardar@gmail.com 2017 [3]. Malignant melanoma can be cured if detected at Department of Electrical Engineering, COMSATS Institute of Information Technology, Wah, Pakistan Department of Electrical Engineering, COMSATS Institute of Information Technology, Abbottabad, Pakistan Full list of author information is available at the end of the article © The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Khan et al. BMC Cancer (2018) 18:638 Page 2 of 20 its early stages, e.g., if diagnosed at stage I, the possible steps: preprocessing, which consists of hair removal, con- survival rate is 96%, compared to 5% at its stage IV [4, 5]. trast enhancement, segmentation, feature extraction, and However, early detection is strenuous due to its high finally classification. The most challenging task in der- resemblance with benign cancer, even an expert derma- moscopy is an accurate detection of lesion’s boundary tologist can diagnose it wrongly. A specialized technique because of different artifacts such as hairs, illumination of dermatoscopy is mostly followed by dermatologist to effects, low lesion contrast, asymmetrical and irregular diagnose melanoma. In a clinical examination, most com- border, nicked edges, etc. Therefore, for an early detec- monly adopted methods of visual features inspection are; tion of melanoma, shape analysis is more important. Menzies method [6], ABCD rule [7], and 7-point check- In features extraction step, several types of features are list [8]. The most commonly used methods are the ABCD extracted such as shape, color, texture, local etc. But, (atypical, border, color, diameter) rules and pattern analy- we have no clear knowledge about salient features for sis. It is reported that this traditional dermoscopy method classification. can increase the detection rate 10 to 27% [9]. These meth- ods distinctly increases the detection rate compared to Contribution conventional methods but still dependent on dermatolo- In this article, we propose a new method of lesion detec- gist’s skills and training [10]. To facilitate experts numer- tion and classification by implementing probabilistic dis- ous computerized analysis systems have been proposed tribution based segmentation method and conditional recently [11, 12] which are referred to as pattern analysis/ entropy controlled features selection. The proposed tech- computerized dermoscopic analysis systems. These meth- nique is an amalgamation of five major steps: a) contrast ods are non-invasive and image analysis based technique stretching; b) lesion extraction; c) multi-level features to diagnose the melanoma. extraction; d) features selection and e) classification of In the last decade, several non-invasive methods malignant and benign. The results are tested on three pub- were introduced for the diagnosis of melanoma includ- licly available datasets which are PH2, ISIC (i.e. ISIC MSK- ing optical imaging system (OIS) [13], optical coher- 2 and ISIC UDA), and Combined (ISBI 2016 and ISBI ence tomography (OCT) [14], light scattering (LS) 2017), containing RGB images of different resolutions, [15], spectropolarimetric imaging system (SIM) [16, 17], which are later normalized in our proposed technique. fourier polarimetry (FP) [18], polarimetric imaging [19], Our main contributions are enumerated below: reectance confocal microscopy (RCM) [20, 21], photo- 1 Enhanced the contrast of a lesion area by acoustic microscopy [22], optical transfer diagnosis implementing a novel contrast stretching technique, (OTD) [23], etc. All these above mentioned methods have in which we first calculated the global minima and enough potential to diagnose the skin lesions and also maxima from the input image and then utilized low accurate enough to distinguish the melanoma and benign. and high threshold values to enhance the lesion. The optical methods are mostly utilized during a clinal 2 Implemented a novel segmentation method based on tests to evaluate the presurgical boundaries of the basal normal and uniform distribution. Mean of the cell carcinoma. It can help in drawing boundaries around uniform distribution is calculated from the enhanced the region of interest (ROI) in the dermoscopic images. image and the value is added in an activation LS skin methods give the information about the micro- function, which is introduced for segmentation. architecture, which is represented with small pieces of Similarly, mean deviation of the normal distribution pigskin and mineral element and helps to determine the is calculated using enhanced image and also inserted extent of various types of skin cancers. The SIM method their values in an activation function for correctly evaluates the polarimetric contrast of the region segmentation. of interest or infectious region such as melanoma, com- 3 A fusion of segmented images is implemented by pared to the background or healthy region. However, utilizing additive law of probability. in FP method human skins is observed with laser scat- 4 Implemented a novel feature selection method, tering and difference is identified using optical method which initially calculate the Euclidean distance for the diagnostic test for differentiating melanoma and between fused feature vector by implementing an benign. Entropy-variance method. Only most discriminant features are later utilized by multi-class support Problem statement vector machine for classification. It is proved that malignant melanoma is a lethal skin cancer that is extra dominant between the 15 and above aged people [24]. The recent research shows high rate Paper organization of failure to detect and diagnose this type of cancer at The chronological order of this article is as follows: The the early stages [25]. Generally, it consists of four major related work of skin cancer detection and classification is Khan et al. BMC Cancer (2018) 18:638 Page 3 of 20 described in “Related work”section.“Methods”section detection of skin lesion. The color sampling method is explains the proposed method, which consists of several utilized with Harris detector and compared their per- sub steps including contrast stretching, segmentation, fea- formance with grayscale sampling. Also, compared the tures extraction, features fusion, classification etc. The color-SIFT (scale invariant feature transform) and SIFT experimental results and conclusion of this article are features and conclude that color-SIFT features performs described in “Results”and “Discussion”sections. good ascomparetoSIFT.Yanyangetal. [28]intro- duced an novel method for melanoma detection based Related work on Mahalanobis distance learning and graph regular- In the last few decades, advance techniques in different ized non-negative matrix factorization. The introduced domains of medical image processing, machine learn- method treated as a supervised learning method and ing, etc., have introduced tremendous improvements in reduced the dimensionality of extracted set of features computer aided diagnostic systems. Similarly, improve- and improves the classification rate. The method is eval- ments in dermatological examination tools have led the uated on PH2 dataset and achieved improved perfor- revolutions in the prognostic and diagnostic practices. mance. Catarina et al. [29]described thestrategyof The computerized features extractions of cutaneous combination of global and local features. The local fea- lesion images and features analysis by machine learning tures (BagOf Features) and global features (shape and techniques have potential to enroute the conventional geometric) are extracted from original image and fused surgical excision diagnostic methods towards CAD these features based of early fusion and late fusion. The systems. author claim the late fusion is never been utilized in this In literature several methods are implemented for auto- context and it gives better results as compared to early mated detection and classification of skin cancer from fusion. the dermoscopic images. Omer et al. [26]introduced Ebtihal et al. [30] introduced an hybrid method for an automated system for an early detection of skin lesion classification using color and texture features. lesion. They utilized color features prior to global thresh- Four moments such as mean standard deviation, degree olding for lesion’s segmentation. The enhanced image of asymmetry and variance is calculated against each was later subjected to 2D Discrete Fourier Transform channel, which are treated as a features. The local binary (DCT) and 2D Fast Fourier Transform (FFT) for fea- pattern (LBP) and gray level co-occurrences matrices (GLCM) were extracted as a texture features. Finally, tures extraction prior to the classification step. The results the combined features were classified using support were tested on a publicly available dataset PH2. Barata et al. [27] described the importance of color features for vector machine (SVM). Agn et al. [31]introduceda Fig. 1 Proposed architecture of skin lesion detection and classification Khan et al. BMC Cancer (2018) 18:638 Page 4 of 20 Fig. 2 Information of original image and their respective channels: a original image; b red channel; c green channel; d blue channel saliency detection technique for accurate lesion detec- was detected by spatial layout which includes boundaries tion. The introduced method resolve the problems when and color information. They implemented Bayesian the lesion borders are vague and the contrast between framework to minimize the detection errors. Similarly, the lesion and inundating skin is low. The saliency Lei et al. [32] introduced a new method of lesion detec- method is reproduced with the sparse representaion tion and classification based on multi-scale lesion biased method. Further, a Bayesian network is introduced that representation (MLR). This proposed method has the better explains the shape and boundary of the lesion. advantage of detecting the lesion using different rotations Euijoon et al. [38] introduced a saliency based segmen- and scales, compared to conventional methods of single tation technique where the background of original image rotation. Fig. 3 Proposed contrast stretching results Khan et al. BMC Cancer (2018) 18:638 Page 5 of 20 Fig. 4 Proposed uniform distribution based mean segmentation results. a original image; b enhanced image; c proposed uniform based mean segmentation; d 2D contour image; e Contour plot; f 3D contour plot; g lesion area From above recent studies, we noticed that the colour improved classification, several features are utilized information and contrast stretching is an important in literature but according to best our knowledge, factor for accurately detection of lesion from der- serial based features fusion is not yet utilized. How- moscopic images. Since the contrast stretching meth- ever, in our case only salient features are utilized ods improves the visual quality of lesion area and which are later subjected to fusion for improved improves the segmentation accuracy. Additionally, for classification. Fig. 5 Proposed normal distribution based M.D segmentation results. a original image; b enhanced image; c proposed M.D based segmentation; d 2D contour image; e Contour plot; f 3D contour plot; g lesion area Khan et al. BMC Cancer (2018) 18:638 Page 6 of 20 Table 1 Ground truth table for z 1 In RGB dermoscopic images, mostly the available con- tents are visually distinguishable into foreground which is X ∈iX ∈ j S 1 2 infected region and the background. This distinctness is 00 0 also evident in each and every gray channel, as shown in 01 1 Fig. 2. 10 1 Considering the fact [35], details are always high with 11 1 higher gradient regions which is foreground and details are low with the background due to low gradient values. We firstly divide the image into equal sized blocks and the Methods compute weights for all regions and for each channel. For A new method is proposed for lesion detection and clas- a single channel information, details are given below. sification using probabilistic distribution based segmenta- tion method and conditional entropy controlled features 1 Gray channel is preprocessed using Sobel edge filter selection. The proposed method is consists of two major steps: a) lesion identification; b) lesion classification. For to compute gradients where kernel size is selected to lesion identification, we first enhance the contrast of input be 3 × 3. image and then segment the lesion by implementation 2 Gradient calculation for each equal sized block and of novel probabilistic distribution (uniform distribution, rearranging in an ascending order. For each block the normal distribution). The lesion classification is done weights are assigned according to the gradient based of multiple features extraction and entropy con- magnitude. trolled most prominent features selection. The detailed b1 ⎪ ς if υ (x, y) ≤ T ; c 1 ⎪ w flow diagram of proposed method is shown in Fig. 1. ⎪ b2 ς T <υ (x, y) ≤ T ; 1 c 2 ζ (x, y) = (1) b3 ς T <υ (x, y) ≤ T ; 1 c 3 Contrast stretching w b4 ς otherwise There are numerous contrast stretching or normaliza- tion techniques [34], which attempt to improve the image bi where ς (i ≤ 4) are statistical weight coefficient contrast by stretching pixels’ specific intensity range to and T is gradient intervals threshold. a different level. Most of the available options take gray 3 Cumulative weighted gray value is calculated for each image as an input and generate an improved output gray block using: image. In our research work, the primary objective is to acquire a three channel RGB image having dimensions bi N (z) = ς n (z) (2) g i m × n × 3. Although, the proposed technique can only w i=1 work on a single channel of size m × n, therefore, in pro- posed algorithm we separately processed red, green and where n (z) represents cumulative number of gray blue channel. level pixels for each block i. Fig. 6 Proposed fusion results. a original image; b fused segmented image; c mapped on fused image; d ground truth image Khan et al. BMC Cancer (2018) 18:638 Page 7 of 20 4 Concatenate red, green and blue channel to produce where E is the block with maximum edges. Finally, max enhanced RGB image. adjust the intensity levels of enhance image and perform log operation to improved lesion region as compare to For each channel, three basic conditions are considered original. for optimized solution: I) extraction of regions with max- ϕ(AI) = ζ(B ) (4) wi imum information; II) selection of a block size; III) an improved weighting criteria. In most of the dermoscopic ϕ(t) = C × log(β + ϕ(AI)) (5) images, maximum informative regions are with in the range of 25 − 75%. Therefore, considering the minimum Where β is a constant value, (β ≤ 10), which is selected value of 25%, the number of blocks are selected to be 12 to be 3 for producing most optimal results. ζ denotes the as an optimal number, with an aspect ratio of 8.3%. These adjust intensity operation, ϕ(AI) is enhance image after blocks are later selected according to the criteria of maxi- ζ operation and ϕ(t) is final enhance image. The final mal information retained (cumulative number of pixels for contrast stretching results are shown in Fig. 3. each block). Laplacian of Gaussian method (LOG) [36]is used with sigma value of two for edge detection. Weights Lesion segmentation are assigned according to the number of edge points, E Segmentation of skin lesion is an important task in the pi for each block: analysis of skin lesions due to several problems such as color variation, presence of hairs, irregularity of lesion pi in the image and necked edges. Accurate segmentation B = (3) wi E provides important cues for accurate border detection. max Fig. 7 Proposed fusion results. a original image; b proposed segmented image; c mapped on proposed image; d ground truth image; e border on proposed segmented image Khan et al. BMC Cancer (2018) 18:638 Page 8 of 20 +∞ t−μ In this article, a novel method is implemented based of − 2 σ = |t − μ| √ e dt (14) probabilistic distribution. The probabilistic distribution 2πσ −∞ is consists of two major steps: a) uniform distribution t−μ Then put g = in Eq. 14. based mean segmentation; b) normal distribution based +∞ 2 segmentation. −g M.D = √ σ g e dg (15) 2πσ −∞ Mean segmentation The uniform distribution of mean segmentation is ∞ ∞ 2 2 −g −g calculated from enhanced image ϕ(t) and then perform 2 2 = √ ge dg + ge dg (16) 2π 0 0 threshold function for lesion extraction. The detailed description of mean segmentation is defined below: Let −g 2σ t denotes the enhanced dermoscopic image and f (t) M.D = √ ge dg (17) 2π 0 denotes the function of uniform distribution, which is determined as f (t) = .Where y and x denotes the y−x Put = l in Eq. 17 and it becomes: maximum and minimum pixels values of ϕ(t). Then the mean value is calculated as follows: 2σ dl −l M.D = √ 2le √ (18) 2π 2l μ = tf (t) dt (6) 2σ −l = √ e dl (19) = t dt (7) 2π y − x y −l 2 e 1 t = σ (20) = (8) π −1 y − x 2 = (y + x)(y − x) (9) 2 1 2(y − x) =− σ (21) π e μ = (y + x) (10) =− σ(−1) (22) Then perform an activation function, which is define as follows: 1 1 A(μ) = + + C (11) μ 2μ 1 + Table 2 Lesion detection accuracy as compared to ground truth ϕ(t) values 1 if A(μ) ≥ δ thresh Image description Similarity rate Image description Similarity rate F(μ) = (12) 0 if A(μ) < δ thresh IMD038 95.69 IMD199 94.70 where δ is Otus’s threshold, α is a scaling factor which thresh IMD020 92.52 IMD380 97.94 controls the lesion area and its value is selected on the IMD039 91.35 IMD385 94.37 basis of simulations performed, α ≤ 10, and finally got IMD144 88.33 IMD392 94.47 α = 7 to be most optimal number. C is a constant value IMD203 86.44 IMD394 96.96 which is randomly initialized within the range of 0 to 1. The segmentation results are shown in Fig. 4. IMD379 88.41 IMD047 90.07 IMD429 94.87 IMD075 95.85 Mean deviation based segmentation IMD211 92.81 IMD078 94.70 The mean deviation (M.D) of normal distribution is IMD285 95.59 IMD140 96.94 is calculated from ϕ(t) having parameter μ and σ . IMD022 96.02 IMD256 95.82 The value of M.D is utilized by activation function for extraction of lesion from the dermoscopic images. Let IMD025 96.35 IMD312 96.04 t denotes the enhanced dermoscopic image and f (t) IMD042 91.26 IMD369 96.08 denotes the normalized function, which determined as t−μ 1 2 IMD173 96.04 IMD376 93.07 1 − ( ) √ 2 σ f (t) = e . Then initialize the M.D as: 2πσ IMD182 97.97 IMD427 93.14 +∞ IMD430 98.10 IMD168 92.88 M.D = |t − μ| f (t) (13) −∞ Data in bold are significant Khan et al. BMC Cancer (2018) 18:638 Page 9 of 20 Fig. 8 A system architecture of multiple features fusion and selection Fig. 9 Selected channels for color features extraction Khan et al. BMC Cancer (2018) 18:638 Page 10 of 20 Table 3 Proposed features fusion and selection results on PH2 dataset Method Execution time /sec Sensitivity (%) Precision (%) Specificity (%) FNR (%) FPR Accuracy (%) DT 7 88.33 88.73 92.50 10.0 0.04 90.0 QDA 2 90.83 89.40 91.20 9.0 0.04 91.0 Q-SVM 2 95.83 96.60 98.70 3.0 0.01 97.0 LR 6 92.10 92.76 96.96 6.0 0.02 94.0 N-B 3 89.60 91.73 96.90 7.5 0.03 92.5 W-KNN 2 91.67 92.33 96.20 6.5 0.02 93.5 EBT 5 95.43 96.67 98.12 3.5 0.02 96.5 ESD 10 94.20 94.53 97.50 4.5 0.02 95.5 C-KNN 2 91.26 91.56 95.61 7.0 0.03 93.0 Multi-class SVM 1 96.67 97.06 98.74 2.5 0.01 97.5 Data in bold are significant Table 4 Results of individual extracted set of features using PH2 dataset Name Features Performance measures Classification Method Harlick HOG Color Sensitivity (%) Precision (%) Specificity (%) FNR (%) FPR Accuracy (%) Decision tree 67.53 67.50 70.05 31.50 0.16 68.5 71.67 72.1 85.0 23.0 0.11 77.0 87.93 86.93 86.9 12.5 0.06 87.5 Quadratic discriminant analysis 70.0 68.43 70.0 30.0 0.14 70.0 74.60 75.83 88.15 20.0 0.09 80.0 84.6 81.9 80.65 17.0 0.08 83.0 Quadratic SVM 68.33 70.27 76.25 28.5 0.14 71.5 82.5 83.37 92.7 13.5 0.06 86.5 93.77 93.33 94.44 6.0 0.03 94.0 Logistic regression 63.36 64.06 70.05 34.0 0.17 66.0 86.27 85.83 91.9 11.5 0.09 88.5 89.2 90.43 92.55 9.5 0.04 90.5 Naive bayes 62.9 62.9 66.85 35.5 0.18 64.5 81.25 81.93 90.65 15.0 0.07 85.0 87.93 87.63 90.65 11.0 0.06 89.0 Weighted KNN 66.67 67.5 72.5 31.0 0.16 69.0 81.67 83.27 92.5 14.0 0.06 86.0 90.87 90.83 92.55 8.5 0.03 91.5 Ensemble boosted tree 68.33 67.77 68.75 31.5 0.16 68.5 80.67 82.57 91.3 15.0 0.07 85.0 88.37 89.47 91.3 10.5 0.04 89.5 Ensemble subspace discriminant 68.76 68.4 71.9 30.0 0.15 70.0 87.1 87.03 91.9 11.0 0.05 89.0 92.9 94.7 96.9 5.5 0.03 94.1 Cubic KNN 65.43 66.4 71.9 32.0 0.16 68.0 80.4 80.8 89.4 16.0 0.07 84.0 90.3 89.83 91.7 9.5 0.04 90.5 Proposed 69.6 72.23 75.65 28.0 0.14 72.0 86.27 87.37 94.4 10.5 0.02 89.5 94.6 93.97 94.4 5.5 0.02 94.5 Khan et al. BMC Cancer (2018) 18:638 Page 11 of 20 Hence j pixels which pixels values are 1. It mean all 1 value pixels fall in S.Then X ∪ X written as: 1 2 M.D = 0.7979σ (23) Then perform an activation function to utilize M.D as: X ∪ X = (X ∪ X ) ∩ φ (26) 1 2 1 2 1 1 AC(M.D) = + + C (24) M.D 2 M.D P(X ∪ X ) = P((X ∪ X )) ∩ P(φ) (27) 1 2 1 2 1 + ϕ(t) ξ((X , X ) == 1) if (i, j) ∈ z 1 2 1 1 if AC(M.D) ≥ δ = (28) thresh F(M.D) = (25) ξ((X , X ) == 0) if (i, j) ∈ z 1 2 2 0 if AC(M.D)<δ thresh Where z represented as ground truth Table 1. ThesegmentationresultsofM.Dis showninFig. 5. Hence Image fusion 1 if i, j > 1 The term image fusion mean to combine the information (t) = (29) 0 Otherwise of two or more than two images in one resultant image, which contains better information as compare to any indi- P(X ∪ X ) = P(X ) + P(X ) − P(φ) (30) vidual image or source. The image fusion reduces the 1 2 1 2 redundancy between two or more images and increase the Where P(φ) denotes the 0 values which presented clinical applicability for diagnosis. In this work, we imple- as background and 1 denotes the lesion. The graphical mented a union based fusion of two segmented images results after fusion are shown in Fig. 6. into one image. The resultant image is more accurate and having much information as compare to individual. Analysis Suppose N denotes the sample space, which contains In this section, we analyze our segmentation results in 200 dermoscopic images. Let X ∈ F(μ) which is mean terms of accuracy or similarity index as compared to segmented image. Let X ∈ F(M.D) which M.D based given ground truth values. We select randomly images segmented image. Let i denotes the X pixels values and j from PH2 dataset and shows their results in tabular and denotes the X pixels values and S denotes the both i and graphical. The proposed segmentation results are directly Table 5 Confusion matrix for PH2 dataset Confusion Matrix: Proposed features fusion and selection Class Tested images Melanoma Benign Caricinoma Melanoma 20 92.5% 7.5% Benign 40 2.5% 97.5% Caricinoma 40 100% Confusion matrix: Harlick features Class Total Images Melanoma Benign Caricinoma Melanoma 20 57.5% 35% 7.5% Benign 40 8.8% 68.8% 22.5% Caricinoma 40 3.8% 13.8% 82.5% Confusion matrix: HOG features Class Total Images Melanoma Benign Caricinoma Melanoma 20 70% 30% - Benign 40 10% 88.8% 1.3% Caricinoma 40 - - 100% Confusion matrix: Color features Class Total Images Melanoma Benign Caricinoma Melanoma 20 95% 5.0% - Benign 40 3.8% 95% 1.3% Caricinoma 40 1.3% 5.0% 93.8% Khan et al. BMC Cancer (2018) 18:638 Page 12 of 20 Table 6 PH2 dataset: Comparison of proposed algorithm with HOG features existing methods The Histogram Oriented Gradients (HOG) features are originally introduced by Dalal [40] in 2005 for human Method Year Sensitivity % Specificity % Accuracy % detection. The HOG features are also called shape based Abuzaghleh et al. 2014 - - 91 [26] features because they work on the shape of the object. In our case, the HOG features are extracted from seg- Barata et al. [27] 2013 85 87 87 mented skin lesion and work efficiently because every Abuza et al. [43] 2015 - - 96.5 segmented lesion have their own shape. As shown in Kruck et al. [44] 2015 95 88.1 - Fig. 8, the HOG features are extracted from segmented Rula et al. [45] 2017 96 83 - lesion and obtain a feature vector of size 1 × 3780 because Waheed et al. [46] 2017 97 84 96 we have the size of segmented image is 96 × 128 and size of bins is 8 × 8. Thesizeofextracted features are Sath et al. [47] 2017 96 97 - too high and they effect on the classification accuracy. GUU et al. [48] 2017 94.43 81.01 - For this reason, we implement a weighted conditional Lei et al. [49] 2016 87.50 93.13 92.0 entropy with PCA (principle component analysis) on MRastagoo et al. 2015 94 92 - extracted feature vector. The PCA return the score [50] against each feature and then weighted entropy is utilized Proposed 2017 96.67 98.7 97.5 to reduced the feature space and select the maximum Data in bold are significant 200 score features. The weighted conditional entropy is define as: compare to ground truth images as shown in Fig. 7.The testing accuracy against each selected dermoscopic image K K P(i) are depicted in Table 2.FromTable 2 the accuracy of E = W . P(i, j)log (31) W i,j P(i, j) each image is above 90% and the maximum similarity i=1 j=1 rate is 98.10. From our analysis, the proposed segmenta- tion results perform well as compare to existing methods Where i, j denotes the current and next feature respec- [31, 37–39] in terms of border detection rate. tively. W denotes the weights of selected features, which i,j is selected between 0 and 1 0 ≤ W ≤ 1 and P(i, j) = ij Image representation W . n ij ij . Hence the new reduce vector size is 1 × 200. In this three types of features are extracted for the repre- W . n ij ij ij=1 sentation of an input image. The basic purpose of feature extraction is to find out a combination of most efficient Harlick features features for classification. The performance of dermo- Texture information of an input image is an important scopic images mostly depends on the quality and the component, which is utilized to identify the region of consistency of the selected features. In this work, three interest such as a lesion. For texture information of lesion, types of features are extracted such as color, texture and we extract the harlick features [41]. The harlick features HOG for classification of skin lesion. are extracted from the segmented image as shown in Table 7 Proposed features fusion and selection results on ISIC-MSK dataset Performance measures Method Sensitivity (%) Precision (%) Specificity (%) FNR (%) FPR Accuracy (%) Decision tree 92.95 93.1 94.30 6.9 0.07 93.1 Quadratic discriminant analysis 95.95 95.45 91.90 4.5 0.04 95.5 Quadratic SVM 96.25 96.10 95.60 3.8 0.03 96.2 Logistic regression 95.10 95.10 95.60 4.8 0.04 95.2 Naive bayes 92.80 93.30 95.60 6.9 0.07 93.1 Weighted KNN 95.10 95.10 95.60 4.8 0.04 95.2 Ensemble boosted tree 95.10 95.10 95.60 4.80 0.04 95.2 Ensemble subspace discriminant 95.10 95.10 95.60 4.8 0.04 95.2 Cubic KNN 89.35 90.65 95.60 10.0 0.10 90.0 Proposed 96.60 97.0 98.30 2.8 0.01 97.2 Data in bold are significant Khan et al. BMC Cancer (2018) 18:638 Page 13 of 20 Fig. 8. There are total 14 texture features implemented processing and are deeply robust to geometric variations (i.e. autocorrelation, contrast, cluster prominence, cluster of lesion patterns. Three types of color spaces are utilized shade, dissimilarity, energy, entropy, homogeneity 1, for color features extraction such as RGB, HSI, and LAB. homogeneity 2, maximum probability, average, variances, As shown in Fig. 9, the mean, variance, skewness and kur- inverse difference normalized and inverse difference tosis are calculated for each selected channel. From Fig. 8, moment normalized) and a feature vector of size 1 × 14 its shown clearly that the 1×12 features are extracted from is created. After calculating the mean, range and vari- each color space and total features of three color spaces ance of each feature, the final vector is calculated having having dimension of 1 × 36. size 1 × 42. Features fusion Color features The goal of feature fusion is to create a new feature The color information of the region of interest has vector, which contains much information as compare to attained strong prevalence for classification of lesions in individual feature vector. Different types of features are malignant or benign. The color features provide a quick extracted from same image always indicates the distinct Table 8 Results for individual extracted set of features using ISIC-MSK dataset Classifier Selected features Performance measures Color HOG Harlick Sensitivity % Precision % Specificity FNR % FPR Accuracy % DT  89.4 89.65 0.919 10.3 0.105 89.7 92.25 93.10 0.944 6.9 0.06 93.1 80.95 82.15 0.888 18.3 0.18 81.7 QDA  86.05 86.05 0.875 13.8 0.13 86.2 94.30 93.85 0.894 6.2 0.05 93.8 70.73 73.25 0.769 26.6 0.26 73.4 Q-SVM  95.6 95.75 0.956 4.1 0.03 95.9 95.5 95.46 0.956 4.5 0.04 95.5 82.05 82.3 0.856 17.6 0.17 82.4 LR  92.05 92.7 0.956 7.6 0.07 92.4 95.1 95.1 0.956 4.8 0.04 95.2 81.45 82.25 0.875 17.9 0.18 82.1 N-B  90.9 91.8 0.956 8.6 0.08 91.4 93.95 94.2 0.956 5.9 0.05 94.1 82.2 83.95 0.913 16.9 0.03 83.1 W-KNN  90.9 91.9 0.956 8.6 0.08 91.4 93.95 94.2 0.956 5.9 0.05 94.1 81.15 84.2 0.938 17.6 0.08 82.4 EBT  91.45 91.85 0.994 8.3 0.08 91.7 93.35 93.4 0.944 6.6 0.06 93.4 81.45 82.25 0.875 17.9 0.18 82.1 ESD  86.95 88.05 0.931 12.4 0.125 87.6 95.5 95.45 0.956 4.5 0.04 95.5 78.0 79.5 0.875 21.0 0.21 79.0 Cubic KNN  93.25 93.5 0.95 6.6 0.06 93.4 93.15 92.7 0.973 7.2 0.07 92.8 76.6 76.6 0.788 23.1 0.23 76.9 Proposed  95.85 95.85 0.963 4.1 0.03 95.9 97.1 96.75 0.963 3.8 0.02 96.2 82.55 84.7 0.913 16.6 0.13 83.4 Khan et al. BMC Cancer (2018) 18:638 Page 14 of 20 // Table 9 Confusion matrix for all set of extracted features using F P = (α , α , ...α )(j , j , ... j )(o , o , ... o ) 1 2 d 1 2 d 1 2 d ISIC-MSK dataset (32) Class Total images Melanoma Benign Where d denotes the dimension of extracted set of fea- Confusion matrix: Proposed features fusion and selection tures. As we know the dimension of each extracted feature Melanoma 130 99.2% 1% vector (i.e. HOG (1 × 200), Texture (1 × 42) and Color Benign 160 4.4% 95.6% (1 × 36). Then the fused vector is define as: Confusion matrix: Harlick features // ϒ F = α + ι j, α + ι o | α ∈ D, j ∈ E, o ∈ F (33) Melanoma 130 73.8% 26.2% It in an n dimensional complex vector, where n = Benign 160 8.8% 91.3% max(d(D), d(E), d(F)). From previous expression, the Confusion matrix: HOG features HOG has maximum dimension 1 × 200. Hence, make the size of E and F feature vector equally to D vector. For Melanoma 130 99.2% 0.8% this purpose adding zeros. For example below is a given Benign 160 5.0% 95.0% matrix, which consists of three feature vectors. Confusion matrix: Color features Melanoma 130 96.2% 3.8% ⎨ D = (0.20.7 0.90.110.100.56 ... 0.90) Benign 160 3.8% 96.3% E = (0.10.30.50.170.15) (34) F = (0.30.170.930.15) Then make the same size of feature vector, by adding characteristics of an image. The combination of these fea- zeros. tures effectively discriminate the information of extracted features and also eliminates the redundant information between them. The elimination of redundant informa- D = (0.2 0.7 0.9 0.11 0.10 0.56 ... 0.90) tion between extracted set of features provides improved E = (0.1 0.3 0.5 0.17 0.15 0.0 ... 0.0) (35) classification performance. In this work, we implemented F = (0.3 0.17 0.93 0.15 0.0 0.0 ... 0.0) a parallel features fusion technique. The implemented Finally, a novel feature selection technique is imple- technique efficiently fuse the all extracted features and mented on fused features vector and select the most also remove the redundant information between them. prominent features for classification. The fusion process is detailed as: Suppose C , C ,and 1 2 C are known lesion classes (i.e. melanoma, atypical nevi and benign). Let  = ψ | ψ ∈ R denotes the test- Features selection ing images. As given three extracted feature sets D = The motivation behind the implementation of feature h t c α | α ∈ R , E = j | j ∈ R , {o | o ∈ R },where α, j and selection technique is to select the most prominent fea- o are three feature vector (i.e. HOG, texture and color). tures for improving the accuracy and also make the sys- Then the parallel fusion is define as: tem fast in terms of execution time. The major reasons Table 10 Proposed features fusion and feature selection results on ISIC-UDA dataset Measures Method Sensitivity Precision Specificity FNR FPR Accuracy DT 87.25 90.65 97.1 10.7 0.12 89.3 QDA 79.75 88.60 99.3 16.3 0.19 83.7 QSVM 98.05 98.40 99.3 1.7 0.02 98.3 LR 94.8 96.35 99.3 4.3 0.04 95.7 N-B 88.5 91.00 96.4 9.9 0.10 90.1 W-KNN 83.85 91.20 100 12.9 0.16 87.1 EBT 95.2 95.85 97.9 4.3 0.4 95.7 E-S-D 89.6 89.75 92.1 9.9 0.09 90.1 L-KNN 81.7 90.25 100 14.6 0.18 85.4 Proposed 97.85 98.60 100 1.7 0.02 98.3 Data in bold are significant Khan et al. BMC Cancer (2018) 18:638 Page 15 of 20 behind feature selection technique are a) utilize only a of lesion classes and also consider more reliable as com- selected group prominent features leads to increased the pare to Euclidean distance. Second, it implements an classification accuracy by the elimination of irrelevant entropy-variance method on closeness features and select features; b) the miniature group of features is discov- the most prominent features based on their maximum val- ered that maximally increases the performance of pro- ues. Entropy in a nutshell is the uncertainty measurement posed method; c) select a group of features from the associated with initialization of the closeness features. high dimensional features set for a dense and detailed Since base classifier is highly dependent on their initial data representation. In this work, a novel Entropy- conditions for their fast convergence and accurate approx- Variances based feature selection method is imple- imation. Also, the selected closeness features should have mented. The proposed method performs in two steps. maximum entropy value. To the best of our knowledge, First, it calculates the Bhattacharyya distance of fused fea- entropy, especially in conjunction with Bhattacharyya dis- ture vector. The Bhattacharyya distance find out the close- tance and Variances, has never been adopted for selection ness between two features. It is utilized for classification of most prominent features. Let f and f are two features i i+1 Table 11 Results for individual extracted set of features using ISIC-UDA dataset Features Performance measures Method Color HOG Harlick Sensitivity (%) Precision (%) Specificity (%) FNR (%) FPR Accuracy (%) Decision tree  72.75 77.4 90.7 23.6 0.62 76.4 70.15 69.4 69.3 30.0 0.30 70.0 86.55 87.35 91.4 12.4 0.13 87.6 QDA  74.04 74.04 79.3 24.9 0.21 75.1 77.4 88.45 100 18.0 0.22 82.0 82.65 83.15 87.9 16.3 0.17 83.7 QSVM  73.7 77.25 89.3 23.2 0.73 76.8 81.35 89.3 99.3 15.0 0.18 85.0 94.45 95.8 98.6 4.7 0.05 95.3 LR  68.5 68.35 73.6 30.5 0.31 69.5 78.5 88.9 100 17.2 0.21 82.8 93.4 94.65 97.1 5.6 0.05 94.4 N-B  69.4 69.95 78.6 28.8 0.30 71.2 76.7 76.7 81.4 22.3 0.22 77.7 86.0 89.05 95.7 12.0 0.13 88.0 W-KNN  74.04 77.9 90.0 22.7 0.21 77.3 80.8 87.15 97.1 15.9 0.17 84.1 88.55 92.3 98.6 9.4 0.11 90.6 EBT  71.35 71.8 79.3 27.0 0.23 73.0 80.8 83.8 92.9 17.2 0.17 82.8 90.5 91.55 95.0 8.6 0.09 91.4 ESD  69.95 71.6 82.9 27.5 0.30 72.5 60.2 74.5 85.0 24.9 0.27 75.1 83.9 86.5 93.6 14.2 0.15 85.8 Cubic KNN  71.7 74.4 86.4 25.3 0.23 74.7 80.15 87.4 97.9 16.3 0.19 83.7 85.5 90.2 97.9 12.0 0.14 88.0 Proposed  73.65 78.5 91.4 22.7 0.22 77.3 82.6 87.55 96.4 14.6 0.15 85.4 95.2 95.85 97.9 4.3 0.04 95.7 Khan et al. BMC Cancer (2018) 18:638 Page 16 of 20 // Datasets & results of fused vector ϒ F . The Bhattacharyya distance is PH2 Dataset calculated as: The PH2 dataset [51] consists of 200 RGB dermoscopic ⎛ ⎞ images and of resolution (768 × 560). This dataset has three main divisions; a) melanoma; b) benign; c) common ⎜ ⎟ B =−ln f (u).f (u) (36) d ⎝ i i+1 ⎠ nevi. There are 40 melanoma, 80 benign and 80 common // nev image are in this dataset. For validation 50:50 strategy u∈ϒ F is performed for training and testing of proposed method. Then Entropy-variance is performed on crossness vec- Four experiments are done on different feature sets (i.e. tor to find out the best features based of their maximum harlick features, color features, HOG features, proposed entropy value. features fusion and selection method) for given a compar- 2 ison between individual set of features and proposed fea- ln f + σ (i+1) E B =− V d ture set. The proposed features fusion and selection with 2 2 ln f + σ + ln f − σ i i entropy-variances method results are depicted in Table 3. (37) The proposed method obtain maximum accuracy 97.06%, 0 0 H /δH log H /δH f f i i sensitivity 96.67%, specificity 98.74%, precision 97.06% f =1 and FPR is 0.01. The individual feature set by without utilizing feature selection algorithm results are depicted ϒ −1 i in Table 4. The results of Tables 3 and 4 are confirmed δH = H (38) by their confusion matrix in Table 5, which shows that f =0 proposed features fusion and selection method efficiently where H denotes the closeness set of features. Hence the perform on base classifier as compare to other classi- size of selected feature vector is 1 × 172. The selected vec- fication methods. The comparison of proposed method tor is feed to multi-class SVM for classification of lesion on PH2 dataset also given in Table 6, which shows the (i.e. melanoma, benign). The one-against all multi-class authenticity of proposed method. SVM [42] is utilized for classification. ISIC dataset The ISIC dataset [52] is an institutional database and Results often used in skin cancer research. It is an open source Evaluation protocol database having high-quality RGB dermoscopic images The proposed method is evaluated on four publicly of resolution (1022 × 1022). ISIC incorporates many sub- available datasets including PH2, ISIC, and collective ISBI datasets but we selected: a) ISIC MSK-2 and b) ISIC-UDA. (ISBI 2016 and ISBI 2017). The proposed method is a con- From ISIC MSK-2 dataset, we collected 290 images junction of two primary steps: a) lesion identification; b) lesion classification (i.e. melanoma, benign, atypical nevi). The lesion identification results are discussed in their Table 12 Confusion matrix for all set of extracted features using own section. In this section, we discussed proposed lesion ISIC-UDA dataset classification results. Four classifications three types of Class Total images Melanoma Benign features are extracted (i.e. texture, HOG, and color). The experimental results are obtained on each feature set Confusion matrix: Proposed features fusion and selection individually and then compare their results with pro- Melanoma 93 95.7% 4.3% posed feature vector (fused vector). The multi-class SVM Benign 140 - 100% is selected as a base classifier and compare their results with nine classifications method (decision tree (DT), Confusion matrix: Harlick features quadratic discriminant analysis (QDA), quadratic SVM Melanoma 93 55.9% 44.1% (Q-SVM), logistic regression (LR), Naive Bayes, weighted Benign 140 8.6% 91.4% K-Nearest Neighbor (w-KNN), ensemble boosted tree (EBT), ensemble subspace discriminant (ESDA), and Confusion matrix: HOG features cubic KNN (C-KNN)). Seven measures are calculated Melanoma 93 68.8% 31.2% for testing the performance of proposed method such as Benign 140 3.6% 96.4% sensitivity, specificity, precision, false negative rate (FNR), false positive rate (FPR), and accuracy. Also, calculate Confusion matrix: Color features the execution time of one image. The proposed method Melanoma 93 92.5% 7.5% is implemented on MATLAB 2017a having personal Benign 140 2.1% 97.9% computer Core i7 with 16GB of RAM. Khan et al. BMC Cancer (2018) 18:638 Page 17 of 20 Table 13 Classification results on ISBI 2016 dataset Method Sensitivity (%) Precision (%) Specificity (%) FNR (%) FPR Accuracy (%) AUC DT 63.0 62.0 79.0 28.5 0.370 71.5 0.63 QDA 68.0 65.5 79.0 26.4 0.320 73.6 0.74 Q-SVM 68.5 78.5 95.0 17.7 0.315 82.3 0.81 LR 67.0 65.0 79.0 26.1 0.330 72.9 0.69 NB 74.5 77.0 91.5 17.1 0.255 82.9 0.84 W-KNN 70.5 75.0 91.0 18.7 0.295 81.3 0.83 EBT 66.0 80.0 97.0 18.3 0.034 81.7 0.79 ESDA 72.5 55.0 90.0 18.5 0.275 81.5 0.83 Proposed 75.5 78.0 93.0 16.8 0.270 83.2 0.85 Data in bold are significant having 130 melanoma and 160 benign. For validation of ISBI - 2016 & 17 proposed algorithm, we have performed four experiments These datasets - ISBI 2016 [52] and ISBI 2017 [53], are on different types of features (i.e. Harlick features, Color based on ISIC archive, which is a largest publicly avail- features, HOG features and proposed features fusion and able collection of quality controlled dermoscopic images selection vector). Four different classification methods for skin lesions. It contains separate training and test- are compared with the base classifier ( multi-class SVM). ing RGB samples of different resolutions, such as ISBI The proposed features fusion and selection results are 2016 contains 1279 images (273 melanoma and 1006 shown in Table 7 having maximum accuracy 97.2%, sensi- benign), where 900 images for training and 350 for test- tivity 96.60% and specificity 98.30% on the base classifier. ing the algorithm. The ISBI 2017 dataset contains total The individual feature set results are depicted in Table 8, 2750 images (517 melanoma and 2233 benign) including and base classifier (multi-class SVM) perform well as 2000 training images and 750 testing. For experimental compared to other methods. The base classifier results results, first experiments are done on each dataset sep- are confirmed by their confusion matrix given in Table 9. arately and obtained classification accuracy 83.2%, and From ISIC UDA dataset, we select total 233 images 88.2% on ISBI 2016, and ISBI 2017, respectively. The clas- having 93 melanoma and 140 benign. The proposed sification results are given in Tables 13 and 14, which method results are depicted in Table 10 having maximum is proved by their confusion matrix given in Table 16. accuracy 98.3% and specificity 100% on the base classifier. After that, both datasets are combined and 10 fold cross- Also, the results on individual feature sets are depicted validation is performed for classification results. The max- in the Table 11, which shows that the proposed features imum classification accuracy of 93.2% is achieved with fusion and selection method perform significantly well as multi-class SVM, presented in Table 15, which is also compared to Table 10. The base classifier results are con- confirmed by their confusion matrix given in Table 16. firmed by their confusion matrix given in the Table 12, The proposed method is also compared with [54], which which shows the authenticity of proposed method. has achieved maximum classification accuracy of 85.5%, Table 14 Classification results on ISBI 2017 dataset Method Sensitivity (%) Precision (%) Specificity (%) FNR (%) FPR Accuracy (%) AUC DT 74.5 75.0 77 25.5 0.255 74.8 0.77 QDA 77.5 78.0 81 22.5 0.254 77.6 0.78 Q-SVM 86.5 86.5 87 13.8 0.135 86.2 0.92 LR 84.5 84.5 86 15.4 0.135 84.6 0.92 NB 79.5 80.0 83 21.5 0.212 79.5 0.80 W-KNN 87.5 88.0 88 12.2 0.125 87.8 0.92 EBT 86.0 83.5 92 14.2 0.140 85.8 0.91 ESDA 83.5 83.5 87.0 16.5 0.165 83.5 0.90 Proposed 88.5 88.0 91.0 11.8 0.120 88.2 0.93 Data in bold are significant Khan et al. BMC Cancer (2018) 18:638 Page 18 of 20 Table 15 Classification results for challenge ISBI 2016 & ISBI 2017 dataset Method Performance measures Sensitivity (%) Precision (%) Specificity (%) FNR (%) FPR Accuracy (%) AUC DT 87.5 88.0 86.0 12.4 0.125 87.6 0.86 QDA 80.0 80.0 79.0 20.0 0.200 80.0 0.86 QSVM 92.5 92.5 95.0 7.4 0.075 92.6 0.95 LR 92.0 91.5 95.0 8.2 0.08 91.8 0.95 NB 92.0 92.5 97.0 8.2 0.08 91.8 0.93 W-KNN 88.5 88.5 91.0 11.6 0.115 88.4 0.88 EBT 92.0 92.0 97.0 8.3 0.08 91.7 0.95 ESDA 89.5 89.5 91.5 10.4 0.105 89.6 0.94 Proposed 93.0 93.5 97.0 6.8 0.07 93.2 0.96 Data in bold are significant AUC 0.826, sensitivity 0.853, and specificity 0.993 on ISBI their results with proposed features fusion and selection 2016 dataset. However, with our method, achieved clas- as presented in the Tables 3, 7,and 10, which shows that sification accuracy is 93.2%, AUC 0.96, sensitivity 0.930, proposed method performs significantly better in terms of and specificity 0.970, which confirms the authenticity and classification accuracy and execution time. The base clas- efficiency of our algorithm on combined dataset com- sifier results are also confirmed by their confusion matrix pared to [54]. Moreover, in [55]reportedmaximumAUC giveninTables 5, 9,and 12. Also, the comparison results of is 0.94 for skin cancer classification for 130 melanoma the PH2 dataset with existing methods is presented in the images, however, our method achieved AUC 0.96 on 315 Table 6, which shows the efficiency of proposed method. melanoma images. In [56]and [57], the classification accu- Moreover, the proposed method is also evaluated on com- racy achieved is 85.0% and 81.33% for ISBI 2016 dataset. bination of ISBI 2016 and ISBI 2017 dataset and achieved Upon comparison with [54–56], and [57], the proposed classification accuracy 93.2% as presented in Table 15. method performs significantly better on both (ISBI 2016 The classification accuracy of proposed method on Com- & 17) datasets. bined dataset is confirmed by their confusion matrix given in Table 16, which shows the authenticity of proposed Discussion method as compared to existing methods. In this section, we epitomized our proposed method in terms of tabular and visual results. The proposed method Table 16 Confusion matrix for ISBI 2016, ISBI 2017, and consists of two major steps: a) lesion identification; b) Combined images dataset lesion classification as shown in the Fig. 1. The lesion iden- ISBI 2016 tification phase has two major parts such as enhancement Classs Classification class and segmentation. The lesion enhancement results are TPR (%) FNR (%) Method Benign Melanoma showninthe Fig. 3, which shows the efficiency of intro- duced technique. Then the lesion segmentation method is Benign 93% 3% 93% 3% performed and their results in terms of quantitative and Melanoma 11% 53% 53% 11% tabular in Table 2 and Figs. 4, 5, 6 and 7. After this extract ISBI 2017 multi-level features and fused based on parallel strategy. Class Classification class Then a novel feature selection technique is introduced TPR (%) FNR (%) Benign Melanoma and performed on fused feature vector to select the best Benign 91% 9% 91% 9% features as shown in Fig. 8. Finally, the selected features are utilized by a multi-class SVM. The multi-class SVM Melanoma 14% 86% 86% 14% selected as a base classifier. The purpose of features fusion Combined and selection is to improve the classification accuracy and Class Classification class TPR (%) FNR (%) also make the system more efficient. Three publicly avail- Benign Melanoma able datasets are utilized for classification purposes such Benign 97% 3% 97% 3% as PH2, ISIC, and Combined dataset (ISBI 2016 and ISBI Melanoma 11% 89% 89% 11% 2017). The individual feature results on selected datasets arepresented in theTables 4, 8,and 11.Thencompared Data in bold are significant Khan et al. BMC Cancer (2018) 18:638 Page 19 of 20 Conclusion Author details Department of Computer Science, COMSATS Institute of Information In this work, we have implemented a novel method for Technology, Wah, Pakistan. Department of Electrical Engineering, COMSATS the identification and classification of skin lesions. The 3 Institute of Information Technology, Wah, Pakistan. Department of Electrical proposed framework incorporates two primary phases: Engineering, COMSATS Institute of Information Technology, Abbottabad, Pakistan. College of Computer and Information Sciences, King Saud a) lesion identification; b) lesion classification. In the University, Riyadh, Saudi Arabia. Department of Electrical Engineering, identification step, a novel probabilistic method is intro- COMSATS Institute of Information Technology, Attock, Pakistan. duced prior to features extraction. An entropy controlled Received: 5 October 2017 Accepted: 30 April 2018 variances based features selection method is also imple- mented by combining Bhattacharyya distance, and with an aim of considering only discriminant features. The References selected features are later utilized for classification in its 1. Rigel DS, Friedman RJ, Kopf AW. The incidence of malignant melanoma in the United States: issues as we approach the 21st century. J Am Acad final step using multi-class SVM. The proposed method Dermatol. 1996;34(5):839–47. is tested on three publicly available datasets (i.e. PH2, 2. Altekruse SF, Kosary CL, Krapcho M, Neyman N, Aminou R, Waldron W, ISBI 2016 & 17, and ISIC), and it is concluded that Ruhl J, et al. SEER cancer statistics review, 1975–2007. Bethesda: National Cancer Institute 7; 2010. the base classifier performs significantly better with pro- 3. Abuzaghleh O, Barkana BD, Faezipour M. Automated skin lesion analysis posed features fusion and selection method, compared based on color and shape geometry feature set for melanoma early to other existing techniques in term of sensitivity, speci- detection and prevention. In: Systems, Applications and Technology Conference (LISAT), 2014 IEEE Long Island. IEEE; 2014. p. 1–6. ficity, and accuracy. Furthermore, the presented method 4. Freedberg KA, Geller AC, Miller DR, Lew RA, Koh HK. Screening for achieved satisfactory segmentation results on selected malignant melanoma: a cost-effectiveness analysis. J Am Acad Dermatol. datasets. 1999;41(5):738–45. 5. Barata C, Ruela M, Francisco M, Mendonça T, Marques JS. Two systems Abbreviations for the detection of melanomas in dermoscopy images using texture and ABCD: Atypical, border, color, diameter; ACS: American Cancer Society; CAD: color features. IEEE Syst J. 2014;8(3):965–79. Computer Aided Design; C-KNN: Cubic KNN; DCT: Discrete Fourier Transform; 6. Menzies SW, Ingvar C, Crotty KA, McCarthy WH. Frequency and DT: Decision tree; EBT: Ensemble boosted tree; ESDA: Ensemble subspace morphologic characteristics of invasive melanomas lacking specific discriminant analysis; FFT: Fast Fourier Transform; FNR: False negative rate; surface microscopic features. Arch Dermatol. 1996;132(10):1178–82. GLCM: Gray level co-occurrences matrices; HOG: Histogram Oriented 7. Stolz W, Riemann A, Cognetta AB, Pillet L, Abmayr W, Holzel D, Bilek P, Gradients; LBP: local binary pattern; LOG: Laplacian of Gaussian; LR: Logistic Nachbar F, Landthaler M. Abcd rule of dermatoscopy-a new practical regression; M.D: Mean Deviation; MLR: Multi-scale lesion biased method for early recognition of malignant-melanoma. Eur J Dermatol. representation; PCA: Principle component analysis; QDA: Quadratic 1994;4(7):521–7. discriminant analysis; Q-SVM: Quadratic SVM; RGB: Red, Green, Blue; SIFT: 8. Argenziano G, Fabbrocini G, Carli P, De Giorgi V, Sammarco E, Delfino Scale-invariant feature transform; SVM: Support vector machine; W-KNN: M. Epiluminescence microscopy for the diagnosis of doubtful Weighted K-Nearest Neighbor melanocytic skin lesions: comparison of the ABCD rule of dermatoscopy and a new 7-point checklist based on pattern analysis. Arch Dermatol. Funding 1998;134(12):1563–70. The authors extend their appreciation to the Deanship of Scientific Research at 9. Mayer J. Systematic review of the diagnostic accuracy of dermatoscopy in King Saud University for funding this work through research group under detecting malignant melanoma. Med J Aust. 1997;167(4):206–10. grant# (RG-1438-034) and Higher Education Commission, Pakistan - Startup 10. Braun RP, Rabinovitz H, Tzu JE, Marghoob AA. Dermoscopy Research Grant #: 21-260/SRGP/R&O/HEC/2014. research—An update. In: Seminars in cutaneous medicine and surgery, vol. 28, no. 3. Frontline Medical Communications; 2009. p. 165–71. Availability of data and materials 11. Katapadi AB, Celebi ME, Trotter SC, Gurcan MN. Evolving strategies for The datasets analysed during the current study are in open access using the the development and evaluation of a computerised melanoma image following links. analysis system. Comput Methods Biomech Biomed Eng Imaging Vis. 2017;1–8. 1. AADI project repository at the web link: http://www.fc.up.pt/addi/ph2 12. Jaworek-Korjakowska J. Computer-aided diagnosis of micro-malignant %20database.html melanoma lesions applying support vector machines. BioMed Res Int. 2. ISIC UDA archive. https://isic-archive.com/ 2016;2016. 3. ISBI 2016. https://challenge.kitware.com/#challenge/n/ISBI_2016%3A_ 13. Safrani A, Aharon O, Mor S, Arnon O, Rosenberg L, Abdulhalim I. Skin Skin_Lesion_Analysis_Towards_Melanoma_Detection biomedical optical imaging system using dual-wavelength polarimetric control with liquid crystals. J Biomed Opt. 2010;15(2):026024. Authors’ contributions 14. Patalay R, Craythorne E, Mallipeddi R, Coleman A. An integrated skin MAK, TA, MS and AS conceived the study and participated in its design and marking tool for use with optical coherence tomography (OCT). In: Proc. coordination and helped to draft the manuscript. KA, MA, SIA and AA provided of SPIE Vol, vol. 10037. 2017. p. 100370Y–1. guidance and support in every part of this work and assisted in the writing and 15. Rajaram N, Nguyen TH, Tunnell JW. Lookup table–based inverse model editing of the manuscript. All authors read and approved the final manuscript. for determining optical properties of turbid media. J Biomed Opt. 2008;13(5):050501. Ethics approval and consent to participate 16. Aharon O Abdulhalim, Arnon O, Rosenberg L, Dyomin V, Silberstein E. Not applicable. Differential optical spectropolarimetric imaging system assisted by liquid crystal devices for skin imaging. J Biomed Opt. 2011;16(8):086008. Competing interests 17. Graham L, Yitzhaky Y, Abdulhalim I. Classification of skin moles from The authors declare that they have no competing interests. optical spectropolarimetric images: a pilot study. J Biomed Opt. 2013;18(11):111403. Publisher’s Note 18. Ushenko AG, Dubolazov OV, Ushenko VA, Yu Novakovskaya O, Olar OV. Springer Nature remains neutral with regard to jurisdictional claims in Fourier polarimetry of human skin in the tasks of differentiation of benign published maps and institutional affiliations. and malignant formations. Appl Opt. 2016;55(12):B56–B60. Khan et al. BMC Cancer (2018) 18:638 Page 20 of 20 19. Ávila FJ, Stanciu SG, Costache M, Bueno JM. Local enhancement of 39. Bozorgtabar B, Abedini M, Garnavi R. Sparse Coding Based Skin Lesion multiphoton images of skin cancer tissues using polarimetry. In: Lasers Segmentation Using Dynamic Rule-Based Refinement. In: MLMI@ MICCAI. and Electro-Optics Europe & European Quantum Electronics Conference 2016. p. 254–61. (CLEO/Europe-EQEC, 2017 Conference on). IEEE; 2017. p. 1–1. 40. Dalal N, Triggs B. Histograms of oriented gradients for human detection. 20. Stamnes JJ, Ryzhikov G, Biryulina M, Hamre B, Zhao L, Stamnes K. In: Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Optical detection and monitoring of pigmented skin lesions. Biomed Opt Computer Society Conference on, vol. 1. IEEE; 2005. p. 886–93. Express. 2017;8(6):2946–64. 41. Haralick RM, Shanmugam K. Textural features for image classification. 21. Pellacani G, Cesinaro AM, Seidenari S. Reflectance-mode confocal IEEE Trans Syst Man CYbernetics. 1973;6:610–21. microscopy of pigmented skin lesions–improvement in melanoma 42. Liu Y, Zheng YF. One-against-all multi-class SVM classification using diagnostic specificity. J Am Acad Dermatol. 2005;53(6):979–85. reliability measures. In: Neural Networks, 2005. IJCNN’05. Proceedings 22. Oh J-T, Li M-L, Zhang HF, Maslov K, Stoica G, Wang LV. 2005 IEEE International Joint Conference on, vol. 2. IEEE; 2005. p. 849–54. Three-dimensional imaging of skin melanoma in vivo by dual-wavelength 43. Abuzaghleh O, Barkana BD, Faezipour M. Noninvasive real-time photoacoustic microscopy. J Biomed Opt. 2006;11(3):034032. automated skin lesion analysis system for melanoma early detection and 23. Swanson DL, Laman SD, Biryulina M, Ryzhikov G, Stamnes JJ, Hamre B, prevention. IEEE J Trans Eng Health Med. 2015;3:1–12. Zhao L, Sommersten E, Castellana FS, Stamnes K. Optical transfer ´ 44. Kruk M, Swiderski B, Osowski S, Kurek J, Sowinska ´ M, Walecka I. diagnosis of pigmented lesions. Dermatol Surg. 2010;36(12):1979–86. Melanoma recognition using extended set of descriptors and classifiers. 24. Rademaker M, Oakley A. Digital monitoring by whole body photography Eurasip J Image Video Process. 2015;2015(1):43. and sequential digital dermoscopy detects thinner melanomas. J Prim 45. Ruela M, Barata C, Marques JS, Rozeira J. A system for the detection of Health Care. 2010;2(4):268–72. melanomas in dermoscopy images using shape and symmetry features. 25. Moncrieff M, Cotton S, Hall P, Schiffner R, Lepski U, Claridge E. SIAscopy Comput Methods Biomech and Biomed Eng Imaging Vis. 2017;5(2): assists in the diagnosis of melanoma by utilizing computer vision 127–37. techniques to visualise the internal structure of the skin. Med Image 46. Waheed Z, Waheed A, Zafar M, Riaz F. An efficient machine learning Underst Anal. 2001;53–6. approach for the detection of melanoma using dermoscopic images. 26. Abuzaghleh O, Barkana BD, Faezipour M. Automated skin lesion analysis In: Communication, Computing and Digital Systems (C-CODE), based on color and shape geometry feature set for melanoma early International Conference on. IEEE; 2017. p. 316–9. detection and prevention. In: Systems, Applications and Technology 47. Satheesha TY, Satyanarayana D, Prasad MNG, Dhruve KD. Melanoma is Conference (LISAT), 2014 IEEE Long Island. IEEE; 2014. p. 1–6. Skin Deep: A 3D reconstruction technique for computerized dermoscopic 27. Barata C, Marques JS, Rozeira J. Evaluation of color based keypoints and skin lesion classification. IEEE J Trans Eng Health Med. 2017;5:1–17. features for the classification of melanomas using the bag-of-features 48. Gu Y Zhou, Qian B. Melanoma Detection Based on Mahalanobis Distance model. In: International Symposium on Visual Computing. Berlin, Learning and Constrained Graph Regularized Nonnegative Matrix Heidelberg: Springer; 2013. p. 40–49. Factorization. In: Applications of Computer Vision (WACV) 2017 IEEE 28. Gu Y, Zhou J, Qian B. Melanoma Detection Based on Mahalanobis Winter Conference on. IEEE; 2017. p. 797–805. Distance Learning and Constrained Graph Regularized Nonnegative 49. Bi L, Kim J, Ahn E, Feng D, Fulham M. Automatic melanoma detection Matrix Factorization. In: Applications of Computer Vision (WACV), 2017 via multi-scale lesion-biased representation and joint reverse IEEE Winter Conference on. IEEE; 2017. p. 797–805. classification. In: Biomedical Imaging (ISBI), 2016 IEEE 13th International 29. Barata C, Celebi ME, Marques JS. Melanoma detection algorithm based Symposium on. IEEE; 2016. p. 1055–8. on feature fusion. In: Engineering in Medicine and Biology Society (EMBC), 50. Rastgoo M, Morel O, Marzani F, Garcia R. Ensemble approach for 2015 37th Annual International Conference of the IEEE. IEEE; 2015. differentiation of malignant melanoma. In: The International Conference p. 2653–6. on Quality Control by Artificial Vision 2015. International Society for 30. Almansour E, Jaffar MA. Classification of Dermoscopic Skin Cancer Optics and Photonics; 2015. p. 953415. Images Using Color and Hybrid Texture Features. IJCSNS Int J Comput Sci 51. Mendonça T, Ferreira PM, Marques JS, Marcal ARS, Rozeira J. PH 2-A Netw Secur. 2016;16(4):135–9. dermoscopic image database for research and benchmarking. 31. Ahn E, Kim J, Bi L, Kumar A, Li C, Fulham M, Feng DD. Saliency-based In: Engineering in Medicine and Biology Society (EMBC) 2013 35th Annual Lesion Segmentation via Background Detection in Dermoscopic Images. International Conference of the IEEE. IEEE; 2013. p. 5437–40. IEEE J Biomed Health Inform. 2017;21(6):1685–93. 52. Gutman D, Codella NCF, Celebi E, Helba B, Marchetti M, Mishra N, 32. Bi L, Kim J, Ahn E, Feng D, Fulham M. Automatic melanoma detection Halpern A. Skin lesion analysis toward melanoma detection: A challenge via multi-scale lesion-biased representation and joint reverse at the international symposium on biomedical imaging (ISBI) 2016, classification. In: Biomedical Imaging (ISBI), 2016 IEEE 13th International hosted by the international skin imaging collaboration (ISIC). arXiv Symposium on. IEEE; 2016. p. 1055–8. preprint arXiv:1605.01397. 2016. 33. Wong A, Scharcanski J, Fieguth P. Automatic skin lesion segmentation 53. Codella NCF, Gutman D, Celebi ME, Helba B, Marchetti MA, Dusza SW, via iterative stochastic region merging. IEEE Trans Inf Technol Biomed. Kalloo A, et al. Skin lesion analysis toward melanoma detection: A 2011;15(6):929–36. challenge at the 2017 international symposium on biomedical imaging 34. Mokhtar N, Harun N, Mashor M, Roseline H, Mustafa N, Adollah R, (isbi), hosted by the international skin imaging collaboration (isic). arXiv Adilah H, Nashrul MN. Image Enhancement Techniques Using Local, preprint arXiv:1710.05006. 2017. Global, Bright, Dark and Partial Contrast Stretching For Acute Leukemia 54. Yu L, Chen H, Dou Q, Qin J, Heng P-A. Automated melanoma Images. Lect Notes Eng Comput Sci. 2009;2176. recognition in dermoscopy images via very deep residual networks. IEEE 35. Duan Q, Akram T, Duan P, Wang X. Visual saliency detection using Trans Med Imaging. 2017;36(4):994–1004. information contents weighting. In: Optik - International Journal for Light 55. Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, Thrun S. and Electron Optics, Volume 127, Issue 19. 2016. p. 7418–30. Dermatologist-level classification of skin cancer with deep neural 36. Akram T, Naqvi SR, Ali Haider S, Kamran M. Towards real-time crops networks. Nature. 2017;542(7639):115–8. surveillance for disease classification: exploiting parallelism in computer 56. Ge Z, Demyanov S, Bozorgtabar B, Abedini M, Chakravorty R, Bowling vision. In: Computers and Electrical Engineering, Volume 59. 2017. p. A, Garnavi R. Exploiting local and generic features for accurate skin 15–26. lesions classification using clinical and dermoscopy imaging. In: 37. Barata C, Celebi ME, Marques JS. Melanoma detection algorithm based Biomedical Imaging (ISBI 2017), 2017 IEEE 14th International Symposium on feature fusion. In: Engineering in Medicine and Biology Society (EMBC), on. IEEE; 2017. p. 986–90. 2015 37th Annual International Conference of the IEEE. IEEE; 2015. 57. Lopez AR, Giro-i-Nieto X, Burdick J, Marques O. Skin lesion classification p. 2653–56. from dermoscopic images using deep learning techniques. In: Biomedical 38. Ahn E, Bi L, Jung YH, Kim J, Li C, Fulham M, Feng DD. Automated Engineering (BioMed) 2017 13th IASTED International Conference on. saliency-based lesion segmentation in dermoscopic images. IEEE; 2017. p. 49–54. In: Engineering in Medicine and Biology Society (EMBC), 2015 37th Annual International Conference of the IEEE. IEEE; 2015. p. 3009–12.

Journal

BMC CancerSpringer Journals

Published: Jun 5, 2018

References

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off