TY - JOUR AU1 - Nancy, V AU2 - Balakrishnan, G AB - Abstract Thermal sensors are now being an emerging technology in image processing applications such as face recognition, fault detection, object detection and classification, navigation, etc. Owing to its versatility, it has been an influential concern for many researchers recently. Thermal sensors have proficiency of sensing the object heedless of the lighting conditions. Due to this added leverage of thermal sensors, we propose a novel scheme for spotting the object, which is targeted by a specific thermal camera. The accomplishment of this task paves the opportunity for guiding the visually impaired (VI) people within the indoor environment adequately. Augmenting the obstacles in the user’s path is requisite for the VI people’s navigation. The image of the object is captured using the thermal camera and pre-processed for enhancing the quality of that image by suppressing the background, tuning the colour channels, etc. Noise in the thermal image is eradicated to a certain extent using Gaussian smoothing process followed by Markov random field for constructing the Gaussian mixture model. Further, the pattern is deduced and classified based on the least-squares support-vector machine. The experiment is tested for disparate timing and distance, and the optimum solution is obtained. To enact the accurate outcome with short estimation period in affordable size and cost is the main added logic behind this fused concept. 1. INTRODUCTION The International Classification of Disease states that nearly 442 million people are affected with mild vision, whereas 826 million people are impaired with near vision. According to World Health Organization dated on 11 October 2018, there are 1.3 billion people suffering from vision across the world. In spite of 80% of impairment in vision is unfortunately non-recoverable, it could be addressed with certain assistive aids such as wearing glass, cataract surgery, etc. The most promising challenge for the visually impaired (VI) people is the navigation, which tends to the dependency of another person even for performing their simple daily activities. The outdoor navigation is more complex, but it could be compromised to a certain level using the aids utilizing the global positioning system (GPS) [1]. Rather than the outdoor, the indoor navigation is also tedious as the VI person has to memorize the position of the obstacles in their path. The restricted navigation can be performed unless and otherwise the location of the obstacle is not rehabilitated by the third party. They also suffer from various social challenges for lagging in participating activities, which henceforth a hindrance in their job career and social development. The vision impairment shrinks the VI people’s participation in sports and other recreational activities [2, 3] besides their workspace. Further, it demotivates them to be isolated from the crowd and stress them mentally via various emotions. Hence, therefore, their personal, social and even professional life is grieved. Luckily, the advanced technology has solved many hitches of the VI people and provides them a good companionship, which they might be longing for. The combination of existing assistive aids such as white cane, kinect device and latest emerging sensors, smartphones enhance the lifestyle of VI people. The assistive technology is classified based on the requirement of the VI people such as electronic travel aid, electronic orientation aid and position locator device. These modern assistive technologies potentially enhance the lifestyle of the VI people by performing their regular activities. Especially, their self-esteem is affected when they depend on others for doing their necessary tasks, which may affect them mentally. Owing to the advancement in the commercially available aids that provide auditory output such as talking thermometer, calculator, blood pressure machine, weighing machine, etc., they perform some of their own activities. Inclusive range of technologies and solutions exists, which paved the way to bring significant change and improvement in the lives of VI. Therefore, they have controlled their outreach to certain extent, which builds up their confidence. Even though they navigate, they still suffer due to the burden of loading their external devices [5, 6], [8, 9] along with them. Considering an account of that note, a fused logic of merging the thermal image and VI people’s navigation is proposed. In recent trends, advancement in the image processing along with its implementation in a framework with portable size and affordable price has discovered various research areas, which lead to the invention of many ideas to support the society. Thermal sensors were initially utilized for the military applications to authenticate the intrusion of the terrorists. Later, this has been tested in several fields such as medical, engineering, remote sensing, computer vision, machine learning, etc. Thermal sensors have the advantage of capturing the targeted object or obstacle even in the dark environment irrespective of the climatic conditions. Perception behind this thermal imaging is the black body radiation that emits the object’s temperature when it is above minus 273 degrees centigrade. The object’s temperature is decided based on the emissivity and intensity of that particular object. Since the detection depends on its temperature alone, it is very effective for spotting the object during night time. Variation in the weather conditions in the atmosphere such as rain, fog, smoke, etc. does not influence the reading of the particular object’s temperature. Sunlight may alter the variation in the object’s temperature to a predictable range, which is acceptable for indoor navigation. These thermal sensors are combined with variety of assistive technology that is a boon for these visually challenging sections of the society. Detecting and identifying the obstacle free path for an easier and flawless navigation of VI people have been the topic of interest for researchers over the current years. Ensuring the safe navigation in familiar and unfamiliar environment is a challenging task for the blind and visually impaired (BVI) [5, 7, 8]. Spotlighting those people, several technologies have been emerged and fabricated for their assistance, which benefits them fairly. Analysing the pros and cons of various research articles and life’s experience of VI people, a fused novel idea is emerged. Up to the knowledge this logic of applying thermal imaging prototype for classifying the type of object and guiding the VI people in indoor navigation is innovative. Obstacles or objects are determined without considering the lighting conditions. The major contribution of this research proposal is as follows. Detecting the object in the indoor environment via thermal camera and identifying that thermal image based on the combined pre-processing and classification techniques to refer the class of that object. Guiding the VI people with reference to that class of the object through a voice message. This research proposal bounces you with a fresh logic of enhancing and classifying the thermal image for guiding the VI people in the indoor environment. It is sectioned as follows: Section 2 interprets the view about the thermal image-based object detection techniques and need for the guidance for the navigation of VI people based on the thermal image. Section 3 illustrates the entire flow of the thermal imaging prototype with corresponding figures and tables. Section 4 specifies the pros of the novel logic via experimental results and discussed with the statistics. Further, the compilation of the proposed work is concluded in Section 5, pursuing the ideas that could enhance the current system are portrayed in Section 6. 2. RELATED WORK Currently, visually impaired people (VIP) are very curious regarding the navigational aids and technology. Unlike the earlier days they are willing to incorporate and adapt recent trends to enhance their lifestyle. Smartphone is one of the modes for their communication and navigation as it has become mandatory in everybody’s life. Due to its compatibility and reliability, it is very easily accessible, understandable and moreover user-friendly. Obstacle detection and avoidance are the main scope for research work when considering VIP. 2.1. Smartphone-Based Navigation Li et al. [1] proposed a framework for dynamic obstacle detection and also provides a safe navigation through path planning algorithm. In this research work, the layout of the building is framed through minimum spanning tree (MST) algorithm. Each compartment in that building is tracked by using Prim’s algorithm, which identifies the subset of edges for every vertex of the graph in MST, as a result the probability of available rooms in that building is determined. Once the picture is framed, it is tuned with the help of de-skewing process for getting clarity. Later, the framed obstacle is identified by using the connected component labelling approach. Dynamic obstacle is predicted through Kalman filter, which evaluates the internal state of the system by computing the summation of the series measurement of that system. This enhanced recursive filter reduces the mean squared error as it is independent of the noisy environment. Hence, the user is localized by constructing semantic map and guided through speech recognition. However, in noisy environment, the speech recognition should be improved further. Introduction of point of interest in semantic map provides easy accessibility to the user. Jafri et al. [2] tracks the user’s movement and their orientation in dynamic 3D space with the support of Google Tango Tablet, which is the project developed by Google. This development kit is an Android device embedded with powerful graphic processors (NVIDIA Tegra K1 with 192 CUDA cores) and sensors such as motion tracking camera, 3D depth sensor, accelerometer, ambient light sensor, barometer, compass, GPS and gyroscope. Utilizing the visual inertial odometry techniques, their location is tracked using inbuilt camera and sensors. The distance between the tablet and the ground is determined through Unity’s raycast feature, which is a technique used in the video game development. By adapting this technique, we can track the projectile’s direction in 2D or 3D space. Wireless headphone is the medium through which the feedback is given instantly. User must grasp the device at an angle of 60 degrees to obtain a feasible result. The obstacle is perceived at a distance of 3 cm above the surface plane. Various test results are obtained and tabulated by using varying size of the exact object. Moreover, the device’s depth sensor could not be able to detect very dark, shiny and transparent material and also distortion is caused due to non-reflective and uneven floors. The obstacles within 13 cm could not be recognized by this system. Dong et al. [3] focus on two factors such as indoor mapping and localization based on visual and inertial sensor data using a proposed system called ViNav. Two partitioning methods named density based and fingerprint based enhance the speed of assessment of the process. The shortest path to reach the destination from the user’s current location is identified by Dijkstra’s algorithm. This algorithm detects the optimum solution across multiple floors by finding the shortest route to reach the destination with minimum cost. K Nearest Neighbour (K-NN) algorithm selects the particular partition from the indoor map for matching the patterns of the current location of the user. In spite of localizing the user, the efficiency is achieved <1 m, whereas in orientation accuracy is achieved <6 degrees. Accurate and reliable data fusion framework [4] for managing the VIP’s navigation is proposed. Further, the collision avoidance accuracy is enhanced. The combination of oriented fast and rotated brief (ORB) algorithm identifies the obstacle based on the depth of the image. The obtained features and patterns are matched by K-NN algorithm. The outliers in the frame are eliminated using random sample consensus (RANSAC) method and further collision is neglected by measuring their proximity. RANSAC method is used to solve the location determination problem, which identifies the location of the object in particular image by connecting the series of points. The main drawback in this system is it fails to detect the large doors and walls. 2.2. Navigation based on assistive aids Katzschmann et al. [5] provides both indoor and outdoor navigation through identifying only the free space in the framed environment. High and low obstacles are spotted with the guide of altitude and reference heading system. For focusing the image accurately infrared time-of-flight (TOF) distance sensors are used. The feedback to the user is given by vibratory motors that are fixed above the waist of the user. Prior and proper training is acquired for the user to neglect the initial faults. The objects that are transparent such as glass, windows and side obstacles are not spotted. An electronic NAVguide [6] is specifically designed for VI people to enhance their navigation process. This aid distinguishes the obstacles as floor level, knee level, wet floors and ascending staircases without information overload. The corresponding NAVguide comprises six ultrasonic sensors with wide beam to estimate the distance between the object and the NAVguide. Vibration motor is fixed in the side and front of the user’s shoe, and wet floor sensor is fixed at the bottom of the shoes to find the slippery floors. The voice feedback to the user is given within 150 cm via mono wireless headphones. The battery support to the user is limited to ‘600 min’. The flow of the process is interrupted if the user shifts the angle of the NAVguide. Descending staircase is not focused and also accident prone to wet floors since the wet floor is detected only after the user steps on it. Mancini et al. [7] formulated a mechatronic system to assist the VI people dedicatedly in performing walking and running by identifying the lines and lanes in their respective path. The environment is pictured using BlueFox MLC200wC colour red, green, blue (RGB) camera with global shutter. The Haptic device is Bluegiga BLE112 device, which equipped with Parallax C1026B002F vibration motor is represented with a set of two gloves. The communication with the haptic device is done by Bluetooth low energy device. Shi and Tomasi corner detector finds the corners or points of interest in a particular framed environment and extracts the features from them. The capacity of the battery is ~950 million ampere per hour, which lasts for nearly 8 to10 hours approximately. A shortcoming of this system is that the clarity of the path painting must be effective to operate well. A novel idea [8] is proposed for VI children exclusively for social interaction. Audio bracelet for blind interaction is equipped for experimental purpose to invoke sound based on its movement. The level of the participant interaction is measured in terms of Granger causality analysis, which is mounted in the bracelet. Extracted data are deployed using Vicon Nexus software and processed further in Matlab. The main hitch is striving in coordinating and controlling the group. Tao et al. [9] formulated a predefined layout of the path and guides the user to reach the destination via navigation instructions. iPhone 7 simulator is used as virtual simulator for investigational determination. Unity game engine concept is employed for creating virtual simulation to BVI users to interact with physical environment. The motion of the target in virtual environment is captured by natural language processor parser. The information access is provided to the user via Sikuli script library. Stanford NLC package is exploited for parsing the navigation instructions. The key snag in this structure did not define the type of the obstacle targeted. Verification and validation are not performed in strident setup to ensure its competence. Indoor navigation assistance with RGB-D sensor [10] detects the object in a frame and classifies that information grounded upon the visual and range. Asus Xtion Pro live camera is hanged on the user’s chest, which senses the motion of the object in the environment with its compatible middleware software development kit. It is exclusively designed for motion tracking and game development. The plane detection is done by RANSAC method [4], and using the colour information the respective path is planned with the help of polygonal and watershed floor segmentation. Hindrance to this aid is sunlight influence and also accuracy is achieved below 2 m only. 2.3. Thermal Sensors–Based Object Detection Thermal sensors [11] are incorporated for tracking and detecting movement of hand based upon the thermal images extracted from the thermal camera FLIR s65. Hand movements are captured also by Microsoft LifeCam visual camera using three conditions such as light (i) ON, (ii) OFF and (iii) backlit. Later the results are compared to determine the optimum solution. Environmental influence plays a vital role so output may vary depending upon the situation. When the hand is moved fast, it is not possible to detect it accurately, so deep learning concepts are used to overcome this problem. Wearable kinect sensors [12] recognize the dynamic face in 3D environment and declare the corresponding face to the user through voice. Kinect’s infrared camera uses the depth-only information to detect the object in the dark. Histogram of oriented gradients (HOG) descriptor, principal component analysis and K-NN algorithm [4] are utilized as face recognition algorithm. Accuracy is 50% when the face is detected in dark environment. A backpack is a heavy weight to carry for a long time and also object detection is exposed to sunlight. Choi et al. [13] framed a system named KAIST to aid autonomous driving for BVI [9] people during both day and night. This system performs numerous tasks such as (i) detecting obstacle in driving path, (ii) identifying the free driving region and (iii) enhancing the process by estimating the depth and colour of the image. RGB stereo (Sony ICX445 CCD) is used for daytime, whereas thermal camera (FLIR A655Sc) is used for night for capturing the dynamic environment. The captured information is processed in Samsung pro-850 Solid State Drive, and they are aligned with beam splitter without geometric distortion. Levenberg–Marqurdt method is utilized to collect three planes extracted from sensors. Enhancement can be further done by using one of the deep learning concepts called shell concept. Pedestrian detection [14] at both day and night is done by utilizing both visible and FIR camera, and comparative analysis is done based on the output obtained. FLIR Tau 2 camera is used as far infrared camera for detecting the image during night mode, and IDS UI-3240CP is used as visible camera for capturing image in day mode. HOG and local binary pattern (LBP) are used for feature extraction, and support-vector machine (SVM) classification is performed with the input of patch and part based model. A drawback to this system is non-feasible as it is cost-effective and the false rate is 5%. Hermosilla et al. [15] recognize face in spite of temporal variation using thermal image database (UCH Thermal Temporal Face Database, PUCV Thermal Temporal Face1 Database). Face recognition is done by a set of algorithms such as LBP [14], Weber linear descriptor, Gabor jet descriptor, scale invariant feature transform and speeded up robust feature and compared later. The major issue is (i) own database is not created so the system is dependent and (ii) noise level in eye position affects the process in all algorithms. 2.4. Navigation based on the Indoor Zhang et al. [16] proposed a way finding an indoor navigation by formulating 3D layout of the environment using Swissranger SR4000 (3D TOF camera). The constructed framework is imputed to the 6–DOF SLAM graph for the reduction of pitch, roll and yaw error. Functioning of this system depends on client (HP Stream 7 32GB Windows 8.1 tablet computer), server (Lenevo Thinkpad T430 laptop), architecture and Ubuntu 12.04 64-bit. Camera’s egomotion (i.e) change in the posture with respect to the scene is calculated using visual range odometry algorithm. Limitation is as follows: (i) detect object below 3 m; (ii) non-efficient when the user walks at a speed below 0.6 m/s. Zheng et al. [17] proposed an aid Travi-Navi to address several issues during walking such as shortcut identification, robust tracking and capturing quality image and finally help the user to move in navigation trace without deviating. Samsung Galaxy S2, HTC Desire and HTC Driod capture the environment and store the image in 20 KB size. User is tracked by using magnetic field distortion and Wi-Fi fingerprint sequences. ORB [4] is used for feature extraction, whereas Keypoint histogram is utilized for feature matching, further classified based on SVM [14]. Navigation trace is done for ~2.8 km and the user’s walking speed is compensated with the trace using dynamic time wrapping. The curbs of this system are (i) accuracy is achieved up to 4 m, (ii) only single floor navigation is possible and (iii) user is tracked without deviation within 9 m. Wearable tactile device [18] assists the VI people for navigation using Ultracane, Tom Pouce (Intelligent canes) and Tactipad (cube shaped wearable device for gist display). It focuses on obstacle avoidance, orientation and creating awareness about the environment. Drawbacks of this system are (i) overhanging obstacles are not focused and (ii) navigation instruction based on the user’s movement is not provided. Kang et al. [19] locates the object using deformable grid based on their shape variation. Glass-type wearable device captures the environment and sends to the laptop for pre-processing techniques. Black matching algorithm and optical flow formulates the transformation of image in the grid to identify the object. Displacement of each vertex is performed by perspective projection geometry and vertex deformation function is utilized for picturing the object. Spatio-temporal filter is used as noise filter for dynamic objects. Limitation is as follows: (i) obstacle detection fails when the user deviates from path; (ii) fails to detect the non-textured image such as door and wall. 3. PROPOSED METHODOLOGY We formulated a novel framework for identifying and classifying an object based on the thermal image by considering the pros and cons of various methodologies. With the help of our framework, the VI people could be independent for identifying the obstacles in their path within the indoor environment. Since the use of thermal image is an advantage for detecting the object during the night environment, it is added advantage for the user. First, the image of the target (I) is captured and later it undergoes some pre-processing techniques such as background subtraction, edge detection, feature detection, etc. for enhancing that target image (I). Once the pre-processing is done further the image (I) is subjected to classification for identifying the class of the respective target image. In our proposed system, we have chosen four classes of objects for experimenting such as chair, phone, key and specs (spectacles). The hardware and software specifications of this proposal are stated as (i) Fluke VT02 infrared camera (Fig. 1), (ii) Intel® Core™ i5-3210M 64 bit processor with 8 GB RAM and (iii) MATLAB R2018 version a. Figure 1. Open in new tabDownload slide Fluke VT02 thermal camera. Figure 1. Open in new tabDownload slide Fluke VT02 thermal camera. The thermal camera is shown in three different views such as (a) side view, (b) rear view and (c) front view for better clarity. This thermal camera visualizes the temperature ranging from −10 to 250 degrees centigrade. Minimum detection range of this product is 50 cm. The main flow of the process is pictured in Fig. 2. The image acquisition process reads the image (I) and feeds it as an input to the image enhancement module. Here, the mandatory pre-processing techniques are contributed for improving the quality of that image (I). Further, the corresponding points and edges of the image (I) are identified, which leads to the contour segmentation in identification module. Finally, the classification module decides the respective class of that image (I).The overall architectural flow of this process is briefed in Fig. 3. The images are acquired via the thermal camera, and it is set as seed for the subsequent image processing methods. A database is created with four classes of image and the detection rate is estimated for each image. Once the image acquisition is completed then the image enhancement is done with the collections of pre-processing techniques, which are described in detail. Figure 2. Open in new tabDownload slide Main module. Figure 2. Open in new tabDownload slide Main module. Figure 3. Open in new tabDownload slide Architectural flow. Figure 3. Open in new tabDownload slide Architectural flow. 3.1. Histogram Equalization The intensity of the acknowledged image is normalized to attain the clarity of each n pixel of that relevant image. The indexed image I(x) with resolution of 640 × 480 is transformed to RGB image for tranquil processing. Ensuing this preceding way the RGB colour space of that image is revamped to L*a*b colour space or CIE LAB colour defined by International Commission on Illumination (CIE). It differentiates the colour differences in the image by formulating the three parameters such as L*—brightness layer, a*—colour present in the red–green axis and b*—colour present in the blue–yellow axis. The parameter luminosity is altered by keeping the ‘a’ and ‘b’ channel constant. Contrast adjustment on only the luminosity keeps a*, b* channel the same. By default, the maximum luminosity is set to 100 for the experimental purpose. Mutate that image so that it is rebounded to RGB colour space again. Contrast adjustment is contrived by using the following functions such as histeq() and adapthisteq(). The intensity value of image’s pixel is altered as whole in histogram equalization (Fig. 4); however, each value of the image’s pixel is corrected in adaptive histogram equalization. Figure 4. Open in new tabDownload slide Output for the sample object phone after histogram equalization process. Figure 4. Open in new tabDownload slide Output for the sample object phone after histogram equalization process. The flow of the histogram equalization process for the thermal image of the object phone is exposed in Fig. 4. The sample thermal image is given as input to the histogram function. The intensity of the pixel value is equalized for the entire image as a whole in histogram equalization, whereas in adaptive histogram equalization method each pixel’s value is altered separately so that minute details can be highlighted from the thermal image. 3.2. Colour Correction The normalized image that has the equal intensity values is fed as input to this colour correction sub-module. The RGB image is remoulded to gray-scale image for placid process of background subtraction. Focusing the pixel values of that image, their colour is made equitable for suppressing the background. As a result the intensified image is obtained, which furnishes the way for detecting the object that is captured. The colour balancing pattern for the object phone is exposed in Fig. 5. Colour balancing is a process that changes the overall light of the image to enhance their quality. Figure 5. Open in new tabDownload slide Colour balancing output for the object phone. Figure 5. Open in new tabDownload slide Colour balancing output for the object phone. The colour-balanced pattern is obtained by neutralizing the gray light in that image. This technique is also referred as white balance or neutral balance. The chromadapt function [20] is utilized for balancing the RGB colour space of the input image by formulating colour correction matrix, which comprises x, y and z values. The gray value is computed by formulating the matrix of the order (A(y, x, 1) A(y, x, 2) A(y, x, 3). Here A is the input thermal image, x value is 280, y value is 100 and z value is increased iteratively as 1, 2 and 3, respectively, to obtain the balanced image as shown in Fig. 5. 3.3. Corner Correction The corner correction is widely used in the process of object tracking and identification, because it is not influenced by the factor such as lightning condition. The corner correction matrix is formulated using the conception of minimum eigenvalues, which addresses the sample input image as random process. The covariance between the image pixels is estimated to identify the major variation or corner peaks in that image as shown in Fig. 6. Figure 6. Open in new tabDownload slide Distribution of the corner points for the sample object phone. Figure 6. Open in new tabDownload slide Distribution of the corner points for the sample object phone. The corner peak value of each pixel is located with the help of regional maxima function. This function returns the binary image, which locates the peak values of gray scale image to bind a corner of that indexed image. The corner peak values are represented as one and the remaining values are assigned to zero. Later that matrix is adjusted for plotting the corner points of the indexed image as a whole. 3.4. Homography The identified corner points are segregated to various plots based on their occurrence in their respective planes. This approach is very convenient for coordinating and mapping the points scattered on various planes. The points lying on the current plane are grouped together for classification later. Here the seed is set for reproducibility to form 3*N matrix for locating the points in the same plane. The pixel value from the original image and the indexed image is compared and corresponding precision value is cornered to create a 3*N matrix. Implementing the values of that matrix multiplication is performed through permutation and combination method. As a result the fused pixel value is plotted as the output based on the calibration process. 3.5. Gaussian Noise Removal Gaussian noise is consistently common in digital images and images dealing with radiations. It provides a decent model in variety of systems dealing with images and it accommodates a positive outcome. It gets a handle on sensor noise, which is occurred owing to poor illumination, high temperature or transmission error. In Gaussian function, smoothing is applied for the blurred image and that technique is referred as Gaussian smoothing or blur. In our system, the image segmentation is performed by Markov random fields (MRFs) technique. Joint probability distribution is empowered for performing MRF operation in an undirected graph. In this undirected graph, K represents collection of nodes and C represents Clique. A Clique C is nothing but a collection of nodes that are pairwise connected. It is a collection of subsets of K. The Markov network for the object phone at distance of 1, 3 and 6 meters is illustrated in Fig. 7. Here K = {P1, P3, P6} and C = {(P1, P3), (P3, P6), (P1, P6)}. Figure 7. Open in new tabDownload slide Sample Markov network for the object phone. Figure 7. Open in new tabDownload slide Sample Markov network for the object phone. Each Clique C in MRF has a deciding parameter called potential function |${\psi}_C$|⁠, which is mathematically denoted as: $${\psi}_C:C\to \mathbb{R}>0.$$ Here the potential function never becomes zero. The combined distribution of MRF is the product of the potential function |${\psi}_C$| of all cliques C and it is represented below. $$P\left({K}_1,\dots .{K}_n\right)=\frac{1}{Z}\ \prod_{C\ \epsilon\ Cliques}{\varphi}_C,\, where$$ $$Z=\sum_{k\in K}\prod_{C\ \epsilon\ Cliques}{\varphi}_C$$ $${\varphi}_C={e}^{-{E}_C}$$ Z is the partition function that normalizes the probability function into the probability distribution function|$P\ (K)$|⁠. The potential function |${\varphi}_C$| is given in terms of Gibbs or Boltzmann distributions [22] with energy function|${E}_c$|⁠. The energy function |${E}_c$| helps to arrange identical plane shapes without overlapping in an area. The energy of label field function segments the image with the parameters such as its width, height and class. Finally, the Gaussian mixture model (GMM) is formulated with the help of the parameters such as image, segmentation and its class number. The default Neix function [21] organizes the clique by assigning the value of the potential depending on the neighbour’s pixel. The potential value |${\psi}_C$| is assigned as 0.5 in our system, and the class number is set to 4 followed by the 20 iteration. 3.6. Contouring Contouring is the crucial step in object identification and detection. A contour represents the outline of the object in form of curve of points or line segment. It defines the shape of an object approximately equal. Diagnosing the contour resolve the quantity of objects in the image and measure their size. In addition to that, it segregates the shape of the object in that image. The binary image is nurtured as input image, and it is distinct with the help of threshold values. The GMM image is converted to binary image by suppressing the red and blue channels and the threshold values is set above 200. The contour image of the object is determined based on their threshold value. Each object contradicts with the parameter distance to a certain range, which assists for classifying their concerned class. 3.7. SVM Classification Least-squares support-vector machine (LS-SVM) is a machine learning algorithm used for segregating the supervised data by extracting their features. Each class of the object is denoted as a point in the n-dimensional space, where n indicates the number of features of the object. Further, the objects are distinguished based on the hyperplane, which separates the classes of object. This algorithm usually takes the input as an object’s data and gives an output as line, which classifies the data. The SVM model can classify both linear and non-linear data in n-dimensional space. The non-linear data, which cannot be separated, can be converted to linear data by increasing the dimensional space. The manual prediction of the dimensional space is not practical, so the kernel trick came into the picture. The kernel function identifies the perfect plane in the higher dimensional space, where the proposed dataset can fit in order to classify them. In this proposed work, radial basis function (RBF) kernel is chosen as it distinguishes the data points in the infinite dimensional space. From the cluster of data points in the infinite space, a specific point that is near to the hyperplane from both the classes|$({X}_i$||${X}_j)$| is chosen as support vector (SV). The SVs that are non-separable by a perfect hyperplane depends on the LS-SVM as it normalizes and trains the respective data for classification. Now, the distance between the SV and the hyperplane is computed as margin (⁠|${X}_i-{X}_j)$| and this value depends only on the distance from the origin of the hyperplane. The optimal hyperplane is chosen when the respective hyperplane has the maximum margin value. The radial basis kernel is defined as $$K(X_i, X_j)=\textrm{exp}\ (-\Upsilon \Vert{X}_i-{X}_j\Vert^{2}), \ \Upsilon = \frac{1}{\sigma^{2.}}$$ Here |${\big\Vert{X}_i-{X}_j\big\Vert}^2$| is the distance between the SVs of two classes of the object. It decides the boundary based on the maximum distance between SV of the two classes. The perfect hyperplane is chosen based on the decision boundary of the classes. When the decision boundary is large, it reduces the occurrence of error, whereas the smaller decision boundaries will result in overfitting. The RBF kernel output for the object chair is culminated, and it is shown in the Fig. 8 with respect to the distance 1 m. This graph describes how well the parameters of the trained model coincide with the input image. The red curve represents the trained model, whereas the blue dots represent the real-time input image. Figure 8. Open in new tabDownload slide LS-SVM RBF model for trained and real-time input image. Figure 8. Open in new tabDownload slide LS-SVM RBF model for trained and real-time input image. Gamma ϒ is a regularization parameter that is tuned as 1 to get an optimum fitting of the input image against the training model. From Fig. 8, we can infer that our training model fits well with the input data. Similar graph is also attained for the remaining dataset such as the key, phone and spectacles using this LS-SVM classification. Henceforth, the likelihood for determining the object precisely is exceptional. The detection range is fairly positive, and their detailed explanation is deliberated in the succeeding section. 4. EXPERIMENTAL RESULTS Compilation of four classes of object such as chair, phone, key and spectacles (specs.) is experimented for their accuracy and reliability. The choice of the object was targeted relying on the daily usage of the VI people. All these objects were framed within the indoor environment built upon on the two distinct parameters named distance (d) and time (t). The specimen for the objects stored in the database is virtually represented in Fig. 9. Figure 9. Open in new tabDownload slide Visible image for the sample objects. Figure 9. Open in new tabDownload slide Visible image for the sample objects. TABLE 1. Temperature (°F) with their respective distance (m) and time interval. Time . Object class . Without fan . With fan . 1 M . 3 M . 6 M . 1 M . 3 M . 6 M . 7:20 am to 8:00 am Chair 91.9 91.8 91.8 94.5 94.6 94.4 Key 95.0 94.7 94.6 94.7 94.9 94.1 Phone 95.9 95.6 95.8 95.3 95.3 95.2 Specs 94.5 94.8 94.8 94.3 94.5 94.2 Time Object class Without fan With fan 1 M 3 M 6 M 1 M 3 M 6 M 11:30 am to 12:10 pm Chair 94.7 94.5 94.4 94.8 94.8 94.5 Key 93.3 93.3 93.2 94.1 94.1 93.7 Phone 101.6 99.6 99.8 99.6 97.5 97.0 Specs 94.4 93.9 94.0 94.4 94.3 94.0 Time Object class Without fan With fan 1 M 3 M 6 M 1 M 3 M 6 M 4:45 pm to 5:15 pm Chair 91.9 91.8 91.8 94.5 94.4 94.6 Key 95.0 94.7 94.6 94.7 94.9 94.1 Phone 95.9 95.6 95.8 95.3 95.2 95.3 Specs 94.4 93.9 94.0 94.4 94.3 94.0 Time Object class Without fan With fan 1 M 3 M 1 M 3 M 1 M 3 M 8:00 pm to 8:30 pm Chair 92.7 92.7 94.4 92.9 92.8 92.6 Key 92.6 92.5 92.2 93.0 93.0 92.9 Phone 96.1 95.5 95.1 94.8 94.7 94.4 Specs 92.7 92.6 92.5 92.9 92.9 93.0 Time . Object class . Without fan . With fan . 1 M . 3 M . 6 M . 1 M . 3 M . 6 M . 7:20 am to 8:00 am Chair 91.9 91.8 91.8 94.5 94.6 94.4 Key 95.0 94.7 94.6 94.7 94.9 94.1 Phone 95.9 95.6 95.8 95.3 95.3 95.2 Specs 94.5 94.8 94.8 94.3 94.5 94.2 Time Object class Without fan With fan 1 M 3 M 6 M 1 M 3 M 6 M 11:30 am to 12:10 pm Chair 94.7 94.5 94.4 94.8 94.8 94.5 Key 93.3 93.3 93.2 94.1 94.1 93.7 Phone 101.6 99.6 99.8 99.6 97.5 97.0 Specs 94.4 93.9 94.0 94.4 94.3 94.0 Time Object class Without fan With fan 1 M 3 M 6 M 1 M 3 M 6 M 4:45 pm to 5:15 pm Chair 91.9 91.8 91.8 94.5 94.4 94.6 Key 95.0 94.7 94.6 94.7 94.9 94.1 Phone 95.9 95.6 95.8 95.3 95.2 95.3 Specs 94.4 93.9 94.0 94.4 94.3 94.0 Time Object class Without fan With fan 1 M 3 M 1 M 3 M 1 M 3 M 8:00 pm to 8:30 pm Chair 92.7 92.7 94.4 92.9 92.8 92.6 Key 92.6 92.5 92.2 93.0 93.0 92.9 Phone 96.1 95.5 95.1 94.8 94.7 94.4 Specs 92.7 92.6 92.5 92.9 92.9 93.0 Open in new tab TABLE 1. Temperature (°F) with their respective distance (m) and time interval. Time . Object class . Without fan . With fan . 1 M . 3 M . 6 M . 1 M . 3 M . 6 M . 7:20 am to 8:00 am Chair 91.9 91.8 91.8 94.5 94.6 94.4 Key 95.0 94.7 94.6 94.7 94.9 94.1 Phone 95.9 95.6 95.8 95.3 95.3 95.2 Specs 94.5 94.8 94.8 94.3 94.5 94.2 Time Object class Without fan With fan 1 M 3 M 6 M 1 M 3 M 6 M 11:30 am to 12:10 pm Chair 94.7 94.5 94.4 94.8 94.8 94.5 Key 93.3 93.3 93.2 94.1 94.1 93.7 Phone 101.6 99.6 99.8 99.6 97.5 97.0 Specs 94.4 93.9 94.0 94.4 94.3 94.0 Time Object class Without fan With fan 1 M 3 M 6 M 1 M 3 M 6 M 4:45 pm to 5:15 pm Chair 91.9 91.8 91.8 94.5 94.4 94.6 Key 95.0 94.7 94.6 94.7 94.9 94.1 Phone 95.9 95.6 95.8 95.3 95.2 95.3 Specs 94.4 93.9 94.0 94.4 94.3 94.0 Time Object class Without fan With fan 1 M 3 M 1 M 3 M 1 M 3 M 8:00 pm to 8:30 pm Chair 92.7 92.7 94.4 92.9 92.8 92.6 Key 92.6 92.5 92.2 93.0 93.0 92.9 Phone 96.1 95.5 95.1 94.8 94.7 94.4 Specs 92.7 92.6 92.5 92.9 92.9 93.0 Time . Object class . Without fan . With fan . 1 M . 3 M . 6 M . 1 M . 3 M . 6 M . 7:20 am to 8:00 am Chair 91.9 91.8 91.8 94.5 94.6 94.4 Key 95.0 94.7 94.6 94.7 94.9 94.1 Phone 95.9 95.6 95.8 95.3 95.3 95.2 Specs 94.5 94.8 94.8 94.3 94.5 94.2 Time Object class Without fan With fan 1 M 3 M 6 M 1 M 3 M 6 M 11:30 am to 12:10 pm Chair 94.7 94.5 94.4 94.8 94.8 94.5 Key 93.3 93.3 93.2 94.1 94.1 93.7 Phone 101.6 99.6 99.8 99.6 97.5 97.0 Specs 94.4 93.9 94.0 94.4 94.3 94.0 Time Object class Without fan With fan 1 M 3 M 6 M 1 M 3 M 6 M 4:45 pm to 5:15 pm Chair 91.9 91.8 91.8 94.5 94.4 94.6 Key 95.0 94.7 94.6 94.7 94.9 94.1 Phone 95.9 95.6 95.8 95.3 95.2 95.3 Specs 94.4 93.9 94.0 94.4 94.3 94.0 Time Object class Without fan With fan 1 M 3 M 1 M 3 M 1 M 3 M 8:00 pm to 8:30 pm Chair 92.7 92.7 94.4 92.9 92.8 92.6 Key 92.6 92.5 92.2 93.0 93.0 92.9 Phone 96.1 95.5 95.1 94.8 94.7 94.4 Specs 92.7 92.6 92.5 92.9 92.9 93.0 Open in new tab These objects were subjected to two conditions such as (i) with fan and (ii) without fan. Since the thermal image confides in the emissivity and temperature range, the above conditions were tested for its reliability in temperature. The connectivity of the dataset is elaborated briefly in Table 1. The object is captured by altering the distance of the target image in respective range as 1 m, 3 m and 6 m subjected to the two conditions. Added to that, the temperature range was determined by capturing the object at different time intervals. The inference from Table1 states that there is minor fluctuation with the range of temperature when exposed to the conditions. The temperature of the objects portrayed with fan condition is slightly greater than without fan expect object phone. The phone’s temperature is high when it is exposed to fan when compared with other objects. The temperature range fluctuates in correspondence with the environmental conditions. To find the optimum temperature range, the dataset is created by varying the parameters. Based on the analysis of the above table, the temperature range with four contrasting time interval is elucidated in Fig. 10. Figure 10. Open in new tabDownload slide Variation in temperature (o F) at time interval 8:00 pm–8:30 pm. Figure 10. Open in new tabDownload slide Variation in temperature (o F) at time interval 8:00 pm–8:30 pm. The previous figure infers that the temperature range of the chair lies between 90–92.7 Fahrenheit (°F), whereas key lies within 94–95°F and phone varies between 95–97°F; finally, spectacles is in the range of 92.6–93°F, respectively. The SVM classifier resolves the identification issue by considering this temperature range as one of its parameter. The true positive graph is plotted only if the detected image (I) patches with the indexed image I(x), otherwise the fluctuation in the values returns it as false negative (Fig. 11). Figure 11. Open in new tabDownload slide False negative pattern for the objects. Figure 11. Open in new tabDownload slide False negative pattern for the objects. From Fig. 12, we can witness that the indoor room temperature is at 89.5°F, which is low when compared to the range of individual object’s temperature. The ambient temperature is analysed by subjecting the objects such as chair, key, phone, spectacles, sofa, table and cupboard in both indoor and outdoor environment and the values are shown in Table 2. Figure 12. Open in new tabDownload slide Environmental temperature for the entire room at time interval 11:30 pm–12:30 pm. Figure 12. Open in new tabDownload slide Environmental temperature for the entire room at time interval 11:30 pm–12:30 pm. TABLE 2. Comparison of sample objects’ temperature (°F) in indoor and outdoor environment. Time . Object class . Indoor environment . Outdoor environment . 1 M 3 M 6 M 1 M 3 M 6 M 8:30 am to 3:30 am Chair 91.5 91.6 91.8 92.4 92.7 92.8 Key 94.7 94.6 94.5 95.8 95.6 95.3 Phone 95.9 95.3 93.1 96.8 96.5 96.2 Spectacles 93.8 93.5 93.3 94.8 94.5 94.5 Sofa 96.7 96.5 96.3 97.6 97.5 97.3 Cupboard 97.8 97.4 97.2 98.8 98.5 98.3 Table 96.3 96.2 96.1 96.8 98.4 96.2 2:30 am to 3:30 am Chair 92.5 92.3 92.6 93.5 93.4 93.9 Key 94.4 94.6 94.6 95.6 95.5 95.4 Phone 95.9 95.6 95.3 95.9 96.9 96.8 Spectacles 93.5 93.4 93.2 94.3 94.2 94.5 Sofa 96.7 96.8 96.9 97.5 97.3 97.2 Cupboard 98.8 98.4 98.6 100.8 100.2 99.9 Table 101.4 101.5 101.7 102.6 102.4 102.7 7:30 pm to 9:00 pm Chair 90.9 90.4 90.5 91.7 91.4 91.4 Key 93.7 93.5 93.4 94.5 94.3 94.1 Phone 95.4 95.3 95.2 94.3 94.2 94.3 Spectacles 92.4 92.3 92.5 93.5 93.7 93.6 Sofa 95.7 95.6 95.2 96.7 96.5 96.3 Cupboard 97.3 97.5 97.4 98.5 98.7 98.6 Table 98.5 98.3 98.3 99.5 99.5 99.8 Time . Object class . Indoor environment . Outdoor environment . 1 M 3 M 6 M 1 M 3 M 6 M 8:30 am to 3:30 am Chair 91.5 91.6 91.8 92.4 92.7 92.8 Key 94.7 94.6 94.5 95.8 95.6 95.3 Phone 95.9 95.3 93.1 96.8 96.5 96.2 Spectacles 93.8 93.5 93.3 94.8 94.5 94.5 Sofa 96.7 96.5 96.3 97.6 97.5 97.3 Cupboard 97.8 97.4 97.2 98.8 98.5 98.3 Table 96.3 96.2 96.1 96.8 98.4 96.2 2:30 am to 3:30 am Chair 92.5 92.3 92.6 93.5 93.4 93.9 Key 94.4 94.6 94.6 95.6 95.5 95.4 Phone 95.9 95.6 95.3 95.9 96.9 96.8 Spectacles 93.5 93.4 93.2 94.3 94.2 94.5 Sofa 96.7 96.8 96.9 97.5 97.3 97.2 Cupboard 98.8 98.4 98.6 100.8 100.2 99.9 Table 101.4 101.5 101.7 102.6 102.4 102.7 7:30 pm to 9:00 pm Chair 90.9 90.4 90.5 91.7 91.4 91.4 Key 93.7 93.5 93.4 94.5 94.3 94.1 Phone 95.4 95.3 95.2 94.3 94.2 94.3 Spectacles 92.4 92.3 92.5 93.5 93.7 93.6 Sofa 95.7 95.6 95.2 96.7 96.5 96.3 Cupboard 97.3 97.5 97.4 98.5 98.7 98.6 Table 98.5 98.3 98.3 99.5 99.5 99.8 Open in new tab TABLE 2. Comparison of sample objects’ temperature (°F) in indoor and outdoor environment. Time . Object class . Indoor environment . Outdoor environment . 1 M 3 M 6 M 1 M 3 M 6 M 8:30 am to 3:30 am Chair 91.5 91.6 91.8 92.4 92.7 92.8 Key 94.7 94.6 94.5 95.8 95.6 95.3 Phone 95.9 95.3 93.1 96.8 96.5 96.2 Spectacles 93.8 93.5 93.3 94.8 94.5 94.5 Sofa 96.7 96.5 96.3 97.6 97.5 97.3 Cupboard 97.8 97.4 97.2 98.8 98.5 98.3 Table 96.3 96.2 96.1 96.8 98.4 96.2 2:30 am to 3:30 am Chair 92.5 92.3 92.6 93.5 93.4 93.9 Key 94.4 94.6 94.6 95.6 95.5 95.4 Phone 95.9 95.6 95.3 95.9 96.9 96.8 Spectacles 93.5 93.4 93.2 94.3 94.2 94.5 Sofa 96.7 96.8 96.9 97.5 97.3 97.2 Cupboard 98.8 98.4 98.6 100.8 100.2 99.9 Table 101.4 101.5 101.7 102.6 102.4 102.7 7:30 pm to 9:00 pm Chair 90.9 90.4 90.5 91.7 91.4 91.4 Key 93.7 93.5 93.4 94.5 94.3 94.1 Phone 95.4 95.3 95.2 94.3 94.2 94.3 Spectacles 92.4 92.3 92.5 93.5 93.7 93.6 Sofa 95.7 95.6 95.2 96.7 96.5 96.3 Cupboard 97.3 97.5 97.4 98.5 98.7 98.6 Table 98.5 98.3 98.3 99.5 99.5 99.8 Time . Object class . Indoor environment . Outdoor environment . 1 M 3 M 6 M 1 M 3 M 6 M 8:30 am to 3:30 am Chair 91.5 91.6 91.8 92.4 92.7 92.8 Key 94.7 94.6 94.5 95.8 95.6 95.3 Phone 95.9 95.3 93.1 96.8 96.5 96.2 Spectacles 93.8 93.5 93.3 94.8 94.5 94.5 Sofa 96.7 96.5 96.3 97.6 97.5 97.3 Cupboard 97.8 97.4 97.2 98.8 98.5 98.3 Table 96.3 96.2 96.1 96.8 98.4 96.2 2:30 am to 3:30 am Chair 92.5 92.3 92.6 93.5 93.4 93.9 Key 94.4 94.6 94.6 95.6 95.5 95.4 Phone 95.9 95.6 95.3 95.9 96.9 96.8 Spectacles 93.5 93.4 93.2 94.3 94.2 94.5 Sofa 96.7 96.8 96.9 97.5 97.3 97.2 Cupboard 98.8 98.4 98.6 100.8 100.2 99.9 Table 101.4 101.5 101.7 102.6 102.4 102.7 7:30 pm to 9:00 pm Chair 90.9 90.4 90.5 91.7 91.4 91.4 Key 93.7 93.5 93.4 94.5 94.3 94.1 Phone 95.4 95.3 95.2 94.3 94.2 94.3 Spectacles 92.4 92.3 92.5 93.5 93.7 93.6 Sofa 95.7 95.6 95.2 96.7 96.5 96.3 Cupboard 97.3 97.5 97.4 98.5 98.7 98.6 Table 98.5 98.3 98.3 99.5 99.5 99.8 Open in new tab From the above table, we can notice that the temperature range of the objects in the outdoor environment is greater than the indoor. Here the new sample objects such as sofa, cupboard and table are included for identifying the optimum solutions. The accuracy of our proposed system is validated by estimating the respective mean |$\mu (x)$|and standard deviation |$\sigma (x)$| of our observed targets. The |$\mu (x),$||$\sigma (x)$| values are calculated for distinct object corresponding to their distance and tabulated. Referring those plots a column chart is plotted as shown in Fig. 13. Encapsulating the values of |$\sigma (x)$|⁠, the trend line is created, which denotes the behavioural pattern between the input data and the tested data. As a result the R2 value is calculated as 0.8998. The R2 = 0.8998 value in Fig. 13 depicts that how well the input data are correlating with the trained data. From this chart, it is clear that all the targeted objects fall in the fitting line. Henceforth, the accuracy of detecting and identifying the class of the object is achieved 90% with error rate is approximately equal to 10%. The design of user end application is developed considering the two parameters such as portability and accessibility. This proposed system is user friendly to the VI people due to its methodology and system design. The flow of the application is shown in Fig. 14. Figure 13. Open in new tabDownload slide Statistical linear variation of target objects using error bars. Figure 13. Open in new tabDownload slide Statistical linear variation of target objects using error bars. Figure 14. Open in new tabDownload slide Framework for the user end application. Figure 14. Open in new tabDownload slide Framework for the user end application. The VI people start the application in the IOS device by a specific voice message, which switches the Fluke thermal camera to active mode. The captured images are transferred to the Matlab cloud through Wi-Fi technology. Once the images are retrieved from the IOS device, our proposed algorithm in the Matlab cloud is activated. The images are pre-processed and classified based on LS-SVM algorithm, and finally the output is given to the user through a voice message. With the support of this work, they will be able to perform certain task such as identifying the key, chair, phone and spectacles to perform their daily routines. 5. CONCLUSION Thermal imaging is an inquisitive research topic where you can explore wide range of applications due to its capability of identifying the object based on its temperature. They are resistive towards the luminance, weather and lighting conditions. Night vision tasks will require this technology for intensifying their system. The sample objects are captured through the fluke thermal camera to train the dataset. Once the object is captured in real-time, certain pre-processing techniques such as histogram equalization, colour balancing, background subtraction, detecting corners and edges are carried out. Later, contour is detected, which promotes the outline of the targeted object. The class or the type of the object in the dataset is cross-validated by diagnosing their respective patterns. Patterns are formulated by LS-SVM, which trains the module, and their implications are gathered for verification and validation process. The experiments were performed for various classes of objects with varying time and distance. The estimation period is considerably high when compared to the existing system and the accuracy is achieved for 90%. The false alarm rate is 10%, which is negotiable. The novel attempt of capturing the objects via fluke camera and utilizing it for object detection in indoor environment is an added advantage to this research work. Henceforth, the object detection and identification is possible in indoor environment with the help of thermal cameras. Thus, the recap of this research article is conferred by analysing the wide favours of the thermal image and justifying their leverage over the disadvantages of the existing technologies. The VI people’s society will be benefited undoubtedly since it enhances their lifestyle. 6. FUTURE SCOPE The object detection and classification is deprived for a small dataset, which motivates us to investigate it for various classes of objects and environments in future. In order to improve the efficiency of the proposed system, we planned to modify the hardware system by integrating smartphones and raspberry pi kit for object detection and classification. The raspberry pi kit is a low-cost portable device that operates similar to the computer. This modification will allow the proposed system to function independently irrespective of the internet connectivity. Smartphones will be initiated to detect the obstacles in their path through a voice command by the VI users. Later, the images are transferred to the raspberry pi kit via Bluetooth technology. The images are pre-processed in raspberry pi and classified with suitable machine learning or deep learning algorithm. The identified object is transferred to the user via loud speaker or wireless headphones. Since the hardware equipment is portable with affordable cost, it will be user friendly to the VI people. Once the system is designed, it will be validated by VI users and based on their suggestions the system will be modified. Acknowledgement I thank the management and employees of Chase Technologies for providing the thermal images. REFERENCES [1] Li , B. et al. ( 2019 ) Vision-based mobile indoor assistive navigation aid for blind people . IEEE Trans. Mobile Comput. , 18 , 702 – 714 . Google Scholar Crossref Search ADS WorldCat [2] Jafri , R. , Campos , R.L., Ali , S.A. and Arabnia , H.R. ( 2016 ) Utilizing the Google project tango tablet development kit and the unity engine for image and infrared data-based obstacle detection for the visually impaired . 2016 Int. Conf. Heal. Informatics Med. Syst. (HIMS’15), Las Vegas, Nevada , 6 , 163 – 164 . Google Scholar OpenURL Placeholder Text WorldCat [3] Dong , J. , Noreikis , M., Xiao , Y. and Yla-Jaaski , A. ( 2018 ) ViNav: a vision-based indoor navigation system for smartphones . IEEE Trans. Mob. Comput. , 1233 , 1–1 . Google Scholar OpenURL Placeholder Text WorldCat [4] Elmannai , W.M. and Elleithy , K.M. ( 2018 ) A highly accurate and reliable data fusion framework for guiding the visually impaired . IEEE Access , 6 , 33029 – 33054 . Google Scholar Crossref Search ADS WorldCat [5] Katzschmann , R.K. , Araki , B. and Rus , D. ( 2018 ) Safe local navigation for visually impaired users with a time-of-flight and haptic feedback device . IEEE Trans. Neural Syst. Rehabil. Eng. , 26 , 583 – 593 . Google Scholar Crossref Search ADS PubMed WorldCat [6] Patil , K. , Jawadwala , Q. and Shu , F. ( 2018 ) Design and construction of electronic aid for visually impaired people . IEEE T. Hum.-Mach. Syst. , 48 , 172 – 182 . doi: https://doi.org/10.1109/THMS.2018.2799588. Google Scholar Crossref Search ADS WorldCat [7] Mancini , A. , Frontoni , E. and Zingaretti , P. ( 2018 ) Mechatronic system to help visually impaired users during walking and running . IEEE Trans. Intell. Transp. Syst. , 19 , 649 – 660 . doi: https://doi.org/10.1109/TITS.2017.2780621. Google Scholar Crossref Search ADS WorldCat [8] Cappagli , G. et al. ( 2018 ) Assessing social competence in visually impaired people and proposing an interventional program in visually impaired children . IEEE Trans. Cogn. Dev. Syst. , 10 , 929 – 935 . Google Scholar Crossref Search ADS WorldCat [9] Tao , Y. , Ding , L. and Ganz , A. ( 2017 , 2017 ) Indoor navigation validation framework for visually impaired users . IEEE Access , 5 , 21763 – 21773 . Google Scholar Crossref Search ADS WorldCat [10] Aladren , A. , Lopez-Nicolas , G., Puig , L. and Guerrero , J. ( 2016 ) Navigation assistance for the visually impaired using RGB-D sensor with range expansion . IEEE Syst. J. , 10 , 922 – 932 doi: https://doi.org/10.1109/JSYST.2014.2320639. Google Scholar Crossref Search ADS WorldCat [11] Song , E. , Lee , H., Choi , J. and Lee , S. ( 2018 ) AHD: thermal-image based adaptive hand detection for enhanced tracking system . IEEE ACCESS , 6 , 12156 – 12166 . doi: https://doi.org/10.1109/ACCESS.2018.2810951. Google Scholar Crossref Search ADS WorldCat [12] Britto Neto , L. , Grijalva , F., Lima Maike , V.R.M., Martini , L.C., Florencio , D., Baranauskas , M.C.C., Rocha , A. and Goldenstein , S. ( 2017 ) A Kinect-based wearable face recognition system toaid visually impaired users . IEEE Trans. Hum.-Mach. Syst. , 47 , 52 – 64 . doi: https://doi.org/10.1109/THMS.2016.2604367. Google Scholar OpenURL Placeholder Text WorldCat [13] Choi , Y. , Kim , N., Hwang , S., Park , K., Yoon , J.S., An , K. and Kweon , I. ( 2018 ) KAIST multi-spectral day/night data set for autonomous and assisted driving . IEEE Trans. Intell. Transp. Syst. , 19 , 1 – 15 . https://doi.org/10.1109/TITS.2018.2791533. Google Scholar Crossref Search ADS WorldCat [14] Alejandro , G.A. , Fang , Z., Socarras , Y., Serrat , J., Vázquez , D., Xu , J. and López , A. ( 2016 ) Pedestrian detection at day/night time with visible and FIR cameras: a comparison . Sensors , 16 , 820 . doi: https://doi.org/10.3390/s16060820. Google Scholar Crossref Search ADS WorldCat [15] Hermosilla , G. , Verdugo , J., Farias , G., Pizarro Torres , F. and Vera , E. ( 2017 ) Thermal face recognition under temporal variation conditions . IEEE ACCESS , 5 , 9663 – 9672 . doi: https://doi.org/10.1109/ACCESS.2017.2704296. Google Scholar Crossref Search ADS WorldCat [16] Zhang , H. and Ye , C. ( 2017 ) An indoor wayfinding system based on geometric features aided graph SLAM for the visually impaired . IEEE Trans. Neural Syst. Rehab. Eng. , 25 , 1592 – 1604 . doi: https://doi.org/10.1109/TNSRE.2017.2682265. Google Scholar Crossref Search ADS WorldCat [17] Zheng , Y. , Shen , G., Li , L., Zhao , C. and Zhao , F. ( 2014 ) Travi-Navi: self-deployable indoor navigation system . IEEE/ACM Trans. Netw. , 25 , 2655 – 2669 . doi: https://doi.org/10.1109/TNET.2017.2707101. Google Scholar Crossref Search ADS WorldCat [18] Pissaloux , E. , Velázquez , R. and Maingreaud , F. ( 2017 ) A new framework for cognitive mobility of visually impaired users in using tactile device . IEEE Trans. Hum.-Mach. Syst. , 47 , 6, 1040 – 1051 . doi: https://doi.org/10.1109/THMS.2017.2736888. Google Scholar Crossref Search ADS WorldCat [19] Kang , M.-C. , Chae , S.-H., Sun , J.-Y., Lee , S.-H. and Ko , S.-J. ( 2017 ) An enhanced obstacle avoidance method for the visually impaired using deformable grid . IEEE Trans. Consum. Electron. , 63 , 169 – 177 . https://doi.org/10.1109/TCE.2017.014832. Google Scholar Crossref Search ADS WorldCat [20] Mahmoud Afifi ( 2020 ). chromadapt: adjust color balance of RGB image with chromatic. https://www.mathworks.com/matlabcentral/fileexchange/66682-chromadapt-adjust-color-balance-of-rgb-image-with-chromatic, MATLAB Central File Exchange. (accessed March 18, 2020) . [21] lin ( 2020 ). Image segmentation based on Markov RandomFields. https://www.mathworks.com/matlabcentral/fileexchange/33592-image-segmentation-based-on-markov-random-fields, MATLAB Central File Exchange. (accessed March 18, 2020) . [22] Yang , M.Y. and Förstner , W. ( 2011 ) A hierarchical conditional random field model for labeling and classifying images of man-made scenes . In IEEE International Conference on Computer Vision Workshops (ICCV Workshops) , 2011, pp. 196 – 203 . doi https://doi.org/10.1109/ICCVW.2011.6130243. Google Scholar OpenURL Placeholder Text WorldCat © The British Computer Society 2020. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model) TI - Thermal Image-Based Object Classification for Guiding the Visually Impaired JF - The Computer Journal DO - 10.1093/comjnl/bxaa097 DA - 2020-08-04 UR - https://www.deepdyve.com/lp/oxford-university-press/thermal-image-based-object-classification-for-guiding-the-visually-5dM4tYlTNh SP - 1 EP - 1 VL - Advance Article IS - DP - DeepDyve ER -