TY - JOUR AU - Tran, Dat AB - Abstract Electroencephalogram (EEG) plays an essential role in analysing and recognizing brain-related diseases. EEG has been increasingly used as a new type of biometrics in person identification and verification systems. These EEG-based systems are important components in applications for both police and civilian works, and both areas process a huge amount of EEG data. Storing and transmitting these huge amounts of data are significant challenges for data compression techniques. Lossy compression is used for EEG data as it provides a higher compression ratio (CR) than lossless compression techniques. However, lossy compression can negatively influence the performance of EEG-based person identification and verification systems via the loss of information in the reconstructed data. To address this, we propose introducing performance measures as additional features in evaluating lossy compression techniques for EEG data. Our research explores if a common value of CR exists for different systems using datasets with lossy compression that could provide almost the same system performance with those using datasets without lossy compression. We performed experiments on EEG-based person identification and verification systems using two large EEG datasets, CHB MIT Scalp and Alcoholism, to investigate the relationship between standard lossy compression measures and our proposed system performance measures with the two lossy compression techniques, discrete wavelet transform—adaptive arithmetic coding and discrete wavelet transform—set partitioning in hierarchical trees. Our experimental results showed a common value of CR exists for different systems, specifically, 70 for person identification systems and 50 for person verification systems. 1. Introduction Electroencephalogram (EEG) has been commonly used in diagnosing and detecting brain-related diseases such as dementia, seizure, psychiatric disorders and sleep apnea [14]. Recently, EEG has also been used as a new type of biometric for person recognition including person identification and verification (authentication) systems [3, 6, 16, 21, 22, 27, 29]. Using EEG-based biometrics overcomes drawbacks and has advantages compared to traditional measures such as fingerprint, face, voice, iris and hand. For example, fingerprint, face and hand measures can be subject to physical damage, so they are not universal. In contrast, every living and functional person has a recordable EEG signal, and as brain damage occurs infrequently overall, the EEG biometric is universal. Moreover, while face, fingerprint and iris measures can be photographed, voice measures can be recorded, potentially introducing fake biometric samples. It is very hard to fake an EEG signature or to attack an EEG biometric system [23]. In addition, using EEG-based biometrics reduces the risks associated with password-based systems, for instance, forgotten password, password attacks and stolen token [22]. EEG-based person recognition can be used in criminal identification and police work and may have applications in many civilian purposes, e.g. access control and financial security systems in which the high level of accuracy and security is essential [29]. Therefore, using EEG in person recognition systems is a promising research direction. One of the major challenges in using EEG signals is storing and circulating vast amounts of EEG data. The size of EEG data is affected by four factors, namely the number of channels (electrodes), the amount of time, the sampling frequency and bit resolution. According to Marsan and Zivin [17], the number of channels can exceed 256 while the time can be several hours or even several months. Even though the sampling frequencies are typically between 256 Hz and 512 Hz, they can be up to 2000 Hz in some specific applications where a higher resolution is required [25]. Furthermore, with EEG signals, a large amount of data can be created within a short time of recording [13]. In addition, the more EEG signals that are captured and used, the higher the accuracy and reliability of the EEG-based applications obtained. Hence, reducing the size of EEG data using compression techniques while maintaining the maximum fidelity of the reconstructed signals is a key imperative in advancing the application of EEG based applications. Lossless and lossy compressions are two kinds of EEG data compression. Lossless compression reduces the size of the compressed EEG data without any loss of information in the reconstructed data; as the name suggests, lossy introduces a loss in the reconstructed data. However, the compression ratio (CR) of lossy compression techniques is much higher than for lossless compressions. Consequently, EEG data compression studies mostly focus on lossy compression techniques. According to Dao et al. [8], the lossy compression techniques can be grouped into four classes including wavelet-based, filter-based, predictor and others. Dao et al. also stated that most lossy compression techniques are transformation-based, e.g. wavelet-based and discrete wavelet transform—set partitioning in hierarchical trees (DWT-SPIHT) and its extended versions are among the best lossy compression techniques. Moreover, discrete wavelet transform—adaptive arithmetic coding (DWT-AAC), as reported in [19], achieves good results compared to some recent compression techniques. EEG compression techniques are necessary to reduce the size of the data that needs to be stored and transmitted; it is therefore important to examine the impact of these techniques on EEG-based applications. Higgins et al. [13] and Daou and Labeau [10] investigated the effect of EEG compression on the accuracy of EEG-based seizure detection systems. Higgins et al. [13] reported the seizure detection system performed well (above or equal 90%) if the loss in the recovered EEG signals was less than 35%. Additionally, Daou and Labeau [10] pointed out that the rate of true seizure detection was around 90% at bit rate of 2 bps (corresponding to CR of 8). Nguyen et al. [18] investigated the possibility of applying lossy compression on the EEG-based user authentication system. In this research, two common EEG features, autoregressive (AR) and power spectral density (PSD), were used for authentication. The experimental results showed that applying lossy compression on the authentication system was feasible and the system accuracy was unchanged if the information loss was not greater than 11% [18]. Although the effect of lossy compression on the person authentication (verification) system has been investigated [18], further examination of lossy compression on large EEG datasets is warranted to validate the existing results. The impact of lossy compression on EEG-based person identification systems has yet to be investigated. In this paper, we investigate the impact of different lossy compression techniques on the performance of EEG-based person recognition systems, including person identification and verification. We propose to use performance measures of these EEG-based person identification and verification systems as additional measures in evaluating lossy compression techniques for EEG data to find solutions to the following research questions: 1) What is the impact of lossy compression on EEG-based person recognition system? 2) Is it feasible for EEG-based person recognition systems to use reconstructed signals processed by lossy compression? And if it is, how much information loss can be tolerated? We aim to find whether a common value of CR exists for different systems using datasets with lossy compression that could provide almost the same system performance using the same datasets without lossy compression. We performed experiments on EEG-based person identification and verification systems using two large EEG datasets, CHB MIT Scalp and Alcoholism, to investigate the relationship between standard lossy compression measures and our proposed system performance measures with the two lossy compression techniques DWT-AAC and DWT-SPIHT. Our experimental results showed a common value of CR exists for different systems, specifically, 70 for person identification systems and 50 for person verification systems. The rest of this paper is organized as follows. Sections 2 and 3 present EEG lossy compression techniques and EEG-based person recognition system, respectively. Section 4 presents the experiment conditions, followed by the results in Section 5. The discussion and conclusion are presented in Sections 6 and 7, correspondingly. 2. E‌EG lossy compression techniques This section briefly explains the two lossy compression techniques used to compress and decompress EEG data in this research, DWT-SPIHT and DWT-AAC. The performance measures are also presented in this section. 2.1. Discrete wavelet transform Wavelet transform (WT) is used to analyse signals in both time and frequency domains, so it is suitable for analysing non-stationary signals such as EEG. WT decomposes a signal into a set of basic functions. These basic functions are dilations, translations and scales of a common function, known as the mother wavelet [4, 15, 30]. A mother wavelet ψ(t) is defined as follows: $$\begin{equation}{\int}_{-\infty}^{\infty}\psi \left(\mathrm{t}\right)d\mathrm{t}=0\end{equation}$$(1) Another wavelet |${\psi}_{\mathrm{s},\mathrm{u}}\Big(\mathrm{t}\Big)$| is obtained when the mother wavelet is dilated by a scale of s and shifted by u: $$\begin{equation}{\psi}_{\mathrm{s},\mathrm{u}}\left(\mathrm{t}\right)=\frac{1}{\sqrt{\mathrm{s}}}\psi \left(\frac{\mathrm{t}-\mathrm{u}}{\mathrm{s}}\right)\end{equation}$$(2) The WT is defined as follows: $$\begin{equation}{\mathrm{X}}_{\mathrm{W}}\left(\mathrm{s},\mathrm{u}\right)=\frac{1}{\sqrt{\mathrm{s}}}{\int}_{-\infty}^{\infty }{\psi}^{\ast}\left(\frac{\mathrm{t}-\mathrm{u}}{\mathrm{s}}\right)\mathrm{x}\left(\mathrm{t}\right)d\mathrm{t}\end{equation}$$(3) DWT is used to perform a fast implementation of the WT of a discrete signal, which uses digital filters in processing analysis. An input signal is filtered by the lowpass and highpass filters into two components, which are approximation coefficients (corresponding to low frequency) and detail coefficients (corresponding to high frequency). The highpass filter uses wavelet functions, while the lowpass filter uses scaling functions. The filtering process is conducted on many levels, when through each filter, the signal is sampled down by 2. In each level, the signal has a different resolution. Therefore, DWT is a multi-resolution analysis. The signal can be recovered using inverse discrete WT (IDWT). To ensure the recovery signal is exactly like the original through each reconstruction filter, the signal sampling is conducted to 2. A 2D DWT is performed by repeating the application of 1D DWT. The 1D DWT is first applied along the rows and then along the columns of the first step result, which generates four sub-bands. 2.2. DWT-SPIHT SPIHT has been proposed by Said and Pearlman [24] for image compression. Its core principles are based on the embedded zerotree wavelet introduced by Shapiro [26]. This algorithm exploits the relationship between wavelet coefficients in different scales to improve the performance. The principles used by the SPIHT algorithm include the following: partial ordering by magnitude, partitioning by significance of magnitudes with regard to a sequence of octavely decreasing thresholds, ordered bit plane transmission, and self-similarity across different scales in wavelet transform. After EEG signals are decomposed by a wavelet transform (DWT), SPIHT is used to encode the wavelet coefficients. The wavelet coefficients are put into the spatial orientation trees, the sorting and refinement procedures, in which the relationship between wavelet coefficients in different scales such as a spatial-frequency localization and cross-subband similarity is exploited efficiently, aim to improve the performance of SPIHT. 2.3. DWT-AAC DWT-AAC is an EEG lossy compression technique proposed by Nguyen et al. [19]. In the compression process, EEG signals are decomposed by a DWT into a set of coefficients. Afterwards, these coefficients are quantized and then thresholded, which creates the binary significance map and indices of significant coefficients. Finally, both the binary significance map and indices of significant coefficients are coded by an AAC to generate the compressed signals. In the decompression process, an inverse process is employed to decompress the signals. The thresholding component is put after the quantization in DWT-AAC, which helps to control the impact of using static threshold values on the fidelity of signals. Moreover, using AAC instead of arithmetic coder or Huffman, and coding both the binary significance map and indices of significant coefficients improves the compression performance of DWT-AAC. As reported in [19], DWT-AAC achieves good results compared to some recent compression techniques. 2.4. Performance measures The metrics widely used to measure the performance of compression algorithms are compression ratio (CR) and percentage root-mean-square difference (PRD). The standard deviations (STD) of CRs and PRDs are also used. CR is defined as the ratio between the number of bits of the original signals (Lorg) and the compressed signals (Lcomp): $$\begin{equation}\mathrm{CR}=\frac{{\mathrm{L}}_{\mathrm{org}}}{{\mathrm{L}}_{\mathrm{comp}}}\end{equation}$$(4) PRD is used to evaluate the distortion between the original and recovered signals. It is defined as follows: $$\begin{equation}\mathrm{PRD}=\sqrt{\frac{\sum_{i=1}^N{\left({x}_{org}\left[i\right]-{x}_{rec}\left[i\right]\right)}^2}{\sum_{i=1}^N{\left({x}_{org}\left[i\right]\right)}^2}},\end{equation}$$(5) where xorg[i] and xrec[i] represent the original and reconstructed EEG signals, respectively, and N is the number of samples. Figure 1. Open in new tabDownload slide A typical EEG-based person identification system. Figure 1. Open in new tabDownload slide A typical EEG-based person identification system. Figure 2. Open in new tabDownload slide A typical EEG-based person verification system. Figure 2. Open in new tabDownload slide A typical EEG-based person verification system. 3. E‌EG-based person recognition system This section provides an overview of EEG-based person recognition systems such as EEG-based person identification and EEG-based person verification. 3.1. E‌EG-based person identification system The EEG-based person identification task is a process of recognizing a person using his/her provided signal without providing his/her identity. Training and testing are two phases in a typical EEG-based person identification system, as shown in Figure 1. In the training phase, the EEG signal is captured, pre-processed and then features are extracted to train person models. In the testing phase, the person is asked to provide an EEG signal that is then pre-processed, extracted features and put into a scoring component to calculate matching scores with all person models in the database. The system then outputs the person with highest matching score as the identified person. The identification rate is used to measure the performance of the EEG-based person identification system. Recently, feature extraction and modelling techniques have been proposed for EEG-based person identification systems. In particular, AR, PSD, paralinguistic feature and discrete Fourier transform have been used for feature extraction. Support vector machine (SVM), k-nearest neighbours, Naïve Bayes and neural networks are some common techniques for classifiers that have been used in [3, 6, 21, 27]. In this present work, we employ an EEG-based person identification system proposed in [21] in which the paralinguistic feature and SVM were used for extracting features and building person models. 3.2. E‌EG-based person verification system The verification task is a process of accepting or rejecting the claimed identity made by a user. There are two phases in an EEG-based person verification system, namely enrolment and verification, as depicted in Figure 2. In the enrolment phase, an EEG signal from a claimed user is captured, pre-processed and then extracted features to build a person model. This model is stored in the database with the identity claimed by this user. In the verification phase, a user is required to provide both an EEG signal and his/her identity. Similar to the enrolment phase, the EEG signal is processed to extract features, and these features are then fed into the classifier to verify this user. This user will be accepted if the similarity between his/her recorded signal and the claimed model stored in the database is higher than a pre-set threshold. If it is lower, this user will be rejected. False acceptance rate (FAR), false rejection rate (FRR) and equal error rate (EER) are used to measure the performance of the EEG-based person verification system. Although several techniques have been proposed for the EEG-based person verification system, the AR and PSD for feature extraction and SVM for classifier are most commonly used. Marcel et al. used PSD in [16], while the AR and PSD were used in [20], and the AR, PSD and SVM were used in [2, 22]. Recently, Nguyen et al. [21] stated paralinguistic features could be applied to extract brain wave features from EEG signals for a higher identification performance. Therefore, the paralinguistic features and SVM were used for the EEG-based person verification system in this paper. We examine the impact of lossy compression on the performances of person identification and verification systems by studying the identification rate, and FAR, FRR and EER correspondingly without and with compression at different levels of CR. 4. Experiment conditions 4.1. E‌EG datasets Two public datasets were used for this research, which are CHB MIT Scalp EEG Dataset (DS1) [12] and Alcoholism Large EEG (DS2) [5]. DS1 contains 10 records of EEG signals from 10 patients with epilepsy. The EEG signals from this dataset were sampled at 256 Hz and digitized at 16-bit resolution. DS2 includes signals of 20 subjects (10 alcoholics and 10 controls) sampled at 256 Hz with 16-bit resolution. 4.2. Compression parameters DWT-SPIHT was used and set up to run with 5 levels of biorthogonal 4.4 2D DWT because this mode has already achieved wide acceptance for use in compression algorithms [9]. Moreover, DWT-AAC was also set up to run using 5 levels of biorthogonal 4.4 2D DWT and 6 bits uniform quantization due to its high performance compared to recent techniques [19]. Table 1. The best parameters for both datasets processed by DWT-SPIHT . DS1 . DS2 . CR . c . γ . c . γ . 1 8 0.5 8 0.25 4 8 0.5 8 1 5 8 0.5 8 1 6.4 8 0.5 8 1 8 8 0.5 8 1 10 8 0.5 8 1 16 8 0.0625 8 1 20 8 0.5 8 1 32 8 1 8 1 40 8 1 8 1 50 8 1 8 1 64 8 0.25 8 1 . DS1 . DS2 . CR . c . γ . c . γ . 1 8 0.5 8 0.25 4 8 0.5 8 1 5 8 0.5 8 1 6.4 8 0.5 8 1 8 8 0.5 8 1 10 8 0.5 8 1 16 8 0.0625 8 1 20 8 0.5 8 1 32 8 1 8 1 40 8 1 8 1 50 8 1 8 1 64 8 0.25 8 1 Open in new tab Table 1. The best parameters for both datasets processed by DWT-SPIHT . DS1 . DS2 . CR . c . γ . c . γ . 1 8 0.5 8 0.25 4 8 0.5 8 1 5 8 0.5 8 1 6.4 8 0.5 8 1 8 8 0.5 8 1 10 8 0.5 8 1 16 8 0.0625 8 1 20 8 0.5 8 1 32 8 1 8 1 40 8 1 8 1 50 8 1 8 1 64 8 0.25 8 1 . DS1 . DS2 . CR . c . γ . c . γ . 1 8 0.5 8 0.25 4 8 0.5 8 1 5 8 0.5 8 1 6.4 8 0.5 8 1 8 8 0.5 8 1 10 8 0.5 8 1 16 8 0.0625 8 1 20 8 0.5 8 1 32 8 1 8 1 40 8 1 8 1 50 8 1 8 1 64 8 0.25 8 1 Open in new tab 4.3. Feature extraction and classifier parameters The openSMILE feature extraction toolkit [11] was used to extract EEG features including Mel-frequency cepstral coefficients, log filter-bank powers and line spectral pairs. The range of frequencies was set between 1 and 300 Hz. The feature was extracted individually from each EEG channel, and then features from all channels were merged together. The 8 channels chosen were C3, C4, Cz, P3, P4, Pz, O1 and O2 for both the datasets. For each of the datasets, we selected 2/3 for cross-validation training and 1/3 for testing. This selection was random and repeated 10 times, with the final experimental result being the average of these 10 results. Linear SVM classifiers [7] were trained in a 3-fold cross-validation scheme with parameter C ranging from 1 to 1000 in 5 steps. In the EEG-based person verification system, we calculated the best parameters for building person models in which 5-fold cross-validation was used on training data; the best parameters found were used to train person models. The RBF kernel function |$K({x}_i,{x}_j)={e}^{-\gamma{|{x}_i-{x}_j|}^2}$| was used. The parameter γ was searched in |$\{2k:k=-4,-3,\dots, 1\}$| while the parameter c was searched in |$\{2k:k=-2,-1,\dots, 3\}$|⁠. The parameters γ and c were used for training SVMs. Tables 1 and 2 show the best parameters c and γ for each dataset and each compression technique. EEG data from both datasets were compressed and recovered at different levels of CR so the best parameters were calculated for each case. CR = 1 indicates that the EEG signal was original while CR = 5 means that the EEG signal was compressed and reconstructed at CR of 5. Table 2. The best parameters for both datasets processed by DWT-AAC DS1 . DS2 . CR . C . γ . CR . C . γ . 1 8 0.5 1 8 0.25 5.4 8 0.5 8 8 1 5.8 8 0.5 8.1 8 1 12.7 8 0.5 16.8 8 1 15.1 8 0.5 19 8 1 17.7 8 1 25 8 1 22.2 8 0.5 38.2 8 1 24.2 8 0.5 46.6 8 0.5 27.6 8 0.5 62.5 8 0.5 40 8 2 80.8 8 1 51.2 8 2 108.3 8 1 75.9 8 1 N/A N/A N/A DS1 . DS2 . CR . C . γ . CR . C . γ . 1 8 0.5 1 8 0.25 5.4 8 0.5 8 8 1 5.8 8 0.5 8.1 8 1 12.7 8 0.5 16.8 8 1 15.1 8 0.5 19 8 1 17.7 8 1 25 8 1 22.2 8 0.5 38.2 8 1 24.2 8 0.5 46.6 8 0.5 27.6 8 0.5 62.5 8 0.5 40 8 2 80.8 8 1 51.2 8 2 108.3 8 1 75.9 8 1 N/A N/A N/A Open in new tab Table 2. The best parameters for both datasets processed by DWT-AAC DS1 . DS2 . CR . C . γ . CR . C . γ . 1 8 0.5 1 8 0.25 5.4 8 0.5 8 8 1 5.8 8 0.5 8.1 8 1 12.7 8 0.5 16.8 8 1 15.1 8 0.5 19 8 1 17.7 8 1 25 8 1 22.2 8 0.5 38.2 8 1 24.2 8 0.5 46.6 8 0.5 27.6 8 0.5 62.5 8 0.5 40 8 2 80.8 8 1 51.2 8 2 108.3 8 1 75.9 8 1 N/A N/A N/A DS1 . DS2 . CR . C . γ . CR . C . γ . 1 8 0.5 1 8 0.25 5.4 8 0.5 8 8 1 5.8 8 0.5 8.1 8 1 12.7 8 0.5 16.8 8 1 15.1 8 0.5 19 8 1 17.7 8 1 25 8 1 22.2 8 0.5 38.2 8 1 24.2 8 0.5 46.6 8 0.5 27.6 8 0.5 62.5 8 0.5 40 8 2 80.8 8 1 51.2 8 2 108.3 8 1 75.9 8 1 N/A N/A N/A Open in new tab 5. Results 5.1. Compression performance Figures 3a and 4a show the average PRDs versus average CRs processed by both DWT-SPIHT and DWT-AAC on both datasets DS1 and DS2, respectively. Generally, DWT-AAC achieved better compression performances than DWT-SPIHT. In particular, on DS1, with CRs smaller than 8, DWT-SPIHT had slightly better result than DWT-AAC while there was a reverse trend when CRs were greater than 8. At PRD of 7%, e.g. CRs of DWT-SPIHT and DWT-AAC were 7 and 5.5, correspondingly. Conversely, at PRD of 30%, the latter obtained a CR of 30 while CR of the former was 25. For DS2, the compression performance of DWT-AAC was much better than that of DWT-SPIHT. For instance, at CR of 20, PRD of the former was 19%, compared to 38% for the latter. Figure 3. Open in new tabDownload slide Average PRDs versus average CRs for DWT-SPIHT and DWT-AAC on DS1. Figure 3. Open in new tabDownload slide Average PRDs versus average CRs for DWT-SPIHT and DWT-AAC on DS1. Figure 4. Open in new tabDownload slide Average PRDs versus average CRs for DWT-SPIHT and DWT-AAC on DS2. Figure 4. Open in new tabDownload slide Average PRDs versus average CRs for DWT-SPIHT and DWT-AAC on DS2. Figures 3b and 4b illustrate the STD calculated from the CRs and PRDs of subjects’ EEG data processed by both compression techniques on DS1 and DS2, correspondingly. It can be seen that the STDs of CRs of DWT-SPIHT are zero, while those of PRDs are quite large. It is noted that the size of compressed data is determined exactly based on bit rate before compression, so the STDs of CRs processed by DWT-SPIHT are always zero. In contrast, the STDs of CRs are much higher than those of PRDs for DWT-AAC. Additionally, the STDs conducted on DS2 are significantly higher than those performed on DS1. This could be because the differences of EEG signals between the alcoholic and normal subjects in DS2 are bigger than those between epilepsy subjects in DS1. These introduce larger gaps of CRs and PRDs between subjects in DS2, creating higher STDs. 5.2. Person identification performance with increasing compression The identification rates versus corresponding CRs for both DWT-SPIHT and DWT-AAC on DS1 and DS2 are illustrated in Figures 5 and 6, respectively. It can be seen that when CR increases, the identification rates for both compression algorithms decline gradually. On DS1, e.g. the identification rate is 99.52% at a CR of 5.8 compressed by DWT-AAC, while the figure reduces to 95.92% when CR is 27.58. On DS2, for instance, the identification rate is 98.16% at a CR of 6.4 processed by DWT-SPIHT, compared to 90.11% at a CR of 25. Figure 5. Open in new tabDownload slide Identification rates versus CRs for DWT-SPIHT and DWT-AAC on DS1. Figure 5. Open in new tabDownload slide Identification rates versus CRs for DWT-SPIHT and DWT-AAC on DS1. Figure 6. Open in new tabDownload slide Identification rates versus CRs for DWT-SPIHT and DWT-AAC on DS2. Figure 6. Open in new tabDownload slide Identification rates versus CRs for DWT-SPIHT and DWT-AAC on DS2. The performances of the person identification system using uncompressed (original) EEG data on DS1 and DS2 are demonstrated in Table 3. From Table 3 and referring back to Figures 5 and 6, it is said that for both compression algorithms, the system using the reconstructed signals at low CRs obtains higher performances than that using the original ones. For instance, on DS1, the identification rates are 99.76% for DWT-AAC at a CR of 5.45 with a PRD of 6.8% and 99.72% for DWT-SPIHT at a CR of 6.4 with a PRD of 5.6%, respectively, which is higher than 99.28% of using the original signals. Similarly, on DS2, the identification rates are 99.35% and 96.45% for DWT-AAC at a CR of 8.14 with a PRD of 16.48% and for DWT-SPIHT at a CR of 8 with a PRD of 31.95%, correspondingly, compared to 92.61% when using the uncompressed data. Table 3. Performance of person identification using original EEG signals Dataset . Identification rate (%) . DS1 99.28 DS2 92.61 Dataset . Identification rate (%) . DS1 99.28 DS2 92.61 Open in new tab Table 3. Performance of person identification using original EEG signals Dataset . Identification rate (%) . DS1 99.28 DS2 92.61 Dataset . Identification rate (%) . DS1 99.28 DS2 92.61 Open in new tab 5.3. Person verification performance with increasing compression Figures 7 and 8 show the relationship between the average EERs and CRs on both datasets DS1 and DS2. It can be seen that the more compression (CR), the higher EER the person verification system can obtain. In other words, the performance of the person verification system degrades with increasing compression. On DS1, for instance, the EER is 0.31% at a CR of 10 processed by DWT-SPIHT, while the figure reaches to 1.4% at a CR of 32, as seen in Figure 7. In similar, the system has an EER of 0.1% on DS2 at a CR of 19 compressed by DWT-AAC, compared to an EER of 0.87% at a CR of 46, as shown in Figure 8. Figure 7. Open in new tabDownload slide The average EERs versus CRs for DWT-SPIHT and DWT-AAC on DS1. Figure 7. Open in new tabDownload slide The average EERs versus CRs for DWT-SPIHT and DWT-AAC on DS1. Figure 8. Open in new tabDownload slide The average EERs versus CRs for DWT-SPIHT and DWT-AAC on DS2. Figure 8. Open in new tabDownload slide The average EERs versus CRs for DWT-SPIHT and DWT-AAC on DS2. However, similar to the person identification, the person verification system obtains higher performances when using the reconstructed signal at low CRs processed by DWT-SPIHT and DWT-AAC than when using the original ones. For example, the EERs using recovered signals at CR around 5 for both lossy compressions on DS1 are much smaller than those using the original ones, as shown in Figure 7. Similar on DS2, at CRs around 10 for both compressions, the EERs are much better than those using the original ones, as seen in Figure 8. Figures 9 and 10 demonstrate the DET curves of person verification system when using original and recovered signals on DS1 and DS2, correspondingly. On DS1, the DET curves using the reconstructed signals processed by both DWT- AAC and DWT-SPIHT at low CRs are quite similar to the original ones. In contrast, there are few errors in the DET curves using recovered signals at low CRs for both compression techniques on DS2. For example, the DET curves using reconstructed signals with DWT-AAC at CR of 38.2 and with DWT-SPIHT at CR of 20 are much better than those using the original one, as seen in Figures 10a and 10b, respectively, as shown in Figure 10. Figure 9. Open in new tabDownload slide DET curves of person verification on DS1. Figure 9. Open in new tabDownload slide DET curves of person verification on DS1. Figure 10. Open in new tabDownload slide DET curves of person verification on DS2. Figure 10. Open in new tabDownload slide DET curves of person verification on DS2. 6. Discussion Results from Section 5 highlight that the lossy compression techniques do have an impact on EEG-based person identification and verification systems as the person recognition performance decreases when compression increases. This is because there is a trade-off between CR and PRD. Hence, PRD will escalate when CR augments; this results in more and more information loss in the recovered EEG signals, including the biometric information. This reduces the performance of person identification and verification systems. Besides this, both the person identification and verification systems using the reconstructed signals at low CRs processed by DWT-AAC and DWT-SPIHT obtain higher performances than those using the original ones. To explain this, the core principles of both compression algorithms should be examined. For DWT-AAC, after quantization, the quantized wavelet coefficients are thresholded, which leads insignificant coefficients that those values less than the threshold are discarded. Most of insignificant coefficients are the highpass ‘detail’ coefficients that lay on the high frequencies. DWT-SPIHT is based on bit rate, so the size of the compressed data will be determined exactly before compression. After quantization, the quantized wavelet coefficients are ordered by magnitude and the significance of magnitudes will be coded first when the size of compressed data is still available. Similar to DWT-AAC, unimportant coefficients are removed in DWT-SPIHT. Therefore, the most probable explanation is that applying both compression algorithms at low CRs may actually filter out some irrelevant noises, e.g. the signal artifacts present in the EEG signals. This makes the recovered signals smoother while still preserving the biometric information well, improving the performance of the person identification and verification systems. Figure 11 demonstrates the original signal, and two reconstructed signals processed by DWT-SPIHT and DWT-AAC at CRs around 8 on DS2. Although the three signals are nearly similar, two recovered signals are smoother compared to the original one as some irrelevant noises were removed. Figure 11. Open in new tabDownload slide EEG signals from subject co2a0000365, channel C3, DS2. (i) Original signals, reconstructed signals with (ii) DWT-SPIHT and (iii) DWT-SPIHT. Figure 11. Open in new tabDownload slide EEG signals from subject co2a0000365, channel C3, DS2. (i) Original signals, reconstructed signals with (ii) DWT-SPIHT and (iii) DWT-SPIHT. A purpose of this paper was to explore how to maximize the compression rate and information loss without affecting the ability of EEG-based person recognition systems. For seizure detection, the study in [13] indicated that a percentage greater than 90% is considered a very good performance classifier. Similarly, a percentage of more than 90% is also considered very good for person identification, so an identification rate of 90% is used as a threshold limit for compression. With regard to the person verification system, it can be said that the system is not affected by lossy compression at a given CR and PRD if the system’s EER when using the reconstructed signals at that CR and PRD is equal or less than that of the original ones. Therefore, the EER of this system using the original EEG data was used as the cut-off limit to maximize the compression. Referring back to Figure 5, it can be seen that at an identification rate of 90% on DS1, CRs of DWT-SPIHT and DWT-AAC are 50 and 70, respectively. From Figure 3a, the equivalent PRDs are 49% and 50% for the former and latter, correspondingly. From Figures 6 and 4a on DS2 at a 90% identification rate, DWT-SPIHT achieves a CR of 25 with a PRD of 41.3% while DWT-AAC obtains a higher CR of 52 corresponding to a PRD of 27%. Figures 12 and 13 illustrate the relationship between the average EERs and PRDs when the person verification system used the recovered signals from both datasets. On DS1, from Figure 12 and referring back to Figure 7, the EERs of using the reconstructed signals at PRDs less than 10% (CRs less than 10) for DWT-SPIHT and PRDs less than 15% (CRs less than 15) for DWT-AAC are smaller than those of using the original ones. Similarly, on DS2, from Figure 13 and referring back to Figure 8, the EERs at PRDs less than 45% (CRs less than 30) for DWT-SPIHT and at PRDs less than 28% (CRs less than 50) for DWT-AAC are equal or less than those of using the original ones. Figure 12. Open in new tabDownload slide The average EERs versus PRDs for DWT-SPIHT and DWT-AAC on DS1. Figure 12. Open in new tabDownload slide The average EERs versus PRDs for DWT-SPIHT and DWT-AAC on DS1. Figure 13. Open in new tabDownload slide The average EERs versus PRDs for DWT-SPIHT and DWT-AAC on DS2. Figure 13. Open in new tabDownload slide The average EERs versus PRDs for DWT-SPIHT and DWT-AAC on DS2. According to Antoniol and Tonella [1] and Srinivasan and Reddy [28], the CR of lossless compression of EEG signals achieves between 2 and 3. Meanwhile, on the worst cases, at CR of 50 (PRD of 49%) on DS1 and at CR of 25 (PRD of 41.3%) on DS2, the performances are still good enough (equal or above 90%) for person identification purpose. Similarly, on the worst cases, at CR of 10 (PRD of 10%) on DS1 and at CR of 30 (PRD of 45%) on DS2, there is no effect on the performance of the person verification system. Therefore, it can say that applying lossy compression for EEG-based person recognition system takes the advantage compared to using the lossless one. As mentioned above, besides the conventional measures CR and PRD, the performance measures of person identification and verification systems, which are identification rate and EER, are proposed to use as additional metrics in order to evaluate the quality of lossy compression techniques. Particularly, the thresholds of identification rate of 90% and EER of using the original signal are used with CR and PRD. It can be seen that DWT-AAC has better compression performances than DWT-SPIHT as at identification rate of 90%, CR of the former is 70 on DS1, compared to 50 for the latter. Similarly, CR of DWT-AAC is 52 on DS2, while that of DWT-SPIHT is only 25. Moreover, at EER of using original signals, CR of DWT-AAC is 15 on DS1, compared to 10 for DWT-SPIHT. In similar, DWT-AAC obtains CR of 50 on DS2, while DWT-SPIHT gives only CR of 30. 7. Conclusion In this paper, we have investigated the impact of lossy compression on the performance of EEG-based person identification and verification systems. Our experimental results have answered the two research questions asked earlier as follows: (i) lossy compression affects the EEG-based person recognition system by reducing the performance when compression increases. However, at low CRs, lossy compression improves the recognition performance thanks to removing the noises and making reconstructed signals smoother. (ii) As soon as maximizing information loss is considered, in the best cases, the compression can be tolerated at CR up to 70 (PRD of 50%) for person identification and at CR up to 50 (PRD of 28%) for person verification. In the worst cases, the compression can be tolerated at CR up to 25 (PRD of 41.3%) for the former and at CR up to 10 (PRD of 10%) for the latter. It means that applying lossy techniques to EEG-based person recognition systems is feasible and beneficial compared to using lossless approaches. Furthermore, the identification rate of 90% and EER of using the original signal, as supplemental metrics to evaluate the quality of lossy compression algorithms. By using these additional measures with CR and PRD, it is easier to indicate that DWT-AAC gives better results than DWT-SPIHT. The common values of CR, 70 for person identification system and 50 for person verification system, have been found in which person recognition systems that use lossy compression with those common values of CR will provide almost the same performances with the systems that use the same datasets without lossy compression. Other lossy compression techniques, as well as EEG-based person recognition systems based on other methods, will be conducted on a larger scale of EEG datasets to verify these research findings. References [1] G. Antoniol and P. Tonella EEG data compression techniques . IEEE Transactions on Biomedical Engineering , 44 , 105 – 114 , 1997 . Google Scholar Crossref Search ADS PubMed WorldCat [2] C. Ashby , A. Bhatia, F. Tenore and J. Vogelstein Low-cost electroencephalogram (EEG) based authentication . In 2011 5th International IEEE/EMBS Conference on Neural Engineering , pp. 442 – 445 . IEEE , 2011 . Google Scholar Crossref Search ADS Google Preview WorldCat COPAC [3] X. Bao , J. Wang and J. Hu Method of individual identification based on electroencephalogram analysis . In International Conference on New Trends in Information and Service Science, 2009. NISS’09 , pp. 390 – 393 . IEEE , 2009 . Google Scholar Crossref Search ADS Google Preview WorldCat COPAC [4] L. A. Barford , R. S. Fazzio and D. R. Smith An Introduction to Wavelets . Citeseer , 1992 . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC [5] H. Begleiter EEG database . Available: http://kdd.ics.uci.edu/databases/eeg/eeg.data.html [6] K. Brigham and B. V. Kumar Subject identification from electroencephalogram (EEG) signals during imagined speech . In 2010 Fourth IEEE International Conference on Biometrics: Theory Applications and Systems (BTAS) , pp. 1 – 8 . IEEE , 2010 . Google Scholar Crossref Search ADS Google Preview WorldCat COPAC [7] C.-C. Chang and C.-J. Lin Libsvm: a library for support vector machines . ACM Transactions on Intelligent Systems and Technology (TIST) , 2 , 27 , 2011 . Google Scholar OpenURL Placeholder Text WorldCat [8] P. T. Dao , X. J. Li and H. N. Do Lossy compression techniques for EEG signals . In 2015 International Conference on Advanced Technologies for Communications (ATC) , pp. 154 – 159 . IEEE , 2015 . Google Scholar Crossref Search ADS Google Preview WorldCat COPAC [9] H. Daou and F. Labeau Performance analysis of a 2-D EEG compression algorithm using an automatic seizure detection system . In 2012 Conference Record of the Forty Sixth Asilomar Conference on Signals, Systems and Computers (ASILOMAR) , pp. 1632 – 1636 . IEEE , 2012 . Google Scholar Crossref Search ADS Google Preview WorldCat COPAC [10] H. Daou and F. Labeau Dynamic dictionary for combined EEG compression and seizure detection . IEEE Journal of Biomedical and Health Informatics , 18 , 247 – 256 , 2014 . Google Scholar Crossref Search ADS PubMed WorldCat [11] F. Eyben , M. Wollmer and B. Schuller Opensmile: the munich versatile and fast open-source audio feature extractor . In Proceedings of the 18th ACM International Conference on Multimedia , pp. 1459 – 1462 . ACM , 2010 . Google Scholar Crossref Search ADS Google Preview WorldCat COPAC [12] A. L. Goldberger , L. A. Amaral, L. Glass, J. M. Hausdorff, P. C. Ivanov, R. G. Mark, J. E. Mietus, G. B. Moody, C.-K. Peng and H. E. Stanley Physiobank, physiotoolkit, and physionet components of a new research resource for complex physiologic signals . Circulation , 101 , e215 – e220 , 2000 . Google Scholar PubMed OpenURL Placeholder Text WorldCat [13] G. Higgins , B. McGinley, S. Faul, R. P. McEvoy, M. Glavin, W. P. Mar-nane and E. Jones The effects of lossy compression on diagnostically relevant seizure information in EEG signals . IEEE Journal of Biomedical and Health Informatics , 17 , 121 – 127 , 2013 . Google Scholar Crossref Search ADS PubMed WorldCat [14] D. Hill Value of the EEG in diagnosis of epilepsy . British Medical Journal , 1 , 663 , 1958 . Google Scholar Crossref Search ADS PubMed WorldCat [15] S. Mallat A Wavelet Tour of Signal Processing . Elsevier , 1999 . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC [16] S. Marcel and J. D. R. Millan Person authentication using brainwaves (EEG) and maximum a posteriori model adaptation . IEEE Transactions on Pattern Analysis and Machine Intelligence , 29 , 743 – 752 , 2007 . Google Scholar Crossref Search ADS PubMed WorldCat [17] C. A. Marsan and L. Zivin Factors related to the occurrence of typical paroxysmal abnormalities in the EEG records of epileptic patients . Epilepsia , 11 , 361 – 381 , 1970 . Google Scholar Crossref Search ADS PubMed WorldCat [18] B. Nguyen , D. Nguyen, W. Ma and D. Tran Investigating the possibility of applying EEG lossy compression to EEG-based user authentication . In 2017 International Joint Conference on Neural Networks (IJCNN) , pp. 79 – 85 . IEEE , 2017 . Google Scholar Crossref Search ADS Google Preview WorldCat COPAC [19] B. Nguyen , D. Nguyen, W. Ma and D. Tran Wavelet transform and adaptive arithmetic coding techniques for EEG lossy compression . In 2017 International Joint Conference on Neural Networks (IJCNN) , pp. 3153 – 3160 . IEEE , 2017 . Google Scholar Crossref Search ADS Google Preview WorldCat COPAC [20] P. Nguyen , D. Tran, X. Huang and W. Ma Motor imagery EEG-based person verification . In International Work-Conference on Artificial Neural Networks , pp. 430 – 438 . Springer , 2013 . Google Scholar Crossref Search ADS Google Preview WorldCat COPAC [21] P. Nguyen , D. Tran, X. Huang and D. Sharma A proposed feature extraction method for EEG-based person identification . In International Conference on Artificial Intelligence. The Steering Committee of The World Congress in Computer Science, Computer Engineering and Applied Computing (WorldComp) , 2012 . [22] T. Pham , W. Ma, D. Tran, P. Nguyen and D. Phung EEG-based user authentication in multilevel security systems. In International Conference on Advanced Data Mining and Applications , pp. 513 – 523 . Springer , 2013 . Google Scholar Crossref Search ADS Google Preview WorldCat COPAC [23] A. Riera , A. Soria-Frisch, M. Caparrini, C. Grau and G. Ruffini Un-obtrusive biometric system based on electroencephalogram analysis . EURASIP Journal on Advances in Signal Processing , 2008 , 18 , 2008 . Google Scholar OpenURL Placeholder Text WorldCat [24] A. Said and W. A. Pearlman A new, fast, and efficient image codec based on set partitioning in hierarchical trees . IEEE Transactions on Circuits and Systems for Video Technology , 6 , 243 – 250 , 1996 . Google Scholar Crossref Search ADS WorldCat [25] S. Sanei and J. A. Chambers EEG Signal Processing . Wiley Interscience , 2007 . Google Scholar Crossref Search ADS Google Preview WorldCat COPAC [26] J. M. Shapiro Embedded image coding using zerotrees of wavelet coefficients . IEEE Transactions on Signal Processing , 41 , 3445 – 3462 , 1993 . Google Scholar Crossref Search ADS WorldCat [27] H. A. Shedeed A new method for person identification in a biometric security system based on brain EEG signal processing . In 2011 World Congress on Information and Communication Technologies (WICT) , pp. 1205 – 1210 . IEEE , 2011 . Google Scholar Crossref Search ADS Google Preview WorldCat COPAC [28] K. Srinivasan and M. R. Reddy Efficient preprocessing technique for real-time lossless EEG compression . Electronics Letters , 46 , 26 – 27 , 2010 . Google Scholar Crossref Search ADS WorldCat [29] S. Sun Multitask learning for EEG-based biometrics . In 19th International Conference on Pattern Recognition, 2008. ICPR 2008 , pp. 1 – 4 . IEEE , 2008 . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC [30] M. Vetterli and C. Herley Wavelets and filter banks: theory and design . IEEE Transactions on Signal Processing , 40 , 2207 – 2232 , 1992 . Google Scholar Crossref Search ADS WorldCat © The Author(s) 2020. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permission@oup.com. This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model) TI - Biometric recognition system performance measures for lossy compression on EEG signals JF - Logic Journal of the IGPL DO - 10.1093/jigpal/jzaa033 DA - 2020-09-09 UR - https://www.deepdyve.com/lp/oxford-university-press/biometric-recognition-system-performance-measures-for-lossy-WXgqu7efw0 SP - 1 EP - 1 VL - Advance Article IS - DP - DeepDyve ER -