Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

The Use of the Visualisation of Multidimensional Data Using PCA to Evaluate Possibilities of the Division of Coal Samples Space Due to their Suitability for Fluidised Gasification

The Use of the Visualisation of Multidimensional Data Using PCA to Evaluate Possibilities of the... Arch. Min. Sci., Vol. 61 (2016), No 3, p. 523­535 Electronic version (in color) of this paper is available: http://mining.archives.pl DOI 10.1515/amsc-2016-0038 DARIUSZ JAMRÓZ*, TOMASZ NIEDOBA**, AGNIESZKA SUROWIAK**, TADEUSZ TUMIDAJSKI** ZASTOSOWANIE WIZUALIZACJI WIELOWYMIAROWYCH DANYCH ZA POMOC PCA DO OCENY MOLIWOCI PODZIALU PRÓBEK WGLA ZE WZGLDU NA ICH PRZYDATNO DO ZGAZOWANIA Methods serving to visualise multidimensional data through the transformation of multidimensional space into two-dimensional space, enable to present the multidimensional data on the computer screen. Thanks to this, qualitative analysis of this data can be performed in the most natural way for humans, through the sense of sight. An example of such a method of multidimensional data visualisation is PCA (principal component analysis) method. This method was used in this work to present and analyse a set of seven-dimensional data (selected seven properties) describing coal samples obtained from Janina and Wieczorek coal mines. Coal from these mines was previously subjected to separation by means of a laboratory ring jig, consisting of ten rings. With 5 layers of both types of coal (with 2 rings each) were obtained in this way. It was decided to check if the method of multidimensional data visualisation enables to divide the space of such divided samples into areas with different suitability for the fluidised gasification process. To that end, the card of technological suitability of coal was used (Sobolewski et al., 2012; 2013), in which key, relevant and additional parameters, having effect on the gasification process, were described. As a result of analyses, it was stated that effective determination of coal samples suitability for the on-surface gasification process in a fluidised reactor is possible. The PCA method enables the visualisation of the optimal subspace containing the set requirements concerning the properties of coals intended for this process. Keywords: Principal Component Analysis, multidimensional visualisation, coal gasification, jigging * ** AGH UNIVERSITY OF SCIENCE AND TECHNOLOGY, FACULTY OF ELECTRICAL ENGINEERING, AUTOMATICS, COMPUTER SCIENCE AND BIOMEDICAL ENGINEERING, DEPARTMENT OF APPLIED COMPUTER SCIENCE, AL. MICKIEWICZA 30, 30-059 KRAKÓW, POLAND, E-MAIL: jamroz@agh.edu.pl AGH UNIVERSITY OF SCIENCE AND TECHNOLOGY, FACULTY OF MINING AND GEOENGINEERING, DEPARTMENT OF ENVIRONMENTAL ENGINEERING AND MINERAL PROCESSING, AL. MICKIEWICZA 30, 30-059 KRAKÓW, POLAND, E-MAIL: tniedoba@agh.edu.pl, asur@agh.edu.pl, tadeusz.tumidajski@agh.edu.pl Proces zgazowania wgla jest jedn z technologii, które zyskuj coraz szersz uwag wród technologów zajmujcych si jego przeróbk i utylizacj. Ze wzgldu na typ zgazowania wyrónia si dwa glówne sposoby: zgazowanie naziemne i podziemne. Kady z tych typów mona jednak przeprowadzi za pomoc rónych technologii. W przypadku zgazowania naziemnego, jedn z takich technologii jest zgazowanie w reaktorze fluidalnym. Do tego typu zgazowania zostaly opracowane wytyczne w ramach projektu NCBiR nr 23.23.100.8498/R34 pt. ,,Opracowanie technologii zgazowania wgla dla wysokoefektywnej produkcji paliw i energii" w ramach strategicznego programu bada naukowych i prac rozwojowych pt. ,,Zaawansowane technologie pozyskiwania energii" (Marciniak-Kowalska, 2011-12; Sobolewski et al., 2012; 2013; Strugala et al., 2011; 2012). Autorzy wybrali glówne z tych wytycznych, dotyczcych zalecanych poziomów okrelonych cech wgla. W celu zbadania wgla pod ktem ich przydatnoci do zgazowania pobrano próbki dwóch wgli: pochodzcych z Zakladu Górniczego Janina oraz z Kopalni Wgla Kamiennego Wieczorek. Kady z tych wgli zostal poddany procesowi wzbogacania w laboratoryjnej osadzarce piercieniowej (10 piercieni, wgiel w klasach wydzielonych z przedzialu 0-18 mm). Po zakoczeniu procesu rozdzialu material podzielono na 5 warstw (po 2 piercienie) i kady z nich rozsiano na sitach na 10 klas ziarnowych, ustalajc wychody warstw i klas. Nastpnie, tak otrzymane produkty ­ klasy ziarnowe, po wydzieleniu analitycznych próbek, poddano chemicznej analizie elementarnej i technicznej wgla, w celu scharakteryzowania wlaciwoci wplywajcych na procesy zgazowania. Lcznie z obu kopal uzyskano 99 próbek (50 z kopalni Janina oraz 49 z kopalni Wieczorek ­ w jednej z warstw nie uzyskano klasy 16-18 mm) charakteryzowanych przez nastpujce parametry: zawarto siarki calkowitej, zawarto wodoru, zawarto azotu, zawarto chloru, zawarto wgla calkowitego, cieplo spalania oraz zawarto popiolu. Przykladowe dane dla jednej z otrzymanych warstw przedstawiono w tabeli 1. Dodatkowo wykorzystano kart przydatnoci technologicznej wgla (Sobolewski et al., 2012; 2013), w której opisano parametry kluczowe, istotne oraz dodatkowe, majce wplyw na proces zgazowania. Na jej podstawie oznaczono próbki wgla, które w sposób efektywny poddaj si procesowi zgazowania. W celu wizualizacji danych zastosowano jedn z nowoczesnych metod wielowymiarowej statystycznej analizy czynnikowej ­ metod PCA (ang. Principal Component Analysis). W metodzie tej dokonuje si rzutu prostopadlego wielowymiarowych danych na plaszczyzn reprezentowan przez specjalnie wybrane wektory V1,V2. S to wektory wlasne, odpowiadajce dwóm najwikszym (co do modulu) wartociom wlasnym macierzy kowariancji zbioru obserwacji. Opisany dobór wektorów V1,V2 pozwala uzyska obraz na plaszczynie prezentujcy najwicej zmiennoci danych. Algorytm i zasady tej metody zostaly szczególowo zaprezentowane w podrozdziale 3 artykulu. Za pomoc metody PCA dokonano trzech typów analiz. Pierwszy obraz mial na celu rozpoznanie, czy moliwa jest identyfikacja pochodzenia wgla, czyli rozdzial wgla pochodzcego z ZG Janina od wgla z KWK Wieczorek. Odpowied byla twierdzca. Na tak przygotowane dane narzucono nastpnie warunki wynikajce z naloenia wymogów okrelonych w karcie przydatnoci technologicznej wgla. Okazalo si, e przy wziciu pod uwag wszystkich warunków jedynie 17 próbek z ZG Janina i zaledwie jedna z KWK Wieczorek spelnia wszystkie kryteria, co przedstawiono na rysunku 2. Stwierdzono, e dzieje si tak glównie z powodu zawartoci chloru, która wykracza poza naloone limity. Cecha ta nie wplywa jednak w kluczowy sposób na sam proces zgazowania a istotna jest ze wzgldu na aspekt ochrony rodowiska. Dlatego dokonano podobnej analizy, ale przy odrzuceniu warunku dotyczcego tej cechy. Po odrzuceniu wymogów dotyczcych zawartoci chloru okazalo si, e 37 próbek z ZG Janina oraz 41 próbek z KWK Wieczorek spelnia pozostale zalecenia odnonie naziemnego zgazowania w reaktorze fluidalnym. Jest to potwierdzenie wczeniejszych obserwacji autorów w tym zakresie. W obu przypadkach wizualizacja wielowymiarowa przy uyciu PCA pozwolila stwierdzi, e obrazy punktów reprezentujcych próbki wgla bardziej podatnego na zgazowanie oraz mniej przydatnego do zgazowania zajmuj osobne podobszary przestrzeni oraz gromadz si w skupiskach, które mona latwo od siebie odseparowa. Stwierdzono wic, e metoda PCA pozwala podzieli przestrze próbek na obszary o rónej przydatnoci do procesu zgazowania fluidalnego zarówno gdy przyjto ograniczenie dotyczce zawartoci chloru jak i przy jego pominiciu. Zastosowanie metody PCA w celu identyfikacji przydatnoci próbek wgla do zgazowania jest nowatorskie i nie bylo wczeniej stosowane. Istnieje moliwo zastosowania równie innych metod w tym zakresie. Naley jednak podkreli, e niewtpliw zalet metody PCA jest fakt, e w trakcie wizualizacji nie ma koniecznoci doboru adnych parametrów w przeciwiestwie do wielu innych metod wizualizacji wielowymiarowych danych. Slowa kluczowe: analiza PCA, wizualizacja wielowymiarowa, zgazowanie wgla, wzbogacanie w osadzarkach 1. Introduction In the article, coal samples coming from two coal mines ­ Janina Mining Plant and Wieczorek Coal Mine were analysed. The above mentioned samples were taken in order to evaluate their suitability for on-surface gasification in fluidised bed. The properties of coals directed to gasification must comply with the limits (Blaschke, 2009; Borowiecki et al., 2008; Chmielniak & Tomaszewicz, 2012; Kosminski et al., 2006; Lee et al., 2007; Marciniak-Kowalska, 2012-13; Strugala et al., 2011; Strugala & Czerski, 2012; Surowiak, 2013a,b; 2014a,b), wherein it must be noted that they are linked with each other. The evaluation of coal suitability for gasification should be therefore conducted multidimensionally with the use of multidimensional distributions of properties and their statistics (Ahmed & Drzymala, 2005; Broek & Surowiak, 2005, 2007; 2010; Drzymala, 2009; Jamróz, 2009; 2014a-c; Jamróz & Niedoba, 2014; Niedoba & Jamróz, 2013; Niedoba, 2009, 2011, 2013a,b, 2014; Niedoba & Surowiak, 2012; Surowiak & Broek, 2014a,b; Tumidajski, 1997; Tumidajski & Saramak, 2009). It is a natural thing that the analysis of multidimensional properties and statistics begin with the analysis of particle density distribution and particle size distribution of coals and then is extended on the basis of further coal properties, especially the contents of components and reactions to the processes of its processing. The analysis of coal in terms of distribution of the so-called class-fraction is the initial information on the coal capability of developing the surface area and concentration of flammable and volatile parts and ash. While multidimensional methods of visualisation allow for the combined interpretation of all measured properties in tested terms. 2. Data In order to investigate the mineral beneficiation capability intended for the process of gasification in fluidised bed ­ bituminous coals from Janina Mining Plant (coal of 31.2 type) and Wieczorek Coal Mine (coal of 32 type) ­ each of them was subjected to the beneficiation process in the laboratory ring jig (10 rings, coal in the class of 0-18 mm). After the completion of the separation process, material was divided into 5 layers (with 2 rings) and each of them was sieved on sieves into 10 grain classes, establishing yields of layers and classes. Then, products obtained in such a way ­ grain classes, after the separation of analytical samples, were subjected to chemical elemental and technical analysis of coal in order to characterise features influencing gasification processes. In total, from both coal mines, 99 samples (50 from Janina coal mine and 49 from Wieczorek coal mine ­ in one of the layers, 16-18 mm class was not obtained) having the following parameters were obtained: Total sulphur content, hydrogen content, nitrogen content, chlorine content, total carbon content, heat of combustion and ash content. The card of technological suitability of coal was used additionally (Sobolewski et al., 2012, 2013), in which key, relevant and additional parameters, having effect on the gasification process, were described. On the card's basis coal samples, which are subjected to the gasification process in an effective way, were identified. Conditions used are: calorific value [kJ/kg] > 18000, ash content < 25%, chlorine content < 0.1%, total sulphur content < 2%, carbon content > 60%, 3.5% hydrogen content 5.5%, nitrogen content < 2%. On the basis of the presented conditions, among the analysed 99 samples only 18 samples were identified as those which can be subject to gasification in an effective way. Among those 18 samples, 17 came from Janina Mining Plant and only one sample came from Wieczorek Coal Mine. Table 1 presents an example of the obtained data. TABLE 1 Elemental analysis of coal in layer I after beneficiation in the jig ­ Janina Mining Plant Class d [mm] Total sulphur Hydrogen Nitrogen content content content Ha [%] Na [%] Sta [%] Chlorine content Cla [%] Total carbon content Ca [%] Heat of combustion Qsa [kJ/kg] Ash content Aa [%] < 1.00 1.00-2.00 2.00-3.15 3.15-5.00 5.00-6.30 6.30-8.00 8.00-10.00 10.00-12.50 12.50-16.00 16.00-18.00 3. Principal component analysis The methods of multidimensional visualisation are increasingly used instruments for statistical analysis. A number of such methods were described in many publications (Aldrich, 1998; Asimov, 1985; Assa et al., 1999; Chatterjee et al., 1993; Cleveland, 1984; Cook et al., 1995; Chou et al., 1999; Inselberg, 1985; Jain & Mao, 1992; Kim et al., 2000; Kraaijveld et al., 1995; Gennings et al., 1999; Sobol & Klein, 1989). Authors also used methods of this type for analyses and classifiications of coal type (Jamróz, 2011, 2009; Jamróz & Niedoba, 2013, 2014; Niedoba, 2013, 2014; Niedoba & Jamróz, 2013). One of the methods is PCA (Principal Component Analysis). 3.1. The description of the method PCA is one of the statistical methods of factor analysis. In this method, the orthogonal projection of multidimensional data in a plane represented by specially selected vectors V1,V2 is performed. These are eigenvectors, corresponding to the two largest (in terms of a module) eigenvalues of the covariance matrix of the observation space. The described selection of vectors V1,V2 enables to obtain the image in a plane representing the most variability in the data. 3.2. The algorithm The input data set consists of elements described by n-properties. It can be therefore treated as a set of n-dimensional vectors. Let us identify k-th input data vector as xk = (xk,1, xk,2, ... xk,n). The algorithm serving to realise the visualisation using PCA consists of several steps: a) Input data scaling. Individual properties, represented by individual data dimensions are scaled in such a way so as to fall into the same preset interval. It was decided to scale individual coordinates (properties) of vectors of the data set to the interval (0, 1). b) Covariance matrix determination. We use the general formula for the covariance: cov X , Y E XY E X EY (1) where E denotes the expected value. At first, we thus calculate expected values: xk ,i Ei k 1 (2) and xk ,i xk , j Ei , j k 1 (3) where Ei ­ the expected value i-th coordinate of the input data, Ei, j ­ the expected value of the quotient of i-th and j-th coordinate of the input data, m ­ a number of input data vectors, xk,i ­ i-th coordinate of the k-th input data vector. If we denote the covariance matrix as A, then each element of the matrix aij is obtained by counting: aij Ei , j Ei E j (4) In this way, a symmetrical covariance matrix of the input data set is obtained. c) Determination of eigenvalues and eigenvectors of the covariance matrix. For the numerical calculations, the Jacobi method was selected. In this method, we use the fact that the orthogonal transformation does not change the eigenvalues and eigenvectors of the matrix. Thus we can perform a sequence of such orthogonal transformations on matrix A in order to bring it into the diagonal form D: A W D WT (5) In the diagonal matrix, there are eigenvalues on the main diagonal, while the eigenvectors corresponding to them will be noted in the matrix W columns. Matrices D and W fulfilling the equation (5) using the Jacobi method can be obtained in the following steps: 1) We assume an identity matrix of size nxn as matrix W, 2) We assume covariance matrix of size nxn calculated in point b) as matrix A, 3) We select a leading element outside the main diagonal of matrix A that is such whose value is the largest in terms of the module and does not lie on the main diagonal. We look for its position in the matrix, that is such coordinates p and q, that: i, j 1,..., n and i j is : a pq aij (6) 4) We calculate values c and s. At first we determine: r aqq a pp 2a pq (7) t sgn( r ) r r2 1 (8) where: aij denotes the element of the matrix from i-th row and j-th column, sgn(r) = 1 for r > = 0 and sgn(r) = ­1 for r < 0. Then we determine: c 1 t (9) and s = tc (10) 5) Using calculated values c and s, we create matrix B in such a way that it is identity matrix of size nxn, in which we change four elements: bpp = c, bqq = c, bpq = s, bqp = ­s, 6) We assume new value to matrix A, using the current value of matrix A, matrix B created in the previous step and transposed matrix B: A = BT · A · B (11) 7) We assume new value to matrix W, using the current value of matrix W and matrix B created in step 5: W=W·B (12) 8) We check if, as a result of calculations, we obtained the assumed at the beginning accuracy of calculations , that is: i , j 1,...,n max i j aij (13) i 1,...,n max aii If inequality (13) is not fulfilled, we return to step 3 and continue calculations. Otherwise, obtained matrix A is a diagonal matrix. As a result, on the main diagonal of obtained matrix A there are eigenvalues of output matrix and in columns of obtained matrix W there are eigenvectors corresponding to them. d) Two coordinate axes determination. Among the calculated at the stage described in subpoint c vectors, we select two eigenvectors corresponding to two largest, in terms of a module, eigenvalues of the covariance matrix. We denote them as V1 = (v1,1, v1,2, ... v1,n), V2 = (v2,1, v2,2, ... v2,n). In this way we obtained two coordinate axes on which we will project all data. e) Drawing a set of points on the screen. For each point xk we determine its two coordinates ( ~k ,1, ~k , 2 ) obtained after the projection onto axes V1 and V2, that is: x x ~ xk ,1 ~ xk , 2 v1,i xk ,i i 1 n (14) v2,i xk ,i i 1 (15) Thanks to this, we can present the image of each vector on the computer screen. This is realised through drawing a symbol, representing the class, to which the vector of data xk corresponding x x to it belongs to, in the point with coordinates ( ~k ,1, ~k , 2 ) on the screen. In this way, an image of multidimensional points representing different classes of coal appears on the computer screen. 4. The results of experiments Within the study, in order to visualise seven-dimensional data describing coal samples, the computer system based on the assumptions outlined in the previous point was developed. It was written in C++ programming language with the use of Microsoft Visual Studio. The obtained results are presented in Figs. 1-3. These views show the way in which 7-dimensional data is transformed by means of PCA into two dimensions. The algorithm of visualisation by means of PCA works in this way, despite a considerable reduction in the number of dimensions, so as to obtain the view presenting the largest variability in data. In this way, we can see important properties of 7-dimensional data features on the 2-dimensional screen. It was decided to check if the method of multidimensional data visualisation enables to divide the space of samples into areas with different suitability for the fluidised gasification process. Figures 1-3 present views of points representing seven-dimensional vectors of data describing coal samples obtained from Janina and Wieczorek coal mines. To obtain them, the developed visualisation system calculated the covariance matrix: 0.0350 ­ 0.0202 ­ 0.0294 ­ 0.0081 ­ 0.0231 ­ 0.0224 0.0173 ­ 0.0202 0.0583 0.0640 0.0173 0.0621 0.0615 ­ 0.0595 ­ 0.0294 0.0640 0.0816 0.0254 0.0686 0.0686 ­ 0.0587 cov ­ 0.0081 0.0173 0.0254 0.0295 0.0183 0.0187 ­ 0.0130 ­ 0.0231 0.0621 0.0686 0.0183 0.0672 0.0664 ­ 0.0648 ­ 0.0224 0.0615 0.0686 0.0187 0.0664 0.0658 ­ 0.0635 0.0173 ­ 0.0595 ­ 0.0587 ­ 0.0130 ­ 0.0648 ­ 0.0635 0.0680 and eigenvectors (corresponding to the two largest in terms of module eigenvalues): V1 = (­0.1666 0.4115 0.4673 0.1364 0.4439 0.4395 ­0.4192), V2 = (­0.6610 ­0.1188 0.3043 0.5051 ­0.1160 ­0.1005 0.4215) Figure 1 presents the illustration of the discussed data according to the division into coal samples from Janina and Wieczorek coal mines. In this figure, it is clearly visible that the images of points representing samples of coal from different coal mines occupy separate subareas and accumulate in clusters. It is clearly seen here that in the whole area of the figure, these clusters can be easily separated from each other. On the basis of this figure, it can be stated that the PCA method of multidimensional data visualisation enables to divide the space of samples into areas belonging to different coal mines. Thanks to this, by analysing the next, unknown samples we can qualify them according to their origin into a group coming from Janina or Wieczorek coal mine through their visualisation. Fig. 1. The view of 7-dimensional data with the division according to the place of extraction. The images of points representing coal samples obtained in Janina coal mine are marked with a symbol of square (), coal samples obtained in Wieczorek coal mine ­ marked with circle () Fig. 2. The view of 7-dimensional data representing samples of coal with different suitability for the fluidised gasification process. The images of points representing coal samples less suitable for gasification are marked with a symbol of square (), coal samples more suitable for gasification ­ marked with circle () Figure 2 presents the discussed data according to a completely different division ­ the division into samples of coal more susceptible to gasification and less susceptible to gasification. In this figure, it is visible that the images of points representing samples of coal more susceptible to gasification and less susceptible to gasification occupy separate subareas and accumulate in clusters. It is seen that in the whole area of the figure, these clusters can be easily separated from each other. On the basis of this figure, it can be stated that the PCA method of multidimensional data visualisation enables to divide the space of samples into areas with different suitability for the fluidised gasification process. Thanks to this, by analysing the next, unknown samples we can qualify them into a group of more suitable samples for gasification or less suitable samples for gasification through their visualisation. This is especially important because, in the analysed situation, coal samples more suitable for gasification occupy the interior of the seven-dimensional cuboid ­ which is a considerable simplification. It results directly from the fact that the assumed conditions specifying belonging to this group (the card of technological suitability of coal) are simple inequalities with which you can easily check such belonging. In fact, it may, however, turn out that the area of belonging can have considerably more complicated shape. Then on the basis of larger number of samples whose belonging to the class of coal more suitable for gasification will be established empirically, it will be possible to try using PCA to obtain space division into areas representing samples of coal more and less suitable for gasification. Thanks to this, it can turn out that the obtained mapping will reflect reality more accurately. Therefore, the earlier statement that the PCA method of multidimensional data visualisation enables to divide the space of samples into areas with different suitability for the fluidised gasification process takes particular effect. Fig. 3. The view of 7-dimensional data representing samples of coal with different suitability for the fluidised gasification process with the omission of the condition for the chlorine content. The images of points representing coal samples less suitable for gasification are marked with a symbol of square (), coal samples more suitable for gasification ­ marked with circle () Additionally, as it turns out, the same space division contains considerably more information. It is shown in Figure 3, in which the discussed data according to the division into samples of coal more susceptible to gasification and less susceptible to gasification with the omission of the condition for the chlorine content. Also here despite the omission of the condition for the chlorine content, the images of points representing samples of coal more and less susceptible to gasification occupy separate subareas and accumulate in clusters. It is seen that in the whole area of the figure, these clusters can be easily separated from each other. On the basis of this figure, it can be stated that the PCA method of multidimensional data visualisation enables to divide the space of samples into areas with different suitability for the fluidised gasification process even with a change in conditions determining this suitability. In this specific case, it has particular importance because the chlorine content influences only the degree of contamination resulting from gasification and not the effectiveness of this gasification. But the assignment of samples changes completely. For comparison, in Figure 2 only 18 samples were identified as those which can be subject to gasification in an effective way. Among those 18 samples, 17 came from Janina coal mine and only one sample came from Wieczorek coal mine. While with the omission of the condition for the chlorine content (Figure 3), among the analysed 99 samples of coal as much as 78 samples were identified as those which can be subject to gasification in an effective way. In this case, 78 samples, 37 came from Janina coal mine and 41 samples came from Wieczorek coal mine. It can be concluded from this that if we can omit the chlorine contamination, then the use of coal from Wieczorek coal mine for gasification will be more effective ­ otherwise from Janina coal mine. It should be noted that by the visualisation algorithm through PCA does not use information on the points representing data belonging to specific classes. In this situation, the way in which the images of points representing a given class will be grouped depends only on certain, properties of this data identified by the algorithm. Therefore, Figures 1-3 differ only in the belonging of individual points to different classes. It results from the fact that all three figures were created as a result of projection of individual data vectors onto two eigenvectors corresponding to the two largest in terms of module eigenvalues of the same covariance matrix. Because for all figures the covariance matrix is calculated for exactly the same seven-dimensional data with the omission of information on the belonging of the points to individual classes. Therefore, the location of points in all three figures is identical ­ only their assignment to the respective classes is different. 5. Conclusions The conducted experiments consisting of the visualisation of seven-dimensional data using PCA enabled to obtain the following conclusions: 1) the multidimensional visualisation using PCA enables to state that the images of points representing samples of coal more susceptible to gasification and less suitable for gasification occupy separate subareas of space and accumulate in clusters which can be easily separated from each other. The PCA method enables to divide the space of samples into areas with different suitability for the fluidised gasification process. Thanks to this, by analysing the next, unknown samples we can qualify them into a group of more suitable samples for gasification or less suitable samples for gasification through their visualisation. 2) as a result of the multidimensional visualisation using PCA it is possible to state that the images of points representing samples of coal from Janina and Wieczorek coal mines occupy separate subareas and accumulate in clusters which can be easily separated from each other. Thanks to this, the space of samples can be divided into areas belonging to different coal mines. Thanks to this, by analysing the next, unknown samples we can qualify them according to their origin into a group coming from Janina or Wieczorek coal mine through their visualisation. 3) the algorithm of the visualisation through PCA does not use information on the belonging of the points representing data to specific classes. In this situation, the way in which the images of points representing a given class will be grouped depends only on certain, properties of this data identified by the algorithm ­ irrespectively of their allocation to different classes. 4) the same division of the space of samples conducted using PCA at the same time groups the points representing the analysed data both in terms of place of their extraction (Janina and Wieczorek coal mines) and their suitability for the fluidised gasification process. 5) On the basis of the card of technological suitability of coal, among the analysed 99 samples only 18 samples were identified as those which can be subject to gasification in an effective way. Among those 18 samples, 17 came from Janina coal mine and only one sample came from Wieczorek Coal Mine. 6) The situation changes dramatically with the omission of the condition for the chlorine content. Then on the basis of the same card of technological suitability of coal, among the analysed 99 samples of coal as much as 78 samples were identified as those which can be subject to gasification in an effective way. Among those 78 samples, 37 came from Janina coal mine and 41 samples came from Wieczorek coal mine. 7) The undoubted advantage of the PCA method is the fact that during the visualisation there is no necessity to select any parameters, in contrast to many other methods of the multidimensional data visualisation. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Archives of Mining Sciences de Gruyter

The Use of the Visualisation of Multidimensional Data Using PCA to Evaluate Possibilities of the Division of Coal Samples Space Due to their Suitability for Fluidised Gasification

Loading next page...
 
/lp/de-gruyter/the-use-of-the-visualisation-of-multidimensional-data-using-pca-to-aWURkpLYHa
Publisher
de Gruyter
Copyright
Copyright © 2016 by the
ISSN
1689-0469
eISSN
1689-0469
DOI
10.1515/amsc-2016-0038
Publisher site
See Article on Publisher Site

Abstract

Arch. Min. Sci., Vol. 61 (2016), No 3, p. 523­535 Electronic version (in color) of this paper is available: http://mining.archives.pl DOI 10.1515/amsc-2016-0038 DARIUSZ JAMRÓZ*, TOMASZ NIEDOBA**, AGNIESZKA SUROWIAK**, TADEUSZ TUMIDAJSKI** ZASTOSOWANIE WIZUALIZACJI WIELOWYMIAROWYCH DANYCH ZA POMOC PCA DO OCENY MOLIWOCI PODZIALU PRÓBEK WGLA ZE WZGLDU NA ICH PRZYDATNO DO ZGAZOWANIA Methods serving to visualise multidimensional data through the transformation of multidimensional space into two-dimensional space, enable to present the multidimensional data on the computer screen. Thanks to this, qualitative analysis of this data can be performed in the most natural way for humans, through the sense of sight. An example of such a method of multidimensional data visualisation is PCA (principal component analysis) method. This method was used in this work to present and analyse a set of seven-dimensional data (selected seven properties) describing coal samples obtained from Janina and Wieczorek coal mines. Coal from these mines was previously subjected to separation by means of a laboratory ring jig, consisting of ten rings. With 5 layers of both types of coal (with 2 rings each) were obtained in this way. It was decided to check if the method of multidimensional data visualisation enables to divide the space of such divided samples into areas with different suitability for the fluidised gasification process. To that end, the card of technological suitability of coal was used (Sobolewski et al., 2012; 2013), in which key, relevant and additional parameters, having effect on the gasification process, were described. As a result of analyses, it was stated that effective determination of coal samples suitability for the on-surface gasification process in a fluidised reactor is possible. The PCA method enables the visualisation of the optimal subspace containing the set requirements concerning the properties of coals intended for this process. Keywords: Principal Component Analysis, multidimensional visualisation, coal gasification, jigging * ** AGH UNIVERSITY OF SCIENCE AND TECHNOLOGY, FACULTY OF ELECTRICAL ENGINEERING, AUTOMATICS, COMPUTER SCIENCE AND BIOMEDICAL ENGINEERING, DEPARTMENT OF APPLIED COMPUTER SCIENCE, AL. MICKIEWICZA 30, 30-059 KRAKÓW, POLAND, E-MAIL: jamroz@agh.edu.pl AGH UNIVERSITY OF SCIENCE AND TECHNOLOGY, FACULTY OF MINING AND GEOENGINEERING, DEPARTMENT OF ENVIRONMENTAL ENGINEERING AND MINERAL PROCESSING, AL. MICKIEWICZA 30, 30-059 KRAKÓW, POLAND, E-MAIL: tniedoba@agh.edu.pl, asur@agh.edu.pl, tadeusz.tumidajski@agh.edu.pl Proces zgazowania wgla jest jedn z technologii, które zyskuj coraz szersz uwag wród technologów zajmujcych si jego przeróbk i utylizacj. Ze wzgldu na typ zgazowania wyrónia si dwa glówne sposoby: zgazowanie naziemne i podziemne. Kady z tych typów mona jednak przeprowadzi za pomoc rónych technologii. W przypadku zgazowania naziemnego, jedn z takich technologii jest zgazowanie w reaktorze fluidalnym. Do tego typu zgazowania zostaly opracowane wytyczne w ramach projektu NCBiR nr 23.23.100.8498/R34 pt. ,,Opracowanie technologii zgazowania wgla dla wysokoefektywnej produkcji paliw i energii" w ramach strategicznego programu bada naukowych i prac rozwojowych pt. ,,Zaawansowane technologie pozyskiwania energii" (Marciniak-Kowalska, 2011-12; Sobolewski et al., 2012; 2013; Strugala et al., 2011; 2012). Autorzy wybrali glówne z tych wytycznych, dotyczcych zalecanych poziomów okrelonych cech wgla. W celu zbadania wgla pod ktem ich przydatnoci do zgazowania pobrano próbki dwóch wgli: pochodzcych z Zakladu Górniczego Janina oraz z Kopalni Wgla Kamiennego Wieczorek. Kady z tych wgli zostal poddany procesowi wzbogacania w laboratoryjnej osadzarce piercieniowej (10 piercieni, wgiel w klasach wydzielonych z przedzialu 0-18 mm). Po zakoczeniu procesu rozdzialu material podzielono na 5 warstw (po 2 piercienie) i kady z nich rozsiano na sitach na 10 klas ziarnowych, ustalajc wychody warstw i klas. Nastpnie, tak otrzymane produkty ­ klasy ziarnowe, po wydzieleniu analitycznych próbek, poddano chemicznej analizie elementarnej i technicznej wgla, w celu scharakteryzowania wlaciwoci wplywajcych na procesy zgazowania. Lcznie z obu kopal uzyskano 99 próbek (50 z kopalni Janina oraz 49 z kopalni Wieczorek ­ w jednej z warstw nie uzyskano klasy 16-18 mm) charakteryzowanych przez nastpujce parametry: zawarto siarki calkowitej, zawarto wodoru, zawarto azotu, zawarto chloru, zawarto wgla calkowitego, cieplo spalania oraz zawarto popiolu. Przykladowe dane dla jednej z otrzymanych warstw przedstawiono w tabeli 1. Dodatkowo wykorzystano kart przydatnoci technologicznej wgla (Sobolewski et al., 2012; 2013), w której opisano parametry kluczowe, istotne oraz dodatkowe, majce wplyw na proces zgazowania. Na jej podstawie oznaczono próbki wgla, które w sposób efektywny poddaj si procesowi zgazowania. W celu wizualizacji danych zastosowano jedn z nowoczesnych metod wielowymiarowej statystycznej analizy czynnikowej ­ metod PCA (ang. Principal Component Analysis). W metodzie tej dokonuje si rzutu prostopadlego wielowymiarowych danych na plaszczyzn reprezentowan przez specjalnie wybrane wektory V1,V2. S to wektory wlasne, odpowiadajce dwóm najwikszym (co do modulu) wartociom wlasnym macierzy kowariancji zbioru obserwacji. Opisany dobór wektorów V1,V2 pozwala uzyska obraz na plaszczynie prezentujcy najwicej zmiennoci danych. Algorytm i zasady tej metody zostaly szczególowo zaprezentowane w podrozdziale 3 artykulu. Za pomoc metody PCA dokonano trzech typów analiz. Pierwszy obraz mial na celu rozpoznanie, czy moliwa jest identyfikacja pochodzenia wgla, czyli rozdzial wgla pochodzcego z ZG Janina od wgla z KWK Wieczorek. Odpowied byla twierdzca. Na tak przygotowane dane narzucono nastpnie warunki wynikajce z naloenia wymogów okrelonych w karcie przydatnoci technologicznej wgla. Okazalo si, e przy wziciu pod uwag wszystkich warunków jedynie 17 próbek z ZG Janina i zaledwie jedna z KWK Wieczorek spelnia wszystkie kryteria, co przedstawiono na rysunku 2. Stwierdzono, e dzieje si tak glównie z powodu zawartoci chloru, która wykracza poza naloone limity. Cecha ta nie wplywa jednak w kluczowy sposób na sam proces zgazowania a istotna jest ze wzgldu na aspekt ochrony rodowiska. Dlatego dokonano podobnej analizy, ale przy odrzuceniu warunku dotyczcego tej cechy. Po odrzuceniu wymogów dotyczcych zawartoci chloru okazalo si, e 37 próbek z ZG Janina oraz 41 próbek z KWK Wieczorek spelnia pozostale zalecenia odnonie naziemnego zgazowania w reaktorze fluidalnym. Jest to potwierdzenie wczeniejszych obserwacji autorów w tym zakresie. W obu przypadkach wizualizacja wielowymiarowa przy uyciu PCA pozwolila stwierdzi, e obrazy punktów reprezentujcych próbki wgla bardziej podatnego na zgazowanie oraz mniej przydatnego do zgazowania zajmuj osobne podobszary przestrzeni oraz gromadz si w skupiskach, które mona latwo od siebie odseparowa. Stwierdzono wic, e metoda PCA pozwala podzieli przestrze próbek na obszary o rónej przydatnoci do procesu zgazowania fluidalnego zarówno gdy przyjto ograniczenie dotyczce zawartoci chloru jak i przy jego pominiciu. Zastosowanie metody PCA w celu identyfikacji przydatnoci próbek wgla do zgazowania jest nowatorskie i nie bylo wczeniej stosowane. Istnieje moliwo zastosowania równie innych metod w tym zakresie. Naley jednak podkreli, e niewtpliw zalet metody PCA jest fakt, e w trakcie wizualizacji nie ma koniecznoci doboru adnych parametrów w przeciwiestwie do wielu innych metod wizualizacji wielowymiarowych danych. Slowa kluczowe: analiza PCA, wizualizacja wielowymiarowa, zgazowanie wgla, wzbogacanie w osadzarkach 1. Introduction In the article, coal samples coming from two coal mines ­ Janina Mining Plant and Wieczorek Coal Mine were analysed. The above mentioned samples were taken in order to evaluate their suitability for on-surface gasification in fluidised bed. The properties of coals directed to gasification must comply with the limits (Blaschke, 2009; Borowiecki et al., 2008; Chmielniak & Tomaszewicz, 2012; Kosminski et al., 2006; Lee et al., 2007; Marciniak-Kowalska, 2012-13; Strugala et al., 2011; Strugala & Czerski, 2012; Surowiak, 2013a,b; 2014a,b), wherein it must be noted that they are linked with each other. The evaluation of coal suitability for gasification should be therefore conducted multidimensionally with the use of multidimensional distributions of properties and their statistics (Ahmed & Drzymala, 2005; Broek & Surowiak, 2005, 2007; 2010; Drzymala, 2009; Jamróz, 2009; 2014a-c; Jamróz & Niedoba, 2014; Niedoba & Jamróz, 2013; Niedoba, 2009, 2011, 2013a,b, 2014; Niedoba & Surowiak, 2012; Surowiak & Broek, 2014a,b; Tumidajski, 1997; Tumidajski & Saramak, 2009). It is a natural thing that the analysis of multidimensional properties and statistics begin with the analysis of particle density distribution and particle size distribution of coals and then is extended on the basis of further coal properties, especially the contents of components and reactions to the processes of its processing. The analysis of coal in terms of distribution of the so-called class-fraction is the initial information on the coal capability of developing the surface area and concentration of flammable and volatile parts and ash. While multidimensional methods of visualisation allow for the combined interpretation of all measured properties in tested terms. 2. Data In order to investigate the mineral beneficiation capability intended for the process of gasification in fluidised bed ­ bituminous coals from Janina Mining Plant (coal of 31.2 type) and Wieczorek Coal Mine (coal of 32 type) ­ each of them was subjected to the beneficiation process in the laboratory ring jig (10 rings, coal in the class of 0-18 mm). After the completion of the separation process, material was divided into 5 layers (with 2 rings) and each of them was sieved on sieves into 10 grain classes, establishing yields of layers and classes. Then, products obtained in such a way ­ grain classes, after the separation of analytical samples, were subjected to chemical elemental and technical analysis of coal in order to characterise features influencing gasification processes. In total, from both coal mines, 99 samples (50 from Janina coal mine and 49 from Wieczorek coal mine ­ in one of the layers, 16-18 mm class was not obtained) having the following parameters were obtained: Total sulphur content, hydrogen content, nitrogen content, chlorine content, total carbon content, heat of combustion and ash content. The card of technological suitability of coal was used additionally (Sobolewski et al., 2012, 2013), in which key, relevant and additional parameters, having effect on the gasification process, were described. On the card's basis coal samples, which are subjected to the gasification process in an effective way, were identified. Conditions used are: calorific value [kJ/kg] > 18000, ash content < 25%, chlorine content < 0.1%, total sulphur content < 2%, carbon content > 60%, 3.5% hydrogen content 5.5%, nitrogen content < 2%. On the basis of the presented conditions, among the analysed 99 samples only 18 samples were identified as those which can be subject to gasification in an effective way. Among those 18 samples, 17 came from Janina Mining Plant and only one sample came from Wieczorek Coal Mine. Table 1 presents an example of the obtained data. TABLE 1 Elemental analysis of coal in layer I after beneficiation in the jig ­ Janina Mining Plant Class d [mm] Total sulphur Hydrogen Nitrogen content content content Ha [%] Na [%] Sta [%] Chlorine content Cla [%] Total carbon content Ca [%] Heat of combustion Qsa [kJ/kg] Ash content Aa [%] < 1.00 1.00-2.00 2.00-3.15 3.15-5.00 5.00-6.30 6.30-8.00 8.00-10.00 10.00-12.50 12.50-16.00 16.00-18.00 3. Principal component analysis The methods of multidimensional visualisation are increasingly used instruments for statistical analysis. A number of such methods were described in many publications (Aldrich, 1998; Asimov, 1985; Assa et al., 1999; Chatterjee et al., 1993; Cleveland, 1984; Cook et al., 1995; Chou et al., 1999; Inselberg, 1985; Jain & Mao, 1992; Kim et al., 2000; Kraaijveld et al., 1995; Gennings et al., 1999; Sobol & Klein, 1989). Authors also used methods of this type for analyses and classifiications of coal type (Jamróz, 2011, 2009; Jamróz & Niedoba, 2013, 2014; Niedoba, 2013, 2014; Niedoba & Jamróz, 2013). One of the methods is PCA (Principal Component Analysis). 3.1. The description of the method PCA is one of the statistical methods of factor analysis. In this method, the orthogonal projection of multidimensional data in a plane represented by specially selected vectors V1,V2 is performed. These are eigenvectors, corresponding to the two largest (in terms of a module) eigenvalues of the covariance matrix of the observation space. The described selection of vectors V1,V2 enables to obtain the image in a plane representing the most variability in the data. 3.2. The algorithm The input data set consists of elements described by n-properties. It can be therefore treated as a set of n-dimensional vectors. Let us identify k-th input data vector as xk = (xk,1, xk,2, ... xk,n). The algorithm serving to realise the visualisation using PCA consists of several steps: a) Input data scaling. Individual properties, represented by individual data dimensions are scaled in such a way so as to fall into the same preset interval. It was decided to scale individual coordinates (properties) of vectors of the data set to the interval (0, 1). b) Covariance matrix determination. We use the general formula for the covariance: cov X , Y E XY E X EY (1) where E denotes the expected value. At first, we thus calculate expected values: xk ,i Ei k 1 (2) and xk ,i xk , j Ei , j k 1 (3) where Ei ­ the expected value i-th coordinate of the input data, Ei, j ­ the expected value of the quotient of i-th and j-th coordinate of the input data, m ­ a number of input data vectors, xk,i ­ i-th coordinate of the k-th input data vector. If we denote the covariance matrix as A, then each element of the matrix aij is obtained by counting: aij Ei , j Ei E j (4) In this way, a symmetrical covariance matrix of the input data set is obtained. c) Determination of eigenvalues and eigenvectors of the covariance matrix. For the numerical calculations, the Jacobi method was selected. In this method, we use the fact that the orthogonal transformation does not change the eigenvalues and eigenvectors of the matrix. Thus we can perform a sequence of such orthogonal transformations on matrix A in order to bring it into the diagonal form D: A W D WT (5) In the diagonal matrix, there are eigenvalues on the main diagonal, while the eigenvectors corresponding to them will be noted in the matrix W columns. Matrices D and W fulfilling the equation (5) using the Jacobi method can be obtained in the following steps: 1) We assume an identity matrix of size nxn as matrix W, 2) We assume covariance matrix of size nxn calculated in point b) as matrix A, 3) We select a leading element outside the main diagonal of matrix A that is such whose value is the largest in terms of the module and does not lie on the main diagonal. We look for its position in the matrix, that is such coordinates p and q, that: i, j 1,..., n and i j is : a pq aij (6) 4) We calculate values c and s. At first we determine: r aqq a pp 2a pq (7) t sgn( r ) r r2 1 (8) where: aij denotes the element of the matrix from i-th row and j-th column, sgn(r) = 1 for r > = 0 and sgn(r) = ­1 for r < 0. Then we determine: c 1 t (9) and s = tc (10) 5) Using calculated values c and s, we create matrix B in such a way that it is identity matrix of size nxn, in which we change four elements: bpp = c, bqq = c, bpq = s, bqp = ­s, 6) We assume new value to matrix A, using the current value of matrix A, matrix B created in the previous step and transposed matrix B: A = BT · A · B (11) 7) We assume new value to matrix W, using the current value of matrix W and matrix B created in step 5: W=W·B (12) 8) We check if, as a result of calculations, we obtained the assumed at the beginning accuracy of calculations , that is: i , j 1,...,n max i j aij (13) i 1,...,n max aii If inequality (13) is not fulfilled, we return to step 3 and continue calculations. Otherwise, obtained matrix A is a diagonal matrix. As a result, on the main diagonal of obtained matrix A there are eigenvalues of output matrix and in columns of obtained matrix W there are eigenvectors corresponding to them. d) Two coordinate axes determination. Among the calculated at the stage described in subpoint c vectors, we select two eigenvectors corresponding to two largest, in terms of a module, eigenvalues of the covariance matrix. We denote them as V1 = (v1,1, v1,2, ... v1,n), V2 = (v2,1, v2,2, ... v2,n). In this way we obtained two coordinate axes on which we will project all data. e) Drawing a set of points on the screen. For each point xk we determine its two coordinates ( ~k ,1, ~k , 2 ) obtained after the projection onto axes V1 and V2, that is: x x ~ xk ,1 ~ xk , 2 v1,i xk ,i i 1 n (14) v2,i xk ,i i 1 (15) Thanks to this, we can present the image of each vector on the computer screen. This is realised through drawing a symbol, representing the class, to which the vector of data xk corresponding x x to it belongs to, in the point with coordinates ( ~k ,1, ~k , 2 ) on the screen. In this way, an image of multidimensional points representing different classes of coal appears on the computer screen. 4. The results of experiments Within the study, in order to visualise seven-dimensional data describing coal samples, the computer system based on the assumptions outlined in the previous point was developed. It was written in C++ programming language with the use of Microsoft Visual Studio. The obtained results are presented in Figs. 1-3. These views show the way in which 7-dimensional data is transformed by means of PCA into two dimensions. The algorithm of visualisation by means of PCA works in this way, despite a considerable reduction in the number of dimensions, so as to obtain the view presenting the largest variability in data. In this way, we can see important properties of 7-dimensional data features on the 2-dimensional screen. It was decided to check if the method of multidimensional data visualisation enables to divide the space of samples into areas with different suitability for the fluidised gasification process. Figures 1-3 present views of points representing seven-dimensional vectors of data describing coal samples obtained from Janina and Wieczorek coal mines. To obtain them, the developed visualisation system calculated the covariance matrix: 0.0350 ­ 0.0202 ­ 0.0294 ­ 0.0081 ­ 0.0231 ­ 0.0224 0.0173 ­ 0.0202 0.0583 0.0640 0.0173 0.0621 0.0615 ­ 0.0595 ­ 0.0294 0.0640 0.0816 0.0254 0.0686 0.0686 ­ 0.0587 cov ­ 0.0081 0.0173 0.0254 0.0295 0.0183 0.0187 ­ 0.0130 ­ 0.0231 0.0621 0.0686 0.0183 0.0672 0.0664 ­ 0.0648 ­ 0.0224 0.0615 0.0686 0.0187 0.0664 0.0658 ­ 0.0635 0.0173 ­ 0.0595 ­ 0.0587 ­ 0.0130 ­ 0.0648 ­ 0.0635 0.0680 and eigenvectors (corresponding to the two largest in terms of module eigenvalues): V1 = (­0.1666 0.4115 0.4673 0.1364 0.4439 0.4395 ­0.4192), V2 = (­0.6610 ­0.1188 0.3043 0.5051 ­0.1160 ­0.1005 0.4215) Figure 1 presents the illustration of the discussed data according to the division into coal samples from Janina and Wieczorek coal mines. In this figure, it is clearly visible that the images of points representing samples of coal from different coal mines occupy separate subareas and accumulate in clusters. It is clearly seen here that in the whole area of the figure, these clusters can be easily separated from each other. On the basis of this figure, it can be stated that the PCA method of multidimensional data visualisation enables to divide the space of samples into areas belonging to different coal mines. Thanks to this, by analysing the next, unknown samples we can qualify them according to their origin into a group coming from Janina or Wieczorek coal mine through their visualisation. Fig. 1. The view of 7-dimensional data with the division according to the place of extraction. The images of points representing coal samples obtained in Janina coal mine are marked with a symbol of square (), coal samples obtained in Wieczorek coal mine ­ marked with circle () Fig. 2. The view of 7-dimensional data representing samples of coal with different suitability for the fluidised gasification process. The images of points representing coal samples less suitable for gasification are marked with a symbol of square (), coal samples more suitable for gasification ­ marked with circle () Figure 2 presents the discussed data according to a completely different division ­ the division into samples of coal more susceptible to gasification and less susceptible to gasification. In this figure, it is visible that the images of points representing samples of coal more susceptible to gasification and less susceptible to gasification occupy separate subareas and accumulate in clusters. It is seen that in the whole area of the figure, these clusters can be easily separated from each other. On the basis of this figure, it can be stated that the PCA method of multidimensional data visualisation enables to divide the space of samples into areas with different suitability for the fluidised gasification process. Thanks to this, by analysing the next, unknown samples we can qualify them into a group of more suitable samples for gasification or less suitable samples for gasification through their visualisation. This is especially important because, in the analysed situation, coal samples more suitable for gasification occupy the interior of the seven-dimensional cuboid ­ which is a considerable simplification. It results directly from the fact that the assumed conditions specifying belonging to this group (the card of technological suitability of coal) are simple inequalities with which you can easily check such belonging. In fact, it may, however, turn out that the area of belonging can have considerably more complicated shape. Then on the basis of larger number of samples whose belonging to the class of coal more suitable for gasification will be established empirically, it will be possible to try using PCA to obtain space division into areas representing samples of coal more and less suitable for gasification. Thanks to this, it can turn out that the obtained mapping will reflect reality more accurately. Therefore, the earlier statement that the PCA method of multidimensional data visualisation enables to divide the space of samples into areas with different suitability for the fluidised gasification process takes particular effect. Fig. 3. The view of 7-dimensional data representing samples of coal with different suitability for the fluidised gasification process with the omission of the condition for the chlorine content. The images of points representing coal samples less suitable for gasification are marked with a symbol of square (), coal samples more suitable for gasification ­ marked with circle () Additionally, as it turns out, the same space division contains considerably more information. It is shown in Figure 3, in which the discussed data according to the division into samples of coal more susceptible to gasification and less susceptible to gasification with the omission of the condition for the chlorine content. Also here despite the omission of the condition for the chlorine content, the images of points representing samples of coal more and less susceptible to gasification occupy separate subareas and accumulate in clusters. It is seen that in the whole area of the figure, these clusters can be easily separated from each other. On the basis of this figure, it can be stated that the PCA method of multidimensional data visualisation enables to divide the space of samples into areas with different suitability for the fluidised gasification process even with a change in conditions determining this suitability. In this specific case, it has particular importance because the chlorine content influences only the degree of contamination resulting from gasification and not the effectiveness of this gasification. But the assignment of samples changes completely. For comparison, in Figure 2 only 18 samples were identified as those which can be subject to gasification in an effective way. Among those 18 samples, 17 came from Janina coal mine and only one sample came from Wieczorek coal mine. While with the omission of the condition for the chlorine content (Figure 3), among the analysed 99 samples of coal as much as 78 samples were identified as those which can be subject to gasification in an effective way. In this case, 78 samples, 37 came from Janina coal mine and 41 samples came from Wieczorek coal mine. It can be concluded from this that if we can omit the chlorine contamination, then the use of coal from Wieczorek coal mine for gasification will be more effective ­ otherwise from Janina coal mine. It should be noted that by the visualisation algorithm through PCA does not use information on the points representing data belonging to specific classes. In this situation, the way in which the images of points representing a given class will be grouped depends only on certain, properties of this data identified by the algorithm. Therefore, Figures 1-3 differ only in the belonging of individual points to different classes. It results from the fact that all three figures were created as a result of projection of individual data vectors onto two eigenvectors corresponding to the two largest in terms of module eigenvalues of the same covariance matrix. Because for all figures the covariance matrix is calculated for exactly the same seven-dimensional data with the omission of information on the belonging of the points to individual classes. Therefore, the location of points in all three figures is identical ­ only their assignment to the respective classes is different. 5. Conclusions The conducted experiments consisting of the visualisation of seven-dimensional data using PCA enabled to obtain the following conclusions: 1) the multidimensional visualisation using PCA enables to state that the images of points representing samples of coal more susceptible to gasification and less suitable for gasification occupy separate subareas of space and accumulate in clusters which can be easily separated from each other. The PCA method enables to divide the space of samples into areas with different suitability for the fluidised gasification process. Thanks to this, by analysing the next, unknown samples we can qualify them into a group of more suitable samples for gasification or less suitable samples for gasification through their visualisation. 2) as a result of the multidimensional visualisation using PCA it is possible to state that the images of points representing samples of coal from Janina and Wieczorek coal mines occupy separate subareas and accumulate in clusters which can be easily separated from each other. Thanks to this, the space of samples can be divided into areas belonging to different coal mines. Thanks to this, by analysing the next, unknown samples we can qualify them according to their origin into a group coming from Janina or Wieczorek coal mine through their visualisation. 3) the algorithm of the visualisation through PCA does not use information on the belonging of the points representing data to specific classes. In this situation, the way in which the images of points representing a given class will be grouped depends only on certain, properties of this data identified by the algorithm ­ irrespectively of their allocation to different classes. 4) the same division of the space of samples conducted using PCA at the same time groups the points representing the analysed data both in terms of place of their extraction (Janina and Wieczorek coal mines) and their suitability for the fluidised gasification process. 5) On the basis of the card of technological suitability of coal, among the analysed 99 samples only 18 samples were identified as those which can be subject to gasification in an effective way. Among those 18 samples, 17 came from Janina coal mine and only one sample came from Wieczorek Coal Mine. 6) The situation changes dramatically with the omission of the condition for the chlorine content. Then on the basis of the same card of technological suitability of coal, among the analysed 99 samples of coal as much as 78 samples were identified as those which can be subject to gasification in an effective way. Among those 78 samples, 37 came from Janina coal mine and 41 samples came from Wieczorek coal mine. 7) The undoubted advantage of the PCA method is the fact that during the visualisation there is no necessity to select any parameters, in contrast to many other methods of the multidimensional data visualisation.

Journal

Archives of Mining Sciencesde Gruyter

Published: Sep 1, 2016

References