Visualization of Categorical Data Using Extracat Package in RBrzezińska, Justyna
2018 Econometrics
doi: 10.15611/eada.2018.2.01
AbstractVisualization in research process plays a crucial role. There are several advanced plots for visualizing categorical data, such as mosaic, association, double-decker, sieve or fourfold plot that are based on the graphical presentation of residuals in a contingency table. In this paper we present new methods for visualizing categorical data such as rmb, fluctile and scpcp plot available in extracat package in R. This package provides a well-structured representation of categorical data and allows for a detailed presentation of the relationship between categories in terms of proportions. We describe rmb, fluctile and cpcp. Those plots are based on the concept of multiple bar charts, a fluctuation diagram from a multidimensional table and parallel coordinates respectively. Such plots are mostly used for a visualization of a contingency table or a data frame; they can also be used for exploratory analysis and allows for a graphical presentation even for a high number of variables [Pilhöfer, Unwin 2013]. All the calculations and plots are obtained using R software.
Evaluation of the Effectiveness of Actions for the Benefit of Persons of Non-Mobile Age in European Union CountriesZałuska, Urszula; Kwiatkowska-Ciotucha, Dorota
2018 Econometrics
doi: 10.15611/eada.2018.2.02
AbstractThe purpose of the paper is to assess the effectiveness of the applied solutions in individual EU countries in terms of supporting non-mobile people. Demographic aging affects most EU countries. It results in an increase in the number and proportion of older people and an increasing share of the post-working population. In such conditions, it is very important to maintain the activity on the labor market of people in the so-called non-mobile age. Maintaining professional activity depends on a number of factors, including, first and foremost, the assurance of openness in the labor market, adequate financial standing and medical care, and the ability to develop competencies and keep pace with technological change. This requires, on the one hand, the application of appropriate policies to this group and, on the other, the possibilities of the country. The study compared the situation of people in the indicated age group with the situation of the general population at working age.
Selected Econometric Methods of Modelling the World’s PopulationRzymowski, Witold; Surowiec, Agnieszka
2018 Econometrics
doi: 10.15611/eada.2018.2.03
AbstractSelected econometric methods of modelling the world’s population size based on historical data are presented in the paper. Periodical variables were used in the models proposed in the paper. Moreover, a logistic-type function was used in modelling. The purpose of the paper was to obtain a model describing the world’s population with the lowest possible maximal relative error and possibly the longest period of durability. In this work, 13,244 models from three families models were analyzed. Only a small part of such a large number of models satisfies the conditions of stability. The method of modelling the world’s population size allows to obtain models with maximal relative errors not exceeding 0.5%. Selected models were used to prediction of the world’s population up to 2050. The obtained results were compared with data published by the Organisation for Economic Co-operation and Development.
Statistical Analysis of Economic Poverty in Poland Using RBrzezińska, Justyna
2018 Econometrics
doi: 10.15611/eada.2018.2.04
AbstractEconomic poverty is one of the more common and complex problems in the modern world, as well as in Poland. This is a complex and multidimensional phenomenon, and therefore there is no single universally valid definition of poverty. This article presents a statistical analysis of economic poverty in Poland based on real data from the Central Statistical Office of Poland. An in-depth statistical analysis of the social situation of Poles will be presented, as well as an attempt to examine interdependencies in the occurrence of various forms of poverty and social exclusion in Poland. In the article, several multivariate statistical methods are presented together with the graphical presentation of results. We present a correspondence analysis with a perception map, as well as the advanced modern visualizing tool for categorical data. All the calculations were conducted using R software.
Predicting the Default Risk of Companies. Comparison of Credit Scoring Models: Logit Vs Support Vector MachinesNehrebecka, Natalia
2018 Econometrics
doi: 10.15611/eada.2018.2.05
AbstractThe aim of the article is to compare models on a train and validation sample, which will be created using logistic regression and Support Vector Machine (SVM) and will be used to assess the credit risk of non-financial enterprises. When creating models, the variables will be subjected to the transformation of the Weight of Evidence (WoE), the number of potential predictions will be reduced based on the Information Value (IV) statistics. The quality of the models will be assessed according to the most popular criteria such as GINI statistics, Kolmogorov-Smirnov (K-S) and Area Under Receiver Operating Characteristic (AUROC). Based on the results, it was found that there are significant differences between the logistic regression model of discriminatory character and the SVM for the model sample. In the case of a validation sample, logistic regression has the best prognostic capability. These analyses can be used to reduce the risk of negative effects on the financial sector.
Clustering Macroeconomic Time SeriesAugustyński, Iwo; Laskoś-Grabowski, Paweł
2018 Econometrics
doi: 10.15611/eada.2018.2.06
AbstractThe data mining technique of time series clustering is well established. However, even when recognized as an unsupervised learning method, it does require making several design decisions that are nontrivially influenced by the nature of the data involved. By extensively testing various possibilities, we arrive at a choice of a dissimilarity measure (compression-based dissimilarity measure, or CDM) which is particularly suitable for clustering macroeconomic variables. We check that the results are stable in time and reflect large-scale phenomena, such as crises. We also successfully apply our findings to the analysis of national economies, specifically to identifying their structural relations.