A comparison of classification strategies in rule-based classifiers

A comparison of classification strategies in rule-based classifiers Abstract This article discusses classification strategies in rule-based classifiers, reveals how often induced rules did not lead to unambiguous classification and emphasizes a major role that classification strategies play in classification of unknown examples. Five selected popular classification strategies proposed by Michalski $$et al$$, Grzymała-Busse and Zou, An, Stefanowski, Sulzmann and Fürnkranz are reviewed and compared experimentally. Additionally, a new strategy that exploits $$\chi^{2}$$ statistic to measure the association between the rule coverage and the indicated class is proposed. The experiment was conducted on 30 UCI datasets using MODLEM and modified RIPPER classifiers. 1 Introduction In general, classification of an example with decision rules comes down to identifying those rules whose conditional parts are satisfied by attribute values from the classified example. This activity is called matching [12]. Apart from unambiguous matching – to one or more rules pointing at the same class, two problematic situations can appear: the classified example satisfies conditional parts of many rules that point at different decision classes, or it is not matched to any rule. For further analysis, the second case is to be named – according to [1, 12] – a multiple-match, while the third case – a no-match. Additionally, the first case is named a single match. To address multiple- and no-match one applies algorithms called classification strategies [1, 12]. One can say that once rules have been induced, classification strategy is the only factor which determines the classifier accuracy. Now, a poor classification strategy would not lead to high overall accuracy, especially in case of a high prevalence of multiple-match and no-match cases regardless of chosen learning algorithm and rules quality. Thus, the classification strategy should have as high accuracy as possible. Classification strategies described in the literature were usually proposed together with specific rule induction algorithms and therefore so far they have very rarely been analysed independently or compared. In [2], An and Cercone examined several rule quality formulas based on one learning algorithm and checked their influence on the overall accuracy. The formulas were used only during the rule induction process, while as a classification strategy the authors used only the An’s proposal from [1]. Then, in [8] Grzymała-Busse and Sudre compared a strategy from [9] with their new proposal for the no-match situation with a final conclusion that none of them is better. In turn, in [3] Błaszczyński $$et~al.$$ revealed the accuracy of MODLEM (MODified Learning from Examples Module) combined with two different classification strategies. However, they did not compare strategies mutually, instead they compared the accuracy of different ensemble architectures containing these strategies. In recent years, several classification strategies have been proposed. In this article, the importance of classification strategies is verified and proportions of single-, multiple- and no-match are presented. The main goals are to compare five different strategies proposed by the following authors: Michalski $$et~al.$$ [12], Grzymała-Busse and Zou [9], An [1], Stefanowski [14], Sulzmann and Fürnkranz [15] and to examine if there are any differences in their performance. Additionally, a new proposal (introduced in Section 2.3) relying on the $$\chi^{2}$$ statistic is included. For more general and sound conclusions, two rule induction algorithms are used. The first one is MODLEM [13] – a rule induction algorithm which generates an unordered set of rules and which was invented to cope with numeric data without discretization. The second one is RIPPER (Repeated Incremental Pruning to Produce Error Reduction) proposed in [4]. Although, it creates ordered set of rules, the idea from [15] was adopted to obtain an unordered set of rules: the rules are induced in one-against-all mode for each class. This modification, called UNRIPPER, allows one to apply classification strategies more sophisticated than first fires, first classifies. The article is organized as follows. Section 2 contains a brief description of rule classifiers. It also describes the selected classification strategies and introduces a new strategy. Section 3 describes the design of the experiment. Section 4 presents and discusses the results for all selected classification strategies. Finally, Section 5 concludes with a discussion. 2 Methods 2.1 Basic concepts and definitions This chapter introduces basic concepts and definitions. A dataset $$X$$ consists of examples $$x \in X$$ characterized by a defined set of $$m$$ attributes $$ A={a_{1},a_{2},\ldots ,a_{m}}$$. The domain of attribute $$a \in A$$ is denoted by $$V_{a}$$ and the value of example $$x$$ for attribute $$a$$ is denoted by $$a(x)$$. For numeric attributes, the minimum and the maximum values are also defined and denoted by $$\min (a)$$ and $$\max (a)$$, respectively. Classification supervised learning also demands that a dataset used during learning process (called learning dataset) has one additional nominal attribute $$d \notin A$$, which contains the true classification decision for each example $$x$$ belonging to the learning dataset. For each decision class $$v_{d} \in V_{d}$$ one can split the learning dataset to a set of examples belonging to this class ($$X_{v_{d}}^{+}= { x \in X :d(x)= v_{d} }$$) and examples not belonging to it ($$X_{v_{d}}^{-}= { x \in X :d(x)\ne v_{d} }$$). A rule $$r$$ consists of two parts, where $$P_{r}$$ is a conditional part or a premise [15] and $$Q_{r} \in V_{d}$$ is a conclusion indicating a classification decision. The conditional part $$P_{r}$$ is a conjunction of single conditions $$w$$: $$P_{r}= w_{1} \land w_{2} \land \ldots \land w_{k}$$, where the number of single conditions $$k$$ is named the length of rule $$r$$ and denoted by $$length(r)$$. A single condition [4] (sometimes named a selector [12]) $$w$$ is a logical expression, which defines a relationship between the value of the attribute $$a$$ for a classified example $$x$$ and the domain of the attribute. The operator $$\in$$ is used either with a subset of values [12] from the attribute domain $$V_{a}$$ for nominal attributes or with an interval built on the attribute domain for numeric ones. A selector $$w$$ covers $$x$$ denoting $$w \supseteq x$$, if the relationship contained in $$w$$ is satisfied by $$x$$. A rule $$r$$ covers example $$x$$ (denoted by $$r \supseteq x$$) [15] if its each condition $$w \in P_{r}$$ covers $$ x$$. The conclusion $$Q_{r}$$ determines assignment of any object that satisfies $$P_{r}$$ to a class indicated by $$Q_{r}$$. The coverage $$ [P_{r}]$$ of rule $$r$$ is a subset of examples $$x$$ from dataset $$X$$ which are covered by $$r$$. The coverage can be also divided into positive and negative parts. A positive coverage $$[P_{r}] ^{+}$$ of rule $$r$$ is an intersection between its coverage $$[P_{r}]$$ and a set $$ X_{Q_{r}}^{+}$$ of examples belonging to the class indicated by $$Q_{r} $$, whereas an intersection between $$[P_{r}]$$ and $$X_{Q_{r}}^{-}$$ is named a negative coverage. A classification of a new example $$x$$ by a set of decision rules $$R$$ consists in finding a rule $$r$$ which covers the example and finally assigning such example to a class indicated by the rule’s conclusion. One could think about expressing this as a function:   \begin{equation} z : x,R \mapsto y :y \in V_{d}. \end{equation} (1) However, because of the multiple- and the no-match situations a concrete definition of this function depends on the chosen classification strategy. Let us introduce a partial rule $$r_{x}^{'}$$ corresponding to a rule $$r$$ and an example $$x$$. Such partial rule indicates the same class as $$r$$ does and its conditional part consists only of these selectors from $$r$$ which cover $$x$$. For example, if the rule $$r$$ has a form of $$w_{1} \land {w_{2}} \land {w_{3}}\to Q$$ and $$x$$ satisfies only $$w_{1}$$ and $$w_{3}$$ then the partial rule $$r_{x}^{'} $$ is defined as $$w_{1} \land {w_{3}}\to Q$$. Finally, $$R^{\supseteq x}$$ denotes a set of rules covering the example $$x$$ and $$R_{P}^{x}$$ is used to denote a subset of rules $$R$$ which contains these rules which have at least one selector covering $$x$$. Such rules are named partially-matched to $$x$$ [1]. Before providing a brief description of the selected classification strategies, let us introduce a few important formulas useful to estimate rule quality. The first one is m-estimate proposed by Džeroski $$et~al.$$ [6]:   \begin{equation} mEstimate(r) = \frac{\vert [P_{r}] ^{+}\vert +m \frac{\vert X_{Q_{r}}^{+}\vert }{\vert X\vert }}{\vert [P_{r}]\vert + m}. \end{equation} (2) If m equals to 0, the whole expression is simply the value of Maximum Likelihood Estimator (MLE) of positive coverage inside rule’s coverage. Because of MLE’s properties, rules with small coverage containing only positive examples may obtain overestimated values comparing with rules with greater coverage containing some negative examples. The parameter m introduces to calculation an occurrence of m additional examples with a priori known probability. This allows the expression to control the trade-off between the frequency of positive examples in the coverage and a priori probability of the class indicated by $$r$$ [6], which might be tuned according to the noise in the dataset. Another proposal comes from [1]. It is called the measure of discrimination and the underlying motivation was to apply a similar formula to the one used in information retrieval to discriminate between relevant and irrelevant documents. To measure the rule ability to discriminate positives from negatives the authors proposed a following expression:   \begin{equation} {\rm measureOfDiscrimination}(r) = \log \frac{(\vert [P_{r}] ^{+}\vert )( \vert X\vert -\vert [P_{r}]\vert -\vert X_{Q_{r}}^{+}\vert +\vert [P_{r}] ^{+}\vert )}{(\vert [P_{r}]\vert -\vert [P_{r}] ^{+}\vert )(\vert X_{Q_{r}}^{+}\vert -\vert [P_{r}] ^{+}\vert )}. \end{equation} (3) Stefanowski in [14] has proposed a new measure which relies on calculating a distance between the rule and the classified example:   \begin{equation} {\rm distance}(r,x)= \frac{1}{{\rm length}(r)}\sqrt{\sum_{w \in P_{r} }{{\rm distance}^{2}(x,w)}}. \end{equation} (4) In (4) $${\rm distance}(x,w)$$ indicates a distance between the example $$x$$ and a selector $$w$$. For selectors built on nominal or ordinal attributes it has value equal to 0 when $$w$$ covers $$x$$ and 1 otherwise. For selectors built on numeric domains the distance can be expressed as:   \begin{equation} {\rm distance}(x,w)=\begin{cases} 0 & \text{if } w\supseteq x \\ \frac{\min _{v \in V_{a }: w \supseteq v}\vert a(x)-v\vert }{\vert {\rm max}(a)-{\rm min}(a)\vert } & \text{otherwise}, \\ \end{cases} \end{equation} (5) where $$w\supseteq v$$ denotes that the relationship contained in $$w$$ is satisfied by $$v \in V_{a}$$. Yet another formula was proposed in [12] and it measures the fit between the example being classified and the rule:   \begin{equation} {\rm measureOfFit}(r,x) =\prod_{w \in P_{r}}{{\rm measureOfFit}(w,x)}. \end{equation} (6) The measure of fit between a selector $$w$$ based on nominal attribute and the example is defined as follows:   \begin{equation} {\rm measureOfFit}(w,x)= \begin{cases} 1 & \text{if } w \supseteq x \\ \frac{\text{number of selector values}}{\vert V_{a}\vert } & \text{otherwise}. \\ \end{cases} \end{equation} (7) Finally, in [2] a proposal was made for exploiting the measures of association between variables as measure of rule quality. An example of such measure is $$\chi ^{2}$$ statistic which actually determines based on distributions of nominal variables whether they are independent of each other. The value of $$\chi ^{2}$$ statistic [7] for a pair of variables with $$n$$ and $$m$$ values respectively can be calculated as:   \begin{equation} \chi ^{2}({\rm Observed, Expected}) = \sum_{i=1}^{n}{\sum_{j=1}^{m}{\frac{({\rm Observed}_{ij}-{\rm Expected}_{ij})^{2}}{{\rm Expected}_{ij}}}}. \end{equation} (8) The symbols $${\rm Observed}_{ij}$$ and $${\rm Expected}_{ij}$$ represent, respectively, the number of observed and expected co-occurrences of the $$ i$$-th value from the first variable with the $$j$$-th value from the second one. 2.2 Existing strategies Now, let us briefly describe particular classification strategies, which are used in the experiment. Each one is presented as a pair of functions $$ z^{mm}$$ and $$z^{nm}$$, which support the expression (1) in case of multiple- and no-match, respectively. Grzymała-Busse and Zou proposed in [9] to exploit measures of rule strength and rule specificity. They defined the rule strength as the power of positive coverage and the specificity as the rule length $${\rm length}(r)$$. Concretely, their proposal for multiple-match is as follows:   \begin{equation} z_{S}^{mm}(x) = {\rm argmax}_{v_{d} \in V_{d}}\sum_{r \in R^{\supseteq x} :Q_{r}= v_{d} }{\vert [P_{r}] ^{+}\vert }{\rm length}(r). \end{equation} (9) For no-match case Grzymała-Busse and Zou proposed a similar formula to $$z_{S}^{mm}$$, but instead of $$length(r)$$ the proposal contains the number of selectors covering an example $$x$$ denoted by $$k_{r}^{x}$$:   \begin{equation} z_{S}^{nm}(x) = {\rm argmax}_{v_{d} \in V_{d}}\sum_{r \in R :Q_{r}= v_{d} }{\vert [P_{r}] ^{+}\vert } k_{r}^{x}. \end{equation} (10) Next idea comes from An [1] and is already applied in ELEM2 system. It makes use of the measure of discrimination and for multiple-match has a following form:   \begin{equation} z_{D}^{mm}(x) = {\rm argmax}_{v_{d} \in V_{d}}\sum_{r \in R^{\supseteq x} :Q_{r}= v_{d} }{{\rm measureOfDiscrimination}(r)} \end{equation} (11) Again, the expression for no-match case is similar to the one applied to multiple-match. However, it is enriched by the ratio of the number of selectors covering an example $$x$$ to the length of $$r$$:   \begin{equation} z_{D}^{nm}(x) = {\rm argmax}_{v_{d} \in V_{d}}\sum_{r \in R :Q_{r}= v_{d} }{{\rm measureOfDiscrimination}(r)} \frac{k_{r}^{x}}{{\rm length}(r)} \end{equation} (12) In case of multiple-match, Stefanowski [14] advised to apply a formula similar to $$z_{S}^{mm}$$, but without the rule specificity. Moreover, he considered not to limit to use only positive coverage, yet in this proposal, all examples from coverage participate in classification:   \begin{equation} z_{N}^{mm}(x) = {\rm argmax}_{v_{d} \in V_{d}}\sum_{r \in R^{\supseteq x} }{\vert [P_{r}]\cap X_{v_{d}}^{+}\vert }. \end{equation} (13) For no-match cases, Stefanowski presented a completely different idea inspired by k-Nearest Neighbours (k-NN) classifier. This idea contains a way to calculate distances between rules and examples. Against as in k-NN where it is expected to exclude from voting the influence of distant neighbours, in this strategy all rules participate in classification with vote weights additively inverse of their distances. The final class is established as follows:   \begin{equation} z_{N}^{nm}(x) = {\rm argmax}_{v_{d} \in V_{d}}\sum_{r \in R_{P}^{x} }{(1-{\rm distance}(r,x))\vert [P_{r}]\cap X_{v_{d}}^{+}\vert}. \end{equation} (14) Sulzmann and Fürnkranz proposed in [15] to exploit the $$m$$ -estimate to solve the multiple-match case. They suggested to find one rule with the highest value of $$m$$-estimate and to classify the example according to this rule:   \begin{equation} z_{m}^{mm}(x) = Q_{r}:{\rm argmax}_{r \in R^{\supseteq x}} mEstimate(r). \end{equation} (15) Unfortunately, Sulzmann and Fürnkranz did not provide any solution based on $$m$$-estimate for the no-match case. They advised to assign each example with no matched rules to the majority class [15]. We claim such idea is biased towards one class and propose a new application of Sulzmann and Fürnkranz’s strategy from multiple-match to no-match case. Namely, we consider only partially-matched rules and calculate $$mEstimate$$ using partial rules derived from them. Then, instead of the majority class, an example $$x$$ is assigned to the class indicated by a following formula:   \begin{equation} z_{m}^{nm}(x) = Q_{r}:{\rm argmax}_{r \in R_{P}^{x}} mEstimate(r_{x}^{'}). \end{equation} (16) Another strategy comes from Michalski $$et~al.$$ [12] and was implemented as a part of AQ15 system. It also exploits positive coverage of rules, yet to avoid overcounting examples covered by many rules, the union operator is applied. The proposal for multiple-match case is as follows:   \begin{equation} \label{eq:zA_mm} z_{A}^{mm}(x) = {\rm argmax}_{v_{d} \in V_{d}}\vert \bigcup_{r \in R^{\supseteq x} :Q_{r}= v_{d}}{[P_{r}] ^{+}}\vert. \end{equation} (17) The authors recommended to enrich $$z_{A}^{mm}$$ by $${\rm measureOfFit}(r,x)$$ for no-match case:   \begin{equation} z_{A}^{nm}(x) = {\rm argmax}_{v_{d} \in V_{d}}\vert \bigcup_{r \in R :Q_{r}= v_{d}}{[P_{r}] ^{+}{\rm measureOfFit}(r,x)}\vert. \end{equation} (18) Unfortunately, $${\rm measureOfFit}$$ was not defined for selectors based on numeric attributes. Therefore, to be able to handle numeric attributes we propose a new formula which is similar to that one proposed for nominal attributes – it also relies on calculating how big part of the attribute’s domain is covered by selector $$w$$. The definition of $${\rm measureOfFit}$$ is as follows:   \begin{equation} {\rm measureOfFit}(w,x)= \begin{cases} 1 & \text{if } w \supseteq x \\ \frac{{\rm width}(w\cap \lbrack {\rm min}(a), {\rm max}(a)\rbrack )}{\vert {\rm max}(a)-{\rm min}(a)\vert } & \text{otherwise}. \\ \end{cases} \end{equation} (19) The $${\rm width}(w)$$ is the absolute difference between the argument endpoints. An illustrative value of $${\rm measureOfFit}$$ for a selector $$ w$$: $$a>-1$$ given domain $$V_{a} \in [-3;5]$$ equals to $$\frac{\vert 5-(-1)\vert }{\vert 5-(-3)\vert }=0.75$$. 2.3 The new proposal In this article a new approach is presented: let us check and measure if a relationship between affiliation to rule coverage and affiliation to the class indicated by rule exists. To do this $$\chi ^{2}$$ value is calculated just as in the $$\chi ^{2}$$ test of independence between two binary variables. The first variable takes as values ‘example covered’ ($$x \in [P_{r}]$$) or ‘example uncovered’ ($$x \notin \lbrack P_{r}\rbrack $$) by rule $$r$$, whereas ‘example belongs’ ($$x \in X_{Q_{r}}^{+}$$) or ‘example does not belong’ ($$x \in X_{Q_{r}}^{-}$$) to the class indicated by rule’s $$r$$ conclusion are the values taken by the second variable. For these variables, a following contingency table is created, where $$ [P_{r}]^{'}$$ denotes a complement of $$[P_{r}]$$:   \begin{equation} {\rm Observed}(r) =\left[\begin{matrix} \vert [P_{r}]\cap X_{Q_{r}}^{+}\vert & \vert [P_{r}]\cap X_{Q_{r}}^{-} \vert & \\ \vert [P_{r}]^{'}\cap X_{Q_{r}}^{+}\vert & \vert [P_{r}]^{'}\cap X_{Q_{r}}^{-}\vert & \\ \end{matrix} \right]=\left[\begin{matrix} \vert [P_{r}] ^{+}\vert & \vert [P_{r}] ^{-}\vert & \\ \vert [P_{r}]^{'}\cap X_{Q_{r}}^{+}\vert & \vert [P_{r}]^{'}\cap X_{Q_{r}}^{-}\vert & \\ \end{matrix} \right]. \end{equation} (20) As in the $$\chi ^{2}$$ test of independence, the matrix of expected values is calculated under the assumption that these two variables are independent. In consequence, it is also assumed that each pair of possible values is independent which means that the joint probability of the variables can be calculated as a product of their marginals:   \begin{equation} {\rm Expected}(r) =\left[\begin{matrix} \vert [P_{r}]\vert \vert X_{Q_{r}}^{+}\vert & \vert [P_{r}]\vert \vert X_{Q_{r}}^{-}\vert & \\ \vert [P_{r}]^{'}\vert \vert X_{Q_{r}}^{+}\vert & \vert [P_{r}]^{'}\vert \vert X_{Q_{r}}^{-}\vert & \\ \end{matrix} \right]\frac{1}{\vert X\vert }. \end{equation} (21) In terms of classification, we propose to classify an example according to a rule which obtains the largest value of $$\chi ^{2}$$ indicating the strongest relationship between the variables. The proposal for solving multiple-match comes down to the following formula:   \begin{equation} z_{C}^{mm}(x) = Q_{r}:{\rm argmax}_{r \in R^{\supseteq x}}\chi ^{2}(r), \end{equation} (22) where $$\chi ^{2}$$ is calculated according to a following expression:   \begin{equation} \chi^{2}(r)=\sum_{i=1}^{2}{\sum_{j=1}^{2}{({\rm Observed}(r)_{ij}-{\rm Expected}(r)_{ij})^{2}/{\rm Expected}(r)_{ij}}}. \end{equation} (23) To solve no-match case a similar expression is applied, but using partial rules derived from the set of partially-matched ones:   \begin{equation} z_{C}^{nm}(x) =Q_{r}:{\rm argmax}_{r \in R_{P}^{x}}\chi ^{2}(r_{x}^{'}). \end{equation} (24) 3 Experimental design The main goal of the experiment was to examine the role that classification strategies play in classification of unknown examples and to compare the classification strategies described in the previous section. The experiment was conducted on 30 datasets downloaded from the machine learning repository of the University of California, Irvine [10]. They are presented in Table 1. The selected datasets were diversified in terms of their size, numbers of nominal and numeric attributes, as well as the number of classes and their distribution. Table 1. A summary of the datasets used in the experiment (dataset = full name of the dataset, abbr. = dataset abbreviation, #c = number of classes, #o = number of nominal attributes, #n = number of numeric attributes, and Class Distribution = class distribution) Dataset  abbr.  #c  #o  #n  Class distribution  Abalone data  aba  2  1  7  3842:335  Breast cancer  brc  2  9  0  201:85  Wisconsin breast cancer  brw  2  0  9  458:241  Liver Disorders  bup  2  0  6  200:145  Car evaluation  car  4  6  0  1210:384:69:65  King+Rook versus King+Pawn  che  2  36  0  1669:1527  Contraceptive Method Choice  cmc  2  7  2  1140:333  German credit  cre  2  13  7  700:300  Credit Approval  crx  2  9  6  383:307  Ecoli  eco  8  0  7  143:77:52:35:20:5:2:2  Glass identification  gla  6  0  9  76:70:29:17:13:9  Haberman’s Survival Data  hab  2  0  3  225:81  Cleveland heart disease  hea  2  7  6  165:138  Hepatitis  hep  2  13  6  123:32  Ionosphere  ion  2  0  34  225:126  Iris Plants Database  iri  3  0  4  50:50:50  Lymphography  lym  4  15  3  81:61:4:2  Monk’s 3 problem  mon  2  6  0  62:60  Mushroom  mus  2  22  0  4208:3916  Pima Indians Diabetes  pim  2  0  8  500:268  Primary Tumor  pri  21  17  0  84:39:29:28:24:24:20:16:14:14:10:9:7:6:6:2:2:2:1:1:1  satimage_train  sat  6  0  36  1072:1038:961:479:470:415  Sonar  son  2  0  60  111:97  Soybean disease  soy  19  35  0  92:91:91:88:44:44:20:20:20:20:20:20:20:20:20:16:15:14:8  Tic-tac-toe endgame  tic  2  9  0  626:332  Blood transfusion service center  tra  2  0  4  570:178  Vehicle silhouettes  veh  4  0  18  218:217:212:199  Congressional Voting Records  vot  2  16  0  267:168  Deterding vowel recognition  vow  11  3  10  90:90:90:90:90:90:90:90:90:90:90  Cellular Localization Sites of Proteins  yea  10  0  8  463:429:244:163:51:44:35:30:20:5  Dataset  abbr.  #c  #o  #n  Class distribution  Abalone data  aba  2  1  7  3842:335  Breast cancer  brc  2  9  0  201:85  Wisconsin breast cancer  brw  2  0  9  458:241  Liver Disorders  bup  2  0  6  200:145  Car evaluation  car  4  6  0  1210:384:69:65  King+Rook versus King+Pawn  che  2  36  0  1669:1527  Contraceptive Method Choice  cmc  2  7  2  1140:333  German credit  cre  2  13  7  700:300  Credit Approval  crx  2  9  6  383:307  Ecoli  eco  8  0  7  143:77:52:35:20:5:2:2  Glass identification  gla  6  0  9  76:70:29:17:13:9  Haberman’s Survival Data  hab  2  0  3  225:81  Cleveland heart disease  hea  2  7  6  165:138  Hepatitis  hep  2  13  6  123:32  Ionosphere  ion  2  0  34  225:126  Iris Plants Database  iri  3  0  4  50:50:50  Lymphography  lym  4  15  3  81:61:4:2  Monk’s 3 problem  mon  2  6  0  62:60  Mushroom  mus  2  22  0  4208:3916  Pima Indians Diabetes  pim  2  0  8  500:268  Primary Tumor  pri  21  17  0  84:39:29:28:24:24:20:16:14:14:10:9:7:6:6:2:2:2:1:1:1  satimage_train  sat  6  0  36  1072:1038:961:479:470:415  Sonar  son  2  0  60  111:97  Soybean disease  soy  19  35  0  92:91:91:88:44:44:20:20:20:20:20:20:20:20:20:16:15:14:8  Tic-tac-toe endgame  tic  2  9  0  626:332  Blood transfusion service center  tra  2  0  4  570:178  Vehicle silhouettes  veh  4  0  18  218:217:212:199  Congressional Voting Records  vot  2  16  0  267:168  Deterding vowel recognition  vow  11  3  10  90:90:90:90:90:90:90:90:90:90:90  Cellular Localization Sites of Proteins  yea  10  0  8  463:429:244:163:51:44:35:30:20:5  At the beginning, proportions of examples being classified in single-, multiple- and no-match situations were analysed. The sum of percentages of multiple- and no-match cases can be viewed as a percentage of examples which classifier itself would not be able to classify without a classification strategy at all and as a gain in the accuracy, which would be obtained, if the strategy classified all the examples correctly. Thus, this number is a good indicator of strategy importance. Next, strategy accuracies were checked, if there are any differences between them. To explain the findings, the overall accuracies were decomposed into multiple- and no-match accuracies separately. Finally, the two different recommendations for multiple- and no-match were combined together into a new strategy. Each classifier was evaluated using stratified 10-fold Cross Validation repeated 100 times to estimate the mean appropriately. To test significance of a difference between the strategies the two-tailed Friedman test was used. Besides that, the Nemenyi test was employed as a post-hoc test for their pairwise comparison. These tests were described in detail in [5]. The considered significance level $$\alpha$$ was equal to 0.05. Finally, both MODLEM and UNRIPPER were used with pruning enabled and the value of $$m$$ in $$z_{m}$$ was set to 5 following suggestions by Sulzmann and Fürnkranz [15]. In many real-world applications datasets manifest imbalanced distribution of examples, namely, (at least) one class is much more underrepresented in comparison with the others. In view of such imbalanced class distribution of some selected datasets, classifiers were not evaluated using accuracy, but rather $$F_{\beta }-score$$, because otherwise even a classifier with very high accuracy could be biased towards majority classes and useless in practice. To avoid biasing classifier towards any classes and to reflect the interest in accuracy across all classes the accuracy was measured as macro-averaged $$F-score$$ [16] which treats performance on each class equally:   \begin{equation} F-score(X) = \frac{1}{\vert V_{d}\vert}\sum_{v_{d} \in V_{d}}{\frac{2\vert x \in X : d(x) = v_{d} \land z(x) = v_{d} \vert} {\vert x \in X : z(x)= v_{d}\vert + \vert x \in X : d(x)= v_{d}\vert}}. \end{equation} (25) 4 Results 4.1 Proportions of single-, multiple- and no-match At the beginning, proportions of examples classified through single-match, multiple-match and no-match using rules induced by both MODLEM and UNRIPPER are presented. These numbers are expected to be dependent on the size of rule coverage and the extent to which the rules overlap, which, in turn, results from the number and the length of the induced rules. The more rules, the more overlaps resulting in increased percentage of multiple-match, and, inversely, the longer rule the narrower its coverage resulting in increased percentage of no-match. Table 2 shows that MODLEM created many more rules which were also longer on average than the generated by UNRIPPER. These very big differences in the number of rules may result in a greater number of multiple-matches in case of the former and a greater number of no-matches in case of the latter. Table 2. The profile of rules generated by MODLEM and UNRIPPER    Number of rules  Average rule length    MODLEM  UNRIPPER  MODLEM  UNRIPPER  aba  129  8.88  3.78  2.75  brc  32  3.83  3.97  1.60  brw  20  8.70  3.55  2.37  bup  52  7.45  3.29  2.34  car  68  50.93  5.00  3.75  che  38  24.68  4.53  3.81  cmc  96  5.65  4.50  1.76  cre  139  7.58  5.31  2.12  crx  50  5.83  4.36  2.50  eco  35  12.25  3.23  2.45  gla  28  11.25  3.00  2.49  hab  37  3.00  2.70  1.32  hea  33  7.85  3.85  2.21  hep  8  4.88  2.25  1.63  ion  15  6.90  2.93  2.07  iri  7  4.33  2.14  1.55  lym  19  9.03  3.21  1.82  mon  14  7.25  3.21  1.65  mus  15  54.45  2.33  2.00  pim  91  6.80  4.36  2.40  pri  83  11.75  5.88  3.16  sat  166  60.88  5.92  4.81  son  13  7.98  3.31  2.13  soy  51  33.50  4.41  2.46  tic  17  18.08  3.29  3.16  tra  39  5.08  2.54  1.87  veh  85  19.75  4.40  3.15  vot  17  5.23  3.94  2.00  vow  71  58.63  3.97  3.71  yea  200  21.58  4.27  3.87     Number of rules  Average rule length    MODLEM  UNRIPPER  MODLEM  UNRIPPER  aba  129  8.88  3.78  2.75  brc  32  3.83  3.97  1.60  brw  20  8.70  3.55  2.37  bup  52  7.45  3.29  2.34  car  68  50.93  5.00  3.75  che  38  24.68  4.53  3.81  cmc  96  5.65  4.50  1.76  cre  139  7.58  5.31  2.12  crx  50  5.83  4.36  2.50  eco  35  12.25  3.23  2.45  gla  28  11.25  3.00  2.49  hab  37  3.00  2.70  1.32  hea  33  7.85  3.85  2.21  hep  8  4.88  2.25  1.63  ion  15  6.90  2.93  2.07  iri  7  4.33  2.14  1.55  lym  19  9.03  3.21  1.82  mon  14  7.25  3.21  1.65  mus  15  54.45  2.33  2.00  pim  91  6.80  4.36  2.40  pri  83  11.75  5.88  3.16  sat  166  60.88  5.92  4.81  son  13  7.98  3.31  2.13  soy  51  33.50  4.41  2.46  tic  17  18.08  3.29  3.16  tra  39  5.08  2.54  1.87  veh  85  19.75  4.40  3.15  vot  17  5.23  3.94  2.00  vow  71  58.63  3.97  3.71  yea  200  21.58  4.27  3.87  A summary of the proportions of examples classified through single-, multiple- and no-match is presented in Table 3. The median for single-match is equal to 71.75% and 82.2% for MODLEM- and UNRIPPER-generated rules, respectively. This indicates that for these two classifiers in the median case classification of 28.25% or 17.8% of examples involved the classification strategy. We think that these proportions are high enough to change significantly the number of examples correctly classified (providing classification strategies work well) and to affect the evaluation of a specific classifier. It is worth noticing that in the extreme case barely 13.3% (pri using MODLEM) and 41.7% (mus using UNRIPPER) of examples were handled through single-match. Finally, the results confirm the previous expectations regarding proportions of multiple- and no-match – for 24 datasets MODLEM obtained a higher percentage of multiple-match than UNIRPPER, whereas the latter noted a greater percentage of no-match for 19 datasets. Table 3. The percentages of examples classified through single-, multiple- and no-match   MODLEM  UNRIPPER    Single  Multiple  No  Single  Multiple  No  aba  51.4 $$\pm$$ 6.0  47.7 $$\pm$$ 6.0  0.9 $$\pm$$ 0.2  86.3 $$\pm$$ 8.6  1.1 $$\pm$$ 0.3  12.6 $$\pm$$ 8.7  brc  49.9 $$\pm$$ 5.0  44.5 $$\pm$$ 4.8  5.6 $$\pm$$ 1.5  84.1 $$\pm$$ 5.2  7.0 $$\pm$$ 2.4  8.9$$\pm$$ 5.3  brw  94.1 $$\pm$$ 0.8  3.7 $$\pm$$ 0.8  2.2 $$\pm$$ 0.5  96.0 $$\pm$$ 0.7  2.7 $$\pm$$ 0.7  1.3$$\pm$$ 0.4  bup  58.5 $$\pm$$ 4.1  31.8 $$\pm$$ 6.0  9.7 $$\pm$$ 2.7  75.3 $$\pm$$ 3.0  11.1 $$\pm$$ 2.8  13.6 $$\pm$$ 3.2  car  89.0 $$\pm$$ 2.1  9.5 $$\pm$$ 2.3  1.6 $$\pm$$ 0.3  83.8 $$\pm$$ 1.1  11.2 $$\pm$$ 1.0  5.1$$\pm$$ 0.9  che  98.2 $$\pm$$ 0.3  1.5 $$\pm$$ 0.2  0.3 $$\pm$$ 0.1  98.9 $$\pm$$ 0.2  0.8 $$\pm$$ 0.2  0.3$$\pm$$ 0.1  cmc  34.7 $$\pm$$ 2.8  64.2 $$\pm$$ 2.9  1.1 $$\pm$$ 0.3  76.6 $$\pm$$ 10.1  0.7 $$\pm$$ 0.5  22.7 $$\pm$$ 10.2  cre  66.7 $$\pm$$ 1.7  22.0 $$\pm$$ 1.9  11.4 $$\pm$$ 1.3  76.0 $$\pm$$ 4.2  6.8 $$\pm$$ 1.1  17.2 $$\pm$$ 4.7  crx  78.0 $$\pm$$ 3.1  14.4 $$\pm$$ 3.2  7.6 $$\pm$$ 1.0  93.0 $$\pm$$ 1.1  4.2 $$\pm$$ 1.1  2.8$$\pm$$ 1.0  eco  80.5 $$\pm$$ 1.9  10.2 $$\pm$$ 1.8  9.3 $$\pm$$ 1.2  84.7 $$\pm$$ 1.7  7.1 $$\pm$$ 1.2  8.2$$\pm$$ 1.4  gla  69.4 $$\pm$$ 2.3  14.3 $$\pm$$ 2.2  16.3 $$\pm$$ 2.4  65.0 $$\pm$$ 3.5  11.9 $$\pm$$ 2.6  23.2 $$\pm$$ 3.1  hab  41.4 $$\pm$$ 5.8  56.2 $$\pm$$ 5.8  2.4 $$\pm$$ 1.0  80.6 $$\pm$$ 9.0  2.2 $$\pm$$ 1.3  17.3 $$\pm$$ 9.0  hea  77.6 $$\pm$$ 2.0  12.6 $$\pm$$ 1.9  9.8 $$\pm$$ 1.6  80.6 $$\pm$$ 2.3  10.0 $$\pm$$ 2.0  9.4 $$\pm$$ 2.0  hep  58.5 $$\pm$$ 4.4  37.2 $$\pm$$ 4.0  4.3 $$\pm$$ 1.4  69.3 $$\pm$$ 6.3  20.6 $$\pm$$ 5.6  10.1 $$\pm$$ 6.0  ion  90.2 $$\pm$$ 1.8  5.2 $$\pm$$ 1.2  4.7 $$\pm$$ 1.3  92.2 $$\pm$$ 1.6  4.4 $$\pm$$ 1.0  3.5$$\pm$$ 1.2  iri  97.3 $$\pm$$ 1.1  1.3 $$\pm$$ 1.0  1.5 $$\pm$$ 0.8  95.9 $$\pm$$ 1.5  1.4 $$\pm$$ 1.0  2.7$$\pm$$ 1.3  lym  78.6 $$\pm$$ 2.5  11.0 $$\pm$$ 2.2  10.4 $$\pm$$ 2.5  80.5 $$\pm$$ 2.9  10.3 $$\pm$$ 2.4  9.2 $$\pm$$ 2.4  mon  86.8 $$\pm$$ 3.4  9.0 $$\pm$$ 3.4  4.2 $$\pm$$ 1.6  89.1 $$\pm$$ 2.8  9.4 $$\pm$$ 2.4  1.5$$\pm$$ 1.4  mus  80.5 $$\pm$$ 0.7  19.5 $$\pm$$ 0.7  0.0 $$\pm$$ 0.0  41.7 $$\pm$$ 0.9  58.4 $$\pm$$ 0.9  0.0 $$\pm$$ 0.0  pim  64.3 $$\pm$$ 3.8  27.5 $$\pm$$ 4.0  8.2 $$\pm$$ 1.2  86.4 $$\pm$$ 1.8  6.2 $$\pm$$ 1.2  7.4$$\pm$$ 1.4  pri  13.3 $$\pm$$ 1.5  85.1 $$\pm$$ 1.4  1.5 $$\pm$$ 0.7  51.8 $$\pm$$ 2.3  16.8 $$\pm$$ 2.3  31.4 $$\pm$$ 2.3  sat  71.9 $$\pm$$ 3.4  24.7 $$\pm$$ 3.6  3.4 $$\pm$$ 0.4  87.6 $$\pm$$ 0.4  6.5 $$\pm$$ 0.4  6.0$$\pm$$ 0.3  son  71.6 $$\pm$$ 3.5  14.3 $$\pm$$ 2.4  14.1 $$\pm$$ 2.0  71.1 $$\pm$$ 3.4  15.4 $$\pm$$ 2.6  13.5 $$\pm$$ 3.0  soy  70.0 $$\pm$$ 1.1  25.7 $$\pm$$ 1.1  4.3 $$\pm$$ 0.7  72.5 $$\pm$$ 0.8  21.3 $$\pm$$ 0.6  6.2 $$\pm$$ 0.6  tic  92.5 $$\pm$$ 2.3  6.6 $$\pm$$ 2.4  0.9 $$\pm$$ 0.3  96.1 $$\pm$$ 0.8  2.4 $$\pm$$ 0.9  1.5$$\pm$$ 0.1  tra  19.0 $$\pm$$ 2.3  80.3 $$\pm$$ 2.4  0.6 $$\pm$$ 0.3  92.6 $$\pm$$ 4.3  1.1 $$\pm$$ 0.5  6.2$$\pm$$ 4.5  veh  67.0 $$\pm$$ 2.9  20.6 $$\pm$$ 3.5  12.4 $$\pm$$ 1.2  73.3 $$\pm$$ 1.6  8.7 $$\pm$$ 1.3  18.0 $$\pm$$ 1.7  vot  80.7 $$\pm$$ 2.1  18.2 $$\pm$$ 2.2  1.1 $$\pm$$ 0.4  94.6 $$\pm$$ 0.6  5.4 $$\pm$$ 0.6  0.0$$\pm$$ 0.1  vow  73.5 $$\pm$$ 1.4  10.8 $$\pm$$ 1.1  15.7 $$\pm$$ 1.1  66.2 $$\pm$$ 1.5  12.8 $$\pm$$ 1.0  21.0 $$\pm$$ 1.4  yea  23.2 $$\pm$$ 3.1  73.8 $$\pm$$ 3.6  3.0 $$\pm$$ 0.7  64.7 $$\pm$$ 1.6  7.1 $$\pm$$ 0.9  28.3 $$\pm$$ 1.5    MODLEM  UNRIPPER    Single  Multiple  No  Single  Multiple  No  aba  51.4 $$\pm$$ 6.0  47.7 $$\pm$$ 6.0  0.9 $$\pm$$ 0.2  86.3 $$\pm$$ 8.6  1.1 $$\pm$$ 0.3  12.6 $$\pm$$ 8.7  brc  49.9 $$\pm$$ 5.0  44.5 $$\pm$$ 4.8  5.6 $$\pm$$ 1.5  84.1 $$\pm$$ 5.2  7.0 $$\pm$$ 2.4  8.9$$\pm$$ 5.3  brw  94.1 $$\pm$$ 0.8  3.7 $$\pm$$ 0.8  2.2 $$\pm$$ 0.5  96.0 $$\pm$$ 0.7  2.7 $$\pm$$ 0.7  1.3$$\pm$$ 0.4  bup  58.5 $$\pm$$ 4.1  31.8 $$\pm$$ 6.0  9.7 $$\pm$$ 2.7  75.3 $$\pm$$ 3.0  11.1 $$\pm$$ 2.8  13.6 $$\pm$$ 3.2  car  89.0 $$\pm$$ 2.1  9.5 $$\pm$$ 2.3  1.6 $$\pm$$ 0.3  83.8 $$\pm$$ 1.1  11.2 $$\pm$$ 1.0  5.1$$\pm$$ 0.9  che  98.2 $$\pm$$ 0.3  1.5 $$\pm$$ 0.2  0.3 $$\pm$$ 0.1  98.9 $$\pm$$ 0.2  0.8 $$\pm$$ 0.2  0.3$$\pm$$ 0.1  cmc  34.7 $$\pm$$ 2.8  64.2 $$\pm$$ 2.9  1.1 $$\pm$$ 0.3  76.6 $$\pm$$ 10.1  0.7 $$\pm$$ 0.5  22.7 $$\pm$$ 10.2  cre  66.7 $$\pm$$ 1.7  22.0 $$\pm$$ 1.9  11.4 $$\pm$$ 1.3  76.0 $$\pm$$ 4.2  6.8 $$\pm$$ 1.1  17.2 $$\pm$$ 4.7  crx  78.0 $$\pm$$ 3.1  14.4 $$\pm$$ 3.2  7.6 $$\pm$$ 1.0  93.0 $$\pm$$ 1.1  4.2 $$\pm$$ 1.1  2.8$$\pm$$ 1.0  eco  80.5 $$\pm$$ 1.9  10.2 $$\pm$$ 1.8  9.3 $$\pm$$ 1.2  84.7 $$\pm$$ 1.7  7.1 $$\pm$$ 1.2  8.2$$\pm$$ 1.4  gla  69.4 $$\pm$$ 2.3  14.3 $$\pm$$ 2.2  16.3 $$\pm$$ 2.4  65.0 $$\pm$$ 3.5  11.9 $$\pm$$ 2.6  23.2 $$\pm$$ 3.1  hab  41.4 $$\pm$$ 5.8  56.2 $$\pm$$ 5.8  2.4 $$\pm$$ 1.0  80.6 $$\pm$$ 9.0  2.2 $$\pm$$ 1.3  17.3 $$\pm$$ 9.0  hea  77.6 $$\pm$$ 2.0  12.6 $$\pm$$ 1.9  9.8 $$\pm$$ 1.6  80.6 $$\pm$$ 2.3  10.0 $$\pm$$ 2.0  9.4 $$\pm$$ 2.0  hep  58.5 $$\pm$$ 4.4  37.2 $$\pm$$ 4.0  4.3 $$\pm$$ 1.4  69.3 $$\pm$$ 6.3  20.6 $$\pm$$ 5.6  10.1 $$\pm$$ 6.0  ion  90.2 $$\pm$$ 1.8  5.2 $$\pm$$ 1.2  4.7 $$\pm$$ 1.3  92.2 $$\pm$$ 1.6  4.4 $$\pm$$ 1.0  3.5$$\pm$$ 1.2  iri  97.3 $$\pm$$ 1.1  1.3 $$\pm$$ 1.0  1.5 $$\pm$$ 0.8  95.9 $$\pm$$ 1.5  1.4 $$\pm$$ 1.0  2.7$$\pm$$ 1.3  lym  78.6 $$\pm$$ 2.5  11.0 $$\pm$$ 2.2  10.4 $$\pm$$ 2.5  80.5 $$\pm$$ 2.9  10.3 $$\pm$$ 2.4  9.2 $$\pm$$ 2.4  mon  86.8 $$\pm$$ 3.4  9.0 $$\pm$$ 3.4  4.2 $$\pm$$ 1.6  89.1 $$\pm$$ 2.8  9.4 $$\pm$$ 2.4  1.5$$\pm$$ 1.4  mus  80.5 $$\pm$$ 0.7  19.5 $$\pm$$ 0.7  0.0 $$\pm$$ 0.0  41.7 $$\pm$$ 0.9  58.4 $$\pm$$ 0.9  0.0 $$\pm$$ 0.0  pim  64.3 $$\pm$$ 3.8  27.5 $$\pm$$ 4.0  8.2 $$\pm$$ 1.2  86.4 $$\pm$$ 1.8  6.2 $$\pm$$ 1.2  7.4$$\pm$$ 1.4  pri  13.3 $$\pm$$ 1.5  85.1 $$\pm$$ 1.4  1.5 $$\pm$$ 0.7  51.8 $$\pm$$ 2.3  16.8 $$\pm$$ 2.3  31.4 $$\pm$$ 2.3  sat  71.9 $$\pm$$ 3.4  24.7 $$\pm$$ 3.6  3.4 $$\pm$$ 0.4  87.6 $$\pm$$ 0.4  6.5 $$\pm$$ 0.4  6.0$$\pm$$ 0.3  son  71.6 $$\pm$$ 3.5  14.3 $$\pm$$ 2.4  14.1 $$\pm$$ 2.0  71.1 $$\pm$$ 3.4  15.4 $$\pm$$ 2.6  13.5 $$\pm$$ 3.0  soy  70.0 $$\pm$$ 1.1  25.7 $$\pm$$ 1.1  4.3 $$\pm$$ 0.7  72.5 $$\pm$$ 0.8  21.3 $$\pm$$ 0.6  6.2 $$\pm$$ 0.6  tic  92.5 $$\pm$$ 2.3  6.6 $$\pm$$ 2.4  0.9 $$\pm$$ 0.3  96.1 $$\pm$$ 0.8  2.4 $$\pm$$ 0.9  1.5$$\pm$$ 0.1  tra  19.0 $$\pm$$ 2.3  80.3 $$\pm$$ 2.4  0.6 $$\pm$$ 0.3  92.6 $$\pm$$ 4.3  1.1 $$\pm$$ 0.5  6.2$$\pm$$ 4.5  veh  67.0 $$\pm$$ 2.9  20.6 $$\pm$$ 3.5  12.4 $$\pm$$ 1.2  73.3 $$\pm$$ 1.6  8.7 $$\pm$$ 1.3  18.0 $$\pm$$ 1.7  vot  80.7 $$\pm$$ 2.1  18.2 $$\pm$$ 2.2  1.1 $$\pm$$ 0.4  94.6 $$\pm$$ 0.6  5.4 $$\pm$$ 0.6  0.0$$\pm$$ 0.1  vow  73.5 $$\pm$$ 1.4  10.8 $$\pm$$ 1.1  15.7 $$\pm$$ 1.1  66.2 $$\pm$$ 1.5  12.8 $$\pm$$ 1.0  21.0 $$\pm$$ 1.4  yea  23.2 $$\pm$$ 3.1  73.8 $$\pm$$ 3.6  3.0 $$\pm$$ 0.7  64.7 $$\pm$$ 1.6  7.1 $$\pm$$ 0.9  28.3 $$\pm$$ 1.5  4.2 Overall accuracy of strategies To learn about if there is any difference between the six aforementioned classification strategies, their comparison was conducted. Table 4 shows their accuracy when they were combined with MODLEM- and UNRIPPER-generated rules. The highest scores for each dataset are marked bold. In case of MODLEM, the situation is clear: $$z_{D}$$ obtained the highest score for 21 datasets and the second one was $$z_{m}$$ noting 7 wins. In turn, in case of UNRIPPER the best one cannot be determined so easily – these two strategies noted 8 wins, whereas $$z_{C}$$ obtained 9. The Friedman test was conducted to verify the significance of the differences between the strategies. The null hypothesis was that all the strategies perform equally, i.e. their average ranks are equal. The Friedman test returned p-values smaller than 5e-7 and equal to 3.77e-4 for MODLEM and UNRIPPER, respectively, allowing to reject the null hypothesis. Table 4. $$F-score$$ obtained by the strategies   MODLEM  UNRIPPER    $$z_{m}$$  $$z_{S}$$  $$z_{D}$$  $$z_{C}$$  $$z_{N}$$  $$z_{A}$$  $$z_{m}$$  $$z_{S}$$  $$z_{D}$$  $$z_{C}$$  $$z_{N}$$  $$z_{A}$$  aba  0.482  0.538  0.629  0.431  0.503  0.539  0.255  0.255  0.219  0.305  0.257  0.250  brc  0.461  0.538  0.559  0.546  0.490  0.518  0.466  0.467  0.323  0.397  0.466  0.418  brw  0.616  0.555  0.738  0.624  0.557  0.664  0.393  0.433  0.609  0.496  0.444  0.503  bup  0.571  0.498  0.621  0.585  0.536  0.486  0.505  0.500  0.480  0.494  0.458  0.391  car  0.696  0.541  0.814  0.726  0.499  0.408  0.284  0.328  0.595  0.432  0.304  0.249  che  0.812  0.671  0.802  0.769  0.776  0.734  0.563  0.495  0.577  0.471  0.513  0.494  cmc  0.565  0.558  0.593  0.535  0.569  0.551  0.385  0.343  0.307  0.285  0.386  0.369  cre  0.434  0.542  0.566  0.533  0.538  0.539  0.392  0.426  0.387  0.397  0.407  0.379  crx  0.728  0.642  0.728  0.702  0.675  0.658  0.563  0.588  0.466  0.482  0.500  0.498  eco  0.335  0.279  0.377  0.400  0.278  0.283  0.285  0.277  0.348  0.365  0.233  0.221  gla  0.341  0.303  0.399  0.428  0.299  0.289  0.298  0.212  0.266  0.387  0.215  0.186  hab  0.480  0.548  0.576  0.497  0.505  0.540  0.365  0.355  0.274  0.290  0.365  0.328  hea  0.682  0.651  0.692  0.657  0.653  0.672  0.583  0.589  0.600  0.601  0.573  0.554  hep  0.431  0.573  0.620  0.589  0.565  0.561  0.464  0.485  0.505  0.419  0.479  0.431  ion  0.435  0.451  0.642  0.621  0.549  0.530  0.544  0.399  0.636  0.597  0.490  0.466  iri  0.898  0.825  0.859  0.888  0.841  0.749  0.746  0.460  0.634  0.654  0.637  0.588  lym  0.383  0.389  0.502  0.448  0.394  0.400  0.464  0.484  0.509  0.512  0.474  0.486  mon  0.780  0.789  0.681  0.647  0.718  0.646  0.914  0.540  0.884  0.927  0.913  0.350  mus  0.712  0.718  0.751  0.701  0.736  0.732  0.802  0.932  0.954  0.778  0.934  0.862  pim  0.621  0.533  0.640  0.603  0.591  0.537  0.405  0.496  0.501  0.507  0.431  0.382  pri  0.196  0.167  0.210  0.205  0.170  0.150  0.114  0.095  0.114  0.124  0.104  0.106  sat  0.668  0.473  0.658  0.530  0.513  0.460  0.489  0.425  0.487  0.430  0.395  0.414  son  0.588  0.589  0.614  0.594  0.566  0.533  0.572  0.537  0.560  0.546  0.534  0.473  soy  0.506  0.388  0.406  0.424  0.473  0.464  0.426  0.330  0.371  0.354  0.377  0.343  tic  0.647  0.582  0.915  0.802  0.650  0.562  0.511  0.224  0.694  0.532  0.229  0.223  tra  0.552  0.542  0.610  0.536  0.498  0.554  0.377  0.451  0.318  0.328  0.378  0.333  veh  0.565  0.477  0.600  0.509  0.456  0.405  0.394  0.257  0.287  0.329  0.252  0.270  vot  0.799  0.717  0.749  0.768  0.764  0.631  0.484  0.563  0.615  0.570  0.598  0.578  vow  0.468  0.339  0.403  0.428  0.342  0.419  0.446  0.314  0.395  0.370  0.286  0.375  yea  0.467  0.188  0.492  0.491  0.207  0.185  0.270  0.183  0.292  0.310  0.180   0.191    MODLEM  UNRIPPER    $$z_{m}$$  $$z_{S}$$  $$z_{D}$$  $$z_{C}$$  $$z_{N}$$  $$z_{A}$$  $$z_{m}$$  $$z_{S}$$  $$z_{D}$$  $$z_{C}$$  $$z_{N}$$  $$z_{A}$$  aba  0.482  0.538  0.629  0.431  0.503  0.539  0.255  0.255  0.219  0.305  0.257  0.250  brc  0.461  0.538  0.559  0.546  0.490  0.518  0.466  0.467  0.323  0.397  0.466  0.418  brw  0.616  0.555  0.738  0.624  0.557  0.664  0.393  0.433  0.609  0.496  0.444  0.503  bup  0.571  0.498  0.621  0.585  0.536  0.486  0.505  0.500  0.480  0.494  0.458  0.391  car  0.696  0.541  0.814  0.726  0.499  0.408  0.284  0.328  0.595  0.432  0.304  0.249  che  0.812  0.671  0.802  0.769  0.776  0.734  0.563  0.495  0.577  0.471  0.513  0.494  cmc  0.565  0.558  0.593  0.535  0.569  0.551  0.385  0.343  0.307  0.285  0.386  0.369  cre  0.434  0.542  0.566  0.533  0.538  0.539  0.392  0.426  0.387  0.397  0.407  0.379  crx  0.728  0.642  0.728  0.702  0.675  0.658  0.563  0.588  0.466  0.482  0.500  0.498  eco  0.335  0.279  0.377  0.400  0.278  0.283  0.285  0.277  0.348  0.365  0.233  0.221  gla  0.341  0.303  0.399  0.428  0.299  0.289  0.298  0.212  0.266  0.387  0.215  0.186  hab  0.480  0.548  0.576  0.497  0.505  0.540  0.365  0.355  0.274  0.290  0.365  0.328  hea  0.682  0.651  0.692  0.657  0.653  0.672  0.583  0.589  0.600  0.601  0.573  0.554  hep  0.431  0.573  0.620  0.589  0.565  0.561  0.464  0.485  0.505  0.419  0.479  0.431  ion  0.435  0.451  0.642  0.621  0.549  0.530  0.544  0.399  0.636  0.597  0.490  0.466  iri  0.898  0.825  0.859  0.888  0.841  0.749  0.746  0.460  0.634  0.654  0.637  0.588  lym  0.383  0.389  0.502  0.448  0.394  0.400  0.464  0.484  0.509  0.512  0.474  0.486  mon  0.780  0.789  0.681  0.647  0.718  0.646  0.914  0.540  0.884  0.927  0.913  0.350  mus  0.712  0.718  0.751  0.701  0.736  0.732  0.802  0.932  0.954  0.778  0.934  0.862  pim  0.621  0.533  0.640  0.603  0.591  0.537  0.405  0.496  0.501  0.507  0.431  0.382  pri  0.196  0.167  0.210  0.205  0.170  0.150  0.114  0.095  0.114  0.124  0.104  0.106  sat  0.668  0.473  0.658  0.530  0.513  0.460  0.489  0.425  0.487  0.430  0.395  0.414  son  0.588  0.589  0.614  0.594  0.566  0.533  0.572  0.537  0.560  0.546  0.534  0.473  soy  0.506  0.388  0.406  0.424  0.473  0.464  0.426  0.330  0.371  0.354  0.377  0.343  tic  0.647  0.582  0.915  0.802  0.650  0.562  0.511  0.224  0.694  0.532  0.229  0.223  tra  0.552  0.542  0.610  0.536  0.498  0.554  0.377  0.451  0.318  0.328  0.378  0.333  veh  0.565  0.477  0.600  0.509  0.456  0.405  0.394  0.257  0.287  0.329  0.252  0.270  vot  0.799  0.717  0.749  0.768  0.764  0.631  0.484  0.563  0.615  0.570  0.598  0.578  vow  0.468  0.339  0.403  0.428  0.342  0.419  0.446  0.314  0.395  0.370  0.286  0.375  yea  0.467  0.188  0.492  0.491  0.207  0.185  0.270  0.183  0.292  0.310  0.180   0.191  Figure 1 shows a ranking of the strategies according to their average ranks and the critical difference derived from the Nemenyi test. One can notice a few statistically significant differences (such strategies are not connected by a horizontal line). First, in case of MODLEM-generated rules $$z_{D}$$ is better than all the others while $$z_{C}$$ outperforms $$z_{A}$$. Thus, a conclusion is reached that in this case $$z_{D}$$ is the best one among the evaluated strategies. Then, in case of UNRIPPER-generated rules $$z_{m}$$, $$z_{D}$$ and $$z_{C}$$ are better than $$z_{A}$$. An interesting phenomenon is that the previous supremacy of $$z_{D}$$ is diminished. It is also an interesting thing that for both classifiers the strategies formed a similar order: the top three are $$z_{D}$$, $$z_{m}$$ and $$z_{C}$$, whereas the bottom three are $$z_{N}$$, $$z_{S}$$ and $$z_{A}$$ lagging behind. These two phenomena are analysed and described in the following subsections. Fig. 1. View largeDownload slide The average ranks of the strategies and a pairwise comparison using the Nemenyi test. (a) MODLEM; (b) UNRIPPER. Fig. 1. View largeDownload slide The average ranks of the strategies and a pairwise comparison using the Nemenyi test. (a) MODLEM; (b) UNRIPPER. 4.3 An accuracy decomposition into multiple- and no-match In the previous subsection, $$z_{D}$$ outperformed all the other strategies in case of MODLEM, but its supremacy did not occur in case of UNRIPPER. Because these two algorithms led to different proportions of examples classified using multiple- and no-match, we hypothesized that the strategies may perform differently in these two cases. Hence the overall accuracy was decomposed into accuracies obtained in multiple- and no-match, which were analysed separately. Table 5 shows accuracy of the classification strategies considering only the multiple-match situation. For both learning algorithms, the best strategy is most often $$z_{D}$$. It took the lead for 25 and 14 datasets in case of MODLEM and UNRIPPER, respectively. Again, the results were compared using the Friedman test, which returned p-values smaller than 5e-7 and equal to 9.07e-4 in case of MODLEM and UNRIPPER, respectively and rejected the null hypothesis for both groups. Figure 2 depicts a ranking of the strategies based on the average ranks. It turns out that $$z_{D}$$ is the top strategy in both groups and it is significantly better than all the others in case of MODLEM as well as than $$z_{A}$$ in case of UNRIPPER. Therefore the multiple-match situation is concluded that $$z_{D}$$ is the best strategy. On the other side there is $$z_{A}$$, which performed the worst in both groups. Fig. 2. View largeDownload slide The average ranks of the strategies and a pairwise comparison using the Nemenyi test in multiple-match. (a) MODLEM; (b) UNRIPPER. Fig. 2. View largeDownload slide The average ranks of the strategies and a pairwise comparison using the Nemenyi test in multiple-match. (a) MODLEM; (b) UNRIPPER. Table 5. $$F-score$$ obtained by the strategies in multiple-match   MODLEM  UNRIPPER    $$z_{m}$$  $$z_{S}$$  $$z_{D}$$  $$z_{C}$$  $$z_{N}$$  $$z_{A}$$  $$z_{m}$$  $$z_{S}$$  $$z_{D}$$  $$z_{C}$$  $$z_{N}$$  $$z_{A}$$  aba  0.484  0.542  0.630  0.429  0.505  0.543  0.367  0.370  0.416  0.505  0.371  0.372  brc  0.467  0.547  0.547  0.532  0.506  0.530  0.392  0.395  0.343  0.491  0.388  0.396  brw  0.643  0.661  0.741  0.678  0.665  0.665  0.385  0.454  0.618  0.463  0.460  0.387  bup  0.574  0.475  0.618  0.599  0.573  0.463  0.530  0.492  0.502  0.485  0.414  0.413  car  0.780  0.516  0.847  0.793  0.465  0.460  0.427  0.362  0.657  0.458  0.409  0.362  che  0.821  0.701  0.833  0.809  0.796  0.775  0.613  0.487  0.647  0.536  0.525  0.524  cmc  0.565  0.557  0.593  0.536  0.569  0.550  0.388  0.528  0.370  0.433  0.417  0.458  cre  0.450  0.588  0.593  0.555  0.590  0.587  0.353  0.424  0.437  0.437  0.344  0.382  crx  0.710  0.685  0.721  0.696  0.719  0.635  0.530  0.562  0.462  0.483  0.501  0.501  eco  0.454  0.451  0.493  0.486  0.463  0.460  0.443  0.458  0.486  0.402  0.444  0.440  gla  0.363  0.410  0.515  0.507  0.414  0.407  0.432  0.302  0.402  0.362  0.349  0.308  hab  0.483  0.552  0.576  0.497  0.509  0.543  0.461  0.476  0.262  0.284  0.461  0.460  hea  0.664  0.676  0.677  0.685  0.684  0.683  0.567  0.598  0.572  0.558  0.535  0.493  hep  0.433  0.588  0.622  0.585  0.579  0.576  0.410  0.461  0.577  0.448  0.444  0.448  ion  0.455  0.479  0.704  0.686  0.617  0.596  0.542  0.416  0.671  0.659  0.543  0.514  iri  0.882  0.886  0.919  0.882  0.877  0.877  0.832  0.455  0.626  0.638  0.637  0.528  lym  0.602  0.604  0.694  0.643  0.623  0.623  0.613  0.622  0.597  0.585  0.598  0.571  mon  0.701  0.773  0.681  0.688  0.707  0.595  0.801  0.362  0.724  0.822  0.797  0.386  mus  0.712  0.719  0.751  0.702  0.737  0.732  0.802  0.932  0.954  0.778  0.934  0.862  pim  0.633  0.562  0.663  0.610  0.642  0.556  0.362  0.476  0.502  0.538  0.397  0.414  pri  0.200  0.171  0.214  0.210  0.174  0.154  0.429  0.413  0.390  0.389  0.428  0.425  sat  0.701  0.528  0.732  0.598  0.596  0.531  0.561  0.564  0.634  0.565  0.563  0.543  son  0.547  0.585  0.606  0.584  0.589  0.587  0.561  0.515  0.571  0.502  0.505  0.484  soy  0.585  0.466  0.489  0.545  0.564  0.570  0.496  0.362  0.429  0.495  0.413  0.408  tic  0.801  0.704  0.860  0.818  0.843  0.693  0.832  0.362  0.963  0.534  0.372  0.361  tra  0.553  0.543  0.610  0.536  0.500  0.555  0.356  0.386  0.351  0.357  0.355  0.358  veh  0.658  0.605  0.670  0.662  0.647  0.594  0.534  0.498  0.566  0.504  0.500  0.476  vot  0.800  0.742  0.788  0.772  0.776  0.609  0.478  0.561  0.615  0.569  0.598  0.579  vow  0.551  0.529  0.574  0.551  0.566  0.564  0.536  0.468  0.594  0.469  0.477  0.458  yea  0.475  0.195  0.503  0.506  0.214  0.190  0.372  0.329  0.518  0.524  0.328   0.313    MODLEM  UNRIPPER    $$z_{m}$$  $$z_{S}$$  $$z_{D}$$  $$z_{C}$$  $$z_{N}$$  $$z_{A}$$  $$z_{m}$$  $$z_{S}$$  $$z_{D}$$  $$z_{C}$$  $$z_{N}$$  $$z_{A}$$  aba  0.484  0.542  0.630  0.429  0.505  0.543  0.367  0.370  0.416  0.505  0.371  0.372  brc  0.467  0.547  0.547  0.532  0.506  0.530  0.392  0.395  0.343  0.491  0.388  0.396  brw  0.643  0.661  0.741  0.678  0.665  0.665  0.385  0.454  0.618  0.463  0.460  0.387  bup  0.574  0.475  0.618  0.599  0.573  0.463  0.530  0.492  0.502  0.485  0.414  0.413  car  0.780  0.516  0.847  0.793  0.465  0.460  0.427  0.362  0.657  0.458  0.409  0.362  che  0.821  0.701  0.833  0.809  0.796  0.775  0.613  0.487  0.647  0.536  0.525  0.524  cmc  0.565  0.557  0.593  0.536  0.569  0.550  0.388  0.528  0.370  0.433  0.417  0.458  cre  0.450  0.588  0.593  0.555  0.590  0.587  0.353  0.424  0.437  0.437  0.344  0.382  crx  0.710  0.685  0.721  0.696  0.719  0.635  0.530  0.562  0.462  0.483  0.501  0.501  eco  0.454  0.451  0.493  0.486  0.463  0.460  0.443  0.458  0.486  0.402  0.444  0.440  gla  0.363  0.410  0.515  0.507  0.414  0.407  0.432  0.302  0.402  0.362  0.349  0.308  hab  0.483  0.552  0.576  0.497  0.509  0.543  0.461  0.476  0.262  0.284  0.461  0.460  hea  0.664  0.676  0.677  0.685  0.684  0.683  0.567  0.598  0.572  0.558  0.535  0.493  hep  0.433  0.588  0.622  0.585  0.579  0.576  0.410  0.461  0.577  0.448  0.444  0.448  ion  0.455  0.479  0.704  0.686  0.617  0.596  0.542  0.416  0.671  0.659  0.543  0.514  iri  0.882  0.886  0.919  0.882  0.877  0.877  0.832  0.455  0.626  0.638  0.637  0.528  lym  0.602  0.604  0.694  0.643  0.623  0.623  0.613  0.622  0.597  0.585  0.598  0.571  mon  0.701  0.773  0.681  0.688  0.707  0.595  0.801  0.362  0.724  0.822  0.797  0.386  mus  0.712  0.719  0.751  0.702  0.737  0.732  0.802  0.932  0.954  0.778  0.934  0.862  pim  0.633  0.562  0.663  0.610  0.642  0.556  0.362  0.476  0.502  0.538  0.397  0.414  pri  0.200  0.171  0.214  0.210  0.174  0.154  0.429  0.413  0.390  0.389  0.428  0.425  sat  0.701  0.528  0.732  0.598  0.596  0.531  0.561  0.564  0.634  0.565  0.563  0.543  son  0.547  0.585  0.606  0.584  0.589  0.587  0.561  0.515  0.571  0.502  0.505  0.484  soy  0.585  0.466  0.489  0.545  0.564  0.570  0.496  0.362  0.429  0.495  0.413  0.408  tic  0.801  0.704  0.860  0.818  0.843  0.693  0.832  0.362  0.963  0.534  0.372  0.361  tra  0.553  0.543  0.610  0.536  0.500  0.555  0.356  0.386  0.351  0.357  0.355  0.358  veh  0.658  0.605  0.670  0.662  0.647  0.594  0.534  0.498  0.566  0.504  0.500  0.476  vot  0.800  0.742  0.788  0.772  0.776  0.609  0.478  0.561  0.615  0.569  0.598  0.579  vow  0.551  0.529  0.574  0.551  0.566  0.564  0.536  0.468  0.594  0.469  0.477  0.458  yea  0.475  0.195  0.503  0.506  0.214  0.190  0.372  0.329  0.518  0.524  0.328   0.313  Next, Table 6 contains $$F-score$$ values in no-match. This time the situation is not as clear as before: $$z_{m}$$ most often obtained the highest score in case of UNRIPPER noting 12 wins while the second one $$z_{C}$$ scored the best for 10 datasets. In case of MODLEM the lead is taken together by $$z_{m}$$ and $$z_{D}$$ which scored the best for 10 datasets. The Friedman test rejected the null hypothesis returning p-values equal to 2e-6 and 1.013e-3 in case of MODLEM and UNRIPPER, respectively. Figure 3 presents a ranking based on the average ranks. In both cases, the best strategy is $$z_{m}$$, but in case of MODLEM the difference comparing with $$z_{D}$$ is very slight. Also $$z_{C}$$ reveals stable and fairly good results. It is interesting, despite that $$z_{D}$$ is the second in case of MODLEM it is barely second to last in case of UNRIPPER. Table 6. $$F-score$$ obtained by the strategies in no-match   MODLEM  UNRIPPER    $$z_{m}$$  $$z_{S}$$  $$z_{D}$$  $$z_{C}$$  $$z_{N}$$  $$z_{A}$$  $$z_{m}$$  $$z_{S}$$  $$z_{D}$$  $$z_{C}$$  $$z_{N}$$  $$z_{A}$$  aba  0.422  0.421  0.257  0.501  0.427  0.426  0.244  0.244  0.198  0.278  0.245  0.238  brc  0.397  0.442  0.507  0.603  0.367  0.382  0.390  0.380  0.290  0.306  0.384  0.413  brw  0.551  0.321  0.700  0.504  0.317  0.628  0.381  0.372  0.549  0.511  0.390  0.466  bup  0.557  0.488  0.610  0.515  0.380  0.475  0.470  0.488  0.452  0.490  0.466  0.354  car  0.203  0.373  0.374  0.294  0.337  0.135  0.023  0.141  0.267  0.250  0.078  0.023  che  0.653  0.502  0.406  0.523  0.371  0.443  0.370  0.402  0.308  0.276  0.429  0.380  cmc  0.478  0.478  0.503  0.450  0.478  0.478  0.376  0.327  0.303  0.277  0.376  0.363  cre  0.399  0.384  0.494  0.481  0.384  0.385  0.390  0.411  0.369  0.382  0.408  0.361  crx  0.727  0.552  0.720  0.693  0.588  0.702  0.542  0.462  0.451  0.448  0.449  0.480  eco  0.504  0.378  0.517  0.521  0.358  0.380  0.364  0.310  0.389  0.462  0.265  0.209  gla  0.351  0.202  0.219  0.354  0.184  0.180  0.243  0.174  0.144  0.358  0.165  0.169  hab  0.447  0.462  0.432  0.444  0.446  0.473  0.311  0.307  0.271  0.289  0.311  0.283  hea  0.695  0.600  0.703  0.612  0.598  0.640  0.570  0.568  0.609  0.632  0.601  0.580  hep  0.482  0.489  0.557  0.580  0.489  0.482  0.374  0.371  0.333  0.347  0.373  0.362  ion  0.381  0.379  0.437  0.422  0.379  0.393  0.513  0.363  0.551  0.447  0.368  0.359  iri  0.936  0.821  0.854  0.925  0.846  0.699  0.694  0.668  0.674  0.682  0.680  0.646  lym  0.500  0.507  0.528  0.483  0.476  0.499  0.495  0.518  0.492  0.523  0.505  0.547  mon  0.884  0.750  0.589  0.509  0.676  0.719  0.958  0.958  0.958  0.958  0.958  0.179  mus  0.774  0.774  0.774  0.774  0.774  0.826  1.00  1.00  1.00  1.00  1.00  1.00  pim  0.571  0.369  0.533  0.577  0.355  0.441  0.428  0.502  0.485  0.471  0.450  0.346  pri  0.838  0.848  0.840  0.789  0.849  0.851  0.153  0.132  0.143  0.171  0.126  0.124  sat  0.500  0.233  0.209  0.240  0.091  0.161  0.389  0.195  0.234  0.256  0.140  0.208  son  0.616  0.578  0.607  0.582  0.515  0.394  0.567  0.556  0.537  0.579  0.560  0.393  soy  0.619  0.628  0.621  0.534  0.626  0.540  0.643  0.634  0.654  0.525  0.624  0.575  tic  0.256  0.210  0.852  0.669  0.195  0.209  0.002  0.001  0.266  0.351  0.004  0.001  tra  0.395  0.408  0.504  0.474  0.395  0.417  0.377  0.401  0.311  0.314  0.379  0.327  veh  0.412  0.253  0.221  0.282  0.167  0.134  0.322  0.129  0.134  0.243  0.116  0.160  vot  0.614  0.332  0.285  0.561  0.480  0.536  0.950  0.927  0.927  0.927  0.927  0.896  vow  0.382  0.119  0.180  0.323  0.137  0.288  0.361  0.140  0.202  0.290  0.145  0.267  yea  0.474  0.285  0.350  0.333  0.275  0.307  0.217  0.096  0.134  0.171  0.089   0.114    MODLEM  UNRIPPER    $$z_{m}$$  $$z_{S}$$  $$z_{D}$$  $$z_{C}$$  $$z_{N}$$  $$z_{A}$$  $$z_{m}$$  $$z_{S}$$  $$z_{D}$$  $$z_{C}$$  $$z_{N}$$  $$z_{A}$$  aba  0.422  0.421  0.257  0.501  0.427  0.426  0.244  0.244  0.198  0.278  0.245  0.238  brc  0.397  0.442  0.507  0.603  0.367  0.382  0.390  0.380  0.290  0.306  0.384  0.413  brw  0.551  0.321  0.700  0.504  0.317  0.628  0.381  0.372  0.549  0.511  0.390  0.466  bup  0.557  0.488  0.610  0.515  0.380  0.475  0.470  0.488  0.452  0.490  0.466  0.354  car  0.203  0.373  0.374  0.294  0.337  0.135  0.023  0.141  0.267  0.250  0.078  0.023  che  0.653  0.502  0.406  0.523  0.371  0.443  0.370  0.402  0.308  0.276  0.429  0.380  cmc  0.478  0.478  0.503  0.450  0.478  0.478  0.376  0.327  0.303  0.277  0.376  0.363  cre  0.399  0.384  0.494  0.481  0.384  0.385  0.390  0.411  0.369  0.382  0.408  0.361  crx  0.727  0.552  0.720  0.693  0.588  0.702  0.542  0.462  0.451  0.448  0.449  0.480  eco  0.504  0.378  0.517  0.521  0.358  0.380  0.364  0.310  0.389  0.462  0.265  0.209  gla  0.351  0.202  0.219  0.354  0.184  0.180  0.243  0.174  0.144  0.358  0.165  0.169  hab  0.447  0.462  0.432  0.444  0.446  0.473  0.311  0.307  0.271  0.289  0.311  0.283  hea  0.695  0.600  0.703  0.612  0.598  0.640  0.570  0.568  0.609  0.632  0.601  0.580  hep  0.482  0.489  0.557  0.580  0.489  0.482  0.374  0.371  0.333  0.347  0.373  0.362  ion  0.381  0.379  0.437  0.422  0.379  0.393  0.513  0.363  0.551  0.447  0.368  0.359  iri  0.936  0.821  0.854  0.925  0.846  0.699  0.694  0.668  0.674  0.682  0.680  0.646  lym  0.500  0.507  0.528  0.483  0.476  0.499  0.495  0.518  0.492  0.523  0.505  0.547  mon  0.884  0.750  0.589  0.509  0.676  0.719  0.958  0.958  0.958  0.958  0.958  0.179  mus  0.774  0.774  0.774  0.774  0.774  0.826  1.00  1.00  1.00  1.00  1.00  1.00  pim  0.571  0.369  0.533  0.577  0.355  0.441  0.428  0.502  0.485  0.471  0.450  0.346  pri  0.838  0.848  0.840  0.789  0.849  0.851  0.153  0.132  0.143  0.171  0.126  0.124  sat  0.500  0.233  0.209  0.240  0.091  0.161  0.389  0.195  0.234  0.256  0.140  0.208  son  0.616  0.578  0.607  0.582  0.515  0.394  0.567  0.556  0.537  0.579  0.560  0.393  soy  0.619  0.628  0.621  0.534  0.626  0.540  0.643  0.634  0.654  0.525  0.624  0.575  tic  0.256  0.210  0.852  0.669  0.195  0.209  0.002  0.001  0.266  0.351  0.004  0.001  tra  0.395  0.408  0.504  0.474  0.395  0.417  0.377  0.401  0.311  0.314  0.379  0.327  veh  0.412  0.253  0.221  0.282  0.167  0.134  0.322  0.129  0.134  0.243  0.116  0.160  vot  0.614  0.332  0.285  0.561  0.480  0.536  0.950  0.927  0.927  0.927  0.927  0.896  vow  0.382  0.119  0.180  0.323  0.137  0.288  0.361  0.140  0.202  0.290  0.145  0.267  yea  0.474  0.285  0.350  0.333  0.275  0.307  0.217  0.096  0.134  0.171  0.089   0.114  Fig. 3. View largeDownload slide The average ranks of the strategies and a pairwise comparison using the Nemenyi test in no-match. (a) MODLEM; (b) UNRIPPER. Fig. 3. View largeDownload slide The average ranks of the strategies and a pairwise comparison using the Nemenyi test in no-match. (a) MODLEM; (b) UNRIPPER. Because the obtained recommendations for multiple- and no-match are different, we combined these two together into a new strategy denoted by $$z_{H}$$, where $$z_{H}^{mm}(x) = z_{D}^{mm}(x)$$ and $$z_{H}^{nm}(x) = z_{m}^{nm}(x)$$. Once again, the Friedman test was used to compare the results returning p-values smaller than 5e-7 for both algorithms. Figure 4 presents that in both cases $$z_{H}$$ is the top strategy supported by many significant differences. In case of MODLEM, $$z_{H}$$ obtained even better average rank than $$z_{D}$$ and both are significantly better than all the others. Also in case of UNRIPPER, $$z_{H}$$ is the best one and performed much better than both $$z_{D}$$ and $$z_{m}$$ outperforming significantly the others. Here, it is especially interesting how big an improvement was brought by combining the assets of $$z_{D}$$ and $$z_{m}$$. Fig. 4. View largeDownload slide The average ranks of the strategies and a pairwise comparison using the Nemenyi test after the addition of a combined strategy $$z_{H}$$ based on $$z_{D}$$ and $$z_{m}$$. (a) MODLEM; (b) UNRIPPER. Fig. 4. View largeDownload slide The average ranks of the strategies and a pairwise comparison using the Nemenyi test after the addition of a combined strategy $$z_{H}$$ based on $$z_{D}$$ and $$z_{m}$$. (a) MODLEM; (b) UNRIPPER. 4.4 An analysis of the rank order of the strategies It has already been observed that the strategies formed a similar rank order in which $$z_{D}$$, $$z_{m}$$ and $$z_{C}$$ were the top three and $$z_{N}$$, $$z_{S}$$ and $$z_{A}$$ seem to lag behind them. To understand this, the datasets, for which the differences between the best and the worst strategies are the biggest, were investigated further. Such an example in case of MODLEM is car dataset for which $$z_{D}$$ obtained score 0.406 higher than $$z_{A}$$. The $$F-score$$ values obtained for each class were compared separately. Figure 5 depicts that $$z_{N}$$, $$z_{S}$$ and $$z_{A}$$ obtained much smaller values for 3 out of 4 classes and were comparable to the others only in case of the last one which was the largest class representing 0.7 of the dataset. This may indicate that $$z_{N}$$, $$z_{S}$$ and $$z_{A}$$ are biased towards the majority class. The analysis was repeated for UNRIPPER classifier and tic dataset. Figure 6 shows that these three strategies recognized the negative class very poorly. On this class $$z_{m}$$, $$z_{C}$$ and especially $$z_{D}$$ are much better. This class is the minority one and constitutes 0.347 of the dataset. Fig. 5. View largeDownload slide The $$F-score$$ obtained for each class by all the strategies considering MODLEM classifier and car dataset. Fig. 5. View largeDownload slide The $$F-score$$ obtained for each class by all the strategies considering MODLEM classifier and car dataset. Fig. 6. View largeDownload slide The $$F-score$$ obtained for each class by all the strategies considering UNRIPPER classifier and tic dataset. Fig. 6. View largeDownload slide The $$F-score$$ obtained for each class by all the strategies considering UNRIPPER classifier and tic dataset. Once again the Friedman test was used to verify if all the strategies perform equally on the largest classes. For each dataset only the $$F-score$$ value related to its largest class was chosen and inputted to the test. The obtained p-values were smaller than 5e-7 for MODLEM and equal to 2.2e-5 for UNRIPPER. The full results are shown in Figure 7. Apart from $$z_{D}$$ all the strategies formed the same order for both classifiers, but – surprisingly – $$z_{N}$$, $$z_{S}$$ and $$z_{A}$$ are located in the middle of the rankings. It is interesting that despite their similar performance in overall case, this time $$z_{m}$$, $$z_{C}$$ and $$z_{D}$$ performed completely oppositely – $$z_{m}$$ took the lead which is also confirmed through statistical significance, while $$z_{C}$$ is located on the last position. Also $$z_{D}$$ is significantly worse than $$z_{m}$$ noting barely third and fifth rank for MODLEM and UNRIPPER, respectively. Fig. 7. View largeDownload slide The average ranks of the strategies and a pairwise comparison using the Nemenyi test considering only the largest class. (a) MODLEM; (b) UNRIPPER. Fig. 7. View largeDownload slide The average ranks of the strategies and a pairwise comparison using the Nemenyi test considering only the largest class. (a) MODLEM; (b) UNRIPPER. To complement the analysis, a comparison of the strategies considering only the smallest classes was conducted (Figure 8). As before, the test allowed to reject the null hypothesis returning p-values smaller than 5e-7 for both algorithms. This time $$z_{D}$$ and $$z_{C}$$ are strongly the top two and each noted significantly better performance than almost all the others. Therefore, a conclusion is reached that these two are much more sensitive to small classes and can be recommended especially to datasets with imbalanced distribution of examples. Fig. 8. View largeDownload slide The average ranks of the strategies and a pairwise comparison using the Nemenyi test considering only the smallest class. (a) MODLEM; (b) UNRIPPER. Fig. 8. View largeDownload slide The average ranks of the strategies and a pairwise comparison using the Nemenyi test considering only the smallest class. (a) MODLEM; (b) UNRIPPER. 5 Conclusions This article discussed classification strategies in rule-based classifiers. To attain general conclusions, two rule induction algorithms were used. An analysis of the rules they generated revealed a few differences. MODLEM created larger number of rules than UNRIPPER and these rules were longer on average than the rules generated by the latter algorithm. The next observation was that in the median case 28.25% or 17.8% of examples could not be classified without using a classification strategy when rules were induced by MODLEM and UNRIPPER, respectively, but in extreme case these values were much higher and equal to 86.7% and 58.3%, respectively. Five popular classification strategies proposed by Michalski $$et~al.$$ [12], Grzymała-Busse and Zou [9], An [1], Stefanowski [14], Sulzmann and Fürnkranz [15] were considered to find out if whichever of them is significantly better or worse than the others. Also, a new strategy based on $$\chi ^{2}$$ statistic was taken into consideration. The first finding was that the results of the comparison are dependent on the chosen learning algorithm: $$z_{m}$$ – the modified strategy by Sulzmann and Fürnkranz, $$z_{D}$$ proposed by An and the new proposal denoted by $$z_{C}$$ performed similarly and were the best in case of UNRIPPER, whereas $$z_{D}$$ was significantly better than all the others in case of MODLEM. This dependency was suspected to result from the different proportions of examples classified using multiple- and no-match, and therefore the overall results were decomposed into multiple- and no-match ones separately. A deeper insight into mechanisms used by the strategies led up to find that for both algorithms multiple- and no-match have their own best strategy, which are $$z_{D}$$ and $$z_{m}$$ (our proposal based on m-estimate), respectively. A combination of these two recommendations into a new strategy led to the best results regardless of the chosen learning algorithm. Next, the strategies formed a similar order for both learning algorithms: the top three were $$z_{D}$$, $$z_{m}$$ and $$z_{C}$$, whereas Stefanowski’s $$z_{N}$$, $$z_{S}$$ – proposed by Grzymała-Busse and Zou and $$z_{A}$$ from Michalski et al. were situated on the last three positions. Finally, interesting properties of $$z_{D}$$ and $$z_{C}$$ were found – they are significantly better than all the others considering only the smallest class from each dataset. We think they are well-suited to problems with imbalanced distribution of examples. Although the classification strategies have just been examined and their properties and accuracies have been discussed thoroughly, there are still open directions to continue the research. The most interesting one is to propose a new strategy that utilizes not only the global information (like positive or negative coverage in the entire dataset), but also incorporates the information about the examples occurring in the local neighbourhood of the classified one. This approach may improve the accuracy significantly, especially in case of highly noisy datasets. Summing up, this article provided a deeper insight into classification strategies and their usage during classification. A few interesting conclusions related to this topic were brought, which can be applied easily to new or existing problems. The presented results may be a benchmark for new proposals of strategies. Acknowledgement The author thanks Szymon Wilk for careful reading, helpful discussions and valuable comments on previous drafts. He also thanks Jerzy Stefanowski for helpful discussions and valuable suggestions on the very first experiment. Breast cancer, lymphography and primary tumor datasets were obtained from the University Medical Centre, Institute of Oncology, Ljubljana, Yugoslavia by M. Zwitter and M. Soklic. Vehicle dataset comes from the Turing Institute, Glasgow, Scotland. Wisconsin breast cancer dataset was obtained from the University of Wisconsin Hospitals, Madison from Dr. William H. Wolberg [11]. Blood transfusion service center dataset comes from [17]. Cleveland heart disease dataset comes from V.A. Medical Center, Long Beach and Cleveland Clinic Foundation and the principal Robert Detrano. The authors would like to cordially thank for these datasets. References [1] An. A. Learning classification rules from data. Computers & Mathematics with Applications , 45, 737– 748, 2003. Google Scholar CrossRef Search ADS   [2] An A. and Cercone. N. Rule quality measures for rule induction systems: Description and evaluation. Computational Intelligence , 17, 409– 424, 2001. Google Scholar CrossRef Search ADS   [3] Blaszczynski J. Stefanowski J. and Zajac. M. Ensembles of abstaining classifiers based on rule sets. In Foundations of Intelligent Systems, 18th International Symposium, ISMIS 2009, Prague, Czech Republic, September 14-17, 2009. Proceedings, vol. 5722 of Lecture Notes in Computer Science , Rauch J. Ras Z. W. Berka P. and Elomaa T. eds, pp. 382– 391. Springer, 2009. [4] Cohen. W. W. Fast effective rule induction. In ICML  Prieditis A. and Russell S. J. eds, pp. 115– 123. Morgan Kaufmann, 1995. Google Scholar CrossRef Search ADS   [5] Demsar. J. Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research , 7, 1– 30, 2006. [6] Džeroski S. Cestnik B. and Petrovski. I. Using the m-estimate in rule induction. Journal of Computing and Information Technology , 1, 37– 46, 1993. [7] Fisher. R. A. On the interpretation of $$\chi^{2}$$ from contingency tables, and the calculation of p. Journal of the Royal Statistical Society , 85, 87– 94, 1922. Google Scholar CrossRef Search ADS   [8] Grzymala-Busse J. W. and Sudre. G. P. A comparison of two partial matching strategies for classification of unseen cases. In 2006 IEEE International Conference on Granular Computing, GrC 2006, Atlanta, Georgia, USA, May 10–12, 2006 , pp. 800– 805. IEEE, 2006. [9] Grzymala-Busse J. W. and Zou. X. Classification strategies using certain and possible rules. In Rough Sets and Current Trends in Computing, First International Conference, RSCTC’98, Warsaw, Poland, June 22-26, 1998, Proceedings, Vol. 1424 of Lecture Notes in Computer Science , Polkowski L. and Skowron A. eds, pp. 37– 44. Springer, 1998. Google Scholar CrossRef Search ADS   [10] Lichman. M. UCI machine learning repository, University of California, Irvine, School of Information and Computer Sciences,  2013. http://archive.ics.uci.edu/ml. [11] Mangasarian O. L. and Wolberg. W. H. Cancer diagnosis via linear programming. SIAM News , 23, 1– 18, 1990. [12] Michalski R. S. Mozetic I. Hong J. and Lavrac. N. The multi-purpose incremental learning system AQ15 and its testing application to three medical domains. In Proceedings of the 5th National Conference on Artificial Intelligence. Philadelphia, PA, August 11–15, 1986. Vol. 2: Engineering , Kehler T. and Rosenschein S. J. eds, pp. 1041– 1047. Morgan Kaufmann, 1986. [13] Stefanowski. J. The rough set based rule induction technique for classification problems. In Proceedings of the 6th European Congress on Intelligent Techniques and Soft Computing , Vol. 1, pp. 109– 113, 1998. [14] Stefanowski. J. Algorytmy indukcji reguł decyzyjnych w odkrywaniu wiedzy . Wydawnictwo Politechniki Poznańskiej, 2001. [15] Sulzmann J.-N. and Fürnkranz. J. An empirical comparison of probability estimation techniques for probabilistic rules. In Proceedings of the 12th International Conference on Discovery Science (DS-09)  Gama J. Costa V. Santos Jorge A. and Brazdil P. B. eds, pp. 317– 331. Springer-Verlag, 2009. Google Scholar CrossRef Search ADS   [16] Yang Y. and Liu. X. A re-examination of text categorization methods. In Proceedings of the 22Nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’99 , pp. 42– 49. ACM, 1999. [17] Yeh I.-C. Yang K.-J. and Ting. T.-M. Knowledge discovery on rfm model using bernoulli sequence. Expert Systems with Applications , 36, 5866– 5871, 2009. Google Scholar CrossRef Search ADS   © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Logic Journal of the IGPL Oxford University Press

A comparison of classification strategies in rule-based classifiers

Loading next page...
 
/lp/ou_press/a-comparison-of-classification-strategies-in-rule-based-classifiers-TibxQ740TU
Publisher
Oxford University Press
Copyright
© The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
ISSN
1367-0751
eISSN
1368-9894
D.O.I.
10.1093/jigpal/jzx053
Publisher site
See Article on Publisher Site

Abstract

Abstract This article discusses classification strategies in rule-based classifiers, reveals how often induced rules did not lead to unambiguous classification and emphasizes a major role that classification strategies play in classification of unknown examples. Five selected popular classification strategies proposed by Michalski $$et al$$, Grzymała-Busse and Zou, An, Stefanowski, Sulzmann and Fürnkranz are reviewed and compared experimentally. Additionally, a new strategy that exploits $$\chi^{2}$$ statistic to measure the association between the rule coverage and the indicated class is proposed. The experiment was conducted on 30 UCI datasets using MODLEM and modified RIPPER classifiers. 1 Introduction In general, classification of an example with decision rules comes down to identifying those rules whose conditional parts are satisfied by attribute values from the classified example. This activity is called matching [12]. Apart from unambiguous matching – to one or more rules pointing at the same class, two problematic situations can appear: the classified example satisfies conditional parts of many rules that point at different decision classes, or it is not matched to any rule. For further analysis, the second case is to be named – according to [1, 12] – a multiple-match, while the third case – a no-match. Additionally, the first case is named a single match. To address multiple- and no-match one applies algorithms called classification strategies [1, 12]. One can say that once rules have been induced, classification strategy is the only factor which determines the classifier accuracy. Now, a poor classification strategy would not lead to high overall accuracy, especially in case of a high prevalence of multiple-match and no-match cases regardless of chosen learning algorithm and rules quality. Thus, the classification strategy should have as high accuracy as possible. Classification strategies described in the literature were usually proposed together with specific rule induction algorithms and therefore so far they have very rarely been analysed independently or compared. In [2], An and Cercone examined several rule quality formulas based on one learning algorithm and checked their influence on the overall accuracy. The formulas were used only during the rule induction process, while as a classification strategy the authors used only the An’s proposal from [1]. Then, in [8] Grzymała-Busse and Sudre compared a strategy from [9] with their new proposal for the no-match situation with a final conclusion that none of them is better. In turn, in [3] Błaszczyński $$et~al.$$ revealed the accuracy of MODLEM (MODified Learning from Examples Module) combined with two different classification strategies. However, they did not compare strategies mutually, instead they compared the accuracy of different ensemble architectures containing these strategies. In recent years, several classification strategies have been proposed. In this article, the importance of classification strategies is verified and proportions of single-, multiple- and no-match are presented. The main goals are to compare five different strategies proposed by the following authors: Michalski $$et~al.$$ [12], Grzymała-Busse and Zou [9], An [1], Stefanowski [14], Sulzmann and Fürnkranz [15] and to examine if there are any differences in their performance. Additionally, a new proposal (introduced in Section 2.3) relying on the $$\chi^{2}$$ statistic is included. For more general and sound conclusions, two rule induction algorithms are used. The first one is MODLEM [13] – a rule induction algorithm which generates an unordered set of rules and which was invented to cope with numeric data without discretization. The second one is RIPPER (Repeated Incremental Pruning to Produce Error Reduction) proposed in [4]. Although, it creates ordered set of rules, the idea from [15] was adopted to obtain an unordered set of rules: the rules are induced in one-against-all mode for each class. This modification, called UNRIPPER, allows one to apply classification strategies more sophisticated than first fires, first classifies. The article is organized as follows. Section 2 contains a brief description of rule classifiers. It also describes the selected classification strategies and introduces a new strategy. Section 3 describes the design of the experiment. Section 4 presents and discusses the results for all selected classification strategies. Finally, Section 5 concludes with a discussion. 2 Methods 2.1 Basic concepts and definitions This chapter introduces basic concepts and definitions. A dataset $$X$$ consists of examples $$x \in X$$ characterized by a defined set of $$m$$ attributes $$ A={a_{1},a_{2},\ldots ,a_{m}}$$. The domain of attribute $$a \in A$$ is denoted by $$V_{a}$$ and the value of example $$x$$ for attribute $$a$$ is denoted by $$a(x)$$. For numeric attributes, the minimum and the maximum values are also defined and denoted by $$\min (a)$$ and $$\max (a)$$, respectively. Classification supervised learning also demands that a dataset used during learning process (called learning dataset) has one additional nominal attribute $$d \notin A$$, which contains the true classification decision for each example $$x$$ belonging to the learning dataset. For each decision class $$v_{d} \in V_{d}$$ one can split the learning dataset to a set of examples belonging to this class ($$X_{v_{d}}^{+}= { x \in X :d(x)= v_{d} }$$) and examples not belonging to it ($$X_{v_{d}}^{-}= { x \in X :d(x)\ne v_{d} }$$). A rule $$r$$ consists of two parts, where $$P_{r}$$ is a conditional part or a premise [15] and $$Q_{r} \in V_{d}$$ is a conclusion indicating a classification decision. The conditional part $$P_{r}$$ is a conjunction of single conditions $$w$$: $$P_{r}= w_{1} \land w_{2} \land \ldots \land w_{k}$$, where the number of single conditions $$k$$ is named the length of rule $$r$$ and denoted by $$length(r)$$. A single condition [4] (sometimes named a selector [12]) $$w$$ is a logical expression, which defines a relationship between the value of the attribute $$a$$ for a classified example $$x$$ and the domain of the attribute. The operator $$\in$$ is used either with a subset of values [12] from the attribute domain $$V_{a}$$ for nominal attributes or with an interval built on the attribute domain for numeric ones. A selector $$w$$ covers $$x$$ denoting $$w \supseteq x$$, if the relationship contained in $$w$$ is satisfied by $$x$$. A rule $$r$$ covers example $$x$$ (denoted by $$r \supseteq x$$) [15] if its each condition $$w \in P_{r}$$ covers $$ x$$. The conclusion $$Q_{r}$$ determines assignment of any object that satisfies $$P_{r}$$ to a class indicated by $$Q_{r}$$. The coverage $$ [P_{r}]$$ of rule $$r$$ is a subset of examples $$x$$ from dataset $$X$$ which are covered by $$r$$. The coverage can be also divided into positive and negative parts. A positive coverage $$[P_{r}] ^{+}$$ of rule $$r$$ is an intersection between its coverage $$[P_{r}]$$ and a set $$ X_{Q_{r}}^{+}$$ of examples belonging to the class indicated by $$Q_{r} $$, whereas an intersection between $$[P_{r}]$$ and $$X_{Q_{r}}^{-}$$ is named a negative coverage. A classification of a new example $$x$$ by a set of decision rules $$R$$ consists in finding a rule $$r$$ which covers the example and finally assigning such example to a class indicated by the rule’s conclusion. One could think about expressing this as a function:   \begin{equation} z : x,R \mapsto y :y \in V_{d}. \end{equation} (1) However, because of the multiple- and the no-match situations a concrete definition of this function depends on the chosen classification strategy. Let us introduce a partial rule $$r_{x}^{'}$$ corresponding to a rule $$r$$ and an example $$x$$. Such partial rule indicates the same class as $$r$$ does and its conditional part consists only of these selectors from $$r$$ which cover $$x$$. For example, if the rule $$r$$ has a form of $$w_{1} \land {w_{2}} \land {w_{3}}\to Q$$ and $$x$$ satisfies only $$w_{1}$$ and $$w_{3}$$ then the partial rule $$r_{x}^{'} $$ is defined as $$w_{1} \land {w_{3}}\to Q$$. Finally, $$R^{\supseteq x}$$ denotes a set of rules covering the example $$x$$ and $$R_{P}^{x}$$ is used to denote a subset of rules $$R$$ which contains these rules which have at least one selector covering $$x$$. Such rules are named partially-matched to $$x$$ [1]. Before providing a brief description of the selected classification strategies, let us introduce a few important formulas useful to estimate rule quality. The first one is m-estimate proposed by Džeroski $$et~al.$$ [6]:   \begin{equation} mEstimate(r) = \frac{\vert [P_{r}] ^{+}\vert +m \frac{\vert X_{Q_{r}}^{+}\vert }{\vert X\vert }}{\vert [P_{r}]\vert + m}. \end{equation} (2) If m equals to 0, the whole expression is simply the value of Maximum Likelihood Estimator (MLE) of positive coverage inside rule’s coverage. Because of MLE’s properties, rules with small coverage containing only positive examples may obtain overestimated values comparing with rules with greater coverage containing some negative examples. The parameter m introduces to calculation an occurrence of m additional examples with a priori known probability. This allows the expression to control the trade-off between the frequency of positive examples in the coverage and a priori probability of the class indicated by $$r$$ [6], which might be tuned according to the noise in the dataset. Another proposal comes from [1]. It is called the measure of discrimination and the underlying motivation was to apply a similar formula to the one used in information retrieval to discriminate between relevant and irrelevant documents. To measure the rule ability to discriminate positives from negatives the authors proposed a following expression:   \begin{equation} {\rm measureOfDiscrimination}(r) = \log \frac{(\vert [P_{r}] ^{+}\vert )( \vert X\vert -\vert [P_{r}]\vert -\vert X_{Q_{r}}^{+}\vert +\vert [P_{r}] ^{+}\vert )}{(\vert [P_{r}]\vert -\vert [P_{r}] ^{+}\vert )(\vert X_{Q_{r}}^{+}\vert -\vert [P_{r}] ^{+}\vert )}. \end{equation} (3) Stefanowski in [14] has proposed a new measure which relies on calculating a distance between the rule and the classified example:   \begin{equation} {\rm distance}(r,x)= \frac{1}{{\rm length}(r)}\sqrt{\sum_{w \in P_{r} }{{\rm distance}^{2}(x,w)}}. \end{equation} (4) In (4) $${\rm distance}(x,w)$$ indicates a distance between the example $$x$$ and a selector $$w$$. For selectors built on nominal or ordinal attributes it has value equal to 0 when $$w$$ covers $$x$$ and 1 otherwise. For selectors built on numeric domains the distance can be expressed as:   \begin{equation} {\rm distance}(x,w)=\begin{cases} 0 & \text{if } w\supseteq x \\ \frac{\min _{v \in V_{a }: w \supseteq v}\vert a(x)-v\vert }{\vert {\rm max}(a)-{\rm min}(a)\vert } & \text{otherwise}, \\ \end{cases} \end{equation} (5) where $$w\supseteq v$$ denotes that the relationship contained in $$w$$ is satisfied by $$v \in V_{a}$$. Yet another formula was proposed in [12] and it measures the fit between the example being classified and the rule:   \begin{equation} {\rm measureOfFit}(r,x) =\prod_{w \in P_{r}}{{\rm measureOfFit}(w,x)}. \end{equation} (6) The measure of fit between a selector $$w$$ based on nominal attribute and the example is defined as follows:   \begin{equation} {\rm measureOfFit}(w,x)= \begin{cases} 1 & \text{if } w \supseteq x \\ \frac{\text{number of selector values}}{\vert V_{a}\vert } & \text{otherwise}. \\ \end{cases} \end{equation} (7) Finally, in [2] a proposal was made for exploiting the measures of association between variables as measure of rule quality. An example of such measure is $$\chi ^{2}$$ statistic which actually determines based on distributions of nominal variables whether they are independent of each other. The value of $$\chi ^{2}$$ statistic [7] for a pair of variables with $$n$$ and $$m$$ values respectively can be calculated as:   \begin{equation} \chi ^{2}({\rm Observed, Expected}) = \sum_{i=1}^{n}{\sum_{j=1}^{m}{\frac{({\rm Observed}_{ij}-{\rm Expected}_{ij})^{2}}{{\rm Expected}_{ij}}}}. \end{equation} (8) The symbols $${\rm Observed}_{ij}$$ and $${\rm Expected}_{ij}$$ represent, respectively, the number of observed and expected co-occurrences of the $$ i$$-th value from the first variable with the $$j$$-th value from the second one. 2.2 Existing strategies Now, let us briefly describe particular classification strategies, which are used in the experiment. Each one is presented as a pair of functions $$ z^{mm}$$ and $$z^{nm}$$, which support the expression (1) in case of multiple- and no-match, respectively. Grzymała-Busse and Zou proposed in [9] to exploit measures of rule strength and rule specificity. They defined the rule strength as the power of positive coverage and the specificity as the rule length $${\rm length}(r)$$. Concretely, their proposal for multiple-match is as follows:   \begin{equation} z_{S}^{mm}(x) = {\rm argmax}_{v_{d} \in V_{d}}\sum_{r \in R^{\supseteq x} :Q_{r}= v_{d} }{\vert [P_{r}] ^{+}\vert }{\rm length}(r). \end{equation} (9) For no-match case Grzymała-Busse and Zou proposed a similar formula to $$z_{S}^{mm}$$, but instead of $$length(r)$$ the proposal contains the number of selectors covering an example $$x$$ denoted by $$k_{r}^{x}$$:   \begin{equation} z_{S}^{nm}(x) = {\rm argmax}_{v_{d} \in V_{d}}\sum_{r \in R :Q_{r}= v_{d} }{\vert [P_{r}] ^{+}\vert } k_{r}^{x}. \end{equation} (10) Next idea comes from An [1] and is already applied in ELEM2 system. It makes use of the measure of discrimination and for multiple-match has a following form:   \begin{equation} z_{D}^{mm}(x) = {\rm argmax}_{v_{d} \in V_{d}}\sum_{r \in R^{\supseteq x} :Q_{r}= v_{d} }{{\rm measureOfDiscrimination}(r)} \end{equation} (11) Again, the expression for no-match case is similar to the one applied to multiple-match. However, it is enriched by the ratio of the number of selectors covering an example $$x$$ to the length of $$r$$:   \begin{equation} z_{D}^{nm}(x) = {\rm argmax}_{v_{d} \in V_{d}}\sum_{r \in R :Q_{r}= v_{d} }{{\rm measureOfDiscrimination}(r)} \frac{k_{r}^{x}}{{\rm length}(r)} \end{equation} (12) In case of multiple-match, Stefanowski [14] advised to apply a formula similar to $$z_{S}^{mm}$$, but without the rule specificity. Moreover, he considered not to limit to use only positive coverage, yet in this proposal, all examples from coverage participate in classification:   \begin{equation} z_{N}^{mm}(x) = {\rm argmax}_{v_{d} \in V_{d}}\sum_{r \in R^{\supseteq x} }{\vert [P_{r}]\cap X_{v_{d}}^{+}\vert }. \end{equation} (13) For no-match cases, Stefanowski presented a completely different idea inspired by k-Nearest Neighbours (k-NN) classifier. This idea contains a way to calculate distances between rules and examples. Against as in k-NN where it is expected to exclude from voting the influence of distant neighbours, in this strategy all rules participate in classification with vote weights additively inverse of their distances. The final class is established as follows:   \begin{equation} z_{N}^{nm}(x) = {\rm argmax}_{v_{d} \in V_{d}}\sum_{r \in R_{P}^{x} }{(1-{\rm distance}(r,x))\vert [P_{r}]\cap X_{v_{d}}^{+}\vert}. \end{equation} (14) Sulzmann and Fürnkranz proposed in [15] to exploit the $$m$$ -estimate to solve the multiple-match case. They suggested to find one rule with the highest value of $$m$$-estimate and to classify the example according to this rule:   \begin{equation} z_{m}^{mm}(x) = Q_{r}:{\rm argmax}_{r \in R^{\supseteq x}} mEstimate(r). \end{equation} (15) Unfortunately, Sulzmann and Fürnkranz did not provide any solution based on $$m$$-estimate for the no-match case. They advised to assign each example with no matched rules to the majority class [15]. We claim such idea is biased towards one class and propose a new application of Sulzmann and Fürnkranz’s strategy from multiple-match to no-match case. Namely, we consider only partially-matched rules and calculate $$mEstimate$$ using partial rules derived from them. Then, instead of the majority class, an example $$x$$ is assigned to the class indicated by a following formula:   \begin{equation} z_{m}^{nm}(x) = Q_{r}:{\rm argmax}_{r \in R_{P}^{x}} mEstimate(r_{x}^{'}). \end{equation} (16) Another strategy comes from Michalski $$et~al.$$ [12] and was implemented as a part of AQ15 system. It also exploits positive coverage of rules, yet to avoid overcounting examples covered by many rules, the union operator is applied. The proposal for multiple-match case is as follows:   \begin{equation} \label{eq:zA_mm} z_{A}^{mm}(x) = {\rm argmax}_{v_{d} \in V_{d}}\vert \bigcup_{r \in R^{\supseteq x} :Q_{r}= v_{d}}{[P_{r}] ^{+}}\vert. \end{equation} (17) The authors recommended to enrich $$z_{A}^{mm}$$ by $${\rm measureOfFit}(r,x)$$ for no-match case:   \begin{equation} z_{A}^{nm}(x) = {\rm argmax}_{v_{d} \in V_{d}}\vert \bigcup_{r \in R :Q_{r}= v_{d}}{[P_{r}] ^{+}{\rm measureOfFit}(r,x)}\vert. \end{equation} (18) Unfortunately, $${\rm measureOfFit}$$ was not defined for selectors based on numeric attributes. Therefore, to be able to handle numeric attributes we propose a new formula which is similar to that one proposed for nominal attributes – it also relies on calculating how big part of the attribute’s domain is covered by selector $$w$$. The definition of $${\rm measureOfFit}$$ is as follows:   \begin{equation} {\rm measureOfFit}(w,x)= \begin{cases} 1 & \text{if } w \supseteq x \\ \frac{{\rm width}(w\cap \lbrack {\rm min}(a), {\rm max}(a)\rbrack )}{\vert {\rm max}(a)-{\rm min}(a)\vert } & \text{otherwise}. \\ \end{cases} \end{equation} (19) The $${\rm width}(w)$$ is the absolute difference between the argument endpoints. An illustrative value of $${\rm measureOfFit}$$ for a selector $$ w$$: $$a>-1$$ given domain $$V_{a} \in [-3;5]$$ equals to $$\frac{\vert 5-(-1)\vert }{\vert 5-(-3)\vert }=0.75$$. 2.3 The new proposal In this article a new approach is presented: let us check and measure if a relationship between affiliation to rule coverage and affiliation to the class indicated by rule exists. To do this $$\chi ^{2}$$ value is calculated just as in the $$\chi ^{2}$$ test of independence between two binary variables. The first variable takes as values ‘example covered’ ($$x \in [P_{r}]$$) or ‘example uncovered’ ($$x \notin \lbrack P_{r}\rbrack $$) by rule $$r$$, whereas ‘example belongs’ ($$x \in X_{Q_{r}}^{+}$$) or ‘example does not belong’ ($$x \in X_{Q_{r}}^{-}$$) to the class indicated by rule’s $$r$$ conclusion are the values taken by the second variable. For these variables, a following contingency table is created, where $$ [P_{r}]^{'}$$ denotes a complement of $$[P_{r}]$$:   \begin{equation} {\rm Observed}(r) =\left[\begin{matrix} \vert [P_{r}]\cap X_{Q_{r}}^{+}\vert & \vert [P_{r}]\cap X_{Q_{r}}^{-} \vert & \\ \vert [P_{r}]^{'}\cap X_{Q_{r}}^{+}\vert & \vert [P_{r}]^{'}\cap X_{Q_{r}}^{-}\vert & \\ \end{matrix} \right]=\left[\begin{matrix} \vert [P_{r}] ^{+}\vert & \vert [P_{r}] ^{-}\vert & \\ \vert [P_{r}]^{'}\cap X_{Q_{r}}^{+}\vert & \vert [P_{r}]^{'}\cap X_{Q_{r}}^{-}\vert & \\ \end{matrix} \right]. \end{equation} (20) As in the $$\chi ^{2}$$ test of independence, the matrix of expected values is calculated under the assumption that these two variables are independent. In consequence, it is also assumed that each pair of possible values is independent which means that the joint probability of the variables can be calculated as a product of their marginals:   \begin{equation} {\rm Expected}(r) =\left[\begin{matrix} \vert [P_{r}]\vert \vert X_{Q_{r}}^{+}\vert & \vert [P_{r}]\vert \vert X_{Q_{r}}^{-}\vert & \\ \vert [P_{r}]^{'}\vert \vert X_{Q_{r}}^{+}\vert & \vert [P_{r}]^{'}\vert \vert X_{Q_{r}}^{-}\vert & \\ \end{matrix} \right]\frac{1}{\vert X\vert }. \end{equation} (21) In terms of classification, we propose to classify an example according to a rule which obtains the largest value of $$\chi ^{2}$$ indicating the strongest relationship between the variables. The proposal for solving multiple-match comes down to the following formula:   \begin{equation} z_{C}^{mm}(x) = Q_{r}:{\rm argmax}_{r \in R^{\supseteq x}}\chi ^{2}(r), \end{equation} (22) where $$\chi ^{2}$$ is calculated according to a following expression:   \begin{equation} \chi^{2}(r)=\sum_{i=1}^{2}{\sum_{j=1}^{2}{({\rm Observed}(r)_{ij}-{\rm Expected}(r)_{ij})^{2}/{\rm Expected}(r)_{ij}}}. \end{equation} (23) To solve no-match case a similar expression is applied, but using partial rules derived from the set of partially-matched ones:   \begin{equation} z_{C}^{nm}(x) =Q_{r}:{\rm argmax}_{r \in R_{P}^{x}}\chi ^{2}(r_{x}^{'}). \end{equation} (24) 3 Experimental design The main goal of the experiment was to examine the role that classification strategies play in classification of unknown examples and to compare the classification strategies described in the previous section. The experiment was conducted on 30 datasets downloaded from the machine learning repository of the University of California, Irvine [10]. They are presented in Table 1. The selected datasets were diversified in terms of their size, numbers of nominal and numeric attributes, as well as the number of classes and their distribution. Table 1. A summary of the datasets used in the experiment (dataset = full name of the dataset, abbr. = dataset abbreviation, #c = number of classes, #o = number of nominal attributes, #n = number of numeric attributes, and Class Distribution = class distribution) Dataset  abbr.  #c  #o  #n  Class distribution  Abalone data  aba  2  1  7  3842:335  Breast cancer  brc  2  9  0  201:85  Wisconsin breast cancer  brw  2  0  9  458:241  Liver Disorders  bup  2  0  6  200:145  Car evaluation  car  4  6  0  1210:384:69:65  King+Rook versus King+Pawn  che  2  36  0  1669:1527  Contraceptive Method Choice  cmc  2  7  2  1140:333  German credit  cre  2  13  7  700:300  Credit Approval  crx  2  9  6  383:307  Ecoli  eco  8  0  7  143:77:52:35:20:5:2:2  Glass identification  gla  6  0  9  76:70:29:17:13:9  Haberman’s Survival Data  hab  2  0  3  225:81  Cleveland heart disease  hea  2  7  6  165:138  Hepatitis  hep  2  13  6  123:32  Ionosphere  ion  2  0  34  225:126  Iris Plants Database  iri  3  0  4  50:50:50  Lymphography  lym  4  15  3  81:61:4:2  Monk’s 3 problem  mon  2  6  0  62:60  Mushroom  mus  2  22  0  4208:3916  Pima Indians Diabetes  pim  2  0  8  500:268  Primary Tumor  pri  21  17  0  84:39:29:28:24:24:20:16:14:14:10:9:7:6:6:2:2:2:1:1:1  satimage_train  sat  6  0  36  1072:1038:961:479:470:415  Sonar  son  2  0  60  111:97  Soybean disease  soy  19  35  0  92:91:91:88:44:44:20:20:20:20:20:20:20:20:20:16:15:14:8  Tic-tac-toe endgame  tic  2  9  0  626:332  Blood transfusion service center  tra  2  0  4  570:178  Vehicle silhouettes  veh  4  0  18  218:217:212:199  Congressional Voting Records  vot  2  16  0  267:168  Deterding vowel recognition  vow  11  3  10  90:90:90:90:90:90:90:90:90:90:90  Cellular Localization Sites of Proteins  yea  10  0  8  463:429:244:163:51:44:35:30:20:5  Dataset  abbr.  #c  #o  #n  Class distribution  Abalone data  aba  2  1  7  3842:335  Breast cancer  brc  2  9  0  201:85  Wisconsin breast cancer  brw  2  0  9  458:241  Liver Disorders  bup  2  0  6  200:145  Car evaluation  car  4  6  0  1210:384:69:65  King+Rook versus King+Pawn  che  2  36  0  1669:1527  Contraceptive Method Choice  cmc  2  7  2  1140:333  German credit  cre  2  13  7  700:300  Credit Approval  crx  2  9  6  383:307  Ecoli  eco  8  0  7  143:77:52:35:20:5:2:2  Glass identification  gla  6  0  9  76:70:29:17:13:9  Haberman’s Survival Data  hab  2  0  3  225:81  Cleveland heart disease  hea  2  7  6  165:138  Hepatitis  hep  2  13  6  123:32  Ionosphere  ion  2  0  34  225:126  Iris Plants Database  iri  3  0  4  50:50:50  Lymphography  lym  4  15  3  81:61:4:2  Monk’s 3 problem  mon  2  6  0  62:60  Mushroom  mus  2  22  0  4208:3916  Pima Indians Diabetes  pim  2  0  8  500:268  Primary Tumor  pri  21  17  0  84:39:29:28:24:24:20:16:14:14:10:9:7:6:6:2:2:2:1:1:1  satimage_train  sat  6  0  36  1072:1038:961:479:470:415  Sonar  son  2  0  60  111:97  Soybean disease  soy  19  35  0  92:91:91:88:44:44:20:20:20:20:20:20:20:20:20:16:15:14:8  Tic-tac-toe endgame  tic  2  9  0  626:332  Blood transfusion service center  tra  2  0  4  570:178  Vehicle silhouettes  veh  4  0  18  218:217:212:199  Congressional Voting Records  vot  2  16  0  267:168  Deterding vowel recognition  vow  11  3  10  90:90:90:90:90:90:90:90:90:90:90  Cellular Localization Sites of Proteins  yea  10  0  8  463:429:244:163:51:44:35:30:20:5  At the beginning, proportions of examples being classified in single-, multiple- and no-match situations were analysed. The sum of percentages of multiple- and no-match cases can be viewed as a percentage of examples which classifier itself would not be able to classify without a classification strategy at all and as a gain in the accuracy, which would be obtained, if the strategy classified all the examples correctly. Thus, this number is a good indicator of strategy importance. Next, strategy accuracies were checked, if there are any differences between them. To explain the findings, the overall accuracies were decomposed into multiple- and no-match accuracies separately. Finally, the two different recommendations for multiple- and no-match were combined together into a new strategy. Each classifier was evaluated using stratified 10-fold Cross Validation repeated 100 times to estimate the mean appropriately. To test significance of a difference between the strategies the two-tailed Friedman test was used. Besides that, the Nemenyi test was employed as a post-hoc test for their pairwise comparison. These tests were described in detail in [5]. The considered significance level $$\alpha$$ was equal to 0.05. Finally, both MODLEM and UNRIPPER were used with pruning enabled and the value of $$m$$ in $$z_{m}$$ was set to 5 following suggestions by Sulzmann and Fürnkranz [15]. In many real-world applications datasets manifest imbalanced distribution of examples, namely, (at least) one class is much more underrepresented in comparison with the others. In view of such imbalanced class distribution of some selected datasets, classifiers were not evaluated using accuracy, but rather $$F_{\beta }-score$$, because otherwise even a classifier with very high accuracy could be biased towards majority classes and useless in practice. To avoid biasing classifier towards any classes and to reflect the interest in accuracy across all classes the accuracy was measured as macro-averaged $$F-score$$ [16] which treats performance on each class equally:   \begin{equation} F-score(X) = \frac{1}{\vert V_{d}\vert}\sum_{v_{d} \in V_{d}}{\frac{2\vert x \in X : d(x) = v_{d} \land z(x) = v_{d} \vert} {\vert x \in X : z(x)= v_{d}\vert + \vert x \in X : d(x)= v_{d}\vert}}. \end{equation} (25) 4 Results 4.1 Proportions of single-, multiple- and no-match At the beginning, proportions of examples classified through single-match, multiple-match and no-match using rules induced by both MODLEM and UNRIPPER are presented. These numbers are expected to be dependent on the size of rule coverage and the extent to which the rules overlap, which, in turn, results from the number and the length of the induced rules. The more rules, the more overlaps resulting in increased percentage of multiple-match, and, inversely, the longer rule the narrower its coverage resulting in increased percentage of no-match. Table 2 shows that MODLEM created many more rules which were also longer on average than the generated by UNRIPPER. These very big differences in the number of rules may result in a greater number of multiple-matches in case of the former and a greater number of no-matches in case of the latter. Table 2. The profile of rules generated by MODLEM and UNRIPPER    Number of rules  Average rule length    MODLEM  UNRIPPER  MODLEM  UNRIPPER  aba  129  8.88  3.78  2.75  brc  32  3.83  3.97  1.60  brw  20  8.70  3.55  2.37  bup  52  7.45  3.29  2.34  car  68  50.93  5.00  3.75  che  38  24.68  4.53  3.81  cmc  96  5.65  4.50  1.76  cre  139  7.58  5.31  2.12  crx  50  5.83  4.36  2.50  eco  35  12.25  3.23  2.45  gla  28  11.25  3.00  2.49  hab  37  3.00  2.70  1.32  hea  33  7.85  3.85  2.21  hep  8  4.88  2.25  1.63  ion  15  6.90  2.93  2.07  iri  7  4.33  2.14  1.55  lym  19  9.03  3.21  1.82  mon  14  7.25  3.21  1.65  mus  15  54.45  2.33  2.00  pim  91  6.80  4.36  2.40  pri  83  11.75  5.88  3.16  sat  166  60.88  5.92  4.81  son  13  7.98  3.31  2.13  soy  51  33.50  4.41  2.46  tic  17  18.08  3.29  3.16  tra  39  5.08  2.54  1.87  veh  85  19.75  4.40  3.15  vot  17  5.23  3.94  2.00  vow  71  58.63  3.97  3.71  yea  200  21.58  4.27  3.87     Number of rules  Average rule length    MODLEM  UNRIPPER  MODLEM  UNRIPPER  aba  129  8.88  3.78  2.75  brc  32  3.83  3.97  1.60  brw  20  8.70  3.55  2.37  bup  52  7.45  3.29  2.34  car  68  50.93  5.00  3.75  che  38  24.68  4.53  3.81  cmc  96  5.65  4.50  1.76  cre  139  7.58  5.31  2.12  crx  50  5.83  4.36  2.50  eco  35  12.25  3.23  2.45  gla  28  11.25  3.00  2.49  hab  37  3.00  2.70  1.32  hea  33  7.85  3.85  2.21  hep  8  4.88  2.25  1.63  ion  15  6.90  2.93  2.07  iri  7  4.33  2.14  1.55  lym  19  9.03  3.21  1.82  mon  14  7.25  3.21  1.65  mus  15  54.45  2.33  2.00  pim  91  6.80  4.36  2.40  pri  83  11.75  5.88  3.16  sat  166  60.88  5.92  4.81  son  13  7.98  3.31  2.13  soy  51  33.50  4.41  2.46  tic  17  18.08  3.29  3.16  tra  39  5.08  2.54  1.87  veh  85  19.75  4.40  3.15  vot  17  5.23  3.94  2.00  vow  71  58.63  3.97  3.71  yea  200  21.58  4.27  3.87  A summary of the proportions of examples classified through single-, multiple- and no-match is presented in Table 3. The median for single-match is equal to 71.75% and 82.2% for MODLEM- and UNRIPPER-generated rules, respectively. This indicates that for these two classifiers in the median case classification of 28.25% or 17.8% of examples involved the classification strategy. We think that these proportions are high enough to change significantly the number of examples correctly classified (providing classification strategies work well) and to affect the evaluation of a specific classifier. It is worth noticing that in the extreme case barely 13.3% (pri using MODLEM) and 41.7% (mus using UNRIPPER) of examples were handled through single-match. Finally, the results confirm the previous expectations regarding proportions of multiple- and no-match – for 24 datasets MODLEM obtained a higher percentage of multiple-match than UNIRPPER, whereas the latter noted a greater percentage of no-match for 19 datasets. Table 3. The percentages of examples classified through single-, multiple- and no-match   MODLEM  UNRIPPER    Single  Multiple  No  Single  Multiple  No  aba  51.4 $$\pm$$ 6.0  47.7 $$\pm$$ 6.0  0.9 $$\pm$$ 0.2  86.3 $$\pm$$ 8.6  1.1 $$\pm$$ 0.3  12.6 $$\pm$$ 8.7  brc  49.9 $$\pm$$ 5.0  44.5 $$\pm$$ 4.8  5.6 $$\pm$$ 1.5  84.1 $$\pm$$ 5.2  7.0 $$\pm$$ 2.4  8.9$$\pm$$ 5.3  brw  94.1 $$\pm$$ 0.8  3.7 $$\pm$$ 0.8  2.2 $$\pm$$ 0.5  96.0 $$\pm$$ 0.7  2.7 $$\pm$$ 0.7  1.3$$\pm$$ 0.4  bup  58.5 $$\pm$$ 4.1  31.8 $$\pm$$ 6.0  9.7 $$\pm$$ 2.7  75.3 $$\pm$$ 3.0  11.1 $$\pm$$ 2.8  13.6 $$\pm$$ 3.2  car  89.0 $$\pm$$ 2.1  9.5 $$\pm$$ 2.3  1.6 $$\pm$$ 0.3  83.8 $$\pm$$ 1.1  11.2 $$\pm$$ 1.0  5.1$$\pm$$ 0.9  che  98.2 $$\pm$$ 0.3  1.5 $$\pm$$ 0.2  0.3 $$\pm$$ 0.1  98.9 $$\pm$$ 0.2  0.8 $$\pm$$ 0.2  0.3$$\pm$$ 0.1  cmc  34.7 $$\pm$$ 2.8  64.2 $$\pm$$ 2.9  1.1 $$\pm$$ 0.3  76.6 $$\pm$$ 10.1  0.7 $$\pm$$ 0.5  22.7 $$\pm$$ 10.2  cre  66.7 $$\pm$$ 1.7  22.0 $$\pm$$ 1.9  11.4 $$\pm$$ 1.3  76.0 $$\pm$$ 4.2  6.8 $$\pm$$ 1.1  17.2 $$\pm$$ 4.7  crx  78.0 $$\pm$$ 3.1  14.4 $$\pm$$ 3.2  7.6 $$\pm$$ 1.0  93.0 $$\pm$$ 1.1  4.2 $$\pm$$ 1.1  2.8$$\pm$$ 1.0  eco  80.5 $$\pm$$ 1.9  10.2 $$\pm$$ 1.8  9.3 $$\pm$$ 1.2  84.7 $$\pm$$ 1.7  7.1 $$\pm$$ 1.2  8.2$$\pm$$ 1.4  gla  69.4 $$\pm$$ 2.3  14.3 $$\pm$$ 2.2  16.3 $$\pm$$ 2.4  65.0 $$\pm$$ 3.5  11.9 $$\pm$$ 2.6  23.2 $$\pm$$ 3.1  hab  41.4 $$\pm$$ 5.8  56.2 $$\pm$$ 5.8  2.4 $$\pm$$ 1.0  80.6 $$\pm$$ 9.0  2.2 $$\pm$$ 1.3  17.3 $$\pm$$ 9.0  hea  77.6 $$\pm$$ 2.0  12.6 $$\pm$$ 1.9  9.8 $$\pm$$ 1.6  80.6 $$\pm$$ 2.3  10.0 $$\pm$$ 2.0  9.4 $$\pm$$ 2.0  hep  58.5 $$\pm$$ 4.4  37.2 $$\pm$$ 4.0  4.3 $$\pm$$ 1.4  69.3 $$\pm$$ 6.3  20.6 $$\pm$$ 5.6  10.1 $$\pm$$ 6.0  ion  90.2 $$\pm$$ 1.8  5.2 $$\pm$$ 1.2  4.7 $$\pm$$ 1.3  92.2 $$\pm$$ 1.6  4.4 $$\pm$$ 1.0  3.5$$\pm$$ 1.2  iri  97.3 $$\pm$$ 1.1  1.3 $$\pm$$ 1.0  1.5 $$\pm$$ 0.8  95.9 $$\pm$$ 1.5  1.4 $$\pm$$ 1.0  2.7$$\pm$$ 1.3  lym  78.6 $$\pm$$ 2.5  11.0 $$\pm$$ 2.2  10.4 $$\pm$$ 2.5  80.5 $$\pm$$ 2.9  10.3 $$\pm$$ 2.4  9.2 $$\pm$$ 2.4  mon  86.8 $$\pm$$ 3.4  9.0 $$\pm$$ 3.4  4.2 $$\pm$$ 1.6  89.1 $$\pm$$ 2.8  9.4 $$\pm$$ 2.4  1.5$$\pm$$ 1.4  mus  80.5 $$\pm$$ 0.7  19.5 $$\pm$$ 0.7  0.0 $$\pm$$ 0.0  41.7 $$\pm$$ 0.9  58.4 $$\pm$$ 0.9  0.0 $$\pm$$ 0.0  pim  64.3 $$\pm$$ 3.8  27.5 $$\pm$$ 4.0  8.2 $$\pm$$ 1.2  86.4 $$\pm$$ 1.8  6.2 $$\pm$$ 1.2  7.4$$\pm$$ 1.4  pri  13.3 $$\pm$$ 1.5  85.1 $$\pm$$ 1.4  1.5 $$\pm$$ 0.7  51.8 $$\pm$$ 2.3  16.8 $$\pm$$ 2.3  31.4 $$\pm$$ 2.3  sat  71.9 $$\pm$$ 3.4  24.7 $$\pm$$ 3.6  3.4 $$\pm$$ 0.4  87.6 $$\pm$$ 0.4  6.5 $$\pm$$ 0.4  6.0$$\pm$$ 0.3  son  71.6 $$\pm$$ 3.5  14.3 $$\pm$$ 2.4  14.1 $$\pm$$ 2.0  71.1 $$\pm$$ 3.4  15.4 $$\pm$$ 2.6  13.5 $$\pm$$ 3.0  soy  70.0 $$\pm$$ 1.1  25.7 $$\pm$$ 1.1  4.3 $$\pm$$ 0.7  72.5 $$\pm$$ 0.8  21.3 $$\pm$$ 0.6  6.2 $$\pm$$ 0.6  tic  92.5 $$\pm$$ 2.3  6.6 $$\pm$$ 2.4  0.9 $$\pm$$ 0.3  96.1 $$\pm$$ 0.8  2.4 $$\pm$$ 0.9  1.5$$\pm$$ 0.1  tra  19.0 $$\pm$$ 2.3  80.3 $$\pm$$ 2.4  0.6 $$\pm$$ 0.3  92.6 $$\pm$$ 4.3  1.1 $$\pm$$ 0.5  6.2$$\pm$$ 4.5  veh  67.0 $$\pm$$ 2.9  20.6 $$\pm$$ 3.5  12.4 $$\pm$$ 1.2  73.3 $$\pm$$ 1.6  8.7 $$\pm$$ 1.3  18.0 $$\pm$$ 1.7  vot  80.7 $$\pm$$ 2.1  18.2 $$\pm$$ 2.2  1.1 $$\pm$$ 0.4  94.6 $$\pm$$ 0.6  5.4 $$\pm$$ 0.6  0.0$$\pm$$ 0.1  vow  73.5 $$\pm$$ 1.4  10.8 $$\pm$$ 1.1  15.7 $$\pm$$ 1.1  66.2 $$\pm$$ 1.5  12.8 $$\pm$$ 1.0  21.0 $$\pm$$ 1.4  yea  23.2 $$\pm$$ 3.1  73.8 $$\pm$$ 3.6  3.0 $$\pm$$ 0.7  64.7 $$\pm$$ 1.6  7.1 $$\pm$$ 0.9  28.3 $$\pm$$ 1.5    MODLEM  UNRIPPER    Single  Multiple  No  Single  Multiple  No  aba  51.4 $$\pm$$ 6.0  47.7 $$\pm$$ 6.0  0.9 $$\pm$$ 0.2  86.3 $$\pm$$ 8.6  1.1 $$\pm$$ 0.3  12.6 $$\pm$$ 8.7  brc  49.9 $$\pm$$ 5.0  44.5 $$\pm$$ 4.8  5.6 $$\pm$$ 1.5  84.1 $$\pm$$ 5.2  7.0 $$\pm$$ 2.4  8.9$$\pm$$ 5.3  brw  94.1 $$\pm$$ 0.8  3.7 $$\pm$$ 0.8  2.2 $$\pm$$ 0.5  96.0 $$\pm$$ 0.7  2.7 $$\pm$$ 0.7  1.3$$\pm$$ 0.4  bup  58.5 $$\pm$$ 4.1  31.8 $$\pm$$ 6.0  9.7 $$\pm$$ 2.7  75.3 $$\pm$$ 3.0  11.1 $$\pm$$ 2.8  13.6 $$\pm$$ 3.2  car  89.0 $$\pm$$ 2.1  9.5 $$\pm$$ 2.3  1.6 $$\pm$$ 0.3  83.8 $$\pm$$ 1.1  11.2 $$\pm$$ 1.0  5.1$$\pm$$ 0.9  che  98.2 $$\pm$$ 0.3  1.5 $$\pm$$ 0.2  0.3 $$\pm$$ 0.1  98.9 $$\pm$$ 0.2  0.8 $$\pm$$ 0.2  0.3$$\pm$$ 0.1  cmc  34.7 $$\pm$$ 2.8  64.2 $$\pm$$ 2.9  1.1 $$\pm$$ 0.3  76.6 $$\pm$$ 10.1  0.7 $$\pm$$ 0.5  22.7 $$\pm$$ 10.2  cre  66.7 $$\pm$$ 1.7  22.0 $$\pm$$ 1.9  11.4 $$\pm$$ 1.3  76.0 $$\pm$$ 4.2  6.8 $$\pm$$ 1.1  17.2 $$\pm$$ 4.7  crx  78.0 $$\pm$$ 3.1  14.4 $$\pm$$ 3.2  7.6 $$\pm$$ 1.0  93.0 $$\pm$$ 1.1  4.2 $$\pm$$ 1.1  2.8$$\pm$$ 1.0  eco  80.5 $$\pm$$ 1.9  10.2 $$\pm$$ 1.8  9.3 $$\pm$$ 1.2  84.7 $$\pm$$ 1.7  7.1 $$\pm$$ 1.2  8.2$$\pm$$ 1.4  gla  69.4 $$\pm$$ 2.3  14.3 $$\pm$$ 2.2  16.3 $$\pm$$ 2.4  65.0 $$\pm$$ 3.5  11.9 $$\pm$$ 2.6  23.2 $$\pm$$ 3.1  hab  41.4 $$\pm$$ 5.8  56.2 $$\pm$$ 5.8  2.4 $$\pm$$ 1.0  80.6 $$\pm$$ 9.0  2.2 $$\pm$$ 1.3  17.3 $$\pm$$ 9.0  hea  77.6 $$\pm$$ 2.0  12.6 $$\pm$$ 1.9  9.8 $$\pm$$ 1.6  80.6 $$\pm$$ 2.3  10.0 $$\pm$$ 2.0  9.4 $$\pm$$ 2.0  hep  58.5 $$\pm$$ 4.4  37.2 $$\pm$$ 4.0  4.3 $$\pm$$ 1.4  69.3 $$\pm$$ 6.3  20.6 $$\pm$$ 5.6  10.1 $$\pm$$ 6.0  ion  90.2 $$\pm$$ 1.8  5.2 $$\pm$$ 1.2  4.7 $$\pm$$ 1.3  92.2 $$\pm$$ 1.6  4.4 $$\pm$$ 1.0  3.5$$\pm$$ 1.2  iri  97.3 $$\pm$$ 1.1  1.3 $$\pm$$ 1.0  1.5 $$\pm$$ 0.8  95.9 $$\pm$$ 1.5  1.4 $$\pm$$ 1.0  2.7$$\pm$$ 1.3  lym  78.6 $$\pm$$ 2.5  11.0 $$\pm$$ 2.2  10.4 $$\pm$$ 2.5  80.5 $$\pm$$ 2.9  10.3 $$\pm$$ 2.4  9.2 $$\pm$$ 2.4  mon  86.8 $$\pm$$ 3.4  9.0 $$\pm$$ 3.4  4.2 $$\pm$$ 1.6  89.1 $$\pm$$ 2.8  9.4 $$\pm$$ 2.4  1.5$$\pm$$ 1.4  mus  80.5 $$\pm$$ 0.7  19.5 $$\pm$$ 0.7  0.0 $$\pm$$ 0.0  41.7 $$\pm$$ 0.9  58.4 $$\pm$$ 0.9  0.0 $$\pm$$ 0.0  pim  64.3 $$\pm$$ 3.8  27.5 $$\pm$$ 4.0  8.2 $$\pm$$ 1.2  86.4 $$\pm$$ 1.8  6.2 $$\pm$$ 1.2  7.4$$\pm$$ 1.4  pri  13.3 $$\pm$$ 1.5  85.1 $$\pm$$ 1.4  1.5 $$\pm$$ 0.7  51.8 $$\pm$$ 2.3  16.8 $$\pm$$ 2.3  31.4 $$\pm$$ 2.3  sat  71.9 $$\pm$$ 3.4  24.7 $$\pm$$ 3.6  3.4 $$\pm$$ 0.4  87.6 $$\pm$$ 0.4  6.5 $$\pm$$ 0.4  6.0$$\pm$$ 0.3  son  71.6 $$\pm$$ 3.5  14.3 $$\pm$$ 2.4  14.1 $$\pm$$ 2.0  71.1 $$\pm$$ 3.4  15.4 $$\pm$$ 2.6  13.5 $$\pm$$ 3.0  soy  70.0 $$\pm$$ 1.1  25.7 $$\pm$$ 1.1  4.3 $$\pm$$ 0.7  72.5 $$\pm$$ 0.8  21.3 $$\pm$$ 0.6  6.2 $$\pm$$ 0.6  tic  92.5 $$\pm$$ 2.3  6.6 $$\pm$$ 2.4  0.9 $$\pm$$ 0.3  96.1 $$\pm$$ 0.8  2.4 $$\pm$$ 0.9  1.5$$\pm$$ 0.1  tra  19.0 $$\pm$$ 2.3  80.3 $$\pm$$ 2.4  0.6 $$\pm$$ 0.3  92.6 $$\pm$$ 4.3  1.1 $$\pm$$ 0.5  6.2$$\pm$$ 4.5  veh  67.0 $$\pm$$ 2.9  20.6 $$\pm$$ 3.5  12.4 $$\pm$$ 1.2  73.3 $$\pm$$ 1.6  8.7 $$\pm$$ 1.3  18.0 $$\pm$$ 1.7  vot  80.7 $$\pm$$ 2.1  18.2 $$\pm$$ 2.2  1.1 $$\pm$$ 0.4  94.6 $$\pm$$ 0.6  5.4 $$\pm$$ 0.6  0.0$$\pm$$ 0.1  vow  73.5 $$\pm$$ 1.4  10.8 $$\pm$$ 1.1  15.7 $$\pm$$ 1.1  66.2 $$\pm$$ 1.5  12.8 $$\pm$$ 1.0  21.0 $$\pm$$ 1.4  yea  23.2 $$\pm$$ 3.1  73.8 $$\pm$$ 3.6  3.0 $$\pm$$ 0.7  64.7 $$\pm$$ 1.6  7.1 $$\pm$$ 0.9  28.3 $$\pm$$ 1.5  4.2 Overall accuracy of strategies To learn about if there is any difference between the six aforementioned classification strategies, their comparison was conducted. Table 4 shows their accuracy when they were combined with MODLEM- and UNRIPPER-generated rules. The highest scores for each dataset are marked bold. In case of MODLEM, the situation is clear: $$z_{D}$$ obtained the highest score for 21 datasets and the second one was $$z_{m}$$ noting 7 wins. In turn, in case of UNRIPPER the best one cannot be determined so easily – these two strategies noted 8 wins, whereas $$z_{C}$$ obtained 9. The Friedman test was conducted to verify the significance of the differences between the strategies. The null hypothesis was that all the strategies perform equally, i.e. their average ranks are equal. The Friedman test returned p-values smaller than 5e-7 and equal to 3.77e-4 for MODLEM and UNRIPPER, respectively, allowing to reject the null hypothesis. Table 4. $$F-score$$ obtained by the strategies   MODLEM  UNRIPPER    $$z_{m}$$  $$z_{S}$$  $$z_{D}$$  $$z_{C}$$  $$z_{N}$$  $$z_{A}$$  $$z_{m}$$  $$z_{S}$$  $$z_{D}$$  $$z_{C}$$  $$z_{N}$$  $$z_{A}$$  aba  0.482  0.538  0.629  0.431  0.503  0.539  0.255  0.255  0.219  0.305  0.257  0.250  brc  0.461  0.538  0.559  0.546  0.490  0.518  0.466  0.467  0.323  0.397  0.466  0.418  brw  0.616  0.555  0.738  0.624  0.557  0.664  0.393  0.433  0.609  0.496  0.444  0.503  bup  0.571  0.498  0.621  0.585  0.536  0.486  0.505  0.500  0.480  0.494  0.458  0.391  car  0.696  0.541  0.814  0.726  0.499  0.408  0.284  0.328  0.595  0.432  0.304  0.249  che  0.812  0.671  0.802  0.769  0.776  0.734  0.563  0.495  0.577  0.471  0.513  0.494  cmc  0.565  0.558  0.593  0.535  0.569  0.551  0.385  0.343  0.307  0.285  0.386  0.369  cre  0.434  0.542  0.566  0.533  0.538  0.539  0.392  0.426  0.387  0.397  0.407  0.379  crx  0.728  0.642  0.728  0.702  0.675  0.658  0.563  0.588  0.466  0.482  0.500  0.498  eco  0.335  0.279  0.377  0.400  0.278  0.283  0.285  0.277  0.348  0.365  0.233  0.221  gla  0.341  0.303  0.399  0.428  0.299  0.289  0.298  0.212  0.266  0.387  0.215  0.186  hab  0.480  0.548  0.576  0.497  0.505  0.540  0.365  0.355  0.274  0.290  0.365  0.328  hea  0.682  0.651  0.692  0.657  0.653  0.672  0.583  0.589  0.600  0.601  0.573  0.554  hep  0.431  0.573  0.620  0.589  0.565  0.561  0.464  0.485  0.505  0.419  0.479  0.431  ion  0.435  0.451  0.642  0.621  0.549  0.530  0.544  0.399  0.636  0.597  0.490  0.466  iri  0.898  0.825  0.859  0.888  0.841  0.749  0.746  0.460  0.634  0.654  0.637  0.588  lym  0.383  0.389  0.502  0.448  0.394  0.400  0.464  0.484  0.509  0.512  0.474  0.486  mon  0.780  0.789  0.681  0.647  0.718  0.646  0.914  0.540  0.884  0.927  0.913  0.350  mus  0.712  0.718  0.751  0.701  0.736  0.732  0.802  0.932  0.954  0.778  0.934  0.862  pim  0.621  0.533  0.640  0.603  0.591  0.537  0.405  0.496  0.501  0.507  0.431  0.382  pri  0.196  0.167  0.210  0.205  0.170  0.150  0.114  0.095  0.114  0.124  0.104  0.106  sat  0.668  0.473  0.658  0.530  0.513  0.460  0.489  0.425  0.487  0.430  0.395  0.414  son  0.588  0.589  0.614  0.594  0.566  0.533  0.572  0.537  0.560  0.546  0.534  0.473  soy  0.506  0.388  0.406  0.424  0.473  0.464  0.426  0.330  0.371  0.354  0.377  0.343  tic  0.647  0.582  0.915  0.802  0.650  0.562  0.511  0.224  0.694  0.532  0.229  0.223  tra  0.552  0.542  0.610  0.536  0.498  0.554  0.377  0.451  0.318  0.328  0.378  0.333  veh  0.565  0.477  0.600  0.509  0.456  0.405  0.394  0.257  0.287  0.329  0.252  0.270  vot  0.799  0.717  0.749  0.768  0.764  0.631  0.484  0.563  0.615  0.570  0.598  0.578  vow  0.468  0.339  0.403  0.428  0.342  0.419  0.446  0.314  0.395  0.370  0.286  0.375  yea  0.467  0.188  0.492  0.491  0.207  0.185  0.270  0.183  0.292  0.310  0.180   0.191    MODLEM  UNRIPPER    $$z_{m}$$  $$z_{S}$$  $$z_{D}$$  $$z_{C}$$  $$z_{N}$$  $$z_{A}$$  $$z_{m}$$  $$z_{S}$$  $$z_{D}$$  $$z_{C}$$  $$z_{N}$$  $$z_{A}$$  aba  0.482  0.538  0.629  0.431  0.503  0.539  0.255  0.255  0.219  0.305  0.257  0.250  brc  0.461  0.538  0.559  0.546  0.490  0.518  0.466  0.467  0.323  0.397  0.466  0.418  brw  0.616  0.555  0.738  0.624  0.557  0.664  0.393  0.433  0.609  0.496  0.444  0.503  bup  0.571  0.498  0.621  0.585  0.536  0.486  0.505  0.500  0.480  0.494  0.458  0.391  car  0.696  0.541  0.814  0.726  0.499  0.408  0.284  0.328  0.595  0.432  0.304  0.249  che  0.812  0.671  0.802  0.769  0.776  0.734  0.563  0.495  0.577  0.471  0.513  0.494  cmc  0.565  0.558  0.593  0.535  0.569  0.551  0.385  0.343  0.307  0.285  0.386  0.369  cre  0.434  0.542  0.566  0.533  0.538  0.539  0.392  0.426  0.387  0.397  0.407  0.379  crx  0.728  0.642  0.728  0.702  0.675  0.658  0.563  0.588  0.466  0.482  0.500  0.498  eco  0.335  0.279  0.377  0.400  0.278  0.283  0.285  0.277  0.348  0.365  0.233  0.221  gla  0.341  0.303  0.399  0.428  0.299  0.289  0.298  0.212  0.266  0.387  0.215  0.186  hab  0.480  0.548  0.576  0.497  0.505  0.540  0.365  0.355  0.274  0.290  0.365  0.328  hea  0.682  0.651  0.692  0.657  0.653  0.672  0.583  0.589  0.600  0.601  0.573  0.554  hep  0.431  0.573  0.620  0.589  0.565  0.561  0.464  0.485  0.505  0.419  0.479  0.431  ion  0.435  0.451  0.642  0.621  0.549  0.530  0.544  0.399  0.636  0.597  0.490  0.466  iri  0.898  0.825  0.859  0.888  0.841  0.749  0.746  0.460  0.634  0.654  0.637  0.588  lym  0.383  0.389  0.502  0.448  0.394  0.400  0.464  0.484  0.509  0.512  0.474  0.486  mon  0.780  0.789  0.681  0.647  0.718  0.646  0.914  0.540  0.884  0.927  0.913  0.350  mus  0.712  0.718  0.751  0.701  0.736  0.732  0.802  0.932  0.954  0.778  0.934  0.862  pim  0.621  0.533  0.640  0.603  0.591  0.537  0.405  0.496  0.501  0.507  0.431  0.382  pri  0.196  0.167  0.210  0.205  0.170  0.150  0.114  0.095  0.114  0.124  0.104  0.106  sat  0.668  0.473  0.658  0.530  0.513  0.460  0.489  0.425  0.487  0.430  0.395  0.414  son  0.588  0.589  0.614  0.594  0.566  0.533  0.572  0.537  0.560  0.546  0.534  0.473  soy  0.506  0.388  0.406  0.424  0.473  0.464  0.426  0.330  0.371  0.354  0.377  0.343  tic  0.647  0.582  0.915  0.802  0.650  0.562  0.511  0.224  0.694  0.532  0.229  0.223  tra  0.552  0.542  0.610  0.536  0.498  0.554  0.377  0.451  0.318  0.328  0.378  0.333  veh  0.565  0.477  0.600  0.509  0.456  0.405  0.394  0.257  0.287  0.329  0.252  0.270  vot  0.799  0.717  0.749  0.768  0.764  0.631  0.484  0.563  0.615  0.570  0.598  0.578  vow  0.468  0.339  0.403  0.428  0.342  0.419  0.446  0.314  0.395  0.370  0.286  0.375  yea  0.467  0.188  0.492  0.491  0.207  0.185  0.270  0.183  0.292  0.310  0.180   0.191  Figure 1 shows a ranking of the strategies according to their average ranks and the critical difference derived from the Nemenyi test. One can notice a few statistically significant differences (such strategies are not connected by a horizontal line). First, in case of MODLEM-generated rules $$z_{D}$$ is better than all the others while $$z_{C}$$ outperforms $$z_{A}$$. Thus, a conclusion is reached that in this case $$z_{D}$$ is the best one among the evaluated strategies. Then, in case of UNRIPPER-generated rules $$z_{m}$$, $$z_{D}$$ and $$z_{C}$$ are better than $$z_{A}$$. An interesting phenomenon is that the previous supremacy of $$z_{D}$$ is diminished. It is also an interesting thing that for both classifiers the strategies formed a similar order: the top three are $$z_{D}$$, $$z_{m}$$ and $$z_{C}$$, whereas the bottom three are $$z_{N}$$, $$z_{S}$$ and $$z_{A}$$ lagging behind. These two phenomena are analysed and described in the following subsections. Fig. 1. View largeDownload slide The average ranks of the strategies and a pairwise comparison using the Nemenyi test. (a) MODLEM; (b) UNRIPPER. Fig. 1. View largeDownload slide The average ranks of the strategies and a pairwise comparison using the Nemenyi test. (a) MODLEM; (b) UNRIPPER. 4.3 An accuracy decomposition into multiple- and no-match In the previous subsection, $$z_{D}$$ outperformed all the other strategies in case of MODLEM, but its supremacy did not occur in case of UNRIPPER. Because these two algorithms led to different proportions of examples classified using multiple- and no-match, we hypothesized that the strategies may perform differently in these two cases. Hence the overall accuracy was decomposed into accuracies obtained in multiple- and no-match, which were analysed separately. Table 5 shows accuracy of the classification strategies considering only the multiple-match situation. For both learning algorithms, the best strategy is most often $$z_{D}$$. It took the lead for 25 and 14 datasets in case of MODLEM and UNRIPPER, respectively. Again, the results were compared using the Friedman test, which returned p-values smaller than 5e-7 and equal to 9.07e-4 in case of MODLEM and UNRIPPER, respectively and rejected the null hypothesis for both groups. Figure 2 depicts a ranking of the strategies based on the average ranks. It turns out that $$z_{D}$$ is the top strategy in both groups and it is significantly better than all the others in case of MODLEM as well as than $$z_{A}$$ in case of UNRIPPER. Therefore the multiple-match situation is concluded that $$z_{D}$$ is the best strategy. On the other side there is $$z_{A}$$, which performed the worst in both groups. Fig. 2. View largeDownload slide The average ranks of the strategies and a pairwise comparison using the Nemenyi test in multiple-match. (a) MODLEM; (b) UNRIPPER. Fig. 2. View largeDownload slide The average ranks of the strategies and a pairwise comparison using the Nemenyi test in multiple-match. (a) MODLEM; (b) UNRIPPER. Table 5. $$F-score$$ obtained by the strategies in multiple-match   MODLEM  UNRIPPER    $$z_{m}$$  $$z_{S}$$  $$z_{D}$$  $$z_{C}$$  $$z_{N}$$  $$z_{A}$$  $$z_{m}$$  $$z_{S}$$  $$z_{D}$$  $$z_{C}$$  $$z_{N}$$  $$z_{A}$$  aba  0.484  0.542  0.630  0.429  0.505  0.543  0.367  0.370  0.416  0.505  0.371  0.372  brc  0.467  0.547  0.547  0.532  0.506  0.530  0.392  0.395  0.343  0.491  0.388  0.396  brw  0.643  0.661  0.741  0.678  0.665  0.665  0.385  0.454  0.618  0.463  0.460  0.387  bup  0.574  0.475  0.618  0.599  0.573  0.463  0.530  0.492  0.502  0.485  0.414  0.413  car  0.780  0.516  0.847  0.793  0.465  0.460  0.427  0.362  0.657  0.458  0.409  0.362  che  0.821  0.701  0.833  0.809  0.796  0.775  0.613  0.487  0.647  0.536  0.525  0.524  cmc  0.565  0.557  0.593  0.536  0.569  0.550  0.388  0.528  0.370  0.433  0.417  0.458  cre  0.450  0.588  0.593  0.555  0.590  0.587  0.353  0.424  0.437  0.437  0.344  0.382  crx  0.710  0.685  0.721  0.696  0.719  0.635  0.530  0.562  0.462  0.483  0.501  0.501  eco  0.454  0.451  0.493  0.486  0.463  0.460  0.443  0.458  0.486  0.402  0.444  0.440  gla  0.363  0.410  0.515  0.507  0.414  0.407  0.432  0.302  0.402  0.362  0.349  0.308  hab  0.483  0.552  0.576  0.497  0.509  0.543  0.461  0.476  0.262  0.284  0.461  0.460  hea  0.664  0.676  0.677  0.685  0.684  0.683  0.567  0.598  0.572  0.558  0.535  0.493  hep  0.433  0.588  0.622  0.585  0.579  0.576  0.410  0.461  0.577  0.448  0.444  0.448  ion  0.455  0.479  0.704  0.686  0.617  0.596  0.542  0.416  0.671  0.659  0.543  0.514  iri  0.882  0.886  0.919  0.882  0.877  0.877  0.832  0.455  0.626  0.638  0.637  0.528  lym  0.602  0.604  0.694  0.643  0.623  0.623  0.613  0.622  0.597  0.585  0.598  0.571  mon  0.701  0.773  0.681  0.688  0.707  0.595  0.801  0.362  0.724  0.822  0.797  0.386  mus  0.712  0.719  0.751  0.702  0.737  0.732  0.802  0.932  0.954  0.778  0.934  0.862  pim  0.633  0.562  0.663  0.610  0.642  0.556  0.362  0.476  0.502  0.538  0.397  0.414  pri  0.200  0.171  0.214  0.210  0.174  0.154  0.429  0.413  0.390  0.389  0.428  0.425  sat  0.701  0.528  0.732  0.598  0.596  0.531  0.561  0.564  0.634  0.565  0.563  0.543  son  0.547  0.585  0.606  0.584  0.589  0.587  0.561  0.515  0.571  0.502  0.505  0.484  soy  0.585  0.466  0.489  0.545  0.564  0.570  0.496  0.362  0.429  0.495  0.413  0.408  tic  0.801  0.704  0.860  0.818  0.843  0.693  0.832  0.362  0.963  0.534  0.372  0.361  tra  0.553  0.543  0.610  0.536  0.500  0.555  0.356  0.386  0.351  0.357  0.355  0.358  veh  0.658  0.605  0.670  0.662  0.647  0.594  0.534  0.498  0.566  0.504  0.500  0.476  vot  0.800  0.742  0.788  0.772  0.776  0.609  0.478  0.561  0.615  0.569  0.598  0.579  vow  0.551  0.529  0.574  0.551  0.566  0.564  0.536  0.468  0.594  0.469  0.477  0.458  yea  0.475  0.195  0.503  0.506  0.214  0.190  0.372  0.329  0.518  0.524  0.328   0.313    MODLEM  UNRIPPER    $$z_{m}$$  $$z_{S}$$  $$z_{D}$$  $$z_{C}$$  $$z_{N}$$  $$z_{A}$$  $$z_{m}$$  $$z_{S}$$  $$z_{D}$$  $$z_{C}$$  $$z_{N}$$  $$z_{A}$$  aba  0.484  0.542  0.630  0.429  0.505  0.543  0.367  0.370  0.416  0.505  0.371  0.372  brc  0.467  0.547  0.547  0.532  0.506  0.530  0.392  0.395  0.343  0.491  0.388  0.396  brw  0.643  0.661  0.741  0.678  0.665  0.665  0.385  0.454  0.618  0.463  0.460  0.387  bup  0.574  0.475  0.618  0.599  0.573  0.463  0.530  0.492  0.502  0.485  0.414  0.413  car  0.780  0.516  0.847  0.793  0.465  0.460  0.427  0.362  0.657  0.458  0.409  0.362  che  0.821  0.701  0.833  0.809  0.796  0.775  0.613  0.487  0.647  0.536  0.525  0.524  cmc  0.565  0.557  0.593  0.536  0.569  0.550  0.388  0.528  0.370  0.433  0.417  0.458  cre  0.450  0.588  0.593  0.555  0.590  0.587  0.353  0.424  0.437  0.437  0.344  0.382  crx  0.710  0.685  0.721  0.696  0.719  0.635  0.530  0.562  0.462  0.483  0.501  0.501  eco  0.454  0.451  0.493  0.486  0.463  0.460  0.443  0.458  0.486  0.402  0.444  0.440  gla  0.363  0.410  0.515  0.507  0.414  0.407  0.432  0.302  0.402  0.362  0.349  0.308  hab  0.483  0.552  0.576  0.497  0.509  0.543  0.461  0.476  0.262  0.284  0.461  0.460  hea  0.664  0.676  0.677  0.685  0.684  0.683  0.567  0.598  0.572  0.558  0.535  0.493  hep  0.433  0.588  0.622  0.585  0.579  0.576  0.410  0.461  0.577  0.448  0.444  0.448  ion  0.455  0.479  0.704  0.686  0.617  0.596  0.542  0.416  0.671  0.659  0.543  0.514  iri  0.882  0.886  0.919  0.882  0.877  0.877  0.832  0.455  0.626  0.638  0.637  0.528  lym  0.602  0.604  0.694  0.643  0.623  0.623  0.613  0.622  0.597  0.585  0.598  0.571  mon  0.701  0.773  0.681  0.688  0.707  0.595  0.801  0.362  0.724  0.822  0.797  0.386  mus  0.712  0.719  0.751  0.702  0.737  0.732  0.802  0.932  0.954  0.778  0.934  0.862  pim  0.633  0.562  0.663  0.610  0.642  0.556  0.362  0.476  0.502  0.538  0.397  0.414  pri  0.200  0.171  0.214  0.210  0.174  0.154  0.429  0.413  0.390  0.389  0.428  0.425  sat  0.701  0.528  0.732  0.598  0.596  0.531  0.561  0.564  0.634  0.565  0.563  0.543  son  0.547  0.585  0.606  0.584  0.589  0.587  0.561  0.515  0.571  0.502  0.505  0.484  soy  0.585  0.466  0.489  0.545  0.564  0.570  0.496  0.362  0.429  0.495  0.413  0.408  tic  0.801  0.704  0.860  0.818  0.843  0.693  0.832  0.362  0.963  0.534  0.372  0.361  tra  0.553  0.543  0.610  0.536  0.500  0.555  0.356  0.386  0.351  0.357  0.355  0.358  veh  0.658  0.605  0.670  0.662  0.647  0.594  0.534  0.498  0.566  0.504  0.500  0.476  vot  0.800  0.742  0.788  0.772  0.776  0.609  0.478  0.561  0.615  0.569  0.598  0.579  vow  0.551  0.529  0.574  0.551  0.566  0.564  0.536  0.468  0.594  0.469  0.477  0.458  yea  0.475  0.195  0.503  0.506  0.214  0.190  0.372  0.329  0.518  0.524  0.328   0.313  Next, Table 6 contains $$F-score$$ values in no-match. This time the situation is not as clear as before: $$z_{m}$$ most often obtained the highest score in case of UNRIPPER noting 12 wins while the second one $$z_{C}$$ scored the best for 10 datasets. In case of MODLEM the lead is taken together by $$z_{m}$$ and $$z_{D}$$ which scored the best for 10 datasets. The Friedman test rejected the null hypothesis returning p-values equal to 2e-6 and 1.013e-3 in case of MODLEM and UNRIPPER, respectively. Figure 3 presents a ranking based on the average ranks. In both cases, the best strategy is $$z_{m}$$, but in case of MODLEM the difference comparing with $$z_{D}$$ is very slight. Also $$z_{C}$$ reveals stable and fairly good results. It is interesting, despite that $$z_{D}$$ is the second in case of MODLEM it is barely second to last in case of UNRIPPER. Table 6. $$F-score$$ obtained by the strategies in no-match   MODLEM  UNRIPPER    $$z_{m}$$  $$z_{S}$$  $$z_{D}$$  $$z_{C}$$  $$z_{N}$$  $$z_{A}$$  $$z_{m}$$  $$z_{S}$$  $$z_{D}$$  $$z_{C}$$  $$z_{N}$$  $$z_{A}$$  aba  0.422  0.421  0.257  0.501  0.427  0.426  0.244  0.244  0.198  0.278  0.245  0.238  brc  0.397  0.442  0.507  0.603  0.367  0.382  0.390  0.380  0.290  0.306  0.384  0.413  brw  0.551  0.321  0.700  0.504  0.317  0.628  0.381  0.372  0.549  0.511  0.390  0.466  bup  0.557  0.488  0.610  0.515  0.380  0.475  0.470  0.488  0.452  0.490  0.466  0.354  car  0.203  0.373  0.374  0.294  0.337  0.135  0.023  0.141  0.267  0.250  0.078  0.023  che  0.653  0.502  0.406  0.523  0.371  0.443  0.370  0.402  0.308  0.276  0.429  0.380  cmc  0.478  0.478  0.503  0.450  0.478  0.478  0.376  0.327  0.303  0.277  0.376  0.363  cre  0.399  0.384  0.494  0.481  0.384  0.385  0.390  0.411  0.369  0.382  0.408  0.361  crx  0.727  0.552  0.720  0.693  0.588  0.702  0.542  0.462  0.451  0.448  0.449  0.480  eco  0.504  0.378  0.517  0.521  0.358  0.380  0.364  0.310  0.389  0.462  0.265  0.209  gla  0.351  0.202  0.219  0.354  0.184  0.180  0.243  0.174  0.144  0.358  0.165  0.169  hab  0.447  0.462  0.432  0.444  0.446  0.473  0.311  0.307  0.271  0.289  0.311  0.283  hea  0.695  0.600  0.703  0.612  0.598  0.640  0.570  0.568  0.609  0.632  0.601  0.580  hep  0.482  0.489  0.557  0.580  0.489  0.482  0.374  0.371  0.333  0.347  0.373  0.362  ion  0.381  0.379  0.437  0.422  0.379  0.393  0.513  0.363  0.551  0.447  0.368  0.359  iri  0.936  0.821  0.854  0.925  0.846  0.699  0.694  0.668  0.674  0.682  0.680  0.646  lym  0.500  0.507  0.528  0.483  0.476  0.499  0.495  0.518  0.492  0.523  0.505  0.547  mon  0.884  0.750  0.589  0.509  0.676  0.719  0.958  0.958  0.958  0.958  0.958  0.179  mus  0.774  0.774  0.774  0.774  0.774  0.826  1.00  1.00  1.00  1.00  1.00  1.00  pim  0.571  0.369  0.533  0.577  0.355  0.441  0.428  0.502  0.485  0.471  0.450  0.346  pri  0.838  0.848  0.840  0.789  0.849  0.851  0.153  0.132  0.143  0.171  0.126  0.124  sat  0.500  0.233  0.209  0.240  0.091  0.161  0.389  0.195  0.234  0.256  0.140  0.208  son  0.616  0.578  0.607  0.582  0.515  0.394  0.567  0.556  0.537  0.579  0.560  0.393  soy  0.619  0.628  0.621  0.534  0.626  0.540  0.643  0.634  0.654  0.525  0.624  0.575  tic  0.256  0.210  0.852  0.669  0.195  0.209  0.002  0.001  0.266  0.351  0.004  0.001  tra  0.395  0.408  0.504  0.474  0.395  0.417  0.377  0.401  0.311  0.314  0.379  0.327  veh  0.412  0.253  0.221  0.282  0.167  0.134  0.322  0.129  0.134  0.243  0.116  0.160  vot  0.614  0.332  0.285  0.561  0.480  0.536  0.950  0.927  0.927  0.927  0.927  0.896  vow  0.382  0.119  0.180  0.323  0.137  0.288  0.361  0.140  0.202  0.290  0.145  0.267  yea  0.474  0.285  0.350  0.333  0.275  0.307  0.217  0.096  0.134  0.171  0.089   0.114    MODLEM  UNRIPPER    $$z_{m}$$  $$z_{S}$$  $$z_{D}$$  $$z_{C}$$  $$z_{N}$$  $$z_{A}$$  $$z_{m}$$  $$z_{S}$$  $$z_{D}$$  $$z_{C}$$  $$z_{N}$$  $$z_{A}$$  aba  0.422  0.421  0.257  0.501  0.427  0.426  0.244  0.244  0.198  0.278  0.245  0.238  brc  0.397  0.442  0.507  0.603  0.367  0.382  0.390  0.380  0.290  0.306  0.384  0.413  brw  0.551  0.321  0.700  0.504  0.317  0.628  0.381  0.372  0.549  0.511  0.390  0.466  bup  0.557  0.488  0.610  0.515  0.380  0.475  0.470  0.488  0.452  0.490  0.466  0.354  car  0.203  0.373  0.374  0.294  0.337  0.135  0.023  0.141  0.267  0.250  0.078  0.023  che  0.653  0.502  0.406  0.523  0.371  0.443  0.370  0.402  0.308  0.276  0.429  0.380  cmc  0.478  0.478  0.503  0.450  0.478  0.478  0.376  0.327  0.303  0.277  0.376  0.363  cre  0.399  0.384  0.494  0.481  0.384  0.385  0.390  0.411  0.369  0.382  0.408  0.361  crx  0.727  0.552  0.720  0.693  0.588  0.702  0.542  0.462  0.451  0.448  0.449  0.480  eco  0.504  0.378  0.517  0.521  0.358  0.380  0.364  0.310  0.389  0.462  0.265  0.209  gla  0.351  0.202  0.219  0.354  0.184  0.180  0.243  0.174  0.144  0.358  0.165  0.169  hab  0.447  0.462  0.432  0.444  0.446  0.473  0.311  0.307  0.271  0.289  0.311  0.283  hea  0.695  0.600  0.703  0.612  0.598  0.640  0.570  0.568  0.609  0.632  0.601  0.580  hep  0.482  0.489  0.557  0.580  0.489  0.482  0.374  0.371  0.333  0.347  0.373  0.362  ion  0.381  0.379  0.437  0.422  0.379  0.393  0.513  0.363  0.551  0.447  0.368  0.359  iri  0.936  0.821  0.854  0.925  0.846  0.699  0.694  0.668  0.674  0.682  0.680  0.646  lym  0.500  0.507  0.528  0.483  0.476  0.499  0.495  0.518  0.492  0.523  0.505  0.547  mon  0.884  0.750  0.589  0.509  0.676  0.719  0.958  0.958  0.958  0.958  0.958  0.179  mus  0.774  0.774  0.774  0.774  0.774  0.826  1.00  1.00  1.00  1.00  1.00  1.00  pim  0.571  0.369  0.533  0.577  0.355  0.441  0.428  0.502  0.485  0.471  0.450  0.346  pri  0.838  0.848  0.840  0.789  0.849  0.851  0.153  0.132  0.143  0.171  0.126  0.124  sat  0.500  0.233  0.209  0.240  0.091  0.161  0.389  0.195  0.234  0.256  0.140  0.208  son  0.616  0.578  0.607  0.582  0.515  0.394  0.567  0.556  0.537  0.579  0.560  0.393  soy  0.619  0.628  0.621  0.534  0.626  0.540  0.643  0.634  0.654  0.525  0.624  0.575  tic  0.256  0.210  0.852  0.669  0.195  0.209  0.002  0.001  0.266  0.351  0.004  0.001  tra  0.395  0.408  0.504  0.474  0.395  0.417  0.377  0.401  0.311  0.314  0.379  0.327  veh  0.412  0.253  0.221  0.282  0.167  0.134  0.322  0.129  0.134  0.243  0.116  0.160  vot  0.614  0.332  0.285  0.561  0.480  0.536  0.950  0.927  0.927  0.927  0.927  0.896  vow  0.382  0.119  0.180  0.323  0.137  0.288  0.361  0.140  0.202  0.290  0.145  0.267  yea  0.474  0.285  0.350  0.333  0.275  0.307  0.217  0.096  0.134  0.171  0.089   0.114  Fig. 3. View largeDownload slide The average ranks of the strategies and a pairwise comparison using the Nemenyi test in no-match. (a) MODLEM; (b) UNRIPPER. Fig. 3. View largeDownload slide The average ranks of the strategies and a pairwise comparison using the Nemenyi test in no-match. (a) MODLEM; (b) UNRIPPER. Because the obtained recommendations for multiple- and no-match are different, we combined these two together into a new strategy denoted by $$z_{H}$$, where $$z_{H}^{mm}(x) = z_{D}^{mm}(x)$$ and $$z_{H}^{nm}(x) = z_{m}^{nm}(x)$$. Once again, the Friedman test was used to compare the results returning p-values smaller than 5e-7 for both algorithms. Figure 4 presents that in both cases $$z_{H}$$ is the top strategy supported by many significant differences. In case of MODLEM, $$z_{H}$$ obtained even better average rank than $$z_{D}$$ and both are significantly better than all the others. Also in case of UNRIPPER, $$z_{H}$$ is the best one and performed much better than both $$z_{D}$$ and $$z_{m}$$ outperforming significantly the others. Here, it is especially interesting how big an improvement was brought by combining the assets of $$z_{D}$$ and $$z_{m}$$. Fig. 4. View largeDownload slide The average ranks of the strategies and a pairwise comparison using the Nemenyi test after the addition of a combined strategy $$z_{H}$$ based on $$z_{D}$$ and $$z_{m}$$. (a) MODLEM; (b) UNRIPPER. Fig. 4. View largeDownload slide The average ranks of the strategies and a pairwise comparison using the Nemenyi test after the addition of a combined strategy $$z_{H}$$ based on $$z_{D}$$ and $$z_{m}$$. (a) MODLEM; (b) UNRIPPER. 4.4 An analysis of the rank order of the strategies It has already been observed that the strategies formed a similar rank order in which $$z_{D}$$, $$z_{m}$$ and $$z_{C}$$ were the top three and $$z_{N}$$, $$z_{S}$$ and $$z_{A}$$ seem to lag behind them. To understand this, the datasets, for which the differences between the best and the worst strategies are the biggest, were investigated further. Such an example in case of MODLEM is car dataset for which $$z_{D}$$ obtained score 0.406 higher than $$z_{A}$$. The $$F-score$$ values obtained for each class were compared separately. Figure 5 depicts that $$z_{N}$$, $$z_{S}$$ and $$z_{A}$$ obtained much smaller values for 3 out of 4 classes and were comparable to the others only in case of the last one which was the largest class representing 0.7 of the dataset. This may indicate that $$z_{N}$$, $$z_{S}$$ and $$z_{A}$$ are biased towards the majority class. The analysis was repeated for UNRIPPER classifier and tic dataset. Figure 6 shows that these three strategies recognized the negative class very poorly. On this class $$z_{m}$$, $$z_{C}$$ and especially $$z_{D}$$ are much better. This class is the minority one and constitutes 0.347 of the dataset. Fig. 5. View largeDownload slide The $$F-score$$ obtained for each class by all the strategies considering MODLEM classifier and car dataset. Fig. 5. View largeDownload slide The $$F-score$$ obtained for each class by all the strategies considering MODLEM classifier and car dataset. Fig. 6. View largeDownload slide The $$F-score$$ obtained for each class by all the strategies considering UNRIPPER classifier and tic dataset. Fig. 6. View largeDownload slide The $$F-score$$ obtained for each class by all the strategies considering UNRIPPER classifier and tic dataset. Once again the Friedman test was used to verify if all the strategies perform equally on the largest classes. For each dataset only the $$F-score$$ value related to its largest class was chosen and inputted to the test. The obtained p-values were smaller than 5e-7 for MODLEM and equal to 2.2e-5 for UNRIPPER. The full results are shown in Figure 7. Apart from $$z_{D}$$ all the strategies formed the same order for both classifiers, but – surprisingly – $$z_{N}$$, $$z_{S}$$ and $$z_{A}$$ are located in the middle of the rankings. It is interesting that despite their similar performance in overall case, this time $$z_{m}$$, $$z_{C}$$ and $$z_{D}$$ performed completely oppositely – $$z_{m}$$ took the lead which is also confirmed through statistical significance, while $$z_{C}$$ is located on the last position. Also $$z_{D}$$ is significantly worse than $$z_{m}$$ noting barely third and fifth rank for MODLEM and UNRIPPER, respectively. Fig. 7. View largeDownload slide The average ranks of the strategies and a pairwise comparison using the Nemenyi test considering only the largest class. (a) MODLEM; (b) UNRIPPER. Fig. 7. View largeDownload slide The average ranks of the strategies and a pairwise comparison using the Nemenyi test considering only the largest class. (a) MODLEM; (b) UNRIPPER. To complement the analysis, a comparison of the strategies considering only the smallest classes was conducted (Figure 8). As before, the test allowed to reject the null hypothesis returning p-values smaller than 5e-7 for both algorithms. This time $$z_{D}$$ and $$z_{C}$$ are strongly the top two and each noted significantly better performance than almost all the others. Therefore, a conclusion is reached that these two are much more sensitive to small classes and can be recommended especially to datasets with imbalanced distribution of examples. Fig. 8. View largeDownload slide The average ranks of the strategies and a pairwise comparison using the Nemenyi test considering only the smallest class. (a) MODLEM; (b) UNRIPPER. Fig. 8. View largeDownload slide The average ranks of the strategies and a pairwise comparison using the Nemenyi test considering only the smallest class. (a) MODLEM; (b) UNRIPPER. 5 Conclusions This article discussed classification strategies in rule-based classifiers. To attain general conclusions, two rule induction algorithms were used. An analysis of the rules they generated revealed a few differences. MODLEM created larger number of rules than UNRIPPER and these rules were longer on average than the rules generated by the latter algorithm. The next observation was that in the median case 28.25% or 17.8% of examples could not be classified without using a classification strategy when rules were induced by MODLEM and UNRIPPER, respectively, but in extreme case these values were much higher and equal to 86.7% and 58.3%, respectively. Five popular classification strategies proposed by Michalski $$et~al.$$ [12], Grzymała-Busse and Zou [9], An [1], Stefanowski [14], Sulzmann and Fürnkranz [15] were considered to find out if whichever of them is significantly better or worse than the others. Also, a new strategy based on $$\chi ^{2}$$ statistic was taken into consideration. The first finding was that the results of the comparison are dependent on the chosen learning algorithm: $$z_{m}$$ – the modified strategy by Sulzmann and Fürnkranz, $$z_{D}$$ proposed by An and the new proposal denoted by $$z_{C}$$ performed similarly and were the best in case of UNRIPPER, whereas $$z_{D}$$ was significantly better than all the others in case of MODLEM. This dependency was suspected to result from the different proportions of examples classified using multiple- and no-match, and therefore the overall results were decomposed into multiple- and no-match ones separately. A deeper insight into mechanisms used by the strategies led up to find that for both algorithms multiple- and no-match have their own best strategy, which are $$z_{D}$$ and $$z_{m}$$ (our proposal based on m-estimate), respectively. A combination of these two recommendations into a new strategy led to the best results regardless of the chosen learning algorithm. Next, the strategies formed a similar order for both learning algorithms: the top three were $$z_{D}$$, $$z_{m}$$ and $$z_{C}$$, whereas Stefanowski’s $$z_{N}$$, $$z_{S}$$ – proposed by Grzymała-Busse and Zou and $$z_{A}$$ from Michalski et al. were situated on the last three positions. Finally, interesting properties of $$z_{D}$$ and $$z_{C}$$ were found – they are significantly better than all the others considering only the smallest class from each dataset. We think they are well-suited to problems with imbalanced distribution of examples. Although the classification strategies have just been examined and their properties and accuracies have been discussed thoroughly, there are still open directions to continue the research. The most interesting one is to propose a new strategy that utilizes not only the global information (like positive or negative coverage in the entire dataset), but also incorporates the information about the examples occurring in the local neighbourhood of the classified one. This approach may improve the accuracy significantly, especially in case of highly noisy datasets. Summing up, this article provided a deeper insight into classification strategies and their usage during classification. A few interesting conclusions related to this topic were brought, which can be applied easily to new or existing problems. The presented results may be a benchmark for new proposals of strategies. Acknowledgement The author thanks Szymon Wilk for careful reading, helpful discussions and valuable comments on previous drafts. He also thanks Jerzy Stefanowski for helpful discussions and valuable suggestions on the very first experiment. Breast cancer, lymphography and primary tumor datasets were obtained from the University Medical Centre, Institute of Oncology, Ljubljana, Yugoslavia by M. Zwitter and M. Soklic. Vehicle dataset comes from the Turing Institute, Glasgow, Scotland. Wisconsin breast cancer dataset was obtained from the University of Wisconsin Hospitals, Madison from Dr. William H. Wolberg [11]. Blood transfusion service center dataset comes from [17]. Cleveland heart disease dataset comes from V.A. Medical Center, Long Beach and Cleveland Clinic Foundation and the principal Robert Detrano. The authors would like to cordially thank for these datasets. References [1] An. A. Learning classification rules from data. Computers & Mathematics with Applications , 45, 737– 748, 2003. Google Scholar CrossRef Search ADS   [2] An A. and Cercone. N. Rule quality measures for rule induction systems: Description and evaluation. Computational Intelligence , 17, 409– 424, 2001. Google Scholar CrossRef Search ADS   [3] Blaszczynski J. Stefanowski J. and Zajac. M. Ensembles of abstaining classifiers based on rule sets. In Foundations of Intelligent Systems, 18th International Symposium, ISMIS 2009, Prague, Czech Republic, September 14-17, 2009. Proceedings, vol. 5722 of Lecture Notes in Computer Science , Rauch J. Ras Z. W. Berka P. and Elomaa T. eds, pp. 382– 391. Springer, 2009. [4] Cohen. W. W. Fast effective rule induction. In ICML  Prieditis A. and Russell S. J. eds, pp. 115– 123. Morgan Kaufmann, 1995. Google Scholar CrossRef Search ADS   [5] Demsar. J. Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research , 7, 1– 30, 2006. [6] Džeroski S. Cestnik B. and Petrovski. I. Using the m-estimate in rule induction. Journal of Computing and Information Technology , 1, 37– 46, 1993. [7] Fisher. R. A. On the interpretation of $$\chi^{2}$$ from contingency tables, and the calculation of p. Journal of the Royal Statistical Society , 85, 87– 94, 1922. Google Scholar CrossRef Search ADS   [8] Grzymala-Busse J. W. and Sudre. G. P. A comparison of two partial matching strategies for classification of unseen cases. In 2006 IEEE International Conference on Granular Computing, GrC 2006, Atlanta, Georgia, USA, May 10–12, 2006 , pp. 800– 805. IEEE, 2006. [9] Grzymala-Busse J. W. and Zou. X. Classification strategies using certain and possible rules. In Rough Sets and Current Trends in Computing, First International Conference, RSCTC’98, Warsaw, Poland, June 22-26, 1998, Proceedings, Vol. 1424 of Lecture Notes in Computer Science , Polkowski L. and Skowron A. eds, pp. 37– 44. Springer, 1998. Google Scholar CrossRef Search ADS   [10] Lichman. M. UCI machine learning repository, University of California, Irvine, School of Information and Computer Sciences,  2013. http://archive.ics.uci.edu/ml. [11] Mangasarian O. L. and Wolberg. W. H. Cancer diagnosis via linear programming. SIAM News , 23, 1– 18, 1990. [12] Michalski R. S. Mozetic I. Hong J. and Lavrac. N. The multi-purpose incremental learning system AQ15 and its testing application to three medical domains. In Proceedings of the 5th National Conference on Artificial Intelligence. Philadelphia, PA, August 11–15, 1986. Vol. 2: Engineering , Kehler T. and Rosenschein S. J. eds, pp. 1041– 1047. Morgan Kaufmann, 1986. [13] Stefanowski. J. The rough set based rule induction technique for classification problems. In Proceedings of the 6th European Congress on Intelligent Techniques and Soft Computing , Vol. 1, pp. 109– 113, 1998. [14] Stefanowski. J. Algorytmy indukcji reguł decyzyjnych w odkrywaniu wiedzy . Wydawnictwo Politechniki Poznańskiej, 2001. [15] Sulzmann J.-N. and Fürnkranz. J. An empirical comparison of probability estimation techniques for probabilistic rules. In Proceedings of the 12th International Conference on Discovery Science (DS-09)  Gama J. Costa V. Santos Jorge A. and Brazdil P. B. eds, pp. 317– 331. Springer-Verlag, 2009. Google Scholar CrossRef Search ADS   [16] Yang Y. and Liu. X. A re-examination of text categorization methods. In Proceedings of the 22Nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’99 , pp. 42– 49. ACM, 1999. [17] Yeh I.-C. Yang K.-J. and Ting. T.-M. Knowledge discovery on rfm model using bernoulli sequence. Expert Systems with Applications , 36, 5866– 5871, 2009. Google Scholar CrossRef Search ADS   © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

Journal

Logic Journal of the IGPLOxford University Press

Published: Feb 1, 2018

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off