“Whoa! It's like Spotify but for academic articles.”

Instant Access to Thousands of Journals for just $40/month

A Methodology for Direct and Indirect Discrimination Prevention in Data Mining

A Methodology for Direct and Indirect Discrimination Prevention in Data Mining Data mining is an increasingly important technology for extracting useful knowledge hidden in large collections of data. There are, however, negative social perceptions about data mining, among which potential privacy invasion and potential discrimination. The latter consists of unfairly treating people on the basis of their belonging to a specific group. Automated data collection and data mining techniques such as classification rule mining have paved the way to making automated decisions, like loan granting/denial, insurance premium computation, etc. If the training data sets are biased in what regards discriminatory (sensitive) attributes like gender, race, religion, etc., discriminatory decisions may ensue. For this reason, antidiscrimination techniques including discrimination discovery and prevention have been introduced in data mining. Discrimination can be either direct or indirect. Direct discrimination occurs when decisions are made based on sensitive attributes. Indirect discrimination occurs when decisions are made based on nonsensitive attributes which are strongly correlated with biased sensitive ones. In this paper, we tackle discrimination prevention in data mining and propose new techniques applicable for direct or indirect discrimination prevention individually or both at the same time. We discuss how to clean training data sets and outsourced data sets in such a way that direct and/or indirect discriminatory decision rules are converted to legitimate (nondiscriminatory) classification rules. We also propose new metrics to evaluate the utility of the proposed approaches and we compare these approaches. The experimental evaluations demonstrate that the proposed techniques are effective at removing direct and/or indirect discrimination biases in the original data set while preserving data quality. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Knowledge and Data Engineering, IEEE Transactions on Institute of Electrical and Electronics Engineers
Loading next page...

You're reading a free preview. Subscribe to read the entire article.

And millions more from thousands of peer-reviewed journals, for just $40/month

To be the best researcher, you need access to the best research

  • With DeepDyve, you can stop worrying about how much articles cost, or if it's too much hassle to order — it's all at your fingertips. Your research is important and deserves the top content.
  • Read from thousands of the leading scholarly journals from Springer, Elsevier, Nature, IEEE, Wiley-Blackwell and more.
  • All the latest content is available, no embargo periods.

Stop missing out on the latest updates in your field

  • We’ll send you automatic email updates on the keywords and journals you tell us are most important to you.
  • There is a lot of content out there, so we help you sift through it and stay organized.