OPPORTUNITIES FOR ETHICAL DECISION MAKING: A CASE STUDY IN K-NN
University of Maryland, Baltimore County (UNITED STATES)
About this paper:
Conference name: 17th International Technology, Education and Development Conference
Dates: 6-8 March, 2023
Location: Valencia, Spain
Abstract:
Machine Learning (ML) algorithms create a seamless process of modeling results from the training sets. If we consider the ML algorithm and data life cycle, a lot of decisions made during the ML model design process can impact the outcomes. One place of decision making is in designing the right training sets. Algorithms rely on training sets, and class imbalance in the training sets if not addressed can lead to ethical impacts on the minority class.
This paper demonstrates how decisions in the algorithmic model design can have an impact on the outcomes and suggest some mitigating steps. We detail an approach to explore the K-Nearest Neighbors (K-NN) algorithm, a fundamental algorithm accessible to all AI practitioners, with various parameters that can affect outcomes, depicting ethical considerations in algorithmic decision making. While our illustrations are based on a pseudo-randomly divided minority and majority classes in an example data set, we demonstrate that there is a need of such data sets to allow for practitioners to understand the pitfalls of failing to probe the design of an ML model. Our results demonstrate how overall acceptable accuracy rates can hide unacceptable rates for minority class labels when the minority class has different characteristics than that of the majority class. This presents an ethical issue for underrepresented minority groups when models are applied to large populations without understanding the class imbalance in the training sets.
We created a hands-on Jupyter Notebook that practitioners and students can explore to understand and illustrate the impact of ethical decision making throughout the algorithmic process. It is important to note that in this paper we are not making any claims about the data sets used or particular race or gender but rather making a case for ethical decision making in algorithmic design.Keywords:
Data science, ethics, algorithms.