• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to footer
  • Home
  • Crypto Currency
  • Technology
  • Contact
NEO Share

NEO Share

Sharing The Latest Tech News

  • Home
  • Artificial Intelligence
  • Machine Learning
  • Computers
  • Mobile
  • Crypto Currency

AI Inclusion / Selection Bias Challenge

January 27, 2021 by systems

Frank Kienle
Photo by Michael Dziedzic on Unsplash

Intelligence is the ability to learn from experience, solve problems, and use knowledge to adapt to new situations (David G. Myers, Psychology 12 Edition). I refer to artificial intelligence (AI) systems as a collection of advanced technologies that allow machines to sense, comprehend, act, and learn.Machine learning and statistical models are often within the heart of these AI or data-driven systems.

I experience three major pitfalls repeatedly when designing statistical / machine learning models for data-driven systems.

not knowing the business value and definition of good

selecting wrong or biased information

designing models which are too complex and costly to maintain

In the following, I will only focus on the selection bias problem.

Machine learning systems and as well humans drive the learning through iterations with data/information. The quality, amount, preparation, and selection of data are critical to a machine learning solution’s success.

One famous statement in machine learning is garbage in will lead to garbage out. Of course, I fully agree with this statement. However, often it is not clear what garbage is.

There are obvious issues related to data quality problems, like missing data or outliers even before we can judge the quality of data. We have to select a representative data set linked to your business application.

Any AI engine and as well every statistic calculates on the seen datasets. However, every underlying dataset is the product of human decisions. Human biases occur within the selection and curation of data, which will show up in AI systems’ outputs.

The video summarises an AI inclusion problem; it shows the biased output concerning the search term ‚family ‘.

From a society or ethical perspective, it is a significant problem, often summarized as an AI inclusion problem. Note that the term AI inclusion many different aspects are typically discussed around development, social impact, policy implications, and legal issues concerning AI systems, see, e.g., aiandinclusion

Technically speaking, what happens in this video is a selection bias problem.

Selection bias occurs when the samples used to produce the model are not fully representative of cases that the model may use in the future. The selection bias comes not only within AI systems; it lies in the heart of any statistical evaluation.

In a data-driven world, automated processes and decisions are based on statistical models. Thus, a basic understanding of statistics is mandatory to judge critically on results and outputs.

I realize a strong wish towards content featuring deep learning systems in my current data science lecture class.

However, we have to teach the basics first before going fast into modeling. Within our data science education programs, we have to focus more on statistics and thus the judgment of AI systems instead of teaching fancy algorithms.

What do you think — explicitly teaching the topic selection bias — would it help tackle the bigger AI inclusion problem?

Filed Under: Artificial Intelligence

Primary Sidebar

Stay Ahead: The Latest Tech News and Innovations

Cryptocurrency Market Updates: What’s Happening Now

Emerging Trends in Artificial Intelligence: What to Watch For

Top Cloud Computing Services to Secure Your Data

The Future of Mobile Technology: Recent Advancements and Predictions

Footer

  • Privacy Policy
  • Terms and Conditions

Copyright © 2025 NEO Share

Terms and Conditions - Privacy Policy