Close Menu
    Trending
    • When Models Stop Listening: How Feature Collapse Quietly Erodes Machine Learning Systems
    • Why I Still Don’t Believe in AI. Like many here, I’m a programmer. I… | by Ivan Roganov | Aug, 2025
    • The Exact Salaries Palantir Pays AI Researchers, Engineers
    • “I think of analysts as data wizards who help their product teams solve problems”
    • These 5 Programming Languages Are Quietly Taking Over in 2025 | by Aashish Kumar | The Pythonworld | Aug, 2025
    • Chess grandmaster Magnus Carlsen wins at Esports World Cup
    • How I Built a $20 Million Company While Still in College
    • How Computers “See” Molecules | Towards Data Science
    AIBS News
    • Home
    • Artificial Intelligence
    • Machine Learning
    • AI Technology
    • Data Science
    • More
      • Technology
      • Business
    AIBS News
    Home»Data Science»AI’s Achilles’ Heel: The Data Quality Dilemma
    Data Science

    AI’s Achilles’ Heel: The Data Quality Dilemma

    Team_AIBS NewsBy Team_AIBS NewsJuly 20, 2025No Comments5 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    As AI has gained prominence, all the information high quality points we’ve confronted traditionally are nonetheless related. Nonetheless, there are further complexities confronted when coping with the nontraditional knowledge that AI typically makes use of.

    AI Knowledge Has Completely different High quality Wants

    When AI makes use of conventional structured knowledge, all the identical knowledge cleaning processes and protocols which were developed over time can be utilized as-is. To the extent a company already has confidence in its conventional knowledge sources, using AI shouldn’t require any particular knowledge high quality work.

    The catch, nonetheless, is that AI typically makes use of nontraditional knowledge that may’t be cleansed in the identical approach as conventional structured knowledge. Consider photos, textual content, video, and audio. When utilizing AI fashions with such a knowledge, high quality is as essential as ever. However sadly, the normal strategies utilized for cleaning structured knowledge merely don’t apply. New approaches are required.

    AI’s Completely different Wants: Enter And Coaching

    First, let’s use an instance of picture knowledge high quality from the enter and mannequin coaching perspective. Usually, every picture has been given tags summarizing what it comprises. For instance, “scorching canine” or “sports activities automotive” or “cat.” This tagging, usually accomplished by people, can have true errors and likewise conditions the place totally different individuals interpret the picture in another way. How can we establish and deal with such conditions?

    It isn’t straightforward! With numerical knowledge, it’s attainable to establish unhealthy knowledge by way of mathematical formulation or enterprise guidelines. For instance, if the worth of a sweet bar is $125, we may be assured it may possibly’t be proper as a result of it’s so far above expectation. Equally, an individual proven as age 200 clearly doesn’t make any sense. There actually isn’t an efficient approach right now to mathematically test if tags are correct for a picture. One of the best ways to validate the tag is to have a second individual assess the picture.

    An alternate is to develop a course of that makes use of different AI fashions to scan the picture and see if the tags utilized look like appropriate. In different phrases, we are able to use present picture fashions to assist validate the information being fed into future fashions. Whereas there may be potential for some round logic doing this, fashions have gotten sturdy sufficient that it shouldn’t be an issue pragmatically.

    AI’s Completely different Wants: Output And Scoring

    Subsequent, let’s use an instance of picture knowledge high quality from the mannequin output and scoring perspective. As soon as we now have a picture mannequin that we now have confidence in, we feed the mannequin new photos in order that it may possibly assess the photographs. As an example, does the picture include a scorching canine, or a sports activities automotive, or a cat? How can we assess if a picture supplied for evaluation is “clear sufficient” for the mannequin? What if the picture is blurry or pixelated or in any other case not clear? Is there a approach to “clear” the picture?

    The boldness we are able to have in what an AI mannequin tells us is within the picture immediately will depend on how clear the picture is. In a case such because the picture above, how do we all know if the picture is a blurred view of timber or one thing else completely? Whilst people, there may be subjectivity on this evaluation and no clear path for having an automatic, algorithmic strategy to declaring the picture as “clear sufficient” or not. Right here, guide evaluate is perhaps greatest. In absence of that, we are able to once more have an algorithm that scores the readability of the enter picture together with processes to fee the arrogance within the descriptions generated by the mannequin’s evaluation. Many AI purposes do that right now, however there may be certainly enchancment attainable.

    Rising To The Problem

    The examples supplied illustrate that basic knowledge high quality approaches like lacking worth imputation and outlier detection can’t be utilized on to knowledge corresponding to photos or audio. These new knowledge varieties, which AI is closely depending on, would require new and novel methodologies for assessing high quality each on the enter and the output finish of the fashions. Given it took us a few years to develop our approaches for conventional knowledge, it ought to come as no shock that we now have not but achieved comparable requirements for the unstructured knowledge which AI makes use of.

    Till these requirements come up, it’s essential to:

    1. Consistently scan trade blogs, papers, and code repositories to maintain tabs on newly developed approaches
    2. Make your knowledge high quality processes modular in order that it’s straightforward to change or add procedures to make use of the newest advances
    3. Be diligent in finding out recognized errors so as to establish if patterns exist associated to the place your cleaning processes and fashions are performing higher and worse

    Knowledge high quality has all the time been a thorn within the facet of information and analytics practitioners. Not solely do the normal points stay as AI is deployed, however the totally different knowledge that AI makes use of introduces all kinds of novel and troublesome knowledge high quality challenges to deal with. These working within the knowledge high quality realm ought to have job safety for a while to return!

    Initially posted within the Analytics Matters newsletter on LinkedIn

    The submit AI’s Achilles’ Heel: The Data Quality Dilemma appeared first on Datafloq.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleNetflix Used AI to Generate a Scene on a TV Show: ‘Thrilled’
    Next Article Effortless EDA with Sweetviz & YData-Profiling: Secret Weapons for Every Data Scientist | by Harish K | Jul, 2025
    Team_AIBS News
    • Website

    Related Posts

    Data Science

    GFT: Wynxx Reduces Time to Launch Financial Institutions’ AI and Cloud Projects

    August 1, 2025
    Data Science

    The AI-Driven Enterprise: Aligning Data Strategy with Business Goals

    August 1, 2025
    Data Science

    Google DeepMind Launches AlphaEarth Foundations Virtual Satellite

    July 31, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    When Models Stop Listening: How Feature Collapse Quietly Erodes Machine Learning Systems

    August 2, 2025

    I Tried Buying a Car Through Amazon: Here Are the Pros, Cons

    December 10, 2024

    Amazon and eBay to pay ‘fair share’ for e-waste recycling

    December 10, 2024

    Artificial Intelligence Concerns & Predictions For 2025

    December 10, 2024

    Barbara Corcoran: Entrepreneurs Must ‘Embrace Change’

    December 10, 2024
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    Most Popular

    The Baby Tech That One Parent Found Helped the Most

    March 19, 2025

    How Pets Can Promote Better Health and Well-Being in the Workplace

    January 16, 2025

    Artificial Intelligence: Charting Its Evolution and Impact Over the Next Decade | by Greeshma M Shajan | Feb, 2025

    February 14, 2025
    Our Picks

    When Models Stop Listening: How Feature Collapse Quietly Erodes Machine Learning Systems

    August 2, 2025

    Why I Still Don’t Believe in AI. Like many here, I’m a programmer. I… | by Ivan Roganov | Aug, 2025

    August 2, 2025

    The Exact Salaries Palantir Pays AI Researchers, Engineers

    August 2, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2024 Aibsnews.comAll Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.