Close Menu
    Trending
    • Credit Risk Scoring for BNPL Customers at Bati Bank | by Sumeya sirmula | Jul, 2025
    • The New Career Crisis: AI Is Breaking the Entry-Level Path for Gen Z
    • Musk’s X appoints ‘king of virality’ in bid to boost growth
    • Why Entrepreneurs Should Stop Obsessing Over Growth
    • Implementing IBCS rules in Power BI
    • What comes next for AI copyright lawsuits?
    • Why PDF Extraction Still Feels LikeHack
    • GenAI Will Fuel People’s Jobs, Not Replace Them. Here’s Why
    AIBS News
    • Home
    • Artificial Intelligence
    • Machine Learning
    • AI Technology
    • Data Science
    • More
      • Technology
      • Business
    AIBS News
    Home»Machine Learning»Building an Android Smart Gallery App for Image Organization & Search: Transitioning from Classification to Embedding Models | by D41 | Mar, 2025
    Machine Learning

    Building an Android Smart Gallery App for Image Organization & Search: Transitioning from Classification to Embedding Models | by D41 | Mar, 2025

    Team_AIBS NewsBy Team_AIBS NewsMarch 25, 2025No Comments4 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    SmartScan app banner

    In my previous approach for SmartScan, I designed it primarily for picture group utilizing a classification mannequin. Whereas the app functioned as supposed, the inflexible nature of a classification mannequin finally turned a limitation. This method required customers to coach it to precisely categorize their photographs, making it much less versatile when coping with the variety of consumer photographs. In distinction, embedding fashions generate function vectors, permitting us to compute cosine similarity between photographs and class representations — an method that adapts way more gracefully to various inputs. Though frameworks like ONNX and LiteRT help on-device coaching, implementing this added a layer of complexity that would negatively affect usability of the app.

    By transitioning to an embedding-based method, the app not solely improved its picture group capabilities but additionally gained a strong text-to-image search function, enabling customers to search out photographs utilizing pure language queries.

    Initially, I used CLIP embedding fashions (picture and textual content) for a zero-shot classification method by computing the cosine similarity between the textual illustration of folder names and the embeddings of latest photographs. Whereas this allowed for dynamic categorization, this technique was not at all times correct as a result of a folder identify alone didn’t seize the total variability of its contents. For instance, throughout testing with Twitter and Reddit screenshots, the mannequin incessantly misclassified Twitter screenshots as Reddit, with the cosine similarity between the 2 typically being extraordinarily shut.

    To enhance accuracy, I launched few-shot studying technique utilizing prototype embeddings. With this new method, every vacation spot folder is represented by a prototype embedding — the common of all picture embeddings inside that folder. New photographs are then in comparison with these prototype embeddings, resulting in significantly better matching accuracy.

    One other key motive for switching to an embedding mannequin was its suitability for implementing a text-to-image search function — which has now been added. Embedding fashions naturally map photographs and textual content right into a shared function area, making similarity comparability seamless and intuitive for customers looking out their gallery with textual queries.

    General utilizing the embedding method for classification gives a number of vital benefits:

    • Flexibility: Embedding fashions simply adapt to varied user-defined classes with out the constraints of discrete labels.
    • Dynamic Consumer Management: Customers can seamlessly add or modify classes with out retraining a posh classification mannequin.
    • Enhanced Scalability: The prototype embedding technique scales gracefully as new picture varieties and classes are added.

    To allow environment friendly use of the CLIP fashions on consumer units, I carried out the next steps:

    1. Conversion to ONNX: I transformed the CLIP visible and textual content encoder fashions to the ONNX format utilizing ONNX Runtime, making certain compatibility with cell environments.
    2. Mannequin Quantization: Each fashions had been quantized, decreasing their sizes by roughly 4x, with the picture encoder mannequin being decreased from 351.6MB to 95.6MB and the textual content encoder mannequin being decreased from 254.2MB to 64.4MB. This quantization not solely minimizes the APK measurement but additionally optimizes inference efficiency on cell units.

    Switching to an embedding mannequin has considerably enhanced the app’s flexibility and total consumer expertise. The preliminary zero-shot classification method utilizing embeddings served as a helpful stepping stone. Nonetheless, by transitioning to the few-shot studying method utilizing prototype embeddings, the categorization course of turned way more correct and strong. Furthermore, the embedding method now absolutely helps the built-in text-to-image search function, offering a pure and highly effective option to discover picture collections by easy textual content queries.

    Reflecting on my shift from zero-shot classification to few-shot studying, I couldn’t assist however take into account the controversy on AI changing software program builders. To not toot my very own horn, however this exemplifies why that received’t occur solely. Whereas AI will dominate code era, the complexities of system design, debugging, and problem-solving will at all times want human perception.

    For these excited by exploring the app additional, it’s open supply. You possibly can obtain it and take a look at the GitHub repository here. If you happen to discover it helpful or attention-grabbing, help it by giving it a star! It also needs to be launched on F-Droid someday this week.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleTrump’s Crypto Venture Introduces a Stablecoin
    Next Article OpenAI’s new image generator aims to be practical enough for designers and advertisers
    Team_AIBS News
    • Website

    Related Posts

    Machine Learning

    Credit Risk Scoring for BNPL Customers at Bati Bank | by Sumeya sirmula | Jul, 2025

    July 1, 2025
    Machine Learning

    Why PDF Extraction Still Feels LikeHack

    July 1, 2025
    Machine Learning

    🚗 Predicting Car Purchase Amounts with Neural Networks in Keras (with Code & Dataset) | by Smruti Ranjan Nayak | Jul, 2025

    July 1, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Credit Risk Scoring for BNPL Customers at Bati Bank | by Sumeya sirmula | Jul, 2025

    July 1, 2025

    I Tried Buying a Car Through Amazon: Here Are the Pros, Cons

    December 10, 2024

    Amazon and eBay to pay ‘fair share’ for e-waste recycling

    December 10, 2024

    Artificial Intelligence Concerns & Predictions For 2025

    December 10, 2024

    Barbara Corcoran: Entrepreneurs Must ‘Embrace Change’

    December 10, 2024
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    Most Popular

    Good Enough Statistics. A New Publication on Statistics and… | by Zach Flynn | Feb, 2025

    February 10, 2025

    MLCommons Releases AILuminate LLM v1.1 with French Language Capabilities

    February 11, 2025

    Call Klarna’s AI Hotline and Talk to an AI Clone of Its CEO

    June 13, 2025
    Our Picks

    Credit Risk Scoring for BNPL Customers at Bati Bank | by Sumeya sirmula | Jul, 2025

    July 1, 2025

    The New Career Crisis: AI Is Breaking the Entry-Level Path for Gen Z

    July 1, 2025

    Musk’s X appoints ‘king of virality’ in bid to boost growth

    July 1, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2024 Aibsnews.comAll Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.