Close Menu
    Trending
    • Credit Risk Scoring for BNPL Customers at Bati Bank | by Sumeya sirmula | Jul, 2025
    • The New Career Crisis: AI Is Breaking the Entry-Level Path for Gen Z
    • Musk’s X appoints ‘king of virality’ in bid to boost growth
    • Why Entrepreneurs Should Stop Obsessing Over Growth
    • Implementing IBCS rules in Power BI
    • What comes next for AI copyright lawsuits?
    • Why PDF Extraction Still Feels LikeHack
    • GenAI Will Fuel People’s Jobs, Not Replace Them. Here’s Why
    AIBS News
    • Home
    • Artificial Intelligence
    • Machine Learning
    • AI Technology
    • Data Science
    • More
      • Technology
      • Business
    AIBS News
    Home»Machine Learning»Model Quantization — What is it?. Model quantization is a technique that… | by Sujeeth Kumaravel | Jun, 2025
    Machine Learning

    Model Quantization — What is it?. Model quantization is a technique that… | by Sujeeth Kumaravel | Jun, 2025

    Team_AIBS NewsBy Team_AIBS NewsJune 16, 2025No Comments2 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Mannequin quantization is a method that reduces the precision of mannequin parameters (like weights and activations) from high-precision floating-point numbers (FP32, FP16) to lower-precision codecs like 8-bit integers (INT8). This conversion considerably reduces the mannequin dimension, reminiscence footprint, and computational price, permitting for sooner inference and deployment on resource-constrained gadgets.

    Right here’s a extra detailed rationalization:

    Why Quantization?

    Diminished Mannequin Measurement:

    • By utilizing lower-precision information varieties, the mannequin may be saved extra compactly, requiring much less reminiscence storage.

    Sooner Inference:

    • Operations on lower-precision numbers, notably integers, are sometimes sooner on {hardware}, resulting in faster inference occasions.

    Diminished Reminiscence Necessities:

    • Decrease-precision numbers require much less reminiscence bandwidth, which is essential for memory-bound operations like massive language fashions (LLMs).

    Power Effectivity:

    • Decrease-precision computations will also be extra energy-efficient.

    Deployment on Edge Gadgets:

    • Quantization allows deploying fashions on resource-constrained gadgets like cell phones and IoT gadgets.

    How Quantization Works

    1. Parameter Mapping:

    Mannequin parameters (weights and activations) are mapped from their unique high-precision floating-point values to a smaller vary of lower-precision values, sometimes integers.

    2. Put up-Coaching Quantization (PTQ):

    On this method, the mannequin is first skilled with high-precision floating-point values after which transformed to lower-precision after coaching.

    3. Quantization-Conscious Coaching (QAT):

    This methodology incorporates quantization into the coaching course of by utilizing “fake-quantization” modules, which simulate the quantization course of throughout each ahead and backward passes.

    Varieties of Quantization

    Symmetric vs. Uneven:

    • Symmetric quantization maps values round zero, whereas uneven quantization can have completely different ranges for constructive and damaging values.

    Uniform vs. Non-Uniform:

    • Uniform quantization maps values evenly throughout the vary, whereas non-uniform quantization can use extra or fewer bits for various ranges.

    FP16, FP8, INT8, INT4:

    • These are a few of the frequent lower-precision information varieties utilized in quantization.

    Advantages of Quantization

    • Diminished reminiscence footprint: Makes it doable to deploy fashions on resource-limited gadgets.
    • Sooner inference velocity: Allows faster processing of information.
    • Improved vitality effectivity: Reduces energy consumption, particularly vital for cellular gadgets.
    • Decrease computational price: Can cut back the necessity for costly {hardware} and specialised accelerators.

    Commerce-offs

    • Potential accuracy loss: Quantization can introduce some accuracy degradation, nevertheless it’s usually manageable and may be fine-tuned with quantization-aware coaching.
    • Complexity: Implementing quantization can require specialised instruments and experience.

    Purposes

    Giant Language Fashions (LLMs):

    • Quantization is especially efficient for LLMs as a result of their massive dimension and excessive computational necessities.

    Picture Recognition and Object Detection:

    • Quantization can be utilized to enhance the efficiency of those fashions on edge gadgets.

    Speech Recognition:

    • Quantization can cut back the reminiscence and computational price of speech recognition fashions.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleTop 5 SASE Solutions for Modern Enterprise Security
    Next Article How to use AI to write your résumé
    Team_AIBS News
    • Website

    Related Posts

    Machine Learning

    Credit Risk Scoring for BNPL Customers at Bati Bank | by Sumeya sirmula | Jul, 2025

    July 1, 2025
    Machine Learning

    Why PDF Extraction Still Feels LikeHack

    July 1, 2025
    Machine Learning

    🚗 Predicting Car Purchase Amounts with Neural Networks in Keras (with Code & Dataset) | by Smruti Ranjan Nayak | Jul, 2025

    July 1, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Credit Risk Scoring for BNPL Customers at Bati Bank | by Sumeya sirmula | Jul, 2025

    July 1, 2025

    I Tried Buying a Car Through Amazon: Here Are the Pros, Cons

    December 10, 2024

    Amazon and eBay to pay ‘fair share’ for e-waste recycling

    December 10, 2024

    Artificial Intelligence Concerns & Predictions For 2025

    December 10, 2024

    Barbara Corcoran: Entrepreneurs Must ‘Embrace Change’

    December 10, 2024
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    Most Popular

    How To Generate GIFs from 3D Models with Python

    February 25, 2025

    How to Identify Leaders Who Truly Fit Your Company Culture

    February 17, 2025

    Meta and Amazon axe DEI programmes joining corporate rollback

    January 11, 2025
    Our Picks

    Credit Risk Scoring for BNPL Customers at Bati Bank | by Sumeya sirmula | Jul, 2025

    July 1, 2025

    The New Career Crisis: AI Is Breaking the Entry-Level Path for Gen Z

    July 1, 2025

    Musk’s X appoints ‘king of virality’ in bid to boost growth

    July 1, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2024 Aibsnews.comAll Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.