Close Menu
    Trending
    • How Smart Entrepreneurs Turn Mid-Year Tax Reviews Into Long-Term Financial Wins
    • Become a Better Data Scientist with These Prompt Engineering Tips and Tricks
    • Meanwhile in Europe: How We Learned to Stop Worrying and Love the AI Angst | by Andreas Maier | Jul, 2025
    • Transform Complexity into Opportunity with Digital Engineering
    • OpenAI Is Fighting Back Against Meta Poaching AI Talent
    • Lessons Learned After 6.5 Years Of Machine Learning
    • Handling Big Git Repos in AI Development | by Rajarshi Karmakar | Jul, 2025
    • National Lab’s Machine Learning Project to Advance Seismic Monitoring Across Energy Industries
    AIBS News
    • Home
    • Artificial Intelligence
    • Machine Learning
    • AI Technology
    • Data Science
    • More
      • Technology
      • Business
    AIBS News
    Home»Artificial Intelligence»Beyond Causal Language Modeling. A deep dive into “Not All Tokens Are… | by Masatake Hirono | Jan, 2025
    Artificial Intelligence

    Beyond Causal Language Modeling. A deep dive into “Not All Tokens Are… | by Masatake Hirono | Jan, 2025

    Team_AIBS NewsBy Team_AIBS NewsJanuary 27, 2025No Comments2 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Contributions of This Work

    This paper gives each an illuminating evaluation of token-level coaching dynamics and a brand new approach known as SLM:

    Token Loss Evaluation:
    They reveal {that a} majority of tokens contribute little past the preliminary coaching section, whereas a small subset stays persistently excessive loss.

    SLM for Centered Studying:
    By leveraging a reference mannequin to gauge how “helpful” every token is, they handle to cut back coaching tokens drastically with out sacrificing high quality — in lots of circumstances even boosting downstream efficiency.

    Broad Demonstration of Effectiveness:
    SLM works not solely on math-specific duties but in addition in additional normal domains, with both a meticulously curated reference dataset or a reference mannequin drawn from the identical giant corpus.

    The place May This Go Subsequent?

    SLM encompasses numerous potential instructions for future analysis. For instance:

    Scaling Up Additional:
    Although the paper primarily focuses on fashions round 1B to 7B parameters, there stays the open query of how SLM performs on the 30B, 70B, or 100B+ scale. If the token-level strategy generalizes effectively, the fee financial savings could possibly be monumental for really huge LLMs.

    Reference Fashions by way of API:
    For those who can’t collect curated information, perhaps you would use an API-based language mannequin as your reference. That may make SLM extra sensible for smaller analysis groups who lack the assets for selective reference coaching.

    Reinforcement Studying Extensions:
    Think about coupling SLM with reinforcement studying. The reference mannequin might act as a “reward mannequin,” and token choice may then be optimized by means of one thing akin to coverage gradients.

    A number of Reference Fashions:
    As an alternative of a single RM, you would practice or collect a number of, every specializing in a unique area or fashion. Then, mix their token scores to supply a extra strong multi-domain filtering system.

    Alignment and Security:
    There’s a rising pattern towards factoring in alignment or truthfulness. One may practice a reference mannequin to present increased scores to well-supported statements and nil out tokens that look factually incorrect or dangerous.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleSolving Burgers’ Equation with Neural Network | by M.Hamxa | Jan, 2025
    Next Article Bill Gates Is Still Doing Product Reviews at Microsoft
    Team_AIBS News
    • Website

    Related Posts

    Artificial Intelligence

    Become a Better Data Scientist with These Prompt Engineering Tips and Tricks

    July 1, 2025
    Artificial Intelligence

    Lessons Learned After 6.5 Years Of Machine Learning

    July 1, 2025
    Artificial Intelligence

    Prescriptive Modeling Makes Causal Bets – Whether You Know it or Not!

    June 30, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    How Smart Entrepreneurs Turn Mid-Year Tax Reviews Into Long-Term Financial Wins

    July 1, 2025

    I Tried Buying a Car Through Amazon: Here Are the Pros, Cons

    December 10, 2024

    Amazon and eBay to pay ‘fair share’ for e-waste recycling

    December 10, 2024

    Artificial Intelligence Concerns & Predictions For 2025

    December 10, 2024

    Barbara Corcoran: Entrepreneurs Must ‘Embrace Change’

    December 10, 2024
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    Most Popular

    How AI Is Leveling the Playing Field For Small Businesses to Compete With Industry Giants

    March 7, 2025

    The Impact of AI on High-Frequency Trading

    April 4, 2025

    Redesigning Education to Thrive Amid Exponential Change

    June 3, 2025
    Our Picks

    How Smart Entrepreneurs Turn Mid-Year Tax Reviews Into Long-Term Financial Wins

    July 1, 2025

    Become a Better Data Scientist with These Prompt Engineering Tips and Tricks

    July 1, 2025

    Meanwhile in Europe: How We Learned to Stop Worrying and Love the AI Angst | by Andreas Maier | Jul, 2025

    July 1, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2024 Aibsnews.comAll Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.