Close Menu
    Trending
    • Before You Start Day Trading, Know These Stages
    • How generative AI could help make construction sites safer
    • PCA and SVD: The Dynamic Duo of Dimensionality Reduction | by Arushi Gupta | Jul, 2025
    • 5 Ways Artificial Intelligence Can Support SMB Growth at a Time of Economic Uncertainty in Industries
    • Microsoft Says Its AI Diagnoses Patients Better Than Doctors
    • From Reporting to Reasoning: How AI Is Rewriting the Rules of Data App Development
    • Can AI Replace Doctors? How Technology Is Shaping Healthcare – Healthcare Info
    • Singapore police can now seize bank accounts to stop scams
    AIBS News
    • Home
    • Artificial Intelligence
    • Machine Learning
    • AI Technology
    • Data Science
    • More
      • Technology
      • Business
    AIBS News
    Home»Machine Learning»Day 2 — Can Tiny Language Models Power Real-World Apps? | by Shourabhpandey | Apr, 2025
    Machine Learning

    Day 2 — Can Tiny Language Models Power Real-World Apps? | by Shourabhpandey | Apr, 2025

    Team_AIBS NewsBy Team_AIBS NewsApril 8, 2025No Comments3 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Within the age of GPT-4 and Gemini 1.5, operating an LLM on a smartphone feels virtually… outdated. And but, at the moment I ran TinyLlama-1.1B on my cellphone — and it labored. No cloud. No GPU. Simply an on-device neural community producing considerate responses in actual time.

    This submit isn’t nearly what I did — it’s about why that issues.

    Most individuals work together with massive language fashions by way of APIs — OpenAI, Google, Anthropic — which cover the heavy lifting behind paywalls and server farms.

    However counting on cloud APIs creates a couple of key limitations:

    • Privateness: Each immediate is shipped to a distant server.
    • Latency: Responses rely on community circumstances.
    • Price: API calls add up quick in manufacturing.
    • Dependence: Your app turns into tethered to exterior suppliers.

    That’s the place native LLMs enter the scene — tiny, quantized fashions you may run immediately in your cellphone or laptop computer utilizing frameworks like GGUF, llama.cpp, and MLC.

    I downloaded an app referred to as PocketPal AI from the Play Retailer. It helps GGUF-format fashions and makes use of GGML below the hood to run them on-device.

    • Parameters: ~1.1 billion
    • Quantized dimension: ~500MB (q4_k_m)
    • Context size: 2048 tokens
    • Tokenizer: ChatML-compatible
    • {Hardware}: Mid-range Android cellphone (Snapdragon 778G, 8GB RAM)

    I gave it a easy check immediate:

    “Summarize this concept: an Android app that helps customers plan their day and observe life occasions like a second mind.”

    It responded with:

    “A private assistant app that helps customers set up duties, document reminiscences, and enhance self-awareness.”

    Not groundbreaking, however coherent, on-topic, and quick — round 1.2 tokens/sec on-device. That’s sufficient for journaling, word summarization, and even immediate rephrasing — all with out hitting an API.

    FeatureTinyLlama 1.1BPhi-2Gemma 2BGemini NanoOn-device readyYes (GGUF)YesYesYes (Android solely)Quant dimension (this fall)~500MB~1.2GB~1.5GBOEM-onlyContext length204820488192UnknownLicenseApache 2.0MITApache 2.0Proprietary

    TinyLlama shines in minimal reminiscence footprint, open weights, and pace on lower-end telephones. Nevertheless, it lacks reasoning depth and typically repeats or stalls on complicated prompts — not ideally suited for chatbot use, however nice for light-weight duties.

    This one check gave me three insights:

    1. Native-first is viable for actual apps.
      For journal apps, planners, or immediate engines — you may ship on-device AI with no exterior value.
    2. Mannequin dimension isn’t every part.
      TinyLlama carried out higher than anticipated. It proves a well-trained small mannequin > an enormous mannequin used poorly.
    3. That is the start.
      If fashions like TinyLlama are usable now, think about what we’ll get in 6 months — with MLC, Steel backend, or Google’s AICore pushing additional.

    Tomorrow I’ll begin constructing the app shell in Kotlin — no ML but, simply organising the construction. Ultimately, TinyLlama (or the same mannequin) will energy options like:

    • Journaling assistant
    • Objective-based suggestions
    • Reminiscence recall and semantic search
    • Summarization and perception era

    However at the moment proved that even a solo dev, on a funds, can construct clever instruments that don’t rely on the cloud.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleHow AI is Transforming DevOps in Software Development
    Next Article Why You’re Using Marketing Agencies and Freelancers Wrong
    Team_AIBS News
    • Website

    Related Posts

    Machine Learning

    PCA and SVD: The Dynamic Duo of Dimensionality Reduction | by Arushi Gupta | Jul, 2025

    July 2, 2025
    Machine Learning

    Can AI Replace Doctors? How Technology Is Shaping Healthcare – Healthcare Info

    July 2, 2025
    Machine Learning

    Is Your AI Whispering Secrets? How Scientists Are Teaching Chatbots to Forget Dangerous Tricks | by Andreas Maier | Jul, 2025

    July 2, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Before You Start Day Trading, Know These Stages

    July 2, 2025

    I Tried Buying a Car Through Amazon: Here Are the Pros, Cons

    December 10, 2024

    Amazon and eBay to pay ‘fair share’ for e-waste recycling

    December 10, 2024

    Artificial Intelligence Concerns & Predictions For 2025

    December 10, 2024

    Barbara Corcoran: Entrepreneurs Must ‘Embrace Change’

    December 10, 2024
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    Most Popular

    5 free time-saving Windows apps every PC should have

    February 17, 2025

    The Concepts Data Professionals Should Know in 2025: Part 1 | by Sarah Lea | Jan, 2025

    January 19, 2025

    Is Fortnite Apple Blocked From the Apple App Store?

    May 17, 2025
    Our Picks

    Before You Start Day Trading, Know These Stages

    July 2, 2025

    How generative AI could help make construction sites safer

    July 2, 2025

    PCA and SVD: The Dynamic Duo of Dimensionality Reduction | by Arushi Gupta | Jul, 2025

    July 2, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2024 Aibsnews.comAll Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.