Close Menu
    Trending
    • Implementing IBCS rules in Power BI
    • What comes next for AI copyright lawsuits?
    • Why PDF Extraction Still Feels LikeHack
    • GenAI Will Fuel People’s Jobs, Not Replace Them. Here’s Why
    • Millions of websites to get ‘game-changing’ AI bot blocker
    • I Worked Through Labor, My Wedding and Burnout — For What?
    • Cloudflare will now block AI bots from crawling its clients’ websites by default
    • 🚗 Predicting Car Purchase Amounts with Neural Networks in Keras (with Code & Dataset) | by Smruti Ranjan Nayak | Jul, 2025
    AIBS News
    • Home
    • Artificial Intelligence
    • Machine Learning
    • AI Technology
    • Data Science
    • More
      • Technology
      • Business
    AIBS News
    Home»Data Science»Rafay Launches Serverless Inference Offering
    Data Science

    Rafay Launches Serverless Inference Offering

    Team_AIBS NewsBy Team_AIBS NewsMay 13, 2025No Comments4 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Sunnyvale, CA – Could 8, 2025 – Rafay Methods, a cloud-native and AI infrastructure orchestration and administration firm, introduced common availability of the corporate’s Serverless Inference providing, a token-metered API for operating open-source and privately educated or tuned LLMs.

    The corporate mentioned many NVIDIA Cloud Suppliers (NCPs) and GPU Clouds are already leveraging the Rafay Platform to ship a multi-tenant, Platform-as-a-Service expertise to their clients, full with self-service consumption of compute and AI functions. These NCPs and GPU Clouds can now ship Serverless Inference as a turnkey service at no further price, enabling their clients to construct and scale AI functions quick, with out having to cope with the associated fee and complexity of constructing automation, governance, and controls for GPU-based infrastructure.

    The World AI inference market is anticipated to develop to $106 billion in 2025, and $254 billion by 2030. Rafay’s Serverless Inference empowers GPU Cloud Suppliers (GPU Clouds) and NCPs to faucet into the booming GenAI market by eliminating key adoption obstacles—automated provisioning and segmentation of complicated infrastructure, developer self-service, quickly launching new GenAI fashions as a service, producing billing information for on-demand utilization, and extra.

    “Having spent the final 12 months experimenting with GenAI, many enterprises at the moment are centered on constructing agentic AI functions that increase and improve their enterprise choices. The flexibility to quickly devour GenAI fashions via inference endpoints is essential to sooner improvement of GenAI capabilities. That is the place Rafay’s NCP and GPU Cloud companions have a fabric benefit,” mentioned Haseeb Budhani, CEO and co-founder of Rafay Systems.

    “With our new Serverless Inference providing, accessible without cost to NCPs and GPU Clouds, our clients and companions can now ship an Amazon Bedrock-like service to their clients, enabling entry to the most recent GenAI fashions in a scalable, safe, and cost-effective method. Builders and enterprises can now combine GenAI workflows into their functions in minutes, not months, with out the ache of infrastructure administration. This providing advances our firm’s imaginative and prescient to assist NCPs and GPU Clouds evolve from working GPU-as-a-Service companies to AI-as-a-Service companies.”
    By providing Serverless Inference as an on-demand functionality to downstream clients, Rafay helps NCPs and GPU Clouds handle a key hole out there. Rafay’s Serverless Inference providing supplies the next key capabilities to NCPs and GPU Clouds:

    • Seamless developer integration: OpenAI-compatible APIs require zero code migration for present functions, with safe RESTful and streaming-ready endpoints that dramatically speed up time-to-value for finish clients.

    • Clever infrastructure administration: Auto-scaling GPU nodes with right-sized mannequin allocation capabilities dynamically optimize sources throughout multi-tenant and devoted isolation choices, eliminating over-provisioning whereas sustaining strict efficiency SLAs.

    • Constructed-in metering and billing: Token-based and time-based utilization monitoring for each enter and output supplies granular consumption analytics, whereas integrating with present billing platforms via complete metering APIs and enabling clear, consumption-based pricing fashions.

    • Enterprise-grade safety and governance: Complete safety via HTTPS-only API endpoints, rotating bearer token authentication, detailed entry logging, and configurable token quotas per crew, enterprise unit, or software fulfill enterprise compliance necessities.

    • Observability, storage, and efficiency monitoring: Finish-to-end visibility with logs and metrics archived within the supplier’s personal storage namespace, assist for backends like MinIO- a high-performance, AWS S3-compatible object storage system, and Weka-a high-performance, AI-native information platform; in addition to a centralized credential administration guarantee full infrastructure and mannequin efficiency transparency.

    Rafay’s Serverless Inference providing is offered as we speak to all clients and companions utilizing the Rafay Platform to ship multi-tenant, GPU and CPU based mostly infrastructure. The corporate can be set to roll out fine-tuning capabilities shortly.  These new additions are designed to assist NCPs and GPU Clouds quickly ship high-margin, production-ready AI companies, eradicating complexity.





    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleElon Musk’s Boring Company Is in Talks With Government Over Amtrak Project
    Next Article Why iPhones Still Aren’t Made in America A Brief Recap of Steve Jobs’ Warning | by Victorhorlly | May, 2025
    Team_AIBS News
    • Website

    Related Posts

    Data Science

    GenAI Will Fuel People’s Jobs, Not Replace Them. Here’s Why

    July 1, 2025
    Data Science

    Futurwise: Unlock 25% Off Futurwise Today

    July 1, 2025
    Data Science

    National Lab’s Machine Learning Project to Advance Seismic Monitoring Across Energy Industries

    July 1, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Implementing IBCS rules in Power BI

    July 1, 2025

    I Tried Buying a Car Through Amazon: Here Are the Pros, Cons

    December 10, 2024

    Amazon and eBay to pay ‘fair share’ for e-waste recycling

    December 10, 2024

    Artificial Intelligence Concerns & Predictions For 2025

    December 10, 2024

    Barbara Corcoran: Entrepreneurs Must ‘Embrace Change’

    December 10, 2024
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    Most Popular

    How to Craft Marketing Campaigns That Reach Multiple Generations

    January 14, 2025

    What Is MCP and Why It Matters in the AI Era? | by Bert | Jun, 2025

    June 23, 2025

    How This Charleston Brunch Hotspot Keeps Food Costs Down

    February 6, 2025
    Our Picks

    Implementing IBCS rules in Power BI

    July 1, 2025

    What comes next for AI copyright lawsuits?

    July 1, 2025

    Why PDF Extraction Still Feels LikeHack

    July 1, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2024 Aibsnews.comAll Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.