Close Menu
    Trending
    • What I Learned After 316 Articles and One Surprising Experiment | by Sriram Yaladandi | Pen With Paper | Aug, 2025
    • No. 1 Place to Retire in the World May Not Be On Your Radar
    • I Tested Candy AI Unfiltered Chat for 1 Month
    • Jacobian and Hessian Intuition: Why Deep Learning Needs Higher-Order Calculus | by Thinking Loop | Aug, 2025
    • How I Built a Profitable AI Startup Solo — And the 6 Mistakes I’d Never Make Again
    • From Data Scientist IC to Manager: One Year In
    • Highly Efficient Multi-Node Communication Patterns Using oneCCL | by This Technology Life | Aug, 2025
    • SNIA Announces Storage.AI – insideAI News
    AIBS News
    • Home
    • Artificial Intelligence
    • Machine Learning
    • AI Technology
    • Data Science
    • More
      • Technology
      • Business
    AIBS News
    Home»Machine Learning»Highly Efficient Multi-Node Communication Patterns Using oneCCL | by This Technology Life | Aug, 2025
    Machine Learning

    Highly Efficient Multi-Node Communication Patterns Using oneCCL | by This Technology Life | Aug, 2025

    Team_AIBS NewsBy Team_AIBS NewsAugust 4, 2025No Comments6 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Neural networks are the keystone of deep studying algorithms. Coaching neural networks is compute intensive and due to this fact wants an efficient distribution of the computation to attain cheap coaching occasions. This requires processors to speak in the course of the computation course of. Since communication is mostly a collective operation (a number of processors take part concurrently), optimizing communication primitives and patterns utilized in collective communications (Allreduce, Scale back-Scatter and so forth.) is vital to reaching optimum outcomes in Deep Studying and Machine Studying workloads.

    This weblog will introduce you to Intel oneAPI Collective Communications Library (oneCCL), which permits optimized implementation of communication patterns leading to scalable and environment friendly coaching of deep neural networks in a distributed setting.

    What’s oneCCL?

    oneCCL is a versatile, performant library for Machine Studying and Deep Studying workloads and is considered one of a number of efficiency libraries which can be a part of the oneAPI programming mannequin. oneCCL gives a set of optimized implementations of Collective Communications primitives, uncovered by an API that’s simple to devour by Deep Studying Framework builders.
    It permits an environment friendly implementation of a number of communication operations throughout a number of nodes, leading to deep studying optimizations. It permits asynchronous and out-of-order execution of operations and makes use of a number of cores as required, for optimum utilization of the community. It additionally helps collectives in low-precision knowledge varieties.

    oneCCL Function Highlights

    Following are the core options of oneCCL:

    • Constructed on high of the Intel MPI and libfabrics communication interfaces.
    • Helps numerous interconnection requirements comparable to Ethernet, Cornelis Networks, and InfiniBand.
    • Gives a collective API which, interoperating with SYCL, effectively implements essential collective operations in deep studying comparable to all-reduce, all-gather and reduce-scatter.
    • Simply binds with distributed coaching and deep studying frameworks comparable to Horovod and PyTorch, respectively.
    • Helps bfloat16 floating-point format and permits creating customized datatypes of desired sizes. Discover all of the supported datatypes right here.

    oneCCL API Ideas and Parameters

    On this part, you’ll familiarize your self with a number of the summary ideas utilized by oneCCL for optimized collective operations.

    • Rank: An addressable entity in a communication operation.
    • System: An abstraction of a participant system comparable to a CPU or a GPU concerned within the communication operation.
    • Communicator: Defines a group of speaking ranks. Communication operations between homogenous oneCCL are outlined by the communicator class.
    • Key-Worth Retailer: An interface that units up communication between the constituent ranks whereas grouping them to kind a communicator.
    • Stream: An abstraction that captures the execution context for communication operations.
    • Occasion: An abstraction that captures the synchronization context for communication operations.

    Every oneCCL system corresponds to a rank. A number of ranks talk with one another by an interface referred to as a key-value retailer and a bunch of ranks often called a communicator is fashioned. Whereas finishing up a communication operation amongst a communicator’s ranks, you need to go a stream object containing the execution steps to the communicator. The communicator returns an occasion object which tracks the progress of the operation. It’s also possible to go a vector of occasion objects to the communicator for offering it with enter dependencies required for the operation.

    oneCCL Collective Communication Operations

    oneCCL permits performing a number of collective communication operations among the many gadgets of a communicator — ‘collective’ within the sense that every one the taking part ranks ought to make calls to carry out such operations.

    • Allgatherv: This operation permits accumulating knowledge from all of the ranks inside a communicator into a typical buffer. Whatever the proportion of knowledge contributed by every rank, all of them could have the identical knowledge of their output buffers on the finish of the operation.
    • Allreduce: This operation globally performs a discount operation (max, sum, multiply and so forth.) throughout all of the ranks of communicator. The ensuing worth is returned to every rank on the finish of the operation.
    • Scale back: This operation additionally globally performs a discount operation throughout all of the ranks of communicator like Allreduce however returns the ensuing worth solely to the basis rank.
    • ReduceScatter: This operation additionally performs a discount throughout all of the ranks of a communicator, however solely a sub-portion of the result’s returned to every rank.
    • Alltoallv: By this operation, each rank sends a separate knowledge block to each different block throughout the communicator. If ith rank sends jth knowledge block of its ship buffer, then it’s obtained by the ith knowledge block of the jth rank.
    • Broadcast: By this operation, one of many ranks often called the basis of a communicator distributes knowledge to all different ranks inside that communicator.
    • Barrier synchronization: This operation includes discount adopted by broadcast. It’s executed throughout all of the ranks of a communicator. Nevertheless, any rank calling the barrier() methodology is blocked and solely after all of the ranks throughout the communicator have referred to as it, is the operation accomplished.

    For every sort of supported communication operations, oneCCL specifies operation attribute courses comparable to allgather_attr, allreduce_attr, reduce_attr, reduce_scatter_attr, alltoallv_attr, broadcast_attr and barrier_attr. Utilizing these courses, you possibly can customise the habits of the corresponding communication operation.

    The occasion class of oneCCL permits monitoring the occasion of completion of every communication operation. For example, whereas performing the all-reduce operation, the occasion::wait() methodology pauses all different operations until the all-reduce finishes. If the occasion::check() methodology is used, then it returns a Boolean worth the place True means the all-reduce operation has accomplished whereas False means it’s nonetheless in progress. Thus, you possibly can watch for the completion of an operation in a blocking or a non-blocking means.

    Error Dealing with by oneCCL

    When an error happens, oneCCL passes it to a degree the place a operate name catches it utilizing normal C++ error dealing with approach. A code snippet beneath reveals how a oneCCL error is finally caught by the std::exception class of C++.

    oneCCL has a hierarchy of exception courses during which ccl::exception is the bottom class inherited from the usual C++ std::exception class. It experiences an unspecified error. Different exception courses derived from the bottom class experiences errors encountered in numerous eventualities as follows:

    • ccl::invalid_argument: when arguments given to an operation are unacceptable
    • ccl::host_bad_alloc: when error happens whereas allocating reminiscence on the host
    • ccl::unimplemented: when the requested operation can’t be applied
    • ccl::unsupported: when the requested operation will not be supported

    Wrap-up

    On this weblog, we talked about oneCCL library as a method of performing scalable and environment friendly communication operations. Get began with oneCCL immediately and leverage it for enhancing your Machine Studying workloads in multi-nodal environments. We additionally encourage you to discover the total suite of Intel developer instruments for constructing your AI, HPC, and rendering functions and be taught in regards to the unified, open, standards-based oneAPI programming mannequin that varieties their basis.

    First revealed location: https://community.intel.com/t5/Blogs/Tech-Innovation/Tools/oneTBB-Concurrent-Container-Class-An-Efficient-Way-To-Scale-Your/post/1467254

    First revealed date: 04/17/23

    Contributors:

    Nikita Shiledarbaxi, AI software program advertising and marketing engineer

    Rob Mueller-Albrecht, Product Advertising and marketing Supervisor

    Chandan Damannagari, Director, AI Software program

    Intel Company



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleSNIA Announces Storage.AI – insideAI News
    Next Article From Data Scientist IC to Manager: One Year In
    Team_AIBS News
    • Website

    Related Posts

    Machine Learning

    What I Learned After 316 Articles and One Surprising Experiment | by Sriram Yaladandi | Pen With Paper | Aug, 2025

    August 5, 2025
    Machine Learning

    Jacobian and Hessian Intuition: Why Deep Learning Needs Higher-Order Calculus | by Thinking Loop | Aug, 2025

    August 5, 2025
    Machine Learning

    Debunking the Myth: Is Threatening or Seducing an LLM AI Pointless? The (Not So) Surprising Lack of Effect | by Berend Watchus | Aug, 2025

    August 4, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    What I Learned After 316 Articles and One Surprising Experiment | by Sriram Yaladandi | Pen With Paper | Aug, 2025

    August 5, 2025

    I Tried Buying a Car Through Amazon: Here Are the Pros, Cons

    December 10, 2024

    Amazon and eBay to pay ‘fair share’ for e-waste recycling

    December 10, 2024

    Artificial Intelligence Concerns & Predictions For 2025

    December 10, 2024

    Barbara Corcoran: Entrepreneurs Must ‘Embrace Change’

    December 10, 2024
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    Most Popular

    Time Series Forecasting Made Simple (Part 2): Customizing Baseline Models

    May 9, 2025

    Best AI Girl Generators in 2024

    December 17, 2024

    How Machine Learning is Changing Everyday Life: | by Sana Mirza | Jan, 2025

    January 28, 2025
    Our Picks

    What I Learned After 316 Articles and One Surprising Experiment | by Sriram Yaladandi | Pen With Paper | Aug, 2025

    August 5, 2025

    No. 1 Place to Retire in the World May Not Be On Your Radar

    August 5, 2025

    I Tested Candy AI Unfiltered Chat for 1 Month

    August 5, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2024 Aibsnews.comAll Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.