The Ultra Ethernet Consortium, a project under the stewardship of The Linux Foundation, has pledged to fine-tune the existing stack architecture of ethernet to satiate the burgeoning network demands of artificial intelligence (AI) and high-performance computing. The founding members of this innovative alliance include the who’s who of the technology world, including AMD, Arista, Broadcom, Cisco, Eviden, HPE, Intel, Meta, and Microsoft.
The core mission of this consortium is centred around enhancing the performance of existing ethernet technology in order to seamlessly accommodate the burgeoning workloads of high-power computing and AI applications. It aspires to leverage the widespread adoption and versatility of ethernet to handle a diverse range of workloads while ensuring scalability and cost-effectiveness.
Drilling deeper into the technical objectives, the consortium aims to formalise specifications, APIs, and source code defining protocols and signalling mechanisms. These blueprints will address everything from electrical and optical aspects to software, storage, security and management constructs needed to facilitate different workloads. Therefore, it covers the full spectrum from link-level mechanisms to end-to-end congestion and telemetry. The focus here, however, is specifically on handling requirements of AI, machine learning and high-performance computing environments.
Putting it in layman’s terms, or taking swimming in the sea of networking architecture out of the equation, the Chair of the Ultra Ethernet Consortium, Dr J Metz, likens the effort to a comprehensive tune-up of ethernet. Expounding on the essence of the consortium, he elucidates: “This isn’t about overhauling Ethernet. It’s about tuning Ethernet to improve efficiency for workloads with specific performance requirements. We’re looking at every layer – from the physical all the way through the software layers – to find the best way to improve efficiency and performance at scale.”
Reflecting on the need for getting involved in such a project, Justin Hotard, executive vice president and general manager, HPC and AI, at Hewlett Packard Enterprise, notes, “Generative AI workloads will require us to architect our networks for supercomputing scale and performance.” He further underscores the importance of the Ultra Ethernet Consortium in developing an open, scalable and cost-effective ethernet-based communication stack to support high-performance workloads efficiently.
Supporting this perspective, Jeff McVeigh, corporate vice president and general manager of the Super Compute Group at Intel, points out the inexhaustible demand for computational and network performance for AI, machine learning, and high-performance workloads stays at scale. He brings up the need for open solutions to meet these requirements and escape from proprietary solutions.
It’s worth mentioning that the challenges the Ultra Ethernet Consortium seeks to tackle bear a resemblance to what Nvidia’s Infiniband promises to offer. However, the consortium seems ready to accept competition and plans to start welcoming new members in Q4 of this year.