Startup Profile: Enfabrica's Got a Hot AI Switch
Everyone has their eye on AI – especially the training and inferencing tasks associated with creating the large language models (LLMs) that form the basis of applications. These workloads require massive amounts of powerful compute capacity in accelerated systems containing multiple graphics processing units (GPUs).
What if these systems could be made even faster and more efficient?
That’s the mission of Enfabrica, a startup founded in 2020 that emerged from stealth early in 2023 with a compelling proposition: namely, to recreate the networking components of AI systems.
Specifically, Enfabrica has created an accelerated compute fabric switch (ACF-S) that replaces the smart NICs and PCIe switches that govern the links between Ethernet-linked servers and the GPUs and CPUs that today comprise AI processing systems. And the switch component doesn't just replace those elements. It offers faster connections from the network to the AI system; it reduces latency associated with traffic flows between NICs and the GPUs; and it ultimately supports a lower total cost of ownership (TCO) for AI systems.
Here's how Enfabrica co-founder and CEO Rochan Sankar put it in a press release earlier this year:
"[C]urrent server I/O and networking solutions have serious bottlenecks that will cause them to either buckle under the scale of demand, or vastly underutilize the costly compute resources deployed to meet it. We believe we have cracked the code on a I/O fabric design that will scale high-performance, low-latency AI workloads at far superior TCO than anything out there, and make growing AI infrastructure composable, sustainable and democratized."
Enfabrica co-founder and CEO Rochan Sankar speaks at an industry event earlier this year. Source: Enfabrica
Solving Basic Problems
The details of the ACF-S are complex and technical, but basically the switch solves three problems, according to CEO Sankar:
Scalability. The ACF-S improves the bandwidth between the network and the GPUs, expanding the number of systems that can be linked to the GPUs.
Network latency. With fewer device hops, traffic can move faster through the AI system. How much faster? Fifty to 66%, claims Enfabrica.
Cost. The ACF-S allows for more efficient use of GPUs, meaning that GPUs can run “hotter,” or at higher utilization, eliminating the situation in which costly GPU high bandwidth memory (HBM) lies idle. Overall, Enfabrica claims to realize 50% lower cost of compute with ACF-S.
Digging the technical details of the ACF-S reveals a few other nuggets. The ACF-S supports four native 800-Gb/s Ethernet ports along with Compute Express Link (CXL), enabling high-speed switching among 800-Gb/s Ethernet, PCIe, and CXL ports. The ACF-S comprises at 8-Tb/s switching system, management says.
Looking Ahead, Taking Orders
Enfabrica recently joined the Ultra Ethernet Consortium (UEC), a group founded by Arista and Cisco (among others), which aims to deliver a new version of Ethernet for AI workloads in HPC environments. Note, however, that while the consortium seeks an alternative to InfiniBand, Enfabrica APIs support InfiniBand.
Also note that Enfabrica enjoys a solid relationship with Arista, according to management – a relationship that could prove very interesting in the future.
The ACF-S is set to ship in the first half of 2024, but Enfabrica is taking orders for it and has been showing prototypes at trade conferences (see below).
Source: Enfabrica
Of course, there's fierce competition from other component providers, notably Broadcom, but according to Enfabrica's Sankar, some perceived competitors, such as AMD and NVIDIA, could also prove to be partners. "We are not system makers," Sankar told Futuriom. "We rely on partners with server-side components."
Startup Profile: Enfabrica
Headquarters location: Mountain View, Calif., with offices throughout the U.S. as well as in Europe and India
Employees: 110
CEO: Rochan Sankar (ex-Broadcom)
Target market: “Anybody building AI infrastructure” per CEO Sankar
Prominent investors: Atreides Management, Sutter Hill Ventures, IAG Capital, Liberty Global, NVIDIA, Valor Equity Partners, Infinitum, Alumni Ventures
Funding raised to date: $148 million