External Publication
Visit Post

Arista hints at in-the-works telemetry tools to manage AI fabrics

Network World [Unofficial] February 18, 2026
Source

Arista Networks is extending its telemetry capabilities in response to AI-driven demand for more comprehensive network management and greater visibility across complex environments.

The networking company shared early details about advanced telemetry technology that’s in the works to help its AI and cloud customers improve their monitoring and diagnostic capabilities.

Telemetry already is at the core of Arista’s EOS software stack and its Cloud Vision network management and analytics platform for enterprise customers. Real-time network state telemetry and metrics are stored in one common database, SysDB, which is easily accessible through APIs and SDKs such as gNMI/OpenConfig for analytics, according to Arista.

“We have a real-time streaming telemetry that has been with us since the beginning of time,” said Jayshree Ullal, CEO and chairperson of Arista, during the vendor’s fourth-quarter earnings call with financial analysts. “Our cloud customers and AI customers are seeking some of that visibility, too, and so we have developed some deeper AI capabilities for telemetry as well.”

Currently, Arista captures and streams network telemetry data to CloudVision and other customer systems, added Ken Duda, president, chief technology officer and founder of Arista.

“We’re extending that capability for AI with a combination of in-network data sources related to flow control, RDMA counters, buffering and congestion counters, and also host-level information, including what’s going on in the RDMA stack on the host, what’s going on with collectives, latencies, any flow control problems or buffering problems in the host NIC,” Duda said. “Then we pull that information all together in CloudVision and give the operator a unified view of what’s happening in the network and what’s happening in the host.”

“This greatly aids our customers in building an overall working solution, because the interactions between the network and the host can be complicated and difficult to debug when it’s different systems collecting them,” Duda said.

Analysts react to telemetry preview

Arista declined to share more details about its forthcoming AI telemetry extensions, but experts say additional control features would be a benefit to high-end customers such as hyperscalers that are operating AI networks.

“Modern switches already know detailed internal conditions (congestion, drops, buffers, RDMA counters, latency), but that information is invisible unless it’s exported. Streaming it to a central system makes the network observable in real time, not just via logs but via live operational state. This is especially critical for AI clusters, where tiny network issues can stall synchronized GPU jobs and waste massive compute resources,” said Sameh Boujelbene, vice president of Dell’Oro Group.

“Operators therefore need visibility across both the network and the hosts (congestion, NIC buffering, RDMA behavior, and collective performance), all at once. The key idea is to unify host and network telemetry into one correlated view. Many failures happen between layers, and siloed monitoring hides the root cause. A single timeline that combines both perspectives lets operators see the full pipeline and diagnose complex performance problems much faster,” Boujelbene said.

According to Alan Weckel, co-founder and analyst with the 650 Group, telemetry is key to understanding what is actually occurring in AI fabrics, and Arista has a lot of these features already on the switch side.

Arista bought Big Switch and its Big Cloud Fabric in 2020, and that technology lets customers manage physical switches as a single fabric, including security, automation, orchestration and analytics. Importantly, the software can run on a variety of certified switches from Dell EMC, HPE and others.

“The BigSwitch piece helps them with additional probes, and I think we will see more as the standards such as [Ultra Ethernet Consortium] progress,” Weckel said.

Duda’s comments on Arista’s Q4 call reveal where the industry is headed, Weckel added. “Operators really need a unified view that spans beyond just a single vendor’s view of the world (NIC, scale out, scale up, scale across) in order to fully monetize those GPU assets, so the tools need to evolve just as quickly as the hardware infrastructure is,” Weckel said.

Ryan Koontz, a senior analyst with Needham & Company, noted that extending AI visibility would significantly bolster Arista’s already strong EOS and CloudVision capabilities.

“My research work into hyperscalers and more recently the AI backend indicates that Arista’s current streaming telemetry capability is a massive differentiator for which competition is years behind,” Koontz said.

“And AI training is hyper-sensitive to packet loss during which this telemetry capability really shines. It’s a big reason why Arista is quickly emerging as a back-end powerhouse as the hyperscalers look to reduce their dependence on Nvidia. I assume this telemetry fits neatly into the containerization of EOS which broadly is far ahead of the pack,” Koontz said.

Discussion in the ATmosphere

Loading comments...