Rapid cloud migration was a boon for innovation in high-speed Ethernet (HSE) connectivity.
Now, generative AI applications are poised to drive unprecedented levels of demand from their networks as new requirements speed hyperscale data center adoption of higher speed Ethernet connectivity solutions.
This trend is upending traditional timetables for connectivity development and availability. As field deployments of 400G Ethernet have started, 800G chipsets are currently in manufacturing and standard specifications are already in progress for 1.6 Terabit Ethernet (TbE).
Hyperscalers have a strong business need for implementing the latest HSE solutions, which double bandwidth as tightly optimized data centers operate at high fill rates. Communications service providers will eventually join them as AI/ML solutions generate large traffic loads on 5G networks.
Higher speed = higher complexity
High-speed Ethernet requires adoption and deployment of multiple technology innovations and extensive data center overhauls, greatly increasing complexity.
The unique needs of AI/ML are spurring a new way of thinking about data center networking. New use cases are resource hungry and require higher speeds, rapid response times, and low latency. There are also new hyperscale elements, such as graphics processing unit (GPU) clusters.
AI workloads have significant impact on data center performance, as discussed in our. For example:
AI models are growing in complexity by 1,000 times every three years.
New models have billions, and soon trillions, of dense parameters.
Apps will require thousands of GPU accelerators.
The number of data centers are expected to increase from 700 in 2022 to 1,000 by 2025.
Hyperscalers are turning to new architectures and infrastructure upgrades to support these expanding AI/ML requirements—and see them as business differentiators.
New testing approaches necessary as AI/ML changes hyperscaler data centers requirements
The diverse, emerging technologies needed to support AI require rigorous testing. Traditional test methods are not practical and must undergo transformation as faster connectivity and increased bandwidth increase complexity, including:
A doubling of PAM4 signal modulation speeds to support 200G lanes.
The speed per lane I/O data rates growing from 112 Gbps per lane to 224 Gbps.
Doubling the spectrum doubles the sampling speed and symbol rate.
Excess forward error correction errors cause other errors and packet loss.
Interoperability testing is essential for successful high-speed Ethernet deployments and AI/ML use cases. Compared to the past, most deployed solutions now have multiple vendors. Plus, in the rush to market, silicon is being built before standards are defined, leading vendors down diverse paths. Open-source solutions result in many parts that need to work together.
Further, it is not economical to test and qualify in the lab how GPU clusters behave. Many GPUs would be required to test their clusters, but they have limited availability and are expensive. Instead, GPU cluster behavior is emulated at a fraction of the cost. By emulating real-word conditions, individual devices can be tested, as well as their behavior with adjacent devices, and the whole system. All without buying GPUs.
Exciting times, higher speeds ahead
ChatGPT and other AI/ML applications are creating exciting times for hyperscale data centers and high-speed Ethernet. Ethernet speeds will continue to increase, and new use cases will be found for each speed generation.
Innovative test and assurance companies like Spirent are helping this dynamic industry evolve and be successful.
Spirent participated in the recent Frost & Sullivan Thought Leader Think Tank discussion on high-speed Ethernet testing, drivers and challenges impacting the adoption of next-gen Ethernet technologies, and how the test and measurement industry is supporting the evolution to 800G and beyond.
Learn more aboutand how to ensure new high-speed Ethernet technologies will meet the data center networking needs of AI workloads.