Accelerating AI Training with Precision:
How Netitest Optimized High-Speed Networking with Napatech’s 200G FPGA SmartNIC
Case Study
Challenge
Netitest, a leader in intelligent computing network testing, faced a challenge while developing a solution for AI training. Their test solution required a SmartNIC with ultra-high bandwidth and low latency capabilities to evaluate the performance of network resources in AI training environments.
Solution
To meet these requirements, Netitest selected the Napatech NT400D11 SmartNIC with 2 x 100G, running Link-Capture™ Software. This solution ensured 200G transmission throughput with precise latency measurement accuracy of 10 nanoseconds.
Benefits
With this solution, Netitest was able to simulate high-performance traffic flows, accurately testing bandwidth and latency for both RoCE v2 switches and RDMA network cards. This facilitated thorough evaluation and optimization of the AI training network.
Netitest optimized AI training environments by integrating Napatech’s NT400D11 2 x 100G SmartNIC with Link-Capture™ Software, achieving zero packet loss, ultra-low latency and high throughput. This solution enabled precise 200G RoCE v2 traffic handling and 10-nanosecond latency measurements, crucial for testing and improving network efficiency in AI workloads.
Challenges in AI Training
Optimizing model performance, evaluating distributed computing frameworks and ensuring smooth deployment are all critical for success in AI training. Network performance plays a key role in these processes. To maximize efficiency and avoid bottlenecks in AI workloads, Netitest needed a SmartNIC capable of handling 200G RoCE v2 traffic while maintaining a latency measurement accuracy of 10 nanoseconds. The specific requirements included:
- Simulating the transmission and reception of RoCE v2 (RDMA over Converged Ethernet version 2) traffic to assess the bandwidth and latency of high-performance network switches.
- Simulating the transmitting end and establishing connection with the receiving end of an RDMA (Remote Direct Memory Access) network. This enables testing throughput and latency performance of RDMA network cards via read/write operations.
Solution
To address these challenges, Netitest integrated the Napatech NT400D11 2 x 100G SmartNIC with Link-Capture™ Software into their Supernova Physical Tester. This solution met the strict international standards required for network environment testing in AI training, ensuring:
- Zero packet loss
- Ultra-low latency
- High throughput
Benefits
- 2-port 100G solution: The SmartNIC ensured seamless handling of network traffic, providing reliable testing results at extremely high data rates.
- Uniform packet transmission: With time stamps accurate to within 10 nanoseconds, the solution allowed for precise measurement of transmission/reception times, which is critical in evaluating the efficiency of AI training networks.
Through this partnership, Netitest developed a cutting-edge solution for testing AI training environments, enabling the optimization of network resources and ensuring top-tier performance in AI workloads.
About Netitest
Netitest mainly develops and sells higher performance, intelligent, more convenient and efficient network test products and network test services to the field of communication network testing. Relying on innovative core technologies and excellent product development capabilities, Netitest has achieved several technological breakthroughs in domestic network testers and broke the international monopoly of hardware tester technology.