Overview

NFSoRDMA — VAST via ConnectX-7 (400 GbE)

vastnfs-dkms 4.5.5 · RoCE v2 · PFC priority 3 · nconnect=32/64 · 5 VIPs over 2× SN5600 · GPU Direct Storage confirmed viaXferType: GPUD. Two runs: a thread-sweep to find the single-GPU ceiling and a multi-GPU scaling test to find the NIC ceiling.

Single-GPU ceiling (PCIe-bound)
26.4 GiB/s

Run 5, 128 workers, 1 MiB block, GPU 0. 94% of the practical PCIe 4.0 x16 ceiling for a single L40S (~28 GiB/s).

Doubling to 256 workers gives only ~0.5% more — the GPU uplink is already saturated.

4-GPU ceiling (NIC-bound)
43.4 GiB/s

Run 6, 4 GPUs × 64 workers, 1 MiB block. 347 Gbps on the 400 GbE NIC = 96% of line rate after Ethernet + RoCE + NFS framing overhead.

Adding more GPUs beyond this would not help — the NIC is the ceiling. To scale further, bond a second ConnectX-7.

Run 5
Single-GPU thread sweep — 1 MiB
GPU 0 · /mnt/vast-rdma · nconnect=32/64 · 5 VIPs multipath · XferType: GPUD
Run 6
Multi-GPU scaling — NIC saturates
Aggregate read across GPUs · 1 MiB block · 64 workers each · 400 GbE practical ceiling ≈ 45 GiB/s
Y-axis capped at 45 GiB/s — the 400 GbE practical ceiling after RoCE + NFS framing.
Xprt distribution across VAST VIPs
xprts per VIP at nconnect=32 and 64. vastnfs hashes connections evenly across the 5 VIPs — confirms multipath is healthy.
PCIe → NIC bottleneck transition
At each scale the constraint moves to the next layer.
GPUsPer-GPUAggregate% of 400 GbEBinding constraint
126.4126.4153%GPU PCIe uplink
220.0840.1580%NIC approaching saturation
410.8643.4387%NIC (single port)