Overview

System configuration — chili10-101d

Single-socket AMD Genoa + 4× L40S + 4× PCIe 5.0 Solidigm D7-PS1010 + 2× ConnectX-7. No PCIe switches, no NVLink — a clean topology purpose-fit for Tier-2 GDS validation.

Hardware summary
Hostnamechili10-101d
ChassisGigabyte G293-Z23-AAM1-000 (2U, 4-GPU front-load)
CPUAMD EPYC 9554 — 64 C / 64 T (SMT off), Zen 4 Genoa
DRAM377 GiB DDR5-4800 · 12-channel · 1 NUMA node
GPUs4× NVIDIA L40S · 46 GB GDDR6 · BAR1 64 GiB · PCIe Gen4 x16
Data NVMe4× Solidigm D7-PS1010 (7.68 TB · PCIe 5.0 x4 · 14.5 GB/s read)
Boot NVMeKIOXIA KXD5YLN13T84 (3.5 TB · quadrant 0x80)
Data NICs2× ConnectX-7 (mlx5_0 @ c1:00.0, mlx5_1 @ 01:00.0) · 400 GbE
Mgmt NICBroadcom BCM57416 10 GbE (enp3s0f1np1 · 10.100.200.56)
OS / kernelUbuntu 24.04.3 LTS · kernel 6.17.0-22-generic (HWE)
GPU driverNVIDIA 580.126.20-open · CUDA CC 8.9
PCIe topology — per IOD quadrant
The EPYC 9554 IOD exposes 4 Gen5 root complexes at 0x00, 0x40, 0x80, 0xc0. No external PCIe switches. Each GPU/NIC/NVMe hangs directly off a root port — the best possible case for P2P DMA.
0x00GPU0 + NIC1 · PHB
  • GPU 0 (02:00.0) · L40S
  • NIC 1 (01:00.0) · ConnectX-7 · mlx5_1 · 10.100.240.56
  • Broadcom mgmt NIC + SATA + BMC
0x40GPU1 + 2× NVMe
  • GPU 1 (41:00.0) · L40S
  • nvme3 (42:00.0) · Solidigm D7-PS1010 7.68 TB
  • nvme4 (43:00.0) · Solidigm D7-PS1010 7.68 TB
0x80GPU2 + 3× NVMe (data + boot)
  • GPU 2 (81:00.0) · L40S
  • nvme0 (82:00.0) · Solidigm D7-PS1010 7.68 TB
  • nvme1 (83:00.0) · Solidigm D7-PS1010 7.68 TB
  • nvme2 (84:00.0) · KIOXIA boot drive
0xc0GPU3 + NIC0 · PHB
  • GPU 3 (c2:00.0) · L40S
  • NIC 0 (c1:00.0) · ConnectX-7 · mlx5_0 · 10.100.241.56
  • ASPEED VGA
nvidia-smi topo -m
NIC0↔GPU3 and NIC1↔GPU0 are PHB (same host bridge) — the intended pairing for minimum-hop GPUDirect RDMA.
GPU0   GPU1   GPU2   GPU3   NIC0   NIC1
GPU0    X     NODE   NODE   NODE   NODE   PHB
GPU1   NODE    X     NODE   NODE   NODE   NODE
GPU2   NODE   NODE    X     NODE   NODE   NODE
GPU3   NODE   NODE   NODE    X     PHB    NODE

NIC0 = mlx5_0 (c1:00.0)   NIC1 = mlx5_1 (01:00.0)
PHB  = same PCIe host bridge (best case)
NODE = across IOD Infinity Fabric (next best)
Storage layout
4× D7-PS1010 in a single md-raid0 stripe, XFS mounted at /gds. Boot drive lives on a separate KIOXIA — not benchmarked.
/dev/md0   : raid0 across 4× D7-PS1010 (nvme3 + nvme4 + nvme0 + nvme1)
             └─ XFS  mounted at /gds

/dev/nvme2n1 : KIOXIA boot drive
             ├─ /boot/efi  (vfat)
             └─ /          (ext4)

Per-drive spec     : Gen5 x4 (32 GT/s) · 14.5 GB/s read
Array aggregate    : ~58 GB/s theoretical, 53 GiB/s measured (Run 2) — 91%
VAST storage fabric
NFSoRDMA target — 2× Mellanox SN5600 leaf switches, 5-CNode VAST cluster, 5-VIP multipath.
Topology2× Mellanox SN5600 (leaf) · 5 CNodes (VAST)
VLAN240/241 (local fabric) · 233 (storage)
RoCEv2 · DSCP 26 · PFC priority 3 · ECN on
VIPs10.100.233.10-14 (5 VIPs)
Mount/mnt/vast-rdma · nconnect=32/64 · proto=rdma · vers=4.1
Drivervastnfs-dkms 4.5.5 · MOFED 24.10 · nvidia-fs 2.28.4