Optimize GPU resource utilization across under utilized RAN sites by scheduling AI workloads through a ZTP mechanism

Manage RAN and AI workloads on a common GPU infrastructure

Customer Profile

Leading US telecom operator with roughly $185B market cap running AI-RAN Proof of Concept to optimize RAN operations using GPUs. 

Business Challenge

Need for unified orchestration of diverse workloads (RAN & AI) with GPU isolation and scaling on a common platform.

Solution

  • Deployed Aarna GPU CMS to manage AI/ML and RAN workloads on GPU infrastructure.
  • Enabled multi-tenant GPU slicing and MIG-based partitioning to allocate resources for AI and 5G functions in a shared infrastructure.
  • Leveraged CNF to provision and manage RAN components such as DU/CU with lifecycle automation.
  • Integrated AI workloads (e.g., computer vision based scene description) using NVCF (NVIDIA Cloud Functions) with asset management and storage orchestration.
  • Implemented observability and telemetry collection with real-time dashboards to monitor GPU utilization and service health.
  • Supported dynamic scaling and migration of workloads based on performance and policy-driven triggers.
  • Provided closed-loop automation for resource reclamation and workload placement using policies and intent-based workflows.

Outcome

Demonstrated GPU workload consolidation, automated scaling, and readiness for production-grade AI-RAN deployments.