Optimize GPU resource utilization across under utilized RAN sites by scheduling AI workloads through a ZTP mechanism

Manage RAN and AI workloads on a common GPU infrastructure

Customer Profile

Leading US telecom operator with roughly $185B market cap running AI-RAN Proof of Concept to optimize RAN operations using GPUs.

Business Challenge

Need for unified orchestration of diverse workloads (RAN & AI) with GPU isolation and scaling on a common platform.

Solution

Deployed Aarna GPU CMS to manage AI/ML and RAN workloads on GPU infrastructure.
Enabled multi-tenant GPU slicing and MIG-based partitioning to allocate resources for AI and 5G functions in a shared infrastructure.
Leveraged CNF to provision and manage RAN components such as DU/CU with lifecycle automation.
Integrated AI workloads (e.g., computer vision based scene description) using NVCF (NVIDIA Cloud Functions) with asset management and storage orchestration.
Implemented observability and telemetry collection with real-time dashboards to monitor GPU utilization and service health.
Supported dynamic scaling and migration of workloads based on performance and policy-driven triggers.
Provided closed-loop automation for resource reclamation and workload placement using policies and intent-based workflows.

Outcome

Demonstrated GPU workload consolidation, automated scaling, and readiness for production-grade AI-RAN deployments.

We use cookies to enhance site navigation, analyze site usage, and assist in our marketing efforts. For more information, please see the aarna.ml Cookie Policy.