Creating Spot Instances to Monetize Unused GPU Cycles

Aarna Networks Solutions

AI Cloud: GPU-As-a Service

Data centers, GPU-as-a-Service cloud or edge providers, and private cloud or edge providers: Build your own multi-tenant AI Cloud with our GPU-as-a-service software stack for Hopper and Blackwell architectures. Solve for network, storage and GPU isolation, Day 2 management, user APIs, and spot instance creation.

TALK TO AN ENGINEER

The critical component to build a true hyperscale grade AI Cloud

A Vital Component to Improve the ROI on Your GPU Investments

GPU costs are 66% of the total on-prem cost, making it critical to monetize unused GPU cycles. GPU capacity aggregators enable monetization of unused cycles by allowing GPU owners to register spot instances. Aarna offers spot instance creation software to owners of GPUs.
Talk to an engineer

If you are a NVIDIA Cloud Partner (NCP), GPU-as-a-Service Cloud provider, or IT/OPS practitioner building a private AI cloud or edge, Aarna Multi Cluster Orchestration Platform (AMCOP) can deliver true multi-tenancy and network isolation for Infiniband and Ethernet, storage and GPU isolation while leveraging existing Base Command Manager features of NVIDIA.

Multi-tenancy and isolation of network, storage and GPU components is not available out of the box from NVIDIA’s DGX or HGX solutions. These are critical to ensure that sensitive workloads and traffic are appropriately separated across tenants. AMCOP enables AI Cloud providers to reliably leave these critical but “boring” components to Aarna while building their key differentiators in terms of LLMOps/RAG/fine tuning models etc. AMCOP seamlessly interfaces with NVIDIA components such as Base Command Manager (BCM), Run:ai, DOCA on Bluefield3, Connectx, Spectrum switches, MOFED drivers for Infiniband, Quantum switches, GPU Controller, Network Controller, DGX OS and more.

Enjoy the convenience of the cloud while maintaining data proximity

Explore the convergence of AI/ML, cloud, and edge computing, and the benefits of running machine learning workloads at the cloud edge with Aarna Edge Services (AES) — the number one zero-touch orchestrator delivered as a service.

AI/ML at the Cloud Edge

AI/ML applications today, such as for large language models (LLM), are mostly run on-prem or in the public cloud. Both approaches have pros and cons. But edge, cloud, and AI/ML have converged to a point where now there is a third way – applying machine learning at the cloud edge. Benefits of this approach include:

  • Ability to process data close to where it gets produced
  • Ease of use features at par with the public cloud
  • OPEX savings
  • On-demand usage
Distributed AI is moving workloads to where they make the most business sense, including the cloud edge.

Computer Vision

Computer vision can generate large amounts of data. With hundreds or thousands of cameras being deployed, the traffic can easily add up to multiple gigabits. Moving this amount of data to the public cloud for computer vision ML processing can be quite expensive. An alternative is to run ML processing at the cloud edge, i.e., the colocation or datacenter location where the last mile access network terminates.

Generative AI

Powered by large language models (LLM), Generative AI programs like ChatGPT are revolutionizing the way we live and work. Cloud edge in a private cloud is an ideal place to collect data and run AI/ML algorithms for business intelligence. When using open source models such as Llama or Dolly, the user can have full control over the LLM model meaning there’s zero probability of data leakage into the public domain. 

Given that the cloud edge can be easily connected to a company’s private data with a dedicated link to their datacenter cage or through SD-WAN breakout (see figure below), a cloud edge LLM will have unrestricted access to sensitive data for training purposes than an LLM running in a public cloud. 

The above figure shows a Cloud Edge ML implementation with connectivity to a company’s on-prem locations over SD-WAN. The ML workloads could be LLMs like Llama or Dolly or computer vision ones such as NVidia Metropolis.

RAN-in-the-Cloud

One such edge location for AI/ML processing is the Radio Access Network (RAN). Ideally, a 5G radio access network would be hosted as a service in multi-tenant cloud infrastructure running as a containerized solution alongside other applications. This concept of RAN-in-the-Cloud allows RAN components (CU/DU) to be dynamically allocated, increasing utilization for better sustainability, and using spare capacity in off-peak hours to run AI/ML applications.

Aarna Edge Services (AES)

Aarna Edge Services (AES), is the number one zero-touch edge multicloud orchestrator delivered as a service. It features an easy-to-use GUI that can slash weeks of orchestration work into less than an hour. In case of a failure, AES includes fault isolation and roll-back capabilities. Support includes:

  • Equinix Metal Servers with GPUs
  • Equinix Fabric & Network Edge with Azure Express Route/AWS Direct Connect
  • Pure Storage
  • ML workloads
  • NVidia Fleet Command + Metropolis, OR
  • Open source Llama LLM, OR
  • Open source Dolly LLM

Customers considering AMCOP for Spot Instance Creation

Pipeline (In Progress)

Why Aarna?

The industry lacks an off-the-shelf solution to offer GPU-as-a-Service. As a GPUaaS provider, your current alternative is to do-it-yourself. Creating an advanced multi-tenancy and Day N software layer requires deep technical expertise and close coordination with hardware vendors. Aarna has the network, storage and GPU expertise to get you there much faster, enabling you to focus on differentiating your service rather than dealing with infrastructure level problems – all using 100% open source software.

Set Up a Cloud Edge LLM

Aarna Networks, Predera, and NetFoundry have partnered to offer a Private, Zero-Trust, Fully Managed LLM for to help you explore the world of generative AI. Choose from a variety of foundational models that you can fine tune with your corporate data to discover new insights and revenue generating opportunities. See this Solution Document to learn more.

Or, request a free consultation to learn more about how to apply these approaches to your business requirements and cloud/edge machine learning strategies or request a Free Trial of AES today.