Nephio is a new open source project seeded by Google and hosted at the Linux Foundation that is getting substantial attention in the industry. I attended the first Nephio Developer Summit last week in Sunnyvale, June 22-23 and wanted to share my key takeaways. As a member of the Nephio project with Sandeep Sharma from our team holding a Technical Steering Committee (TSC) seat, it is no surprise that we at Aarna are big fans of the project. Here are my observations of Nephio along with the pros and cons as I see them.
Scope
The stated goal of Nephio is to “simplify the deployment and management of multi-vendor cloud infrastructure and network functions across large scale edge deployments.” This is a clear and self-explanatory definition. At the meeting, Google stressed that Domain Orchestration as opposed to Service Orchestration is the focus of the project. Of course, the lines blur. Is a 5G service consisting of UPF+AMF+SMF a domain or a service? I think from Nephio’s point-of-view, this would be considered a domain. In other words, Nephio can deploy and manage a 5G service with a variety of NFs. So, that leaves very little (if anything) for the “Service Orchestration” layer to do.
What I found fascinating about Nephio is that it considers cloud infrastructure within its scope as well. Other projects, such as the Linux Foundation Networking ONAP project, have only worked on the service/NFVO/VNFM layers. I think considering both infra+NFs together is a huge plus for the 5G + MEC (multi-access edge computing) era. We at Aarna are seeing evidence of this trend from groups such as the O-RAN Alliance, where FOCOM (Federated O-Cloud Orchestration and Management), NFO (Network Function Orchestration), and NF (Network Function) Configuration Management, Performance Management, and Fault Management are all within the scope of the O-RAN Service Management and Orchestration (SMO) entity.
Nephio Technical Overview
Very simply put, Nephio uses Kubernetes (K8s) automation for cloud infrastructure and NFs. I had not appreciated this point, but Kubernetes is general purpose. It just happens to be used for container orchestration first, but it is not limited to that use case. With that understanding, we can see that Nephio is applying Kubernetes to a new use case.
Needless to say, Kubernetes comes with tremendous benefits. It is mature. It is declarative and intent driven (an intent driven system monitors the end state and continuously reconciles it with the intended state). Kubernetes can be expanded through mechanisms such as CRDs (Custom Resource Definitions) and Operators. Custom Resources are extensions of the Kubernetes API that can declaratively express user intent for a particular domain. Operators or Custom Controllers (apologies if they are not exact synonyms, I am using them as such) listen to the APIs and perform actions to fulfill the declarative intent. Ultimately declarative intent has to be converted to imperative. That is the job of the Custom Controller.
So is that it? Then why do we need Nephio? Clearly there’s more…
Distributed State
Nephio creates the concept of a centralized Nephio K8s cluster with platform controllers which reconcile the high level user intents expressed in KRM files. From that standpoint, the Nephio cluster runs the user intent through a series of Custom Controllers to produce the state that can be consumed by the edge cluster(s).
The state is transmitted to the edge cluster using a “pull” mechanism using an open source project called ConfigSync. However, ConfigSync may be replaced by alternatives such as ArgoCD or Flux v2. A pull mechanism is significantly more scalable than a push approach. It also moves the burden of maintaining the state to the edge cluster as opposed to the Nephio cluster. Again much more scalable.
The edge clusters in-turn use the input provided to them by the Nephio cluster for their own Operators/K8s cluster configuration that may include edge cluster infrastructure and NF automation.
GitOps
That’s not all. Nephio also bakes in the concept of GitOps into the project. The user provides KRM files in a package called kpt that is checked into a Git repo. kpt uses the principle of configuration as data (APIs) rather than configuration as code (templates or Domain Specific Languages). The Custom Controllers on the Nephio cluster successively refine the kpt package in the git repo. Finally the edge cluster pulls the state from the Git repo to apply it to the local K8s cluster. This architecture is both pragmatic and clever. It’s like infusing Fluoride into water. The user gets the benefit of GitOps without explicitly knowing or worrying about it.
Pros
Nephio has a number of key benefits.
- Simplicity: Like Kubernetes, I think Nephio will disrupt open source networking vis-à-vis cloud infrastructure and greatly simplify network service delivery.
- Google backing: Google is not only behind the open source project, they also seem to be committed to Nephio based cloud service(s). This is ideal backing for an ambitious open source project.
- Common across Infra, Platform, Workloads: The same descriptors and project can be applied for setting up the infrastructure (on-prem or cloud), the CaaS platform (K8s with different plugins and software components such as Multus, SR-IOV, DPU, Istio, Prometheus etc.), and workloads (NFs and MEC applications).
- GitOps Built-in: Users don’t have to bolt on a DevOps framework on top of Nephio. It’s inbuilt. I love this feature.
- Distributed: Nephio is inherently built for a world with a large number of edge clouds. This again distinguishes it from prior projects where a centralized entity can struggle to scale.
- Data-model first: By having CRDs first, the data model is essentially agreed upon even before writing the 1st line of code. This is the right way to do things. Current projects either approach the data model in parallel to writing code or often as an afterthought.
- Community excitement: If the attendance at the event is any indication, the community is truly energized by Nephio. It also includes several active end users. This is a positive sign.
Cons
In my opinion, Nephio comes with some architectural assumptions that might slow down its adoption.
- Developer effort: Nephio is just a framework. Without CRDs and Custom Controllers, Nephio doesn’t actually do anything. This means that the developer burden, as compared to prior solutions or open source projects, is definitely higher. In addition, the ops personnel at telcos will need to be comfortable with KRM files and kpt packages, which requires sophistication. Of course, there could be a GUI to front-end and simplify this mechanism.
- Moving the burden to NF vendors: Philosophically, Nephio moves the control to NF vendors (aka the sVNFM model). In the past, systems such as ONAP SO+CDS+SDN-C, had tried to wrest control away from NF vendors and seek common approaches via a gVNFM. I don’t think the Nephio approach is either good or bad. After all, the NF vendor is the expert on how to manipulate their NF. Why not give the control to them? But this does mean waiting for NF vendors to create Operators.
- KRM vs. Helm: With one exception, every NF vendor I have talked with creates CNFs (cloud native network functions) via Helm Charts. It seems Nephio doesn’t hold Helm Charts in a positive light at this time since it mixes declarative with imperative. However, this position might slow Nephio adoption.
Conclusion
I am a Nephio believer. After having seen prior approaches, I believe that software simplicity is the number one factor that determines its success. And Nephio fully embodies simplicity. I think Nephio will have a big impact on 5G in general and O-RAN and MEC specifically (Nephio has the O2 interface as one of its stated use cases). We at Aarna are onboard. We will announce our Nephio strategy later in Q3’22 and will publish blogs and videos on the Nephio architecture. Want to learn more? Check out the Aarna's Nephio Executive Brief. Feel free to reach out to us if you have any Nephio needs or questions.