Chapter 3: Autoscaled Deployment Stacks

Introduction

Scalability is a cornerstone of the 5G technology stack. By leveraging Kubernetes constructs like replicasets and metrics-driven pod autoscaling, combined with multi-cluster management, operators can build a scalable 5G Core infrastructure that adapts to fluctuating demands. Autoscaling ensures that 5G services remain performant, cost-efficient, and available, regardless of traffic load or geographical location. This chapter explores how Kubernetes autoscaling mechanisms enable the seamless scaling of 5G Core functions and provides real-world examples of successful implementations in the telecom industry.

Understanding Autoscaling in Kubernetes

Kubernetes offers robust, flexible tools for managing the scalability of containerized applications. These are vital for the dynamic demands of 5G networks, where workloads can vary significantly based on user activity, network slices, and traffic patterns. Autoscaling in Kubernetes can be categorized into three primary types:

Horizontal Pod Autoscaler (HPA): Dynamically scales the number of pod replicas based on observed metrics such as CPU utilization, memory usage, or custom-defined metrics, ensuring the necessary number of pods handle increasing or decreasing workloads.
Vertical Pod Autoscaler (VPA): Automatically adjusts the resource limits and requests for individual containers in a pod based on actual resource usage, ensuring pods are right-sized for optimal performance.
Cluster Autoscaler (CA): Adds or removes nodes in a cluster based on pending pod resource requests that cannot be met by the current cluster size. CA ensures that infrastructure can meet the demand generated by HPA and VPA scaling activities.

Each of these autoscaling mechanisms plays a critical role in maintaining the performance, reliability, and cost-effectiveness of 5G Core functions as network demand fluctuates.

Solution Architecture

Autoscaling a 5G Core requires an architecture designed for distributed workloads, multi-cluster management, and the ability to manage traffic and workloads across various locations. A well-architected autoscaled 5G Core includes several key components:

Source Repository: Houses the 5G software stack, including core network functions (CNFs) and supplementary services, stored as Helm charts, containers, and configuration files.
Management/Hub Cluster: Manages the lifecycle of other clusters, including provisioning, updates, scaling, and policy enforcement. This cluster typically runs Kubernetes management tools like Open Cluster Management.
Spoke/Managed Clusters: These clusters deploy 5G CNFs, handling network traffic for specific geographical regions or network slices. Each managed cluster can scale independently based on local demands.
Service Mesh: Provides observability, traffic steering, and security across the distributed microservices architecture, ensuring secure and optimized communication between services.

Our testbed, validated on hyperscaler infrastructure, shows that this architecture can dynamically scale in response to real-time network demand, leveraging Kubernetes autoscaling capabilities.

Real-World Example: Autoscaling at Rakuten Mobile

Background: Rakuten Mobile, a Japanese telecom operator, was the first to build a fully virtualized mobile network using a cloud-native architecture. One of the key challenges they faced was ensuring that their 5G Core could scale efficiently to meet fluctuating user demands while maintaining high performance and reliability.

Implementation:

Horizontal Scaling: Rakuten Mobile used Kubernetes' HPA to dynamically scale User Plane Functions (UPF) during high-traffic periods. UPF instances scaled up during peak hours (such as commuting times) and scaled down during off-peak hours to optimize resource utilization.
Vertical Scaling: VPA was implemented to automatically adjust CPU and memory limits for CNFs, ensuring efficient use of resources without overprovisioning. For instance, during traffic surges, VPA increased resource limits for the AMF (Access and Mobility Management Function) to accommodate increased signaling traffic.
Cluster Autoscaling: To optimize infrastructure costs, Rakuten Mobile implemented Cluster Autoscaler to dynamically add or remove nodes based on demand. During events like sporting events or concerts, where sudden traffic spikes were expected, Cluster Autoscaler ensured sufficient capacity was available without manual intervention.

Outcome: Rakuten Mobile successfully maintained high network performance and availability while optimizing resource usage. Autoscaling reduced operational costs by eliminating the need for manual intervention, while allowing the network to dynamically adapt to changing traffic conditions.

Detailed Design Paradigms

Kubernetes Metrics Collection & Usage:
- Kubernetes' Metrics Server collects real-time data on CPU and memory utilization. This data forms the backbone for HPA, allowing pods to scale based on real-time resource consumption.
- For 5G networks, custom metrics such as packet processing rates, session establishment times, and end-to-end latency are critical. These metrics can be collected via Prometheus and fed into autoscaling policies, ensuring that scaling decisions are driven by network-specific demand rather than generic resource metrics.
Horizontal Pod Autoscaler (HPA):
- HPA automatically scales the number of pods based on resource utilization. In the 5G context, CNFs like the UPF or SMF (Session Management Function) can be scaled based on CPU usage or custom metrics, such as the number of active data sessions.
- For example, if UPF CPU utilization exceeds 70% for a sustained period, HPA can automatically spin up additional UPF pods to handle the increased load, ensuring that user data is processed efficiently.
Vertical Pod Autoscaler (VPA):
- VPA optimizes resource allocation by adjusting pod CPU and memory requests in real-time. For instance, during heavy signaling loads, VPA can increase the memory allocation for the AMF to prevent performance degradation.
- This real-time adjustment ensures that CNFs have the resources they need while preventing over-provisioning during low-traffic periods, leading to more efficient resource use across the network.
Cluster Autoscaler (CA):
- CA works alongside HPA and VPA to scale the underlying infrastructure by adding or removing nodes as needed. For example, during peak traffic events, CA can automatically add additional nodes to ensure that new pods have the resources they need to operate effectively.
- Conversely, when traffic decreases, CA reduces the number of nodes in the cluster, optimizing operational costs by minimizing idle resources.

Challenges & Solutions

HPA vs. VPA for 5G Stack:
- While HPA scales the number of pods, VPA adjusts resource allocation for existing pods. In a 5G core, HPA is preferred for stateless CNFs like the UPF, where session state can easily be distributed across multiple pods. However, VPA is ideal for stateful functions like the AMF or SMF, where restarting pods can disrupt active sessions.
- An additional challenge for HPA is the support for stateful applications, especially for 5G core components like the SMF, where state and session awareness are critical. To ensure consistent performance, service mesh traffic policies must be carefully configured to maintain session persistence.
Cluster Scaling for 5G Workloads:
- Cluster Autoscaler is crucial for scaling the physical infrastructure in public cloud environments. In private (on-premise) deployments, scaling is often constrained by available hardware resources. In these cases, bare metal clusters must be provisioned in advance to support the expected scaling needs.
- For public cloud deployments, autoscaling policies can ensure that clusters in regions with high user activity (e.g., during major sporting events) are scaled preemptively, ensuring network performance does not degrade due to resource constraints.
Smart Workload Scheduling:
- By leveraging GitOps for consistent workload deployment across clusters, operators can ensure that changes are automatically applied across all clusters without manual intervention, reducing the risk of configuration drift.
- Smart workload scheduling is particularly critical during scaling operations. Policies for critical CNFs can ensure that traffic is rerouted during scaling operations to prevent service degradation. For example, when scaling a SMF pod, traffic can temporarily be routed to other pods or clusters to ensure session continuity.

Implementation Case Study: Autoscaling in Action

Background: A major European telecom provider needed to ensure their 5G Core could dynamically adjust to the demands of millions of users across multiple countries. Their challenge was to implement an autoscaling solution that could efficiently scale in response to fluctuating demands while minimizing operational costs.

Implementation:

The telecom provider used HPA to scale AMF and SMF network functions based on CPU utilization and custom metrics such as active sessions.
VPA was implemented to dynamically adjust resource allocation for UPF and SMF pods, ensuring that critical network functions were appropriately resourced during peak times.
Cluster Autoscaler was employed to manage infrastructure costs, adding and removing nodes as demand fluctuated across their geographically distributed clusters.

Outcome: The telecom operator achieved seamless scalability, reducing operational costs while maintaining high availability and performance across their 5G network. Autoscaling mechanisms allowed them to respond dynamically to user demand without manual intervention, improving service reliability and reducing downtime.

Conclusion

The autoscaling capabilities of Kubernetes, combined with multi-cluster management and advanced observability tools, provide a robust framework for deploying scalable 5G core networks. Real-world implementations, such as those at Rakuten Mobile and other leading telecom operators, demonstrate the effectiveness of these technologies in managing the dynamic demands of 5G networks. By leveraging HPA, VPA, and Cluster Autoscaler, operators can ensure their 5G networks are both cost-efficient and high-performing. In the next chapter, we will explore how service mesh technologies enhance connectivity and manage microservices communications within Telco networks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chapter-03.md

Chapter-03.md

Chapter 3: Autoscaled Deployment Stacks

Introduction

Understanding Autoscaling in Kubernetes

Solution Architecture

Real-World Example: Autoscaling at Rakuten Mobile

Detailed Design Paradigms

Challenges & Solutions

Implementation Case Study: Autoscaling in Action

Conclusion

Files

Chapter-03.md

Latest commit

History

Chapter-03.md

File metadata and controls

Chapter 3: Autoscaled Deployment Stacks

Introduction

Understanding Autoscaling in Kubernetes

Solution Architecture

Real-World Example: Autoscaling at Rakuten Mobile

Detailed Design Paradigms

Challenges & Solutions

Implementation Case Study: Autoscaling in Action

Conclusion