Scaling
The gNMIc Operator supports horizontal scaling of collector clusters. This page explains how scaling works and best practices for production deployments.
Scaling a Cluster
To scale a cluster, update the replicas field:
# Scale to 5 replicas
kubectl patch cluster my-cluster --type merge -p '{"spec":{"replicas":5}}'
Or edit the Cluster resource:
spec:
replicas: 5 # Changed from 3
What Happens When You Scale
Scale Up ( 3 → 5 pods)
- Kubernetes creates new pods (
gnmic-3,gnmic-4) - Operator waits for pods to be ready
- Operator redistributes targets using bounded load rendezvous hashing
- Some targets move from existing pods to new pods
- Configuration is applied to all pods
Scale Down ( 5 → 3 pods)
- Operator redistributes targets away from pods being removed
- Configuration is applied to remaining pods
- Kubernetes terminates pods (
gnmic-4,gnmic-3) - Targets from terminated pods are handled by remaining pods
Target Redistribution
The operator uses bounded load rendezvous hashing to distribute targets:
- Stable: Same target tends to stay on same pod
- Even: Targets are distributed evenly (within 1-2 of each other)
Example Distribution
# 10 targets, 3 pods
Pod 0: [target1, target5, target8] (3 targets)
Pod 1: [target2, target4, target9] (3 targets)
Pod 2: [target3, target6, target7, target10] (4 targets)
# After scaling to 4 pods
Pod 0: [target1, target5, target8] (3 targets) - unchanged
Pod 1: [target2, target4] (2 targets) - lost target9
Pod 2: [target3, target7, target10] (3 targets) - lost target6
Pod 3: [target6, target9] (2 targets) - new pod
Best Practices
Start with Appropriate Size
Estimate based on:
- Number of targets
- Subscription frequency
- Data volume per target
- Number of outputs
Use Resource Limits
Ensure clusters (pods) have appropriate resources:
spec:
resources:
requests:
memory: "256Mi"
cpu: "200m"
limits:
memory: "1Gi"
cpu: "2"
Monitor Before Scaling
Check metrics before scaling:
# CPU usage per pod
rate(container_cpu_usage_seconds_total{pod=~"gnmic-.*"}[5m])
# Memory usage per pod
container_memory_usage_bytes{pod=~"gnmic-.*"}
# Targets per pod (from gNMIc metrics)
gnmic_target_status{cluster="my-cluster"}
Horizontal Pod Autoscaler
The operator’s Cluster resource supports the scale subresource, allowing you to enable automatic scaling using the Horizontal Pod Autoscaler (HPA).
To set up autoscaling, create an HPA resource that targets the Cluster resource. Specify the desired minimum and maximum number of replicas, as well as the metrics that will determine when scaling occurs:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: gnmic-c1-hpa
spec:
scaleTargetRef:
apiVersion: operator.gnmic.dev/v1alpha1
kind: Cluster
name: c1
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
Note: You must install the Kubernetes metrics server to enable HPA based on CPU or Memory:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
Autoscaling based on custom resources
gNMIc pods provide various Prometheus metrics that can be leveraged by an HPA resource for autoscaling.
One common use case is to scale based on the number of targets assigned to each Pod. The gNMIc pods export metrics like:
gnmic_target_up{name="default/leaf1"} 0
gnmic_target_up{name="default/leaf2"} 0
gnmic_target_up{name="default/spine1"} 1
Here, a value of 1 indicates that the target is present, while 0 denotes it is absent.
With Prometheus Adapter, this metric can be made available as targets_per_pod{cluster="c1", pod="gnmic-c1-0"} = 1.
You can use the following promQL to aggregate these into a “targets per pod” metric: sum(gnmic_target_up == 1) by (namespace, pod).
You can assign
namespaceandpodlabels to metrics using scrape configurations or relabeling.
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-adapter-rules
namespace: monitoring
data:
config.yaml: |
rules:
default: false
custom:
- seriesQuery: 'gnmic_target_up{namespace!="",pod!=""}'
resources:
overrides:
namespace:
resource: namespace
pod:
resource: pod
name:
matches: "^gnmic_target_up$"
as: "gnmic_targets_present"
metricsQuery: |
sum(gnmic_target_up{<<.LabelMatchers>>} == 1) by (namespace, pod)
The corresponding HPA resource would look like this:
In other words: Scale Cluster c1 to a max of 10 replicas if the average number of targets present in the current pods is above 30.
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: gnmic-c1-hpa
spec:
scaleTargetRef:
apiVersion: operator.gnmic.dev/v1alpha1
kind: Cluster
name: c1
minReplicas: 1
maxReplicas: 10
metrics:
- type: Pods
pods:
metric:
name: gnmic_targets_present
target:
type: AverageValue
averageValue: "30"
Considerations
Output Connections
All pods connect to all outputs. For outputs like Kafka or Prometheus:
- Each pod writes to the same destination
- Data is naturally partitioned by target
- No deduplication needed
Stateless Operation
gNMIc pods are stateless by design:
- No persistent volumes required
- Configuration comes from operator via REST API
- Targets can move between pods without data loss