How to Tune Custom HPA and KEDA Scalers

Kubernetes autoscalers can react to custom metrics such as queue depth or request rate, but when you step off the beaten path of CPU / memory, you need to understand exactly how the scaler translates those numbers into replicas and how to tune the targets for your workload.

How HPA turns a metric into replicas

Take this config snippet from the KEDA QueueLength docs:

triggers:
- type: rabbitmq
  metadata:
    mode: QueueLength
    host: amqp://localhost:5672/vhost
    queueName: testqueue
    protocol: auto
    value: "100.5"
    activationValue: "10.5"

The metric adapter (e.g. KEDA scaler or Prometheus adapter) polls the source at some interval (default 30s) and publishes the current raw value to 'external.metrics.k8s.io'.
Then the HPA controller wakes up every 15s to retrieve that value, divides it by the target value per pod (100.5 in this case), and rounds up:
```
desiredReplicas = ceil(currentMetric / target)
```

The docs aren't very explicit about this behavior, so just to be very clear:

The configured target value is how many messages each pod is expected to handle.

Or rephrased, one pod should keep the metric at or below this number. So in this example, a queue length of 1000 pending messages means the HPA will scale up to 11 pods. If this contract is broken (i.e. the value is incorrectly tuned), the performance will suffer.

How to tune a custom metric value

Before touching any numbers, create a dashboard with these live signals:

Category	Metric	Why it matters
Scaler metric	queue length, lag seconds, RPS, etc	Direct input to HPA
Work in flight	backlog age or growth rate	Shows whether users are waiting
User latency	p95 / p99 end to end	Ultimate SLO (non‑negotiable)
Resource headroom	pod CPU %, memory %, throttling	Ensures a pod can actually do more work if you raise the target
Churn	replica count and scaling events/min	High volatility means wasted nodes and cold‑start latency

Then look for patterns. Some simple examples:

Backlog is increasing but CPU utilization is low -> target value is too low
CPU utilization is too high -> target value is too high or CPU request is too small
Pod counts are flapping with small traffic changes -> look at your 'stabilizationWindow' or 'activationThreshold' (more about these behavior fields)

This is an iterative process. It's a best practice to make incremental (<=30%) changes and only update one field (value, CPU request, stabilization) at a time to distinguish between what causes behavior changes.

Some specific examples

Scaler (metric)	Starter Target	Symptom When Wrong	Possible fixes
QueueLength on RabbitMQ/Redis	10 msgs	Backlog > 1000, CPU <= 25%	Lower to 5 or keep 10 and raise 'maxReplicas', and verify 'channel.prefetch' isn’t bottlenecking
LagSeconds for Kafka	60s	'consumerlagseconds' > 300; CPU > 80%	Halve target, shorten polling interval to 10s, tune 'fetch.max.bytes'
RequestsPerSecond from an API	200 rps	p95 latency > 700 ms while CPU ~= 70%	Drop to 150 rps, add joint CPU scaler at 75%
DB Connection Count	50 conns	“Too many connections” errors despite low CPU	Reduce to 30 conns, bump DB pool size, use 'cooldownPeriod' to stagger ramps

Ongoing efforts

Traffic patterns, features and feature flags, and underlying infrastructure all shift often. If you do find the right balance of configs, make sure to document the tuning process in a playbook so that when your initial assumptions aren't valid, you can easily revisit and iterate again.

Dialed‑in scalers are the quiet guardians of a healthy platform - invisible when traffic is calm, but scales quickly during traffic spikes and saves money otherwise.

If this sounds a bit tedious, consider having Flightcrew manage your autoscaling configs for you and automatically send PRs to update scalers and custom values to timely and optimal values. Shoot us a note at hello@flightcrew.io if you'd like to learn more.

How to Tune Custom HPA and KEDA Scalers

How HPA turns a metric into replicas

How to tune a custom metric value

Some specific examples

Ongoing efforts

Keep reading

Taming Kubernetes HPA Flapping with Stabilization Windows

When to Migrate from HPA to KEDA

Don’t miss out!