Scaling Kubernetes with KEDA and Prometheus
Tim Nichols
CEO/Founder

TL;DR - Setting KEDA to scale on Prometheus metrics lets you apply powerful, flexible autoscaling rules to any metric you’re already collecting.Here’s a quick guide!
KEDA 101
Kubernetes Event Driven Autoscaling extends Horizontal Pod Autoscaling to act on custom events and metrics. Specifically KEDA opens up
- Event Driven Scaling based on reactive or deterministic triggers (ex: batch jobs or push notifications)
- Scaling on custom metrics (ex: Queuesize) that are more accurate and responsive measure of load
- Enables Scale-to-Zero Behavior: Pods can completely scale down when no load is present.
Installing KEDA can improve the accuracy and responsiveness of kubernetes autoscaling, and means more types of workloads can enjoy the benefits of horizontal autoscaling. Read more here
Scaling KEDA with Prometheus
Prometheus is the de facto open-source toolkit for monitoring Kubernetes-based workloads. By pairing Prometheus with KEDA’s Prometheus scaler, you can seamlessly leverage any observed metric—even custom business metrics—as the trigger for scaling.
For example you can:
- Scaling On HTTP Traffic: When http_requests_total goes above a threshold, spin up more Pods.
- Scaling On Custom SLIs: E.g., “orders_unprocessed” that you track via a custom Prometheus metric.
- Advanced Metrics: Scale based on CPU or memory usage in a user-friendly, standardized way.
Pros:
- Unified Observability: Piggyback on existing metrics from prometheus .
- Extremely Flexible: Virtually any PromQL expression can drive autoscaling.
- Business-Centric Metrics: Scale on real business events (ex: num_orders) as the true measure of load, instead of simple utilization metrics.
Cons:
- Query Performance: Overly complex or frequent queries can stress Prometheus.
- Metric Hygiene Needed: This applies to prometheus as a whole, but Inconsistent labeling or low-quality metrics can lead cause direct and downstream issues.
Step-by-Step Guide to Scaling KEDA on Prometheus
1. Install Prometheus
Deploy Prometheus using Helm
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install prometheus prometheus-community/prometheus
2. Install KEDA
Deploy KEDA to your Kubernetes cluster using Helm:
helm repo add kedacore https://kedacore.github.io/charts
helm install keda kedacore/keda --namespace keda --create-namespace
3. Create a Scaled Object Trigger
Define a ScaledObject to scale based on a Prometheus query: For example, if Prometheus metrics indicate no HTTP traffic (no requests in the past 5 minutes), KEDA can scale your application down to zero pods until traffic resumes.
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: prometheus-scaledobject
namespace: default
spec:
scaleTargetRef:
name: payments
triggers:
- type: prometheus
metadata:
serverAddress: http://prometheus-server.default.svc.cluster.local
metricName: rate(http_requests[5m])
threshold: "100"
4. Apply the Configuration
Save the ScaledObject definition to a file, e.g., prometheus-scaledobject.yaml, and apply it:
kubectl apply -f prometheus-scaledobject.yaml
5. Test the Autoscaling Generate traffic to your app and observe the scaling behavior:
kubectl get hpa -w
You should see the Horizontal Pod Autoscaler (HPA) dynamically adjust the number of pods based on HttpRequests
Scaling to Zero on Prometheus
Another example of scaling to zero would be to use relevant business metrics stored in Prometheus. For example, if Prometheus metrics indicates that there are no orders for this workload (e.g., orders_pending is below a threshold), KEDA can scale your application down to zero pods until traffic resumes.
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: prometheus-scaledobject
namespace: default
spec:
scaleTargetRef:
name: payments
triggers:
- type: prometheus
metadata:
serverAddress: http://prometheus-server.default.svc.cluster.local
metricName: orders_pending
threshold: "10"
Best Practices & Edge Cases
This applies to Prometheus as a whole but you should probably ...
- Subqueries & Summaries: Be selective with your PromQL to avoid “noisy” signals or performance overhead.
- Don't worry about query frequency: Prometheus can handle it and you want to get the most out of KEDA
- Verify Metric Consistency: Use consistent labels to ensure you’re querying the right timeseries.
- Prevent Churn: Avoid flapping by introducing a cooldown or smoothing your query metrics.
Integrating KEDA with Prometheus opens up a full universe of scaling options - this can be overwhelming! Check out our tutorial on matching the right scaling strategies to your workloads.
Prometheus, KEDA and Flightcrew
Combining Prometheus with KEDA is a natural integration that extends horizontal autoscaling across your observability stack.
Once you’ve set up KEDA to scale on Prometheus you’ll need to tune your KEDA config so that it aligns with your pod resources, and your underlying node lifecycle.
Flightcrew is an AI tool that can help with this, and other production engineering tasks. Let us know if we can help.
Tim Nichols
CEO/Founder
Tim was a Product Manager on Google Kubernetes Engine and led Machine Learning teams at Spotify before starting Flightcrew. He graduated from Stanford University and lives in New York City. Follow on Bluesky