Optimize Kubernetes Resource Allocation with Robusta KRR

kubernetes

Robusta KRR is a CLI tool designed to optimize resource allocation in Kubernetes clusters. By analyzing historical pod usage data, it provides precise CPU and memory recommendations, reducing cloud costs and enhancing performance. Features include auto-apply mode, diverse data integrations, detailed explainability, and integration with a free SaaS UI.

The Robusta Kubernetes Resource Recommender (KRR) is a powerful CLI tool for optimizing resource allocation in your Kubernetes clusters. It meticulously gathers pod usage data from various monitoring systems like Prometheus, Coralogix, Thanos, and Mimir, providing precise recommendations for CPU and memory requests and limits. This approach significantly reduces cloud costs and enhances application performance by right-sizing your containers.

Key Capabilities

  • Resource Optimization: Get recommendations for CPU and memory requests and limits based on historical data.
  • Cost Reduction & Performance Improvement: Efficiently manage resources to cut down on cloud expenditures and boost workload performance.
  • Auto-Apply Mode: Automate resource right-sizing by applying recommendations automatically. Contact us for beta access to this feature.

Data Integrations

KRR supports a wide array of data sources for gathering metrics:

  • Prometheus
  • Thanos
  • Victoria Metrics
  • Google Managed Prometheus
  • Amazon Managed Prometheus
  • Azure Managed Prometheus
  • Coralogix
  • Grafana Cloud
  • Grafana Mimir

Reporting and Operational Integrations

KRR seamlessly integrates with various platforms for viewing and acting on recommendations:

  • Web UI for visual recommendations
  • Slack notifications
  • k9s plugin
  • Azure Blob Storage export with Microsoft Teams notifications

Features

  • Agent-less Operation: Run KRR as a CLI tool on your local machine for immediate insights, or in-cluster for scheduled reports (e.g., weekly Slack reports).
  • Prometheus Integration: Leverage your existing Prometheus data for accurate recommendations.
  • Explainability: Understand the rationale behind each recommendation with detailed explanation graphs.
  • Extensible Strategies: Easily develop and implement custom strategies for calculating resource recommendations.
  • Free SaaS Platform: Access the free Robusta SaaS platform to visualize KRR recommendations and usage history.
  • Future Support: Upcoming versions will introduce support for custom resources (e.g., GPUs) and custom metrics.

Quantifying Cost Savings with KRR

A recent Sysdig study indicates that Kubernetes clusters typically exhibit:

  • 69% unused CPU
  • 18% unused memory

By implementing KRR to right-size your containers, you can expect an average reduction of 69% in cloud costs.

Robusta KRR vs. Kubernetes VPA

Here’s a comparison between Robusta KRR and Kubernetes Vertical Pod Autoscaler (VPA):

FeatureRobusta KRRKubernetes VPA
Resource RecommendationsCPU/Memory requests and limitsCPU/Memory requests and limits
Installation LocationNot required to be installed inside the cluster; can be used on your own device connected to a cluster.Must be installed inside the cluster.
Workload ConfigurationNo need to configure a VPA object for each workload.Requires VPA object configuration for each workload.
Immediate ResultsProvides immediate results (given Prometheus is running).Requires time to gather data and provide recommendations.
ReportingJSON, CSV, Markdown, Web UI, and more!Not supported.
ExtensibilityAdd your own strategies with a few lines of Python.Limited extensibility.
ExplainabilitySee graphs explaining the recommendations.Not supported.
Custom MetricsPlanned support in future versions.Not supported.
Custom ResourcesPlanned support in future versions (e.g., GPU).Not supported.
AutoscalingPlanned support in future versions.Automatic application of recommendations.
Default History14 days8 days
Supports HPAEnable using --allow-hpa flag.Not supported.

How KRR Works

KRR’s effectiveness stems from its intelligent metric gathering and algorithmic approach.

Metrics Gathering: Robusta KRR utilizes the following Prometheus queries to collect critical usage data:

  • CPU Usage: sum(irate(container_cpu_usage_seconds_total{{namespace="{object.namespace}", pod="{pod}", container="{object.container}"}}[{step}]))
  • Memory Usage: sum(container_memory_working_set_bytes{job="kubelet", metrics_path="/metrics/cadvisor", image!="", namespace="{object.namespace}", pod="{pod}", container="{object.container}"})

If you require custom metric support, please reach out to us. For a free breakdown of KRR recommendations, explore the Robusta SaaS platform.

Recommendation Algorithm (Default Strategy): By default, KRR employs a "simple" strategy, which can be customized via CLI arguments:

  • CPU: A request is set at the 95th percentile, with no limit. This ensures CPU requests are sufficient 95% of the time, while allowing pods to burst and utilize available node CPU in the remaining 5%.
  • Memory: The maximum memory value observed over the past week is taken, with an additional 15% buffer.

Prometheus Connection: KRR attempts to auto-discover the default Prometheus instance. Refer to the official documentation for details on this auto-discovery process.

Installation

Requirements: KRR requires:

  • Prometheus 2.26+
  • kube-state-metrics
  • cAdvisor

If you are using kube-prometheus-stack or Robusta's Embedded Prometheus, no additional setup is needed for metrics. For other setups, ensure the following metrics are available:

  • container_cpu_usage_seconds_total
  • container_memory_working_set_bytes
  • kube_replicaset_owner
  • kube_pod_owner
  • kube_pod_status_phase

Note: If any of the last three metrics are absent, KRR will still function but will only consider currently-running pods for recommendations, excluding historical pods that no longer exist.

Installation Methods:

  • Brew (Mac/Linux):

    brew tap robusta-dev/homebrew-krr
    brew install krr
    krr --help
    krr simple # First launch might take longer
    
  • Windows: Install via Brew on WSL2, or install from source (see below).

  • Docker Image, Binaries, and Airgapped Installation (Offline Environments): Pre-built binaries are available from the Releases page, or you can use the pre-built Docker container. For instance, the container for version 1.8.3 is us-central1-docker.pkg.dev/genuine-flight-317411/devel/krr:v1.8.3. Note: Installing KRR from source in airgapped environments is not recommended due to Python dependency management complexities. Please use pre-built options or contact support for assistance.

  • In-Cluster Installation: Beyond CLI usage, KRR can run within your cluster. We recommend installing KRR via the Robusta Platform. This provides a free UI with features such as:

    • Visualization of application usage history graphs underlying recommendations.
    • Application, namespace, and cluster-level recommendations.
    • YAML configuration snippets for applying suggested recommendations.

    Alternatively, you can run KRR in-cluster as a Kubernetes Job if a UI is not required:

    kubectl apply -f https://raw.githubusercontent.com/robusta-dev/krr/refs/heads/main/docs/krr-in-cluster/krr-in-cluster-job.yaml
    
  • From Source: Requires Python 3.9 or greater.

    git clone https://github.com/robusta-dev/krr
    cd ./krr
    pip install -r requirements.txt
    python krr.py --help
    

    Note: When installed from source, commands like krr ... should be replaced with python krr.py ....

Custom Certificate Authority (CA) Trust: If your LLM provider URL uses a certificate from a custom CA, base-64 encode the certificate and store it in an environment variable named CERTIFICATE for KRR to trust it.

Usage

Basic Usage:

krr simple

Tuning the Recommendation Algorithm (Strategy Settings): Key flags for strategy adjustments:

  • --cpu-min: Sets the minimum recommended CPU value in millicores.
  • --mem-min: Sets the minimum recommended memory value in MB.
  • --history_duration: Defines the duration of Prometheus historical data to use (in hours).

For more specific information on strategy settings, use krr simple --help.

Explicit Prometheus URL: If Prometheus auto-connection fails, use kubectl port-forward and specify the URL:

kubectl port-forward pod/kube-prometheus-st-prometheus-0 9090
# In a new terminal:
krr simple -p http://127.0.0.1:9090

Running on Specific Namespaces: Specify multiple namespaces or use regex patterns:

krr simple -n default -n ingress-nginx
krr simple -n default -n 'ingress-.*'

Note: Regex usage requires permissions to list namespaces in the target cluster.

Running on Workloads Filtered by Label: Apply a label selector to target specific workloads:

python krr.py simple --selector 'app.kubernetes.io/instance in (robusta, ingress-nginx)'

Grouping Jobs by Specific Labels: Consolidate resource recommendations for related batch jobs or data pipelines.

krr simple --job-grouping-labels app,team

This command will:

  • Group jobs that have either app or team labels (or both).
  • Create GroupedJob objects (e.g., app=frontend, team=backend).
  • Provide group-level recommendations, excluding individual jobs from regular listings. You can specify multiple labels: krr simple --job-grouping-labels app,team,environment. A job with app=api,team=backend will appear in both app=api and team=backend groups.

Limiting Jobs per Group: Use --job-grouping-limit <N> to cap the number of jobs included per group (default is 500).

krr simple --job-grouping-labels app,team --job-grouping-limit 3

This limits each label group to at most N jobs (e.g., the first 3 returned by the API); other matching jobs are ignored for that group.

Overriding the kubectl Context: Run KRR against different Kubernetes contexts:

krr simple -c my-cluster-1 -c my-cluster-2

Output Formats for Reporting: KRR supports various output formats:

  • table (default CLI table, powered by Rich library)
  • json
  • yaml
  • pprint (Python's pprint library representation)
  • csv (exports data to a CSV file)
  • csv-raw (CSV with raw calculation data)
  • html

To use a specific formatter, add the -f flag. Combine with --fileoutput <filename> for clean output without logs:

krr simple -f json --fileoutput krr-report.json

Alternatively, use --logtostderr to separate formatted output from logs:

krr simple --logtostderr -f json > result.json 2> logs-and-errors.log

Prometheus Authentication: KRR supports all standard authentication schemes for Prometheus, VictoriaMetrics, Coralogix, and other compatible metric stores. Refer to krr simple --help for flags like --prometheus-url, --prometheus-auth-header, --prometheus-headers, --prometheus-ssl-enabled, --coralogix-token, and various --eks-* flags. Contact support for assistance.

Debug Mode: Enable additional debug logs with:

krr simple -v

Data Source Configuration

Prometheus, Victoria Metrics, and Thanos Auto-Discovery: KRR automatically attempts to discover running Prometheus, Victoria Metrics, and Thanos instances by scanning services for specific labels:

  • Prometheus: "app=kube-prometheus-stack-prometheus", "app=prometheus,component=server", "app=prometheus-server", "app=prometheus-operator-prometheus", "app=rancher-monitoring-prometheus", "app=prometheus-prometheus"
  • Thanos: "app.kubernetes.io/component=query,app.kubernetes.io/name=thanos", "app.kubernetes.io/name=thanos-query", "app=thanos-query", "app=thanos-querier"
  • Victoria Metrics: "app.kubernetes.io/name=vmsingle", "app.kubernetes.io/name=victoria-metrics-single", "app.kubernetes.io/name=vmselect", "app=vmselect" If auto-discovery fails, you will need to provide the Prometheus URL explicitly using the -p flag.

Scanning with a Centralized Prometheus (Multi-Cluster): For Prometheus instances monitoring multiple clusters, specify the cluster label defined in Prometheus. Example: If your cluster has the Prometheus label cluster: "my-cluster-name":

krr.py simple --prometheus-label cluster -l my-cluster-name

You may also need to use the -p flag for the Prometheus URL.

Azure Managed Prometheus: Generate an access token to use Azure Managed Prometheus:

# Uncomment if not logged in: # az login
AZURE_BEARER=$(az account get-access-token --resource=https://prometheus.monitor.azure.com  --query accessToken --output tsv); echo $AZURE_BEARER
python krr.py simple --namespace default -p PROMETHEUS_URL --prometheus-auth-header "Bearer $AZURE_BEARER"

Refer to documentation for configuring labels with centralized Prometheus.

Google Managed Prometheus (GMP): Detailed GMP usage instructions are available here.

Amazon Managed Prometheus: To use Amazon Managed Prometheus, provide your Prometheus link and the --eks-managed-prom flag. KRR will automatically use your AWS credentials.

python krr.py simple -p "https://aps-workspaces.REGION.amazonaws.com/workspaces/..." --eks-managed-prom

Optional parameters include:

  • --eks-profile-name PROFILE_NAME_HERE (to specify an AWS profile)
  • --eks-access-key ACCESS_KEY (to specify your access key)
  • --eks-secret-key SECRET_KEY (to specify your secret key)
  • --eks-service-name SERVICE_NAME (to use a specific service name in the signature)
  • --eks-managed-prom-region REGION_NAME (to specify the Prometheus region) Refer to documentation for configuring labels with centralized Prometheus.

Coralogix Managed Prometheus: Specify your Coralogix Prometheus link and the --coralogix_token flag with your Logs Query Key.

python krr.py simple -p "https://prom-api.coralogix..." --coralogix_token

Refer to documentation for configuring labels with centralized Prometheus.

Grafana Cloud Managed Prometheus: Provide your Prometheus link, Prometheus user, and an access token (with metrics:read scope) from your Grafana Cloud stack. These details are found in the Grafana Cloud Portal.

PROM_URL="YOUR_PROMETHEUS_URL"
PROM_USER="YOUR_PROMETHEUS_USER"
PROM_TOKEN="YOUR_ACCESS_TOKEN"
python krr.py simple -p $PROM_URL --prometheus-auth-header "Bearer ${PROM_USER}:${PROM_TOKEN}" --prometheus-ssl-enabled

Refer to documentation for configuring labels with centralized Prometheus.

Grafana Mimir Auto-Discovery: KRR attempts to auto-discover Grafana Mimir by scanning services for the label "app.kubernetes.io/name=mimir,app.kubernetes.io/component=query-frontend".

Integrations

Free UI for KRR Recommendations (Robusta SaaS Platform): We highly recommend utilizing the free Robusta SaaS platform for an enhanced KRR experience. This platform allows you to:

  • Understand individual application recommendations with historical usage data.
  • Sort and filter recommendations by namespace, priority, and other criteria.
  • Generate YAML snippets to facilitate the application of KRR's suggested fixes.
  • Analyze impact through a comprehensive KRR scan history.

Slack Notification: Automate cost savings by receiving Slack notifications for recommendations exceeding a specified threshold. You can opt for a weekly global report or team-specific reports.

  • Prerequisites: A Slack workspace.
  • Setup:
    1. Install Robusta with Helm into your cluster and configure Slack.
    2. Create your KRR Slack playbook by adding the following to generated_values.yaml:
      customPlaybooks:
      - triggers:
        - on_schedule:
            fixed_delay_repeat:
              repeat: -1 # Run forever
              seconds_delay: 604800 # 1 week
        actions:
        - krr_scan:
            args: "--namespace devs-namespace" ## KRR arguments here
        sinks:
            - "main_slack_sink" # Slack sink for the report
      
    3. Apply the new values with a Helm upgrade:
      helm upgrade robusta robusta/robusta --values=generated_values.yaml --set clusterName=<YOUR_CLUSTER_NAME>
      

k9s Plugin: Install the KRR k9s Plugin to view recommendations directly within your deployments, daemonsets, and statefulsets views. The plugin is named resource recommender. Refer to the k9s documentation for installation instructions.

Azure Blob Storage Export with Microsoft Teams Notifications: Export KRR reports directly to Azure Blob Storage and receive notifications in Microsoft Teams when reports are generated.

  • Prerequisites:
    • An Azure Storage Account with a container for reports.
    • A Microsoft Teams channel with an incoming webhook configured.
    • An Azure SAS URL with write permissions to your storage container.
  • Setup:
    1. Create Azure Storage Container: Set up a container (e.g., fileuploads) in your Azure Storage Account.
    2. Generate SAS URL: Create a SAS URL for your container with write permissions. Example SAS URL format (replace with your actual values): https://yourstorageaccount.blob.core.windows.net/fileuploads?sv=2024-11-04&ss=bf&srt=o&sp=wactfx&se=2026-07-21T21:12:48Z&st=2025-07-21T12:57:48Z&spr=https&sig=...
    3. Configure Teams Webhook: Set up an incoming webhook in your Microsoft Teams channel (found in the Workflows tab).
    4. Run KRR with Azure Integration:
      krr simple -f html \
        --azurebloboutput "https://yourstorageaccount.blob.core.windows.net/fileuploads?sv=..." \
        --teams-webhook "https://your-teams-webhook-url" \
        --azure-subscription-id "your-subscription-id" \
        --azure-resource-group "your-resource-group"
      
  • Features:
    • Automatic File Upload: Reports are automatically uploaded to Azure Blob Storage with timestamped filenames.
    • Teams Notifications: Rich adaptive cards are sent to Teams upon report generation.
    • Direct Links: Teams notifications include direct links to view files in the Azure Portal.
    • Multiple Formats: Supports all KRR output formats (JSON, CSV, HTML, YAML, etc.).
    • Secure: Utilizes SAS URLs for secure, time-limited access to your storage.
  • Command Options:
    • --azurebloboutput: Azure Blob Storage SAS URL base path (include container name; filename is auto-appended).
    • --teams-webhook: Microsoft Teams webhook URL for notifications.
    • --azure-subscription-id: Azure Subscription ID (for Azure Portal links in Teams).
    • --azure-resource-group: Azure Resource Group name (for Azure Portal links in Teams).
  • Example Usage:
    • Basic Azure Blob export:
      krr simple -f json --azurebloboutput "https://mystorageaccount.blob.core.windows.net/reports?sv=..."
      
    • With Teams notifications:
      krr simple -f html \
        --azurebloboutput "https://mystorageaccount.blob.core.windows.net/reports?sv=..." \
        --teams-webhook "https://outlook.office.com/webhook/..." \
        --azure-subscription-id "12345678-1234-1234-1234-123456789012" \
        --azure-resource-group "my-resource-group"
      
  • Teams Notification Features: The Teams adaptive card provides:
    • Report generation announcement.
    • Namespace and format details.
    • Generation timestamp.
    • Storage account and container information.
    • A direct "View in Azure Storage" button linking to the Azure Portal.

Advanced Customization

Creating a Custom Strategy/Formatter: Refer to the examples directory in the KRR repository for guidance on creating custom strategies and formatters.

Testing: KRR uses pytest for testing.

  1. Install the project manually.
  2. Navigate to the project root directory.
  3. Install poetry.
  4. Install dev dependencies: poetry install --group dev.
  5. Install robusta_krr as an editable dependency: pip install -e ..
  6. Run tests: poetry run pytest.

Contributing: Contributions are highly valued. If you have suggestions, fork the repository and create a pull request, or open an issue with the "enhancement" tag. Don't forget to star the project!

  1. Fork the Project.
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature).
  3. Commit your Changes (git commit -m 'Add some AmazingFeature').
  4. Push to the Branch (git push origin feature/AmazingFeature).
  5. Open a Pull Request.

License: Distributed under the MIT License. See LICENSE.txt for more details.

Support: For questions, contact support@robusta.dev or reach out on robustacommunity.slack.com.