Stratos

Stratos is a Kubernetes operator that maintains pools of pre-warmed, reusable Kubernetes nodes. Nodes are launched once, fully initialized, then stopped and restarted on demand — giving you instant capacity with warm caches, pre-pulled images, and zero cold-start overhead.

The Problem

When Kubernetes needs more capacity, every existing autoscaler gives you a brand new machine. That means:

Provisioning — Wait for the cloud provider to allocate and launch an instance
Booting — OS initialization, kubelet startup, cluster join
Networking — CNI plugin initialization, IP allocation
Image pulls — Every DaemonSet image downloaded from scratch
Application startup — Your workload's images pulled, caches empty, no local state

Cluster Autoscaler takes 3-8 minutes. Karpenter brought this down to ~40-50 seconds. But even at Karpenter speed, you still get a cold node every time — empty caches, no pre-pulled images, no local state. For workloads like CI/CD pipelines, LLM inference, or bursty applications, the cold environment is just as painful as the wait.

The Solution

Stratos takes a fundamentally different approach: nodes are initialized once and reused.

Warmup — Stratos launches instances that join the cluster, initialize CNI, pull all DaemonSet images, run any custom setup, then self-stop
Standby — Stopped instances sit in a pool, costing only EBS storage. The disk retains everything: images, caches, local state
Scale-up — When pods are pending, Stratos starts a standby instance. Since the node is already initialized, it's ready in ~20 seconds
Scale-down — Empty nodes are drained and stopped (not terminated), returning to standby with all their state intact

The key insight: Stratos stops and starts nodes instead of terminating and recreating them. This means every scale-up benefits from everything the node has accumulated — Docker layer caches, package manager caches, pre-pulled images, downloaded models, and any other local state.

Not Just Faster Boot — Faster Everything

Traditional autoscalers measure success by "time to node ready." Stratos is faster there too (~20 seconds vs ~40-50 seconds). But the real advantage is what happens after the node is ready:

	Traditional Autoscaler	Stratos
Node provisioning	Launch new instance every time	Start existing instance (~20s)
DaemonSet images	Pull from registry every time	Already on disk
Application images	Pull from registry every time	Already on disk (if previously run)
Docker build cache	Empty	Warm from previous runs
Package manager cache	Empty (`npm install` from scratch)	Warm (`node_modules` cached)
Model weights	Download every time (10+ min for LLMs)	Already on disk
OS/system caches	Cold	Warm

A Karpenter node is ready in ~40 seconds, then your CI pipeline spends another 5 minutes pulling images and rebuilding dependencies. A Stratos node is ready in ~20 seconds, and your pipeline starts with warm caches from the last run.

Use Cases

CI/CD Pipelines

CI agents on Kubernetes typically get a fresh node with empty caches. Every docker build, npm install, or go mod download starts from scratch. Stratos nodes retain build caches across runs — your second pipeline is dramatically faster than the first, and every run after that benefits from the warm cache.

LLM / AI Model Serving

Model images are often 10-50GB+. Downloading them on every scale-up makes autoscaling impractical. With Stratos, the model image is pre-pulled during warmup and persists on the node's EBS volume. Scaling out goes from 15+ minutes to under 2.

Scale-to-Zero

Stratos's ~20-second startup makes true scale-to-zero viable. Pair it with an ingress doorman that holds requests for up to 30 seconds — when traffic hits a scaled-down service, a standby node starts and begins serving before the timeout. No idle compute, no cold-start frustration.

See the Use Cases guide for detailed configurations.

Key Features

Instant capacity: Start pre-warmed nodes in ~20 seconds
Warm caches: Nodes retain Docker layers, build caches, downloaded models, and local state across restarts
Pre-pulled images: DaemonSet images pulled automatically during warmup; configure additional images via preWarm.imagesToPull
Cost-efficient: Stopped instances only incur EBS storage costs
CNI-aware: Handles startup taints for VPC CNI, Cilium, and Calico
Kubernetes-native: Declarative NodePool and AWSNodeClass CRDs
Cloud-agnostic design: Built with a provider abstraction layer (AWS supported)

Quick Start

1. Install with Helm

helm install stratos oci://ghcr.io/stratos-sh/charts/stratos \
  --namespace stratos-system --create-namespace \
  --set clusterName=my-cluster

2. Create an AWSNodeClass and NodePool

awsnodeclass.yaml
apiVersion: stratos.sh/v1alpha1
kind: AWSNodeClass
metadata:
  name: workers
spec:
  instanceType: m5.large
  ami: ami-0123456789abcdef0
  subnetIds: ["subnet-12345678"]
  securityGroupIds: ["sg-12345678"]
  iamInstanceProfile: arn:aws:iam::123456789012:instance-profile/node-role
  userData: |
    #!/bin/bash
    /etc/eks/bootstrap.sh my-cluster \
      --kubelet-extra-args '--register-with-taints=node.eks.amazonaws.com/not-ready=true:NoSchedule'
    until curl -sf http://localhost:10248/healthz; do sleep 5; done
    sleep 30
    poweroff

nodepool.yaml
apiVersion: stratos.sh/v1alpha1
kind: NodePool
metadata:
  name: workers
spec:
  poolSize: 10
  minStandby: 3
  template:
    nodeClassRef:
      kind: AWSNodeClass
      name: workers
    labels:
      stratos.sh/pool: workers
    startupTaints:
      - key: node.eks.amazonaws.com/not-ready
        value: "true"
        effect: NoSchedule

kubectl apply -f awsnodeclass.yaml
kubectl apply -f nodepool.yaml

3. Watch Nodes Scale

kubectl get nodes -l stratos.sh/pool=workers -w

How It Works

                    +---------+
                    | warmup  |  Launch, join cluster, pull images, run setup
                    +----+----+
                         |
           self-stop     |
                         v
                    +---------+
                    | standby |  Stopped — disk retains all state
                    +----+----+
                         |
         scale-up        |
         (start instance)|
                         v
                    +---------+
                    | running |  Serving pods, accumulating caches
                    +----+----+
                         |
         scale-down      |
         (drain & stop)  |
                         v
                    +---------+
                    | standby |  Back to pool — caches preserved
                    +---------+

Next Steps

Installation - Install with Helm
Quickstart - Create your first NodePool
Use Cases - CI/CD, LLM serving, scale-to-zero
Architecture - Understand how Stratos works
AWS Setup - Configure AWS prerequisites

The Problem​

The Solution​

Not Just Faster Boot — Faster Everything​

Use Cases​

CI/CD Pipelines​

LLM / AI Model Serving​

Scale-to-Zero​

Key Features​

Quick Start​

1. Install with Helm​

2. Create an AWSNodeClass and NodePool​

3. Watch Nodes Scale​

How It Works​

Next Steps​