Introduction
Knative Serving is the serverless sub-project of Knative, a CNCF incubating project started at Google. It layers an opinionated autoscaling + traffic-splitting model on top of Kubernetes so teams can focus on container images instead of wiring Deployments, HPAs, Services, Ingresses and PodDisruptionBudgets together every time they ship a workload.
What Knative Serving Does
- Introduces a simple
ServiceCRD that owns Configurations, Revisions and Routes - Scales workloads from zero to thousands of pods based on concurrency or RPS
- Performs traffic splitting across named revisions for canary and blue/green rollouts
- Issues automatic HTTPS via the Knative + cert-manager integration
- Hooks into the Knative Eventing broker for event-driven workloads
Architecture Overview
Serving ships a controller, autoscaler, activator, queue-proxy sidecar and a pluggable ingress layer (Kourier, Contour, Istio). On deploy, the controller creates a Revision and a RevisionTemplate; the autoscaler collects per-revision concurrency metrics through the queue-proxy and adjusts replicas. Scale-to-zero routes through the activator so the first request wakes up a cold revision. Routes and Configurations are separate CRDs so traffic can be pinned to any revision independently from the latest image.
Self-Hosting & Configuration
- Install Serving core manifests plus a networking layer (Kourier is the default)
- Configure
config-autoscaler,config-deploymentandconfig-networkConfigMaps for defaults - Use
knCLI or raw YAML to manage Services, Revisions and Routes - Enable feature flags (e.g. multi-container, init-containers) per cluster policy
- Scrape Prometheus metrics from the autoscaler and activator for observability
Key Features
- Scale-to-zero with sub-second cold-start activation
- Rich autoscaling model: concurrency, RPS and CPU all supported
- First-class revision history with immutable revisions and traffic weights
- Pluggable networking: Kourier, Istio, Contour, Gateway API, Envoy-based meshes
- Works alongside Eventing for Kafka, Pub/Sub, broker and channel abstractions
Comparison with Similar Tools
- Kubernetes HPA + Deployment — manual wiring; Knative bundles the boilerplate
- OpenFaaS / Fission — function-first platforms; Knative focuses on services
- Amazon Lambda / Cloud Run — managed alternatives; Knative is self-hosted
- KEDA + Deployments — event-driven autoscaling; Knative complements with HTTP scaling
- Argo Rollouts — advanced progressive delivery, pairs nicely with Knative traffic splitting
FAQ
Q: Does Knative require Istio? A: No. Kourier is the default lightweight ingress; Contour and Istio are also officially supported.
Q: How does scale-to-zero work? A: The activator buffers requests for zero-scaled revisions; once a Pod is ready the request is proxied and the autoscaler maintains active replicas.
Q: Can I use Knative with any container? A: Any image that listens on a configurable port works. Knative adds no runtime shims beyond the queue-proxy sidecar.
Q: Is it production ready? A: Yes. Knative Serving 1.x is GA and powers workloads at Google Cloud Run, IBM, Red Hat OpenShift Serverless and many self-hosted platforms.