CNCF Cloud Native Landscape: The Practical Guide
· ~6 min readCNCF Cloud Native Landscape: The Practical Guide
The CNCF Cloud Native Landscape catalogs 1,000+ projects. We ranked every major category using a compound metric: stars × 0.5 + forks × 0.3 + maturity bonus (graduated = 1.0, incubating = 0.7, sandbox = 0.4). Here are the categories that matter for production infrastructure.
1. Orchestration & Management — Score: 2.509
The backbone of cloud native. 254K combined stars across the top 5 projects.
| Project | Stars | Status | Role |
|---|---|---|---|
| Kubernetes | 121K | Graduated | Container orchestration |
| gRPC | 44K | Graduated | Polyglot RPC framework |
| Istio | 38K | Graduated | Service mesh |
| Envoy | 27K | Graduated | L4/L7 proxy |
| Argo CD | 22K | Incubating | GitOps deployment |
| Linkerd | 11K | Graduated | Lightweight service mesh |
Kubernetes is the de facto OS for the cloud — pod scheduling, horizontal scaling, RBAC, and a vast operator ecosystem. gRPC with Protocol Buffers powers polyglot microservice communication. Istio provides traffic management, mTLS, and canary deployments via sidecar proxies; Linkerd offers 80% of the same benefits at 20% of the operational cost using a Rust micro-proxy. Envoy is the data plane behind both (and many API gateways). Argo CD watches Git repos and reconciles cluster state — the GitOps standard.
flowchart LR
A[Argo CD] --> | deploys | K[Kubernetes]
K --> | schedules | P[Pods]
I[Istio / Linkerd] --> | mesh | P
E[Envoy] --> | proxy | P
G[gRPC] --> | RPC | P
```text
**When to use what:**
- **New to cloud native?** Kubernetes + Argo CD. Skip service mesh until you need it.
- **Multi-cluster / multi-team?** Istio for the most complete mesh features.
- **Resource-constrained?** Linkerd — minimal footprint, automatic mTLS.
- **Polyglot microservices?** gRPC with Protocol Buffers.
---
## 2. Observability & Analysis — Score: 2.176
Growing 1.2x thanks to AI-driven observability demands. The eyes and ears of production systems.
| Project | Stars | Status | Role |
| --------- | ------- | -------- | ------ |
| [Grafana](https://github.com/grafana/grafana) | 72K | — | Dashboards & visualization |
| [Prometheus](https://github.com/prometheus/prometheus) | 63K | Graduated | Metrics collection |
| [Jaeger](https://github.com/jaegertracing/jaeger) | 22K | Graduated | Distributed tracing |
| [cert-manager](https://github.com/cert-manager/cert-manager) | 13K | Graduated | Automated TLS |
| [OPA](https://github.com/open-policy-agent/opa) | 11K | Graduated | Policy as code |
| [Chaos Mesh](https://github.com/chaos-mesh/chaos-mesh) | 7K | Incubating | Chaos engineering |
**Prometheus** is the metrics backbone of virtually every K8s deployment — pull-based collection, PromQL querying, and Alertmanager for routing. **Grafana** is the universal dashboard layer connecting to 300+ data sources. **Jaeger** implements OpenTelemetry tracing to follow requests across service boundaries. **cert-manager** automates TLS from Let's Encrypt — no more expired certs. **OPA** enforces fine-grained policies (who can deploy what, which pods can communicate) using the Rego language. **Chaos Mesh** injects failures to verify your system degrades gracefully.
```mermaid
flowchart TD
subgraph "Visualization"
GR[Grafana]
end
subgraph "Collection"
PR[Prometheus] & LO[Loki] & JA[Jaeger]
end
subgraph "Security"
CM[cert-manager] & OP[OPA] & CH[Chaos Mesh]
end
GR --> PR & LO & JA
OT[OpenTelemetry] --> PR & LO & JA
OT --> CM & OP & CH
```text
**When to use what:**
- **Setting up monitoring?** Prometheus + Grafana — the default starting point.
- **Debugging latency?** Jaeger for distributed tracing.
- **Automating HTTPS?** cert-manager — zero manual cert management.
- **Access control?** OPA + Gatekeeper for admission policies.
- **Testing resilience?** Chaos Mesh — inject failures before users find them.
---
## 3. Provisioning — Score: 1.311
Infrastructure as code and security. Ansible alone commands 68K stars.
| Project | Stars | Status | Role |
| --------- | ------- | -------- | ------ |
| [Ansible](https://github.com/ansible/ansible) | 68K | — | Agentless automation |
| [OpenTofu](https://github.com/opentofu/opentofu) | 28K | Sandbox | Open source Terraform |
| [OpenEBS](https://github.com/openebs/openebs) | 9K | Graduated | Cloud native storage |
| [Atlantis](https://github.com/runatlantis/atlantis) | 8K | Sandbox | PR-based IaC review |
| [Falco](https://github.com/falcosecurity/falco) | 8K | Graduated | Runtime security (eBPF) |
| [Kyverno](https://github.com/kyverno/kyverno) | 7K | Incubating | K8s-native policy engine |
**Ansible** is agentless IT automation — 3,000+ modules, connects via SSH, idempotent operations. **OpenTofu** is the community fork of Terraform after HashiCorp's BSL license change — same HCL syntax, MPL license. **Falco** monitors system calls via eBPF to detect shell invocations in containers, unexpected network connections, and privilege escalations. **Kyverno** enforces policies using native Kubernetes YAML (no sidecars, no Rego). **OpenEBS** provides container-attached storage with multiple engines (LocalPV, ZFS, NVMe-oF). **Atlantis** automates Terraform plan reviews through pull requests.
| Consideration | Terraform | OpenTofu |
| --- | --- | --- |
| License | BSL (restrictive) | MPL (open source) |
| Provider ecosystem | Mature, largest | Growing, compatible |
| Enterprise support | HashiCorp TFC | Community + vendors |
---
## 4. Runtime — Score: 1.262
Container runtimes, storage, and networking. Cilium's eBPF revolution is driving growth.
| Project | Stars | Status | Role |
| --------- | ------- | -------- | ------ |
| [Cilium](https://github.com/cilium/cilium) | 24K | Graduated | eBPF networking & security |
| [containerd](https://github.com/containerd/containerd) | 20K | Graduated | Container runtime |
| [Rook](https://github.com/rook/rook) | 13K | Graduated | Storage orchestration (Ceph) |
| [Longhorn](https://github.com/longhorn/longhorn) | 7K | Graduated | Block storage |
**Cilium** replaces iptables with kernel-level eBPF packet processing — L3/L4 policy enforcement, transparent mTLS, bandwidth management, and deep observability without sidecars. **containerd** is the invisible workhorse: every pod goes through it for image management (OCI-compliant, CRI implementation). **Rook** turns Ceph into Kubernetes CRDs for declarative storage provisioning. **Longhorn** provides lightweight block storage with built-in replication, snapshots, and disaster recovery.
| Traditional Networking | Cilium eBPF |
| --- | --- |
| iptables-based rules | Kernel-level programmable filtering |
| Sidecar proxy for L7 | Native L7 policy enforcement |
| Best-effort observability | Deep packet-level observability |
| Complex rule chains | Declarative policy models |
---
## 5. App Definition & Development — Score: 0.929
CI/CD pipelines, streaming, and modern dev tools.
| Project | Stars | Status | Role |
| --------- | ------- | -------- | ------ |
| [Pulumi](https://github.com/pulumi/pulumi) | 24K | — | IaC with real programming languages |
| [NATS](https://github.com/nats-io/nats-server) | 19K | Graduated | High-performance messaging |
| [Tekton](https://github.com/tektoncd/pipeline) | 8K | Graduated | K8s-native CI/CD |
| [Strimzi](https://github.com/strimzi/strimzi-kafka-operator) | 5K | Graduated | Kafka operator for K8s |
**Pulumi** brings real programming languages (TypeScript, Python, Go) to infrastructure as code — loops, functions, classes, and type safety instead of YAML templates. **NATS** is a lightweight, high-performance messaging system (sub-ms latency, single Go binary) with JetStream for persistent, exactly-once delivery. **Tekton** runs CI/CD pipelines directly on Kubernetes — your cluster IS your CI system. **Strimzi** manages Apache Kafka through Kubernetes CRDs (topics, users, connectors as declarative resources).
```mermaid
flowchart LR
T[Tekton CI/CD] --> P[Pulumi IaC]
P --> N[NATS Messaging]
N --> S[Strimzi Kafka]
```text
---
## Emerging Categories
These are growing fast but not yet as established as the core five above.
### Wasm (WebAssembly) — 1.5x trend multiplier
Breaking out of the browser into server-side and edge computing. **[Wasmer](https://github.com/wasmerio/wasmer)** (20K stars), **[Wasmtime](https://github.com/bytecodealliance/wasmtime)** (17K), and **[WasmEdge](https://github.com/WasmEdge/WasmEdge)** (10K, incubating) provide sandboxing, portability, and near-native performance. WasmEdge is pushing into AI/ML inference at the edge.
### Serverless
**[OpenFaaS](https://github.com/openfaas/faas)** (26K) turns any function into a serverless workload. **[Knative](https://github.com/knative/serving)** (6K, incubating) extends Kubernetes with scale-to-zero primitives for container-based serverless.
### Cloud Native AI
The fastest-growing category overall (2.0x multiplier). **[vLLM](https://github.com/vllm-project/vllm)** (74K stars) is the standard for LLM serving. Vector databases — **[Milvus](https://github.com/milvus-io/milvus)** (43K), **[Qdrant](https://github.com/qdrant/qdrant)** (29K), **[Chroma](https://github.com/chroma-core/chroma)** (27K) — power RAG architectures. [Read the full CNAI guide →](/talks_and_thoughts/cncf-cloud-native-ai/)
---
For the complete interactive landscape with filtering and sorting, visit [landscape.cncf.io](https://landscape.cncf.io/).