OpenSpec vs. the Specification Framework Landscape: Why DevOps Needs AI-Native Specs | content

Every DevOps team maintains specifications. API contracts, architecture decision records, design documents, runbooks, change proposals. The question is not whether you spec — it's whether your specs survive contact with reality.

Most specification frameworks were designed for human-to-human communication. They assume a reader who can infer intent from prose, fill in gaps from context, and navigate cross-references intuitively. That assumption breaks down when the primary consumer of your specs is no longer human.

We are entering an era where AI agents write, review, and deploy code alongside humans. The specification frameworks we choose determine whether those agents work with us or against us.

This article compares six approaches to specification and change management — including OpenSpec, the AI-native framework used in this project — across practical DevOps dimensions: agent-friendliness, CI/CD integration, documentation drift, and operational overhead.

The Specification Spectrum

Before comparing individual frameworks, it helps to understand what a specification framework actually provides:

┌─────────────────────────────────────────────────────────────────┐
│                    WHAT SPECS DO                                │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  INTENT CAPTURE    ──►  DESIGN DECISIONS  ──►  IMPLEMENTATION   │
│  (what & why)           (how)                  (ticked boxes)   │
│                                                                 │
│  Documentation                                                   │
│  Drift grows here ←───────────────────────────────────────────► │
│  (spec frozen)                                   (code evolves) │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Every framework addresses a subset of this pipeline. The difference is where they place the automation boundary — what they encode in machine-readable formats versus what they leave to human interpretation.

The Contenders

Framework	Primary Format	Consumer	Scope	AI-Native?
OpenAPI/Swagger	YAML/JSON	Tools + Humans	API contracts	Partial
AsyncAPI	YAML/JSON	Tools + Humans	Event contracts	Partial
ADRs	Markdown	Humans	Architecture decisions	No
RFC Process	Markdown	Humans (team)	Design proposals	No
BDD/Gherkin	Plain text (Gherkin)	Tests + Humans	Behavior specs	Yes (structured)
OpenSpec	Markdown + CLI	AI Agents + Humans	Full change lifecycle	Yes (native)
GitHub Issues/Projects	UI + Markdown	Humans	Task tracking	No
JIRA + Confluence	UI + Rich text	Humans (org)	Project management	No

OpenAPI / Swagger: The Gold Standard That Solved One Problem

OpenAPI is the most successful specification framework in DevOps. It defined a machine-readable contract format for REST APIs that generates documentation, client SDKs, server stubs, and test harnesses from a single source of truth.

Where it shines:

Contract-first development with code generation
Extensive tooling ecosystem (Swagger UI, Editor, Codegen, Validator)
Clear, versioned interface boundaries between services
CI pipeline validation (request/response conformance)

Where it falls short for modern DevOps:

Narrow scope — API surface only, not deployment, infrastructure, or operational specs
No change lifecycle management — versioning the spec is manual, and there is no artifact trail from proposal to implementation
Static contract format — OpenAPI 3.1 improved things with JSON Schema, but the spec describes an endpoint, not the engineering process behind it
Agents cannot follow it — an AI agent can read an OpenAPI spec to understand an API surface, but it cannot use it to plan, propose, design, implement, and verify a change

# OpenAPI tells you WHAT the API looks like
# It does NOT tell you HOW to change it safely
paths:
  /deployments:
    get:
      summary: List deployments
      responses:
        '200':
          content:
            application/json:
              schema:
                type: array
                items:
                  $ref: '#/components/schemas/Deployment'

OpenAPI is essential infrastructure. But it is a contract format, not a change management framework. It solves the interface problem and leaves the engineering process untouched.

AsyncAPI: OpenAPI for Events

AsyncAPI extends the contract-first model to event-driven architectures — Kafka topics, RabbitMQ queues, WebSocket channels, MQTT brokers.

What it does well:

Machine-readable channel definitions with publish/subscribe semantics
Message schema validation across event boundaries
Code generation for producers and consumers
Growing tooling ecosystem (generator, modelina, CLI)

Same fundamental limitation as OpenAPI: AsyncAPI describes the event surface — not the process that produced it. You know what Kafka topics exist, but you have no artifact trail of why they were added, what alternatives were considered, or whether the implementation matches the intended design.

Both OpenAPI and AsyncAPI suffer from the same DevOps gap: the spec lives separately from the change process that created it, and the two inevitably diverge.

ADRs: Lightweight, Human-First, Agent-Hostile

Architecture Decision Records (ADRs) are a beautifully simple pattern: a short Markdown file per architecture decision, stored in the repository alongside the code. Originally proposed by Michael Nygard in 2011.

adr/
├── 001-use-postgresql-for-primary-storage.md
├── 002-adopt-kubernetes-for-orchestration.md
├── 003-use-redis-for-session-caching.md
└── 004-migrate-to-opensearch-for-logging.md

Each ADR follows a template: Context → Decision → Consequences. The format is intentionally minimal — prose-based, designed for humans to write and read.

Where ADRs excel:

Low friction to create (one file, no tooling required)
Lives in the repo alongside code (no external system drift)
Perfect for recording why decisions were made
Immutable, numbered record of architectural evolution

Where they break down:

No structure an agent can follow — ADRs are free-form prose. An AI agent would need to parse natural language to extract decision criteria, alternatives, and scope. This is possible (LLMs are good at this) but not reliable for automated enforcement.
No connection to implementation — an ADR says "we decided to use PostgreSQL." It does not track whether the implementation matches that decision, what tasks remain, or whether any code contradicts it.
No lifecycle management — ADRs are append-only. Decisions are made and recorded, but there is no workflow for revisiting, superseding, or archiving them.
No task generation — "We decided X" does not produce a checklist of implementation steps. A human (or agent) must interpret the decision and figure out what to build.

ADRs are valuable documentation. They are not a framework for driving change.

BDD / Gherkin: Executable Specs With a Ceiling

Behavior-Driven Development with Gherkin (Given/When/Then) is the closest predecessor to what OpenSpec attempts. It defines executable specifications that double as documentation and acceptance tests.

Feature: Deployment Rollback
  Scenario: Rollback on health check failure
    Given a deployment with 3 replicas
    When the health check fails for 2 replicas
    Then the orchestrator initiates a rollback
    And the previous revision is restored within 60 seconds

BDD's genuine strengths:

Executable specs — the spec IS the test. No documentation drift because the spec runs in CI.
Shared language — Gherkin's structured natural language bridges business stakeholders and engineers.
Living documentation — passing specs = accurate documentation.
Agent-friendly format — the structured Given/When/Then format is parsable by AI agents without ambiguity.

Why BDD isn't enough for DevOps change management:

Feature-level scope only — a Gherkin scenario describes one behavior. It cannot represent a multi-step engineering change (provision infrastructure → deploy service → configure monitoring → verify).
No design artifact — BDD captures acceptance criteria but not the design decisions, trade-offs, or alternatives considered.
No change lifecycle — BDD specs are written and automated, but there is no concept of proposal, approval, implementation tracking, or archive.
High maintenance overhead — Gherkin scenarios are notoriously brittle. A UI change can break dozens of scenarios written in business language, requiring expensive rewrites.

BDD occupies a useful niche: spec-as-test for specific behaviors. It does not replace a change management framework.

The RFC Process: Collaborative Design, Manual Everything

The RFC (Request For Comments) process, popularized by the IETF and adopted by React, Rust, Python, and Kubernetes, is the gold standard for collaborative design. A proposed change is documented in a structured template, discussed by the community, refined through review cycles, and either accepted or rejected.

rfcs/
├── text/
│   ├── 0000-template.md
│   ├── 0001-new-rfc-process.md
│   ├── 0002-adopt-openspec.md
│   └── 0003-spec-first-change-management.md

What makes RFCs powerful:

Structured, repeatable proposal format
Community review baked into the process
Decision record (accepted/rejected with rationale)
Historical archive of design evolution

The DevOps gap with RFCs:

No automation boundary — an RFC is a Markdown document with prose sections. There is no machine-enforceable contract between the proposal and the implementation.
No task decomposition — acceptance means "we agreed this is the right approach." Someone still needs to break it into implementation tasks.
No CI integration — CI cannot verify that implementation matches the RFC's design decisions.
Agent-hostile structure — like ADRs, RFCs rely on human readers to interpret intent. An AI agent parsing a 50-section RFC would struggle to extract deterministic action items.

The RFC process produces excellent design artifacts. But the bridge from "RFC accepted" to "code deployed" is entirely manual.

GitHub Issues + PRs: The Unstructured Default

Most DevOps teams default to GitHub Issues for change tracking and Pull Requests for code review. This is the path of least resistance — it ships quickly and requires no additional tooling.

What it gets right:

Zero setup cost
Everyone knows how to use it
PR reviews provide human quality gates
CI integration is built in

What it gets wrong for systematic change management:

Issue = unstructured blob — an issue can be a bug report, a feature request, a question, a design discussion, or a task. There is no structural distinction.
No design artifact — the PR is the implementation. Design decisions are buried in comment threads or left implicit.
No spec → implementation traceability — did the PR implement what was intended? The only way to know is to read the diff and compare it to your memory of the discussion.
Agent-hostile — an AI agent can read issues and PRs, but it cannot follow a structured workflow from proposal to implementation to verification because the structure simply does not exist.
Documentation drift is the default — once merged, the issue is closed and the discussion is archived. The implementation evolves, but the issue never updates.

GitHub Issues + PRs is a communication platform that teams repurpose into a workflow. It works for small teams with good discipline. It scales poorly.

OpenSpec: AI-Native Change Management

OpenSpec enters this landscape as a framework designed explicitly for the AI agent era. It combines the structured artifact approach of RFCs, the traceability of ADRs, the executability of BDD, and adds something none of these have: a machine-enforceable change lifecycle that AI agents can follow autonomously.

The OpenSpec Change Lifecycle

Every change in OpenSpec follows a defined artifact dependency chain:

┌─────────────────────────────────────────────────────────────────┐
│                   OPENSPEC CHANGE LIFECYCLE                     │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  1. PROPOSAL                                                    │
│     ┌──────────────────┐                                        │
│     │ What & Why       │  Problem statement, scope, success     │
│     │ proposal.md      │  criteria, risks                      │
│     └────────┬─────────┘                                        │
│              │                                                   │
│              ▼                                                   │
│  2. DESIGN                                                       │
│     ┌──────────────────┐                                        │
│     │ How              │  Architecture, trade-offs, component   │
│     │ design.md        │  interactions, data flow               │
│     └────────┬─────────┘                                        │
│              │                                                   │
│              ▼                                                   │
│  3. SPECS (capability specs)                                     │
│     ┌──────────────────┐                                        │
│     │ Requirements     │  Machine-readable requirements with    │
│     │ specs/*/spec.md  │  scenarios (Given/When/Then-like)      │
│     └────────┬─────────┘                                        │
│              │                                                   │
│              ▼                                                   │
│  4. TASKS                                                        │
│     ┌──────────────────┐                                        │
│     │ Implementation   │  Atomic, ordered implementation steps  │
│     │ tasks.md         │  with checkboxes                       │
│     └────────┬─────────┘                                        │
│              │                                                   │
│              ▼                                                   │
│  5. IMPLEMENTATION (via /opsx-apply)                             │
│     ┌──────────────────┐                                        │
│     │ Code + Tests     │  AI agent executes tasks, marks        │
│     │                  │  checkboxes as completed               │
│     └────────┬─────────┘                                        │
│              │                                                   │
│              ▼                                                   │
│  6. DELTA SPECS                                                  │
│     ┌──────────────────┐                                        │
│     │ What changed     │  Diff: added/modified/removed specs    │
│     │ specs/*/spec.md  │  for syncing back to main specs        │
│     └────────┬─────────┘                                        │
│              │                                                   │
│              ▼                                                   │
│  7. ARCHIVE                                                     │
│     ┌──────────────────┐                                        │
│     │ Done             │  Change moved to archive with date     │
│     │ archive/         │  Main specs updated with delta         │
│     └──────────────────┘                                        │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

What Makes OpenSpec Different

1. AI agent as first-class consumer

Every artifact in an OpenSpec change has a defined schema, known output paths, and explicit dependencies. The CLI exposes machine-readable instructions (openspec instructions <artifact-id> --json) that tell an AI agent exactly what to create, what template to follow, and what context to read.

This is fundamentally different from a Markdown RFC or an ADR. The agent does not need to infer what a "good" proposal looks like — the schema defines it. The agent does not need to figure out which files to create — the CLI resolves the paths. The agent does not need to guess what artifacts are needed next — the dependency graph says so.

# Agent asks: "What do I need to create next?"
openspec instructions design --change "add-metrics-pipeline" --json

# Response: structured, actionable, unambiguous
{
  "artifact": "design",
  "template": "...",
  "dependencies": ["proposal"],
  "resolvedOutputPath": ".openspec/changes/add-metrics-pipeline/design.md",
  "context": "..."
}

2. Full change lifecycle, not a fragment

OpenAPI gives you the API contract. ADRs give you the decision log. BDD gives you the acceptance tests. OpenSpec gives you the entire chain from proposal to archive, with each artifact's outputs feeding into the next.

This means an AI agent can:

Read the proposal → understand scope
Read the design → understand architecture
Read the specs → understand requirements
Read the tasks → know what to implement
Mark tasks complete → update the artifact
Generate delta specs → document what changed
Archive → clean up

No other framework provides this full lifecycle in a machine-enforceable format.

3. Delta specs as a primitive

When implementation reveals that the spec was wrong or incomplete, OpenSpec captures the delta — what actually changed versus what was planned. These delta specs can be synced back to the main specification, keeping the spec alive rather than letting it fossilize.

This solves the documentation drift problem that plagues every other framework. The spec does not sit on a shelf — it evolves with the implementation, and the deltas provide an audit trail of every divergence.

4. Archive is part of the workflow, not an afterthought

Every completed change is moved to an archive directory with a date prefix. The change becomes a historical record, not a forgotten directory. This matters for compliance, post-mortems, and training AI agents on past patterns.

The Trade-offs

OpenSpec is not a replacement for every specification tool. It has real trade-offs:

Dimension	OpenSpec	Traditional Approaches
Setup overhead	Requires CLI, schema initialization	ADR: one file. Issues: zero setup
Learning curve	Artifact lifecycle must be learned	Everyone knows how to write Markdown
Tooling maturity	Emerging ecosystem	OpenAPI: decade+ of tooling
Human readability	Structured artifacts, less narrative	RFCs: natural prose, easy to read
Scope	Engineering change management	OpenAPI: API contracts only
Team size suitability	Best with AI agents or structured teams	ADRs: works for 2-person teams too

OpenSpec adds structure overhead. For a solo developer fixing a typo, a GitHub issue is more appropriate. For a multi-step infrastructure change involving provisioning, deployment, configuration, and verification — especially when AI agents are executing the work — the structure is not overhead, it's leverage.

Head-to-Head: A Deploy Scenario

To make the comparison concrete, consider a realistic DevOps scenario: adding a Prometheus metrics pipeline with custom application metrics to a production service.

Phase	OpenAPI	ADR	BDD	RFC	Issues/PRs	OpenSpec
Proposal	N/A	N/A	N/A	RFC #0032	Issue "add metrics"	`proposal.md`
Design	N/A	ADR-005 "use Prometheus"	N/A	Included in RFC	PR description	`design.md`
Specs	`/metrics` endpoint def	N/A	Given/When/Then scenarios	N/A	N/A	`specs/metrics/spec.md`
Tasks	N/A	N/A	N/A	N/A	Issue checklist	`tasks.md`
Implement	Manual	Manual	Manual	Manual	PR	`/opsx-apply`
Verify	Schema valid?	N/A	Cucumber pass?	N/A	CI checks	Agent checks + CI
Trace	N/A (no link)	N/A (no link)	N/A (separate)	N/A (separate)	Issue ↔ PR link	Full artifact chain
Archive	N/A	N/A	N/A	Closed RFC	Closed issue	Timestamped archive

In the OpenSpec workflow, a single agent can traverse the entire lifecycle. In every other approach, there are manual handoffs, information loss between phases, and no automated verification that the implementation matches the intent.

When to Use What

The right tool depends on who consumes the specification and what you need it to enforce.

CONSUMER
  │
  ▼                    ┌─────────────────────────────────┐
  AI Agent             │  OpenSpec                       │
  (autonomous)         │  Full lifecycle, agent-native   │
                       └─────────────────────────────────┘
                                    │
  ┌─────────────────────────────────────────────────────────┐
  │                    BDD / Gherkin                        │
  │                    Executable specs for behavior        │
  └─────────────────────────────────────────────────────────┘
                                    │
  ┌────────────────────┐   ┌───────────────────┐   ┌──────────────┐
  │  OpenAPI / AsyncAPI │   │     RFC Process   │   │     ADRs     │
  │  Contract formats   │   │  Collaborative    │   │  Decision    │
  │  with codegen       │   │  design reviews   │   │  log         │
  └────────────────────┘   └───────────────────┘   └──────────────┘
                                    │
  ┌─────────────────────────────────────────────────────────┐
  │          GitHub Issues / PRs / JIRA                      │
  │          General-purpose tracking, unstructured         │
  └─────────────────────────────────────────────────────────┘
                                    │
  Human                      ▲
  (ad hoc)                   │
                             ENFORCEMENT (machine-readable → verifiable)

Use OpenSpec when:

AI agents are executing implementation work alongside humans
Changes span multiple steps across infrastructure and application code
You need traceability from proposal through archive
Documentation drift is costing you debugging time
Compliance requires an audit trail of what was changed and why

Use OpenAPI/AsyncAPI when:

You need API contract validation and code generation
Your primary concern is interface compatibility between services
You have a service mesh or API gateway that consumes the spec directly

Use ADRs when:

You want a lightweight architecture decision log
No AI agents are involved
You trust humans to keep the repo description accurate

Use BDD/Gherkin when:

You need executable specifications that run in CI
Business stakeholders need to read acceptance criteria
The spec boundary is a single feature or behavior

Use RFCs when:

You need broad community input on a design
The change has long-term architectural impact
Discussion quality matters more than automation

Use Issues/PRs when:

The change is trivial (typo, single-file fix)
You have no AI agents and a small team
Structure overhead would slow you down more than it helps

The DevOps Verdict

The specification frameworks most DevOps teams use today were designed for a world where humans write code, humans review code, and humans deploy code. That world is ending.

When AI agents participate in the engineering lifecycle — proposing changes, writing implementation code, verifying requirements, generating tests — the specification framework becomes the control plane for agent behavior. A free-form RFC or an unstructured GitHub issue gives an agent ambiguous instructions. A structured OpenSpec artifact with a defined schema, resolved output paths, and explicit dependencies gives an agent deterministic guidance.

This does not mean OpenSpec replaces every other tool. This project uses OpenAPI for its Stripe integration contract, ADRs for architecture decisions, and OpenSpec for managing changes. They serve different layers of the specification stack.

But for the change management layer — the part of the workflow that connects "we should do this" to "it is deployed and verified" — the existing tools leave a gap that AI-native agents cannot cross without human hand-holding. OpenSpec fills that gap by making the entire lifecycle machine-enforceable.

The frameworks that win in the DevOps era will not be the ones with the most features or the prettiest documentation generators. They will be the ones that AI agents can follow without human interpretation.

This article is part of the DevOps Infrastructure series on tobias-weiss.org. The OpenSpec framework is developed as part of the OpenCode project and is used to manage all changes on this site.

We are entering an era where AI agents write, review, and deploy code alongside humans. The specification frameworks we choose determine whether those agents work with us or against us.

The Specification Spectrum

Before comparing individual frameworks, it helps to understand what a specification framework actually provides:

┌─────────────────────────────────────────────────────────────────┐
│                    WHAT SPECS DO                                │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  INTENT CAPTURE    ──►  DESIGN DECISIONS  ──►  IMPLEMENTATION   │
│  (what & why)           (how)                  (ticked boxes)   │
│                                                                 │
│  Documentation                                                   │
│  Drift grows here ←───────────────────────────────────────────► │
│  (spec frozen)                                   (code evolves) │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

The Contenders

Framework	Primary Format	Consumer	Scope	AI-Native?
OpenAPI/Swagger	YAML/JSON	Tools + Humans	API contracts	Partial
AsyncAPI	YAML/JSON	Tools + Humans	Event contracts	Partial
ADRs	Markdown	Humans	Architecture decisions	No
RFC Process	Markdown	Humans (team)	Design proposals	No
BDD/Gherkin	Plain text (Gherkin)	Tests + Humans	Behavior specs	Yes (structured)
OpenSpec	Markdown + CLI	AI Agents + Humans	Full change lifecycle	Yes (native)
GitHub Issues/Projects	UI + Markdown	Humans	Task tracking	No
JIRA + Confluence	UI + Rich text	Humans (org)	Project management	No

OpenAPI / Swagger: The Gold Standard That Solved One Problem

Where it shines:

Contract-first development with code generation
Extensive tooling ecosystem (Swagger UI, Editor, Codegen, Validator)
Clear, versioned interface boundaries between services
CI pipeline validation (request/response conformance)

Where it falls short for modern DevOps:

Narrow scope — API surface only, not deployment, infrastructure, or operational specs
No change lifecycle management — versioning the spec is manual, and there is no artifact trail from proposal to implementation
Static contract format — OpenAPI 3.1 improved things with JSON Schema, but the spec describes an endpoint, not the engineering process behind it
Agents cannot follow it — an AI agent can read an OpenAPI spec to understand an API surface, but it cannot use it to plan, propose, design, implement, and verify a change

# OpenAPI tells you WHAT the API looks like
# It does NOT tell you HOW to change it safely
paths:
  /deployments:
    get:
      summary: List deployments
      responses:
        '200':
          content:
            application/json:
              schema:
                type: array
                items:
                  $ref: '#/components/schemas/Deployment'

OpenAPI is essential infrastructure. But it is a contract format, not a change management framework. It solves the interface problem and leaves the engineering process untouched.

AsyncAPI: OpenAPI for Events

AsyncAPI extends the contract-first model to event-driven architectures — Kafka topics, RabbitMQ queues, WebSocket channels, MQTT brokers.

What it does well:

Machine-readable channel definitions with publish/subscribe semantics
Message schema validation across event boundaries
Code generation for producers and consumers
Growing tooling ecosystem (generator, modelina, CLI)

Both OpenAPI and AsyncAPI suffer from the same DevOps gap: the spec lives separately from the change process that created it, and the two inevitably diverge.

ADRs: Lightweight, Human-First, Agent-Hostile

adr/
├── 001-use-postgresql-for-primary-storage.md
├── 002-adopt-kubernetes-for-orchestration.md
├── 003-use-redis-for-session-caching.md
└── 004-migrate-to-opensearch-for-logging.md

Each ADR follows a template: Context → Decision → Consequences. The format is intentionally minimal — prose-based, designed for humans to write and read.

Where ADRs excel:

Low friction to create (one file, no tooling required)
Lives in the repo alongside code (no external system drift)
Perfect for recording why decisions were made
Immutable, numbered record of architectural evolution

Where they break down:

No structure an agent can follow — ADRs are free-form prose. An AI agent would need to parse natural language to extract decision criteria, alternatives, and scope. This is possible (LLMs are good at this) but not reliable for automated enforcement.
No connection to implementation — an ADR says "we decided to use PostgreSQL." It does not track whether the implementation matches that decision, what tasks remain, or whether any code contradicts it.
No lifecycle management — ADRs are append-only. Decisions are made and recorded, but there is no workflow for revisiting, superseding, or archiving them.
No task generation — "We decided X" does not produce a checklist of implementation steps. A human (or agent) must interpret the decision and figure out what to build.

ADRs are valuable documentation. They are not a framework for driving change.

BDD / Gherkin: Executable Specs With a Ceiling

Feature: Deployment Rollback
  Scenario: Rollback on health check failure
    Given a deployment with 3 replicas
    When the health check fails for 2 replicas
    Then the orchestrator initiates a rollback
    And the previous revision is restored within 60 seconds

BDD's genuine strengths:

Executable specs — the spec IS the test. No documentation drift because the spec runs in CI.
Shared language — Gherkin's structured natural language bridges business stakeholders and engineers.
Living documentation — passing specs = accurate documentation.
Agent-friendly format — the structured Given/When/Then format is parsable by AI agents without ambiguity.

Why BDD isn't enough for DevOps change management:

Feature-level scope only — a Gherkin scenario describes one behavior. It cannot represent a multi-step engineering change (provision infrastructure → deploy service → configure monitoring → verify).
No design artifact — BDD captures acceptance criteria but not the design decisions, trade-offs, or alternatives considered.
No change lifecycle — BDD specs are written and automated, but there is no concept of proposal, approval, implementation tracking, or archive.
High maintenance overhead — Gherkin scenarios are notoriously brittle. A UI change can break dozens of scenarios written in business language, requiring expensive rewrites.

BDD occupies a useful niche: spec-as-test for specific behaviors. It does not replace a change management framework.

The RFC Process: Collaborative Design, Manual Everything

rfcs/
├── text/
│   ├── 0000-template.md
│   ├── 0001-new-rfc-process.md
│   ├── 0002-adopt-openspec.md
│   └── 0003-spec-first-change-management.md

What makes RFCs powerful:

Structured, repeatable proposal format
Community review baked into the process
Decision record (accepted/rejected with rationale)
Historical archive of design evolution

The DevOps gap with RFCs:

No automation boundary — an RFC is a Markdown document with prose sections. There is no machine-enforceable contract between the proposal and the implementation.
No task decomposition — acceptance means "we agreed this is the right approach." Someone still needs to break it into implementation tasks.
No CI integration — CI cannot verify that implementation matches the RFC's design decisions.
Agent-hostile structure — like ADRs, RFCs rely on human readers to interpret intent. An AI agent parsing a 50-section RFC would struggle to extract deterministic action items.

The RFC process produces excellent design artifacts. But the bridge from "RFC accepted" to "code deployed" is entirely manual.

GitHub Issues + PRs: The Unstructured Default

Most DevOps teams default to GitHub Issues for change tracking and Pull Requests for code review. This is the path of least resistance — it ships quickly and requires no additional tooling.

What it gets right:

Zero setup cost
Everyone knows how to use it
PR reviews provide human quality gates
CI integration is built in

What it gets wrong for systematic change management:

Issue = unstructured blob — an issue can be a bug report, a feature request, a question, a design discussion, or a task. There is no structural distinction.
No design artifact — the PR is the implementation. Design decisions are buried in comment threads or left implicit.
No spec → implementation traceability — did the PR implement what was intended? The only way to know is to read the diff and compare it to your memory of the discussion.
Agent-hostile — an AI agent can read issues and PRs, but it cannot follow a structured workflow from proposal to implementation to verification because the structure simply does not exist.
Documentation drift is the default — once merged, the issue is closed and the discussion is archived. The implementation evolves, but the issue never updates.

GitHub Issues + PRs is a communication platform that teams repurpose into a workflow. It works for small teams with good discipline. It scales poorly.

OpenSpec: AI-Native Change Management

The OpenSpec Change Lifecycle

Every change in OpenSpec follows a defined artifact dependency chain:

┌─────────────────────────────────────────────────────────────────┐
│                   OPENSPEC CHANGE LIFECYCLE                     │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  1. PROPOSAL                                                    │
│     ┌──────────────────┐                                        │
│     │ What & Why       │  Problem statement, scope, success     │
│     │ proposal.md      │  criteria, risks                      │
│     └────────┬─────────┘                                        │
│              │                                                   │
│              ▼                                                   │
│  2. DESIGN                                                       │
│     ┌──────────────────┐                                        │
│     │ How              │  Architecture, trade-offs, component   │
│     │ design.md        │  interactions, data flow               │
│     └────────┬─────────┘                                        │
│              │                                                   │
│              ▼                                                   │
│  3. SPECS (capability specs)                                     │
│     ┌──────────────────┐                                        │
│     │ Requirements     │  Machine-readable requirements with    │
│     │ specs/*/spec.md  │  scenarios (Given/When/Then-like)      │
│     └────────┬─────────┘                                        │
│              │                                                   │
│              ▼                                                   │
│  4. TASKS                                                        │
│     ┌──────────────────┐                                        │
│     │ Implementation   │  Atomic, ordered implementation steps  │
│     │ tasks.md         │  with checkboxes                       │
│     └────────┬─────────┘                                        │
│              │                                                   │
│              ▼                                                   │
│  5. IMPLEMENTATION (via /opsx-apply)                             │
│     ┌──────────────────┐                                        │
│     │ Code + Tests     │  AI agent executes tasks, marks        │
│     │                  │  checkboxes as completed               │
│     └────────┬─────────┘                                        │
│              │                                                   │
│              ▼                                                   │
│  6. DELTA SPECS                                                  │
│     ┌──────────────────┐                                        │
│     │ What changed     │  Diff: added/modified/removed specs    │
│     │ specs/*/spec.md  │  for syncing back to main specs        │
│     └────────┬─────────┘                                        │
│              │                                                   │
│              ▼                                                   │
│  7. ARCHIVE                                                     │
│     ┌──────────────────┐                                        │
│     │ Done             │  Change moved to archive with date     │
│     │ archive/         │  Main specs updated with delta         │
│     └──────────────────┘                                        │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

What Makes OpenSpec Different

1. AI agent as first-class consumer

# Agent asks: "What do I need to create next?"
openspec instructions design --change "add-metrics-pipeline" --json

# Response: structured, actionable, unambiguous
{
  "artifact": "design",
  "template": "...",
  "dependencies": ["proposal"],
  "resolvedOutputPath": ".openspec/changes/add-metrics-pipeline/design.md",
  "context": "..."
}

2. Full change lifecycle, not a fragment

This means an AI agent can:

Read the proposal → understand scope
Read the design → understand architecture
Read the specs → understand requirements
Read the tasks → know what to implement
Mark tasks complete → update the artifact
Generate delta specs → document what changed
Archive → clean up

No other framework provides this full lifecycle in a machine-enforceable format.

3. Delta specs as a primitive

4. Archive is part of the workflow, not an afterthought

The Trade-offs

OpenSpec is not a replacement for every specification tool. It has real trade-offs:

Dimension	OpenSpec	Traditional Approaches
Setup overhead	Requires CLI, schema initialization	ADR: one file. Issues: zero setup
Learning curve	Artifact lifecycle must be learned	Everyone knows how to write Markdown
Tooling maturity	Emerging ecosystem	OpenAPI: decade+ of tooling
Human readability	Structured artifacts, less narrative	RFCs: natural prose, easy to read
Scope	Engineering change management	OpenAPI: API contracts only
Team size suitability	Best with AI agents or structured teams	ADRs: works for 2-person teams too

Head-to-Head: A Deploy Scenario

To make the comparison concrete, consider a realistic DevOps scenario: adding a Prometheus metrics pipeline with custom application metrics to a production service.

Phase	OpenAPI	ADR	BDD	RFC	Issues/PRs	OpenSpec
Proposal	N/A	N/A	N/A	RFC #0032	Issue "add metrics"	`proposal.md`
Design	N/A	ADR-005 "use Prometheus"	N/A	Included in RFC	PR description	`design.md`
Specs	`/metrics` endpoint def	N/A	Given/When/Then scenarios	N/A	N/A	`specs/metrics/spec.md`
Tasks	N/A	N/A	N/A	N/A	Issue checklist	`tasks.md`
Implement	Manual	Manual	Manual	Manual	PR	`/opsx-apply`
Verify	Schema valid?	N/A	Cucumber pass?	N/A	CI checks	Agent checks + CI
Trace	N/A (no link)	N/A (no link)	N/A (separate)	N/A (separate)	Issue ↔ PR link	Full artifact chain
Archive	N/A	N/A	N/A	Closed RFC	Closed issue	Timestamped archive

When to Use What

The right tool depends on who consumes the specification and what you need it to enforce.

CONSUMER
  │
  ▼                    ┌─────────────────────────────────┐
  AI Agent             │  OpenSpec                       │
  (autonomous)         │  Full lifecycle, agent-native   │
                       └─────────────────────────────────┘
                                    │
  ┌─────────────────────────────────────────────────────────┐
  │                    BDD / Gherkin                        │
  │                    Executable specs for behavior        │
  └─────────────────────────────────────────────────────────┘
                                    │
  ┌────────────────────┐   ┌───────────────────┐   ┌──────────────┐
  │  OpenAPI / AsyncAPI │   │     RFC Process   │   │     ADRs     │
  │  Contract formats   │   │  Collaborative    │   │  Decision    │
  │  with codegen       │   │  design reviews   │   │  log         │
  └────────────────────┘   └───────────────────┘   └──────────────┘
                                    │
  ┌─────────────────────────────────────────────────────────┐
  │          GitHub Issues / PRs / JIRA                      │
  │          General-purpose tracking, unstructured         │
  └─────────────────────────────────────────────────────────┘
                                    │
  Human                      ▲
  (ad hoc)                   │
                             ENFORCEMENT (machine-readable → verifiable)

Use OpenSpec when:

AI agents are executing implementation work alongside humans
Changes span multiple steps across infrastructure and application code
You need traceability from proposal through archive
Documentation drift is costing you debugging time
Compliance requires an audit trail of what was changed and why

Use OpenAPI/AsyncAPI when:

You need API contract validation and code generation
Your primary concern is interface compatibility between services
You have a service mesh or API gateway that consumes the spec directly

Use ADRs when:

You want a lightweight architecture decision log
No AI agents are involved
You trust humans to keep the repo description accurate

Use BDD/Gherkin when:

You need executable specifications that run in CI
Business stakeholders need to read acceptance criteria
The spec boundary is a single feature or behavior

Use RFCs when:

You need broad community input on a design
The change has long-term architectural impact
Discussion quality matters more than automation

Use Issues/PRs when:

The change is trivial (typo, single-file fix)
You have no AI agents and a small team
Structure overhead would slow you down more than it helps

The DevOps Verdict

The specification frameworks most DevOps teams use today were designed for a world where humans write code, humans review code, and humans deploy code. That world is ending.

This article is part of the DevOps Infrastructure series on tobias-weiss.org. The OpenSpec framework is developed as part of the OpenCode project and is used to manage all changes on this site.

The Specification Spectrum

The Contenders

OpenAPI / Swagger: The Gold Standard That Solved One Problem

AsyncAPI: OpenAPI for Events

ADRs: Lightweight, Human-First, Agent-Hostile

BDD / Gherkin: Executable Specs With a Ceiling

The RFC Process: Collaborative Design, Manual Everything

GitHub Issues + PRs: The Unstructured Default

OpenSpec: AI-Native Change Management

The OpenSpec Change Lifecycle

What Makes OpenSpec Different

The Trade-offs

Head-to-Head: A Deploy Scenario

When to Use What

The DevOps Verdict

Never miss a deep-dive

The Specification Spectrum

The Contenders

OpenAPI / Swagger: The Gold Standard That Solved One Problem

AsyncAPI: OpenAPI for Events

ADRs: Lightweight, Human-First, Agent-Hostile

BDD / Gherkin: Executable Specs With a Ceiling

The RFC Process: Collaborative Design, Manual Everything

GitHub Issues + PRs: The Unstructured Default

OpenSpec: AI-Native Change Management

The OpenSpec Change Lifecycle

What Makes OpenSpec Different

The Trade-offs

Head-to-Head: A Deploy Scenario

When to Use What

The DevOps Verdict

Never miss a deep-dive