Kiro vs OpenSpec: Integrated AWS IDE vs Lightweight Spec Tool (2026)

Kiro vs OpenSpec: Integrated AWS IDE vs Lightweight Spec Tool (2026)

By John Davenport · Published on June 23, 2026

Kiro and OpenSpec both put a specification before the code, but they sit at opposite ends of the spec-driven spectrum. Kiro is AWS’s integrated, spec-first IDE where the spec, the model, and the billing all live inside the AWS perimeter. OpenSpec is a free, repo-resident CLI layer that adds a lightweight spec to whatever agent you already run. If you are comparing “kiro vs openspec,” the real choice is not which one has specs. It is an opinionated all-in-one environment versus a thin, portable change tracker.

I have read deeply on both and run spec-first workflows on real Phoenix apps. Here is the honest comparison.

The core difference in one paragraph

Kiro turns one prompt into three structured artifacts (a requirements.md in EARS notation, a design.md, and a dependency-sequenced tasks.md) inside a VS Code-derived editor, then implements them with Bedrock-backed agents on metered credits. OpenSpec does much less on purpose: openspec init drops a propose, apply, sync, archive loop into your repo, generates lightweight delta specs that record only what each change adds, modifies, or removes, and never calls an LLM itself (no API key, no MCP). Kiro optimizes for integrated, gated, enterprise-grade structure. OpenSpec optimizes for low overhead and changing code that already exists.

Side-by-side comparison

DimensionKiro (AWS)OpenSpec (Fission AI)
MakerAmazon Web ServicesFission AI
CategoryIntegrated spec-first IDE + CLILightweight repo-resident SDD CLI
InstallKiro IDE (Code OSS fork) or CLInpm (@fission-ai/openspec), Node 20.19+
Spec formatEARS requirements + design + tasks under .kiro/specs/Structured Markdown, optional Given/When/Then, RFC 2119, delta specs
Signature featureGated requirements, design, tasks with traceabilityDelta tracking (ADDED / MODIFIED / REMOVED)
Center of gravityGreenfield to production, AWS-nativeBrownfield, change-led iteration
Models / keysBedrock-routed; metered credits with markupNone; rides your existing agent, no API key, no MCP
Pricing$0 / $20 / $40 / $200 credit tiers, $0.04/credit overage, no rolloverFree, MIT
VerificationAgent Hooks can run tests; no live-app proofNone (validate = structure; verify won’t block archive)
Lock-inKiro IDE/CLI plus BedrockNone (30 agents, repo-resident)
Best forAWS / enterprise shops wanting one integrated toolLightweight brownfield work on any stack

GitHub stars

(Pricing and adoption move; re-check before you commit. Kiro is a closed product, so there is no public repo to badge.)

How each one works

Kiro’s loop is heavy and guided. One prompt (“Add a review system for products”) becomes three version-controlled documents under .kiro/specs/<feature>/: a requirements.md written in EARS notation, a design.md with data-flow diagrams and interfaces, and a tasks.md sequenced by dependency where each task links back to a specific requirement. The intended flow is a three-phase gate (requirements, then design, then tasks and execution) with a human approving each phase. Steering files hold standing project context, and Agent Hooks run agents in the background on events like file save. Models route through Amazon Bedrock, and every action consumes credits.

OpenSpec is a thinner layer that calls no model of its own. You run openspec init, then drive the opsx: commands: /opsx:propose generates a full change proposal (intent, design, delta specs, task checklist), /opsx:apply executes the tasks, /opsx:sync merges the change’s deltas into the main specs, and /opsx:archive files the finished change with a date prefix. Specs live under openspec/specs/[domain]/spec.md as requirements plus optional Given/When/Then scenarios, framed in the docs as a “behavior contract, not an implementation plan.” The signature mechanic is delta tracking: a change folder contains only ADDED, MODIFIED, and REMOVED requirements, so documentation compounds on archive and two in-flight changes can edit the same spec without conflicting as long as they touch different requirements.

Kiro: strengths and weaknesses

Strengths. Kiro is one of the most genuinely spec-first tools shipping. The spec persists as editable repo artifacts rather than a throwaway chat plan, and tasks.md keeps requirement-to-task traceability. EARS notation (the Easy Approach to Requirements Syntax, from Alistair Mavin’s team at Rolls-Royce around 2009) forces testable, unambiguous phrasing like “WHEN a user submits a form with invalid data THE SYSTEM SHALL display validation errors next to the relevant fields.” The gated workflow catches design mistakes before code is written, and the AWS-native scaffolding (GovCloud, SSO, steering files, Agent Hooks, first-class MCP) is real enterprise muscle. AWS has named Kiro the successor to Amazon Q Developer, so it is the company’s consolidated dev-AI bet.

Weaknesses. Pricing distrust dominates. The August 2025 split into “vibe requests” and “spec requests,” plus a metering bug that drained limits, drew sustained backlash. On Hacker News, kermatt wrote: “Clear pricing makes it easy for you to control costs. Vibe pricing makes it easy for the vendor to maximize revenue.” ranie93 added: “Just give me dollar amounts, I feel like I’m paying these companies with vbucks at this point.” The current single-credit model is the post-backlash simplification, but per-prompt metering with non-rolling credits remains the structural complaint. The other knock is overhead for small work: writing three documents before touching code is friction for routine edits. And EARS is requirements-syntax discipline, not executable tests; the acceptance criteria standardize phrasing but do not run.

OpenSpec: strengths and weaknesses

Strengths. Delta tracking is genuinely good for brownfield work. Scoped ADDED/MODIFIED/REMOVED changes stop an agent from hallucinating requirements onto existing behavior and let multiple changes proceed in parallel. The output is light (around 250 lines per change in the Augment Code roundup, against Spec Kit’s roughly 800), which cuts review overhead, and the delta enables intent-based review where a reviewer reads what changed and why instead of reverse-engineering the diff. Portability is total: repo-resident files, 30 agents, no API key, no MCP, no proprietary runtime. It is frequently cited as the easiest SDD framework to start with.

Weaknesses. The dominant complaint is manual spec drift. Augment Code and intent-driven.dev both note that specs “don’t self-update during implementation. If the agent drifts (and it will), you resync manually.” One Hacker News practitioner abandoned sync entirely: “it just keeps drifting and drifting until you have duplication and contradictions across specs… maintaining the main specs is not worth it.” The same practitioner flagged a codebase-alignment gap on large repos, where the agent can miss the relevant code to change. The full proposal, design, tasks cycle is also overkill for bug fixes or copy tweaks, there is no multi-agent orchestration, and there are no enterprise features (no multi-repo, no SSO/SCIM). To be precise: OpenSpec does have Given/When/Then scenarios, but they are optional, validate checks structure only, and verify is advisory and “won’t block archive.”

Which should you choose?

Choose Kiro if you are an AWS-native or enterprise shop that wants one integrated tool, value a guided gated workflow over assembling your own, and are comfortable inside Bedrock. The EARS rigor and requirement-to-task traceability are genuinely good, the enterprise controls are there, and if you are migrating off Amazon Q Developer it is the sanctioned path. The trade you accept is the credit-metered economics and adopting Kiro’s editor.

Choose OpenSpec if you are iterating on an existing codebase, want the lightest possible repo-resident layer with no API key, and value change tracking that records exactly what each change added, modified, or removed. It is the pragmatic low-friction default and the easiest to adopt if you are unsure where to start. Just budget for the manual sync discipline its drift problem demands.

The dividing line is clean: Kiro is the integrated, opinionated environment that owns your editor and your model; OpenSpec is the lightweight tracker that owns nothing and rides whatever you already use. Pick the constraint you can live with.

A third option: CodeMySpec

Here is what both tools share, and where both leave a gap. Both are spec-first, and neither verifies the generated code against the live, running app. Kiro’s Agent Hooks can run tests, but unit tests passing is not the same as the button actually working. OpenSpec’s verify “won’t block archive,” and its scenarios are optional. In both, the spec governs the code by convention, not by enforcement.

CodeMySpec targets exactly that gap. It is a full-lifecycle, specification-driven harness for Phoenix and Elixir, distributed as a Claude Code plugin plus a local MCP server. Three differences matter against these two tools:

  • A mandatory BDD gate. BDD scenarios (Given/When/Then) are required, not optional documents the agent should follow, and they are behavioral contracts rather than requirement phrasing. Module specs, reviews, and tests are configurable knobs; the BDD gate is not.
  • Built-in live verification. A QA subagent boots the real app, drives a real browser, screenshots the result, and files issues with severity. Unit tests pass, BDD specs pass, then the QA agent clicks the button and finds the bug anyway. Neither Kiro nor OpenSpec runs the live app to confirm the spec held.
  • Phoenix-native, and BYO everything. Phoenix contexts, LiveView, Ecto, and OTP are first-class. Specs are portable as MCP context or context files (CLAUDE.md, .cursorrules, GEMINI.md), so you bring your own agent, your own model, and your own keys with no token markup, the opposite of Kiro’s credit metering.

To be fair: repo-resident specs and bring-your-own-agent portability are not unique to CodeMySpec. OpenSpec meets that bar too, and both tools keep specs in your repo. The defensible difference is the combination: a mandatory behavioral gate plus live-app verification plus a full lifecycle on one requirement graph, built deep for one stack. It is early access and Elixir-specific, so it is not the answer if you are on a different stack. But if you are building Phoenix and the drift-and-verification gap is what worries you, it is worth a look.

Related Articles

Sources