Kiro vs OpenSpec: Integrated AWS IDE vs Lightweight Spec Tool (2026)
Kiro and OpenSpec both put a specification before the code, but they sit at opposite ends of the spec-driven spectrum. Kiro is AWS’s integrated, spec-first IDE where the spec, the model, and the billing all live inside the AWS perimeter. OpenSpec is a free, repo-resident CLI layer that adds a lightweight spec to whatever agent you already run. If you are comparing “kiro vs openspec,” the real choice is not which one has specs. It is an opinionated all-in-one environment versus a thin, portable change tracker.
I have read deeply on both and run spec-first workflows on real Phoenix apps. Here is the honest comparison.
The core difference in one paragraph
Kiro turns one prompt into three structured artifacts (a requirements.md in EARS notation, a design.md, and a dependency-sequenced tasks.md) inside a VS Code-derived editor, then implements them with Bedrock-backed agents on metered credits. OpenSpec does much less on purpose: openspec init drops a propose, apply, sync, archive loop into your repo, generates lightweight delta specs that record only what each change adds, modifies, or removes, and never calls an LLM itself (no API key, no MCP). Kiro optimizes for integrated, gated, enterprise-grade structure. OpenSpec optimizes for low overhead and changing code that already exists.
Side-by-side comparison
| Dimension | Kiro (AWS) | OpenSpec (Fission AI) |
|---|---|---|
| Maker | Amazon Web Services | Fission AI |
| Category | Integrated spec-first IDE + CLI | Lightweight repo-resident SDD CLI |
| Install | Kiro IDE (Code OSS fork) or CLI | npm (@fission-ai/openspec), Node 20.19+ |
| Spec format | EARS requirements + design + tasks under .kiro/specs/ | Structured Markdown, optional Given/When/Then, RFC 2119, delta specs |
| Signature feature | Gated requirements, design, tasks with traceability | Delta tracking (ADDED / MODIFIED / REMOVED) |
| Center of gravity | Greenfield to production, AWS-native | Brownfield, change-led iteration |
| Models / keys | Bedrock-routed; metered credits with markup | None; rides your existing agent, no API key, no MCP |
| Pricing | $0 / $20 / $40 / $200 credit tiers, $0.04/credit overage, no rollover | Free, MIT |
| Verification | Agent Hooks can run tests; no live-app proof | None (validate = structure; verify won’t block archive) |
| Lock-in | Kiro IDE/CLI plus Bedrock | None (30 agents, repo-resident) |
| Best for | AWS / enterprise shops wanting one integrated tool | Lightweight brownfield work on any stack |
(Pricing and adoption move; re-check before you commit. Kiro is a closed product, so there is no public repo to badge.)
How each one works
Kiro’s loop is heavy and guided. One prompt (“Add a review system for products”) becomes three version-controlled documents under .kiro/specs/<feature>/: a requirements.md written in EARS notation, a design.md with data-flow diagrams and interfaces, and a tasks.md sequenced by dependency where each task links back to a specific requirement. The intended flow is a three-phase gate (requirements, then design, then tasks and execution) with a human approving each phase. Steering files hold standing project context, and Agent Hooks run agents in the background on events like file save. Models route through Amazon Bedrock, and every action consumes credits.
OpenSpec is a thinner layer that calls no model of its own. You run openspec init, then drive the opsx: commands: /opsx:propose generates a full change proposal (intent, design, delta specs, task checklist), /opsx:apply executes the tasks, /opsx:sync merges the change’s deltas into the main specs, and /opsx:archive files the finished change with a date prefix. Specs live under openspec/specs/[domain]/spec.md as requirements plus optional Given/When/Then scenarios, framed in the docs as a “behavior contract, not an implementation plan.” The signature mechanic is delta tracking: a change folder contains only ADDED, MODIFIED, and REMOVED requirements, so documentation compounds on archive and two in-flight changes can edit the same spec without conflicting as long as they touch different requirements.
Kiro: strengths and weaknesses
Strengths. Kiro is one of the most genuinely spec-first tools shipping. The spec persists as editable repo artifacts rather than a throwaway chat plan, and tasks.md keeps requirement-to-task traceability. EARS notation (the Easy Approach to Requirements Syntax, from Alistair Mavin’s team at Rolls-Royce around 2009) forces testable, unambiguous phrasing like “WHEN a user submits a form with invalid data THE SYSTEM SHALL display validation errors next to the relevant fields.” The gated workflow catches design mistakes before code is written, and the AWS-native scaffolding (GovCloud, SSO, steering files, Agent Hooks, first-class MCP) is real enterprise muscle. AWS has named Kiro the successor to Amazon Q Developer, so it is the company’s consolidated dev-AI bet.
Weaknesses. Pricing distrust dominates. The August 2025 split into “vibe requests” and “spec requests,” plus a metering bug that drained limits, drew sustained backlash. On Hacker News, kermatt wrote: “Clear pricing makes it easy for you to control costs. Vibe pricing makes it easy for the vendor to maximize revenue.” ranie93 added: “Just give me dollar amounts, I feel like I’m paying these companies with vbucks at this point.” The current single-credit model is the post-backlash simplification, but per-prompt metering with non-rolling credits remains the structural complaint. The other knock is overhead for small work: writing three documents before touching code is friction for routine edits. And EARS is requirements-syntax discipline, not executable tests; the acceptance criteria standardize phrasing but do not run.
OpenSpec: strengths and weaknesses
Strengths. Delta tracking is genuinely good for brownfield work. Scoped ADDED/MODIFIED/REMOVED changes stop an agent from hallucinating requirements onto existing behavior and let multiple changes proceed in parallel. The output is light (around 250 lines per change in the Augment Code roundup, against Spec Kit’s roughly 800), which cuts review overhead, and the delta enables intent-based review where a reviewer reads what changed and why instead of reverse-engineering the diff. Portability is total: repo-resident files, 30 agents, no API key, no MCP, no proprietary runtime. It is frequently cited as the easiest SDD framework to start with.
Weaknesses. The dominant complaint is manual spec drift. Augment Code and intent-driven.dev both note that specs “don’t self-update during implementation. If the agent drifts (and it will), you resync manually.” One Hacker News practitioner abandoned sync entirely: “it just keeps drifting and drifting until you have duplication and contradictions across specs… maintaining the main specs is not worth it.” The same practitioner flagged a codebase-alignment gap on large repos, where the agent can miss the relevant code to change. The full proposal, design, tasks cycle is also overkill for bug fixes or copy tweaks, there is no multi-agent orchestration, and there are no enterprise features (no multi-repo, no SSO/SCIM). To be precise: OpenSpec does have Given/When/Then scenarios, but they are optional, validate checks structure only, and verify is advisory and “won’t block archive.”
Which should you choose?
Choose Kiro if you are an AWS-native or enterprise shop that wants one integrated tool, value a guided gated workflow over assembling your own, and are comfortable inside Bedrock. The EARS rigor and requirement-to-task traceability are genuinely good, the enterprise controls are there, and if you are migrating off Amazon Q Developer it is the sanctioned path. The trade you accept is the credit-metered economics and adopting Kiro’s editor.
Choose OpenSpec if you are iterating on an existing codebase, want the lightest possible repo-resident layer with no API key, and value change tracking that records exactly what each change added, modified, or removed. It is the pragmatic low-friction default and the easiest to adopt if you are unsure where to start. Just budget for the manual sync discipline its drift problem demands.
The dividing line is clean: Kiro is the integrated, opinionated environment that owns your editor and your model; OpenSpec is the lightweight tracker that owns nothing and rides whatever you already use. Pick the constraint you can live with.
A third option: CodeMySpec
Here is what both tools share, and where both leave a gap. Both are spec-first, and neither verifies the generated code against the live, running app. Kiro’s Agent Hooks can run tests, but unit tests passing is not the same as the button actually working. OpenSpec’s verify “won’t block archive,” and its scenarios are optional. In both, the spec governs the code by convention, not by enforcement.
CodeMySpec targets exactly that gap. It is a full-lifecycle, specification-driven harness for Phoenix and Elixir, distributed as a Claude Code plugin plus a local MCP server. Three differences matter against these two tools:
- A mandatory BDD gate. BDD scenarios (Given/When/Then) are required, not optional documents the agent should follow, and they are behavioral contracts rather than requirement phrasing. Module specs, reviews, and tests are configurable knobs; the BDD gate is not.
- Built-in live verification. A QA subagent boots the real app, drives a real browser, screenshots the result, and files issues with severity. Unit tests pass, BDD specs pass, then the QA agent clicks the button and finds the bug anyway. Neither Kiro nor OpenSpec runs the live app to confirm the spec held.
- Phoenix-native, and BYO everything. Phoenix contexts, LiveView, Ecto, and OTP are first-class. Specs are portable as MCP context or context files (CLAUDE.md, .cursorrules, GEMINI.md), so you bring your own agent, your own model, and your own keys with no token markup, the opposite of Kiro’s credit metering.
To be fair: repo-resident specs and bring-your-own-agent portability are not unique to CodeMySpec. OpenSpec meets that bar too, and both tools keep specs in your repo. The defensible difference is the combination: a mandatory behavioral gate plus live-app verification plus a full lifecycle on one requirement graph, built deep for one stack. It is early access and Elixir-specific, so it is not the answer if you are on a different stack. But if you are building Phoenix and the drift-and-verification gap is what worries you, it is worth a look.
Related Articles
- Spec-Driven Development in 2026: The Complete Guide and Tool Comparison
- Kiro Specs Explained: EARS, Spec Mode, and the Trade-offs
- OpenSpec Explained: Repo-Native Spec-Driven Development
- OpenSpec vs Spec Kit: Lightweight vs Full Toolkit
- Spec Kit vs Kiro: Which Spec-Driven Tool in 2026?
- CodeMySpec vs OpenSpec: Enforced Specs vs Living Docs
- CodeMySpec
Sources
- https://kiro.dev/blog/introducing-kiro/ (launch, 2025-07-14): spec workflow, vibe/spec mode, hooks, MCP.
- https://kiro.dev/docs/specs/ (docs): three-file spec model, three-phase gated workflow, task waves.
- https://kiro.dev/docs/specs/feature-specs/ (docs): EARS template and example,
requirements.mdstructure. - https://kiro.dev/pricing/ (pricing): 2026 credit tiers, overage $0.04/credit, models, GovCloud, what consumes credits.
- https://aws.amazon.com/blogs/devops/amazon-q-developer-end-of-support-announcement/ (2026-04-30): Q Developer EOL, Kiro named successor.
- https://www.infoworld.com/article/4042912/aws-blames-bug-for-kiro-pricing-glitch-that-drained-developer-limits.html (2025-09): vibe/spec pricing, metering bug, AWS response.
- https://news.ycombinator.com/item?id=44942600 (“wallet-wrecking tragedy” thread): quoted sentiment (kermatt, ranie93).
- https://github.com/Fission-AI/OpenSpec (official repo): MIT, npm install, Node 20.19+, opsx workflow, delta specs, brownfield framing.
- https://github.com/Fission-AI/OpenSpec/blob/main/docs/workflows.md (docs): propose/apply/sync/archive,
/opsx:verify“won’t block archive,” scenarios optional. - https://github.com/Fission-AI/OpenSpec/blob/main/docs/cli.md (docs):
validatestructural only,archive --no-validate, no API key / no LLM auth. - https://www.augmentcode.com/tools/best-spec-driven-development-tools (2026-03-07 roundup): ~250 vs ~800 lines, drift, no multi-agent, no enterprise/SSO.
- https://news.ycombinator.com/item?id=47994433 (HN practitioner): drift complaint, large/legacy alignment gap.
- EARS background: https://reqassist.com/blog/ears-requirements-syntax (Mavin / Rolls-Royce 2009 origin, five requirement types).