01// the_harness

Lovable for engineers who care about the code.

Specs, code, tests, verification — end to end, Elixir-first. CodeMySpec is the harness that sits around your AI coding agent so the code you ship still makes sense in two years.

Install the plugin Read the methodology

install · 2 commands

$ /plugin marketplace add Code-My-Spec/plugins

$ /plugin install codemyspec@codemyspec

~/metricflow — mix spex

$ mix spex --continuous

→ loading requirement graph · 142 nodes · 0 cycles

→ next: DashboardLive.render/1 (pending spec)

↪ spec-writer drafting component spec…

↪ test-writer generated 7 ExUnit cases

↪ code-writer implementing · 1 file changed

✓ mix test · 7 passed · 0 failed

✓ mix spex.verify · live browser · 2 screenshots

! qa-agent filed #CMS-412 · dashboard tile empty

demo · 01:12 ▸ play

02// the_frame

AI writes code fast.
That's the easy part.

The hard part is what happens next. Codebases built with AI assistance are showing 60% less refactoring, 48% more duplication, a three-month wall where velocity drops off a cliff.

Efficient technical debt machines. The problem isn't the model. It's the missing harness — no specs, no architecture, no verification, no lifecycle. Prompting is praying. That's the gap CodeMySpec fills.

03// how_it_works

One requirement graph.
Six subagents. Full lifecycle.

CodeMySpec tracks every artifact your Phoenix app needs — specs, tests, implementations, BDD scenarios, QA results — on a single dependency graph. Call get_next_requirement . Do one thing. The graph moves forward.

01 pm

Product

Story interview. The Product Manager agent asks what you want and writes it as markdown with acceptance criteria.

02 arch

Architecture

Map stories to Phoenix contexts. Validate the graph. No cycles, no cross-context leaks.

03 spec

Spec & Test

spec-writer drafts component specs. test-writer generates ExUnit from assertions. TDD, the way you know it.

04 impl

Implement

code-writer writes to pass the tests. Continuous mode walks the graph until every requirement is satisfied.

05 bdd

BDD specs

User-facing behavior captured as BDD scenarios in the Spex DSL, generated from acceptance criteria.

06 qa

Verification

QA agent opens a real browser via Vibium, drives the app, takes screenshots, files issues when reality diverges.

It's the harness OpenAI spent six months building for Codex — productized, for teams that don't have OpenAI's infrastructure team.

Read the full methodology →

04// differentiators

What makes CodeMySpec different.

01

lifecycle

Full-lifecycle platform, not a point tool.

Requirements → specs → architecture → code → tests → verification. One graph, one system. Every other tool owns a single phase and assumes you'll glue the rest together yourself.

02

agent_agnostic

Bring your own agent. Bring your own model. Bring your own keys.

Specs are plain markdown. Tests are standard ExUnit. The plugin works with Claude Code today and whatever agent wins next year. You pay Anthropic or OpenAI directly — CodeMySpec doesn't arbitrage your token spend.

03

elixir_native

Elixir-native by design.

Phoenix contexts, LiveView components, Ecto schemas, OTP supervision — first-class primitives the platform understands. Cursor doesn't know what a context is. CodeMySpec does.

04

verification

Verification is part of the product, not homework.

Every spec produces acceptance criteria and generated tests. The QA agent drives the live app with a real browser, not a mock. You know the feature works before you ship it, not after the alert fires.

05

the_thesis

Harness engineering, productized.

"The agent is commodity. The harness is the differentiator." CodeMySpec is the harness — ready to use, purpose-built for Phoenix.

05// the_money_shot

Prompting is praying.
Verification is a guarantee.

Unit tests pass. BDD specs pass. Then the QA agent opens a real browser, clicks through the flow a user would take, and finds the bug anyway. That's the loop no other AI coding tool has.

terminal · metricflow ✓ all green

$ mix test

Compiling 2 files (.ex)

Running ExUnit with seed: 0xCAFE

...............................................

Finished in 4.2 seconds

47 tests, 0 failures

$ mix spex

Loading BDD scenarios from spex/

Feature: Dashboard renders revenue tile

✓ Given a signed-in user

✓ When they visit /dashboard

✓ Then revenue tile is present

3 scenarios, 0 failures

qa-agent · browser verification ! issue filed

localhost:4000/dashboard

Total MRR

$284,419

Active Users

12,847

Revenue · 30d

—

empty

Conversion

4.2%

SEVERITY · HIGH #CMS-412

Revenue tile renders empty on /dashboard

filed by qa-agent· 2026-04-22 14:08· spec: DashboardLive

Expected chart svg in [data-testid="rev-30d"]; got empty div. Screenshot + LiveView trace attached.

Tests passed.·Feature broken.·QA caught it.

06// proof

Real apps. Real Elixir.
Built with CodeMySpec.

featured · open source

MetricFlow

Multi-context Phoenix app with Google Ads, Google Analytics, Facebook, and QuickBooks integrations. Built end to end with CodeMySpec. The codebase itself is the case study.

40

commits

13d

end-to-end

5

integrations

github.com/Code-My-Spec/metricflow →

production · uat

Fuellytics

Production client app built with CodeMySpec methodology. In UAT March 2026. Real customers, real uptime.

Read the case study →

self-hosted · recursion

CodeMySpec is built with CodeMySpec.

The requirement graph you use is the requirement graph the product tracks itself against. The recursion is the proof.

See the repo →

JD

Built by John Davenport — a working Phoenix engineer, not a tourist. Six years on r/elixir. Real code, in production, right now.

07// objections

Questions engineers actually ask.

"I already use Claude Code / Cursor. Why do I need this?" +

You still do. CodeMySpec runs inside Claude Code as a plugin. It adds the lifecycle layer those tools don't have — specs that survive code changes, a requirement graph that keeps the agent on-rails, and browser-based verification after the tests pass.

"Isn't this just writing more docs?" +

Specs aren't docs. They generate tests, constrain the agent, and survive when the code changes. The doc-writing version of this is why nobody did it before. Agents changed the economics — twenty minutes on a spec now saves two hours of prompt iteration.

"Am I paying twice — once for Claude and once for you?" +

No. CodeMySpec doesn't mark up model tokens. You keep your own provider, your own keys, your own bill. We charge for the platform — server, specs, verification, team features. Not for agent use.

"What happens if CodeMySpec shuts down?" +

Specs are markdown in your repo. Tests are standard Elixir. You own the artifacts. No lock-in.

"Looks over-engineered for my project." +

Use the lightweight path — one spec per feature, generated tests, no ceremony. The graph scales up if the project does.

"I'm not on Elixir. Should I care?" +

The core lifecycle — plugin, MCP servers, requirement graph — works on any stack today. The Elixir-native features are the deepest integration. Broader stack support is on the roadmap once the Elixir beachhead is won.

08// pricing

Free during early access.

● Early access · 2026

$0 / month

CodeMySpec is free during early access. Server-side platform features will be priced per-seat once we hit 1.0.

You bring your own model provider and keys — we never charge a markup on your tokens.

✓ Unlimited specs, tests, BDD scenarios
✓ Continuous mode · requirement graph
✓ QA agent with browser verification (Vibium)
✓ Works with Claude Code today
✓ No token markup · ever