▸ the_harness_layer v0.9 · early_access _

Ship with Claude Code. Hold the codebase together.

The harness layer for engineers shipping with Claude Code. Specs as the source of truth, BDD scenarios that compile to standard ExUnit, browser QA against the live app. Phoenix-native today, polyglot core via MCP. One /plugin install away.

$ /plugin marketplace add Code-My-Spec/plugins
$ /plugin install codemyspec@codemyspec
~/your-app · /codemyspec · session
// 09:14 · next requirement
get_next_requirement
context_spec · accounts · ready
// 10:32 · spec → BDD
start_task three_amigos
4 rules · 12 scenarios · linked
// 13:08 · BDD → ExUnit
mix spex --continuous
spec · test · code · all green
// 16:45 · live-app QA
start_task qa_journey
17/17 healthy · merge
 

01// the_frame

AI writes code fast.
The codebase is slipping anyway.

GitClear's 211M-line study: refactoring -60%, duplication +48% post-AI. In Phoenix codebases specifically: Copilot suggesting Ruby in Elixir files, ChatGPT regressing 1.7 templates to 1.6 syntax, defensive code fighting let-it-crash, GenServer deadlocks in generated supervision trees. The agent is fast. The architecture is decaying at the same rate.

The fix isn't a better prompt or another IDE. It's the harness — the architectural discipline a real engineering team uses to keep work compounding instead of accumulating as tech debt. Specs as the source of truth. Contexts as AI boundaries. Decisions captured where the agent can read them. Verification the agent can't shortcut.

02// the_harness

Specs in. Tests out.
Codebase intact.

Three layers on one requirement graph. Specs are markdown in your repo. Tests are standard ExUnit. QA is a browser driving the running app. No DSL to learn, no test runner to adopt, no lock-in.

/spec
// the spec layer

Specs the agent has to read.

Component specs, context specs, ADRs — markdown in .code_my_spec/. A requirement graph tracks dependencies and names the next task. Three Amigos sessions turn each acceptance criterion into rules and Given/When/Then scenarios — written before any code is.

→ See the methodology
/code
// the code layer

Polyglot core. Phoenix-deep.

Six subagents — context architect, schema writer, BDD writer, code writer, fixer, reviewer — each scoped to one slice. The core lifecycle works on any stack via MCP. The Phoenix-deep path adds contexts as AI boundaries, OTP supervision respected, let-it-crash preserved — for the engineers who want their agent to know what a context is.

→ See the engineering brief
/verify
// the verify layer

Tests the agent can't fake.

BDD scenarios compile to standard ExUnit — no proprietary runner, no DSL. A QA subagent drives the real Phoenix app in a browser and reports what actually broke. Anthropic's research is explicit: the verification standard you set is the verification standard you get. This is the standard.

→ How verification works

03// who_its_for

Built for engineers who care about the codebase.

If you're a solo technical founder shipping production code with nobody else in the building, you're the primary buyer. If you run an Elixir agency delivering across multiple clients, the same harness becomes a shared artifact — specs in the repo, MCP across the team, contexts as boundaries every engineer respects.

▸ Primary

Solo technical founder

1 person· production code· already inside Claude Code

Reads code. Ships product. Watches the codebase start to slip under AI-generated volume. Wants the discipline a real engineering team would have without hiring one. Has tried Cursor, Aider, raw Claude Code — past the hype cycle, wants the architecture to compound.

"I want to run a software business by myself. The LLM does the labor. I do the thinking."

Install the plugin →
▸ Secondary

Elixir agency

5-30 people· 3-10 client projects· CTO sets the AI policy

Phoenix contexts as shared boundaries across client codebases. Specs as artifacts the team hands the client. One MCP server, every engineer's Claude Code session reading the same graph. The defensible answer when a client asks how you ship AI-assisted code safely. CTO-led adoption, one project at a time — paid team tier lands at 1.0; early-access deployments now.

"I need a defensible answer when a client asks how we prevent AI from ruining the codebase."

Talk to John →

04// proof

Real founder. Real apps.
Real recursion.

featured · open-sourcing soon

MetricFlow

Multi-context Phoenix app. Phoenix contexts, Ecto schemas, OTP supervision, ExUnit. Google Ads, GA, Facebook, QuickBooks integrations. Built end to end with /codemyspec. The codebase is the case study.

40
commits
13d
end-to-end
5
integrations

Multi-context Phoenix app shipped end-to-end in 13 days. 5 SaaS integrations (Google Ads, GA, Facebook, QuickBooks, more). Every line written through /codemyspec.

▸ github.com/Code-My-Spec/metricflow
production · uat

Fuellytics

Production client app built with the CodeMySpec methodology. In UAT March 2026. Real customers, real uptime.

▸ fuellytics.app
recursive · self-hosted

codemyspec.com itself

The harness built the harness. This site, the LiveView pages, the blog, the embedding pipeline — all written by /codemyspec. Also built marketmyspec.com ↗.

▸ See the repo
JD
Built by John Davenport — six years on r/elixir, working Phoenix engineer. Building the harness he needs to ship his own apps. Open. In real time.

05// pricing

Free during early access.

early access · 2026

$0/ month

Plugin stays free forever. Team-tier platform features (multi-engineer MCP, persistence, audit log) will be priced per-seat post-1.0. BYOK throughout — we never mark up your tokens.

  • Unlimited specs, requirements, BDD scenarios
  • All six subagents · full lifecycle · live-app QA
  • BYO model · BYO keys · no token markup
  • Specs and tests live in your repo · no lock-in
  • Works with Claude Code today

06// objections

Questions engineers actually ask.

Q.01Is this the harness from /blog/the-harness-layer? +
Yes. This is the harness — specs, requirement graph, BDD scenarios in standard ExUnit, browser QA. Phoenix-native today, polyglot core via Claude Code and MCP.
Q.02I'm a solo technical founder, not on an engineering team. Does this still work? +
That's exactly who it's built for. The harness gives one person the architectural discipline a real team would have without making you hire one. The engineer-founder is the primary audience — the person who can read the specs, ship the product, and is on call when it breaks at 2am.
Q.03I run an Elixir agency. Does this fit our delivery model? +
Yes — specs as deliverables, MCP servers exposing one source of truth to every engineer's Claude Code, Phoenix contexts as shared boundaries across client projects. CTO-led adoption, one project at a time. Onboarding is high-touch today by design — book time with John.
Q.04How is this different from Kiro / OpenSpec / GitHub Spec Kit? +
Those tools generate specs and hand off to other agents to write the code. CodeMySpec owns the whole lifecycle on one requirement graph — specs through architecture through code through BDD scenarios through live-app verification. The spec, the test, and the QA evidence are all linked to the same node. No hand-off, no drift.
Q.05Is this open source? +
Plugin is free, open, and inspectable. Specs and tests are markdown in your repo — no lock-in. Multi-engineer team features (persistence, audit log, shared MCP) will be priced per-seat post-1.0. BYOK throughout.
Q.06What if you build something better tomorrow? +
You own the artifacts. Specs, tests, ADRs, requirements — all live in your repo as markdown. No lock-in. The platform brings continuity across them; you can always walk.
Q.07What stack do I need to use this? +
The Phoenix-deep path is the wedge — by design. Phoenix contexts are already the right AI boundary; generic AI tools blow through them, generate Ruby syntax in Elixir files, regress 1.7 templates to 1.6, and add defensive code that fights let-it-crash. CodeMySpec respects how Phoenix is actually built. The polyglot core (specs, requirement graph, BDD, browser QA) works on any stack via MCP; other stack-deep paths are on the roadmap.
Q.08How much time per week? +
5-10 hours/week to keep the harness running well. Heavier the first month — Three Amigos sessions, the spec round-trip dialed in, QA briefs written. Lighter after that. The graph fights you if you try to ship faster than you can verify.
Q.09Does my code or my client's code leave my machine? +
The plugin runs locally inside Claude Code. Your code, your specs, and your model API calls go directly from your terminal to your model provider — never through CodeMySpec servers. We collect plugin usage telemetry (no code content) to improve the product. Specs and tests live in your repo, in markdown.

07// install

Install it.
See if it holds.

Two commands. No signup gate. No credit card. The plugin is open and inspectable; specs and tests live in your repo from minute one.

/codemyspec
/plugin marketplace add Code-My-Spec/plugins
/plugin install codemyspec@codemyspec

Running an agency or evaluating for a team? ▸ Talk to John