Case Study

Market My Spec

Zero prompts. Zero lines of code. Ten working days. Same harness.

A Phoenix MCP server built by CodeMySpec across ten working days of build activity between April 19 and May 17, 2026. About two hundred commits. Twenty-four shipped stories with criterion-level BDD spex coverage. Zero prompts written by me, zero lines of code. The same harness that shipped MetricFlow shipped this one, this time without me typing into the chat. Three named experiments made the change of mode possible. The engagement-finder loop, the agency white-labeling, and the analytics-admin MCP server are what the harness produced on top.

10
Working Days
24
Stories Shipped
315+
BDD Spex
0
Prompts by John

What is Market My Spec?

Market My Spec is a Phoenix MCP server that exposes marketing operations to Claude Code. An eight-step strategy interview, an engagement-finder loop that searches subreddits and ElixirForum categories from per-account venues (ElixirForum live in prod, Reddit search pending API approval), staged touchpoint drafts with UTM-tracked link rewriting, a per-account file workspace, agency white-labeling on a custom subdomain, and a Google Analytics admin surface for custom dimensions, metrics, and key events. Magic-link auth and OAuth (GitHub, Google) live on the apex. The strategy intelligence runs in the user's own Claude Code subscription, no token markup, no inference resale.

Claude Code

Primary client. Connects with one `claude mcp add` install command

Reddit

Per-account subreddit venues; live search pending Reddit API approval

ElixirForum

Discourse-backed category venues with auto-discovered category ids (live)

Google Analytics Admin

Custom dimensions, custom metrics, key events over MCP

OAuth (GitHub, Google)

One-click sign-up resolved by provider identity, not just email

Magic-link Email

Passwordless sign-up and sign-in via Resend

Key Features

  • Marketing-strategy skill delivered to Claude Code over MCP
  • Eight-step strategy walkthrough: ICP, positioning, channels, content
  • Per-account venues for Reddit and ElixirForum, with weight and enabled flags
  • Saved searches: named multi-venue query recipes, runnable from LiveView or MCP
  • Parallel search orchestrator: weighted ranking, per-source failure isolation, pagination cursor
  • Touchpoints: staged comment drafts with UTM-rewritten link target, state machine through staged / posted / abandoned
  • Per-account S3 file workspace with read-before-overwrite gate, surfaced over MCP and a side-by-side LiveView browser
  • Agency white-labeling: globally unique subdomain claim, logo plus brand colors, per-host routing
  • Google Analytics Admin tools over a second MCP server
  • BYO-Claude: the user pays the inference bill, MMS keeps no token margin

User Stories

A representative slice of the stories that drove this build. Stories that ran the Three Amigos protocol expose their actual persona, rules, scenarios, and resolved questions inline. The records shown are excerpted from the harness; the full set lives in the project.

24
Stories
281
Acceptance Criteria
24
With Three Amigos

Open Source Repository

Market My Spec is open source under Apache 2.0. The harness's own source remains in the CodeMySpec repository, but the product the harness shipped is fully open for inspection, contribution, and self-hosting. Issues and PRs welcome.

View on GitHub
Elixir / Phoenix
Language
Apache 2.0
License
3 mounted
MCP Servers
OAuth2 + magic link
Auth

The Dev Story

The Good, The Bad, and The Ugly

An honest assessment of the second product the harness shipped. Ten working days of build activity between the prior r/elixir post on April 19 and this writing on May 17. About two hundred commits. Three named harness experiments landed and held: configurable per-component workflow, Three Amigos as an agent task, BDD-spec boundary protection. On top of those landed an engagement-finder loop (search, thread fetch, touchpoint staging, UTM rewriting), agency white-labeling on a per-host subdomain, a per-account file workspace surfaced over MCP, and an analytics-admin MCP server ported from CodeMySpec. None of it shipped on faith. Every shipped story carried criterion-level BDD spex against the surfaces the agent built.

The Configurable Workflow Shipped Working

Story 671 covers the requirement-graph projector with seventeen criterion-level specs. A ProjectConfiguration row carries require_specs, require_reviews, require_tests, spec_validation, and qa_validation per project. The knobs gate module specs, spec reviews, and unit tests. BDD specs are not configurable; they are the spine of the default loop. Toggle require_specs off and the module-spec nodes drop out of the graph cleanly. Toggle require_reviews off and implementation splices onto the spec-valid edge directly. Toggle them on and off and the graph round-trips. The configurable per-component workflow stopped being a slogan the moment seventeen tests landed against it.

Three Amigos Caught Real Ambiguity

Story 678 (multi-tenancy) is the cleanest example. The agent ran the Three Amigos protocol, surfaced four open questions about agency-vs-individual UX instead of guessing, and parked them on the story. I answered in plain English. The agent then added a new rule capturing the product decision I had just made (agency accounts are admin-provisioned only, no self-service) plus a matching scenario, and only then closed out the readiness gate. Story 679 (the agency client dashboard) ran the same protocol roughly seven minutes after 678 wrapped. The pattern repeats.

Boundary Protection Installed Clean On First Run

On May 1 the spex_boundary_ready task ran for the first time on Market My Spec. It installed two framework Credo checks (deny stdlib calls and direct send/2 from inside specs), generated a project-local Credo deny list by reading architecture/proposal.md and naming the internal contexts (Repo, Mailer, Vault, Users, Integrations, McpAuth, Skills), scaffolded the curated fixtures bridge, and wrote the project BDD plan. Four artifacts in one task, zero hand-edits. The deny list grew naturally as Engagements, Agencies, and the analytics-admin server landed; every new context was named explicitly, every fixture passed through the bridge.

The Engagement Loop Shipped End-To-End Against Cassettes

Stories 705 through 716 wired the full agent loop: per-account venues (subreddits and ElixirForum categories) under VenueLive.Index and an add_venue / list_venues / update_venue / remove_venue MCP surface; saved searches as named multi-venue recipes runnable from either LiveView or MCP; a parallel Search orchestrator that fans out across enabled venues with weighted ranking and per-source failure isolation; ingested Thread rows with a normalized comment tree; staged Touchpoint drafts with UTM rewriting and a staged / posted / abandoned state machine; and ElixirForum joining Reddit behind a single Source behaviour with auto-discovered category ids. Three MCP servers now mount: marketing strategy, engagements, analytics admin. The MCP surface served seventeen tools by May 15. ElixirForum runs live against the public Discourse APIs. Reddit search is fully implemented and proven against ReqCassette recordings of real Reddit responses; it's gated on API access approval in prod (see Bad).

Agency White-Labeling Shipped Without Touching Apex

Story 691 plus 695 added globally unique subdomain claims, logo URL plus primary and secondary colors, and a per-host AgencyHost plug between Plug.Session and the router. The apex stays clean; unknown subdomains 302 to apex; API endpoints (/oauth/*, /mcp, /.well-known/*) are apex-only. Agency owners configure branding from a single LiveView form. The harness wrote the per-host plug, the migration with a partial unique index on subdomain WHERE NOT NULL, and the spec coverage in one story pass.

Zero Prompts, Orchestration Only

The headline number is honest with one footnote: I orchestrated. I picked which subagent ran when, called retries, and occasionally said do it yourself, no subagent on this one. The writing was the agent's, top to bottom. The 0/0 stat is the result. The orchestration framing is the truth that prevents the headline from reading as marketing.

The Harness Started Specifying Itself

Story 553 landed within nine hours of the prior r/elixir post going live. Seven criterion specs against the harness's own ProjectConfiguration pipeline. By May 5 the harness was running roughly two hundred criterion-level Spex against its own internal modules. By May 17 the count is larger; the framework that builds Phoenix applications now has its own BDD coverage as its backstop, with cassette-backed pipeline fixtures for the validation paths and a release CI that fails on warnings.

What We Changed After Market My Spec

The Layers Are Knobs, Not Doctrine

MetricFlow taught us that the six-phase pipeline ran most of its layers for ceremony rather than catching real problems. Market My Spec made the layers configurable instead of removing them. Module specs, spec reviews, and unit tests are still available; they're one config flip away. The default path is the shortest one that ships working software, and you opt into ceremony as the surface stabilizes. The harness rewires its own requirement graph when the toggles change.

Example Mapping Belongs Inside The Harness

The MetricFlow conclusion called Example Mapping the missing step. Market My Spec made it a real agent task with twelve criterion-level specs against the readiness rules, hosted in the harness, runnable from Claude Code through MCP. Persona, Rules, Scenarios, Questions, all persisted as records, not flat files. The records make the gate machine-checkable. The agent calls evaluate_task when it thinks the conditions are met. The PM holds product intent.

Specs Should Drive The User's Surfaces, Not The Internals

The MetricFlow Potemkin Village failure mode was the coding agent and the QA agent collaborating to ship broken functionality. The fix is structural: a sealed boundary that forces specs to drive the same LiveView, controller, and MCP surfaces a real user drives. Repo is denied. Direct send/2 is denied. System.cmd is denied. The fixtures bridge is the only door, and it is grep-able. The harness refuses to walk into boundary-shaped work until the gate is green.

An MCP Tool Wired To A Stub Adapter Will Green Through Tautological Specs

Story 705 shipped to prod with the search adapter returning an empty list and eighteen spex iterating over the empty list to claim coverage. The spex bar has to enforce a real reply envelope, not a pass-on-NotImplemented escape hatch. New review rule: any spex whose assertions live inside `Enum.each(empty_collection, ...)` is rejected at review; any MCP tool whose adapter returns the empty case unconditionally needs a fixture-backed real-world call before merge. The cleanup on May 16 rebuilt the adapter behind ReqCassette recordings against real Reddit responses and split the cross-source criteria into story 714.

Operator-Supplied Config Has To Be Read At Runtime

HostResolver's compile-time read of :apex_host took UAT down on its first deploy. Migrations that drop or rename columns are bookended by deploy ordering. Both are categories the harness can flag in review: any `Application.compile_env/2` call against a key set by runtime.exs is suspect, and any migration that touches a column the running code reads warrants a two-deploy split. Operationally these are one-line fixes; the value is in spotting the shape before it ships.

Orchestration Is The Next Bottleneck

I didn't write prompts. I didn't write code. I did pick which subagent ran when, call retries, and occasionally route around a bad path. That is the work that's left, and it is the work the next experiment has to absorb. Auto-orchestration through the requirement graph is the obvious next step. We are not there yet.

Ten working days. About two hundred commits. Twenty-four shipped stories. Zero prompts. Zero lines of code from me. The harness shipped Market My Spec the same way it shipped MetricFlow, but this time without me typing into the chat. The fix wasn't more automation. It was knobs on the layers, a real gate before the specs, and a sealed boundary around them. Configurable workflow, Three Amigos, and boundary protection. Three harness changes, one MCP server with three mounted servers on top, an engagement-finder loop wired against cassettes and live against ElixirForum (Reddit live search is one credential approval away), and one open question about orchestration. That's the loop that ships.