Market My Spec Case Study: Zero Prompts, Zero Code, One Honest Teardown

What is Market My Spec?

Market My Spec is a Claude Code plugin that walks you through building a real marketing strategy. It interviews you about your product, dispatches research agents for customer personas and competitive alternatives, and synthesizes positioning, messaging, channels, and a 90-day plan. After the strategy lands, it picks one to three activities a day from the plan and points you at the skills that execute them across Reddit, Twitter, blog content, and SEO.

Reddit

Scan venues, mine threads for opportunities, draft replies

Twitter / X

Search relevant conversations and draft replies in voice

Google Analytics

Pull traffic and conversion signal into the daily plan

Search Console

Track keyword and indexation signal for SEO activities

Claude Code

Distributed as a plugin; runs alongside the user's existing tooling

Key Features

Eight-step guided marketing strategy flow with research agent dispatch
Customer persona research and competitive alternatives synthesis
Positioning, messaging, channel selection, and 90-day plan generation
Daily plan picker that selects one to three activities and routes to the right skill
Reddit and Twitter scanning skills for venue discovery and reply drafting
Touchpoint drafting workflow with voice-matched comment and reply templates
MCP-backed strategy artifacts persisted alongside the build, not in flat files

User Stories

A representative slice of the stories that drove this build. Stories that ran the Three Amigos protocol expose their actual persona, rules, scenarios, and resolved questions inline. The records shown are excerpted from the harness; the full set lives in the project.

5

Stories

14

Acceptance Criteria

2

With Three Amigos

Source

A snapshot of the Market My Spec build is available for inspection on GitHub. The harness's own source remains in the CodeMySpec repository. The MMS snapshot is offered for verification of the case study's claims; it will not stay open source forever.

View on GitHub

Elixir / Phoenix

Language

Claude Code plugin

Distribution

MCP-backed records

Strategy Storage

OAuth2 + magic link

Auth

The Dev Story

The Good, The Bad, and The Ugly

An honest assessment of the second product the harness shipped. The window covered is April 19 through May 5, 2026, the seventeen days between the prior r/elixir post and this one. About 200 commits across the harness, the marketing surface, and the product itself. Three named experiments landed inside that window, every one of them with criterion-level Spex covering its evaluator rules.

The Configurable Workflow Shipped Working

Story 671 covers the requirement-graph projector with seventeen criterion-level specs. A ProjectConfiguration row carries require_specs, require_reviews, require_tests, spec_validation, and qa_validation per project. Toggle require_specs off and the spec nodes drop out cleanly. Toggle require_reviews off and implementation splices onto the spec-valid edge directly. Toggle them on and off and the graph round-trips. The configurable per-component workflow stopped being a slogan the moment seventeen tests landed against it.

Three Amigos Caught Real Ambiguity

Story 678 (multi-tenancy) is the cleanest example. The agent ran the Three Amigos protocol, surfaced four open questions about agency-vs-individual UX instead of guessing, and parked them on the story. I answered in plain English. The agent then added a new rule capturing the product decision I had just made ("agency accounts are admin-provisioned only, no self-service") plus a matching scenario, and only then closed out the readiness gate. Without the gate, that ambiguity would have flowed straight into BDD specs.

Boundary Protection Installed Clean On First Run

On May 1 the spex_boundary_ready task ran for the first time on Market My Spec. It installed two framework Credo checks (deny stdlib calls and direct send/2 from inside specs), generated a project-local Credo deny list by reading architecture/proposal.md and naming seven internal contexts (Repo, Mailer, Vault, Users, Integrations, McpAuth, Skills), scaffolded the curated fixtures bridge, and wrote the project BDD plan. Four artifacts in one task, zero hand-edits.

Zero Prompts, Orchestration Only

The headline number is honest with one footnote: I orchestrated. I picked which subagent ran when, called retries, and occasionally said "do it yourself, no subagent on this one." The writing was the agent's, top to bottom. The 0/0 stat is the result. The orchestration framing is the truth that prevents the headline from reading as marketing.

The Harness Started Specifying Itself

Story 553 landed within nine hours of the prior r/elixir post going live. Seven criterion specs against the harness's own ProjectConfiguration pipeline. By May 5 the harness was running roughly two hundred criterion-level Spex against its own internal modules, with cassette-backed pipeline fixtures for the validation paths and a release CI that fails on warnings. The framework that builds Phoenix applications now has the framework's own BDD coverage as its backstop.

Spec Review Came Back On Earlier Than Hoped

Story 609 (magic-link sign-up) shipped BDD-first on May 1 with no spec review and no unit tests. Two days later I flipped spec review back on for the project because a less-careful model had written sloppy specs and the surface was now important enough to read the contract first. The knob worked exactly as designed, but the "how do you know when to flip it" answer is still operator judgment, not a machine signal.

Orchestration Was Still Required

Interventions during the build clustered at orchestration boundaries. "Use this subagent here, don't use one there. Retry that. Pause and let me check." The dominant intervention shape was switching the agent's execution mode rather than fixing its code. The 0/0 number is real, but the post is honest: I orchestrated, the agent built, and orchestration mostly meant routing.

Story 561's Orphan Filter Took Three Passes

May 3 was the longest day of the window. The orphan-context filter on next_actionable went in, was extended to the full subtree, was reverted when fixture damage outweighed the win, and then landed in narrower form. The harness can absorb a wrong-shaped change and a clean revert; the case for narrower scope on graph-projector changes was made out loud, in commits.

The Configurable Workflow Has No Automated Signal

Right now the operator decides when to flip a layer on. Module specs go on when you want to read the contract before code is written. Spec review goes on when agent drift is showing across components. Unit tests go on when you're about to refactor non-trivial internal logic. None of those triggers are machine-detectable today. That is the next experiment, not a closed problem.

The Specs Got Sloppy When The Knob Was Off

Story 609 (magic-link sign-up) shipped on May 1 via the BDD-only path: Three Amigos, six BDD criterion specs, twelve implementation files, no spec_review and no module specs in the requirement graph. Two days later I spun up a fresh session with the brief "You are reviewing every BDD spec file in this project. The specs were written by a less-careful model and need a thorough audit." The six spec files needed the full sweep. That's the configurable-workflow argument with its consequences in the open: turn the layer off, ship faster, occasionally pay for the layer being off in cleanup. The fix is the same knob in the other direction.

Boundary Protection Has No Live-Catch Transcript Yet

The four artifacts are in place. The framework Credo checks are active. The project-local check denies the right modules. The fixtures bridge is the only sanctioned door. The BDD plan is written. What's missing is a transcript moment where Credo fires live on a violation and the agent has to back up and try again. The artifacts work as designed. The live catch will get staged in the demo video; until then, the boundary-protection claim stands on installation evidence and design intent, not on a real-time catch.

The Mixed-Session Corpus

Forty-seven Claude Code sessions covered the build window. Twenty had build-side signal. All twenty were also marketing-operations sessions, because Market My Spec is the marketing tool we use to do CodeMySpec's marketing. There are no "build-only" sessions in the corpus. Transcript mining for this case study had to happen at the cwd-and-tool-call level inside each session. Privacy review on every excerpt. No customer data leaked, but the operational cost was real.

The Three Amigos Gate Is Structural, Not Semantic

The readiness check confirms records exist: at least one persona linked, at least one rule, every rule has at least one scenario, scenarios outnumber open questions. It does not confirm the rules are right. Quality lives in the agent's prompt and the human PM's review. Whether a one-PM Three Amigos session can substitute for a real workshop with a human dev and a human QA is an open question, not a closed one.

What We Changed After Market My Spec

The Layers Are Knobs, Not Doctrine

MetricFlow taught us that the six-phase pipeline ran most of its layers for ceremony rather than catching real problems. Market My Spec made the layers configurable instead of removing them. Module specs, spec reviews, and unit tests are still available; they're one config flip away. The default path is the shortest one that ships working software, and you opt into ceremony as the surface stabilizes. The harness rewires its own requirement graph when the toggles change.

Example Mapping Belongs Inside The Harness

The MetricFlow conclusion called Example Mapping the missing step. Market My Spec made it a real agent task with twelve criterion-level specs against the readiness rules, hosted in the harness, runnable from Claude Code through MCP. Persona, Rules, Scenarios, Questions, all persisted as records, not flat files. The records make the gate machine-checkable. The agent calls evaluate_task when it thinks the conditions are met. The PM holds product intent.

Specs Should Drive The User's Surfaces, Not The Internals

The MetricFlow Potemkin Village failure mode was the coding agent and the QA agent collaborating to ship broken functionality. The fix is structural: a sealed boundary that forces specs to drive the same LiveView, controller, and MCP surfaces a real user drives. Repo is denied. Direct send/2 is denied. System.cmd is denied. The fixtures bridge is the only door, and it is grep-able. The harness refuses to walk into boundary-shaped work until the gate is green.

Orchestration Is The Next Bottleneck

I didn't write prompts. I didn't write code. I did pick which subagent ran when, call retries, and occasionally route around a bad path. That is the work that's left, and it is the work the next experiment has to absorb. Auto-orchestration through the requirement graph is the obvious next step. We are not there yet.

Roughly ten working days. Zero prompts. Zero lines of code. The harness shipped Market My Spec the same way it shipped MetricFlow, but this time without me typing into the chat. The fix wasn't more automation. It was knobs on the layers, a real gate before the specs, and a sealed boundary around them. Configurable workflow, Three Amigos, and boundary protection. Three changes, one product, one open question about orchestration. That's the loop that ships.

Read the Full Methodology