Exhibit · Coding Agents

Relay — Coding Agent Workspace

How Claude Code, Codex, and the coding agents you add next share one project context — plans, handoffs, review trails, and artifacts — so a scoped task keeps its full context when the work moves between sessions and tools.

The narrow claim is the strong one: a shared workspace and a consistent handoff pattern make coding agents more usable across sessions, tools, and reviews. Not autonomous delivery, not swarms, not zero-review coding.

Distribution Open source

Install: pip install relay-workflow
Repository: open source ↗
Status: cross-agent by design

The frame

Read on the same terms.

Every exhibit in the lab is judged against the same six questions. Here is how this one answers them.

Task

Carry a scoped coding task — align, plan, project, ship — across more than one agent session, and across hosts, without losing context.

Baseline

One-off prompting in a single coding-agent session, with no durable state, handoff discipline, or review trail.

Agent decision

The agent captures a plan into the work unit and projects it onto GitHub; implementation happens in your foreground Claude or Codex session against the shared workspace — Relay tracks the work, it does not drive the agent.

Trace

A shared workspace plus GitHub projection: AGENTS.md conventions, work-unit state, plan files, and the epic / milestones / issues / PRs that close them — all reviewable.

Score

Whether another session — or the other host — can resume from the recorded state, and a human can review what shipped and why. Two-host receipts (Claude and Codex) on a public demo repo.

Boundary

A human reviews and merges; Relay never drives an agent headlessly to write code, and plan-capture is asymmetric across hosts.

The workflow

A scoped task, staged

Each stage leaves durable state in a shared workspace, so another session — or another agent — can pick the task up where the last one left it.

01 Align

Agree the spec and success criteria first — spec.md in your editor, the task as a verifiable goal, not an imperative.
02 Plan

Capture a plan into the work unit — interactively in Claude's plan mode via the capture-plan hook, or headless with `relay plan --host codex`.
03 Project

Project the plan onto GitHub — an epic, milestones, and issues — so the task is decomposed and trackable across sessions and hosts.
04 Ship

Auto-discover the PRs that close those issues, merge them, and close the issues and milestones. (0.4.0 removed the old `implement` verb — code is written in your foreground session, not by Relay.)

Plus `relay status` (active pointer, artifact presence, drift, GitHub counts) and `relay cleanup` (detect stalled units + stale pointer, repair with --execute) — work-unit hygiene tools, not pipeline steps.

Watch it run

The same pipeline, two hosts

Recorded terminal sessions — real runs against the public demo repo. The flows are not symmetric, and that is the point: Claude captures the plan interactively; Codex runs it headless.

Claude host — interactive plan capture Claude Code ≈53s

ALIGN → PLAN in Claude's plan mode; ExitPlanMode and the hook feed the plan into the work unit, then PROJECT and SHIP run against GitHub. Topic: shout.

Codex host — headless plan Codex ≈119s

The same pipeline with no plan hook: the plan is captured headlessly via `relay plan --host codex`, then PROJECT and SHIP — the host asymmetry, shown honestly. Topic: bracket.

The receipts

Shipped on a public demo repo

Each cast above ends in a real merged pull request on relay-demo — a feature shipped through the workflow, on each host.

Topic	Host	Merged PR	Issue	Milestone
shout	Claude Code	#12	#9	#2
bracket	Codex	#35	#25	#5

PyPI · 0.4.0 ↗ GitHub release ↗ demo repo ↗ CHANGELOG ↗

How it fits together

One workspace, many agents

One workspace convention sits under every agent: AGENTS.md as the canonical project instructions, a .workspace/ tree for shared state (plans, work units, a persistent ACTIVE_WORK pointer that survives /clear), and GitHub for the projected epic / milestones / issues. The substrate is host-neutral; each agent binds to it differently — and the bindings are not symmetric.

Claude Code — recommended for plan capture

ExitPlanMode plus a PostToolUse hook feed the plan straight into the work unit, so PLAN is interactive and reliable on this host.

Codex — headless plan

Codex 0.139.0 has no plan tool or plan-event hook, so the same pipeline runs the plan headlessly via `relay plan --host codex`. The flows are not symmetric — and both casts above show it.

Future agents

Anything that can read a markdown convention and write to the workspace joins without bespoke glue — the substrate is the integration point.

The artifact bundle

What the workspace leaves behind

Illustrative of the convention — the durable files a disciplined session writes so the work is reviewable and resumable. Shown as the shape of the pattern, not a recording of one run.

Canonical project instructions AGENTS.md shared by every host

# Project Instructions

## Workspace convention
| Path | Audience | Purpose |
|---|---|---|
| `AGENTS.md` | Codex (native), Claude (via include) | Canonical instructions |
| `.workspace/memory/*.md` | Both | Persistent project state |
| `.workspace/work/` | Both | Active work units + ACTIVE_WORK pointer |

Review trail $ ruff check . && pytest -q what a human reviews

ruff check . .................... All checks passed!
pytest -q
........................                              [100%]
24 passed in 3.41s

Honest scope

The boundary

Cross-agent by design, with two-host receipts on a public demo repo — but the hosts are not symmetric: Claude captures plans interactively (ExitPlanMode + a hook), while Codex has no plan hook and runs the plan headlessly. Relay is a work-tracking discipline, not autonomous delivery: it does not drive Claude or Codex headlessly to write code — implementation is your foreground session — and there are no swarms, no zero-review coding, and no auto-deploy. A human reviews and merges. The shipped surface is 0.4.0; future verbs are small-step releases, not a v2.0 rewrite.

Why cheap, objective checks (a human reviews and merges) are the point: When do AI agents actually pay off?

Make your coding agents usable across sessions

A shared workspace and handoff discipline turns one-off prompting into reviewable, resumable work — across Claude Code, Codex, and the agents you add next.

Open the repository Install from PyPI