In Depth · Xiaohu Explains

The fourth step every team skips is the key to AI engineering

Every runs 5 products with essentially one-person teams. The crux is one extra step taken after every feature ships: store the solution back into the system, so the AI dodges the same pitfall on its own next time.

TL;DR

Every uses a methodology called Compound Engineering to maintain its 5 products with essentially one-person engineering teams. The core is a four-step loop: Plan → Work → Review → Compound.
Traditional engineering stops at Review. The fourth step, Compound, turns every problem you solve into system knowledge, so the AI automatically avoids the same class of mistake next time. That's where the efficiency gap comes from.
The method argues engineers spend 80% of their time on Plan and Review, and only 20% actually writing code.
The companion plugin is open source, supports Claude Code / OpenCode / Codex, and ships 26 specialized agents, 23 workflow commands, and 13 skills — works with zero config.
/workflows:review fires 14 specialized agents in parallel to review code in a single call; /workflows:plan in ultrathink mode can run 40+ research agents concurrently.

⚑A note on framing: this piece is Every's own account of its Compound Engineering methodology and its in-house open-source plugin. The concurrency numbers, time splits, and product counts are all the company's own figures. What follows just explains how it works and what each number means.

1Background

How one person holds up five products

Every recently published a methodology called Compound Engineering, plus a companion open-source plugin, explaining how it maintains five products at once with engineering teams that are essentially one person each.

Five products — Cora, Monologue, Sparkle, Spiral — plus the Every.to site, each with basically a single-engineer team. What holds this scale up isn't longer hours, but the last step of a four-step loop that most teams leave out.

◆

Why it's worth a look: Every open-sourced what it normally runs only internally, including 14 AIs reviewing one piece of code at once, 40+ research agents running concurrently during planning, plus 26 specialized agents. It's one of the most concrete open-source references out there for multi-agent parallel engineering.

2The Problem

Why code gets harder to touch over time

Most codebases get harder to maintain over time, and the reason isn't complicated: every feature you add injects fresh complexity into the system, and the new feature has to "negotiate" with all the old ones. Ten years in, a team spends more time wrestling with legacy code than building anything new, and the code becomes harder to understand, harder to change, harder to trust.

Compound Engineering flips that curve. A feature is no longer a burden added to the system but a new skill taught to it; fix one bug and you wipe out a whole future class of the same bug; a solution, once locked in, becomes a tool you can reuse directly next time. The more you iterate, the better the system gets.

Same x-axis (more features / iterations), two trajectories: a traditional codebase gets harder to change with every edit, while Compound Engineering makes each iteration leave the system smoother.

3The Main Loop

The four-step loop: 80% of the time isn't writing code at all

What supports this scale is a four-step loop: Plan, Work, Review, Compound — then repeat. Whether you spend five minutes fixing a bug or several days building a feature, you walk through the same four steps; only the time each one takes differs.

Any developer knows the first three. The fourth step, Compound, is the dividing line between Compound Engineering and ordinary engineering. Skip it and all you're doing is "traditional engineering with an AI assistant."

The four steps flow along the dark-green track in sequence. Traditional engineering stops at Review; Compound Engineering takes one more step, Compound, handing what this round learned to the next.

The counterintuitive part: writing code is only a fifth of the time

Plan and Review together take up 80% of an engineer's time; actually writing it (Work) plus locking it in (Compound) is only 20%. Most of the thinking happens before the code is written and after.

Plan + Review

80%

Work + Compound

20%

What each step does / a few old beliefs Every says to drop

Plan: turn an idea into a blueprint. Nail down requirements and constraints, study how similar features are built in the codebase, check framework docs and best practices, design the approach, then validate it holds up.

Work: first spin up an isolated environment with a git worktree (an isolated sandbox copy of your repo — each task can open its own and run in parallel without stepping on the others); the agent implements the plan step by step, running tests, linting (automated code checks), and type checks after every change.

Review: several specialized agents review in parallel, tagging issues as P1 (must fix) / P2 (should fix) / P3 (could fix), revalidating after the fixes, and recording what went wrong this time.

Compound: distill the solution into reusable knowledge and write it back into the system — the next section is all about this.

Every also lists a few old beliefs worth dropping: "code must be handwritten" (your job is to ship maintainable code that solves the right problem — who types it doesn't matter); "the first version should be good" (in their experience the first draft is 95% garbage and the second still 50% — that's the process, and the goal is to iterate fast enough that landing the third version costs less than the first); "you can't learn without typing it yourself" (today understanding beats muscle memory — you learn more patterns reviewing 10 AI implementations than handwriting 2); "code is self-expression" (code was never yours personally — it belongs to the team, the product, and the users).

4Step Four · Core

How step four actually works: turning a solution into the system's memory

The first three steps (Plan, Work, Review) produce "a feature." The fourth step, Compound, produces "a system that builds every feature better than the last." On the ground, it comes down to four actions.

① Record the solutionwhat worked, what didn't, what's reusable

→

② Add metadatatag it with YAML frontmatter for later retrieval

→

③ Update CLAUDE.mdwrite the new pattern into the file the agent reads on every startup

→

④ Verify it learnedwill it catch the same issue on its own next time?

The YAML frontmatter in step two is a block of metadata tags at the top of a document (title, category, keywords) that lets a solution be searched by criteria and retrieved precisely later — instead of getting buried in a pile of docs the moment it's written.

Where the compounding comes from

Traditional development stops at step three, Review. Compound Engineering takes one more step: writing the problem you just solved into the system. This step produces no code — it produces the system's ability to dodge the same class of problem next time. That's where the efficiency gap comes from.

An analogy

CLAUDE.md is the "AI operations manual" that sits in the project root, and the agent reads it first on every startup. It's like the onboarding SOP handbook every new hire has to read: whenever someone solves a problem no one had hit before, you add a rule, and the next person gets it automatically — no need to step on the same rake twice.

The toggle below shows, at a glance, the difference once that rule has been banked:

1st time you hit this pitfall5th time you hit something similar

No memory yet — debug it live

The agent doesn't know this pitfall, so you and it debug, locate, and fix it together. Once fixed, the Compound step writes "why it happened, how to avoid it" into CLAUDE.md and saves a YAML-tagged doc into docs/solutions/. This time you spent a little extra on recording it.

The system already remembers

On startup the agent reads that rule, and the earlier solution is searchable in docs/solutions/. So during the Plan phase it routes around the same class of problem and never even reaches the bug. The time you spent recording earlier pays itself back here, with interest.

Every time you finish a Compound, CLAUDE.md gains another piece of knowledge, and the system gets smarter the more you use it:

CLAUDE.md · iteration 1

1 entry

CLAUDE.md · iteration 3

3 entries

CLAUDE.md · iteration 5

8 entries

How docs/solutions/ grows into an institutional knowledge base

Every solved problem is saved as a markdown file with YAML frontmatter, auto-categorized and tagged. Every runs this step with the /workflows:compound command, which fans out six sub-agents in parallel: one to understand the problem, one to extract the solution, one to find related older docs and cross-link them, one to write "how to avoid a recurrence," one to do the categorization and tagging, and one to format the final doc. From then on, any session can automatically dig a past solution out of that pile — instead of relying on some senior engineer to keep it in their head.

5Parallel Review

14 AIs reviewing your code at once

When a PR (a submitted code change) comes in, the /workflows:review command dispatches 14 specialized agents all at once, each watching a single dimension: security, performance, architecture, data, code quality, framework conventions, and so on. They each check independently, then their results are merged into one list prioritized as P1 (must fix) / P2 (should fix) / P3 (could fix).

One PR radiates to 14 agents at once; the colors map to the dimension groups below. They run in parallel, not in a queue.

01security-sentinel · Security

Scans OWASP Top 10 (the ten most commonly exploited classes of web vulnerability), injection attacks, auth and privilege escalation.

02performance-oracle · Performance

Hunts N+1 queries, missing indexes, cacheable spots, algorithmic bottlenecks.

03architecture-strategist · Architecture

Assesses system design, component boundaries, dependency direction.

04pattern-recognition-specialist · Architecture

Spots design patterns, anti-patterns, code smells.

05data-integrity-guardian · Data

Validates database migrations, transaction boundaries, referential integrity.

06data-migration-expert · Data

Checks ID mapping, rollback safety, production data validation.

07code-simplicity-reviewer · Quality

Enforces YAGNI (don't write code for features you might need), hunts needless complexity.

08kieran-rails-reviewer · Quality

Rails conventions, Turbo Streams, model vs. controller responsibility.

09kieran-python-reviewer · Quality

PEP 8 conventions, type annotations, Pythonic style.

10kieran-typescript-reviewer · Quality

Type safety, modern ES style, clean architecture.

11dhh-rails-reviewer · Quality

37signals style: simplicity over abstraction.

12deployment-verification-agent · Deployment

Generates pre-deploy checklists, post-deploy verification steps, rollback plans.

13julik-frontend-races-reviewer · Frontend

Hunts race conditions in JavaScript and Stimulus controllers.

14agent-native-reviewer · Agent-native

Makes sure features work not just for humans but for agents too.

What an N+1 query is

The N+1 query that agent #02 watches for is a common database-performance trap: fetch a 100-row list and, written the wrong way, it queries each row separately — 101 requests in total. It's like making 11 trips to the store for 10 items: one trip to see what's there (1 trip), then a separate trip to grab each item (10 trips).

The 14 parallel results are merged and deduplicated into a single prioritized list, roughly like this:

P1 · Must fix

☐ Search query has a SQL injection hole security-sentinel

☐ User creation isn't wrapped in a transaction data-integrity-guardian

P2 · Should fix

☐ Comment loading has an N+1 query performance-oracle

☐ Business logic stuffed into the controller kieran-rails-reviewer

P3 · Could fix

☐ One unused variable code-simplicity-reviewer

What to do once the list is out

/resolve_pr_parallel handles every issue automatically — P1 first, then P2 — each fix running in isolation so they don't step on each other, with you doing a final human pass over the generated changes. If you'd rather triage before fixing, use /triage: go item by item to decide approve (into the to-do list), skip (delete), or re-prioritize; approved items are marked status: ready and handed off to /resolve_todo_parallel.

6The Plugin

What's in the plugin, and how to install it

The whole workflow is packaged into one plugin — install it and go, zero config. It supports Claude Code, with experimental support for OpenCode and Codex.

Specialized agents

Each does one thing well: 14 review experts, plus research, design, automation, and documentation agents.

Workflow commands

The main-loop commands (plan / work / review / compound) plus a batch of utility commands.

Skills

Ready-to-use domain knowledge, like the agent-native architecture skill and the style-guide skill.

What each file is for

CLAUDE.mdThe operations manual the agent reads on every startup: preferences, conventions, pitfalls hit.

docs/solutions/Every solved problem saved as a searchable doc, auto-found in the next session.

docs/plans/ · brainstorms/The output of /plan and /brainstorm.

todos/Issues found by review, with priority and status.

Install in two lines (Claude Code)

claude /plugin marketplace add https://github.com/EveryInc/every-marketplace claude /plugin install compound-engineering

OpenCode / Codex install commands, plus one command to run the whole pipeline

bunx @every-env/compound-plugin install compound-engineering --to opencode
bunx @every-env/compound-plugin install compound-engineering --to codex

/lfg: you just describe the feature and it chains the whole pipeline together automatically — plan → deepen the plan → work → review → fix issues → browser-test → record a feature demo → compound — dispatching 50+ agents along the way and handing you a PR you can merge directly. It only pauses once, at plan approval.

7The Numbers

Key numbers: just how big is the agent concurrency

The scale of this system, pinned to a few concrete numbers.

Products Every maintains with this method, with engineering teams that are essentially one person each.

80% / 20%

Plan + Review take 80% of an engineer's time; Work + Compound only 20%.

Specialized review agents running at once in a single /workflows:review call.

40+

Research agents dispatched by /workflows:plan in ultrathink mode.

26 / 23 / 13

Specialized agents / workflow commands / skills bundled in the plugin.

Every piece of engineering work should make the work that follows easier, not harder.Every, Compound Engineering

This piece is Every's own account of its Compound Engineering methodology and open-source plugin; the concurrency numbers, time splits, and product counts are all its own official figures. Source: Every, Compound Engineering, every.to/guides/compound-engineering. Plugin source: github.com/EveryInc/compound-engineering-plugin.