The intelligence
underneath.

autoarchitect-code is the open-source substrate the platform runs on. When a specialist agent hits a wall — no matching archetype, a build failure that won't resolve, a domain it's never seen — the runtime delegates down to the substrate. It researches, authors a fresh skill or archetype, debugs, and hands back something the specialists can resume from.

Validated end-to-end across 7 agent runs. Final converging run (a personal flashcard study app) shipped at $3.71 with the substrate firing twice — once to author the archetype, once to write a working React app when the Builder agent got stuck. v48 specs.

MIT Pluggable LLM Skills-native Self-hostable

Pipelines stall.
Specialists hit walls.

A specialist agent — Researcher, Builder, QA, Product Manager — is built for one job. When it hits friction (an ambiguous spec, a missing skill, a novel domain) it has no good move except to wait, halt, or guess.

That's the shape of every stalled run we've watched: a specialist ran out of context-of-its-job, and nothing else in the pipeline could pick the work back up.

One reasoner.
Always available.

The substrate is a generalist agent that any specialist can delegate to. It runs an iterate-until-solved loop with the platform's tools, the platform's permissions, the platform's skills catalog. It doesn't replace the specialists; it picks up the bounded sub-problems that don't fit any one of them.

Same loop is also a standalone CLI for developers. aac "build me X" in your terminal — choose a model, choose a trust level, get streamed events back.

Two layers, one catalog.

Specialists know what to build. Substrate knows how to dig in when it gets hard. Both share the same skills catalog and the same safety invariants.

┌────────────────────────────────────────────────────────┐ AutoArchitect Platform Strategist Researcher Designer Builder PM QA delegate when stuck ┌─────────────────────────────────┐ code_run({ kind, ... }) │ ← single seam └────────────┬────────────────────┘ └────────────────┼───────────────────────────────────────┘┌────────────────────────────────────────────────────────┐ @autoarchitect/code (substrate) QueryEngine · Session · Tools · PermissionGate Gap-fill subsystem · LLM adapters · Telemetry 12 invariants enforced; spec + tests in the open. └────────────────────────────────────────────────────────┘ ↓ writes to shared catalog ┌────────────────────────────────────────────────────────┐ Skills + Archetypes Catalog baked-in · operator-reviewed · runtime-generated └────────────────────────────────────────────────────────┘

One library, three integration surfaces: library (default — same process as the platform), CLI for humans, HTTP service for multi-tenant deployments.

One loop. Three callers.

Specialist stalls

Researcher needs a procedure that doesn't exist. Strategist hits a domain it doesn't know. Builder can't find the right pattern. They delegate via delegate_task to the substrate agent.

→ Returns: completed work + audit trail

Catalog is missing

No skill or archetype fits the user's ask. Substrate runs an 8-step loop: search catalog → research the web → draft → validate → publish to the runtime catalog with full provenance.

→ Returns: published runtime asset

Human types aac

You're in your terminal. You want a CLI that does "refactor src/ for readability" or "scaffold a Go HTTP server with health checks". Pick a model. Pick a mode. Stream events to your TUI.

→ Returns: streamed work + files on disk

12 invariants.
Code-enforced. Tests cover.

The substrate runs inside the platform's guts. That demands a small, auditable safety surface. Every invariant is in the spec, enforced in the module, and covered by a CI test.

Read the safety spec
S1 No engine-code authoring at runtime
S2 Bounded I/O per turn (fetch + bytes)
S3 Bounded session lifetime
S4 Audit trail on every write
S5 Plan-mode is dry-run, always
S6 Sandboxed write paths (no traversal)
S7 Secret scan before commit-to-disk
S8 Shell allowlist + blocklist
S9 Network egress controllable
S10 Skills are content, not code
S11 Plugin loading allowlist-only
S12 Subprocess isolation (opt-in)

Spec-driven. Spec-complete.

17 top-level specs and 6 subsystem specs document the whole tool — contracts, safety, integration, every tool. Implementation tracks the spec; the spec is the authority for community contributions.

Substrate today

  • QueryEngine — streaming async-generator loop
  • Session + budget + compaction
  • Permission ladder (5 levels)
  • 13 core tools — files, shell, web, memory, skills, meta
  • 3 LLM adapters — Anthropic, OpenAI, mock
  • Gap-fill subsystem (8-step loop)
  • Safety module — 12 invariants enforced
  • Spec-complete docs (23 specs)
  • Platform integration (single adapter tool)

Where it's going

  • Ink-based TUI (interactive REPL)
  • HTTP service mode (aac serve)
  • SQLite session persistence + resume
  • Live LLM adapters (Qwen / Ollama / local)
  • Plugin loader (allowlisted)
  • Slash commands (/plan, /compact, /model …)
  • Editor extensions (VS Code, JetBrains)
  • Sub-agent + swarm modes
Roadmap

Read the specs. Open an RFC.

autoarchitect-code is community-governed. The architecture is spec-driven and auditable; contributions land via the RFC process documented in the repo.

MIT · OPEN SOURCE · COMMUNITY GOVERNANCE