Files

T

Anthony Cardinale ebccdda936 Initial public release

A chezmoi-based fleet-dotfiles template for macOS workstations:

- Two-way auto-sync via launchd watcher + 5-min puller
- Mesh SSH via modify_authorized_keys driven by .chezmoidata/fleet.yaml
- age-encrypted secrets file
- Bundled Claude Code agentic team (11 agents) + /lite + /lite-sub commands
- Verify-before-claiming Stop hook
- Generic statusline + project-boundary validate-path hook
- Reference launchd plist for cross-fleet task-durations aggregation
  (companion repo: gitea.tojo.team/cardinale/task-durations)
- AGENTS.md walks an agent through the entire setup Q&A interactively
- docs/ covers architecture, security model, fleet onboarding

2026-05-02 17:26:32 -04:00

16 KiB

Raw Blame History

description, argument-hint, allowed-tools

description	argument-hint	allowed-tools
Run the lite agentic team on a task — 4 named teammates (scout, explorer, critic, tester) work in parallel via direct peer messaging while the orchestrator keeps each in role, synthesizes, implements, and verifies. Uses EXPERIMENTAL_AGENT_TEAMS. Slower than a one-shot but produces tested, critiqued, research-grounded code. Do NOT use for trivial tasks, pure research, or multi-module architectural work.	<task description in quotes>	Agent, TeamCreate, TeamDelete, SendMessage, TaskCreate, TaskUpdate, TaskList, TaskGet, Bash, Read, Edit, Write, Glob, Grep, WebSearch, WebFetch, TodoWrite

/lite — Lite Agentic Team (Agent-Teams edition)

Spawn a 4-teammate Agent Team on a single task. Teammates coordinate via direct peer DMs and a shared task list. The orchestrator keeps each one in their role, synthesizes their outputs, implements code, and verifies tests pass.

Requires CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 (already set in this fleet's settings.json).

Tunable Limits

TEST_RETRY_MAX = 2 # how many times to refine impl on failing tests before escalating
MAX_DEBATE_ROUNDS = 3 # how many message round-trips between teammates before orchestrator forces synthesis
WALL_CLOCK_TARGET = 3-7 min # advisory only; teams are slower than subagents but produce richer output
TEAM_NAME_PREFIX = "lite-" # team_name = TEAM_NAME_PREFIX + unix timestamp

Flow

1. Guard

If $ARGUMENTS is empty: error /lite requires a task description in quotes and stop.

2. Pre-dispatch context gather

Per CLAUDE.md's pre-dispatch checklist, build a single context block reused for all four spawn prompts:

pwd and git rev-parse --show-toplevel 2>/dev/null → project root.
Detect language + test framework (in order):
- package.json → Node — check deps for vitest | jest | mocha
- pyproject.toml or setup.py → Python — pytest if in deps, else default pytest
- Package.swift → Swift — XCTest
- go.mod → Go — go test
- Cargo.toml → Rust — cargo test
- Gemfile → Ruby — rspec if in deps else minitest
- None of the above → "no project detected" branch (see Error handling)
SSH hosts: grep -A5 "Host " ~/.ssh/config matching task domain.
Skills: ls ~/.claude/skills/ matching task domain.
Env var NAMES: grep -i <task-keyword> ~/.zshrc (NAMES only, never values).
CLAUDE.md keywords: grep -B1 -A10 <keyword> ~/.claude/CLAUDE.md for any sections relevant to the task.

Standard context block (built once):

## Available System Access
- SSH: <relevant hosts, or "n/a">
- Skills: <relevant skill summaries, or "n/a">
- Env vars: <relevant env var NAMES only>
- Tools: WebSearch, WebFetch, Bash, Read, Glob, Grep, SendMessage
- Project root: <path or "no git project">
- Test framework: <auto-detected, or "none detected">

## Task (verbatim from user)
$ARGUMENTS

## Relevant CLAUDE.md sections
<paste matching grep output, or "n/a">

3. Create the team

Run TeamCreate with:

team_name: lite- + current unix timestamp (e.g., lite-1714450000). Run date +%s to get the value.
description: short, task-specific (e.g., lite-team for: <first 60 chars of $ARGUMENTS>)

This creates ~/.claude/teams/<team>/config.json and ~/.claude/tasks/<team>/.

4. Seed the shared task list

Create four tasks via TaskCreate (the team's task list is auto-active). Each will be claimed by the corresponding teammate in step 5:

Task subject	Owner (set in step 5)	Description
Prior-art research	scout	Find ≥3 references via WebSearch — libraries, blog posts, similar implementations. Check GitHub stars/last-commit. DM critic + explorer with findings as you go. Output final brief in your last turn.
Tech verification	explorer	Verify versions/maintenance status of every library you'd recommend. DM tester with the test framework + version once decided. DM critic if you find a recommended library is unmaintained.
Concerns + ambiguities	critic	Read the task as a plan. Produce a concerns list (severity-tagged) + ambiguity register. Consult explorer if a concern depends on a tech choice. DM tester with the riskiest assumptions to test. Flag `FUNDAMENTALLY WRONG PLAN` only if contradictory or impossible.
Failing tests	tester	Wait for critic's "riskiest assumptions" DM (or after `MAX_DEBATE_ROUNDS` of silence, derive them from the task yourself). Write failing tests in the project's detected test framework. DM critic to verify coverage. Output final test file paths in your last turn.

Add a fifth task Implementation + verify owned by the orchestrator (not a teammate) — this captures that the orchestrator is responsible for steps 7-8.

5. Spawn the team — single message, four parallel `Agent` calls

All four Agent calls go in one assistant message so they spawn in parallel. Each call uses:

team_name: the team created in step 3
name: short lowercase (scout, explorer, critic, or tester)
subagent_type: the existing agentic-team agent ("Research Scout", "Explorer", "Critic", "Tester")
prompt: the standard prompt template below, with role-specific addenda

Standard spawn prompt template (each teammate gets this with <role> filled in):

You are <role> in a 4-teammate Agent Team. Your full role definition is at:
  ~/.claude/agents/agentic-team/<phase>/<role>.md

Read that file ONCE at the start of your first turn — it defines your identity, what you produce, your rules, and your constraints. Stay in role.

<paste standard context block from step 2>

## Team roster (DM these by name via SendMessage)
- scout (Research Scout) — prior art, libraries, similar implementations
- explorer (Explorer) — version verification, maintenance status, dependency vetting
- critic (Critic) — edge cases, ambiguities, concerns
- tester (Tester) — failing tests for the riskiest assumptions

## How to coordinate
- Use SendMessage to share findings with relevant peers AS YOU DISCOVER THEM, not at the end. Short DMs (1-3 sentences) with concrete content beat long summaries.
- Reply to peer DMs that name you. Ignore DMs not directed at you.
- Use the shared task list (TaskList / TaskUpdate) — claim your task, mark completed when done.
- Do NOT touch other teammates' tasks. Stay in your role.
- Do NOT write implementation code (except tester writing test code). The orchestrator implements.
- Do NOT ask the user clarifying questions — the orchestrator handles that. If something is ambiguous, register it as an ambiguity (critic) or test both interpretations (tester).
- After producing your final output, mark your task completed and go idle. Stay alive — the orchestrator may DM you with follow-ups.

## Your task in this team
<role-specific addendum below>

Role-specific addenda:

scout: "Find ≥3 references via WebSearch (libraries, blog posts, similar impls). Check GitHub stars/last-commit. DM critic with the standard approach you'd recommend. DM explorer with candidate libraries to verify. Final output: prior-art brief in your last turn."
explorer: "Verify every library/framework you'd recommend (npm view <pkg>, pip show, brew info, gh repo view). DM tester once you've decided the test framework + version. DM critic if a recommended library is unmaintained. Final output: version manifest with <lib>@<ver> — last release <date> lines."
critic: "Read the task as if it were already a plan. Produce concerns list (severity-tagged: critical/important/minor) + ambiguity register. Consult explorer via DM if a concern hinges on a tech choice. DM tester the riskiest assumptions. Flag FUNDAMENTALLY WRONG PLAN ONLY if contradictory or impossible. Identify problems — do not propose solutions."
tester: "Wait up to MAX_DEBATE_ROUNDS for critic's 'riskiest assumptions' DM. If it doesn't arrive, derive them yourself from the task. Write FAILING tests in <detected test framework> at the project's standard test path. DM critic to verify coverage of their concerns. Final output: list of test file paths created."

6. Monitor and steer (orchestrator)

While the team works, you (the orchestrator) receive:

Auto-delivered messages from teammates DMing you directly
Idle notifications when each teammate's turn ends
Brief peer-DM summaries inside idle notifications (so you see who is talking to whom)
Updates to the shared task list

Your job is role enforcement and conflict arbitration:

Drift correction: if a teammate is doing work outside their role (e.g., tester writing impl code, critic proposing solutions), DM them: Stay in role: <role>. Do not <off-role activity>. Continue with <correct activity>.
Conflict arbitration: if critic and explorer disagree (e.g., critic says "use library X", explorer says "X is unmaintained"), DM both with the resolution or ask the user if you genuinely don't know.
FUNDAMENTALLY WRONG PLAN: if critic flags this, STOP. Send shutdown_request to all teammates. Print critic's reasoning. Ask the user to confirm or revise before proceeding. Do NOT proceed to step 7.
Stuck team: if all four teammates are idle and the task list shows no recent activity for 30+ seconds AND tester has not produced tests, DM tester directly: Critic's concerns are <summarize>. Write failing tests now. Do not wait for further input.
Debate limit: if any pair of teammates exchanges more than MAX_DEBATE_ROUNDS (= 3) DMs without convergence, DM both: Stop debating. Each of you write your final position into your task and complete it. Orchestrator will arbitrate.

Do NOT re-read teammates' produced files yourself if their final messages are already in your context — duplicate work wastes context.

7. Synthesis + implementation (orchestrator)

When tester has marked their task completed AND the test files exist on disk:

Read the test files (this is your spec).
Implement code in the project to make the failing tests pass. Use critic's concerns + scout's prior art + explorer's version manifest as guardrails.
Do NOT refactor unrelated code.
Do NOT commit (the user controls commits).

UX pass — required when the artifact has a user-facing surface. Before considering the implementation done, identify whether the artifact will produce output a human reads:

Web (HTML/CSS/React/Tailwind/marketing) → invoke the hig-think skill via the Skill tool.
SwiftUI / iOS / macOS native → invoke the swift-front skill.
Other frontend (React Native, generic JS UI, design systems) → invoke the frontend-design skill.
CLI / terminal → no dedicated skill; apply: bold for primary identifier, dim ANSI for metadata, color for category, respect NO_COLOR and isatty(), expose --json (or equivalent) as the machine-readable escape hatch, blank lines between logical sections, errors prefixed with program name.
Backend / library / protocol → skip; no human surface.

The invoked skill provides surface-specific guidance. Apply its recommendations to the implementation alongside the test-driven structure. Default to the human-readable form; gate machine output behind a flag. Do NOT skip this step on the grounds that "tests pass" — tests measure the data contract, not what the user sees.

If scout/explorer/critic are still working when tester finishes, DM them: tester finished. Wrap up your task and complete it. Then proceed.

8. Verification

Run the project's test command (pytest, npm test, swift test, go test ./..., etc., per detected framework).

Tests pass: continue to step 9.
Tests fail (iteration n of TEST_RETRY_MAX = 2):
- If the failure is in your impl: refine the impl, re-run.
- If the failure looks like a bad test (test asserts something the spec doesn't require): DM tester with the failure output and ask them to refine. Wait for tester to update the file. Re-run.
- After TEST_RETRY_MAX failed iterations: STOP. Skip to step 9 with a failure report.

9. Shutdown + report

Send {type: "shutdown_request", reason: "/lite complete"} to each teammate via SendMessage. Wait for shutdown_response (auto-approved by teammates per protocol).
Run TeamDelete to clean up ~/.claude/teams/<team>/ and ~/.claude/tasks/<team>/.
Mark the orchestrator's Implementation + verify task completed.
Final assistant message — concise:
- Research: 1-3 most relevant references scout found.
- Critiqued: 1-3 most important concerns critic raised.
- Tested: what assumptions are now under test.
- Built: what was implemented.
- Files changed: absolute paths.
- Test result: pass / fail-after-N-retries / escalated.

Error handling

Failure	Behavior
`$ARGUMENTS` empty	Error: `/lite requires a task description in quotes`
`EXPERIMENTAL_AGENT_TEAMS` not enabled	Print: `set CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 in settings.json` and stop
Not in a git project	Continue, but tester writes to `~/lite-scratch/<timestamp>/`; orchestrator implements there too
No test framework detected	Continue; tester writes plain assertion script (e.g., a runnable Python file that exits non-zero on failure); report fallback used
TeamCreate fails	Surface the error and stop. Do NOT fall back to subagent dispatch silently.
Any teammate fails to spawn	Surface which one. If 1 of 4: continue without them, note in report. If 2+: shutdown remaining, TeamDelete, surface error.
Critic flags `FUNDAMENTALLY WRONG PLAN`	Send `shutdown_request` to all teammates. TeamDelete. Surface critic's reasoning to user.
Tests fail after `TEST_RETRY_MAX`	Continue to step 9 (shutdown + report) with failure status. Print last test output.
Orchestrator errors mid-flow	Send `shutdown_request` to all alive teammates, run TeamDelete, propagate the error to user. Do NOT leave orphan teams.

Anti-patterns — when NOT to invoke `/lite`

Trivial tasks (rename, fix typo, format) — overhead > value, even more so with teams.
Pure-research, no implementation — direct WebSearch is faster.
Cross-file architectural decisions — use the full team flow with Architect/Builder/Reviewer/Integrator.
Spec is unclear — clarify with the user first; /lite will commit to one interpretation.
Time-sensitive emergencies — direct intervention is faster than 4-teammate orchestration.
Token budget tight — teams cost ~4-7x subagents. (Today's /lite opts into the cost on purpose.)

Constraints (negative — what NOT to do)

Do NOT introduce new agents. Use only the four existing ones (Research Scout, Explorer, Critic, Tester).
Do NOT spawn teammates sequentially — all four go in a single message with four parallel Agent calls.
Do NOT let teammates write implementation code (except tester writing test code). The orchestrator implements.
Do NOT ask the user clarifying questions during the flow — /lite is fire-and-forget; if the task is ambiguous, proceed with the most reasonable interpretation and note it in the report.
Do NOT skip the test-pass verification (step 8) — claiming "done" without running tests violates superpowers:verification-before-completion.
Do NOT proceed past step 6 if critic flags FUNDAMENTALLY WRONG PLAN.
Do NOT leave orphan teams — every flow path (success, error, escalation) must call TeamDelete before returning to the user.
Do NOT skip role enforcement — if a teammate drifts off-role, DM them to redirect. The team's value is in role specialization.
Do NOT exceed MAX_DEBATE_ROUNDS of teammate-to-teammate debate without forcing synthesis.
Do NOT use background subagents alongside the team — all coordination flows through the team's mailbox + task list.

16 KiB Raw Blame History