v3.0.0
v3.0.0 — M1: Minimal Local Source-Code Workflow
Goal: A minimally usable local-mode flow where a user can register an action from YAML, link a git repository resource to a project, and run a plan end-to-end (plan use → plan execute → plan diff → plan apply) with a sandboxed workspace and tool-based change capture.
Acceptance Criteria
agents action create --config action.yamlsuccessfully creates and persists an action record to SQLiteagents resource add git-checkoutregisters a git-checkout resourceagents project createandagents project link-resourcelink resources to projectsagents plan usecreates a plan record with correct state machine transitionsagents plan execute <plan_id>invokes the actor-based LLM path, producing tool invocations and a ChangeSetagents plan diff <plan_id>shows pending changes in the sandboxagents plan apply <plan_id>merges sandbox changes into the target repository with a git commit- Git worktree sandbox creates isolated working directory
- Changes in sandbox do not affect original until Apply
- Post-apply commit exists in target repository
- Test coverage >= 97%
Technical Criteria
- Plan and Action records persist to SQLite database with Alembic migrations.
- Execute/apply runs via actor-based LLM path (not hardcoded).
- ChangeSet built from tool invocations (not parsed from LLM output).
- Git worktree sandbox creates isolated working directory.
- Changes in sandbox do not affect original until Apply.
- Domain models use Pydantic v2 with frozen=True.
- Test coverage remains >= 97%.
100% Completed
test(e2e): E2E acceptance criteria for M1 (v3.0.0) — minimal plan execution flow
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 15s
CI / build (pull_request) Successful in 17s
CI / quality (pull_request) Successful in 18s
CI / e2e_tests (pull_request) Failing after 25s
CI / security (pull_request) Successful in 33s
CI / typecheck (pull_request) Successful in 36s
CI / integration_tests (pull_request) Successful in 2m53s
CI / unit_tests (pull_request) Successful in 3m52s
CI / docker (pull_request) Successful in 54s
CI / coverage (pull_request) Successful in 6m53s
CI / benchmark-regression (pull_request) Successful in 34m3s
State
In Review
Type
Testing
feat(provider): add cost controls and fallback
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 17s
CI / lint (pull_request) Successful in 18s
CI / quality (pull_request) Successful in 20s
CI / security (pull_request) Successful in 34s
CI / typecheck (pull_request) Successful in 37s
CI / integration_tests (pull_request) Successful in 3m46s
CI / unit_tests (pull_request) Successful in 8m32s
CI / docker (pull_request) Successful in 39s
CI / benchmark-regression (pull_request) Successful in 19m26s
CI / coverage (pull_request) Successful in 31m23s
CI / lint (push) Successful in 13s
CI / build (push) Successful in 15s
CI / quality (push) Successful in 16s
CI / typecheck (push) Successful in 30s
CI / benchmark-regression (push) Has been skipped
CI / security (push) Successful in 33s
CI / integration_tests (push) Successful in 3m44s
CI / unit_tests (push) Successful in 8m53s
CI / docker (push) Successful in 1m1s
CI / benchmark-publish (push) Successful in 10m35s
CI / coverage (push) Successful in 37m55s
State
Completed
Type
Feature
fix(provider): remove FakeListLLM defaults
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 17s
CI / quality (pull_request) Successful in 20s
CI / lint (pull_request) Failing after 22s
CI / security (pull_request) Successful in 33s
CI / typecheck (pull_request) Successful in 54s
CI / coverage (pull_request) Has been skipped
CI / benchmark-regression (pull_request) Has been skipped
CI / integration_tests (pull_request) Failing after 2m32s
CI / unit_tests (pull_request) Failing after 9m58s
CI / docker (pull_request) Has been skipped
State
Wont Do
Type
Bug
test(e2e): verify M1 success criteria — minimal plan execution flow
All checks were successful
CI / lint (pull_request) Successful in 21s
CI / security (pull_request) Successful in 52s
CI / typecheck (pull_request) Successful in 58s
CI / quality (pull_request) Successful in 34s
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 23s
CI / integration_tests (pull_request) Successful in 4m52s
CI / unit_tests (pull_request) Successful in 15m30s
CI / docker (pull_request) Successful in 59s
CI / benchmark-regression (pull_request) Successful in 22m16s
CI / coverage (pull_request) Successful in 34m26s
State
Wont Do
Type
Testing
test(e2e): add M1 source-code plan lifecycle suite
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 16s
CI / build (pull_request) Successful in 16s
CI / quality (pull_request) Successful in 19s
CI / security (pull_request) Successful in 29s
CI / typecheck (pull_request) Successful in 37s
CI / integration_tests (pull_request) Successful in 4m8s
CI / unit_tests (pull_request) Successful in 16m20s
CI / docker (pull_request) Successful in 13s
CI / benchmark-regression (pull_request) Successful in 22m17s
CI / coverage (pull_request) Successful in 35m47s
State
Wont Do
Type
Testing
docs: reorganize implementation plan and expand contributing guidelines
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 15s
CI / quality (pull_request) Successful in 18s
CI / build (pull_request) Successful in 20s
CI / security (pull_request) Successful in 29s
CI / typecheck (pull_request) Successful in 32s
CI / integration_tests (pull_request) Successful in 3m12s
CI / unit_tests (pull_request) Successful in 6m51s
CI / docker (pull_request) Successful in 15s
CI / benchmark-regression (pull_request) Successful in 16m15s
CI / coverage (pull_request) Successful in 23m13s
CI / lint (push) Successful in 13s
CI / quality (push) Successful in 18s
CI / build (push) Successful in 21s
CI / typecheck (push) Successful in 33s
CI / benchmark-regression (push) Has been skipped
CI / security (push) Successful in 36s
CI / integration_tests (push) Successful in 2m33s
CI / unit_tests (push) Successful in 6m58s
CI / docker (push) Successful in 8s
CI / benchmark-publish (push) Successful in 10m17s
CI / coverage (push) Successful in 22m57s
State
Completed
Type
Task
feat(resource): add handler runtime for git-checkout and fs-directory
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 16s
CI / lint (pull_request) Successful in 20s
CI / quality (pull_request) Successful in 29s
CI / typecheck (pull_request) Successful in 32s
CI / security (pull_request) Successful in 37s
CI / integration_tests (pull_request) Successful in 4m37s
CI / unit_tests (pull_request) Successful in 17m42s
CI / docker (pull_request) Successful in 1m1s
CI / benchmark-regression (pull_request) Successful in 18m21s
CI / coverage (pull_request) Successful in 40m20s
State
Completed