v3.1.0
v3.1.0 — M2: Actor Compiler + Full LLM Integration
Goal: Actor YAML files compile into live LangGraph graphs. Custom actors (strategy, execution, estimation) are fully operational. The tool router normalizes calls across providers, the validation runner enforces resource-attached validations, and the MCP adapter connects to external tool servers.
Acceptance Criteria
- Actor YAML files with
version: "3",type: llm|tool|graph, and all schema fields parse and validate correctly via Pydantic models - GRAPH-type actor YAML definitions compile into LangGraph
StateGraphstructures with subgraph resolution, cycle detection, and entry/exit validation agents actor add --config actor.yamlloads and registers actors- Actions referencing custom actors correctly resolve and use those actors during plan execution
- Skill registry and tool lifecycle functional via CLI
- MCP adapter discovers and connects to external tool servers
- Validation runner executes required and informational validations
- Multi-file generation produces correct ChangeSet
- Test coverage >= 97%
Technical Criteria
- Actor YAML files parse and validate correctly.
- Actors compile to LangGraph StateGraphs.
- Tool router and MCP adapter can resolve external tools.
- Validation runner executes required/informational validations.
- Multi-file generation produces correct ChangeSet.
100% Completed
test(e2e): E2E acceptance criteria for M2 (v3.1.0) — actor compiler and LLM integration
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 15s
CI / quality (pull_request) Successful in 20s
CI / build (pull_request) Successful in 19s
CI / security (pull_request) Successful in 35s
CI / typecheck (pull_request) Successful in 36s
CI / e2e_tests (pull_request) Failing after 52s
CI / integration_tests (pull_request) Successful in 2m58s
CI / unit_tests (pull_request) Successful in 3m36s
CI / docker (pull_request) Successful in 35s
CI / coverage (pull_request) Successful in 4m41s
CI / benchmark-regression (pull_request) Failing after 40m8s
Priority
Medium
State
In Review
Type
Testing
feat(skill): add MCP refresh hooks
All checks were successful
CI / lint (pull_request) Successful in 21s
CI / typecheck (pull_request) Successful in 32s
CI / quality (pull_request) Successful in 14s
CI / security (pull_request) Successful in 50s
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 24s
CI / integration_tests (pull_request) Successful in 2m44s
CI / unit_tests (pull_request) Successful in 21m48s
CI / docker (pull_request) Successful in 1m2s
CI / benchmark-regression (pull_request) Successful in 26m41s
CI / coverage (pull_request) Successful in 43m59s
State
Completed
Type
Feature
feat(cli): add skill tools and refresh commands
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 15s
CI / build (pull_request) Successful in 16s
CI / quality (pull_request) Successful in 29s
CI / typecheck (pull_request) Successful in 32s
CI / security (pull_request) Successful in 54s
CI / integration_tests (pull_request) Successful in 3m33s
CI / unit_tests (pull_request) Successful in 10m4s
CI / docker (pull_request) Successful in 1m0s
CI / benchmark-regression (pull_request) Successful in 27m22s
CI / coverage (pull_request) Successful in 39m27s
State
Completed
Type
Feature
feat(skill): add agent skills loader
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 12s
CI / build (pull_request) Successful in 15s
CI / quality (pull_request) Successful in 17s
CI / security (pull_request) Successful in 29s
CI / typecheck (pull_request) Successful in 31s
CI / integration_tests (pull_request) Successful in 3m31s
CI / unit_tests (pull_request) Successful in 10m8s
CI / docker (pull_request) Successful in 38s
CI / benchmark-regression (pull_request) Successful in 19m52s
CI / coverage (pull_request) Successful in 38m39s
State
Completed
Type
Feature
chore(ci): add concurrency groups and job timeouts to CI workflows
Some checks failed
CI / lint (pull_request) Successful in 22s
CI / typecheck (pull_request) Successful in 59s
CI / quality (pull_request) Successful in 30s
CI / security (pull_request) Successful in 48s
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 23s
CI / integration_tests (pull_request) Successful in 5m7s
CI / unit_tests (pull_request) Failing after 20m0s
CI / docker (pull_request) Has been skipped
CI / coverage (pull_request) Failing after 30m0s
CI / benchmark-regression (pull_request) Successful in 23m49s
State
Wont Do
Type
Task
test(coverage): add Behave scenarios for remaining under-tested modules
All checks were successful
CI / lint (pull_request) Successful in 14s
CI / benchmark-publish (pull_request) Has been skipped
CI / quality (pull_request) Successful in 27s
CI / typecheck (pull_request) Successful in 37s
CI / security (pull_request) Successful in 39s
CI / build (pull_request) Successful in 29s
CI / integration_tests (pull_request) Successful in 5m51s
CI / benchmark-regression (pull_request) Successful in 22m39s
CI / unit_tests (pull_request) Successful in 29m35s
CI / docker (pull_request) Successful in 13s
CI / coverage (pull_request) Successful in 2h3m1s
CI / lint (push) Successful in 21s
CI / typecheck (push) Successful in 57s
CI / security (push) Successful in 51s
CI / quality (push) Successful in 27s
CI / integration_tests (push) Successful in 4m59s
CI / build (push) Successful in 23s
CI / benchmark-publish (push) Successful in 14m39s
CI / benchmark-regression (push) Has been skipped
CI / unit_tests (push) Successful in 29m48s
CI / docker (push) Successful in 15s
CI / coverage (push) Successful in 1h45m42s
State
Completed
Type
Task
feat(skill): add MCP adapter for external tools
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 15s
CI / build (pull_request) Successful in 16s
CI / quality (pull_request) Successful in 18s
CI / security (pull_request) Successful in 29s
CI / typecheck (pull_request) Successful in 33s
CI / integration_tests (pull_request) Successful in 3m10s
CI / unit_tests (pull_request) Successful in 9m49s
CI / docker (pull_request) Successful in 40s
CI / benchmark-regression (pull_request) Successful in 27m20s
CI / coverage (pull_request) Successful in 38m36s
State
Completed
Type
Feature
feature/m2-actor-yaml
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 15s
CI / build (pull_request) Successful in 16s
CI / quality (pull_request) Successful in 44s
CI / typecheck (pull_request) Successful in 44s
CI / security (pull_request) Successful in 47s
CI / integration_tests (pull_request) Successful in 3m17s
CI / unit_tests (pull_request) Successful in 20m41s
CI / docker (pull_request) Successful in 1m2s
CI / benchmark-regression (pull_request) Successful in 23m40s
CI / coverage (pull_request) Successful in 35m33s
State
Completed
Type
Feature
test: consolidated Brent QA batch — issues #156, #169, #326, #402, #403
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 31s
CI / quality (pull_request) Successful in 30s
CI / build (pull_request) Successful in 16s
CI / security (pull_request) Successful in 1m5s
CI / typecheck (pull_request) Successful in 1m12s
CI / integration_tests (pull_request) Successful in 4m0s
CI / unit_tests (pull_request) Successful in 17m22s
CI / docker (pull_request) Successful in 1m0s
CI / benchmark-regression (pull_request) Successful in 18m43s
CI / coverage (pull_request) Successful in 39m43s
State
Completed
Type
Testing
Integration: Provider fixes, cost controls, changeset persistence, concurrency locks, and plan resume
Some checks failed
CI / lint (pull_request) Failing after 23s
CI / typecheck (pull_request) Successful in 59s
CI / coverage (pull_request) Has been skipped
CI / security (pull_request) Successful in 58s
CI / quality (pull_request) Successful in 43s
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 25s
CI / benchmark-regression (pull_request) Has been skipped
CI / integration_tests (pull_request) Failing after 3m51s
CI / unit_tests (pull_request) Failing after 23m31s
CI / docker (pull_request) Has been skipped
State
Wont Do
Type
Feature
feat(changeset): persist changesets and diff artifacts
All checks were successful
CI / lint (pull_request) Successful in 24s
CI / typecheck (pull_request) Successful in 59s
CI / security (pull_request) Successful in 53s
CI / quality (pull_request) Successful in 36s
CI / benchmark-publish (pull_request) Has been skipped
CI / build (pull_request) Successful in 27s
CI / integration_tests (pull_request) Successful in 5m30s
CI / benchmark-regression (pull_request) Successful in 25m38s
CI / unit_tests (pull_request) Successful in 36m51s
CI / docker (pull_request) Successful in 1m3s
CI / coverage (pull_request) Successful in 1h48m49s
CI / lint (push) Successful in 22s
CI / security (push) Successful in 58s
CI / typecheck (push) Successful in 1m3s
CI / quality (push) Successful in 46s
CI / build (push) Successful in 23s
CI / integration_tests (push) Successful in 5m23s
CI / benchmark-regression (push) Has been skipped
CI / benchmark-publish (push) Successful in 15m5s
CI / unit_tests (push) Successful in 20m1s
CI / docker (push) Successful in 1m0s
CI / coverage (push) Successful in 1h45m36s
State
Completed
Type
Feature
test(e2e): verify M2 success criteria — actor compiler and tool routing
All checks were successful
CI / lint (pull_request) Successful in 32s
CI / benchmark-publish (pull_request) Has been skipped
CI / typecheck (pull_request) Successful in 46s
CI / quality (pull_request) Successful in 34s
CI / security (pull_request) Successful in 50s
CI / build (pull_request) Successful in 27s
CI / integration_tests (pull_request) Successful in 4m5s
CI / unit_tests (pull_request) Successful in 15m35s
CI / docker (pull_request) Successful in 1m1s
CI / benchmark-regression (pull_request) Successful in 22m17s
CI / coverage (pull_request) Successful in 34m4s
State
Wont Do
Type
Testing
feat(concurrency): add plan resume
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 21s
CI / quality (pull_request) Successful in 21s
CI / build (pull_request) Successful in 23s
CI / typecheck (pull_request) Successful in 39s
CI / security (pull_request) Successful in 50s
CI / integration_tests (pull_request) Successful in 4m4s
CI / unit_tests (pull_request) Successful in 9m15s
CI / docker (pull_request) Successful in 1m0s
CI / benchmark-regression (pull_request) Successful in 23m41s
CI / coverage (pull_request) Successful in 43m24s
CI / lint (push) Successful in 22s
CI / build (push) Successful in 24s
CI / quality (push) Successful in 27s
CI / typecheck (push) Successful in 44s
CI / benchmark-regression (push) Has been skipped
CI / security (push) Successful in 55s
CI / integration_tests (push) Successful in 4m16s
CI / benchmark-publish (push) Successful in 14m39s
CI / unit_tests (push) Successful in 24m24s
CI / docker (push) Successful in 1m1s
CI / coverage (push) Successful in 45m36s
State
Completed
Type
Feature
feat(concurrency): add plan and project locks
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 16s
CI / quality (pull_request) Successful in 30s
CI / build (pull_request) Successful in 24s
CI / security (pull_request) Successful in 40s
CI / typecheck (pull_request) Successful in 1m0s
CI / integration_tests (pull_request) Successful in 4m14s
CI / unit_tests (pull_request) Successful in 16m25s
CI / docker (pull_request) Successful in 55s
CI / benchmark-regression (pull_request) Successful in 22m55s
CI / coverage (pull_request) Successful in 38m4s
CI / lint (push) Successful in 13s
CI / build (push) Successful in 15s
CI / quality (push) Successful in 28s
CI / typecheck (push) Successful in 31s
CI / benchmark-regression (push) Has been skipped
CI / security (push) Successful in 31s
CI / integration_tests (push) Successful in 3m19s
CI / benchmark-publish (push) Successful in 14m23s
CI / unit_tests (push) Successful in 14m44s
CI / docker (push) Successful in 38s
CI / coverage (push) Successful in 32m22s
State
Completed
Type
Feature
test(e2e): add M2 actor + tool source smoke suite
All checks were successful
CI / lint (pull_request) Successful in 23s
CI / benchmark-publish (pull_request) Has been skipped
CI / quality (pull_request) Successful in 30s
CI / build (pull_request) Successful in 23s
CI / security (pull_request) Successful in 52s
CI / typecheck (pull_request) Successful in 58s
CI / integration_tests (pull_request) Successful in 4m58s
CI / unit_tests (pull_request) Successful in 12m52s
CI / docker (pull_request) Successful in 1m0s
CI / benchmark-regression (pull_request) Successful in 21m33s
CI / coverage (pull_request) Successful in 1h12m45s
Priority
Medium
State
In Progress
Type
Testing