feat(guardrails): implement Plan Generation Pre-flight Guardrails (7 checks before execution) #582

Closed
opened 2026-03-04 23:45:09 +00:00 by freemo · 1 comment
Owner

Metadata

Field Value
Commit Message feat(guardrails): implement Plan Generation Pre-flight Guardrails (7 checks before execution)
Branch feature/m5-plan-preflight-guardrails

Summary

Implement the 7 pre-flight guardrail checks that must pass before a plan begins execution. These system-level checks validate that the plan is well-formed, its dependencies are satisfiable, and it can safely proceed. If any check fails, the plan is rejected before entering the Strategize phase.

Spec Reference

Section: Core Concepts > Guardrails > Plan Generation Guardrails
Lines: ~28228-28240

Current State

  • AutonomyGuardrailService exists and handles runtime guardrails (step limits, wall-clock enforcement during execution).
  • No pre-flight guardrails exist: There are no checks before a plan enters the Strategize phase.
  • Plan creation in plan_service.py / plan_lifecycle_service.py does not validate action schemas, actor availability, tool existence, or resource accessibility before starting.

Description

The spec mandates 7 pre-flight checks:

  1. Action schema validation: Verify the action referenced by the plan exists, is well-formed, and its configuration conforms to the expected schema.

  2. Actor availability: Confirm that all actors required by the plan (strategy actor, execution actor, estimation actor, invariant reconciliation actor) are registered and reachable. For remote actors, verify network connectivity.

  3. Skill and tool existence: Verify that all skills and tools referenced by the action's actor configuration exist in the registry. This includes tools referenced via skills (transitively resolved).

  4. Automation policy: Verify that the automation profile allows the requested level of autonomy for the target project, resources, and actions.

  5. Rollback feasibility: If require_checkpoints is enabled on the plan, verify that all tools in the actor's skill set have checkpointable: true. Reject plans that would use non-checkpointable tools under a checkpoint-required policy.

  6. Resource accessibility: Verify that the project's linked resources are accessible — git repos are cloneable, databases are connectable, file system paths exist. Shallow connectivity check.

  7. Validation attachment resolution: Pre-resolve the validations that will apply to this plan (from resource-direct, project, and plan attachment scopes) and verify they are all registered and their tool definitions are valid.

Behavior on failure: Plan is rejected before entering the Strategize phase with a clear error message identifying the failing check.

Acceptance Criteria

  • PlanPreflightGuardrail service (or method on existing guardrail service) with all 7 checks
  • Action schema validation: reject plans referencing non-existent or malformed actions
  • Actor availability: verify all 4 actor roles (strategy, execution, estimation, invariant reconciliation) are registered
  • Skill/tool transitive resolution: verify all tools (including via skill references) exist in registry
  • Automation policy check: verify the automation profile permits the requested autonomy level
  • Rollback feasibility: if require_checkpoints: true, verify all tools are checkpointable: true
  • Resource accessibility: shallow connectivity check on all linked resources
  • Validation attachment resolution: pre-resolve and verify all applicable validations
  • Clear error messages for each failing check with the specific check name
  • Pre-flight runs before Strategize phase — plan is rejected if any check fails
  • Unit tests for each of the 7 checks (positive and negative cases)
  • Integration test: create a plan with a missing tool, verify it's rejected with clear message
  • Extends: existing AutonomyGuardrailService (runtime guardrails)
  • Related: Plan lifecycle in plan_lifecycle_service.py
  • Referenced by: ADR-018 (Semantic Error Prevention)

Suggested Milestone

v3.4.0

Priority

High

Suggested Assignee

@freemo — Architecture/complex system design

Subtasks

  • Code: Implement PlanPreflightGuardrail service with all 7 checks: action schema validation, actor availability, skill/tool transitive resolution, automation policy, rollback feasibility, resource accessibility, validation attachment resolution
  • Code: Wire pre-flight into plan lifecycle — run before Strategize phase, reject plan if any check fails
  • Code: Implement clear error messages for each failing check with specific check name
  • Docs: Document the 7 pre-flight guardrail checks and configuration options
  • Behave tests: Add BDD feature file features/guardrails/plan_preflight_guardrails.feature covering all 7 checks (positive and negative)
  • Robot tests: Add Robot Framework integration test: create plan with missing tool, verify rejection with clear message
  • ASV benchmarks: Add ASV benchmark for pre-flight check execution time (benchmarks/bench_preflight_guardrails.py)
  • Quality: coverage ≥97%: Verify via nox -s coverage_report
  • Quality: nox full suite: Run nox (all default sessions), fix any errors

Definition of Done

This issue is complete when:

  • All subtasks below are completed and checked off.
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation.
  • The commit is pushed to the remote on the branch matching the Branch in Metadata exactly.
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.
## Metadata | Field | Value | |-------|-------| | **Commit Message** | `feat(guardrails): implement Plan Generation Pre-flight Guardrails (7 checks before execution)` | | **Branch** | `feature/m5-plan-preflight-guardrails` | ## Summary Implement the 7 pre-flight guardrail checks that must pass before a plan begins execution. These system-level checks validate that the plan is well-formed, its dependencies are satisfiable, and it can safely proceed. If any check fails, the plan is rejected before entering the Strategize phase. ## Spec Reference **Section**: Core Concepts > Guardrails > Plan Generation Guardrails **Lines**: ~28228-28240 ## Current State - `AutonomyGuardrailService` exists and handles runtime guardrails (step limits, wall-clock enforcement during execution). - **No pre-flight guardrails exist**: There are no checks before a plan enters the Strategize phase. - Plan creation in `plan_service.py` / `plan_lifecycle_service.py` does not validate action schemas, actor availability, tool existence, or resource accessibility before starting. ## Description The spec mandates 7 pre-flight checks: 1. **Action schema validation**: Verify the action referenced by the plan exists, is well-formed, and its configuration conforms to the expected schema. 2. **Actor availability**: Confirm that all actors required by the plan (strategy actor, execution actor, estimation actor, invariant reconciliation actor) are registered and reachable. For remote actors, verify network connectivity. 3. **Skill and tool existence**: Verify that all skills and tools referenced by the action's actor configuration exist in the registry. This includes tools referenced via skills (transitively resolved). 4. **Automation policy**: Verify that the automation profile allows the requested level of autonomy for the target project, resources, and actions. 5. **Rollback feasibility**: If `require_checkpoints` is enabled on the plan, verify that all tools in the actor's skill set have `checkpointable: true`. Reject plans that would use non-checkpointable tools under a checkpoint-required policy. 6. **Resource accessibility**: Verify that the project's linked resources are accessible — git repos are cloneable, databases are connectable, file system paths exist. Shallow connectivity check. 7. **Validation attachment resolution**: Pre-resolve the validations that will apply to this plan (from resource-direct, project, and plan attachment scopes) and verify they are all registered and their tool definitions are valid. **Behavior on failure**: Plan is rejected before entering the Strategize phase with a clear error message identifying the failing check. ## Acceptance Criteria - [ ] `PlanPreflightGuardrail` service (or method on existing guardrail service) with all 7 checks - [ ] Action schema validation: reject plans referencing non-existent or malformed actions - [ ] Actor availability: verify all 4 actor roles (strategy, execution, estimation, invariant reconciliation) are registered - [ ] Skill/tool transitive resolution: verify all tools (including via skill references) exist in registry - [ ] Automation policy check: verify the automation profile permits the requested autonomy level - [ ] Rollback feasibility: if `require_checkpoints: true`, verify all tools are `checkpointable: true` - [ ] Resource accessibility: shallow connectivity check on all linked resources - [ ] Validation attachment resolution: pre-resolve and verify all applicable validations - [ ] Clear error messages for each failing check with the specific check name - [ ] Pre-flight runs before Strategize phase — plan is rejected if any check fails - [ ] Unit tests for each of the 7 checks (positive and negative cases) - [ ] Integration test: create a plan with a missing tool, verify it's rejected with clear message ## Related Issues - Extends: existing `AutonomyGuardrailService` (runtime guardrails) - Related: Plan lifecycle in `plan_lifecycle_service.py` - Referenced by: ADR-018 (Semantic Error Prevention) ## Suggested Milestone v3.4.0 ## Priority High ## Suggested Assignee @freemo — Architecture/complex system design ## Subtasks - [ ] **Code**: Implement `PlanPreflightGuardrail` service with all 7 checks: action schema validation, actor availability, skill/tool transitive resolution, automation policy, rollback feasibility, resource accessibility, validation attachment resolution - [ ] **Code**: Wire pre-flight into plan lifecycle — run before Strategize phase, reject plan if any check fails - [ ] **Code**: Implement clear error messages for each failing check with specific check name - [ ] **Docs**: Document the 7 pre-flight guardrail checks and configuration options - [ ] **Behave tests**: Add BDD feature file `features/guardrails/plan_preflight_guardrails.feature` covering all 7 checks (positive and negative) - [ ] **Robot tests**: Add Robot Framework integration test: create plan with missing tool, verify rejection with clear message - [ ] **ASV benchmarks**: Add ASV benchmark for pre-flight check execution time (`benchmarks/bench_preflight_guardrails.py`) - [ ] **Quality: coverage ≥97%**: Verify via `nox -s coverage_report` - [ ] **Quality: nox full suite**: Run `nox` (all default sessions), fix any errors ## Definition of Done This issue is complete when: - All subtasks below are completed and checked off. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done.
freemo self-assigned this 2026-03-05 00:30:15 +00:00
freemo added this to the v3.4.0 milestone 2026-03-05 00:30:15 +00:00
Author
Owner

CTO verification: Issue verified. The 7 pre-flight guardrail checks are specified (lines ~28228-28240) and have no implementation. Queued for implementation after #572.

**CTO verification:** Issue verified. The 7 pre-flight guardrail checks are specified (lines ~28228-28240) and have no implementation. Queued for implementation after #572.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Blocks
#396 Epic: ACMS Context Pipeline
cleveragents/cleveragents-core
Reference
cleveragents/cleveragents-core#582
No description provided.