feat(validation): implement Fix-then-Revalidate orchestration loop for required validations #583

Closed
opened 2026-03-04 23:45:32 +00:00 by freemo · 0 comments
Owner

Metadata

Field Value
Commit Message feat(validation): implement Fix-then-Revalidate orchestration loop for required validations
Branch feature/m6-fix-then-revalidate-loop

Summary

Implement the full fix-then-revalidate orchestration loop that runs when a required validation fails during the Execute phase. This includes: failure diagnosis, self-fix attempt by the execution actor, re-validation, retry limiting, strategy revision escalation, and terminal failure handling.

Spec Reference

Section: Core Concepts > Validation > Validation Failure Handling > Required Validation Failure
Lines: ~22350-22378
Also: Lines ~22195-22202 (Required mode), ~28250 (Validation retry limits)

Current State

  • Validation infrastructure exists: validation_apply.py can run validations and check pass/fail results.
  • Validation modes (required, informational) are defined in the data model.
  • No fix-then-revalidate loop exists: When a validation fails, there is no automated cycle of diagnosis → fix → re-run.
  • The execution actor does not currently examine validation message and data fields to attempt targeted fixes.
  • No retry counting or retry limit enforcement exists for validation failures.
  • No strategy revision escalation path exists from validation failures.

Description

The spec defines this flow for required validation failures:

  1. Diagnosis: Execution actor examines the validation's message and data fields to understand the failure nature. Well-structured data (file paths, line numbers, error messages) enables targeted fixes.

  2. Self-fix attempt: Actor attempts to fix within current strategy bounds:

    • Failing test → read test output, identify broken assertion, fix code
    • Lint error → read lint report, correct formatting/style violation
    • Type error → read type checker output, fix type mismatch
  3. Re-validation: After each fix attempt, re-invoke the failing validation. If it passes, loop ends.

  4. Retry limit: Loop runs up to the configured limit (default: 3 attempts, configurable per plan or automation profile). After limit reached, self-fix stops.

  5. Strategy revision: If failure can't be resolved within current strategy constraints (e.g., strategy says "modify only handler.py" but fix requires changes to model.py), request strategy revision — re-run Strategize phase for affected subtree. Controlled by auto_strategy_revision flag.

  6. Escalation: If strategy revision fails or requires approval, plan pauses for user guidance via agents plan prompt.

  7. Terminal failure: If no intervention, plan fails with state: failed, sandbox preserved.

Acceptance Criteria

  • FixThenRevalidateOrchestrator (or method on plan_executor) that implements the full loop
  • On required validation failure: execution actor receives message + data for diagnosis
  • Self-fix: actor attempts to fix the issue within strategy bounds, then re-runs validation
  • Retry counting: tracks number of fix attempts per validation per plan
  • Retry limit enforcement: default 3, configurable via automation_profile.validation_retry_limit or per-plan config
  • After retry exhaustion: attempt strategy revision if auto_strategy_revision: true
  • After strategy revision failure: pause for user escalation via agents plan prompt
  • Terminal failure: plan transitions to state: failed with sandbox preserved
  • validation_attempts and validation_fix_history recorded in plan execution metadata
  • Informational validation failures do NOT trigger the fix loop (only record results)
  • Unit tests: fix-revalidate cycle with mock actor and validation
  • Integration test: required validation fails, actor fixes, re-validation passes
  • Extends: existing validation_apply.py infrastructure
  • Related: Plan executor (plan_executor.py)
  • Related: Automation profile auto_strategy_revision flag
  • Referenced by: Spec Validation section (lines 22190-22690)

Suggested Milestone

v3.5.0

Priority

Medium

Suggested Assignee

@CoreRasurae — Async/validation/subplans

Subtasks

  • Code: Implement FixThenRevalidateOrchestrator with full loop: diagnosis → self-fix → re-validation → retry limit → escalation
  • Code: Implement retry counting and limit enforcement (default 3, configurable via automation_profile.validation_retry_limit)
  • Code: Implement strategy revision escalation when auto_strategy_revision: true and user escalation via agents plan prompt
  • Code: Record validation_attempts and validation_fix_history in plan execution metadata
  • Docs: Document the fix-then-revalidate loop, retry configuration, and escalation paths
  • Behave tests: Add BDD feature file features/validation/fix_then_revalidate.feature covering full loop with retry and escalation
  • Robot tests: Add Robot Framework integration test: required validation fails, actor fixes, re-validation passes
  • ASV benchmarks: Add ASV benchmark for validation retry loop overhead (benchmarks/bench_fix_revalidate.py)
  • Quality: coverage >=97%: Verify via nox -s coverage_report
  • Quality: nox full suite: Run nox (all default sessions), fix any errors

Definition of Done

This issue is complete when:

  • All subtasks below are completed and checked off.
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation.
  • The commit is pushed to the remote on the branch matching the Branch in Metadata exactly.
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.
## Metadata | Field | Value | |-------|-------| | **Commit Message** | `feat(validation): implement Fix-then-Revalidate orchestration loop for required validations` | | **Branch** | `feature/m6-fix-then-revalidate-loop` | ## Summary Implement the full fix-then-revalidate orchestration loop that runs when a required validation fails during the Execute phase. This includes: failure diagnosis, self-fix attempt by the execution actor, re-validation, retry limiting, strategy revision escalation, and terminal failure handling. ## Spec Reference **Section**: Core Concepts > Validation > Validation Failure Handling > Required Validation Failure **Lines**: ~22350-22378 **Also**: Lines ~22195-22202 (Required mode), ~28250 (Validation retry limits) ## Current State - Validation infrastructure exists: `validation_apply.py` can run validations and check pass/fail results. - Validation modes (`required`, `informational`) are defined in the data model. - **No fix-then-revalidate loop exists**: When a validation fails, there is no automated cycle of diagnosis → fix → re-run. - The execution actor does not currently examine validation `message` and `data` fields to attempt targeted fixes. - No retry counting or retry limit enforcement exists for validation failures. - No strategy revision escalation path exists from validation failures. ## Description The spec defines this flow for required validation failures: 1. **Diagnosis**: Execution actor examines the validation's `message` and `data` fields to understand the failure nature. Well-structured `data` (file paths, line numbers, error messages) enables targeted fixes. 2. **Self-fix attempt**: Actor attempts to fix within current strategy bounds: - Failing test → read test output, identify broken assertion, fix code - Lint error → read lint report, correct formatting/style violation - Type error → read type checker output, fix type mismatch 3. **Re-validation**: After each fix attempt, re-invoke the failing validation. If it passes, loop ends. 4. **Retry limit**: Loop runs up to the configured limit (default: 3 attempts, configurable per plan or automation profile). After limit reached, self-fix stops. 5. **Strategy revision**: If failure can't be resolved within current strategy constraints (e.g., strategy says "modify only handler.py" but fix requires changes to model.py), request strategy revision — re-run Strategize phase for affected subtree. Controlled by `auto_strategy_revision` flag. 6. **Escalation**: If strategy revision fails or requires approval, plan pauses for user guidance via `agents plan prompt`. 7. **Terminal failure**: If no intervention, plan fails with `state: failed`, sandbox preserved. ## Acceptance Criteria - [x] `FixThenRevalidateOrchestrator` (or method on plan_executor) that implements the full loop - [x] On required validation failure: execution actor receives `message` + `data` for diagnosis - [x] Self-fix: actor attempts to fix the issue within strategy bounds, then re-runs validation - [x] Retry counting: tracks number of fix attempts per validation per plan - [x] Retry limit enforcement: default 3, configurable via `automation_profile.validation_retry_limit` or per-plan config - [x] After retry exhaustion: attempt strategy revision if `auto_strategy_revision: true` - [x] After strategy revision failure: pause for user escalation via `agents plan prompt` - [x] Terminal failure: plan transitions to `state: failed` with sandbox preserved - [x] `validation_attempts` and `validation_fix_history` recorded in plan execution metadata - [x] Informational validation failures do NOT trigger the fix loop (only record results) - [x] Unit tests: fix-revalidate cycle with mock actor and validation - [x] Integration test: required validation fails, actor fixes, re-validation passes ## Related Issues - Extends: existing `validation_apply.py` infrastructure - Related: Plan executor (`plan_executor.py`) - Related: Automation profile `auto_strategy_revision` flag - Referenced by: Spec Validation section (lines 22190-22690) ## Suggested Milestone v3.5.0 ## Priority Medium ## Suggested Assignee @CoreRasurae — Async/validation/subplans ## Subtasks - [x] **Code**: Implement `FixThenRevalidateOrchestrator` with full loop: diagnosis → self-fix → re-validation → retry limit → escalation - [x] **Code**: Implement retry counting and limit enforcement (default 3, configurable via `automation_profile.validation_retry_limit`) - [x] **Code**: Implement strategy revision escalation when `auto_strategy_revision: true` and user escalation via `agents plan prompt` - [x] **Code**: Record `validation_attempts` and `validation_fix_history` in plan execution metadata - [x] **Docs**: Document the fix-then-revalidate loop, retry configuration, and escalation paths - [x] **Behave tests**: Add BDD feature file `features/validation/fix_then_revalidate.feature` covering full loop with retry and escalation - [x] **Robot tests**: Add Robot Framework integration test: required validation fails, actor fixes, re-validation passes - [x] **ASV benchmarks**: Add ASV benchmark for validation retry loop overhead (`benchmarks/bench_fix_revalidate.py`) - [x] **Quality: coverage >=97%**: Verify via `nox -s coverage_report` - [x] **Quality: nox full suite**: Run `nox` (all default sessions), fix any errors ## Definition of Done This issue is complete when: - All subtasks below are completed and checked off. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done.
freemo added this to the v3.5.0 milestone 2026-03-05 00:30:26 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#583
No description provided.