UAT: ApplyValidationGate uses DefaultValidationRunner stub — apply-phase validation does not execute registered validation tools #3490

Open
opened 2026-04-05 18:36:35 +00:00 by freemo · 3 comments
Owner

Metadata

  • Branch: fix/m3-apply-validation-gate-real-runner
  • Commit Message: fix(validation): replace DefaultValidationRunner stub with ToolRegistry-backed runner in ApplyValidationGate
  • Milestone: v3.2.0
  • Parent Epic: #357

Bug Description

The ApplyValidationGate class (src/cleveragents/application/services/validation_apply.py) is designed to gate the Apply phase by running validation attachments. However, its default runner implementation (DefaultValidationRunner) is a stub that uses text-matching against context strings to determine pass/fail — it does not actually invoke the registered validation tools from the Tool Registry.

The DefaultValidationRunner.run_validation() method contains this comment in the code:

"This is a placeholder — real implementations invoke the actual validation tool."

And its logic is:

context_text = " ".join(f"{k}={v}" for k, v in context.items()).lower()
val_name = attachment.validation_name.lower()
passed = val_name in context_text

This means a validation named local/lint-check will "pass" if the string "local/lint-check" appears anywhere in the context dict values, and "fail" otherwise — completely unrelated to whether the actual lint check passes.

Spec References (docs/specification.md):

  • Line 22322: "a plain Tool cannot be used where a Validation is specifically required" — implies real tool execution
  • Line 22560: "If the validation is informational, the error is still recorded but does not block." — implies real execution results
  • Line 22808: "final_validation_results | plan.apply | Snapshot of the final validation state at the time execution completed (all required validations passing)."

Expected Behavior:
ApplyValidationGate should use a real ValidationRunner implementation that:

  1. Looks up the validation tool from the Tool Registry by name
  2. Executes the tool via ToolRunner.execute() with the appropriate inputs
  3. Interprets the tool's return value (checking for passed: bool in the output)
  4. Returns an ApplyValidationResult based on the actual tool execution result

Actual Behavior:
ApplyValidationGate uses DefaultValidationRunner which performs text-matching against context strings. Registered validation tools are never actually executed. The Apply phase validation gate produces meaningless results based on string matching rather than real validation outcomes.

Code Location:

  • src/cleveragents/application/services/validation_apply.py, class DefaultValidationRunner (lines ~270–320)
  • The DefaultValidationRunner is the default runner passed to ApplyValidationGate.__init__() in all usages
  • A real ToolRegistryValidationRunner needs to be implemented that uses ToolRunner.execute() and checks the passed boolean in the output

Impact: Critical — the Apply phase validation gate is non-functional. Required validations that should block apply never actually run their registered tool logic. Plans can be applied even when their resources fail real validation checks.

Subtasks

  • Implement ToolRegistryValidationRunner in src/cleveragents/application/services/validation_apply.py that looks up validation tools from the Tool Registry by name
  • Wire ToolRegistryValidationRunner to use ToolRunner.execute() to invoke the validation tool with appropriate inputs
  • Parse the tool's return value and check for passed: bool in the output to determine pass/fail
  • Replace DefaultValidationRunner as the default runner in ApplyValidationGate.__init__() with ToolRegistryValidationRunner
  • Retain DefaultValidationRunner (or remove it) only as a test double in features/mocks/ if needed
  • Add static type annotations to all new/modified methods; verify with nox -e typecheck
  • Tests (Behave): Add/update scenarios in features/ covering real tool invocation, passed: True, passed: False, and missing tool cases
  • Tests (Robot): Add/update integration test in robot/ verifying end-to-end apply-phase validation gate with a real registered validation tool
  • Verify coverage ≥ 97% via nox -s coverage_report
  • Run nox (all default sessions), fix any errors

Definition of Done

This issue is complete when:

  • All subtasks above are completed and checked off.
  • ApplyValidationGate uses a ValidationRunner implementation that actually invokes registered validation tools via ToolRunner.execute() and interprets the passed: bool result — no text-matching stub logic remains in the production code path.
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation.
  • The commit is pushed to the remote on the branch matching the Branch in Metadata exactly.
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.
  • All nox stages pass.
  • Coverage ≥ 97%.

Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: ca-uat-tester

## Metadata - **Branch**: `fix/m3-apply-validation-gate-real-runner` - **Commit Message**: `fix(validation): replace DefaultValidationRunner stub with ToolRegistry-backed runner in ApplyValidationGate` - **Milestone**: v3.2.0 - **Parent Epic**: #357 ## Bug Description The `ApplyValidationGate` class (`src/cleveragents/application/services/validation_apply.py`) is designed to gate the Apply phase by running validation attachments. However, its default runner implementation (`DefaultValidationRunner`) is a stub that uses text-matching against context strings to determine pass/fail — it does **not** actually invoke the registered validation tools from the Tool Registry. The `DefaultValidationRunner.run_validation()` method contains this comment in the code: > "This is a placeholder — real implementations invoke the actual validation tool." And its logic is: ```python context_text = " ".join(f"{k}={v}" for k, v in context.items()).lower() val_name = attachment.validation_name.lower() passed = val_name in context_text ``` This means a validation named `local/lint-check` will "pass" if the string `"local/lint-check"` appears anywhere in the context dict values, and "fail" otherwise — completely unrelated to whether the actual lint check passes. **Spec References** (`docs/specification.md`): - Line 22322: "a plain Tool cannot be used where a Validation is specifically required" — implies real tool execution - Line 22560: "If the validation is informational, the error is still recorded but does not block." — implies real execution results - Line 22808: "`final_validation_results` | `plan.apply` | Snapshot of the final validation state at the time execution completed (all required validations passing)." **Expected Behavior**: `ApplyValidationGate` should use a real `ValidationRunner` implementation that: 1. Looks up the validation tool from the Tool Registry by name 2. Executes the tool via `ToolRunner.execute()` with the appropriate inputs 3. Interprets the tool's return value (checking for `passed: bool` in the output) 4. Returns an `ApplyValidationResult` based on the actual tool execution result **Actual Behavior**: `ApplyValidationGate` uses `DefaultValidationRunner` which performs text-matching against context strings. Registered validation tools are never actually executed. The Apply phase validation gate produces meaningless results based on string matching rather than real validation outcomes. **Code Location**: - `src/cleveragents/application/services/validation_apply.py`, class `DefaultValidationRunner` (lines ~270–320) - The `DefaultValidationRunner` is the default runner passed to `ApplyValidationGate.__init__()` in all usages - A real `ToolRegistryValidationRunner` needs to be implemented that uses `ToolRunner.execute()` and checks the `passed` boolean in the output **Impact**: Critical — the Apply phase validation gate is non-functional. Required validations that should block apply never actually run their registered tool logic. Plans can be applied even when their resources fail real validation checks. ## Subtasks - [ ] Implement `ToolRegistryValidationRunner` in `src/cleveragents/application/services/validation_apply.py` that looks up validation tools from the Tool Registry by name - [ ] Wire `ToolRegistryValidationRunner` to use `ToolRunner.execute()` to invoke the validation tool with appropriate inputs - [ ] Parse the tool's return value and check for `passed: bool` in the output to determine pass/fail - [ ] Replace `DefaultValidationRunner` as the default runner in `ApplyValidationGate.__init__()` with `ToolRegistryValidationRunner` - [ ] Retain `DefaultValidationRunner` (or remove it) only as a test double in `features/mocks/` if needed - [ ] Add static type annotations to all new/modified methods; verify with `nox -e typecheck` - [ ] Tests (Behave): Add/update scenarios in `features/` covering real tool invocation, `passed: True`, `passed: False`, and missing tool cases - [ ] Tests (Robot): Add/update integration test in `robot/` verifying end-to-end apply-phase validation gate with a real registered validation tool - [ ] Verify coverage ≥ 97% via `nox -s coverage_report` - [ ] Run `nox` (all default sessions), fix any errors ## Definition of Done This issue is complete when: - All subtasks above are completed and checked off. - `ApplyValidationGate` uses a `ValidationRunner` implementation that actually invokes registered validation tools via `ToolRunner.execute()` and interprets the `passed: bool` result — no text-matching stub logic remains in the production code path. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done. - All nox stages pass. - Coverage ≥ 97%. --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: ca-uat-tester
freemo added this to the v3.2.0 milestone 2026-04-05 18:36:40 +00:00
Author
Owner

Issue verified and triaged:

  • Priority: Critical — ApplyValidationGate uses text-matching stub instead of real tool execution, making apply-phase validation meaningless.
  • Milestone: v3.2.0 (already assigned)
  • Story Points: 5 (L) — requires implementing ToolRegistryValidationRunner with real tool invocation and result interpretation.
  • Parent Epic: #357 (already linked)
  • Dependency: Part of the validation subsystem fix cluster with #3487 and #3482.
  • Next step: This issue is now ready for implementation.

Automated by CleverAgents Bot
Supervisor: Human Liaison | Agent: ca-human-liaison

Issue verified and triaged: - **Priority**: Critical — ApplyValidationGate uses text-matching stub instead of real tool execution, making apply-phase validation meaningless. - **Milestone**: v3.2.0 (already assigned) - **Story Points**: 5 (L) — requires implementing ToolRegistryValidationRunner with real tool invocation and result interpretation. - **Parent Epic**: #357 (already linked) - **Dependency**: Part of the validation subsystem fix cluster with #3487 and #3482. - **Next step**: This issue is now ready for implementation. --- **Automated by CleverAgents Bot** Supervisor: Human Liaison | Agent: ca-human-liaison
Author
Owner

Transition request: State/Verified -> State/In Progress for issue #3490 (cleveragents/cleveragents-core).

Current state label: State/Verified.
Other labels present: Priority/Critical, Type/Bug.

Preconditions:

  • Blocked label is not present, so there is no blocking issue to unlock.

Planned label changes (to be executed by maintainers or via API):

  • Remove old state label(s): State/Verified
  • Add new state label: State/In Progress

Note: This action requires label operations via the Forgejo API. The current toolset does not expose direct label editing endpoints. If you approve, I can either (a) attempt to apply these label changes through an extended API wrapper, or (b) leave a formal transition record here and proceed once label endpoints are available. For now, I am documenting the intended transition and will re-run when label-edit capability is accessible.

Blockers/Linked issues: None (Blocker not set).

Reason for transition: Verifies that work can commence on the issue and aligns with the Ticket Lifecycle: Unverified → Verified → In Progress.


Automated by CleverAgents Bot
Supervisor: Implementation | Agent: ca-issue-state-updater

Transition request: State/Verified -> State/In Progress for issue #3490 (cleveragents/cleveragents-core). Current state label: State/Verified. Other labels present: Priority/Critical, Type/Bug. Preconditions: - Blocked label is not present, so there is no blocking issue to unlock. Planned label changes (to be executed by maintainers or via API): - Remove old state label(s): State/Verified - Add new state label: State/In Progress Note: This action requires label operations via the Forgejo API. The current toolset does not expose direct label editing endpoints. If you approve, I can either (a) attempt to apply these label changes through an extended API wrapper, or (b) leave a formal transition record here and proceed once label endpoints are available. For now, I am documenting the intended transition and will re-run when label-edit capability is accessible. Blockers/Linked issues: None (Blocker not set). Reason for transition: Verifies that work can commence on the issue and aligns with the Ticket Lifecycle: Unverified → Verified → In Progress. --- **Automated by CleverAgents Bot** Supervisor: Implementation | Agent: ca-issue-state-updater
Author
Owner

Milestone Triage Decision: Moved to Backlog

This issue has been moved out of v3.3.0 during aggressive milestone triage. While important for completeness, it does not directly relate to the core focus of Corrections + Subplans + Checkpoints.

Reasoning:

  • v3.3.0 focus: Essential corrections, subplan management, and checkpoint functionality
  • This issue: Infrastructure improvement not directly blocking core milestone functionality
  • Impact: System enhancement, not core corrections/subplans/checkpoints functionality

Will be addressed in a future milestone after core corrections, subplans, and checkpoints are stable.

**Milestone Triage Decision: Moved to Backlog** This issue has been moved out of v3.3.0 during aggressive milestone triage. While important for completeness, it does not directly relate to the core focus of Corrections + Subplans + Checkpoints. **Reasoning:** - v3.3.0 focus: Essential corrections, subplan management, and checkpoint functionality - This issue: Infrastructure improvement not directly blocking core milestone functionality - Impact: System enhancement, not core corrections/subplans/checkpoints functionality Will be addressed in a future milestone after core corrections, subplans, and checkpoints are stable.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Blocks
Reference
cleveragents/cleveragents-core#3490
No description provided.