UAT: DefaultValidationRunner uses text-matching heuristic instead of actual tool execution pipeline — apply-phase validation is non-functional #5549

Open
opened 2026-04-09 07:22:05 +00:00 by HAL9000 · 0 comments
Owner

Bug Report

Feature Area: Validation Runner — Apply-Phase Validation Execution

Severity: Medium — the apply-phase validation gate (ApplyValidationGate) uses a stub runner that doesn't actually invoke validation tools, making apply-phase validation non-functional


What Was Tested

Code-level analysis of DefaultValidationRunner in src/cleveragents/application/services/validation_apply.py (lines 260–314) against the validation execution requirements in docs/specification.md.

Expected Behavior (from spec)

The spec (Core Concepts > Validation > Validation Lifecycle in Plan Execution) states that validations are executed as standard tool calls. The ApplyValidationGate is used to gate the apply phase based on validation results from the Execute phase. The validation runner should invoke the actual validation tool implementation.

Actual Behavior (from code)

DefaultValidationRunner in validation_apply.py lines 260–314 uses a text-matching heuristic instead of actual tool execution:

class DefaultValidationRunner(ValidationRunner):
    """Default validation runner using text matching.

    For each attachment, checks if the validation name appears in the
    context values. This is a simple default for development and testing.
    Production implementations should use the full tool execution pipeline.
    """

    def run_validation(self, attachment, context):
        context_text = " ".join(f"{k}={v}" for k, v in context.items()).lower()
        val_name = attachment.validation_name.lower()
        # Simple heuristic: check if validation name is referenced in context
        passed = val_name in context_text
        ...

This implementation:

  1. Does NOT invoke the actual validation tool
  2. Determines pass/fail by checking if the validation name appears as a substring in the context dict values
  3. Is documented as "a placeholder — real implementations invoke the actual validation tool"

The ApplyValidationGate is wired with this stub runner in the container (container.py line 815 shows validation_pipeline=None), meaning the apply-phase validation gate is effectively non-functional.

Evidence

  • src/cleveragents/application/services/validation_apply.py:260-314: DefaultValidationRunner stub implementation
  • src/cleveragents/application/container.py:815: validation_pipeline=None in FixThenRevalidateOrchestrator factory
  • The docstring explicitly says: "Production implementations should use the full tool execution pipeline"

Impact

The apply-phase validation gate (ApplyValidationGate) does not actually run validation tools. It uses a text-matching heuristic that will produce incorrect results in production. This means:

  1. Required validations may incorrectly pass (if the validation name happens to appear in context)
  2. Required validations may incorrectly fail (if the validation name doesn't appear in context)
  3. The apply gate is not a reliable safety mechanism

Note: The execute-phase validation pipeline (ValidationPipeline) uses a proper executor callable that is injected — this is the correct pattern. The apply-phase DefaultValidationRunner needs to be replaced with a real implementation that delegates to the tool execution pipeline.

Fix Required

  1. Implement a ToolExecutionValidationRunner that delegates to the ToolRunner or ValidationPipeline executor to actually invoke validation tools
  2. Wire this real runner into ApplyValidationGate in the DI container
  3. The DefaultValidationRunner should remain as a test double only (moved to features/mocks/)

Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: uat-tester

## Bug Report **Feature Area**: Validation Runner — Apply-Phase Validation Execution **Severity**: Medium — the apply-phase validation gate (`ApplyValidationGate`) uses a stub runner that doesn't actually invoke validation tools, making apply-phase validation non-functional --- ## What Was Tested Code-level analysis of `DefaultValidationRunner` in `src/cleveragents/application/services/validation_apply.py` (lines 260–314) against the validation execution requirements in `docs/specification.md`. ## Expected Behavior (from spec) The spec (Core Concepts > Validation > Validation Lifecycle in Plan Execution) states that validations are executed as standard tool calls. The `ApplyValidationGate` is used to gate the apply phase based on validation results from the Execute phase. The validation runner should invoke the actual validation tool implementation. ## Actual Behavior (from code) `DefaultValidationRunner` in `validation_apply.py` lines 260–314 uses a **text-matching heuristic** instead of actual tool execution: ```python class DefaultValidationRunner(ValidationRunner): """Default validation runner using text matching. For each attachment, checks if the validation name appears in the context values. This is a simple default for development and testing. Production implementations should use the full tool execution pipeline. """ def run_validation(self, attachment, context): context_text = " ".join(f"{k}={v}" for k, v in context.items()).lower() val_name = attachment.validation_name.lower() # Simple heuristic: check if validation name is referenced in context passed = val_name in context_text ... ``` This implementation: 1. Does NOT invoke the actual validation tool 2. Determines pass/fail by checking if the validation name appears as a substring in the context dict values 3. Is documented as "a placeholder — real implementations invoke the actual validation tool" The `ApplyValidationGate` is wired with this stub runner in the container (`container.py` line 815 shows `validation_pipeline=None`), meaning the apply-phase validation gate is effectively non-functional. ## Evidence - `src/cleveragents/application/services/validation_apply.py:260-314`: `DefaultValidationRunner` stub implementation - `src/cleveragents/application/container.py:815`: `validation_pipeline=None` in `FixThenRevalidateOrchestrator` factory - The docstring explicitly says: "Production implementations should use the full tool execution pipeline" ## Impact The apply-phase validation gate (`ApplyValidationGate`) does not actually run validation tools. It uses a text-matching heuristic that will produce incorrect results in production. This means: 1. Required validations may incorrectly pass (if the validation name happens to appear in context) 2. Required validations may incorrectly fail (if the validation name doesn't appear in context) 3. The apply gate is not a reliable safety mechanism Note: The execute-phase validation pipeline (`ValidationPipeline`) uses a proper `executor` callable that is injected — this is the correct pattern. The apply-phase `DefaultValidationRunner` needs to be replaced with a real implementation that delegates to the tool execution pipeline. ## Fix Required 1. Implement a `ToolExecutionValidationRunner` that delegates to the `ToolRunner` or `ValidationPipeline` executor to actually invoke validation tools 2. Wire this real runner into `ApplyValidationGate` in the DI container 3. The `DefaultValidationRunner` should remain as a test double only (moved to `features/mocks/`) --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: uat-tester
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#5549
No description provided.