UAT: ValidationPipeline._build_summary does not treat validation errors (timeouts, exceptions) as required failures — violates spec "Validation Error vs. Validation Failure" semantics #1940

Open
opened 2026-04-03 00:19:13 +00:00 by freemo · 2 comments
Owner

Metadata

  • Branch: fix/validation-pipeline-error-as-required-failure
  • Commit Message: fix(validation): treat validation errors as required failures in _build_summary regardless of mode
  • Milestone: v3.2.0
  • Parent Epic: #357

Bug Report

Feature Area: Decision and Validation System (v3.4.0)

What Was Tested

The ValidationPipeline._build_summary() method's handling of validation errors (runtime exceptions, timeouts, malformed returns) for informational-mode validations.

Expected Behavior (from spec)

Per docs/specification.md lines 22549-22558 ("Validation Error vs. Validation Failure"):

Condition Meaning Handling
Validation failure (passed: false) Validation ran successfully, condition not met Mode-specific (fix loop for required, record for informational)
Validation error Validation itself failed to execute (runtime exception, timeout, malformed return) Always treated as required failure regardless of mode

When a validation error occurs (timeout, runtime exception, malformed return), it should be counted as a required_failed regardless of the validation's mode field, causing all_required_passed to be False.

Actual Behavior

The ValidationPipeline._build_summary() method only checks r.mode and r.passed, ignoring whether the result is a validation error (r.error is not None or r.timed_out is True):

# Current implementation (INCORRECT):
for r in results:
    if r.mode == ValidationMode.REQUIRED:
        if r.passed:
            required_passed += 1
        else:
            required_failed += 1
    else:
        if r.passed:
            informational_passed += 1
        else:
            informational_failed += 1  # BUG: validation errors counted here, not as required_failed

Reproduction:

from cleveragents.application.services.validation_pipeline import ValidationPipeline, ValidationCommand, ValidationMode
import time

cmd_info = ValidationCommand(
    validation_name='local/info-check',
    resource_id='01RESOURCE00000000000000001',
    resource_name='my-repo',
    mode=ValidationMode.INFORMATIONAL,
    arguments={},
    timeout_seconds=0.1,  # Very short timeout
)

def slow_executor(name, args):
    time.sleep(1.0)  # Longer than timeout
    return {'passed': True, 'message': 'ok'}

pipeline = ValidationPipeline(commands=[cmd_info], executor=slow_executor)
summary = pipeline.run()

print(f'timed_out={summary.results[0].timed_out}')  # True
print(f'all_required_passed={summary.all_required_passed}')  # True (WRONG - should be False)

The same issue applies to runtime exceptions in informational validations.

Code Location

  • src/cleveragents/application/services/validation_pipeline.py_build_summary() method (lines ~440-460)

Expected Fix

The _build_summary() method should treat validation errors (where r.error is not None or r.timed_out is True) as required failures regardless of mode:

# Corrected implementation:
for r in results:
    is_validation_error = r.error is not None or r.timed_out
    if r.mode == ValidationMode.REQUIRED or is_validation_error:
        if r.passed:
            required_passed += 1
        else:
            required_failed += 1
    else:
        if r.passed:
            informational_passed += 1
        else:
            informational_failed += 1

Note

The FixThenRevalidateOrchestrator correctly handles this case in its fix loop (treating informational validation errors as required failures), but the ValidationPipeline itself does not enforce this invariant.

Subtasks

  • Write a TDD issue-capture Behave scenario (tagged @tdd_expected_fail) demonstrating the bug: informational validation with timeout sets all_required_passed=False
  • Write a TDD issue-capture Behave scenario (tagged @tdd_expected_fail) demonstrating the bug: informational validation with runtime exception sets all_required_passed=False
  • Write a TDD issue-capture Behave scenario (tagged @tdd_expected_fail) demonstrating the bug: informational validation with malformed return sets all_required_passed=False
  • Fix ValidationPipeline._build_summary() to check r.error is not None or r.timed_out and count such results as required_failed regardless of r.mode
  • Remove @tdd_expected_fail tags and confirm all new scenarios pass
  • Verify existing Behave scenarios for ValidationPipeline still pass (no regressions)
  • Run nox -e typecheck — confirm no Pyright errors introduced
  • Run nox -e coverage_report — confirm coverage remains >= 97%
  • Run full nox suite — confirm all stages pass

Definition of Done

  • All subtasks above are completed
  • ValidationPipeline._build_summary() correctly counts validation errors (timeouts, exceptions, malformed returns) as required_failed regardless of the validation's mode
  • all_required_passed is False whenever any validation error occurs, regardless of mode
  • New Behave scenarios cover timeout, runtime exception, and malformed-return error cases for informational-mode validations
  • No regressions in existing ValidationPipeline or FixThenRevalidateOrchestrator tests
  • Commit fix(validation): treat validation errors as required failures in _build_summary regardless of mode pushed to branch fix/validation-pipeline-error-as-required-failure
  • PR merged into main with at least 2 approving reviews
  • All nox stages pass
  • Coverage >= 97%

Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: ca-new-issue-creator

## Metadata - **Branch**: `fix/validation-pipeline-error-as-required-failure` - **Commit Message**: `fix(validation): treat validation errors as required failures in _build_summary regardless of mode` - **Milestone**: v3.2.0 - **Parent Epic**: #357 ## Bug Report **Feature Area:** Decision and Validation System (v3.4.0) ### What Was Tested The `ValidationPipeline._build_summary()` method's handling of validation errors (runtime exceptions, timeouts, malformed returns) for informational-mode validations. ### Expected Behavior (from spec) Per `docs/specification.md` lines 22549-22558 ("Validation Error vs. Validation Failure"): > | Condition | Meaning | Handling | > | :-------- | :------ | :------- | > | **Validation failure** (`passed: false`) | Validation ran successfully, condition not met | Mode-specific (fix loop for required, record for informational) | > | **Validation error** | Validation itself failed to execute (runtime exception, timeout, malformed return) | **Always treated as required failure** regardless of mode | When a validation error occurs (timeout, runtime exception, malformed return), it should be counted as a `required_failed` regardless of the validation's `mode` field, causing `all_required_passed` to be `False`. ### Actual Behavior The `ValidationPipeline._build_summary()` method only checks `r.mode` and `r.passed`, ignoring whether the result is a validation error (`r.error is not None` or `r.timed_out is True`): ```python # Current implementation (INCORRECT): for r in results: if r.mode == ValidationMode.REQUIRED: if r.passed: required_passed += 1 else: required_failed += 1 else: if r.passed: informational_passed += 1 else: informational_failed += 1 # BUG: validation errors counted here, not as required_failed ``` **Reproduction:** ```python from cleveragents.application.services.validation_pipeline import ValidationPipeline, ValidationCommand, ValidationMode import time cmd_info = ValidationCommand( validation_name='local/info-check', resource_id='01RESOURCE00000000000000001', resource_name='my-repo', mode=ValidationMode.INFORMATIONAL, arguments={}, timeout_seconds=0.1, # Very short timeout ) def slow_executor(name, args): time.sleep(1.0) # Longer than timeout return {'passed': True, 'message': 'ok'} pipeline = ValidationPipeline(commands=[cmd_info], executor=slow_executor) summary = pipeline.run() print(f'timed_out={summary.results[0].timed_out}') # True print(f'all_required_passed={summary.all_required_passed}') # True (WRONG - should be False) ``` The same issue applies to runtime exceptions in informational validations. ### Code Location - `src/cleveragents/application/services/validation_pipeline.py` — `_build_summary()` method (lines ~440-460) ### Expected Fix The `_build_summary()` method should treat validation errors (where `r.error is not None` or `r.timed_out is True`) as required failures regardless of mode: ```python # Corrected implementation: for r in results: is_validation_error = r.error is not None or r.timed_out if r.mode == ValidationMode.REQUIRED or is_validation_error: if r.passed: required_passed += 1 else: required_failed += 1 else: if r.passed: informational_passed += 1 else: informational_failed += 1 ``` ### Note The `FixThenRevalidateOrchestrator` correctly handles this case in its fix loop (treating informational validation errors as required failures), but the `ValidationPipeline` itself does not enforce this invariant. ## Subtasks - [ ] Write a TDD issue-capture Behave scenario (tagged `@tdd_expected_fail`) demonstrating the bug: informational validation with timeout sets `all_required_passed=False` - [ ] Write a TDD issue-capture Behave scenario (tagged `@tdd_expected_fail`) demonstrating the bug: informational validation with runtime exception sets `all_required_passed=False` - [ ] Write a TDD issue-capture Behave scenario (tagged `@tdd_expected_fail`) demonstrating the bug: informational validation with malformed return sets `all_required_passed=False` - [ ] Fix `ValidationPipeline._build_summary()` to check `r.error is not None or r.timed_out` and count such results as `required_failed` regardless of `r.mode` - [ ] Remove `@tdd_expected_fail` tags and confirm all new scenarios pass - [ ] Verify existing Behave scenarios for `ValidationPipeline` still pass (no regressions) - [ ] Run `nox -e typecheck` — confirm no Pyright errors introduced - [ ] Run `nox -e coverage_report` — confirm coverage remains >= 97% - [ ] Run full `nox` suite — confirm all stages pass ## Definition of Done - [ ] All subtasks above are completed - [ ] `ValidationPipeline._build_summary()` correctly counts validation errors (timeouts, exceptions, malformed returns) as `required_failed` regardless of the validation's `mode` - [ ] `all_required_passed` is `False` whenever any validation error occurs, regardless of mode - [ ] New Behave scenarios cover timeout, runtime exception, and malformed-return error cases for informational-mode validations - [ ] No regressions in existing `ValidationPipeline` or `FixThenRevalidateOrchestrator` tests - [ ] Commit `fix(validation): treat validation errors as required failures in _build_summary regardless of mode` pushed to branch `fix/validation-pipeline-error-as-required-failure` - [ ] PR merged into main with at least 2 approving reviews - [ ] All nox stages pass - [ ] Coverage >= 97% --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: ca-new-issue-creator
freemo added this to the v3.2.0 milestone 2026-04-03 00:19:38 +00:00
Author
Owner

Issue triaged by project owner:

  • State: Verified
  • MoSCoW: MoSCoW/Should Have — bug or error handling improvement.

Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: ca-project-owner

Issue triaged by project owner: - **State**: Verified - **MoSCoW**: MoSCoW/Should Have — bug or error handling improvement. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: ca-project-owner
freemo removed this from the v3.2.0 milestone 2026-04-06 22:31:20 +00:00
Author
Owner

This issue has been moved to the backlog as part of an aggressive grooming of the v3.2.0 milestone. It has been deemed non-critical for the minimal viability of the milestone and will be addressed in a future release.

This issue has been moved to the backlog as part of an aggressive grooming of the v3.2.0 milestone. It has been deemed non-critical for the minimal viability of the milestone and will be addressed in a future release.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Blocks
Reference
cleveragents/cleveragents-core#1940
No description provided.