BUG-HUNT: [consistency] `PlanGenerationGraph._validate` always passes — `len(all_code) > 10` makes `is_valid` permanently True, rendering retry logic dead code #6554

New issue

Open

opened 2026-04-09 21:18:33 +00:00 by HAL9000 · 1 comment

HAL9000 commented

2026-04-09 21:18:33 +00:00

Owner

Bug Report: [consistency] — `_validate` short-circuit condition `len(all_code) > 10` makes validation always succeed

Severity Assessment

Impact: The validation step — and by extension the entire retry loop — is effectively disabled for any non-trivial generated code. The LLM's "FAIL" verdict is always overridden. This means the max_retries parameter has no practical effect, bad code is never retried, and the validation_result.status returned to callers is always "PASS", masking generation failures.
Likelihood: High — any generated code over 10 characters (virtually all real output) triggers this
Priority: High

Location

File: src/cleveragents/agents/graphs/plan_generation.py
Class: PlanGenerationGraph
Method: _validate
Lines: 358–363

Description

The validation logic has a short-circuit OR condition:

# plan_generation.py lines 358-363
result = chain.invoke({"generated_code": all_code})
validation = str(result)

# Simple validation check (in real implementation, parse the LLM response)
is_valid = "PASS" in validation.upper() or len(all_code) > 10  # BUG

The comment "in real implementation, parse the LLM response" acknowledges this is incomplete — but the placeholder condition len(all_code) > 10 always evaluates to True for any real code, so:

If LLM says "PASS" → True or True = True (correct but vacuous)
If LLM says "FAIL" → False or True = True (BUG: FAIL overridden)
If LLM call fails → exception caught, returns {"status": "FAIL", ...} (correct)

The retry conditional _should_retry checks for "FAIL" status, but since _validate always sets status to "PASS", the retry path (retry -> analyze_requirements) is unreachable. The entire max_retries mechanism is dead code.

Expected Behavior

is_valid should be determined solely by the LLM's response:

is_valid = "PASS" in validation.upper() and "FAIL" not in validation.upper()

Or better, parse structured JSON output from the LLM if the validation prompt is updated to produce it.

Actual Behavior

is_valid is always True for any generated code longer than 10 characters, making validation a no-op and the retry loop permanently bypassed.

TDD Note

After this bug is verified, a Type/Testing issue will be created with a TDD test tagged @tdd_issue, @tdd_issue_<this-issue-number>, and @tdd_expected_fail to prove the bug exists before fixing it.

Automated by CleverAgents Bot
Supervisor: Bug Hunting | Agent: bug-hunter

## Bug Report: [consistency] — `_validate` short-circuit condition `len(all_code) > 10` makes validation always succeed ### Severity Assessment - **Impact**: The validation step — and by extension the entire retry loop — is effectively disabled for any non-trivial generated code. The LLM's "FAIL" verdict is always overridden. This means the `max_retries` parameter has no practical effect, bad code is never retried, and the `validation_result.status` returned to callers is always "PASS", masking generation failures. - **Likelihood**: High — any generated code over 10 characters (virtually all real output) triggers this - **Priority**: High ### Location - **File**: `src/cleveragents/agents/graphs/plan_generation.py` - **Class**: `PlanGenerationGraph` - **Method**: `_validate` - **Lines**: 358–363 ### Description The validation logic has a short-circuit OR condition: ```python # plan_generation.py lines 358-363 result = chain.invoke({"generated_code": all_code}) validation = str(result) # Simple validation check (in real implementation, parse the LLM response) is_valid = "PASS" in validation.upper() or len(all_code) > 10 # BUG ``` The comment "in real implementation, parse the LLM response" acknowledges this is incomplete — but the placeholder condition `len(all_code) > 10` always evaluates to `True` for any real code, so: 1. If LLM says "PASS" → `True or True` = `True` (correct but vacuous) 2. If LLM says "FAIL" → `False or True` = `True` (BUG: FAIL overridden) 3. If LLM call fails → exception caught, returns `{"status": "FAIL", ...}` (correct) The retry conditional `_should_retry` checks for `"FAIL"` status, but since `_validate` always sets status to `"PASS"`, the retry path (`retry -> analyze_requirements`) is **unreachable**. The entire `max_retries` mechanism is dead code. ### Expected Behavior `is_valid` should be determined solely by the LLM's response: ```python is_valid = "PASS" in validation.upper() and "FAIL" not in validation.upper() ``` Or better, parse structured JSON output from the LLM if the validation prompt is updated to produce it. ### Actual Behavior `is_valid` is always `True` for any generated code longer than 10 characters, making validation a no-op and the retry loop permanently bypassed. ### Category consistency / spec-alignment ### TDD Note After this bug is verified, a Type/Testing issue will be created with a TDD test tagged `@tdd_issue`, `@tdd_issue_<this-issue-number>`, and `@tdd_expected_fail` to prove the bug exists before fixing it. --- **Automated by CleverAgents Bot** Supervisor: Bug Hunting | Agent: bug-hunter

HAL9000 added the

labels

2026-04-09 21:26:24 +00:00

HAL9000 added this to the v3.2.0 milestone

2026-04-09 21:28:12 +00:00

HAL9000 commented

2026-04-17 08:07:24 +00:00

Author

Owner

✅ Verified — Critical consistency bug. Validation always passes regardless of actual code quality, making retry logic dead code. This means invalid plans are never retried. MoSCoW: Must Have — plan validation is a core acceptance criterion for v3.2.0.

Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — Critical consistency bug. Validation always passes regardless of actual code quality, making retry logic dead code. This means invalid plans are never retried. **MoSCoW: Must Have** — plan validation is a core acceptance criterion for v3.2.0. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor

HAL9000 added

and removed

labels

2026-04-18 08:26:51 +00:00