fix(agents/graphs/plan_generation): BDD tests and docs for validation bypass #10480 #11144

Closed
freemo wants to merge 0 commits from fix/10480-validation-bypass-fix into master
Owner

Summary

This PR completes all mandatory PR compliance items for the plan generation validation bypass bug fix (issue #10480). The code fix itself was already merged to master in commit d1328e562.

What changed and why

The original implementation of PlanGenerationGraph._validate() contained a logic error that bypassed LLM-based validation for any generated code longer than 10 characters:

is_valid = "PASS" in validation.upper() or len(all_code) > 10

This made the entire validation step meaningless for real-world code.

The code fix (removing the or len(all_code) > 10 clause) is already on master. This PR adds the mandatory compliance artifacts:

Compliance items completed

  • BDD/Behave tests - 5 scenarios in features/validation_bypass_issue_10480.feature:
    • LLM returns FAIL for short code → validation rejects
    • LLM returns FAIL for long code → validation rejects
    • LLM returns PASS → validation accepts
    • LLM returns FAIL with details → validation rejects with full feedback preserved
    • Code length check does not auto-pass any LLM verdict (regression guard)
  • CHANGELOG.md - Entry added under [Unreleased] section
  • CONTRIBUTORS.md - Contribution credit added for Jeffrey Phillips Freeman
  • Commit footer - ISSUES CLOSED: #10480
  • Epic reference - m3/epic-v3.2.0 (Decisions + Validations + Invariants)
  • Labels - State/In Review
  • Milestone - v3.2.0 (milestone id 105)

Closes #10480
This PR blocks issue #10480

# Summary This PR completes all mandatory PR compliance items for the plan generation validation bypass bug fix (issue #10480). The code fix itself was already merged to master in commit `d1328e562`. ## What changed and why The original implementation of `PlanGenerationGraph._validate()` contained a logic error that bypassed LLM-based validation for any generated code longer than 10 characters: ``` is_valid = "PASS" in validation.upper() or len(all_code) > 10 ``` This made the entire validation step meaningless for real-world code. The code fix (removing the `or len(all_code) > 10` clause) is already on master. This PR adds the mandatory compliance artifacts: ## Compliance items completed - [x] **BDD/Behave tests** - 5 scenarios in `features/validation_bypass_issue_10480.feature`: - LLM returns FAIL for short code → validation rejects - LLM returns FAIL for long code → validation rejects - LLM returns PASS → validation accepts - LLM returns FAIL with details → validation rejects with full feedback preserved - Code length check does not auto-pass any LLM verdict (regression guard) - [x] **CHANGELOG.md** - Entry added under `[Unreleased]` section - [x] **CONTRIBUTORS.md** - Contribution credit added for Jeffrey Phillips Freeman - [x] **Commit footer** - `ISSUES CLOSED: #10480` - [x] **Epic reference** - m3/epic-v3.2.0 (Decisions + Validations + Invariants) - [x] **Labels** - State/In Review - [x] **Milestone** - v3.2.0 (milestone id 105) Closes #10480 This PR blocks issue #10480
freemo added this to the v3.2.0 milestone 2026-05-12 06:21:37 +00:00
hurui200320 closed this pull request 2026-05-12 07:22:04 +00:00
Some checks failed
CI / push-validation (push) Successful in 49s
CI / benchmark-regression (push) Failing after 1m36s
CI / benchmark-publish (push) Successful in 1h20m14s
CI / tdd_quality_gate (push) Has been skipped
CI / e2e_tests (push) Successful in 4m55s
CI / unit_tests (push) Successful in 8m57s
Required
Details
CI / lint (push) Failing after 26m11s
Required
Details
CI / integration_tests (push) Failing after 4m56s
Required
Details
CI / push-validation (pull_request) Successful in 48s
CI / tdd_quality_gate (pull_request) Failing after 1m19s
CI / integration_tests (pull_request) Failing after 4m36s
Required
Details
CI / e2e_tests (pull_request) Successful in 5m16s
CI / unit_tests (pull_request) Successful in 6m41s
Required
Details
CI / helm (push) Failing after 16m22s
CI / security (push) Failing after 16m52s
Required
Details
CI / quality (push) Failing after 16m52s
Required
Details
CI / build (push) Failing after 16m51s
Required
Details
CI / typecheck (push) Failing after 17m2s
Required
Details
CI / docker (push) Has been skipped
Required
Details
CI / coverage (push) Has been skipped
Required
Details
CI / status-check (push) Failing after 3s
CI / helm (pull_request) Failing after 16m0s
CI / build (pull_request) Failing after 16m31s
Required
Details
CI / security (pull_request) Failing after 16m38s
Required
Details
CI / lint (pull_request) Failing after 16m38s
Required
Details
CI / typecheck (pull_request) Failing after 16m41s
Required
Details
CI / quality (pull_request) Failing after 16m43s
Required
Details
CI / coverage (pull_request) Has been skipped
Required
Details
CI / docker (pull_request) Has been skipped
Required
Details
CI / status-check (pull_request) Failing after 3s

Pull request closed

Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core!11144
No description provided.