Proposal: improve implementation-worker — add flaky integration_tests detection and master CI health check before merging #5323

Open
opened 2026-04-09 05:50:39 +00:00 by HAL9000 · 0 comments
Owner

Agent Improvement Proposal

Pattern Detected

Type: workflow_fix
Affected Agent: implementation-worker
Evidence: Master CI integration_tests has failed 3 times in recent history (commits a33b6caa, 7a37f02a, b72b8275), each time blocking ALL PR merges for 60-120+ minutes. The pattern shows that PRs are being merged that introduce flaky or broken integration tests, and the implementation-worker has no guidance on how to detect or handle this.

Detailed Evidence

From watchdog reports and tracking issues:

Commit Time Status Duration Blocked
a33b6caa 2026-04-08T23:36 integration_tests FAILING (4m19s) ~120 min
7a37f02a 2026-04-09T02:44 integration_tests FAILING (6m54s) ~60 min
0bd8fbb2 2026-04-09T04:41 integration_tests PASSING Fixed by PR #5264
b72b8275 2026-04-09T05:07 integration_tests FAILING (6m28s) Ongoing

Pattern: The integration_tests failure is recurring. Each failure blocks ALL open PRs from merging (branch protection requires CI / status-check to pass). The fix requires a human to create a dedicated CI fix PR.

Root cause: The implementation-worker agent does not:

  1. Check master CI status before merging a PR
  2. Detect when its own PR might introduce flaky integration tests
  3. Have guidance on what to do when master CI is already failing

Proposed Change

Add two improvements to implementation-worker.md:

1. Pre-merge master CI health check: Before attempting to merge any PR, the worker should verify that master CI is currently passing. If master CI is failing, the worker should NOT attempt to merge (it will fail anyway due to branch protection) and should instead:

  • Log that master CI is failing
  • Skip the merge attempt
  • Continue to the next task

2. Post-merge CI monitoring: After merging a PR, the worker should monitor the master CI for 5-10 minutes to verify the merge didn't break integration_tests. If integration_tests fail after the merge, the worker should:

  • Create a Priority/CI-Blocker issue documenting the failure
  • Include the commit SHA and failing test details
  • Tag it for immediate attention

Add this guidance to the "Merge PR" section of implementation-worker.md:

### Pre-Merge Master CI Check (CRITICAL)

Before attempting to merge, verify master CI is healthy:

1. Check master branch CI status via Forgejo API:
   GET /repos/{owner}/{repo}/commits/{master_sha}/statuses

2. If ANY required check is failing on master:
   - DO NOT attempt to merge (will fail due to branch protection)
   - Log: "Master CI failing — skipping merge, will retry when CI is green"
   - Move to next task
   - Check again in next cycle

3. Only proceed with merge if ALL required checks pass on master

### Post-Merge Integration Test Monitoring

After a successful merge, monitor for 10 minutes:

1. Get the new master SHA after merge
2. Poll CI status every 2 minutes for up to 10 minutes
3. If integration_tests fail:
   - Create a Priority/CI-Blocker issue:
     Title: "fix(ci): integration_tests failing on master after merge of PR #N"
     Body: Include commit SHA, failing test names from CI logs
   - This alerts the system to the regression immediately

Expected Impact

  • Fewer master CI failures: Workers won't merge PRs when master is already broken
  • Faster detection: Post-merge monitoring catches regressions within 10 minutes instead of waiting for the next watchdog cycle
  • Reduced blocked time: Instead of 60-120 minute outages, regressions are caught and reported immediately
  • Self-healing: The CI-Blocker issue creation triggers the implementation pool to prioritize fixing the regression

Risk Assessment

  • Low risk: The pre-merge check is purely defensive — it prevents merges that would fail anyway due to branch protection. No code changes, only merge timing changes.
  • Medium risk: Post-merge monitoring adds 10 minutes of wait time per merge. This could slow down the implementation pipeline slightly. Mitigation: only monitor for integration_tests specifically (the most flaky check), not all checks.
  • Potential concern: If master CI is persistently failing, workers will never merge. However, this is the correct behavior — merging into a broken master makes things worse.

This is a proposal from the agent evolver. A human must approve this issue before the change will be implemented. To approve: remove the needs feedback label, add State/Verified, or comment with approval.


Automated by CleverAgents Bot
Supervisor: Agent Evolver | Agent: agent-evolver

## Agent Improvement Proposal ### Pattern Detected **Type**: workflow_fix **Affected Agent**: `implementation-worker` **Evidence**: Master CI integration_tests has failed 3 times in recent history (commits `a33b6caa`, `7a37f02a`, `b72b8275`), each time blocking ALL PR merges for 60-120+ minutes. The pattern shows that PRs are being merged that introduce flaky or broken integration tests, and the implementation-worker has no guidance on how to detect or handle this. ### Detailed Evidence From watchdog reports and tracking issues: | Commit | Time | Status | Duration Blocked | |--------|------|--------|-----------------| | `a33b6caa` | 2026-04-08T23:36 | integration_tests FAILING (4m19s) | ~120 min | | `7a37f02a` | 2026-04-09T02:44 | integration_tests FAILING (6m54s) | ~60 min | | `0bd8fbb2` | 2026-04-09T04:41 | integration_tests PASSING ✅ | Fixed by PR #5264 | | `b72b8275` | 2026-04-09T05:07 | integration_tests FAILING (6m28s) | Ongoing | **Pattern**: The integration_tests failure is recurring. Each failure blocks ALL open PRs from merging (branch protection requires `CI / status-check` to pass). The fix requires a human to create a dedicated CI fix PR. **Root cause**: The `implementation-worker` agent does not: 1. Check master CI status before merging a PR 2. Detect when its own PR might introduce flaky integration tests 3. Have guidance on what to do when master CI is already failing ### Proposed Change Add two improvements to `implementation-worker.md`: **1. Pre-merge master CI health check**: Before attempting to merge any PR, the worker should verify that master CI is currently passing. If master CI is failing, the worker should NOT attempt to merge (it will fail anyway due to branch protection) and should instead: - Log that master CI is failing - Skip the merge attempt - Continue to the next task **2. Post-merge CI monitoring**: After merging a PR, the worker should monitor the master CI for 5-10 minutes to verify the merge didn't break integration_tests. If integration_tests fail after the merge, the worker should: - Create a `Priority/CI-Blocker` issue documenting the failure - Include the commit SHA and failing test details - Tag it for immediate attention Add this guidance to the "Merge PR" section of `implementation-worker.md`: ``` ### Pre-Merge Master CI Check (CRITICAL) Before attempting to merge, verify master CI is healthy: 1. Check master branch CI status via Forgejo API: GET /repos/{owner}/{repo}/commits/{master_sha}/statuses 2. If ANY required check is failing on master: - DO NOT attempt to merge (will fail due to branch protection) - Log: "Master CI failing — skipping merge, will retry when CI is green" - Move to next task - Check again in next cycle 3. Only proceed with merge if ALL required checks pass on master ### Post-Merge Integration Test Monitoring After a successful merge, monitor for 10 minutes: 1. Get the new master SHA after merge 2. Poll CI status every 2 minutes for up to 10 minutes 3. If integration_tests fail: - Create a Priority/CI-Blocker issue: Title: "fix(ci): integration_tests failing on master after merge of PR #N" Body: Include commit SHA, failing test names from CI logs - This alerts the system to the regression immediately ``` ### Expected Impact - **Fewer master CI failures**: Workers won't merge PRs when master is already broken - **Faster detection**: Post-merge monitoring catches regressions within 10 minutes instead of waiting for the next watchdog cycle - **Reduced blocked time**: Instead of 60-120 minute outages, regressions are caught and reported immediately - **Self-healing**: The CI-Blocker issue creation triggers the implementation pool to prioritize fixing the regression ### Risk Assessment - **Low risk**: The pre-merge check is purely defensive — it prevents merges that would fail anyway due to branch protection. No code changes, only merge timing changes. - **Medium risk**: Post-merge monitoring adds 10 minutes of wait time per merge. This could slow down the implementation pipeline slightly. Mitigation: only monitor for integration_tests specifically (the most flaky check), not all checks. - **Potential concern**: If master CI is persistently failing, workers will never merge. However, this is the correct behavior — merging into a broken master makes things worse. --- *This is a proposal from the agent evolver. A human must approve this issue before the change will be implemented. To approve: remove the `needs feedback` label, add `State/Verified`, or comment with approval.* --- **Automated by CleverAgents Bot** Supervisor: Agent Evolver | Agent: agent-evolver
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#5323
No description provided.