[AUTO-EVLV] Announce: Proposal: test-infra-pool-supervisor context exhaustion on duplicate checks #9201

Open
opened 2026-04-14 10:31:42 +00:00 by HAL9000 · 1 comment
Owner

[AUTO-EVLV-2] Agent Evolution Proposal — Cycle 5

Proposal Type: Context exhaustion (Category 4)
Target File: .opencode/agents/test-infra-pool-supervisor.md
Status: Awaiting human approval (needs feedback)


Problem Statement

The test-infra-pool-supervisor has been failing to complete its work for 4 consecutive cycles (Cycles 2-5 of AUTO-INF-POOL). Evidence from tracking issue #9194:

Evidence: AUTO-INF-POOL Tracking Issue #9194 (Cycle 6)

  • Cycle 2: No new workers dispatched due to tool limitations preventing a full analysis of existing issues.
  • Cycle 3: Cycle 3 initiated; progress ongoing with attempts to retrieve open testing-related issues for analysis.
  • Cycle 4: Cycle 4 initiated; progress ongoing with attempts to retrieve open testing-related issues for analysis.
  • Cycle 5: Cycle 5 initiated; progress ongoing with attempts to retrieve open testing-related issues for analysis.

Root Cause Analysis

The test-infra-pool-supervisor.md requires workers to perform five mandatory duplicate checks before filing any issue:

  1. Keyword search — search open issues for keywords matching the proposed improvement
  2. Cross-area search — check if another analysis area already filed a similar issue
  3. Closed issues search — check if a similar issue was already filed and closed
  4. Dedup proof — include a ### Duplicate Check section in every issue body showing search results
  5. Uncertainty avoidance — if unsure whether something is a duplicate, do not file

The repository currently has 5000+ open issues. Paginating through all of them for duplicate checks requires many API calls and consumes enormous context. The supervisor is running out of context window before it can complete the analysis.

The supervisor's exhaustive pagination rule says: "Always set limit to its maximum available value (use limit=50 for Forgejo MCP tools)... After each list response, check whether the number of returned items equals the page size — if so, there are likely more results; fetch the next page."

With 5000+ issues at 50 per page = 100+ pages, this is infeasible within a single context window.


Proposed Change

Add guidance to test-infra-pool-supervisor.md to limit duplicate check scope:

  1. Scope duplicate checks to relevant labels — instead of paginating all 5000+ issues, filter by Type/Task or Type/Documentation labels and relevant keywords. This reduces the search space dramatically.

  2. Add a "recent issues" heuristic — only check issues created in the last 90 days for duplicates. Older issues are unlikely to be exact duplicates of new proposals.

  3. Add a "keyword-first" strategy — perform keyword searches first (using forgejo_list_repo_issues with title search if available), and only paginate if the keyword search returns results. This avoids full pagination when no duplicates are likely.

  4. Explicitly cap pagination — add a rule: "Limit duplicate check pagination to a maximum of 5 pages (250 issues). If no duplicate is found in the first 250 issues, proceed with filing."


Expected Outcome

The test-infra-pool-supervisor will be able to complete its duplicate checks within its context window, allowing it to dispatch workers and file improvement proposals consistently.


Human Approval Required

This proposal requires human review before implementation. To approve:

  • Remove the needs feedback label, OR
  • Add the State/Verified label, OR
  • Leave a comment with explicit approval

To reject: close this issue with a comment explaining why.


Automated by CleverAgents Bot
Supervisor: Agent Evolution | Agent: agent-evolution-pool-supervisor

After creating the issue, please also post a comment on issue #9114 saying:
"Proposal #2 created: issue #[NEW_ISSUE_NUMBER] — test-infra-pool-supervisor context exhaustion on duplicate checks. Awaiting human approval."


Automated by CleverAgents Bot
Supervisor: Agent Evolution | Agent: agent-evolution-pool-supervisor

--- ## [AUTO-EVLV-2] Agent Evolution Proposal — Cycle 5 **Proposal Type:** Context exhaustion (Category 4) **Target File:** `.opencode/agents/test-infra-pool-supervisor.md` **Status:** Awaiting human approval (`needs feedback`) --- ## Problem Statement The `test-infra-pool-supervisor` has been failing to complete its work for 4 consecutive cycles (Cycles 2-5 of AUTO-INF-POOL). Evidence from tracking issue #9194: ### Evidence: AUTO-INF-POOL Tracking Issue #9194 (Cycle 6) > - Cycle 2: No new workers dispatched due to tool limitations preventing a full analysis of existing issues. > - Cycle 3: Cycle 3 initiated; progress ongoing with attempts to retrieve open testing-related issues for analysis. > - Cycle 4: Cycle 4 initiated; progress ongoing with attempts to retrieve open testing-related issues for analysis. > - Cycle 5: Cycle 5 initiated; progress ongoing with attempts to retrieve open testing-related issues for analysis. --- ## Root Cause Analysis The `test-infra-pool-supervisor.md` requires workers to perform **five mandatory duplicate checks** before filing any issue: > 1. Keyword search — search open issues for keywords matching the proposed improvement > 2. Cross-area search — check if another analysis area already filed a similar issue > 3. Closed issues search — check if a similar issue was already filed and closed > 4. Dedup proof — include a `### Duplicate Check` section in every issue body showing search results > 5. Uncertainty avoidance — if unsure whether something is a duplicate, do not file The repository currently has **5000+ open issues**. Paginating through all of them for duplicate checks requires many API calls and consumes enormous context. The supervisor is running out of context window before it can complete the analysis. The supervisor's exhaustive pagination rule says: "Always set `limit` to its maximum available value (use `limit=50` for Forgejo MCP tools)... After each list response, check whether the number of returned items equals the page size — if so, there are likely more results; fetch the next page." With 5000+ issues at 50 per page = 100+ pages, this is infeasible within a single context window. --- ## Proposed Change Add guidance to `test-infra-pool-supervisor.md` to limit duplicate check scope: 1. **Scope duplicate checks to relevant labels** — instead of paginating all 5000+ issues, filter by `Type/Task` or `Type/Documentation` labels and relevant keywords. This reduces the search space dramatically. 2. **Add a "recent issues" heuristic** — only check issues created in the last 90 days for duplicates. Older issues are unlikely to be exact duplicates of new proposals. 3. **Add a "keyword-first" strategy** — perform keyword searches first (using `forgejo_list_repo_issues` with title search if available), and only paginate if the keyword search returns results. This avoids full pagination when no duplicates are likely. 4. **Explicitly cap pagination** — add a rule: "Limit duplicate check pagination to a maximum of 5 pages (250 issues). If no duplicate is found in the first 250 issues, proceed with filing." --- ## Expected Outcome The test-infra-pool-supervisor will be able to complete its duplicate checks within its context window, allowing it to dispatch workers and file improvement proposals consistently. --- ## Human Approval Required This proposal requires human review before implementation. To approve: - Remove the `needs feedback` label, OR - Add the `State/Verified` label, OR - Leave a comment with explicit approval To reject: close this issue with a comment explaining why. --- **Automated by CleverAgents Bot** Supervisor: Agent Evolution | Agent: agent-evolution-pool-supervisor After creating the issue, please also post a comment on issue #9114 saying: "Proposal #2 created: issue #[NEW_ISSUE_NUMBER] — test-infra-pool-supervisor context exhaustion on duplicate checks. Awaiting human approval." --- **Automated by CleverAgents Bot** Supervisor: Agent Evolution | Agent: agent-evolution-pool-supervisor
Author
Owner

@freemo This is an agent evolution proposal requiring your review and approval.

Issue #9201 — Agent Evolution Proposal: test-infra-pool-supervisor context exhaustion

The Agent Evolution supervisor has identified that the test-infra-pool-supervisor is failing consistently (4 consecutive cycles) because its duplicate-check logic requires paginating through 5000+ open issues (100+ pages), which exhausts the agent's context window before it can complete any work.

Proposed fix (in .opencode/agents/test-infra-pool-supervisor.md):

  1. Scope duplicate checks to relevant labels (Type/Task, Type/Documentation) rather than all issues
  2. Add a 90-day recency heuristic — only check issues created in the last 90 days
  3. Use keyword-first search strategy before full pagination
  4. Cap pagination to a maximum of 5 pages (250 issues) for duplicate checks

To approve: Remove the Needs Feedback label, add State/Verified, or leave a comment with explicit approval.
To reject: Close this issue with a comment explaining why.

This proposal is related to the ongoing AUTO-INF-SUP failures reported in #9019 and #9091.


Automated by CleverAgents Bot
Supervisor: Human Liaison | Agent: human-liaison-pool-supervisor
Worker: [AUTO-HUMAN-8]

@freemo This is an agent evolution proposal requiring your review and approval. **Issue #9201 — Agent Evolution Proposal: test-infra-pool-supervisor context exhaustion** The Agent Evolution supervisor has identified that the `test-infra-pool-supervisor` is failing consistently (4 consecutive cycles) because its duplicate-check logic requires paginating through 5000+ open issues (100+ pages), which exhausts the agent's context window before it can complete any work. **Proposed fix** (in `.opencode/agents/test-infra-pool-supervisor.md`): 1. Scope duplicate checks to relevant labels (`Type/Task`, `Type/Documentation`) rather than all issues 2. Add a 90-day recency heuristic — only check issues created in the last 90 days 3. Use keyword-first search strategy before full pagination 4. Cap pagination to a maximum of 5 pages (250 issues) for duplicate checks **To approve**: Remove the `Needs Feedback` label, add `State/Verified`, or leave a comment with explicit approval. **To reject**: Close this issue with a comment explaining why. This proposal is related to the ongoing AUTO-INF-SUP failures reported in #9019 and #9091. --- **Automated by CleverAgents Bot** Supervisor: Human Liaison | Agent: human-liaison-pool-supervisor Worker: [AUTO-HUMAN-8]
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#9201
No description provided.