BUG: [Supervisor] Worker agent 'ca-test-infra-improver' is failing to initialize #2222

Open
opened 2026-04-03 09:37:58 +00:00 by freemo · 1 comment
Owner

Metadata

  • Branch: fix/ca-test-infra-improver-init-failure
  • Commit Message: fix(agents): reconcile ca-test-infra-improver startup logic with tool security policy
  • Milestone: v3.8.0
  • Parent Epic: #1678

Background and Context

The pool supervisor for the Test Infrastructure domain has determined that the worker agent ca-test-infra-improver is consistently failing to initialize. All attempts to dispatch this worker result in the session terminating in under 10 seconds, making the agent completely non-functional.

Current Behavior

  1. Symptom: Dispatched worker sessions disappear immediately. Polling the /session/status endpoint 10 seconds after a prompt_async call shows the session is already gone.
  2. Initial Diagnosis: An investigation was conducted by dispatching a single worker and retrieving its final message log. The log revealed a fatal error: the worker attempted to use the bash tool to run commands (whoami, git clone, cd) that are forbidden by its security policy.
  3. Workaround Attempt: A new prompt was crafted to explicitly forbid the use of bash and git clone, instructing the agent to perform its analysis using only the permitted Forgejo API tools.
  4. Persistent Failure: The failure persists even with the corrected, API-only prompt. Sessions still terminate instantly.

Root Cause Analysis

The fact that the failure continues regardless of the prompt strongly suggests the error occurs during the agent's internal, non-promptable initialization sequence. The agent's persona dictates that it MUST clone a repository into an isolated directory. It is highly likely this behavior is hardcoded into the agent's startup logic, causing it to violate its own tool permissions and crash before it can be controlled by the supervisor's prompt.

The agent's core instructions are in direct conflict with its execution environment's security policy.

Expected Behavior

The ca-test-infra-improver agent should initialize successfully and be able to perform its test infrastructure analysis duties using only the permitted Forgejo API tools (no bash, no git clone). The agent's startup logic must not attempt to use tools that are forbidden by its security policy.

Acceptance Criteria

  • The ca-test-infra-improver agent initializes without crashing when dispatched by the supervisor
  • The agent's startup sequence does not invoke bash, git clone, or any other tool forbidden by its security policy
  • The agent can complete a full analysis cycle using only permitted Forgejo API tools
  • Sessions persist beyond 10 seconds after prompt_async dispatch
  • The supervisor can successfully poll /session/status and receive an active session state

Supporting Information

  • Related issue: #2210 (META: [Test-Infra-Improver] Blocked due to failing tools) — covers overlapping tool failures but does not address the hardcoded initialization sequence conflict
  • The agent's persona/startup instructions need to be audited and any hardcoded bash/git clone calls must be replaced with equivalent Forgejo API calls

Subtasks

  • Audit the ca-test-infra-improver agent's persona definition and startup instructions for hardcoded bash/git clone calls
  • Identify all initialization steps that violate the agent's tool security policy
  • Rewrite the initialization sequence to use only permitted Forgejo API tools (e.g., forgejo_get_file_content, forgejo_list_repo_commits) instead of bash/git clone
  • Verify the agent initializes successfully when dispatched by the supervisor
  • Verify sessions persist beyond 10 seconds after prompt_async dispatch
  • Confirm the agent can complete a full analysis cycle end-to-end
  • Tests (Behave): Add/update scenarios for agent initialization with restricted tool policy
  • Tests (Robot): Add integration test for supervisor dispatching ca-test-infra-improver
  • Verify coverage >= 97% via nox -s coverage_report
  • Run nox (all default sessions), fix any errors

Definition of Done

This issue is complete when:

  • All subtasks above are completed and checked off.
  • The ca-test-infra-improver agent initializes and runs successfully without invoking any forbidden tools.
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly (fix(agents): reconcile ca-test-infra-improver startup logic with tool security policy), followed by a blank line, then additional lines providing relevant details about the implementation.
  • The commit is pushed to the remote on the branch matching the Branch in Metadata exactly (fix/ca-test-infra-improver-init-failure).
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.
  • All nox stages pass
  • Coverage >= 97%

Automated by CleverAgents Bot
Supervisor: Test Infrastructure | Agent: ca-new-issue-creator

## Metadata - **Branch**: `fix/ca-test-infra-improver-init-failure` - **Commit Message**: `fix(agents): reconcile ca-test-infra-improver startup logic with tool security policy` - **Milestone**: v3.8.0 - **Parent Epic**: #1678 ## Background and Context The pool supervisor for the Test Infrastructure domain has determined that the worker agent `ca-test-infra-improver` is consistently failing to initialize. All attempts to dispatch this worker result in the session terminating in under 10 seconds, making the agent completely non-functional. ## Current Behavior 1. **Symptom:** Dispatched worker sessions disappear immediately. Polling the `/session/status` endpoint 10 seconds after a `prompt_async` call shows the session is already gone. 2. **Initial Diagnosis:** An investigation was conducted by dispatching a single worker and retrieving its final message log. The log revealed a fatal error: the worker attempted to use the `bash` tool to run commands (`whoami`, `git clone`, `cd`) that are forbidden by its security policy. 3. **Workaround Attempt:** A new prompt was crafted to explicitly forbid the use of `bash` and `git clone`, instructing the agent to perform its analysis using only the permitted Forgejo API tools. 4. **Persistent Failure:** The failure persists even with the corrected, API-only prompt. Sessions still terminate instantly. ## Root Cause Analysis The fact that the failure continues regardless of the prompt strongly suggests the error occurs during the agent's internal, non-promptable initialization sequence. The agent's persona dictates that it MUST clone a repository into an isolated directory. It is highly likely this behavior is hardcoded into the agent's startup logic, causing it to violate its own tool permissions and crash before it can be controlled by the supervisor's prompt. The agent's core instructions are in direct conflict with its execution environment's security policy. ## Expected Behavior The `ca-test-infra-improver` agent should initialize successfully and be able to perform its test infrastructure analysis duties using only the permitted Forgejo API tools (no `bash`, no `git clone`). The agent's startup logic must not attempt to use tools that are forbidden by its security policy. ## Acceptance Criteria - [ ] The `ca-test-infra-improver` agent initializes without crashing when dispatched by the supervisor - [ ] The agent's startup sequence does not invoke `bash`, `git clone`, or any other tool forbidden by its security policy - [ ] The agent can complete a full analysis cycle using only permitted Forgejo API tools - [ ] Sessions persist beyond 10 seconds after `prompt_async` dispatch - [ ] The supervisor can successfully poll `/session/status` and receive an active session state ## Supporting Information - Related issue: #2210 (META: [Test-Infra-Improver] Blocked due to failing tools) — covers overlapping tool failures but does not address the hardcoded initialization sequence conflict - The agent's persona/startup instructions need to be audited and any hardcoded `bash`/`git clone` calls must be replaced with equivalent Forgejo API calls ## Subtasks - [ ] Audit the `ca-test-infra-improver` agent's persona definition and startup instructions for hardcoded `bash`/`git clone` calls - [ ] Identify all initialization steps that violate the agent's tool security policy - [ ] Rewrite the initialization sequence to use only permitted Forgejo API tools (e.g., `forgejo_get_file_content`, `forgejo_list_repo_commits`) instead of `bash`/`git clone` - [ ] Verify the agent initializes successfully when dispatched by the supervisor - [ ] Verify sessions persist beyond 10 seconds after `prompt_async` dispatch - [ ] Confirm the agent can complete a full analysis cycle end-to-end - [ ] Tests (Behave): Add/update scenarios for agent initialization with restricted tool policy - [ ] Tests (Robot): Add integration test for supervisor dispatching `ca-test-infra-improver` - [ ] Verify coverage >= 97% via `nox -s coverage_report` - [ ] Run `nox` (all default sessions), fix any errors ## Definition of Done This issue is complete when: - All subtasks above are completed and checked off. - The `ca-test-infra-improver` agent initializes and runs successfully without invoking any forbidden tools. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly (`fix(agents): reconcile ca-test-infra-improver startup logic with tool security policy`), followed by a blank line, then additional lines providing relevant details about the implementation. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly (`fix/ca-test-infra-improver-init-failure`). - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done. - All nox stages pass - Coverage >= 97% --- **Automated by CleverAgents Bot** Supervisor: Test Infrastructure | Agent: ca-new-issue-creator
freemo added this to the v3.8.0 milestone 2026-04-03 09:38:10 +00:00
Author
Owner

Issue triaged by project owner:

  • State: Verified
  • Priority: High — The ca-test-infra-improver agent cannot initialize due to a conflict between its hardcoded startup logic (which tries to git clone) and its security policy (which forbids bash). This blocks all test infrastructure improvement work.
  • Milestone: v3.8.0 (already assigned)
  • MoSCoW: Should Have — Fixing agent initialization is important for the autonomous workflow but does not block core server implementation deliverables. The root cause (hardcoded bash/git clone in agent persona) is a design issue that should be resolved.
  • Parent Epic: #1678 (CI Execution Time Optimization — already linked in issue body)

Related issues: #2210 (broader tool failures), #2205 (file-reading stack overflow). This issue specifically addresses the agent initialization sequence conflict.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: ca-project-owner

Issue triaged by project owner: - **State**: Verified ✅ - **Priority**: High — The `ca-test-infra-improver` agent cannot initialize due to a conflict between its hardcoded startup logic (which tries to `git clone`) and its security policy (which forbids `bash`). This blocks all test infrastructure improvement work. - **Milestone**: v3.8.0 (already assigned) - **MoSCoW**: Should Have — Fixing agent initialization is important for the autonomous workflow but does not block core server implementation deliverables. The root cause (hardcoded bash/git clone in agent persona) is a design issue that should be resolved. - **Parent Epic**: #1678 (CI Execution Time Optimization — already linked in issue body) **Related issues**: #2210 (broader tool failures), #2205 (file-reading stack overflow). This issue specifically addresses the agent initialization sequence conflict. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: ca-project-owner
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#2222
No description provided.