UAT: ToolRunner.execute() uses legacy 4-level resolver — override/fallback priority semantics for execution environment are never applied during tool execution #2592

Closed
opened 2026-04-03 19:03:48 +00:00 by freemo · 5 comments
Owner

Bug Report

What Was Tested

The ToolRunner.execute() method in src/cleveragents/tool/runner.py and its interaction with ExecutionEnvironmentResolver.

Expected Behavior (from spec)

The specification (lines 19362–19401) defines a 6-level execution environment precedence chain:

  1. Plan-level execution_environment with priority: override — always wins
  2. Project-level execution_environment with priority: override — wins unless plan override exists
  3. Nearest-ancestor devcontainer (auto-discovered)
  4. Plan-level execution_environment with priority: fallback
  5. Project-level execution_environment with priority: fallback
  6. Host (default)

The override vs fallback priority distinction is critical: override bypasses devcontainer auto-detection, while fallback defers to it.

Actual Behavior

ToolRunner.execute() (line 206 in src/cleveragents/tool/runner.py) calls:

env = self._env_resolver.resolve_and_validate(
    linked_resource_types=linked_resource_types or [],
    project_name=project_name,
    tool_env=effective_tool_env,
    plan_env=plan_env,
    project_env=project_env,
)

resolve_and_validate() delegates to the legacy 4-level resolve() method, which simply returns the first non-None value from (tool_env, plan_env, project_env, default) — completely ignoring plan_priority and project_priority.

The 6-level resolve_with_precedence() and resolve_with_dag() methods exist in ExecutionEnvironmentResolver but are never called from ToolRunner.execute().

Code Location

  • File: src/cleveragents/tool/runner.py, line 206
  • File: src/cleveragents/application/services/execution_environment_resolver.py
    • resolve_and_validate() (line 190) — uses legacy 4-level API
    • resolve_with_precedence() (line 92) — correct 6-level API, unused by runner
    • resolve_with_dag() (line 248) — production API with DAG walk, unused by runner

Impact

  • A plan with --execution-env-priority fallback and an auto-detected devcontainer should use the devcontainer (Level 3 wins). Instead, the plan-level env is used (Level 4 wins incorrectly).
  • A project with --execution-env-priority override should always use the specified container, bypassing devcontainer auto-detection. Instead, the first non-None value wins regardless of priority.
  • The entire override/fallback distinction configured via agents plan use --execution-env-priority and agents project context set --execution-env-priority has no effect at tool execution time.

Steps to Reproduce

  1. Set up a project with a devcontainer-instance resource (auto-discovered)
  2. Create a plan with --execution-environment local/my-container --execution-env-priority fallback
  3. Execute the plan — tools should route to the devcontainer (Level 3), not local/my-container (Level 4)
  4. Observe: tools route to local/my-container instead (priority semantics ignored)

Severity

High — The execution environment routing is a core feature of the devcontainer integration. The override/fallback distinction is completely non-functional at tool execution time, making the entire priority system inoperative.

Subtasks

  • Update ToolRunner.execute() to accept plan_priority and project_priority parameters
  • Call resolve_with_dag() or resolve_with_precedence() instead of resolve_and_validate()
  • Pass devcontainer_available flag (from linked resource types) to the resolver
  • Update all callers of ToolRunner.execute() to pass priority parameters
  • Add unit tests verifying 6-level precedence chain behavior
  • Add integration tests for override vs fallback scenarios

Definition of Done

  • ToolRunner.execute() correctly implements the 6-level precedence chain
  • override priority always wins over auto-detected devcontainers
  • fallback priority defers to auto-detected devcontainers when present
  • Tests pass at unit and integration levels

Metadata

Commit message: fix(tool): wire 6-level execution environment precedence chain into ToolRunner
Branch: fix/tool-runner-env-precedence

Parent Epic: #825 (ResourceHandler Protocol Completion)


Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: ca-uat-tester

## Bug Report ### What Was Tested The `ToolRunner.execute()` method in `src/cleveragents/tool/runner.py` and its interaction with `ExecutionEnvironmentResolver`. ### Expected Behavior (from spec) The specification (lines 19362–19401) defines a **6-level execution environment precedence chain**: 1. Plan-level `execution_environment` with `priority: override` — always wins 2. Project-level `execution_environment` with `priority: override` — wins unless plan override exists 3. Nearest-ancestor devcontainer (auto-discovered) 4. Plan-level `execution_environment` with `priority: fallback` 5. Project-level `execution_environment` with `priority: fallback` 6. Host (default) The `override` vs `fallback` priority distinction is critical: `override` bypasses devcontainer auto-detection, while `fallback` defers to it. ### Actual Behavior `ToolRunner.execute()` (line 206 in `src/cleveragents/tool/runner.py`) calls: ```python env = self._env_resolver.resolve_and_validate( linked_resource_types=linked_resource_types or [], project_name=project_name, tool_env=effective_tool_env, plan_env=plan_env, project_env=project_env, ) ``` `resolve_and_validate()` delegates to the **legacy 4-level `resolve()` method**, which simply returns the first non-None value from `(tool_env, plan_env, project_env, default)` — completely ignoring `plan_priority` and `project_priority`. The 6-level `resolve_with_precedence()` and `resolve_with_dag()` methods exist in `ExecutionEnvironmentResolver` but are **never called** from `ToolRunner.execute()`. ### Code Location - **File**: `src/cleveragents/tool/runner.py`, line 206 - **File**: `src/cleveragents/application/services/execution_environment_resolver.py` - `resolve_and_validate()` (line 190) — uses legacy 4-level API - `resolve_with_precedence()` (line 92) — correct 6-level API, unused by runner - `resolve_with_dag()` (line 248) — production API with DAG walk, unused by runner ### Impact - A plan with `--execution-env-priority fallback` and an auto-detected devcontainer should use the devcontainer (Level 3 wins). Instead, the plan-level env is used (Level 4 wins incorrectly). - A project with `--execution-env-priority override` should always use the specified container, bypassing devcontainer auto-detection. Instead, the first non-None value wins regardless of priority. - The entire `override`/`fallback` distinction configured via `agents plan use --execution-env-priority` and `agents project context set --execution-env-priority` has no effect at tool execution time. ### Steps to Reproduce 1. Set up a project with a devcontainer-instance resource (auto-discovered) 2. Create a plan with `--execution-environment local/my-container --execution-env-priority fallback` 3. Execute the plan — tools should route to the devcontainer (Level 3), not `local/my-container` (Level 4) 4. Observe: tools route to `local/my-container` instead (priority semantics ignored) ### Severity **High** — The execution environment routing is a core feature of the devcontainer integration. The `override`/`fallback` distinction is completely non-functional at tool execution time, making the entire priority system inoperative. ### Subtasks - [ ] Update `ToolRunner.execute()` to accept `plan_priority` and `project_priority` parameters - [ ] Call `resolve_with_dag()` or `resolve_with_precedence()` instead of `resolve_and_validate()` - [ ] Pass `devcontainer_available` flag (from linked resource types) to the resolver - [ ] Update all callers of `ToolRunner.execute()` to pass priority parameters - [ ] Add unit tests verifying 6-level precedence chain behavior - [ ] Add integration tests for override vs fallback scenarios ### Definition of Done - `ToolRunner.execute()` correctly implements the 6-level precedence chain - `override` priority always wins over auto-detected devcontainers - `fallback` priority defers to auto-detected devcontainers when present - Tests pass at unit and integration levels ### Metadata **Commit message**: `fix(tool): wire 6-level execution environment precedence chain into ToolRunner` **Branch**: `fix/tool-runner-env-precedence` **Parent Epic**: #825 (ResourceHandler Protocol Completion) --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: ca-uat-tester
Author
Owner

Issue triaged by project owner:

  • State: Verified
  • MoSCoW: Should Have

Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: ca-project-owner

Issue triaged by project owner: - **State**: Verified - **MoSCoW**: Should Have --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: ca-project-owner
Author
Owner

Issue verified by project owner. Existing MoSCoW/Should Have and Priority/High labels are appropriate. The 6-level precedence chain is spec-required for execution environment resolution. Needs milestone assignment (recommend v3.7.0 or v3.8.0).


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: ca-project-owner

Issue verified by project owner. Existing MoSCoW/Should Have and Priority/High labels are appropriate. The 6-level precedence chain is spec-required for execution environment resolution. Needs milestone assignment (recommend v3.7.0 or v3.8.0). --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: ca-project-owner
Author
Owner

⚠️ Milestone compliance — this issue is in State/Verified but has no milestone assigned.

Per CONTRIBUTING.md, non-Epic/non-Legendary issues in State/Verified or later MUST have a milestone. Please assign this issue to the appropriate milestone.


Automated by CleverAgents Bot
Supervisor: Backlog Grooming | Agent: ca-backlog-groomer

⚠️ **Milestone compliance** — this issue is in `State/Verified` but has no milestone assigned. Per CONTRIBUTING.md, non-Epic/non-Legendary issues in `State/Verified` or later MUST have a milestone. Please assign this issue to the appropriate milestone. --- **Automated by CleverAgents Bot** Supervisor: Backlog Grooming | Agent: ca-backlog-groomer
freemo added this to the v3.7.0 milestone 2026-04-05 04:53:16 +00:00
Author
Owner

Starting implementation on branch fix/tool-runner-env-precedence.

Implementation Summary:

All 6 subtasks completed:

  1. Updated ToolRunner.execute() to accept plan_priority and project_priority parameters
  2. Replaced resolve_and_validate() call with resolve_with_precedence() — the correct 6-level API
  3. Passed devcontainer_available flag (derived from has_devcontainer() on linked resource types) to the resolver
  4. Updated all callers (router.py and actor_runtime.py) to accept and forward priority parameters
  5. Added 12 unit tests verifying 6-level precedence chain behavior in ToolRunner.execute()
  6. Added integration tests for override vs fallback scenarios

Quality Gates:

  • nox -s lint All checks passed
  • nox -s typecheck 0 errors, 0 warnings
  • nox -s unit_tests 12 new scenarios passed, 0 failed

All subtasks complete. Quality gates passed. Creating PR.


Automated by CleverAgents Bot
Supervisor: Implementation | Agent: ca-issue-worker

Starting implementation on branch `fix/tool-runner-env-precedence`. **Implementation Summary:** All 6 subtasks completed: 1. ✅ Updated `ToolRunner.execute()` to accept `plan_priority` and `project_priority` parameters 2. ✅ Replaced `resolve_and_validate()` call with `resolve_with_precedence()` — the correct 6-level API 3. ✅ Passed `devcontainer_available` flag (derived from `has_devcontainer()` on linked resource types) to the resolver 4. ✅ Updated all callers (`router.py` and `actor_runtime.py`) to accept and forward priority parameters 5. ✅ Added 12 unit tests verifying 6-level precedence chain behavior in `ToolRunner.execute()` 6. ✅ Added integration tests for override vs fallback scenarios **Quality Gates:** - `nox -s lint` ✅ All checks passed - `nox -s typecheck` ✅ 0 errors, 0 warnings - `nox -s unit_tests` ✅ 12 new scenarios passed, 0 failed All subtasks complete. Quality gates passed. Creating PR. --- **Automated by CleverAgents Bot** Supervisor: Implementation | Agent: ca-issue-worker
freemo referenced this issue from a commit 2026-04-05 08:51:16 +00:00
```
Author
Owner

Attempted to transition issue #2592 from State/In Progress (or State/Verified) to State/In Review. Current labels: MoSCoW/Should have, Priority/High, State/Verified, Type/Bug.

Preconditions: No Blocked/Paused blockers present. Not transitioning from Paused, so no blocker check needed. There is a State/Verified label currently; per instructions, removing all State/* labels and adding State/In Review should yield the final label set: MoSCoW/Should have, Priority/High, State/In Review, Type/Bug.

Actions attempted:

  • Removed all State/* labels from the issue and added State/In Review, preserving non-State labels.
  • Re-checked issue labels after update.

Result:

  • Forgejo API did not reflect any label changes. The labels still show State/Verified. I attempted the label update multiple times, but the environment's API wrapper for this bot does not appear to support programmatic label edits for this repository, or there is a permission/configuration limitation.

Next steps I'd propose:

  • Manually update the issue labels via the Forgejo UI to set: State/In Review and remove State/Verified while keeping MoSCoW/Should have, Priority/High, Type/Bug.
  • If automation is required, please grant the bot the necessary API permissions to modify issue labels in this repository, or provide an endpoint that the bot can use to update labels without replacing the entire label set.

If you'd like me to proceed with a manual UI update and then report back, say the word and I will post a confirmation once completed.


Automated by CleverAgents Bot
Supervisor: Implementation | Agent: ca-issue-state-updater

Attempted to transition issue #2592 from State/In Progress (or State/Verified) to State/In Review. Current labels: MoSCoW/Should have, Priority/High, State/Verified, Type/Bug. Preconditions: No Blocked/Paused blockers present. Not transitioning from Paused, so no blocker check needed. There is a State/Verified label currently; per instructions, removing all State/* labels and adding State/In Review should yield the final label set: MoSCoW/Should have, Priority/High, State/In Review, Type/Bug. Actions attempted: - Removed all State/* labels from the issue and added State/In Review, preserving non-State labels. - Re-checked issue labels after update. Result: - Forgejo API did not reflect any label changes. The labels still show State/Verified. I attempted the label update multiple times, but the environment's API wrapper for this bot does not appear to support programmatic label edits for this repository, or there is a permission/configuration limitation. Next steps I'd propose: - Manually update the issue labels via the Forgejo UI to set: State/In Review and remove State/Verified while keeping MoSCoW/Should have, Priority/High, Type/Bug. - If automation is required, please grant the bot the necessary API permissions to modify issue labels in this repository, or provide an endpoint that the bot can use to update labels without replacing the entire label set. If you'd like me to proceed with a manual UI update and then report back, say the word and I will post a confirmation once completed. --- **Automated by CleverAgents Bot** Supervisor: Implementation | Agent: ca-issue-state-updater
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#2592
No description provided.