UAT: Semantic Escalation controller is never invoked to gate automation #6993

Open
opened 2026-04-10 06:12:29 +00:00 by HAL9000 · 0 comments
Owner

What was tested

  • Reviewed Semantic Escalation and automation profile sections (docs/specification.md lines 28483-28690) on commit 51aab18411.

Expected behavior

  • The runtime should feed plan decisions through the Semantic Escalation engine so each automatable flag compares the computed confidence score against the profile threshold.
  • Thresholds like edit_code, execute_command, create_file, install_dependency, modify_config, and approve_plan should pause for human input when confidence is below the configured value.

Actual behavior

  • AutonomyController.should_proceed_automatically(...) is never called outside of tests/benchmarks. No production module references it, so confidence-based gating never runs.
  • Plan lifecycle only checks raw threshold values for phase transitions (create_tool, select_tool, access_network, delete_content). All other flags are ignored, so plans proceed automatically regardless of confidence.

Steps to reproduce

  1. From the repo root, run:
    python3 - <<'PY'
    import pathlib
    root = pathlib.Path('src/cleveragents')
    uses = [str(p.relative_to(root)) for p in root.rglob('*.py')
            if p.name != 'autonomy_controller.py' and 'should_proceed_automatically(' in p.read_text(encoding='utf-8')]
    print('Semantic escalation uses:', uses)
    PY
    
  2. The script prints ['domain/models/core/escalation.py'], meaning only the domain model references the method; runtime code never does.
  3. Inspect src/cleveragents/application/services/plan_lifecycle_service.py — automation thresholds are hard-coded checks against 1.0 and do not call the controller.

Code location

  • src/cleveragents/application/services/autonomy_controller.py (unused controller)
  • src/cleveragents/application/services/plan_lifecycle_service.py
  • src/cleveragents/application/services/plan_executor.py
## What was tested - Reviewed Semantic Escalation and automation profile sections (docs/specification.md lines 28483-28690) on commit 51aab184112728471a44d5a91c334663cf8cd016. ## Expected behavior - The runtime should feed plan decisions through the Semantic Escalation engine so each automatable flag compares the computed confidence score against the profile threshold. - Thresholds like `edit_code`, `execute_command`, `create_file`, `install_dependency`, `modify_config`, and `approve_plan` should pause for human input when confidence is below the configured value. ## Actual behavior - `AutonomyController.should_proceed_automatically(...)` is never called outside of tests/benchmarks. No production module references it, so confidence-based gating never runs. - Plan lifecycle only checks raw threshold values for phase transitions (`create_tool`, `select_tool`, `access_network`, `delete_content`). All other flags are ignored, so plans proceed automatically regardless of confidence. ## Steps to reproduce 1. From the repo root, run: ```bash python3 - <<'PY' import pathlib root = pathlib.Path('src/cleveragents') uses = [str(p.relative_to(root)) for p in root.rglob('*.py') if p.name != 'autonomy_controller.py' and 'should_proceed_automatically(' in p.read_text(encoding='utf-8')] print('Semantic escalation uses:', uses) PY ``` 2. The script prints `['domain/models/core/escalation.py']`, meaning only the domain model references the method; runtime code never does. 3. Inspect `src/cleveragents/application/services/plan_lifecycle_service.py` — automation thresholds are hard-coded checks against 1.0 and do not call the controller. ## Code location - `src/cleveragents/application/services/autonomy_controller.py` (unused controller) - `src/cleveragents/application/services/plan_lifecycle_service.py` - `src/cleveragents/application/services/plan_executor.py`
HAL9000 self-assigned this 2026-04-10 06:21:43 +00:00
HAL9000 added this to the v3.7.0 milestone 2026-04-10 06:21:44 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#6993
No description provided.