test(integration): workflow example 13 — custom automation profile with semantic escalation #777

Open
opened 2026-03-12 19:40:46 +00:00 by freemo · 1 comment
Owner

Metadata

  • Commit Message: test(integration): workflow example 13 — custom automation profile with semantic escalation
  • Branch: test/int-wf13-custom-profile

Background

Integration test for Specification Workflow Example 13: Custom Automation Profile with Semantic Escalation. Exercises custom automation profile creation, invariant-driven escalation that overrides confidence-based auto-proceed, plan explain for decision investigation, and plan prompt for human guidance using mocked LLM providers.

Runs within the standard nox -s integration_tests session using mocked LLM providers.

Expected Behavior

The integration test validates semantic escalation with mocked LLM responses. A custom profile is created with specific thresholds, invariants force escalation even when confidence exceeds the threshold, and the user provides guidance to resume.

Acceptance Criteria

  • Robot Framework test suite in robot/ directory (standard integration tests)
  • Test creates custom automation profile with specific thresholds
  • Test uses integration-appropriate mocking (mocked LLM providers)
  • Test verifies invariant-driven escalation overrides confidence threshold
  • Test exercises plan explain and plan prompt
  • Test passes via nox -s integration_tests
  • Coverage >=97% maintained

Subtasks

  • Write Robot Framework integration test suite for workflow example 13
  • Configure mocked LLM responses for semantic escalation
  • Create custom profile YAML fixture
  • Implement semantic escalation workflow
  • Verify via nox -s integration_tests
  • Verify coverage >=97% via nox -s coverage_report
  • Run nox (all default sessions), fix any errors

Definition of Done

This issue is complete when:

  • All subtasks above are completed and checked off.
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details.
  • The commit is pushed to the remote on the branch matching the Branch in Metadata exactly.
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.
## Metadata - **Commit Message**: `test(integration): workflow example 13 — custom automation profile with semantic escalation` - **Branch**: `test/int-wf13-custom-profile` ## Background Integration test for Specification Workflow Example 13: Custom Automation Profile with Semantic Escalation. Exercises custom automation profile creation, invariant-driven escalation that overrides confidence-based auto-proceed, `plan explain` for decision investigation, and `plan prompt` for human guidance using mocked LLM providers. Runs within the standard `nox -s integration_tests` session using mocked LLM providers. ## Expected Behavior The integration test validates semantic escalation with mocked LLM responses. A custom profile is created with specific thresholds, invariants force escalation even when confidence exceeds the threshold, and the user provides guidance to resume. ## Acceptance Criteria - [ ] Robot Framework test suite in `robot/` directory (standard integration tests) - [ ] Test creates custom automation profile with specific thresholds - [ ] Test uses integration-appropriate mocking (mocked LLM providers) - [ ] Test verifies invariant-driven escalation overrides confidence threshold - [ ] Test exercises `plan explain` and `plan prompt` - [ ] Test passes via `nox -s integration_tests` - [ ] Coverage >=97% maintained ## Subtasks - [ ] Write Robot Framework integration test suite for workflow example 13 - [ ] Configure mocked LLM responses for semantic escalation - [ ] Create custom profile YAML fixture - [ ] Implement semantic escalation workflow - [ ] Verify via `nox -s integration_tests` - [ ] Verify coverage >=97% via `nox -s coverage_report` - [ ] Run `nox` (all default sessions), fix any errors ## Definition of Done This issue is complete when: - All subtasks above are completed and checked off. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done.
freemo added this to the v3.2.0 milestone 2026-03-12 19:40:46 +00:00
Member

Implementation Notes

Files Created/Modified

File Lines Description
robot/wf13_custom_profile.robot 79 5 Robot Framework test cases
robot/helper_wf13_custom_profile.py 478 Python helper with 5 subcommands
robot/cli_core.robot 4 lines changed Pre-existing test failure fix
robot/scientific_paper_basic.robot 4 lines removed Pre-existing skip guard removal
robot/scientific_paper_e2e_test.robot 37 lines changed Pre-existing soft assertion fix

Design Decisions

Invariant-driven escalation verification: The test directly exercises AutonomyController.should_proceed_automatically() with two scenarios: (1) high confidence factors without invariant complexity → proceeds automatically, (2) same confidence factors but with invariant_complexity=1.0 and elevated risk_assessment=0.8 → escalation forced, even though base confidence would normally exceed threshold. This proves the core WF13 specification behavior.

Rich JSON line-wrapping: Implemented _rejoin() utility and COLUMNS=500 env override to handle Rich console wrapping JSON output at terminal width. This is a known pattern also used in other helpers.

Quality Gates

Session Result
nox -s lint PASS
nox -s typecheck PASS (0 errors)
nox -s integration_tests PASS (all 5 WF13 tests)
nox -s coverage_report 97.89% (>= 97%)

Commit

e8720490becb7cceaf8c0b8454cdbebbf3e734db on branch test/int-wf13-custom-profile (single squashed commit)

PR

PR #949

## Implementation Notes ### Files Created/Modified | File | Lines | Description | |---|---|---| | `robot/wf13_custom_profile.robot` | 79 | 5 Robot Framework test cases | | `robot/helper_wf13_custom_profile.py` | 478 | Python helper with 5 subcommands | | `robot/cli_core.robot` | 4 lines changed | Pre-existing test failure fix | | `robot/scientific_paper_basic.robot` | 4 lines removed | Pre-existing skip guard removal | | `robot/scientific_paper_e2e_test.robot` | 37 lines changed | Pre-existing soft assertion fix | ### Design Decisions **Invariant-driven escalation verification:** The test directly exercises `AutonomyController.should_proceed_automatically()` with two scenarios: (1) high confidence factors without invariant complexity → proceeds automatically, (2) same confidence factors but with `invariant_complexity=1.0` and elevated `risk_assessment=0.8` → escalation forced, even though base confidence would normally exceed threshold. This proves the core WF13 specification behavior. **Rich JSON line-wrapping:** Implemented `_rejoin()` utility and `COLUMNS=500` env override to handle Rich console wrapping JSON output at terminal width. This is a known pattern also used in other helpers. ### Quality Gates | Session | Result | |---|---| | `nox -s lint` | PASS | | `nox -s typecheck` | PASS (0 errors) | | `nox -s integration_tests` | PASS (all 5 WF13 tests) | | `nox -s coverage_report` | 97.89% (>= 97%) | ### Commit `e8720490becb7cceaf8c0b8454cdbebbf3e734db` on branch `test/int-wf13-custom-profile` (single squashed commit) ### PR [PR #949](https://git.cleverthis.com/cleveragents/cleveragents-core/pulls/949)
freemo self-assigned this 2026-04-02 06:13:49 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#777
No description provided.