cleveragents/cleveragents-core

Fork 3

test(integration): workflow example 13 — custom automation profile with semantic escalation #777

New issue

Open

opened 2026-03-12 19:40:46 +00:00 by freemo · 1 comment

freemo commented

2026-03-12 19:40:46 +00:00

Owner

Metadata

Commit Message: test(integration): workflow example 13 — custom automation profile with semantic escalation
Branch: test/int-wf13-custom-profile

Background

Integration test for Specification Workflow Example 13: Custom Automation Profile with Semantic Escalation. Exercises custom automation profile creation, invariant-driven escalation that overrides confidence-based auto-proceed, plan explain for decision investigation, and plan prompt for human guidance using mocked LLM providers.

Runs within the standard nox -s integration_tests session using mocked LLM providers.

Expected Behavior

The integration test validates semantic escalation with mocked LLM responses. A custom profile is created with specific thresholds, invariants force escalation even when confidence exceeds the threshold, and the user provides guidance to resume.

Acceptance Criteria

Robot Framework test suite in robot/ directory (standard integration tests)
Test creates custom automation profile with specific thresholds
Test uses integration-appropriate mocking (mocked LLM providers)
Test verifies invariant-driven escalation overrides confidence threshold
Test exercises plan explain and plan prompt
Test passes via nox -s integration_tests
Coverage >=97% maintained

Subtasks

Write Robot Framework integration test suite for workflow example 13
Configure mocked LLM responses for semantic escalation
Create custom profile YAML fixture
Implement semantic escalation workflow
Verify via nox -s integration_tests
Verify coverage >=97% via nox -s coverage_report
Run nox (all default sessions), fix any errors

Definition of Done

This issue is complete when:

All subtasks above are completed and checked off.
A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details.
The commit is pushed to the remote on the branch matching the Branch in Metadata exactly.
The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.

## Metadata - **Commit Message**: `test(integration): workflow example 13 — custom automation profile with semantic escalation` - **Branch**: `test/int-wf13-custom-profile` ## Background Integration test for Specification Workflow Example 13: Custom Automation Profile with Semantic Escalation. Exercises custom automation profile creation, invariant-driven escalation that overrides confidence-based auto-proceed, `plan explain` for decision investigation, and `plan prompt` for human guidance using mocked LLM providers. Runs within the standard `nox -s integration_tests` session using mocked LLM providers. ## Expected Behavior The integration test validates semantic escalation with mocked LLM responses. A custom profile is created with specific thresholds, invariants force escalation even when confidence exceeds the threshold, and the user provides guidance to resume. ## Acceptance Criteria - [ ] Robot Framework test suite in `robot/` directory (standard integration tests) - [ ] Test creates custom automation profile with specific thresholds - [ ] Test uses integration-appropriate mocking (mocked LLM providers) - [ ] Test verifies invariant-driven escalation overrides confidence threshold - [ ] Test exercises `plan explain` and `plan prompt` - [ ] Test passes via `nox -s integration_tests` - [ ] Coverage >=97% maintained ## Subtasks - [ ] Write Robot Framework integration test suite for workflow example 13 - [ ] Configure mocked LLM responses for semantic escalation - [ ] Create custom profile YAML fixture - [ ] Implement semantic escalation workflow - [ ] Verify via `nox -s integration_tests` - [ ] Verify coverage >=97% via `nox -s coverage_report` - [ ] Run `nox` (all default sessions), fix any errors ## Definition of Done This issue is complete when: - All subtasks above are completed and checked off. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done.

freemo added the

labels

2026-03-12 19:40:46 +00:00

brent.edwards was assigned by freemo

2026-03-12 19:40:46 +00:00

freemo added this to the v3.2.0 milestone

2026-03-12 19:40:46 +00:00

freemo added a new dependency

2026-03-12 19:40:46 +00:00

#401 Epic: Milestone Acceptance and Workflow Example Integration Tests

freemo added the

Points

label

2026-03-12 20:32:29 +00:00

brent.edwards added

and removed

labels

2026-03-14 01:05:52 +00:00

brent.edwards referenced this issue from a commit

2026-03-14 03:51:47 +00:00

test(integration): workflow example 13 — custom automation profile with semantic escalation

brent.edwards referenced this issue from a commit

2026-03-14 04:01:02 +00:00

test(integration): workflow example 13 — custom automation profile with semantic escalation

~~brent.edwards referenced this issue 2026-03-14 04:01:37 +00:00~~

test(integration): workflow example 13 — custom automation profile with semantic escalation #949

brent.edwards added

and removed

labels

2026-03-14 04:01:47 +00:00

brent.edwards commented

2026-03-14 04:02:15 +00:00

Member

Implementation Notes

Files Created/Modified

File	Lines	Description
`robot/wf13_custom_profile.robot`	79	5 Robot Framework test cases
`robot/helper_wf13_custom_profile.py`	478	Python helper with 5 subcommands
`robot/cli_core.robot`	4 lines changed	Pre-existing test failure fix
`robot/scientific_paper_basic.robot`	4 lines removed	Pre-existing skip guard removal
`robot/scientific_paper_e2e_test.robot`	37 lines changed	Pre-existing soft assertion fix

Design Decisions

Invariant-driven escalation verification: The test directly exercises AutonomyController.should_proceed_automatically() with two scenarios: (1) high confidence factors without invariant complexity → proceeds automatically, (2) same confidence factors but with invariant_complexity=1.0 and elevated risk_assessment=0.8 → escalation forced, even though base confidence would normally exceed threshold. This proves the core WF13 specification behavior.

Rich JSON line-wrapping: Implemented _rejoin() utility and COLUMNS=500 env override to handle Rich console wrapping JSON output at terminal width. This is a known pattern also used in other helpers.

Quality Gates

Session	Result
`nox -s lint`	PASS
`nox -s typecheck`	PASS (0 errors)
`nox -s integration_tests`	PASS (all 5 WF13 tests)
`nox -s coverage_report`	97.89% (>= 97%)

Commit

e8720490becb7cceaf8c0b8454cdbebbf3e734db on branch test/int-wf13-custom-profile (single squashed commit)

PR

PR #949

## Implementation Notes ### Files Created/Modified | File | Lines | Description | |---|---|---| | `robot/wf13_custom_profile.robot` | 79 | 5 Robot Framework test cases | | `robot/helper_wf13_custom_profile.py` | 478 | Python helper with 5 subcommands | | `robot/cli_core.robot` | 4 lines changed | Pre-existing test failure fix | | `robot/scientific_paper_basic.robot` | 4 lines removed | Pre-existing skip guard removal | | `robot/scientific_paper_e2e_test.robot` | 37 lines changed | Pre-existing soft assertion fix | ### Design Decisions **Invariant-driven escalation verification:** The test directly exercises `AutonomyController.should_proceed_automatically()` with two scenarios: (1) high confidence factors without invariant complexity → proceeds automatically, (2) same confidence factors but with `invariant_complexity=1.0` and elevated `risk_assessment=0.8` → escalation forced, even though base confidence would normally exceed threshold. This proves the core WF13 specification behavior. **Rich JSON line-wrapping:** Implemented `_rejoin()` utility and `COLUMNS=500` env override to handle Rich console wrapping JSON output at terminal width. This is a known pattern also used in other helpers. ### Quality Gates | Session | Result | |---|---| | `nox -s lint` | PASS | | `nox -s typecheck` | PASS (0 errors) | | `nox -s integration_tests` | PASS (all 5 WF13 tests) | | `nox -s coverage_report` | 97.89% (>= 97%) | ### Commit `e8720490becb7cceaf8c0b8454cdbebbf3e734db` on branch `test/int-wf13-custom-profile` (single squashed commit) ### PR [PR #949](https://git.cleverthis.com/cleveragents/cleveragents-core/pulls/949)

freemo referenced this issue

2026-03-14 04:40:16 +00:00

test(integration): workflow example 13 — custom automation profile with semantic escalation #949

brent.edwards referenced this issue from a pull request that will close it,

2026-03-14 19:49:22 +00:00

test(integration): workflow example 13 — custom automation profile with semantic escalation #949

brent.edwards referenced this issue from a commit

2026-03-15 04:31:37 +00:00

test(integration): workflow example 13 — custom automation profile with semantic escalation

brent.edwards referenced this issue from a commit

2026-03-15 19:40:45 +00:00

test(integration): workflow example 13 — custom automation profile with semantic escalation