test(integration): workflow example 9 — session-driven interactive exploration (review profile) #773

Open
opened 2026-03-12 19:39:51 +00:00 by freemo · 1 comment
Owner

Metadata

  • Commit Message: test(integration): workflow example 9 — session-driven interactive exploration (review profile)
  • Branch: test/int-wf09-session

Background

Integration test for Specification Workflow Example 9: Session-Driven Interactive Exploration. Exercises session-based conversational interaction with session create, session tell, session show, and session export using mocked LLM providers.

Runs within the standard nox -s integration_tests session using mocked LLM providers.

Expected Behavior

The integration test validates session management with mocked LLM responses. Session history accumulates, export produces valid JSON, and the AI creates an action from conversation.

Acceptance Criteria

  • Robot Framework test suite in robot/ directory (standard integration tests)
  • Test exercises session create, tell, show, and export commands
  • Test uses integration-appropriate mocking (mocked LLM providers)
  • Test verifies session history accumulation
  • Test verifies session export JSON structure
  • Test verifies action creation from conversation
  • Test passes via nox -s integration_tests
  • Coverage >=97% maintained

Subtasks

  • Write Robot Framework integration test suite for workflow example 9
  • Configure mocked LLM responses for session exploration
  • Create temp project fixture for exploration
  • Implement session-based workflow
  • Verify via nox -s integration_tests
  • Verify coverage >=97% via nox -s coverage_report
  • Run nox (all default sessions), fix any errors

Definition of Done

This issue is complete when:

  • All subtasks above are completed and checked off.
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details.
  • The commit is pushed to the remote on the branch matching the Branch in Metadata exactly.
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.
## Metadata - **Commit Message**: `test(integration): workflow example 9 — session-driven interactive exploration (review profile)` - **Branch**: `test/int-wf09-session` ## Background Integration test for Specification Workflow Example 9: Session-Driven Interactive Exploration. Exercises session-based conversational interaction with `session create`, `session tell`, `session show`, and `session export` using mocked LLM providers. Runs within the standard `nox -s integration_tests` session using mocked LLM providers. ## Expected Behavior The integration test validates session management with mocked LLM responses. Session history accumulates, export produces valid JSON, and the AI creates an action from conversation. ## Acceptance Criteria - [x] Robot Framework test suite in `robot/` directory (standard integration tests) - [x] Test exercises session create, tell, show, and export commands - [x] Test uses integration-appropriate mocking (mocked LLM providers) - [x] Test verifies session history accumulation - [x] Test verifies session export JSON structure - [x] Test verifies action creation from conversation - [x] Test passes via `nox -s integration_tests` - [x] Coverage >=97% maintained ## Subtasks - [x] Write Robot Framework integration test suite for workflow example 9 - [x] Configure mocked LLM responses for session exploration - [x] Create temp project fixture for exploration - [x] Implement session-based workflow - [x] Verify via `nox -s integration_tests` - [x] Verify coverage >=97% via `nox -s coverage_report` - [x] Run `nox` (all default sessions), fix any errors ## Definition of Done This issue is complete when: - All subtasks above are completed and checked off. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done.
freemo added this to the v3.0.0 milestone 2026-03-12 19:39:52 +00:00
Member

Implementation Notes

Design Decisions

  1. Self-contained subcommands: Each of the 7 test cases creates its own isolated workspace via setup_workspace(), runs agents init --yes, executes the session workflow under test, and tears down via cleanup_workspace(). No state persists between Robot test cases. This matches the established pattern from helper_m1_e2e_verification.py.

  2. Real CLI subprocess execution: Tests use helper_e2e_common.run_cli() which invokes python -m cleveragents <args> in a subprocess with CLEVERAGENTS_TESTING_USE_MOCK_AI=true. This exercises the full DI container boot, database migrations, and session service wiring — not just mock-based CliRunner tests (which exist in session_cli.robot).

  3. Session ID extraction: The _extract_session_id() helper parses session create --format plain output looking for session_id: lines, handling debug log line prefixes that can appear before the structured output.

  4. Mock AI session tell: The session tell command uses the M3 stub that echoes an acknowledgement. Tests verify: no Traceback, no INTERNAL errors, non-empty response. Exact response content is not asserted since mock AI responses are non-deterministic.

  5. Action from conversation: With mock AI, the assistant can't actually generate action YAML. The test demonstrates the exploration-to-action workflow path by having the user create an action manually after a session conversation, using action create --config with a YAML file containing all required fields (name, strategy_actor, execution_actor, definition_of_done, automation_profile).

Key Code Locations

  • robot/wf09_session_exploration.robot — 7 Robot test cases
  • robot/helper_wf09_session_exploration.py — 463-line Python helper with 7 subcommands
  • robot/helper_e2e_common.py — Shared utilities
  • src/cleveragents/cli/commands/session.py:115session create
  • src/cleveragents/cli/commands/session.py:496session tell
  • src/cleveragents/cli/commands/session.py:235session show
  • src/cleveragents/cli/commands/session.py:386session export
  • src/cleveragents/cli/commands/session.py:166session list

Discoveries

  • Session create plain format: Output uses session_id: (not id:), followed by actor:, namespace:, messages:, created:, updated:.
  • Action YAML required fields: name (must be namespace/name format), strategy_actor, execution_actor, definition_of_done, automation_profile.
  • No interference with existing session tests: WF09 tests are fully isolated from session_cli.robot (mock-based), session_persistence.robot (DB), and session_create_error.robot (regression).

Test Results

  • Robot Framework: 7/7 integration tests passing
  • Typecheck: 0 Pyright errors (strict mode)
  • Unit tests: 10,700 scenarios / 0 failures (no unit test changes)
  • Lint: Clean

PR Reference

PR #808test/int-wf09-session branch, commit 4851ee7d

## Implementation Notes ### Design Decisions 1. **Self-contained subcommands**: Each of the 7 test cases creates its own isolated workspace via `setup_workspace()`, runs `agents init --yes`, executes the session workflow under test, and tears down via `cleanup_workspace()`. No state persists between Robot test cases. This matches the established pattern from `helper_m1_e2e_verification.py`. 2. **Real CLI subprocess execution**: Tests use `helper_e2e_common.run_cli()` which invokes `python -m cleveragents <args>` in a subprocess with `CLEVERAGENTS_TESTING_USE_MOCK_AI=true`. This exercises the full DI container boot, database migrations, and session service wiring — not just mock-based CliRunner tests (which exist in `session_cli.robot`). 3. **Session ID extraction**: The `_extract_session_id()` helper parses `session create --format plain` output looking for `session_id:` lines, handling debug log line prefixes that can appear before the structured output. 4. **Mock AI session tell**: The `session tell` command uses the M3 stub that echoes an acknowledgement. Tests verify: no Traceback, no INTERNAL errors, non-empty response. Exact response content is not asserted since mock AI responses are non-deterministic. 5. **Action from conversation**: With mock AI, the assistant can't actually generate action YAML. The test demonstrates the exploration-to-action workflow path by having the user create an action manually after a session conversation, using `action create --config` with a YAML file containing all required fields (`name`, `strategy_actor`, `execution_actor`, `definition_of_done`, `automation_profile`). ### Key Code Locations - `robot/wf09_session_exploration.robot` — 7 Robot test cases - `robot/helper_wf09_session_exploration.py` — 463-line Python helper with 7 subcommands - `robot/helper_e2e_common.py` — Shared utilities - `src/cleveragents/cli/commands/session.py:115` — `session create` - `src/cleveragents/cli/commands/session.py:496` — `session tell` - `src/cleveragents/cli/commands/session.py:235` — `session show` - `src/cleveragents/cli/commands/session.py:386` — `session export` - `src/cleveragents/cli/commands/session.py:166` — `session list` ### Discoveries - **Session create plain format**: Output uses `session_id:` (not `id:`), followed by `actor:`, `namespace:`, `messages:`, `created:`, `updated:`. - **Action YAML required fields**: `name` (must be `namespace/name` format), `strategy_actor`, `execution_actor`, `definition_of_done`, `automation_profile`. - **No interference with existing session tests**: WF09 tests are fully isolated from `session_cli.robot` (mock-based), `session_persistence.robot` (DB), and `session_create_error.robot` (regression). ### Test Results - **Robot Framework**: 7/7 integration tests passing - **Typecheck**: 0 Pyright errors (strict mode) - **Unit tests**: 10,700 scenarios / 0 failures (no unit test changes) - **Lint**: Clean ### PR Reference PR #808 — `test/int-wf09-session` branch, commit `4851ee7d`
freemo modified the milestone from v3.0.0 to v3.6.0 2026-03-16 00:32:08 +00:00
freemo self-assigned this 2026-04-02 06:14:00 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#773
No description provided.