Epic: E2E Testing Suite for Acceptance Criteria and Workflow Examples #739

Open
opened 2026-03-12 19:28:09 +00:00 by freemo · 2 comments
Owner

Background

Covers all true end-to-end (E2E) tests for milestone acceptance criteria and specification workflow examples. Unlike integration tests, E2E tests use zero mocking, stubbing, or test doubles of any kind — they exercise the real CleverAgents CLI with real LLM API keys (Anthropic/OpenAI) against real subprocess invocations, exactly as a user would experience them. E2E tests run in their own dedicated nox session (nox -s e2e_tests) and CI job, completely separate from the standard integration test suite.

See Forgejo dependency links for child issues.

Expected Behavior

Each milestone has a dedicated E2E acceptance criteria test suite, and each of the 18 specification workflow examples has a dedicated E2E test. Tests are Robot Framework suites tagged with @E2E. They validate real command sequences with real LLM responses, and output validation is flexible (checking major structural components without strict character-by-character comparison). All E2E tests are excluded from nox -s integration_tests and run only via nox -s e2e_tests.

Acceptance Criteria

  • A dedicated nox -s e2e_tests session exists and runs only @E2E-tagged Robot Framework tests
  • E2E tests are excluded from the standard nox -s integration_tests session
  • A dedicated CI job runs E2E tests separately from integration tests
  • All 6 milestone acceptance criteria E2E suites pass with real LLM keys
  • All 18 workflow example E2E suites pass with real LLM keys
  • No mocking, stubbing, or test doubles of any kind in any E2E test
  • Output validation is flexible — checks major components without strict character-by-character comparison

Definition of Done

This Epic is complete when all child issues (tracked via Forgejo dependency links) are closed and merged. All E2E milestone acceptance suites and workflow example E2E tests pass with real LLM API keys in the dedicated nox -s e2e_tests session.

## Background Covers all true end-to-end (E2E) tests for milestone acceptance criteria and specification workflow examples. Unlike integration tests, E2E tests use **zero mocking, stubbing, or test doubles of any kind** — they exercise the real CleverAgents CLI with real LLM API keys (Anthropic/OpenAI) against real subprocess invocations, exactly as a user would experience them. E2E tests run in their own dedicated nox session (`nox -s e2e_tests`) and CI job, completely separate from the standard integration test suite. See Forgejo dependency links for child issues. ## Expected Behavior Each milestone has a dedicated E2E acceptance criteria test suite, and each of the 18 specification workflow examples has a dedicated E2E test. Tests are Robot Framework suites tagged with `@E2E`. They validate real command sequences with real LLM responses, and output validation is flexible (checking major structural components without strict character-by-character comparison). All E2E tests are excluded from `nox -s integration_tests` and run only via `nox -s e2e_tests`. ## Acceptance Criteria - [ ] A dedicated `nox -s e2e_tests` session exists and runs only `@E2E`-tagged Robot Framework tests - [ ] E2E tests are excluded from the standard `nox -s integration_tests` session - [ ] A dedicated CI job runs E2E tests separately from integration tests - [ ] All 6 milestone acceptance criteria E2E suites pass with real LLM keys - [ ] All 18 workflow example E2E suites pass with real LLM keys - [ ] No mocking, stubbing, or test doubles of any kind in any E2E test - [ ] Output validation is flexible — checks major components without strict character-by-character comparison ## Definition of Done This Epic is complete when all child issues (tracked via Forgejo dependency links) are closed and merged. All E2E milestone acceptance suites and workflow example E2E tests pass with real LLM API keys in the dedicated `nox -s e2e_tests` session.
freemo added this to the v3.2.0 milestone 2026-03-12 19:28:16 +00:00
freemo self-assigned this 2026-03-12 19:28:20 +00:00
Author
Owner

🤖 Backlog Groomer (groomer-1): ⚠️ Stale In Progress — This issue has been in State/In Progress for 383 hours (~16 days) with no updates. Current state: State/In Progress.

Is this blocked? Please update the status or add a comment explaining the current situation. Consider:

  • Moving to State/Verified if work is paused
  • Adding a blocking comment if waiting on dependencies
  • Closing if the work has been superseded
🤖 **Backlog Groomer (groomer-1):** ⚠️ **Stale In Progress** — This issue has been in `State/In Progress` for **383 hours** (~16 days) with no updates. Current state: `State/In Progress`. Is this blocked? Please update the status or add a comment explaining the current situation. Consider: - Moving to `State/Verified` if work is paused - Adding a blocking comment if waiting on dependencies - Closing if the work has been superseded
Author
Owner

Label compliance fix applied:

  • Replaced orphaned label State/In Progress with valid label State/In progress
  • Reason: State/In Progress (capital P) is an orphaned label. The correct label is State/In progress (lowercase p).

Automated by CleverAgents Bot
Supervisor: Backlog Grooming | Agent: ca-backlog-groomer

Label compliance fix applied: - Replaced orphaned label `State/In Progress` with valid label `State/In progress` - Reason: `State/In Progress` (capital P) is an orphaned label. The correct label is `State/In progress` (lowercase p). --- **Automated by CleverAgents Bot** Supervisor: Backlog Grooming | Agent: ca-backlog-groomer
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Blocks Depends on
Reference
cleveragents/cleveragents-core#739
No description provided.