Tests (ASV): Add missing ASV benchmark suite for the langgraph module #2797

Open
opened 2026-04-04 19:55:36 +00:00 by freemo · 1 comment
Owner

Metadata

  • Branch: test/missing-asv-benchmarks-langgraph
  • Commit Message: test(langgraph): add ASV performance benchmark suite for the langgraph module
  • Milestone: v3.8.0
  • Parent Epic: #1678

Background and Context

The langgraph module is missing ASV (airspeed velocity) performance benchmarks. Per the project's Multi-Level Testing Mandate in CONTRIBUTING.md, every module must have tests at all required levels: Behave BDD unit tests, Robot Framework integration tests, and ASV performance benchmarks. The absence of ASV benchmarks for the langgraph module means there is no automated baseline for detecting performance regressions in the system's LangGraph integration layer.

The langgraph module (src/cleveragents/langgraph/) is a critical component of the CleverAgents architecture, providing the actor-first LangGraph primitives that underpin the agent execution pipeline (see docs/specification.md and docs/adr/ADR-022-langchain-langgraph-integration.md). It exposes key abstractions including LangGraph, GraphConfig, GraphState, NodeConfig, NodeType, Edge, and supporting components such as bridge, dynamic_router, message_router, nodes, pure_graph, routing_adapter, and state. The module uses lazy-loading for heavyweight symbols (LangGraph, GraphConfig) to avoid multi-second import overhead — making import-time and initialisation-time benchmarks especially valuable. Performance regressions here — such as slow graph construction, expensive routing decisions, or inefficient state transitions — would silently degrade throughput across the entire agent orchestration pipeline.

Current Behaviour

The langgraph module has no ASV benchmark suite. Running nox (or the benchmark session directly) produces no benchmark results for this module, leaving a gap in the project's performance regression detection coverage.

Expected Behaviour

  • An ASV benchmark suite exists under the benchmarks/ directory for the langgraph module.
  • The suite covers at least the following performance-critical paths:
    • Lazy-load import time for heavyweight symbols (LangGraph, GraphConfig)
    • GraphState construction and field access throughput
    • NodeConfig and NodeType instantiation cost
    • Edge creation and resolution latency
    • LangGraph graph construction and teardown
    • Dynamic router and message router dispatch latency
    • PureGraph execution throughput
    • RoutingAdapter initialisation and routing decision cost
    • Bridge lifecycle (setup and teardown)
  • All benchmarks run without error via nox.
  • No existing nox sessions are broken by the addition of the benchmarks.

Subtasks

  • Audit the langgraph module (src/cleveragents/langgraph/) and identify all performance-sensitive code paths suitable for ASV benchmarking (graph construction, routing, state transitions, bridge lifecycle, lazy-load overhead)
  • Create the ASV benchmark file under benchmarks/ following the existing directory and naming conventions (e.g., langgraph_bench.py)
  • Implement setup / teardown fixtures as needed to isolate benchmark measurements and avoid cross-benchmark contamination
  • Implement at least one benchmark per identified performance-critical path (import time, graph construction, routing dispatch, state operations, bridge lifecycle)
  • Run nox benchmark session locally and confirm all new benchmarks execute without error
  • Verify no regressions are introduced in other nox sessions (nox -e lint, nox -e typecheck, nox -e unit_tests, nox -e integration_tests)
  • Verify coverage remains ≥ 97% via nox -e coverage_report
  • Run full nox (all default sessions) and confirm clean pass

Definition of Done

  • All subtasks above are completed and checked off
  • benchmarks/langgraph_bench.py (or equivalent) exists and covers all performance-sensitive public behaviours of the langgraph module
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly (test(langgraph): add ASV performance benchmark suite for the langgraph module), followed by a blank line, then additional lines providing relevant details about the implementation
  • The commit is pushed to the remote on the branch matching the Branch in Metadata exactly (test/missing-asv-benchmarks-langgraph)
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done
  • All nox stages pass
  • Coverage >= 97%

Automated by CleverAgents Bot
Supervisor: Unknown | Agent: ca-new-issue-creator

## Metadata - **Branch**: `test/missing-asv-benchmarks-langgraph` - **Commit Message**: `test(langgraph): add ASV performance benchmark suite for the langgraph module` - **Milestone**: v3.8.0 - **Parent Epic**: #1678 ## Background and Context The `langgraph` module is missing ASV (airspeed velocity) performance benchmarks. Per the project's Multi-Level Testing Mandate in `CONTRIBUTING.md`, every module must have tests at all required levels: Behave BDD unit tests, Robot Framework integration tests, and ASV performance benchmarks. The absence of ASV benchmarks for the `langgraph` module means there is no automated baseline for detecting performance regressions in the system's LangGraph integration layer. The `langgraph` module (`src/cleveragents/langgraph/`) is a critical component of the CleverAgents architecture, providing the actor-first LangGraph primitives that underpin the agent execution pipeline (see `docs/specification.md` and `docs/adr/ADR-022-langchain-langgraph-integration.md`). It exposes key abstractions including `LangGraph`, `GraphConfig`, `GraphState`, `NodeConfig`, `NodeType`, `Edge`, and supporting components such as `bridge`, `dynamic_router`, `message_router`, `nodes`, `pure_graph`, `routing_adapter`, and `state`. The module uses lazy-loading for heavyweight symbols (`LangGraph`, `GraphConfig`) to avoid multi-second import overhead — making import-time and initialisation-time benchmarks especially valuable. Performance regressions here — such as slow graph construction, expensive routing decisions, or inefficient state transitions — would silently degrade throughput across the entire agent orchestration pipeline. ## Current Behaviour The `langgraph` module has no ASV benchmark suite. Running `nox` (or the benchmark session directly) produces no benchmark results for this module, leaving a gap in the project's performance regression detection coverage. ## Expected Behaviour - An ASV benchmark suite exists under the `benchmarks/` directory for the `langgraph` module. - The suite covers at least the following performance-critical paths: - Lazy-load import time for heavyweight symbols (`LangGraph`, `GraphConfig`) - `GraphState` construction and field access throughput - `NodeConfig` and `NodeType` instantiation cost - `Edge` creation and resolution latency - `LangGraph` graph construction and teardown - Dynamic router and message router dispatch latency - `PureGraph` execution throughput - `RoutingAdapter` initialisation and routing decision cost - Bridge lifecycle (setup and teardown) - All benchmarks run without error via `nox`. - No existing nox sessions are broken by the addition of the benchmarks. ## Subtasks - [ ] Audit the `langgraph` module (`src/cleveragents/langgraph/`) and identify all performance-sensitive code paths suitable for ASV benchmarking (graph construction, routing, state transitions, bridge lifecycle, lazy-load overhead) - [ ] Create the ASV benchmark file under `benchmarks/` following the existing directory and naming conventions (e.g., `langgraph_bench.py`) - [ ] Implement `setup` / `teardown` fixtures as needed to isolate benchmark measurements and avoid cross-benchmark contamination - [ ] Implement at least one benchmark per identified performance-critical path (import time, graph construction, routing dispatch, state operations, bridge lifecycle) - [ ] Run `nox` benchmark session locally and confirm all new benchmarks execute without error - [ ] Verify no regressions are introduced in other nox sessions (`nox -e lint`, `nox -e typecheck`, `nox -e unit_tests`, `nox -e integration_tests`) - [ ] Verify coverage remains ≥ 97% via `nox -e coverage_report` - [ ] Run full `nox` (all default sessions) and confirm clean pass ## Definition of Done - [ ] All subtasks above are completed and checked off - [ ] `benchmarks/langgraph_bench.py` (or equivalent) exists and covers all performance-sensitive public behaviours of the `langgraph` module - [ ] A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly (`test(langgraph): add ASV performance benchmark suite for the langgraph module`), followed by a blank line, then additional lines providing relevant details about the implementation - [ ] The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly (`test/missing-asv-benchmarks-langgraph`) - [ ] The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done - [ ] All nox stages pass - [ ] Coverage >= 97% --- **Automated by CleverAgents Bot** Supervisor: Unknown | Agent: ca-new-issue-creator
freemo added this to the v3.8.0 milestone 2026-04-04 19:55:44 +00:00
Author
Owner

Issue triaged by project owner:

  • State: Verified | MoSCoW: Could Have — ASV benchmark suite for the langgraph module.

Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: ca-project-owner

Issue triaged by project owner: - **State**: Verified | **MoSCoW**: Could Have — ASV benchmark suite for the langgraph module. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: ca-project-owner
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#2797
No description provided.