UAT: LangGraph.execute() does not traverse graph nodes — returns immediately after sending to start stream without awaiting node execution #3821

Open
opened 2026-04-06 06:44:33 +00:00 by freemo · 0 comments
Owner

Metadata

  • Branch: fix/langgraph-execute-node-traversal
  • Commit Message: fix(langgraph): implement node traversal and async execution in LangGraph.execute()
  • Milestone: Backlog
  • Parent Epic: #392

Background

The LangGraph.execute() method in src/cleveragents/langgraph/graph.py (lines 79–91) does not traverse graph nodes. It sets the state manager's state to the input, emits a message to the start stream via stream_router.send_message(), and immediately returns the current state without waiting for any node execution.

The send_message() call invokes self.streams[stream_name].on_next(msg) — a synchronous RxPy Subject emission. The node executors registered via _register_node_executor() are async functions that are never awaited. The graph has no mechanism to:

  • Traverse edges from start → intermediate nodes → end
  • Execute each node in sequence or parallel
  • Wait for all nodes to complete before returning

This means any graph with more than just a start node will silently skip all intermediate nodes and return the unmodified input state.

What was tested

The LangGraph.execute() method in src/cleveragents/langgraph/graph.py (lines 79–91).

Expected behavior (from spec)

According to the specification, an Actor is a conversational unit that can be a single LLM or a complex, composed LangGraph of other actors and tool nodes. The Strategize and Execute phases of the plan lifecycle are implemented as LangGraph graphs. The nodes and edges of the graphs map to the decision-making and tool-execution steps within the plan. When execute() is called, the graph should traverse all nodes in topological order, executing each node and updating state accordingly.

Actual behavior

The LangGraph.execute() method (lines 79–91 in src/cleveragents/langgraph/graph.py) does NOT traverse graph nodes. It:

  1. Sets the state manager's state to the input
  2. Sends a message to the start stream via stream_router.send_message()
  3. Immediately returns the current state without waiting for any node execution
async def execute(self, input_data: GraphState | dict[str, Any]) -> GraphState:
    state = (...)
    # Replace state manager state for a fresh execution context
    self.state_manager.state = state
    self.state_manager.state_stream.on_next(state)
    start_stream = f"__{self.name}_node_start__"
    if start_stream in self.stream_router.streams:
        self.stream_router.send_message(start_stream, state)
    return self.state_manager.get_state()  # Returns IMMEDIATELY without awaiting nodes

Code locations

  • src/cleveragents/langgraph/graph.py lines 79–91 — execute() method
  • src/cleveragents/langgraph/graph.py lines 155–180 — _register_node_executor() method
  • src/cleveragents/langgraph/bridge.py lines 195–240 — _create_graph_executor() method

Steps to reproduce

  1. Create a LangGraph with multiple nodes (e.g., start → agent_node → end)
  2. Call await graph.execute({"messages": [{"role": "user", "content": "test"}]})
  3. Observe that execution_history is empty and the agent node was never called
  4. The returned state is identical to the input state

Impact

This is a critical architectural bug. The LangGraph class is the core execution engine for actor graphs. Any actor defined as type: graph with multiple nodes will silently fail to execute those nodes. The PlanGenerationGraph and AutoDebugAgent use LangGraph's StateGraph directly (which works correctly), but the custom LangGraph class used by the actor compiler (src/cleveragents/actor/compiler.py) produces CompiledActor objects that reference this broken LangGraph class.

Backlog note: This issue was discovered during autonomous operation
on milestone v3.2.0. It does not block milestone completion and has been
placed in the backlog for human review and future milestone assignment.

Subtasks

  • Audit LangGraph.execute() (lines 79–91) and document the full intended execution contract against the specification
  • Audit _register_node_executor() (lines 155–180) to understand how node executors are registered and what async execution model is expected
  • Audit _create_graph_executor() in bridge.py (lines 195–240) to understand the bridge's role in graph traversal
  • Design and implement topological node traversal in LangGraph.execute() — traverse edges from start node through all intermediate nodes to end node(s)
  • Implement async awaiting of each node executor in traversal order, updating GraphState after each node completes
  • Handle conditional edges (branching) correctly — evaluate edge conditions against current state to determine next node(s)
  • Handle parallel node execution where the graph topology allows it (nodes with no data dependency on each other)
  • Ensure execution_history is populated with each node's execution record
  • Write Behave unit tests (in features/) covering: single-node graph, multi-node linear graph, branching graph, and parallel-node graph
  • Write Robot Framework integration tests (in robot/) exercising a compiled actor graph end-to-end
  • Verify CompiledActor objects produced by src/cleveragents/actor/compiler.py correctly execute all nodes
  • Run nox to confirm all quality gates pass

Definition of Done

  • LangGraph.execute() traverses all graph nodes in correct topological order
  • Each node executor is properly awaited before proceeding to the next node
  • Conditional edges are evaluated against current state to determine traversal path
  • execution_history is populated with a record for each executed node
  • CompiledActor objects produced by the actor compiler execute all nodes correctly
  • All Behave unit tests pass (nox -e unit_tests)
  • All Robot Framework integration tests pass (nox -e integration_tests)
  • Pyright type checking passes with no errors (nox -e typecheck)
  • Linting passes (nox -e lint)
  • All nox stages pass
  • Coverage >= 97%

Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: ca-new-issue-creator

## Metadata - **Branch**: `fix/langgraph-execute-node-traversal` - **Commit Message**: `fix(langgraph): implement node traversal and async execution in LangGraph.execute()` - **Milestone**: Backlog - **Parent Epic**: #392 ## Background The `LangGraph.execute()` method in `src/cleveragents/langgraph/graph.py` (lines 79–91) does **not** traverse graph nodes. It sets the state manager's state to the input, emits a message to the start stream via `stream_router.send_message()`, and **immediately returns** the current state without waiting for any node execution. The `send_message()` call invokes `self.streams[stream_name].on_next(msg)` — a synchronous RxPy Subject emission. The node executors registered via `_register_node_executor()` are async functions that are never awaited. The graph has no mechanism to: - Traverse edges from start → intermediate nodes → end - Execute each node in sequence or parallel - Wait for all nodes to complete before returning This means any graph with more than just a start node will silently skip all intermediate nodes and return the unmodified input state. ## What was tested The `LangGraph.execute()` method in `src/cleveragents/langgraph/graph.py` (lines 79–91). ## Expected behavior (from spec) According to the specification, an Actor is a conversational unit that can be a single LLM or a complex, composed LangGraph of other actors and tool nodes. The `Strategize` and `Execute` phases of the plan lifecycle are implemented as LangGraph graphs. The nodes and edges of the graphs map to the decision-making and tool-execution steps within the plan. When `execute()` is called, the graph should traverse all nodes in topological order, executing each node and updating state accordingly. ## Actual behavior The `LangGraph.execute()` method (lines 79–91 in `src/cleveragents/langgraph/graph.py`) does NOT traverse graph nodes. It: 1. Sets the state manager's state to the input 2. Sends a message to the start stream via `stream_router.send_message()` 3. **Immediately returns** the current state without waiting for any node execution ```python async def execute(self, input_data: GraphState | dict[str, Any]) -> GraphState: state = (...) # Replace state manager state for a fresh execution context self.state_manager.state = state self.state_manager.state_stream.on_next(state) start_stream = f"__{self.name}_node_start__" if start_stream in self.stream_router.streams: self.stream_router.send_message(start_stream, state) return self.state_manager.get_state() # Returns IMMEDIATELY without awaiting nodes ``` ## Code locations - `src/cleveragents/langgraph/graph.py` lines 79–91 — `execute()` method - `src/cleveragents/langgraph/graph.py` lines 155–180 — `_register_node_executor()` method - `src/cleveragents/langgraph/bridge.py` lines 195–240 — `_create_graph_executor()` method ## Steps to reproduce 1. Create a `LangGraph` with multiple nodes (e.g., start → agent_node → end) 2. Call `await graph.execute({"messages": [{"role": "user", "content": "test"}]})` 3. Observe that `execution_history` is empty and the agent node was never called 4. The returned state is identical to the input state ## Impact This is a critical architectural bug. The `LangGraph` class is the core execution engine for actor graphs. Any actor defined as `type: graph` with multiple nodes will silently fail to execute those nodes. The `PlanGenerationGraph` and `AutoDebugAgent` use LangGraph's `StateGraph` directly (which works correctly), but the custom `LangGraph` class used by the actor compiler (`src/cleveragents/actor/compiler.py`) produces `CompiledActor` objects that reference this broken `LangGraph` class. > **Backlog note:** This issue was discovered during autonomous operation > on milestone v3.2.0. It does not block milestone completion and has been > placed in the backlog for human review and future milestone assignment. ## Subtasks - [ ] Audit `LangGraph.execute()` (lines 79–91) and document the full intended execution contract against the specification - [ ] Audit `_register_node_executor()` (lines 155–180) to understand how node executors are registered and what async execution model is expected - [ ] Audit `_create_graph_executor()` in `bridge.py` (lines 195–240) to understand the bridge's role in graph traversal - [ ] Design and implement topological node traversal in `LangGraph.execute()` — traverse edges from start node through all intermediate nodes to end node(s) - [ ] Implement async awaiting of each node executor in traversal order, updating `GraphState` after each node completes - [ ] Handle conditional edges (branching) correctly — evaluate edge conditions against current state to determine next node(s) - [ ] Handle parallel node execution where the graph topology allows it (nodes with no data dependency on each other) - [ ] Ensure `execution_history` is populated with each node's execution record - [ ] Write Behave unit tests (in `features/`) covering: single-node graph, multi-node linear graph, branching graph, and parallel-node graph - [ ] Write Robot Framework integration tests (in `robot/`) exercising a compiled actor graph end-to-end - [ ] Verify `CompiledActor` objects produced by `src/cleveragents/actor/compiler.py` correctly execute all nodes - [ ] Run `nox` to confirm all quality gates pass ## Definition of Done - [ ] `LangGraph.execute()` traverses all graph nodes in correct topological order - [ ] Each node executor is properly awaited before proceeding to the next node - [ ] Conditional edges are evaluated against current state to determine traversal path - [ ] `execution_history` is populated with a record for each executed node - [ ] `CompiledActor` objects produced by the actor compiler execute all nodes correctly - [ ] All Behave unit tests pass (`nox -e unit_tests`) - [ ] All Robot Framework integration tests pass (`nox -e integration_tests`) - [ ] Pyright type checking passes with no errors (`nox -e typecheck`) - [ ] Linting passes (`nox -e lint`) - [ ] All nox stages pass - [ ] Coverage >= 97% --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: ca-new-issue-creator
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Blocks
#392 Epic: Actor YAML & Compiler
cleveragents/cleveragents-core
Reference
cleveragents/cleveragents-core#3821
No description provided.