UAT: LangGraph interrupt/resume for human-in-the-loop not implemented — LangGraph class has no interrupt mechanism, is_running flag is never checked #3672

Open
opened 2026-04-05 21:20:15 +00:00 by freemo · 0 comments
Owner

Metadata

  • Branch: fix/langgraph-interrupt-resume-human-in-the-loop
  • Commit Message: fix(langgraph): implement interrupt/resume mechanism for human-in-the-loop support
  • Milestone: (none — backlog)
  • Parent Epic: #366

Backlog note: This issue was discovered during autonomous operation
on milestone v3.6.0. It does not block milestone completion and has been
placed in the backlog for human review and future milestone assignment.

Background

Per docs/specification.md (lines 28837–28851), the system requires human-in-the-loop support with interruptibility as a core design principle:

| Interruptibility | Pause / cancel / retry at any point |

The spec also references LangGraph's interrupt/resume capability for human-in-the-loop approval gates (line 23444):

| Task input-required state | Human-in-the-loop approval — maps to automation profile gates |

And for actor state rollback (line 28674):

  1. Restore actor state: Load the target decision's context_snapshot.actor_state_ref (LangGraph checkpoint). This restores the strategy actor to the exact reasoning state it was in when the decision was made.

Current Behavior

The LangGraph class in src/cleveragents/langgraph/graph.py has:

  1. An is_running: bool flag (line 75) that is set to True in start() and False in stop()
  2. A stop() method (lines 264–265) that sets is_running = False

However:

  • The is_running flag is never checked during node execution — nodes continue executing even after stop() is called
  • There is no interrupt() method to pause execution mid-graph
  • There is no resume() method to continue from a paused state
  • There is no mechanism to inject human input into a running graph
  • The StateManager has time_travel() for going back to previous states, but no forward-resume capability

The RxPyLangGraphBridge has cancel_task_with_reason() (lines 98–122) for cancelling async tasks, but this is a hard cancel, not a pause/resume.

Expected Behavior

For human-in-the-loop support, the LangGraph class should support:

  1. Interrupt: Pause execution after the current node completes, saving state
  2. Resume: Continue execution from the paused state, optionally with injected human input
  3. Input injection: Allow human input to be added to the graph state before resuming

This is consistent with LangGraph's native interrupt() and Command(resume=...) patterns.

Code Location

  • src/cleveragents/langgraph/graph.py lines 264–265: stop() — sets flag but doesn't interrupt running nodes
  • src/cleveragents/langgraph/graph.py lines 74–75: is_running flag — never checked during execution
  • src/cleveragents/langgraph/graph.py lines 79–91: execute() — no interrupt check between nodes

Impact

The human-in-the-loop feature is not functional. Actors cannot be paused mid-execution for human approval. The stop() method has no effect on in-progress graph execution. This blocks the automation profile gates feature (which requires input-required state transitions).

Subtasks

  • Add interrupt_requested: bool flag to LangGraph
  • Check interrupt_requested between node executions in execute() (once execute() is fixed per #3604)
  • Add interrupt() method that sets interrupt_requested = True and saves current state
  • Add resume(input_data: dict | None = None) method that clears interrupt_requested and continues execution
  • Write Behave unit tests verifying interrupt/resume behavior
  • Verify all nox stages pass; coverage ≥ 97%

Definition of Done

  • interrupt() pauses execution after current node
  • resume() continues execution from paused state
  • Human input can be injected via resume(input_data=...)
  • Unit tests pass for interrupt/resume behavior
  • All nox stages pass
  • Coverage >= 97%

Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: ca-new-issue-creator

## Metadata - **Branch**: `fix/langgraph-interrupt-resume-human-in-the-loop` - **Commit Message**: `fix(langgraph): implement interrupt/resume mechanism for human-in-the-loop support` - **Milestone**: *(none — backlog)* - **Parent Epic**: #366 > **Backlog note:** This issue was discovered during autonomous operation > on milestone v3.6.0. It does not block milestone completion and has been > placed in the backlog for human review and future milestone assignment. ## Background Per `docs/specification.md` (lines 28837–28851), the system requires human-in-the-loop support with interruptibility as a core design principle: > | **Interruptibility** | Pause / cancel / retry at any point | The spec also references LangGraph's interrupt/resume capability for human-in-the-loop approval gates (line 23444): > | Task `input-required` state | Human-in-the-loop approval — maps to automation profile gates | And for actor state rollback (line 28674): > 1. **Restore actor state**: Load the target decision's `context_snapshot.actor_state_ref` (LangGraph checkpoint). This restores the strategy actor to the exact reasoning state it was in when the decision was made. ## Current Behavior The `LangGraph` class in `src/cleveragents/langgraph/graph.py` has: 1. An `is_running: bool` flag (line 75) that is set to `True` in `start()` and `False` in `stop()` 2. A `stop()` method (lines 264–265) that sets `is_running = False` However: - The `is_running` flag is **never checked** during node execution — nodes continue executing even after `stop()` is called - There is no `interrupt()` method to pause execution mid-graph - There is no `resume()` method to continue from a paused state - There is no mechanism to inject human input into a running graph - The `StateManager` has `time_travel()` for going back to previous states, but no forward-resume capability The `RxPyLangGraphBridge` has `cancel_task_with_reason()` (lines 98–122) for cancelling async tasks, but this is a hard cancel, not a pause/resume. ## Expected Behavior For human-in-the-loop support, the `LangGraph` class should support: 1. **Interrupt**: Pause execution after the current node completes, saving state 2. **Resume**: Continue execution from the paused state, optionally with injected human input 3. **Input injection**: Allow human input to be added to the graph state before resuming This is consistent with LangGraph's native `interrupt()` and `Command(resume=...)` patterns. ## Code Location - `src/cleveragents/langgraph/graph.py` lines 264–265: `stop()` — sets flag but doesn't interrupt running nodes - `src/cleveragents/langgraph/graph.py` lines 74–75: `is_running` flag — never checked during execution - `src/cleveragents/langgraph/graph.py` lines 79–91: `execute()` — no interrupt check between nodes ## Impact The human-in-the-loop feature is not functional. Actors cannot be paused mid-execution for human approval. The `stop()` method has no effect on in-progress graph execution. This blocks the automation profile gates feature (which requires `input-required` state transitions). ## Subtasks - [ ] Add `interrupt_requested: bool` flag to `LangGraph` - [ ] Check `interrupt_requested` between node executions in `execute()` (once execute() is fixed per #3604) - [ ] Add `interrupt()` method that sets `interrupt_requested = True` and saves current state - [ ] Add `resume(input_data: dict | None = None)` method that clears `interrupt_requested` and continues execution - [ ] Write Behave unit tests verifying interrupt/resume behavior - [ ] Verify all nox stages pass; coverage ≥ 97% ## Definition of Done - [ ] `interrupt()` pauses execution after current node - [ ] `resume()` continues execution from paused state - [ ] Human input can be injected via `resume(input_data=...)` - [ ] Unit tests pass for interrupt/resume behavior - [ ] All nox stages pass - [ ] Coverage >= 97% --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: ca-new-issue-creator
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Blocks
#366 Epic: Post-MVP Deferred Work
cleveragents/cleveragents-core
Reference
cleveragents/cleveragents-core#3672
No description provided.