feat(acms): In-actor conversation history compaction for long Execute sessions #6761

Open
opened 2026-04-10 02:02:22 +00:00 by drew · 4 comments
Member

Overview

Add a lightweight compaction step to the LangGraph actor execution loop that summarises accumulated within-session message history when it exceeds a token threshold. This is distinct from — and complementary to — the ACMS, which manages context assembly before an actor turn begins.

Gap Being Filled

The ACMS assembles context at the start of each actor invocation from the indexed knowledge graph. It does not manage what happens to the LangGraph message history during a long Execute actor session involving many tool call turns.

If an Execute actor makes 40–60 tool calls, the accumulated LangGraph message history (tool calls + results + model responses) grows continuously and will eventually hit the model's context window limit. The ACMS does not help here — it only runs at invocation start, not mid-session.

The gap: Nothing in the current architecture manages conversation overflow within a running actor session. The ACMS solves retrieval; this solves accumulation.

Proposed Design

Add a compaction hook to the LangGraph actor runner (src/cleveragents/langgraph/) that:

  1. Monitors accumulated message history token count after each tool-call turn
  2. Triggers when the count exceeds a configurable threshold (e.g., 75% of the model's context window minus the ACMS hot context budget)
  3. Summarises the tool call + result history using context.summarize.model (the existing cheap model config), replacing the oldest N turns with a compact summary
  4. Preserves the most recent K turns intact (configurable, e.g. keep_recent_turns: 8) so the actor retains full fidelity for its current work
  5. Circuit-breaker: Disable compaction for the rest of the session after 3 consecutive failures (same pattern as ACMS ParallelStrategyExecutor)

The ACMS continues to manage external context retrieval. This hook handles only internal message growth. They are complementary, not competing.

Inspired by Claude Code's 9-section auto-compact template (revealed in the March 2026 leak), the summarisation prompt should produce:

  1. What was being worked on (primary task context)
  2. Key files and code sections touched
  3. Decisions made and why
  4. Errors encountered and how they were resolved
  5. Tool calls made (compact list with outcomes)
  6. Pending items (what still needs to be done)

This structure preserves what an actor needs to resume effectively without the full message history.

Inspiration from Claude Code

Claude Code's 5-layer context compaction cascade was documented in the March 2026 source map leak. CC faces this problem acutely because it runs a while(true) agent loop where conversation history accumulates indefinitely within a session.

CleverAgents' ACMS solves context management at the architecture level for most cases. However, the within-session accumulation problem is the same regardless of how the initial context was assembled. The specific patterns adapted here:

  • Cheapest-first: Free lossless operations (drop old tool results) before any LLM summarisation call
  • Circuit breaker: After N consecutive compaction failures, disable and surface the issue rather than looping
  • Structured summarisation template: Specific 9-section structure that preserves exactly what's needed for resumption

Recommendation

If approved, update:

  • Specification: Add a new subsection under §Architecture > Actor and Agent Architecture documenting within-session compaction semantics
  • ADR-022 (LangChain/LangGraph Integration): Document the compaction hook as a lifecycle concern
  • Config reference: Add actor.compaction.threshold, actor.compaction.keep-recent-turns, and actor.compaction.model (defaulting to context.summarize.model)
  • Implementation: Add compaction logic to src/cleveragents/langgraph/nodes.py or a new src/cleveragents/langgraph/compaction.py
## Overview Add a lightweight compaction step to the LangGraph actor execution loop that summarises accumulated within-session message history when it exceeds a token threshold. This is distinct from — and complementary to — the ACMS, which manages context assembly *before* an actor turn begins. ## Gap Being Filled The ACMS assembles context at the **start** of each actor invocation from the indexed knowledge graph. It does not manage what happens to the LangGraph message history **during** a long Execute actor session involving many tool call turns. If an Execute actor makes 40–60 tool calls, the accumulated LangGraph message history (tool calls + results + model responses) grows continuously and will eventually hit the model's context window limit. The ACMS does not help here — it only runs at invocation start, not mid-session. **The gap:** Nothing in the current architecture manages conversation overflow *within* a running actor session. The ACMS solves retrieval; this solves accumulation. ## Proposed Design Add a compaction hook to the LangGraph actor runner (`src/cleveragents/langgraph/`) that: 1. **Monitors** accumulated message history token count after each tool-call turn 2. **Triggers** when the count exceeds a configurable threshold (e.g., 75% of the model's context window minus the ACMS hot context budget) 3. **Summarises** the tool call + result history using `context.summarize.model` (the existing cheap model config), replacing the oldest N turns with a compact summary 4. **Preserves** the most recent K turns intact (configurable, e.g. `keep_recent_turns: 8`) so the actor retains full fidelity for its current work 5. **Circuit-breaker:** Disable compaction for the rest of the session after 3 consecutive failures (same pattern as ACMS `ParallelStrategyExecutor`) The ACMS continues to manage external context retrieval. This hook handles only internal message growth. They are complementary, not competing. ## Recommended Summary Structure Inspired by Claude Code's 9-section auto-compact template (revealed in the March 2026 leak), the summarisation prompt should produce: 1. What was being worked on (primary task context) 2. Key files and code sections touched 3. Decisions made and why 4. Errors encountered and how they were resolved 5. Tool calls made (compact list with outcomes) 6. Pending items (what still needs to be done) This structure preserves what an actor needs to resume effectively without the full message history. ## Inspiration from Claude Code Claude Code's **5-layer context compaction cascade** was documented in the March 2026 source map leak. CC faces this problem acutely because it runs a `while(true)` agent loop where conversation history accumulates indefinitely within a session. CleverAgents' ACMS solves context management at the architecture level for most cases. However, the within-session accumulation problem is the same regardless of how the initial context was assembled. The specific patterns adapted here: - **Cheapest-first**: Free lossless operations (drop old tool results) before any LLM summarisation call - **Circuit breaker**: After N consecutive compaction failures, disable and surface the issue rather than looping - **Structured summarisation template**: Specific 9-section structure that preserves exactly what's needed for resumption ## Recommendation If approved, update: - **Specification**: Add a new subsection under §Architecture > Actor and Agent Architecture documenting within-session compaction semantics - **ADR-022** (LangChain/LangGraph Integration): Document the compaction hook as a lifecycle concern - **Config reference**: Add `actor.compaction.threshold`, `actor.compaction.keep-recent-turns`, and `actor.compaction.model` (defaulting to `context.summarize.model`) - **Implementation**: Add compaction logic to `src/cleveragents/langgraph/nodes.py` or a new `src/cleveragents/langgraph/compaction.py`
Owner

Thank you for filing this proposal, @drew. I have reviewed it.

This issue carries the Needs Feedback label, indicating it is a proposal awaiting project owner review. I will not modify its state, priority, or milestone — those decisions belong to the project owner (@freemo).

Summary of the proposal: Add a lightweight compaction step to the LangGraph actor execution loop that summarises accumulated within-session message history when it exceeds a token threshold. This is complementary to (not competing with) the ACMS, which manages context assembly before an actor turn begins. The ACMS solves retrieval; this solves accumulation during long Execute sessions.

Completeness assessment: This proposal is well-structured and detailed. It includes:

  • Clear gap analysis distinguishing this from the existing ACMS
  • Concrete proposed design with 5-step algorithm
  • Recommended 6-section summary structure
  • Circuit-breaker pattern for resilience
  • Inspiration from Claude Code's 5-layer context compaction cascade
  • Specific files and spec sections to update if approved

What happens next: The project owner (@freemo) needs to review this proposal and either:

  1. Approve it — at which point the Needs Feedback label is removed and implementation issues are created
  2. Request changes — provide feedback in comments
  3. Decline it — close with a State/Wont Do label and explanation

I will monitor this issue and respond to any questions or comments.


Automated by CleverAgents Bot
Supervisor: Human Liaison | Agent: human-liaison

Thank you for filing this proposal, @drew. I have reviewed it. This issue carries the `Needs Feedback` label, indicating it is a proposal awaiting project owner review. I will not modify its state, priority, or milestone — those decisions belong to the project owner (@freemo). **Summary of the proposal:** Add a lightweight compaction step to the LangGraph actor execution loop that summarises accumulated within-session message history when it exceeds a token threshold. This is complementary to (not competing with) the ACMS, which manages context assembly before an actor turn begins. The ACMS solves retrieval; this solves accumulation during long Execute sessions. **Completeness assessment:** This proposal is well-structured and detailed. It includes: - Clear gap analysis distinguishing this from the existing ACMS - Concrete proposed design with 5-step algorithm - Recommended 6-section summary structure - Circuit-breaker pattern for resilience - Inspiration from Claude Code's 5-layer context compaction cascade - Specific files and spec sections to update if approved **What happens next:** The project owner (@freemo) needs to review this proposal and either: 1. Approve it — at which point the `Needs Feedback` label is removed and implementation issues are created 2. Request changes — provide feedback in comments 3. Decline it — close with a `State/Wont Do` label and explanation I will monitor this issue and respond to any questions or comments. --- **Automated by CleverAgents Bot** Supervisor: Human Liaison | Agent: human-liaison
Owner

Update: The architecture agent has reviewed this proposal and created PR #6884 with the corresponding specification update.

Status: PR #6884 is open and awaiting human review from @freemo. Once approved and merged, this issue will be closed and implementation issues will be created.

The spec PR covers all four proposals from @drew (#6760, #6761, #6763, #6765) in a single architectural cycle.


Automated by CleverAgents Bot
Supervisor: Human Liaison | Agent: human-liaison

Update: The architecture agent has reviewed this proposal and created PR #6884 with the corresponding specification update. **Status:** PR #6884 is open and awaiting human review from @freemo. Once approved and merged, this issue will be closed and implementation issues will be created. The spec PR covers all four proposals from @drew (#6760, #6761, #6763, #6765) in a single architectural cycle. --- **Automated by CleverAgents Bot** Supervisor: Human Liaison | Agent: human-liaison
Owner

Label compliance fix applied: Added missing Priority/Backlog label. Feature proposals without a milestone default to backlog priority per CONTRIBUTING.md.


Automated by CleverAgents Bot
Supervisor: Backlog Grooming | Agent: backlog-groomer

Label compliance fix applied: Added missing `Priority/Backlog` label. Feature proposals without a milestone default to backlog priority per CONTRIBUTING.md. --- **Automated by CleverAgents Bot** Supervisor: Backlog Grooming | Agent: backlog-groomer
Owner

Verified — Feature discussion: in-actor conversation history compaction. MoSCoW: Could-have. Priority: Low — future enhancement.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — Feature discussion: in-actor conversation history compaction. MoSCoW: Could-have. Priority: Low — future enhancement. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#6761
No description provided.