bug(context_tiers): ContextTierService._summarize_for_cold() is a stub that truncates content — LLM-based context summarization not implemented #3603

Open
opened 2026-04-05 20:18:48 +00:00 by freemo · 0 comments
Owner

Metadata

  • Branch: fix/context-tier-llm-summarization
  • Commit Message: fix(context_tiers): implement LLM-based summarization in _summarize_for_cold()
  • Milestone: None
  • Parent Epic: #396

Background

The ContextTierService._summarize_for_cold() method in context_tiers.py is documented as a stub that truncates content to 200 characters instead of calling an LLM summarizer. This means context summarization — a core ACMS feature for managing context within token limits — is not implemented.

Per docs/specification.md §44662 and configuration keys context.summarize.enabled, context.summarize.max-tokens, context.summarize.model:

Context summarization should produce a meaningful summary of fragment content when demoting to cold storage, using the configured summarization model (defaulting to the actor's own model).

The spec defines context.summarize.max-tokens (default: 1000) and context.summarize.model as configuration keys, implying a real LLM call should be made.

Current Behavior

In src/cleveragents/application/services/context_tiers.py, the _summarize_for_cold static method (line ~330):

@staticmethod
def _summarize_for_cold(fragment: TieredFragment) -> TieredFragment:
    """Summarisation hook: truncate content for cold storage.

    This is a stub implementation that truncates the content.
    A production implementation would call an LLM summariser.
    """
    content = fragment.content
    if len(content) > _COLD_SUMMARY_MAX_CHARS:
        content = content[:_COLD_SUMMARY_MAX_CHARS] + "..."
    return fragment.model_copy(update={"content": content})

Where _COLD_SUMMARY_MAX_CHARS = 200. This means:

  1. All context demoted to cold storage is truncated to 200 characters
  2. The context.summarize.enabled, context.summarize.max-tokens, and context.summarize.model configuration keys have no effect on cold-tier demotion
  3. Historical context in cold storage is essentially useless (200 chars is not enough for meaningful context)

Code Location:

  • src/cleveragents/application/services/context_tiers.py lines ~326–337 (_summarize_for_cold)
  • _COLD_SUMMARY_MAX_CHARS = 200 constant at line ~28

Expected Behavior

_summarize_for_cold() should:

  1. Check context.summarize.enabled — if disabled, fall back to truncation (current behavior)
  2. When enabled, invoke the configured LLM (via context.summarize.model, defaulting to the actor's own model) to produce a meaningful summary of the fragment content
  3. Respect context.summarize.max-tokens (default: 1000) as the token budget for the summary
  4. Return a TieredFragment with the LLM-generated summary as its content

Impact

Context summarization is a key ACMS feature for handling large context windows efficiently. Without it, cold-tier storage contains only truncated fragments that cannot provide meaningful historical context. The TemporalArchaeologyStrategy and PlanDecisionContextStrategy that rely on cold-tier content will return low-quality results.

Backlog note: This issue was discovered during autonomous operation
on milestone v3.4.0. It does not block milestone completion and has been
placed in the backlog for human review and future milestone assignment.

Subtasks

  • Write a failing Behave scenario in features/ that reproduces the stub behavior (TDD)
  • Implement LLM invocation in _summarize_for_cold() using the configured summarization model
  • Respect context.summarize.enabled config key (fall back to truncation when disabled)
  • Respect context.summarize.max-tokens config key (default: 1000) as the token budget
  • Respect context.summarize.model config key (default: actor's own model)
  • Add mock LLM summarizer in features/mocks/ for unit test isolation
  • Ensure all existing ContextTierService Behave scenarios still pass
  • Add type annotations to all new/modified functions and verify with nox -e typecheck
  • Update docstrings to reflect the real implementation

Definition of Done

  • All subtasks above are complete
  • A Behave scenario exists that verifies LLM summarization is called on cold-tier demotion
  • A Behave scenario exists that verifies truncation fallback when context.summarize.enabled=false
  • TemporalArchaeologyStrategy and PlanDecisionContextStrategy integration tests pass with meaningful cold-tier content
  • Commit created with message: fix(context_tiers): implement LLM-based summarization in _summarize_for_cold()
  • Branch: fix/context-tier-llm-summarization
  • PR merged with Closes #<this issue>
  • All nox stages pass
  • Coverage >= 97%

Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: ca-new-issue-creator

## Metadata - **Branch**: `fix/context-tier-llm-summarization` - **Commit Message**: `fix(context_tiers): implement LLM-based summarization in _summarize_for_cold()` - **Milestone**: None - **Parent Epic**: #396 ## Background The `ContextTierService._summarize_for_cold()` method in `context_tiers.py` is documented as a stub that truncates content to 200 characters instead of calling an LLM summarizer. This means context summarization — a core ACMS feature for managing context within token limits — is not implemented. Per `docs/specification.md` §44662 and configuration keys `context.summarize.enabled`, `context.summarize.max-tokens`, `context.summarize.model`: > Context summarization should produce a meaningful summary of fragment content when demoting to cold storage, using the configured summarization model (defaulting to the actor's own model). The spec defines `context.summarize.max-tokens` (default: 1000) and `context.summarize.model` as configuration keys, implying a real LLM call should be made. ## Current Behavior In `src/cleveragents/application/services/context_tiers.py`, the `_summarize_for_cold` static method (line ~330): ```python @staticmethod def _summarize_for_cold(fragment: TieredFragment) -> TieredFragment: """Summarisation hook: truncate content for cold storage. This is a stub implementation that truncates the content. A production implementation would call an LLM summariser. """ content = fragment.content if len(content) > _COLD_SUMMARY_MAX_CHARS: content = content[:_COLD_SUMMARY_MAX_CHARS] + "..." return fragment.model_copy(update={"content": content}) ``` Where `_COLD_SUMMARY_MAX_CHARS = 200`. This means: 1. All context demoted to cold storage is truncated to 200 characters 2. The `context.summarize.enabled`, `context.summarize.max-tokens`, and `context.summarize.model` configuration keys have no effect on cold-tier demotion 3. Historical context in cold storage is essentially useless (200 chars is not enough for meaningful context) **Code Location:** - `src/cleveragents/application/services/context_tiers.py` lines ~326–337 (`_summarize_for_cold`) - `_COLD_SUMMARY_MAX_CHARS = 200` constant at line ~28 ## Expected Behavior `_summarize_for_cold()` should: 1. Check `context.summarize.enabled` — if disabled, fall back to truncation (current behavior) 2. When enabled, invoke the configured LLM (via `context.summarize.model`, defaulting to the actor's own model) to produce a meaningful summary of the fragment content 3. Respect `context.summarize.max-tokens` (default: 1000) as the token budget for the summary 4. Return a `TieredFragment` with the LLM-generated summary as its content ## Impact Context summarization is a key ACMS feature for handling large context windows efficiently. Without it, cold-tier storage contains only truncated fragments that cannot provide meaningful historical context. The `TemporalArchaeologyStrategy` and `PlanDecisionContextStrategy` that rely on cold-tier content will return low-quality results. > **Backlog note:** This issue was discovered during autonomous operation > on milestone v3.4.0. It does not block milestone completion and has been > placed in the backlog for human review and future milestone assignment. ## Subtasks - [ ] Write a failing Behave scenario in `features/` that reproduces the stub behavior (TDD) - [ ] Implement LLM invocation in `_summarize_for_cold()` using the configured summarization model - [ ] Respect `context.summarize.enabled` config key (fall back to truncation when disabled) - [ ] Respect `context.summarize.max-tokens` config key (default: 1000) as the token budget - [ ] Respect `context.summarize.model` config key (default: actor's own model) - [ ] Add mock LLM summarizer in `features/mocks/` for unit test isolation - [ ] Ensure all existing `ContextTierService` Behave scenarios still pass - [ ] Add type annotations to all new/modified functions and verify with `nox -e typecheck` - [ ] Update docstrings to reflect the real implementation ## Definition of Done - [ ] All subtasks above are complete - [ ] A Behave scenario exists that verifies LLM summarization is called on cold-tier demotion - [ ] A Behave scenario exists that verifies truncation fallback when `context.summarize.enabled=false` - [ ] `TemporalArchaeologyStrategy` and `PlanDecisionContextStrategy` integration tests pass with meaningful cold-tier content - [ ] Commit created with message: `fix(context_tiers): implement LLM-based summarization in _summarize_for_cold()` - [ ] Branch: `fix/context-tier-llm-summarization` - [ ] PR merged with `Closes #<this issue>` - [ ] All nox stages pass - [ ] Coverage >= 97% --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: ca-new-issue-creator
freemo added this to the v3.4.0 milestone 2026-04-05 20:23:29 +00:00
freemo removed this from the v3.4.0 milestone 2026-04-06 21:04:25 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Blocks
#396 Epic: ACMS Context Pipeline
cleveragents/cleveragents-core
Reference
cleveragents/cleveragents-core#3603
No description provided.