UAT: ContextAssemblyPipeline filters candidates to single strategy — ConfidenceWeightedSelector multi-strategy fusion bypassed #6342

Open
opened 2026-04-09 20:12:19 +00:00 by HAL9000 · 0 comments
Owner

Bug Report

Spec Reference

docs/specification.md — ACMS > Context Assembly Pipeline > Phase 1: Strategy Orchestration (~lines 42618, 42918)

The spec defines Phase 1 as:

  1. ConfidenceWeightedSelector evaluates ALL registered strategies and returns confidence-weighted (strategy, confidence) pairs.
  2. ProportionalBudgetAllocator distributes token budget proportionally across the selected strategies.
  3. ParallelStrategyExecutor runs multiple strategies concurrently and collects combined results.

This is the foundation of "multi-strategy fusion" — different strategies (semantic, recency, tiered, graph) each contribute fragments within their allocated budget.

Code Locations

File 1: /app/src/cleveragents/application/services/acms_pipeline.py
Lines: ~480–487 (ContextAssemblyPipeline.assemble())

File 2: /app/src/cleveragents/application/services/acms_service.py
Lines: ~630–637 (ACMSPipeline.assemble())

Finding

Both ACMSPipeline.assemble() and ContextAssemblyPipeline.assemble() contain identical code that filters the ConfidenceWeightedSelector's output down to a single strategy immediately after selection:

# In ContextAssemblyPipeline.assemble() — acms_pipeline.py ~line 480
candidates = self._strategy_selector.select(
    all_strategies,
    {"strategy": strategy_name},
)
# IMMEDIATELY filter to only the explicitly requested strategy:
candidates = [(s, c) for s, c in candidates if s.name == strategy_name] or [
    (resolved, 1.0)
]

The same pattern appears in ACMSPipeline.assemble() in acms_service.py. The comment at this location states explicitly:

"future multi-strategy support will remove this filter"

This means:

  • ConfidenceWeightedSelector returns multiple (strategy, confidence) pairs ranked by confidence/preference boost — but all except one are immediately discarded.
  • ProportionalBudgetAllocator.allocate() always receives exactly one candidate, so it always gives 100% of the budget to a single strategy.
  • ParallelStrategyExecutor.execute() always runs exactly one strategy (falling back to the synchronous _execute_single() path).

Impact

The entire Phase 1 architecture (ConfidenceWeightedSelector, ProportionalBudgetAllocator, ParallelStrategyExecutor with circuit-breaking) is effectively inert for its primary purpose:

  • The confidence-weighting logic in ConfidenceWeightedSelector (including preferred_strategies boost) is computed and then thrown away.
  • The min_useful_budget filtering in ProportionalBudgetAllocator is never exercised across multiple strategies.
  • The ParallelStrategyExecutor's ThreadPoolExecutor and circuit-breaker are never utilized with more than one strategy.
  • Context assembly never benefits from combining fragments from e.g. semantic + recency + tiered strategies simultaneously.

Steps to Reproduce

pipeline = ContextAssemblyPipeline()
# Register multiple strategies
pipeline.register_strategy("semantic", SemanticStrategy())
pipeline.register_strategy("recency", RecencyStrategy())

# Assemble with default "relevance" strategy
payload = pipeline.assemble(
    plan_id="01ARZ3NDEKTSV4RRFFQ69G5FAV",
    fragments=frags,
    budget=ContextBudget(max_tokens=2048),
    strategy="relevance",
)
# strategies_used will always be ("relevance",) — "semantic" and "recency" are never tried
assert payload.strategies_used == ("relevance",)  # True — only one strategy ever used

Expected

ContextAssemblyPipeline.assemble() passes all candidates from ConfidenceWeightedSelector to ProportionalBudgetAllocator, which distributes budget proportionally. ParallelStrategyExecutor runs the top N strategies concurrently. The resulting context payload reflects multi-strategy fusion.

Actual

Only the single explicitly-named strategy is ever invoked. The strategy parameter to assemble() acts as a hard selector rather than a hint. All other strategies are always ignored regardless of their confidence scores.


Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: uat-tester

## Bug Report ### Spec Reference `docs/specification.md` — ACMS > Context Assembly Pipeline > Phase 1: Strategy Orchestration (~lines 42618, 42918) The spec defines Phase 1 as: 1. `ConfidenceWeightedSelector` evaluates ALL registered strategies and returns confidence-weighted `(strategy, confidence)` pairs. 2. `ProportionalBudgetAllocator` distributes token budget **proportionally across the selected strategies**. 3. `ParallelStrategyExecutor` runs **multiple strategies concurrently** and collects combined results. This is the foundation of "multi-strategy fusion" — different strategies (semantic, recency, tiered, graph) each contribute fragments within their allocated budget. ### Code Locations **File 1**: `/app/src/cleveragents/application/services/acms_pipeline.py` **Lines**: ~480–487 (`ContextAssemblyPipeline.assemble()`) **File 2**: `/app/src/cleveragents/application/services/acms_service.py` **Lines**: ~630–637 (`ACMSPipeline.assemble()`) ### Finding Both `ACMSPipeline.assemble()` and `ContextAssemblyPipeline.assemble()` contain identical code that filters the `ConfidenceWeightedSelector`'s output down to **a single strategy** immediately after selection: ```python # In ContextAssemblyPipeline.assemble() — acms_pipeline.py ~line 480 candidates = self._strategy_selector.select( all_strategies, {"strategy": strategy_name}, ) # IMMEDIATELY filter to only the explicitly requested strategy: candidates = [(s, c) for s, c in candidates if s.name == strategy_name] or [ (resolved, 1.0) ] ``` The same pattern appears in `ACMSPipeline.assemble()` in `acms_service.py`. The comment at this location states explicitly: > "future multi-strategy support will remove this filter" This means: - `ConfidenceWeightedSelector` returns multiple `(strategy, confidence)` pairs ranked by confidence/preference boost — but **all except one are immediately discarded**. - `ProportionalBudgetAllocator.allocate()` always receives exactly one candidate, so it always gives 100% of the budget to a single strategy. - `ParallelStrategyExecutor.execute()` always runs exactly one strategy (falling back to the synchronous `_execute_single()` path). ### Impact The entire Phase 1 architecture (`ConfidenceWeightedSelector`, `ProportionalBudgetAllocator`, `ParallelStrategyExecutor` with circuit-breaking) is effectively inert for its primary purpose: - The confidence-weighting logic in `ConfidenceWeightedSelector` (including `preferred_strategies` boost) is computed and then thrown away. - The `min_useful_budget` filtering in `ProportionalBudgetAllocator` is never exercised across multiple strategies. - The `ParallelStrategyExecutor`'s `ThreadPoolExecutor` and circuit-breaker are never utilized with more than one strategy. - Context assembly never benefits from combining fragments from e.g. semantic + recency + tiered strategies simultaneously. ### Steps to Reproduce ```python pipeline = ContextAssemblyPipeline() # Register multiple strategies pipeline.register_strategy("semantic", SemanticStrategy()) pipeline.register_strategy("recency", RecencyStrategy()) # Assemble with default "relevance" strategy payload = pipeline.assemble( plan_id="01ARZ3NDEKTSV4RRFFQ69G5FAV", fragments=frags, budget=ContextBudget(max_tokens=2048), strategy="relevance", ) # strategies_used will always be ("relevance",) — "semantic" and "recency" are never tried assert payload.strategies_used == ("relevance",) # True — only one strategy ever used ``` ### Expected `ContextAssemblyPipeline.assemble()` passes **all** candidates from `ConfidenceWeightedSelector` to `ProportionalBudgetAllocator`, which distributes budget proportionally. `ParallelStrategyExecutor` runs the top N strategies concurrently. The resulting context payload reflects multi-strategy fusion. ### Actual Only the single explicitly-named strategy is ever invoked. The `strategy` parameter to `assemble()` acts as a hard selector rather than a hint. All other strategies are always ignored regardless of their confidence scores. --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: uat-tester
HAL9000 added this to the v3.2.0 milestone 2026-04-09 21:09:36 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#6342
No description provided.