[AUTO-UAT-6] ContextFragment metadata not sanitized in execute_phase_context_assembler — non-string values passed to dict[str, str] field #8101

Closed
opened 2026-04-13 03:33:09 +00:00 by HAL9000 · 1 comment
Owner

Metadata

  • Commit Message: fix(acms): sanitize ContextFragment metadata to dict[str, str] in execute_phase_context_assembler
  • Branch: fix/acms-context-fragment-metadata-sanitize

Summary

ACMSExecutePhaseContextAssembler._to_context_fragment() in src/cleveragents/application/services/execute_phase_context_assembler.py passes the raw metadata dict (which may contain int and float values such as detail_depth and relevance_score) directly to CoreContextFragment(metadata=metadata). The CoreContextFragment.metadata field is typed as dict[str, str], which means non-string values are either silently coerced (losing type fidelity) or cause Pydantic validation errors depending on the model configuration.

The CHANGELOG entry for [Unreleased] claims this was fixed in #4454 ("Context Hydration Fix: Fixed ContextFragment metadata types (detail_depth and relevance_score must be strings, not int/float) that caused Pydantic validation errors during context assembly, resulting in the LLM receiving zero file context"), but the fix is not present in the current codebase.

Expected Behavior (per spec)

Per the v3.4.0 acceptance criteria:

  • Plan execution leverages ACMS context for LLM calls — the execute-phase context assembler must successfully convert TieredFragment objects to CoreContextFragment objects without validation errors.
  • The ContextFragment.metadata field is dict[str, str]; all values must be strings before construction.

The fix should sanitize the metadata dict to ensure all values are strings before passing to CoreContextFragment. For example:

# Sanitize metadata: ensure all values are strings for CoreContextFragment
str_metadata = {k: str(v) for k, v in metadata.items()}
return CoreContextFragment(
    ...
    metadata=str_metadata,
)

Actual Behavior

In src/cleveragents/application/services/execute_phase_context_assembler.py, line 122:

metadata = dict(fragment.metadata)
detail_depth_raw = metadata.get("detail_depth", 1)
detail_depth = detail_depth_raw if isinstance(detail_depth_raw, int) else 1
score = metadata.get("relevance_score")
relevance = float(score) if isinstance(score, (int, float)) else 0.5
...
return CoreContextFragment(
    ...
    metadata=metadata,  # ← BUG: metadata still contains int/float values
)

The metadata dict is passed without sanitization. If fragment.metadata contains {"detail_depth": 3, "relevance_score": 0.8, "path": "src/main.py"}, then metadata still has int and float values when passed to CoreContextFragment.

Evidence

  • File: src/cleveragents/application/services/execute_phase_context_assembler.py, line 104–123
  • Field definition: src/cleveragents/domain/models/core/context_fragment.py, line 117: metadata: dict[str, str] = Field(default_factory=dict)
  • Test workaround: features/steps/execute_phase_context_assembler_coverage_steps.py, lines 284–305: The test for "tiered fragment with full metadata" (which includes int/float values) uses a MockCoreFragment to bypass CoreContextFragment validation, explicitly noting: "CoreContextFragment.metadata requires dict[str, str], so non-string values will fail validation."
  • CHANGELOG claim: CHANGELOG.md, line 17: Claims this was fixed in #4454, but the fix is absent from the source code.

Duplicate Check

  • Issue #4454 is about "git worktree apply" (a different feature), not about this metadata sanitization bug.
  • Searched open issues for "metadata", "ContextFragment", "execute_phase_context_assembler", "non-string" — no existing open issue found covering this exact gap.
  • The CHANGELOG references a fix that was never committed to the codebase.

Subtasks

  • Add metadata sanitization in _to_context_fragment: convert all values to str before passing to CoreContextFragment
  • Remove the MockCoreFragment workaround in execute_phase_context_assembler_coverage_steps.py (or update it to test the real sanitization path)
  • Add a BDD scenario that verifies int/float metadata values are coerced to strings without using a mock
  • Verify coverage >= 97% via nox -s coverage_report
  • Run nox (all default sessions), fix any errors

Definition of Done

This issue is complete when:

  • All subtasks above are completed and checked off.
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly.
  • The commit is pushed to the remote on the branch matching the Branch in Metadata exactly.
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.

Automated by CleverAgents Bot
Supervisor: UAT Test Pool | Agent: uat-test-worker | Session: [AUTO-UAT-6]

## Metadata - **Commit Message**: `fix(acms): sanitize ContextFragment metadata to dict[str, str] in execute_phase_context_assembler` - **Branch**: `fix/acms-context-fragment-metadata-sanitize` ## Summary `ACMSExecutePhaseContextAssembler._to_context_fragment()` in `src/cleveragents/application/services/execute_phase_context_assembler.py` passes the raw `metadata` dict (which may contain `int` and `float` values such as `detail_depth` and `relevance_score`) directly to `CoreContextFragment(metadata=metadata)`. The `CoreContextFragment.metadata` field is typed as `dict[str, str]`, which means non-string values are either silently coerced (losing type fidelity) or cause Pydantic validation errors depending on the model configuration. The CHANGELOG entry for `[Unreleased]` claims this was fixed in #4454 ("Context Hydration Fix: Fixed `ContextFragment` metadata types (`detail_depth` and `relevance_score` must be strings, not int/float) that caused Pydantic validation errors during context assembly, resulting in the LLM receiving zero file context"), but the fix is **not present** in the current codebase. ## Expected Behavior (per spec) Per the v3.4.0 acceptance criteria: - **Plan execution leverages ACMS context for LLM calls** — the execute-phase context assembler must successfully convert `TieredFragment` objects to `CoreContextFragment` objects without validation errors. - The `ContextFragment.metadata` field is `dict[str, str]`; all values must be strings before construction. The fix should sanitize the metadata dict to ensure all values are strings before passing to `CoreContextFragment`. For example: ```python # Sanitize metadata: ensure all values are strings for CoreContextFragment str_metadata = {k: str(v) for k, v in metadata.items()} return CoreContextFragment( ... metadata=str_metadata, ) ``` ## Actual Behavior In `src/cleveragents/application/services/execute_phase_context_assembler.py`, line 122: ```python metadata = dict(fragment.metadata) detail_depth_raw = metadata.get("detail_depth", 1) detail_depth = detail_depth_raw if isinstance(detail_depth_raw, int) else 1 score = metadata.get("relevance_score") relevance = float(score) if isinstance(score, (int, float)) else 0.5 ... return CoreContextFragment( ... metadata=metadata, # ← BUG: metadata still contains int/float values ) ``` The `metadata` dict is passed without sanitization. If `fragment.metadata` contains `{"detail_depth": 3, "relevance_score": 0.8, "path": "src/main.py"}`, then `metadata` still has `int` and `float` values when passed to `CoreContextFragment`. ## Evidence - **File**: `src/cleveragents/application/services/execute_phase_context_assembler.py`, line 104–123 - **Field definition**: `src/cleveragents/domain/models/core/context_fragment.py`, line 117: `metadata: dict[str, str] = Field(default_factory=dict)` - **Test workaround**: `features/steps/execute_phase_context_assembler_coverage_steps.py`, lines 284–305: The test for "tiered fragment with full metadata" (which includes `int`/`float` values) uses a `MockCoreFragment` to bypass `CoreContextFragment` validation, explicitly noting: "CoreContextFragment.metadata requires dict[str, str], so non-string values will fail validation." - **CHANGELOG claim**: `CHANGELOG.md`, line 17: Claims this was fixed in #4454, but the fix is absent from the source code. ## Duplicate Check - Issue #4454 is about "git worktree apply" (a different feature), not about this metadata sanitization bug. - Searched open issues for "metadata", "ContextFragment", "execute_phase_context_assembler", "non-string" — no existing open issue found covering this exact gap. - The CHANGELOG references a fix that was never committed to the codebase. ## Subtasks - [ ] Add metadata sanitization in `_to_context_fragment`: convert all values to `str` before passing to `CoreContextFragment` - [ ] Remove the `MockCoreFragment` workaround in `execute_phase_context_assembler_coverage_steps.py` (or update it to test the real sanitization path) - [ ] Add a BDD scenario that verifies `int`/`float` metadata values are coerced to strings without using a mock - [ ] Verify coverage >= 97% via `nox -s coverage_report` - [ ] Run `nox` (all default sessions), fix any errors ## Definition of Done This issue is complete when: - All subtasks above are completed and checked off. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done. --- **Automated by CleverAgents Bot** Supervisor: UAT Test Pool | Agent: uat-test-worker | Session: [AUTO-UAT-6]
HAL9000 added this to the v3.4.0 milestone 2026-04-13 03:33:33 +00:00
Owner

superseded by next cycle

superseded by next cycle
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#8101
No description provided.