[Bug Hunt][Cycle 2][Config] YAML injection vulnerability in environment variable substitution #7124

Open
opened 2026-04-10 07:58:09 +00:00 by HAL9000 · 1 comment
Owner

Background

The reactive configuration parser (src/cleveragents/reactive/config_parser.py) performs direct string substitution of environment variables into raw YAML content without any sanitisation or escaping. This allows an attacker who controls environment variable values — a realistic threat in containerised or CI/CD environments — to inject arbitrary YAML structures, potentially leading to privilege escalation, configuration override, or remote code execution via deserialization gadgets.

This is a security-critical correctness issue affecting the core reactive configuration subsystem. Any code path that calls ReactiveConfigParser._interpolate_env() is affected.

Severity Assessment

  • Impact: Code injection via malicious environment variables could lead to arbitrary YAML content injection and potential RCE
  • Likelihood: Medium — requires control over environment variables, but possible in containerised/CI environments
  • Priority: Critical

Location

  • File: src/cleveragents/reactive/config_parser.py
  • Function/Class: ReactiveConfigParser._interpolate_env()
  • Lines: 93–101

Current Behavior

The _interpolate_env() method substitutes environment variable values directly into the raw YAML string before parsing:

def repl(match: re.Match[str]) -> str:
    var = match.group(1)
    default = match.group(2)
    val = os.environ.get(var, default)
    if val is None:
        raise ConfigurationError(f"Environment variable '{var}' is not set")
    return str(val)  # Direct string substitution — no validation

value = self.env_pattern.sub(repl, value)

An environment variable such as:

MALICIOUS="foo\n  password: hacked\n  admin: true"

injected into a config like:

database: ${MALICIOUS}

produces the malformed YAML:

database: foo
  password: hacked
  admin: true

which a YAML parser will interpret as additional top-level keys, silently overriding security-sensitive configuration values.

Expected Behavior

Environment variable values must be safely escaped or validated before YAML substitution to prevent injection of malicious YAML structures. The substituted value should always be treated as a YAML scalar, never as raw YAML markup.

Suggested Fix

  1. Parse environment variable values as YAML scalars first to validate they do not contain YAML structures.
  2. Use proper YAML quoting/escaping for string values before substitution (e.g., wrap in double-quotes and escape internal quotes/newlines).
  3. Consider using a templating engine with sandboxed variable substitution that operates on the parsed YAML AST rather than the raw string.

TDD Note

After this bug issue is verified, a corresponding Type/Testing issue will be created for TDD. The test will use tags: @tdd_issue, @tdd_issue_<this-issue-number>, and @tdd_expected_fail to prove the bug exists before fixing it.


Metadata

  • Branch: bugfix/m3-yaml-injection-env-var-substitution
  • Commit Message: fix(config): sanitise env var values before YAML substitution to prevent injection
  • Milestone: v3.2.0
  • Parent Epic: #5502

Subtasks

  • Reproduce the injection with a targeted BDD scenario tagged @tdd_issue, @tdd_issue_<N>, @tdd_expected_fail
  • Audit all call sites of _interpolate_env() for additional exposure
  • Implement YAML-scalar escaping / quoting for substituted values in ReactiveConfigParser._interpolate_env()
  • Verify fix prevents injection while preserving legitimate multi-line scalar values
  • Remove @tdd_expected_fail tag from the TDD scenario once the fix is in place
  • Update docstring and inline comments to document the security contract
  • Confirm no regression in existing config-parser BDD scenarios

Definition of Done

  • A BDD scenario (tagged @tdd_issue and @tdd_issue_<N>) demonstrates the injection vector and passes after the fix
  • ReactiveConfigParser._interpolate_env() no longer allows YAML structure injection via environment variable values
  • All existing ReactiveConfigParser BDD scenarios continue to pass
  • Pyright strict-mode type checking passes with zero errors (nox -s typecheck)
  • Ruff linting passes with zero violations (nox -s lint)
  • Security scan passes with no new high/critical findings (nox -s security_scan)
  • All nox stages pass
  • Coverage >= 97%

Automated by CleverAgents Bot
Supervisor: Bug Hunting | Agent: new-issue-creator

## Background The reactive configuration parser (`src/cleveragents/reactive/config_parser.py`) performs direct string substitution of environment variables into raw YAML content without any sanitisation or escaping. This allows an attacker who controls environment variable values — a realistic threat in containerised or CI/CD environments — to inject arbitrary YAML structures, potentially leading to privilege escalation, configuration override, or remote code execution via deserialization gadgets. This is a security-critical correctness issue affecting the core reactive configuration subsystem. Any code path that calls `ReactiveConfigParser._interpolate_env()` is affected. ## Severity Assessment - **Impact**: Code injection via malicious environment variables could lead to arbitrary YAML content injection and potential RCE - **Likelihood**: Medium — requires control over environment variables, but possible in containerised/CI environments - **Priority**: Critical ## Location - **File**: `src/cleveragents/reactive/config_parser.py` - **Function/Class**: `ReactiveConfigParser._interpolate_env()` - **Lines**: 93–101 ## Current Behavior The `_interpolate_env()` method substitutes environment variable values directly into the raw YAML string before parsing: ```python def repl(match: re.Match[str]) -> str: var = match.group(1) default = match.group(2) val = os.environ.get(var, default) if val is None: raise ConfigurationError(f"Environment variable '{var}' is not set") return str(val) # Direct string substitution — no validation value = self.env_pattern.sub(repl, value) ``` An environment variable such as: ``` MALICIOUS="foo\n password: hacked\n admin: true" ``` injected into a config like: ```yaml database: ${MALICIOUS} ``` produces the malformed YAML: ```yaml database: foo password: hacked admin: true ``` which a YAML parser will interpret as additional top-level keys, silently overriding security-sensitive configuration values. ## Expected Behavior Environment variable values must be safely escaped or validated before YAML substitution to prevent injection of malicious YAML structures. The substituted value should always be treated as a YAML scalar, never as raw YAML markup. ## Suggested Fix 1. Parse environment variable values as YAML scalars first to validate they do not contain YAML structures. 2. Use proper YAML quoting/escaping for string values before substitution (e.g., wrap in double-quotes and escape internal quotes/newlines). 3. Consider using a templating engine with sandboxed variable substitution that operates on the parsed YAML AST rather than the raw string. ## TDD Note After this bug issue is verified, a corresponding `Type/Testing` issue will be created for TDD. The test will use tags: `@tdd_issue`, `@tdd_issue_<this-issue-number>`, and `@tdd_expected_fail` to prove the bug exists before fixing it. --- ## Metadata - **Branch**: `bugfix/m3-yaml-injection-env-var-substitution` - **Commit Message**: `fix(config): sanitise env var values before YAML substitution to prevent injection` - **Milestone**: v3.2.0 - **Parent Epic**: #5502 ## Subtasks - [ ] Reproduce the injection with a targeted BDD scenario tagged `@tdd_issue`, `@tdd_issue_<N>`, `@tdd_expected_fail` - [ ] Audit all call sites of `_interpolate_env()` for additional exposure - [ ] Implement YAML-scalar escaping / quoting for substituted values in `ReactiveConfigParser._interpolate_env()` - [ ] Verify fix prevents injection while preserving legitimate multi-line scalar values - [ ] Remove `@tdd_expected_fail` tag from the TDD scenario once the fix is in place - [ ] Update docstring and inline comments to document the security contract - [ ] Confirm no regression in existing config-parser BDD scenarios ## Definition of Done - [ ] A BDD scenario (tagged `@tdd_issue` and `@tdd_issue_<N>`) demonstrates the injection vector and passes after the fix - [ ] `ReactiveConfigParser._interpolate_env()` no longer allows YAML structure injection via environment variable values - [ ] All existing `ReactiveConfigParser` BDD scenarios continue to pass - [ ] Pyright strict-mode type checking passes with zero errors (`nox -s typecheck`) - [ ] Ruff linting passes with zero violations (`nox -s lint`) - [ ] Security scan passes with no new high/critical findings (`nox -s security_scan`) - [ ] All nox stages pass - [ ] Coverage >= 97% --- **Automated by CleverAgents Bot** Supervisor: Bug Hunting | Agent: new-issue-creator
HAL9000 added this to the v3.2.0 milestone 2026-04-10 07:58:13 +00:00
Author
Owner

Verified — Critical security bug: YAML injection in environment variable substitution. MoSCoW: Must-have. Priority: Critical.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — Critical security bug: YAML injection in environment variable substitution. MoSCoW: Must-have. Priority: Critical. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#7124
No description provided.