Bug: apply_tdd_inversion does not reset Status.untested steps, causing compute_status() inconsistency after inversion #8291

Open
opened 2026-04-13 08:06:46 +00:00 by hurui200320 · 1 comment
Member

Metadata

  • Commit Message: fix(testing): reset Status.untested steps in apply_tdd_inversion to prevent compute_status inconsistency
  • Branch: fix/tdd-inversion-untested-steps

Background and Context

apply_tdd_inversion() in features/environment.py is the core function that inverts the result of a @tdd_expected_fail scenario from FAIL to PASS (when the bug is still present). It does this by resetting all failed and skipped steps to Status.passed, then calling scenario.set_status(Status.passed).

However, the reset loop only handles two statuses:

for step in all_steps:
    if step.status in (Status.failed, Status.skipped):  # ← Status.untested is missing
        step.status = Status.passed
        ...

Status.untested steps — steps that never ran at all (e.g., because a before_step hook failed on that step, setting step.hook_failed = True and step.status = Status.failed but leaving step.exception = None) — are silently left in their Status.untested state.

Current Behavior

After apply_tdd_inversion runs:

  • scenario._cached_status is correctly set to Status.passed via scenario.set_status(Status.passed).
  • Any Status.untested steps remain Status.untested (not reset to Status.passed).

If scenario.clear_status() is subsequently called (e.g., by future refactoring, a formatter, or any code that re-evaluates the scenario), compute_status() is triggered. It walks all_steps, finds the Status.untested step, and returns Status.untested — not Status.passed. This leaves _cached_status = Status.untested, which is not in final_status, causing an infinite recomputation loop on every subsequent scenario.status access.

Proof:

from behave.model import Scenario, Step, Status
from features.environment import apply_tdd_inversion

s = Scenario('<test>', 1, 'Scenario', 'test',
             tags=['tdd_expected_fail', 'tdd_issue', 'tdd_issue_999'])
s.hook_failed = False
s.was_dry_run = False

failed = Step('<test>', 1, 'When', 'when', 'a failed step')
failed.status = Status.failed
failed.exception = AssertionError('bug present')

untested = Step('<test>', 2, 'Then', 'then', 'an untested step')
untested.status = Status.untested  # never reached

s.steps.extend([failed, untested])
s.set_status(Status.failed)

apply_tdd_inversion(s, True)

# After inversion:
assert s._cached_status == Status.passed   # ✓ correct via set_status
assert failed.status == Status.passed      # ✓ correctly reset
assert untested.status == Status.untested  # ✗ NOT reset — still untested!

# If clear_status() is ever called after inversion:
s.clear_status()
assert s.compute_status() == Status.untested  # ✗ WRONG — should be passed

Expected Behavior

After apply_tdd_inversion inverts a scenario to PASS, all steps — including those with Status.untested — should be reset to Status.passed. This ensures that compute_status() returns Status.passed regardless of when it is called, making the inversion robust against any future code path that calls clear_status() on the scenario.

Acceptance Criteria

  • apply_tdd_inversion resets steps with Status.untested to Status.passed in the same loop that handles Status.failed and Status.skipped.
  • After inversion, scenario.compute_status() returns Status.passed even if scenario.clear_status() is called first.
  • Existing Behave scenarios in features/tdd_expected_fail_infrastructure.feature and features/testing/tdd_tag_validation.feature continue to pass.
  • A new Behave scenario is added to features/tdd_expected_fail_infrastructure.feature covering the Status.untested step case.
  • Coverage remains ≥ 97%.

Supporting Information

  • File: features/environment.py, function apply_tdd_inversion, lines 207–217.
  • Related function: Scenario.compute_status() in Behave 1.2.6 — returns Status.untested when any step has Status.untested and was_dry_run is False.
  • Companion issue: The redundant scenario.clear_status() call on line 218 (immediately before scenario.set_status(Status.passed) on line 219) amplifies this bug by creating a window where _cached_status = Status.untested. That call is unnecessary and should be removed as part of this fix (see Issue #2 in this audit series).
  • Audit commit: 51aab184 introduced the guard logic; the untested-step gap was present from the original implementation in c8cd7eab.

Subtasks

  • Extend the reset loop in apply_tdd_inversion to include Status.untested: if step.status in (Status.failed, Status.skipped, Status.untested):
  • Remove the redundant scenario.clear_status() call on line 218 (the set_status(Status.passed) on line 219 is sufficient on its own)
  • Add a Behave scenario to features/tdd_expected_fail_infrastructure.feature: "apply_tdd_inversion resets untested steps to passed during inversion"
  • Add a corresponding scenario to features/testing/tdd_tag_validation.feature for the apply_tdd_inversion untested-step path
  • Verify nox -e unit_tests passes
  • Verify coverage ≥ 97% via nox -s coverage_report
  • Run nox (all default sessions), fix any errors

Definition of Done

This issue is complete when:

  • All subtasks above are completed and checked off.
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation.
  • The commit is pushed to the remote on the branch matching the Branch in Metadata exactly.
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.
## Metadata - **Commit Message**: `fix(testing): reset Status.untested steps in apply_tdd_inversion to prevent compute_status inconsistency` - **Branch**: `fix/tdd-inversion-untested-steps` ## Background and Context `apply_tdd_inversion()` in `features/environment.py` is the core function that inverts the result of a `@tdd_expected_fail` scenario from FAIL to PASS (when the bug is still present). It does this by resetting all failed and skipped steps to `Status.passed`, then calling `scenario.set_status(Status.passed)`. However, the reset loop only handles two statuses: ```python for step in all_steps: if step.status in (Status.failed, Status.skipped): # ← Status.untested is missing step.status = Status.passed ... ``` `Status.untested` steps — steps that never ran at all (e.g., because a `before_step` hook failed on that step, setting `step.hook_failed = True` and `step.status = Status.failed` but leaving `step.exception = None`) — are silently left in their `Status.untested` state. ## Current Behavior After `apply_tdd_inversion` runs: - `scenario._cached_status` is correctly set to `Status.passed` via `scenario.set_status(Status.passed)`. - Any `Status.untested` steps remain `Status.untested` (not reset to `Status.passed`). If `scenario.clear_status()` is subsequently called (e.g., by future refactoring, a formatter, or any code that re-evaluates the scenario), `compute_status()` is triggered. It walks `all_steps`, finds the `Status.untested` step, and returns `Status.untested` — not `Status.passed`. This leaves `_cached_status = Status.untested`, which is not in `final_status`, causing an infinite recomputation loop on every subsequent `scenario.status` access. **Proof:** ```python from behave.model import Scenario, Step, Status from features.environment import apply_tdd_inversion s = Scenario('<test>', 1, 'Scenario', 'test', tags=['tdd_expected_fail', 'tdd_issue', 'tdd_issue_999']) s.hook_failed = False s.was_dry_run = False failed = Step('<test>', 1, 'When', 'when', 'a failed step') failed.status = Status.failed failed.exception = AssertionError('bug present') untested = Step('<test>', 2, 'Then', 'then', 'an untested step') untested.status = Status.untested # never reached s.steps.extend([failed, untested]) s.set_status(Status.failed) apply_tdd_inversion(s, True) # After inversion: assert s._cached_status == Status.passed # ✓ correct via set_status assert failed.status == Status.passed # ✓ correctly reset assert untested.status == Status.untested # ✗ NOT reset — still untested! # If clear_status() is ever called after inversion: s.clear_status() assert s.compute_status() == Status.untested # ✗ WRONG — should be passed ``` ## Expected Behavior After `apply_tdd_inversion` inverts a scenario to PASS, **all** steps — including those with `Status.untested` — should be reset to `Status.passed`. This ensures that `compute_status()` returns `Status.passed` regardless of when it is called, making the inversion robust against any future code path that calls `clear_status()` on the scenario. ## Acceptance Criteria - [ ] `apply_tdd_inversion` resets steps with `Status.untested` to `Status.passed` in the same loop that handles `Status.failed` and `Status.skipped`. - [ ] After inversion, `scenario.compute_status()` returns `Status.passed` even if `scenario.clear_status()` is called first. - [ ] Existing Behave scenarios in `features/tdd_expected_fail_infrastructure.feature` and `features/testing/tdd_tag_validation.feature` continue to pass. - [ ] A new Behave scenario is added to `features/tdd_expected_fail_infrastructure.feature` covering the `Status.untested` step case. - [ ] Coverage remains ≥ 97%. ## Supporting Information - **File**: `features/environment.py`, function `apply_tdd_inversion`, lines 207–217. - **Related function**: `Scenario.compute_status()` in Behave 1.2.6 — returns `Status.untested` when any step has `Status.untested` and `was_dry_run` is `False`. - **Companion issue**: The redundant `scenario.clear_status()` call on line 218 (immediately before `scenario.set_status(Status.passed)` on line 219) amplifies this bug by creating a window where `_cached_status = Status.untested`. That call is unnecessary and should be removed as part of this fix (see Issue #2 in this audit series). - **Audit commit**: `51aab184` introduced the guard logic; the untested-step gap was present from the original implementation in `c8cd7eab`. ## Subtasks - [ ] Extend the reset loop in `apply_tdd_inversion` to include `Status.untested`: `if step.status in (Status.failed, Status.skipped, Status.untested):` - [ ] Remove the redundant `scenario.clear_status()` call on line 218 (the `set_status(Status.passed)` on line 219 is sufficient on its own) - [ ] Add a Behave scenario to `features/tdd_expected_fail_infrastructure.feature`: "apply_tdd_inversion resets untested steps to passed during inversion" - [ ] Add a corresponding scenario to `features/testing/tdd_tag_validation.feature` for the `apply_tdd_inversion` untested-step path - [ ] Verify `nox -e unit_tests` passes - [ ] Verify coverage ≥ 97% via `nox -s coverage_report` - [ ] Run `nox` (all default sessions), fix any errors ## Definition of Done This issue is complete when: - All subtasks above are completed and checked off. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done.
HAL9000 added this to the v3.2.0 milestone 2026-04-13 08:30:55 +00:00
Owner

Verified — The apply_tdd_inversion bug causes compute_status() inconsistency after inversion, which could affect test reliability. Should Have fix for v3.2.0 — important for test framework correctness but not blocking core functionality. Verified.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — The apply_tdd_inversion bug causes compute_status() inconsistency after inversion, which could affect test reliability. **Should Have** fix for v3.2.0 — important for test framework correctness but not blocking core functionality. Verified. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#8291
No description provided.