[AUTO-INF-4] Flaky test: devcontainer_health_check_steps.py thread-startup waits use patched time.sleep — thread may not be ready in CI #10301

Open
opened 2026-04-18 08:26:53 +00:00 by HAL9000 · 0 comments
Owner

Metadata

Field Value
Branch fix/auto-inf-4-devcontainer-health-thread-sleep
Commit Message fix(tests): replace patched time.sleep with _original_sleep in devcontainer health check thread-startup steps
Milestone v3.5.0
Parent Epic

Background and Context

features/steps/devcontainer_health_check_steps.py contains two step definitions that use time.sleep(0.05) to give a background health-check thread "a moment to start" before proceeding with assertions:

Line 98 (in step_start_health_check):

start_health_check(resource_id, interval=0.1, run_command=context.mock_runner)
# Give it a moment to start
time.sleep(0.05)

Line 146 (in step_call_start_health_check):

start_health_check(
    resource_id,
    interval=60.0,  # long interval — we stop it immediately
    run_command=context.mock_runner,
)
context.hc_resource_id = resource_id
# Give thread just enough time to register
time.sleep(0.05)

The features/environment.py before_all hook installs a fast-sleep patch (_install_fast_sleep_patch) that caps time.sleep() at 10 ms globally:

_MAX_SLEEP = 0.01  # 10 ms cap

def _capped_sleep(seconds: float) -> None:
    time._original_sleep(min(seconds, _MAX_SLEEP))

time.sleep = _capped_sleep

Because time.sleep(0.05) is capped to 10 ms, the step returns after only 10 ms. On a loaded CI runner (e.g., the python:3.13-slim Docker container used in the unit_tests job), thread startup may take longer than 10 ms, causing the subsequent assertions to fail intermittently because the health check thread has not yet registered or started executing.

Affected scenarios in features/devcontainer_health_check.feature:

  • Any scenario using a running health check for "{resource_id}" (calls step_start_health_check)
  • Any scenario using I call start_health_check for "{resource_id}" (calls step_call_start_health_check)

Root cause: The steps use the module-level time.sleep reference, which is replaced by the fast-sleep patch. They should use time._original_sleep (the un-patched version) to allow real wall-clock time to elapse for thread startup.

Correct pattern (already used in other timing-sensitive steps):

# From features/steps/mcp_lifecycle_steps.py (line 211):
original_sleep = getattr(time, "_original_sleep", time.sleep)
original_sleep(secs)

Impact

  • Intermittent CI failures in unit_tests job: thread-startup assertions fail when the fast-sleep patch prevents sufficient time for the health check thread to register.
  • Affected feature: features/devcontainer_health_check.feature.
  • Related to P0 blocker: Issue #2850 (unit_tests CI job failing after ~6m45s) — timing-dependent test failures contribute to CI instability.

Expected Behavior

The thread-startup wait steps should use the un-patched time._original_sleep so that real wall-clock time elapses and the health check thread has time to start before assertions run.


Acceptance Criteria

  • Both time.sleep(0.05) calls in features/steps/devcontainer_health_check_steps.py are replaced with _real_sleep = getattr(time, "_original_sleep", time.sleep); _real_sleep(0.05)
  • All scenarios in features/devcontainer_health_check.feature pass consistently in parallel test runs
  • nox -s unit_tests passes without intermittent failures in the devcontainer health check scenarios
  • Coverage remains >= 97%

Subtasks

  • Update line 98 in features/steps/devcontainer_health_check_steps.py: replace time.sleep(0.05) with _real_sleep = getattr(time, "_original_sleep", time.sleep); _real_sleep(0.05)
  • Update line 146 in features/steps/devcontainer_health_check_steps.py: replace time.sleep(0.05) with _real_sleep = getattr(time, "_original_sleep", time.sleep); _real_sleep(0.05)
  • Verify all scenarios in features/devcontainer_health_check.feature pass in nox -s unit_tests
  • Run nox (all default sessions), fix any errors
  • Verify coverage >= 97% via nox -s coverage_report

Definition of Done

This issue is complete when:

  • All subtasks above are completed and checked off.
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional details.
  • The commit is pushed to the remote on the branch matching the Branch in Metadata exactly.
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.

Duplicate Check

Check Query Result
1. Open issues keyword search devcontainer_health_check, health check thread, thread startup, time.sleep thread No existing issues found covering this specific thread-startup sleep pattern
2. Cross-area search [AUTO-INF-*] issues (#10295, #10245, #10244, #10241, #10250) None address devcontainer_health_check_steps.py thread-startup timing
3. Closed issues search devcontainer health, thread startup sleep, health check thread No closed issues found covering this pattern (pages 1–10 searched)
4. Similar proposals Issue #10295 ([AUTO-INF-4]) covers uko_indexer debounce — different step file and different timing mechanism. Issue #10245 ([AUTO-INF-6]) covers module-level ULID generation — different issue. No overlap with this specific thread-startup sleep issue
5. Uncertainty check This is a specific, well-defined bug: time.sleep(0.05) is capped to 10ms by the fast-sleep patch, but thread startup may need more than 10ms on loaded CI runners. Confident this is a new, distinct issue. Not a duplicate

Automated by CleverAgents Bot
Supervisor: Test Infrastructure Pool | Agent: test-infra-pool-supervisor

## Metadata | Field | Value | |---|---| | **Branch** | `fix/auto-inf-4-devcontainer-health-thread-sleep` | | **Commit Message** | `fix(tests): replace patched time.sleep with _original_sleep in devcontainer health check thread-startup steps` | | **Milestone** | v3.5.0 | | **Parent Epic** | — | --- ## Background and Context `features/steps/devcontainer_health_check_steps.py` contains two step definitions that use `time.sleep(0.05)` to give a background health-check thread "a moment to start" before proceeding with assertions: **Line 98** (in `step_start_health_check`): ```python start_health_check(resource_id, interval=0.1, run_command=context.mock_runner) # Give it a moment to start time.sleep(0.05) ``` **Line 146** (in `step_call_start_health_check`): ```python start_health_check( resource_id, interval=60.0, # long interval — we stop it immediately run_command=context.mock_runner, ) context.hc_resource_id = resource_id # Give thread just enough time to register time.sleep(0.05) ``` The `features/environment.py` `before_all` hook installs a **fast-sleep patch** (`_install_fast_sleep_patch`) that caps `time.sleep()` at **10 ms** globally: ```python _MAX_SLEEP = 0.01 # 10 ms cap def _capped_sleep(seconds: float) -> None: time._original_sleep(min(seconds, _MAX_SLEEP)) time.sleep = _capped_sleep ``` Because `time.sleep(0.05)` is capped to 10 ms, the step returns after only 10 ms. On a loaded CI runner (e.g., the `python:3.13-slim` Docker container used in the `unit_tests` job), thread startup may take longer than 10 ms, causing the subsequent assertions to fail intermittently because the health check thread has not yet registered or started executing. **Affected scenarios in `features/devcontainer_health_check.feature`:** - Any scenario using `a running health check for "{resource_id}"` (calls `step_start_health_check`) - Any scenario using `I call start_health_check for "{resource_id}"` (calls `step_call_start_health_check`) **Root cause:** The steps use the module-level `time.sleep` reference, which is replaced by the fast-sleep patch. They should use `time._original_sleep` (the un-patched version) to allow real wall-clock time to elapse for thread startup. **Correct pattern** (already used in other timing-sensitive steps): ```python # From features/steps/mcp_lifecycle_steps.py (line 211): original_sleep = getattr(time, "_original_sleep", time.sleep) original_sleep(secs) ``` --- ## Impact - **Intermittent CI failures** in `unit_tests` job: thread-startup assertions fail when the fast-sleep patch prevents sufficient time for the health check thread to register. - **Affected feature**: `features/devcontainer_health_check.feature`. - **Related to P0 blocker**: Issue #2850 (`unit_tests` CI job failing after ~6m45s) — timing-dependent test failures contribute to CI instability. --- ## Expected Behavior The thread-startup wait steps should use the un-patched `time._original_sleep` so that real wall-clock time elapses and the health check thread has time to start before assertions run. --- ## Acceptance Criteria - [ ] Both `time.sleep(0.05)` calls in `features/steps/devcontainer_health_check_steps.py` are replaced with `_real_sleep = getattr(time, "_original_sleep", time.sleep); _real_sleep(0.05)` - [ ] All scenarios in `features/devcontainer_health_check.feature` pass consistently in parallel test runs - [ ] `nox -s unit_tests` passes without intermittent failures in the devcontainer health check scenarios - [ ] Coverage remains >= 97% --- ## Subtasks - [ ] Update line 98 in `features/steps/devcontainer_health_check_steps.py`: replace `time.sleep(0.05)` with `_real_sleep = getattr(time, "_original_sleep", time.sleep); _real_sleep(0.05)` - [ ] Update line 146 in `features/steps/devcontainer_health_check_steps.py`: replace `time.sleep(0.05)` with `_real_sleep = getattr(time, "_original_sleep", time.sleep); _real_sleep(0.05)` - [ ] Verify all scenarios in `features/devcontainer_health_check.feature` pass in `nox -s unit_tests` - [ ] Run `nox` (all default sessions), fix any errors - [ ] Verify coverage >= 97% via `nox -s coverage_report` --- ## Definition of Done This issue is complete when: - All subtasks above are completed and checked off. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional details. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done. --- ### Duplicate Check | Check | Query | Result | |-------|-------|--------| | 1. Open issues keyword search | `devcontainer_health_check`, `health check thread`, `thread startup`, `time.sleep thread` | No existing issues found covering this specific thread-startup sleep pattern | | 2. Cross-area search | `[AUTO-INF-*]` issues (#10295, #10245, #10244, #10241, #10250) | None address devcontainer_health_check_steps.py thread-startup timing | | 3. Closed issues search | `devcontainer health`, `thread startup sleep`, `health check thread` | No closed issues found covering this pattern (pages 1–10 searched) | | 4. Similar proposals | Issue #10295 ([AUTO-INF-4]) covers uko_indexer debounce — different step file and different timing mechanism. Issue #10245 ([AUTO-INF-6]) covers module-level ULID generation — different issue. | No overlap with this specific thread-startup sleep issue | | 5. Uncertainty check | This is a specific, well-defined bug: `time.sleep(0.05)` is capped to 10ms by the fast-sleep patch, but thread startup may need more than 10ms on loaded CI runners. Confident this is a new, distinct issue. | Not a duplicate | --- **Automated by CleverAgents Bot** Supervisor: Test Infrastructure Pool | Agent: test-infra-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#10301
No description provided.