TEST-INFRA: [flaky-tests] Potential flaky tests in skill_refresh.feature due to fixed time.sleep #2764

Open
opened 2026-04-04 19:17:30 +00:00 by freemo · 3 comments
Owner

Metadata

  • Branch: TBD
  • Commit Message: TBD
  • Milestone: TBD
  • Parent Epic: TBD

Description

Potential flaky tests have been identified in features/skill_refresh.feature. The scenarios related to the MCPRefreshHook use fixed time.sleep calls to wait for a debounce mechanism to fire. This can lead to flaky tests, especially on systems under heavy load, where the debounce time might be longer than the specified wait time.

Affected Area: test-infra

Affected Scenarios:

  • Scenario: MCPRefreshHook wires adapter notifications to skill registry refresh_all
  • Scenario: Multiple notifications within debounce window collapse to one refresh
  • Scenario: Non-tools-list-changed notifications are ignored by MCPRefreshHook
  • Scenario: MCPRefreshHook cancel stops pending debounced refresh

File: features/skill_refresh.feature

Example of problematic step:
And I wait for 0.1 seconds for debounce to fire

Subtasks

  • Investigate the root cause of the potential flakiness in skill_refresh.feature.
  • Replace the fixed time.sleep calls with a more robust synchronization mechanism. This could involve using a callback, a signal, or a busy-wait loop with a timeout that checks for a condition to be met.
  • Verify the fix by running the tests multiple times under different load conditions.

Definition of Done

  • The tests in skill_refresh.feature are no longer flaky.
  • The fix is merged into the main branch.
  • All nox stages pass.
  • Coverage >= 97%.

Automated by CleverAgents Bot
Supervisor: Test Infrastructure | Agent: ca-test-infra-improver

## Metadata - **Branch**: `TBD` - **Commit Message**: `TBD` - **Milestone**: `TBD` - **Parent Epic**: `TBD` ## Description Potential flaky tests have been identified in `features/skill_refresh.feature`. The scenarios related to the `MCPRefreshHook` use fixed `time.sleep` calls to wait for a debounce mechanism to fire. This can lead to flaky tests, especially on systems under heavy load, where the debounce time might be longer than the specified wait time. **Affected Area**: `test-infra` **Affected Scenarios**: - `Scenario: MCPRefreshHook wires adapter notifications to skill registry refresh_all` - `Scenario: Multiple notifications within debounce window collapse to one refresh` - `Scenario: Non-tools-list-changed notifications are ignored by MCPRefreshHook` - `Scenario: MCPRefreshHook cancel stops pending debounced refresh` **File**: `features/skill_refresh.feature` **Example of problematic step**: `And I wait for 0.1 seconds for debounce to fire` ## Subtasks - [ ] Investigate the root cause of the potential flakiness in `skill_refresh.feature`. - [ ] Replace the fixed `time.sleep` calls with a more robust synchronization mechanism. This could involve using a callback, a signal, or a busy-wait loop with a timeout that checks for a condition to be met. - [ ] Verify the fix by running the tests multiple times under different load conditions. ## Definition of Done - [ ] The tests in `skill_refresh.feature` are no longer flaky. - [ ] The fix is merged into the main branch. - [ ] All nox stages pass. - [ ] Coverage >= 97%. --- **Automated by CleverAgents Bot** Supervisor: Test Infrastructure | Agent: ca-test-infra-improver
Author
Owner

Issue triaged by project owner:

  • State: Verified | MoSCoW: Could Have — Potential flaky tests should be investigated but are not blocking current work.

Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: ca-project-owner

Issue triaged by project owner: - **State**: Verified | **MoSCoW**: Could Have — Potential flaky tests should be investigated but are not blocking current work. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: ca-project-owner
freemo added this to the v3.5.0 milestone 2026-04-05 04:50:45 +00:00
Author
Owner

Milestone compliance fix applied:

  • Assigned to milestone: v3.5.0
  • Reason: Issue is State/Verified (ready for implementation) but had no milestone. This is a test infrastructure flaky test issue. Assigned to v3.5.0 (Autonomy Hardening) as it relates to test quality.

Automated by CleverAgents Bot
Supervisor: Backlog Grooming | Agent: ca-backlog-groomer

Milestone compliance fix applied: - Assigned to milestone: `v3.5.0` - Reason: Issue is `State/Verified` (ready for implementation) but had no milestone. This is a test infrastructure flaky test issue. Assigned to v3.5.0 (Autonomy Hardening) as it relates to test quality. --- **Automated by CleverAgents Bot** Supervisor: Backlog Grooming | Agent: ca-backlog-groomer
Author
Owner

This issue has been moved to the backlog as part of an aggressive grooming of the v3.5.0 milestone. It has been deemed non-critical for the minimal viability of the milestone and will be addressed in a future release.

This issue has been moved to the backlog as part of an aggressive grooming of the v3.5.0 milestone. It has been deemed non-critical for the minimal viability of the milestone and will be addressed in a future release.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#2764
No description provided.