fix(tests): resolve nox unit_tests timeout for agent_skills_loader and skill_search features #9456

Merged
HAL9000 merged 1 commit from fix/agent-skills-unit-tests-timeout into master 2026-04-15 08:36:56 +00:00
Owner

Summary

Resolved timeout issues in nox -s unit_tests when running agent_skills_loader and skill_search feature tests by addressing fork-based multiprocessing deadlocks on overlayfs and fixing feature file detection logic.

Changes

  • scripts/run_behave_parallel.py: Added condition to execute sequentially when len(feature_paths) <= 2, preventing fork-based multiprocessing overhead and deadlocks that occur when many workers race to compile .pyc files simultaneously on overlayfs
  • noxfile.py: Updated feature file detection logic to check all posargs for .feature files, not just the first argument, ensuring specific feature files are run without adding the entire features/ directory

Root Cause

The behave-parallel runner uses the fork start method for multiprocessing, which can deadlock on overlayfs when multiple worker processes attempt to compile .pyc files concurrently. This issue manifests when running a small number of feature files with many workers, causing the test suite to hang indefinitely.

Testing

The fix ensures that:

  • nox -s unit_tests -- features/agent_skills_loader.feature features/skill_search.feature completes without hanging
  • Sequential execution is used for small feature sets (≤2 files), improving reliability and avoiding fork-related deadlocks
  • Specific feature files are respected without unnecessary directory expansion
  • UAT verification is no longer blocked by timeout issues

Issue Reference

Closes #9374


Automated by CleverAgents Bot
Agent: pr-creator

## Summary Resolved timeout issues in `nox -s unit_tests` when running agent_skills_loader and skill_search feature tests by addressing fork-based multiprocessing deadlocks on overlayfs and fixing feature file detection logic. ## Changes - **scripts/run_behave_parallel.py**: Added condition to execute sequentially when `len(feature_paths) <= 2`, preventing fork-based multiprocessing overhead and deadlocks that occur when many workers race to compile `.pyc` files simultaneously on overlayfs - **noxfile.py**: Updated feature file detection logic to check all posargs for `.feature` files, not just the first argument, ensuring specific feature files are run without adding the entire features/ directory ## Root Cause The `behave-parallel` runner uses the `fork` start method for multiprocessing, which can deadlock on overlayfs when multiple worker processes attempt to compile `.pyc` files concurrently. This issue manifests when running a small number of feature files with many workers, causing the test suite to hang indefinitely. ## Testing The fix ensures that: - `nox -s unit_tests -- features/agent_skills_loader.feature features/skill_search.feature` completes without hanging - Sequential execution is used for small feature sets (≤2 files), improving reliability and avoiding fork-related deadlocks - Specific feature files are respected without unnecessary directory expansion - UAT verification is no longer blocked by timeout issues ## Issue Reference Closes #9374 --- **Automated by CleverAgents Bot** Agent: pr-creator
HAL9000 added this to the v3.2.0 milestone 2026-04-14 18:12:57 +00:00
Author
Owner

Triage Decision [AUTO-OWNR-3]: Verified as a valid fix for the nox unit_tests timeout issue (#9374). Resolving the timeout for agent_skills_loader and skill_search features is Must Have for v3.2.0 — CI pipeline is blocked by this.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Triage Decision [AUTO-OWNR-3]**: Verified as a valid fix for the nox unit_tests timeout issue (#9374). Resolving the timeout for `agent_skills_loader` and `skill_search` features is `Must Have` for v3.2.0 — CI pipeline is blocked by this. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
HAL9001 approved these changes 2026-04-14 19:29:23 +00:00
HAL9001 left a comment

Code Review: APPROVED

Summary

This PR correctly resolves the nox -s unit_tests timeout issue for agent_skills_loader.feature and skill_search.feature by addressing two distinct root causes. The fix is minimal, targeted, and all CI checks pass.


Checklist

Criterion Status Notes
Conventional Commit format fix(tests): resolve nox unit_tests timeout...
ISSUES CLOSED footer ISSUES CLOSED: #9374 present in commit message
PR Closes #<issue> statement Closes #9374 in PR description
Milestone assigned v3.2.0
Exactly one Type/* label Type/Bug
All CI checks pass All 13 jobs succeeded (lint, typecheck, quality, security, unit_tests, integration_tests, e2e_tests, coverage, build, docker, helm, push-validation, status-check)
Coverage ≥ 97% CI / coverage passed (10m11s)
BDD tests present/passing Changes are to test infrastructure; existing BDD tests pass
CHANGELOG updated N/A Internal test-runner fix; no user-facing behaviour change
Spec alignment Addresses all acceptance criteria in issue #9374

Code Analysis

noxfile.py (both unit_tests and coverage_report sessions)

The original logic session.posargs[0].endswith(".feature") only checked the first positional argument. When multiple feature files are passed (e.g., -- features/a.feature features/b.feature), only the first was detected, causing the entire features/ directory to be appended as a fallback. The fix correctly uses any(arg.endswith(".feature") for arg in session.posargs) to check all posargs.

scripts/run_behave_parallel.py

Added or len(feature_paths) <= 2 to the sequential-execution guard. This prevents fork-based multiprocessing deadlocks on overlayfs when running a small number of feature files with many workers. The fix is well-commented and the rationale is sound.

Minor observation (non-blocking): With len(feature_paths) <= 2 added, the prior len(feature_paths) == 1 condition is now redundant (it is subsumed by <= 2). Consider removing it in a follow-up for clarity, but this does not affect correctness.


CI Results

All 13 CI jobs completed successfully:

  • lint (44s)
  • typecheck (53s)
  • quality (34s)
  • security (4m7s)
  • unit_tests (8m43s)
  • integration_tests (7m3s)
  • e2e_tests (6m4s)
  • coverage (10m11s)
  • build (3m19s)
  • docker (1m56s)
  • helm (23s)
  • push-validation (20s)
  • status-check (1s)

All quality criteria satisfied. Ready to merge.


Automated by CleverAgents Bot
Supervisor: PR Review Pool | Agent: pr-reviewer
Worker: [AUTO-REV-9456]

## Code Review: APPROVED ✅ ### Summary This PR correctly resolves the `nox -s unit_tests` timeout issue for `agent_skills_loader.feature` and `skill_search.feature` by addressing two distinct root causes. The fix is minimal, targeted, and all CI checks pass. --- ### Checklist | Criterion | Status | Notes | |---|---|---| | Conventional Commit format | ✅ | `fix(tests): resolve nox unit_tests timeout...` | | `ISSUES CLOSED` footer | ✅ | `ISSUES CLOSED: #9374` present in commit message | | PR `Closes #<issue>` statement | ✅ | `Closes #9374` in PR description | | Milestone assigned | ✅ | v3.2.0 | | Exactly one `Type/*` label | ✅ | `Type/Bug` | | All CI checks pass | ✅ | All 13 jobs succeeded (lint, typecheck, quality, security, unit_tests, integration_tests, e2e_tests, coverage, build, docker, helm, push-validation, status-check) | | Coverage ≥ 97% | ✅ | `CI / coverage` passed (10m11s) | | BDD tests present/passing | ✅ | Changes are to test infrastructure; existing BDD tests pass | | CHANGELOG updated | ✅ N/A | Internal test-runner fix; no user-facing behaviour change | | Spec alignment | ✅ | Addresses all acceptance criteria in issue #9374 | --- ### Code Analysis **`noxfile.py`** (both `unit_tests` and `coverage_report` sessions) The original logic `session.posargs[0].endswith(".feature")` only checked the *first* positional argument. When multiple feature files are passed (e.g., `-- features/a.feature features/b.feature`), only the first was detected, causing the entire `features/` directory to be appended as a fallback. The fix correctly uses `any(arg.endswith(".feature") for arg in session.posargs)` to check all posargs. ✅ **`scripts/run_behave_parallel.py`** Added `or len(feature_paths) <= 2` to the sequential-execution guard. This prevents fork-based multiprocessing deadlocks on overlayfs when running a small number of feature files with many workers. The fix is well-commented and the rationale is sound. ✅ > **Minor observation** (non-blocking): With `len(feature_paths) <= 2` added, the prior `len(feature_paths) == 1` condition is now redundant (it is subsumed by `<= 2`). Consider removing it in a follow-up for clarity, but this does not affect correctness. --- ### CI Results All 13 CI jobs completed successfully: - `lint` ✅ (44s) - `typecheck` ✅ (53s) - `quality` ✅ (34s) - `security` ✅ (4m7s) - `unit_tests` ✅ (8m43s) - `integration_tests` ✅ (7m3s) - `e2e_tests` ✅ (6m4s) - `coverage` ✅ (10m11s) - `build` ✅ (3m19s) - `docker` ✅ (1m56s) - `helm` ✅ (23s) - `push-validation` ✅ (20s) - `status-check` ✅ (1s) --- All quality criteria satisfied. Ready to merge. --- **Automated by CleverAgents Bot** Supervisor: PR Review Pool | Agent: pr-reviewer Worker: [AUTO-REV-9456]
Owner

Code Review Decision: APPROVED

All quality criteria met. The fix correctly resolves the nox -s unit_tests timeout for agent_skills_loader.feature and skill_search.feature by:

  1. Fixing multi-feature-file detection in noxfile.py (both unit_tests and coverage_report sessions)
  2. Adding sequential execution guard for ≤2 feature files in run_behave_parallel.py to prevent fork deadlocks on overlayfs

All 13 CI jobs passed. Coverage ≥ 97%. Commit format, ISSUES CLOSED footer, and PR closing keyword all verified.


Automated by CleverAgents Bot
Supervisor: PR Review Pool | Agent: pr-reviewer
Worker: [AUTO-REV-9456]

**Code Review Decision: APPROVED ✅** All quality criteria met. The fix correctly resolves the `nox -s unit_tests` timeout for `agent_skills_loader.feature` and `skill_search.feature` by: 1. Fixing multi-feature-file detection in `noxfile.py` (both `unit_tests` and `coverage_report` sessions) 2. Adding sequential execution guard for ≤2 feature files in `run_behave_parallel.py` to prevent fork deadlocks on overlayfs All 13 CI jobs passed. Coverage ≥ 97%. Commit format, ISSUES CLOSED footer, and PR closing keyword all verified. --- **Automated by CleverAgents Bot** Supervisor: PR Review Pool | Agent: pr-reviewer Worker: [AUTO-REV-9456]
HAL9000 force-pushed fix/agent-skills-unit-tests-timeout from d741d00831
All checks were successful
CI / push-validation (pull_request) Successful in 20s
CI / helm (pull_request) Successful in 23s
CI / quality (pull_request) Successful in 34s
CI / lint (pull_request) Successful in 44s
CI / typecheck (pull_request) Successful in 53s
CI / build (pull_request) Successful in 3m19s
CI / security (pull_request) Successful in 4m7s
CI / e2e_tests (pull_request) Successful in 6m4s
CI / integration_tests (pull_request) Successful in 7m3s
CI / unit_tests (pull_request) Successful in 8m43s
CI / docker (pull_request) Successful in 1m56s
CI / coverage (pull_request) Successful in 10m11s
CI / status-check (pull_request) Successful in 1s
to 4b95daeebb
All checks were successful
CI / push-validation (pull_request) Successful in 16s
CI / helm (pull_request) Successful in 19s
CI / security (pull_request) Successful in 33s
CI / lint (pull_request) Successful in 41s
CI / quality (pull_request) Successful in 42s
CI / typecheck (pull_request) Successful in 53s
CI / build (pull_request) Successful in 3m20s
CI / integration_tests (pull_request) Successful in 4m16s
CI / e2e_tests (pull_request) Successful in 4m38s
CI / unit_tests (pull_request) Successful in 5m37s
CI / docker (pull_request) Successful in 1m18s
CI / coverage (pull_request) Successful in 12m22s
CI / status-check (pull_request) Successful in 1s
2026-04-14 21:33:25 +00:00
Compare
freemo scheduled this pull request to auto merge when all checks succeed 2026-04-14 21:54:23 +00:00
freemo canceled auto merging this pull request when all checks succeed 2026-04-15 01:08:10 +00:00
HAL9000 force-pushed fix/agent-skills-unit-tests-timeout from 4b95daeebb
All checks were successful
CI / push-validation (pull_request) Successful in 16s
CI / helm (pull_request) Successful in 19s
CI / security (pull_request) Successful in 33s
CI / lint (pull_request) Successful in 41s
CI / quality (pull_request) Successful in 42s
CI / typecheck (pull_request) Successful in 53s
CI / build (pull_request) Successful in 3m20s
CI / integration_tests (pull_request) Successful in 4m16s
CI / e2e_tests (pull_request) Successful in 4m38s
CI / unit_tests (pull_request) Successful in 5m37s
CI / docker (pull_request) Successful in 1m18s
CI / coverage (pull_request) Successful in 12m22s
CI / status-check (pull_request) Successful in 1s
to b8732dfc6f
All checks were successful
CI / build (pull_request) Successful in 17s
CI / helm (pull_request) Successful in 17s
CI / push-validation (pull_request) Successful in 10s
CI / lint (pull_request) Successful in 39s
CI / quality (pull_request) Successful in 50s
CI / typecheck (pull_request) Successful in 52s
CI / security (pull_request) Successful in 53s
CI / e2e_tests (pull_request) Successful in 2m13s
CI / coverage (pull_request) Successful in 5m35s
CI / integration_tests (pull_request) Successful in 6m40s
CI / unit_tests (pull_request) Successful in 7m38s
CI / docker (pull_request) Successful in 1m45s
CI / status-check (pull_request) Successful in 1s
CI / lint (push) Successful in 16s
CI / quality (push) Successful in 17s
CI / build (push) Successful in 23s
CI / helm (push) Successful in 24s
CI / typecheck (push) Successful in 53s
CI / security (push) Successful in 53s
CI / push-validation (push) Successful in 38s
CI / e2e_tests (push) Successful in 3m14s
CI / unit_tests (push) Successful in 6m37s
CI / integration_tests (push) Successful in 6m39s
CI / docker (push) Successful in 12s
CI / coverage (push) Successful in 10m53s
CI / status-check (push) Successful in 1s
2026-04-15 08:20:36 +00:00
Compare
HAL9000 scheduled this pull request to auto merge when all checks succeed 2026-04-15 08:31:52 +00:00
HAL9000 merged commit b8732dfc6f into master 2026-04-15 08:36:56 +00:00
Author
Owner

Re-Review Attempt — Data Integrity Issues Found

Review was attempted but could not be formally submitted. Key findings:

⚠️ PR Data Mismatch

The PR number 9456 references a completely different PR than described in the re-review request:

Field Prompt Claims Actual PR #9456 on Forgejo
Title fix(tui): fix thread-safety race in reference_parser catalog cache fix(tests): resolve nox unit_tests timeout
Branch fix/concurrency-catalog-cache-lock-7590 fix/agent-skills-unit-tests-timeout
Linked Issue #7590 #9374
Merge Status Not indicated Already merged

Review Outcome

The PR is already merged with:

  • Previous review: APPROVED by HAL9001 (stale)
  • All required CI gates passing (unit_tests, integration_tests, e2e_tests, coverage, security)

No REQUEST_CHANGES feedback was found to verify compliance against.

Recommendation

The re-review request appears to be based on incorrect data. This may indicate a data corruption or race condition in the review trigger pipeline. No further action needed for this already-merged PR.


Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker

## Re-Review Attempt — Data Integrity Issues Found Review was attempted but could not be formally submitted. Key findings: ### ⚠️ PR Data Mismatch The PR number `9456` references a completely different PR than described in the re-review request: | Field | Prompt Claims | Actual PR #9456 on Forgejo | |---|---|---| | **Title** | fix(tui): fix thread-safety race in reference_parser catalog cache | fix(tests): resolve nox unit_tests timeout | | **Branch** | fix/concurrency-catalog-cache-lock-7590 | fix/agent-skills-unit-tests-timeout | | **Linked Issue** | #7590 | #9374 | | **Merge Status** | Not indicated | Already merged ✅ | ### Review Outcome The PR is **already merged** with: - Previous review: APPROVED by HAL9001 (stale) - All required CI gates passing (unit_tests, integration_tests, e2e_tests, coverage, security) No REQUEST_CHANGES feedback was found to verify compliance against. ### Recommendation The re-review request appears to be based on incorrect data. This may indicate a data corruption or race condition in the review trigger pipeline. No further action needed for this already-merged PR. --- **Automated by CleverAgents Bot** Supervisor: PR Review | Agent: pr-review-worker
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core!9456
No description provided.