[AUTO-INF-6] Shore up coverage for autonomy controller and automation tooling #9790

Open
opened 2026-04-15 15:50:21 +00:00 by HAL9000 · 0 comments
Owner

Summary

  • scripts/validate_automation_tracking.py is uncovered by tests even though it enforces automation issue naming conventions and label policy.
  • scripts/create_template_db.py has no automated verification for Alembic stamping or file permissions despite being invoked by every nox session.
  • src/cleveragents/application/services/autonomy_controller.py lacks direct tests for its weighting, history, and explanation logic that gate automation decisions.

Findings

scripts/validate_automation_tracking.py

  • coverage config (see pyproject.toml tool.coverage.run.source) includes the scripts package, but the features directory listing shows no scenarios referencing automation_tracking (GET /api/v1/repos/cleveragents/cleveragents-core/contents/features?ref=master filtered via jq returned zero matches for automation_tracking).
  • CLI pathways (--title, --repo, --validate-all) and helpers (validate_tracking_title, validate_automation_tracking_issue, _run_validate_all) guard issue formatting but are untested, so regressions in regex or label enforcement reach CI unchecked.

scripts/create_template_db.py

  • The same coverage config includes scripts, yet a search for template_db within the feature catalog returned nothing; there is no fast unit covering create_template.
  • The helper deletes and recreates SQLite files, stamps Alembic head via ScriptDirectory.get_current_head, and chmods 0o664; if head detection or permissions break, the pre-migrated template used by behave/robot suites fails during CI without early warning.

src/cleveragents/application/services/autonomy_controller.py

  • Feature search for autonomy_controller returned no scenarios; current guardrail suites exercise autonomy_guardrail_service but not the controller weighting engine itself.
  • AutonomyController.should_proceed_automatically combines per-factor weights, per-operation thresholds (_get_threshold fallback to 1.0), history tracking with _MAX_HISTORY_PER_TYPE, and explanation text – none of which have deterministic unit coverage today.

Proposed Tests

  • Add a pytest module under tests/scripts/test_validate_automation_tracking.py exercising: title validation matrix (valid and invalid prefix/type/cycle), announcement format, and error messaging; label/body validation via validate_automation_tracking_issue; CLI behaviour for --title, --validate-all, and the repo path (patch get_tracking_issues_from_repo, assert exit codes and output).
  • Add a pytest module for scripts/create_template_db.py to run create_template against a temporary path, open the resulting database, assert alembic_version equals ScriptDirectory.get_current_head() and required tables exist, assert file mode is 0o664, and simulate ScriptDirectory.get_current_head() returning None to confirm RuntimeError propagation.
  • Add targeted unit tests for AutonomyController covering _validate_weights (reject missing or extra keys and non-unit sums), compute_confidence weighting and inversion rules, _get_threshold fallback to 1.0 when the profile lacks an attribute, record_outcome history bounding at _MAX_HISTORY_PER_TYPE (exercise with concurrent writers), _build_explanation text composition, and end-to-end should_proceed_automatically decisions.

CI Reliability Impact

  • Unit coverage on these paths will catch regressions in automation issue hygiene and database bootstrap scripts before they cascade into high CI failure rates.
  • Exercising AutonomyController logic under deterministic tests reduces reliance on long-running behaviour suites when threshold math changes, preventing flaky escalations.

Next Steps

  • Implement the pytest suites described above and wire them into nox -s unit_tests.
  • Ensure the new tests are included in coverage reporting (no changes needed to tool.coverage.run.omit).
  • Follow up with documentation updates once tests land.

Duplicate Check

  • GET /api/v1/repos/cleveragents/cleveragents-core/issues?q=AUTO-INF-6&state=open&limit=50&page=1 – no existing coverage-gap issue overlaps these topics.
  • GET /api/v1/repos/cleveragents/cleveragents-core/issues?q=automation%20tracking&state=open&limit=50&page=1 – no open issue mentions validate_automation_tracking.py.
  • GET /api/v1/repos/cleveragents/cleveragents-core/issues?q=autonomy%20controller&state=open&limit=50&page=1 – no open issue covers the autonomy controller weighting engine.
## Summary - scripts/validate_automation_tracking.py is uncovered by tests even though it enforces automation issue naming conventions and label policy. - scripts/create_template_db.py has no automated verification for Alembic stamping or file permissions despite being invoked by every nox session. - src/cleveragents/application/services/autonomy_controller.py lacks direct tests for its weighting, history, and explanation logic that gate automation decisions. ## Findings ### scripts/validate_automation_tracking.py - coverage config (see pyproject.toml `tool.coverage.run.source`) includes the scripts package, but the features directory listing shows no scenarios referencing automation_tracking (GET /api/v1/repos/cleveragents/cleveragents-core/contents/features?ref=master filtered via jq returned zero matches for `automation_tracking`). - CLI pathways (--title, --repo, --validate-all) and helpers (validate_tracking_title, validate_automation_tracking_issue, _run_validate_all) guard issue formatting but are untested, so regressions in regex or label enforcement reach CI unchecked. ### scripts/create_template_db.py - The same coverage config includes scripts, yet a search for template_db within the feature catalog returned nothing; there is no fast unit covering create_template. - The helper deletes and recreates SQLite files, stamps Alembic head via ScriptDirectory.get_current_head, and chmods 0o664; if head detection or permissions break, the pre-migrated template used by behave/robot suites fails during CI without early warning. ### src/cleveragents/application/services/autonomy_controller.py - Feature search for autonomy_controller returned no scenarios; current guardrail suites exercise autonomy_guardrail_service but not the controller weighting engine itself. - AutonomyController.should_proceed_automatically combines per-factor weights, per-operation thresholds (_get_threshold fallback to 1.0), history tracking with _MAX_HISTORY_PER_TYPE, and explanation text – none of which have deterministic unit coverage today. ## Proposed Tests - Add a pytest module under tests/scripts/test_validate_automation_tracking.py exercising: title validation matrix (valid and invalid prefix/type/cycle), announcement format, and error messaging; label/body validation via validate_automation_tracking_issue; CLI behaviour for --title, --validate-all, and the repo path (patch get_tracking_issues_from_repo, assert exit codes and output). - Add a pytest module for scripts/create_template_db.py to run create_template against a temporary path, open the resulting database, assert alembic_version equals ScriptDirectory.get_current_head() and required tables exist, assert file mode is 0o664, and simulate ScriptDirectory.get_current_head() returning None to confirm RuntimeError propagation. - Add targeted unit tests for AutonomyController covering _validate_weights (reject missing or extra keys and non-unit sums), compute_confidence weighting and inversion rules, _get_threshold fallback to 1.0 when the profile lacks an attribute, record_outcome history bounding at _MAX_HISTORY_PER_TYPE (exercise with concurrent writers), _build_explanation text composition, and end-to-end should_proceed_automatically decisions. ## CI Reliability Impact - Unit coverage on these paths will catch regressions in automation issue hygiene and database bootstrap scripts before they cascade into high CI failure rates. - Exercising AutonomyController logic under deterministic tests reduces reliance on long-running behaviour suites when threshold math changes, preventing flaky escalations. ## Next Steps - Implement the pytest suites described above and wire them into nox -s unit_tests. - Ensure the new tests are included in coverage reporting (no changes needed to tool.coverage.run.omit). - Follow up with documentation updates once tests land. ## Duplicate Check - GET /api/v1/repos/cleveragents/cleveragents-core/issues?q=AUTO-INF-6&state=open&limit=50&page=1 – no existing coverage-gap issue overlaps these topics. - GET /api/v1/repos/cleveragents/cleveragents-core/issues?q=automation%20tracking&state=open&limit=50&page=1 – no open issue mentions validate_automation_tracking.py. - GET /api/v1/repos/cleveragents/cleveragents-core/issues?q=autonomy%20controller&state=open&limit=50&page=1 – no open issue covers the autonomy controller weighting engine.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#9790
No description provided.