test(validation): add semantic validation suites #510

Merged
CoreRasurae merged 1 commit from feature/m6-validation-semantic into master 2026-03-02 22:35:50 +00:00
Member

Summary

Add comprehensive semantic validation test suites covering five new fixture categories, expanding BDD, Robot Framework, and ASV benchmark coverage for the semantic validation subsystem.

Closes #316

Changes

Fixture Files (5 new JSON files)

  • features/fixtures/validation/language_porting_mismatches.json — 8 fixtures for porting mismatch patterns
  • features/fixtures/validation/dependency_graph_violations.json — 6 fixtures for dependency violation patterns
  • features/fixtures/validation/api_surface_changes.json — 8 fixtures for API surface change patterns
  • features/fixtures/validation/cross_file_symbols.json — 6 fixtures for cross-file symbol resolution
  • features/fixtures/validation/circular_import_detection.json — 5 fixtures for circular import patterns

BDD Tests (38 new scenarios)

  • features/semantic_validation_suite.feature — 33 per-fixture scenarios testing all six built-in rules against fixture-driven inputs, plus 5 schema-validation scenarios for fixture loading verification
  • features/steps/semantic_validation_suite_steps.py — Step definitions with fixture loading, rule dispatch, and assertion logic

Robot Framework Integration Tests (11 new test cases)

  • robot/semantic_validation_suite.robot — End-to-end rule execution tests via SemanticValidationService
  • robot/helper_semantic_validation_suite.py — Python helper for Robot test keyword implementations

ASV Benchmarks (6 suites)

  • benchmarks/semantic_validation_suite_bench.py — Performance baselines for batch validation throughput and per-rule latency

Documentation

  • docs/reference/semantic_validation_coverage.md — Coverage expectations and fixture category documentation

Verification

All 11 nox sessions pass:

  • lint: clean (ruff)
  • format: clean
  • typecheck: clean (Pyright)
  • security_scan: clean
  • dead_code: clean
  • unit_tests: 7594 scenarios, 0 failures
  • integration_tests: 1020 tests, 0 failures
  • docs: builds successfully
  • build: wheel builds successfully
  • benchmark: all ASV benchmarks complete
  • coverage_report: 97% (meets ≥97% threshold)

Dependencies

  • Depends on existing semantic validation infrastructure (#282)
  • No dependency on #332 (safety profile model)
## Summary Add comprehensive semantic validation test suites covering five new fixture categories, expanding BDD, Robot Framework, and ASV benchmark coverage for the semantic validation subsystem. Closes #316 ## Changes ### Fixture Files (5 new JSON files) - `features/fixtures/validation/language_porting_mismatches.json` — 8 fixtures for porting mismatch patterns - `features/fixtures/validation/dependency_graph_violations.json` — 6 fixtures for dependency violation patterns - `features/fixtures/validation/api_surface_changes.json` — 8 fixtures for API surface change patterns - `features/fixtures/validation/cross_file_symbols.json` — 6 fixtures for cross-file symbol resolution - `features/fixtures/validation/circular_import_detection.json` — 5 fixtures for circular import patterns ### BDD Tests (38 new scenarios) - `features/semantic_validation_suite.feature` — 33 per-fixture scenarios testing all six built-in rules against fixture-driven inputs, plus 5 schema-validation scenarios for fixture loading verification - `features/steps/semantic_validation_suite_steps.py` — Step definitions with fixture loading, rule dispatch, and assertion logic ### Robot Framework Integration Tests (11 new test cases) - `robot/semantic_validation_suite.robot` — End-to-end rule execution tests via SemanticValidationService - `robot/helper_semantic_validation_suite.py` — Python helper for Robot test keyword implementations ### ASV Benchmarks (6 suites) - `benchmarks/semantic_validation_suite_bench.py` — Performance baselines for batch validation throughput and per-rule latency ### Documentation - `docs/reference/semantic_validation_coverage.md` — Coverage expectations and fixture category documentation ## Verification All 11 nox sessions pass: - **lint**: clean (ruff) - **format**: clean - **typecheck**: clean (Pyright) - **security_scan**: clean - **dead_code**: clean - **unit_tests**: 7594 scenarios, 0 failures - **integration_tests**: 1020 tests, 0 failures - **docs**: builds successfully - **build**: wheel builds successfully - **benchmark**: all ASV benchmarks complete - **coverage_report**: 97% (meets ≥97% threshold) ## Dependencies - Depends on existing semantic validation infrastructure (#282) - No dependency on #332 (safety profile model)
freemo added this to the v3.5.0 milestone 2026-03-02 16:33:56 +00:00
CoreRasurae force-pushed feature/m6-validation-semantic from 2e5c28b922
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 15s
CI / build (pull_request) Successful in 18s
CI / quality (pull_request) Successful in 18s
CI / security (pull_request) Successful in 32s
CI / typecheck (pull_request) Successful in 32s
CI / unit_tests (pull_request) Successful in 1m58s
CI / docker (pull_request) Successful in 39s
CI / integration_tests (pull_request) Successful in 2m50s
CI / coverage (pull_request) Successful in 3m35s
CI / benchmark-regression (pull_request) Successful in 22m38s
to b0ce0972df
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 16s
CI / build (pull_request) Successful in 17s
CI / quality (pull_request) Successful in 17s
CI / security (pull_request) Successful in 30s
CI / typecheck (pull_request) Successful in 33s
CI / unit_tests (pull_request) Successful in 1m54s
CI / docker (pull_request) Successful in 38s
CI / integration_tests (pull_request) Successful in 2m45s
CI / coverage (pull_request) Successful in 3m30s
CI / benchmark-regression (pull_request) Successful in 23m4s
2026-03-02 19:23:44 +00:00
Compare
brent.edwards approved these changes 2026-03-02 20:58:34 +00:00
Dismissed
brent.edwards left a comment

Review — PR #510 test(validation): add semantic validation suites

Verdict: Approve with comments

Well-structured testing PR. The centralized suite_helpers.py shared across Behave, Robot, and ASV is clean and DRY. Fixtures use a proper JSON schema with validation, known-limitation scenarios are explicitly documented with dedicated fixtures, type annotations are present throughout, and the commit message correctly references issue #316. All 11 nox sessions reported green at ≥97% coverage.


P2:should-fix

1. PR label mismatch
The PR carries Type/Feature but the linked issue #316 is labelled Type/Testing. This is a test-only change with no production code — please update the PR label to Type/Testing.

2. Fixture counts in PR description are incorrect
Three of the five suite counts listed in the PR body do not match the actual fixture files:

Suite PR body says Actual
Language porting mismatches 6 8
API surface changes 6 8
Cross-file symbols 5 6

Dependency graph (6) and circular import (5) are correct.

3. Documentation scenario count off by one
docs/reference/semantic_validation_coverage.md states "39 scenarios" but the feature file (features/semantic_validation_suite.feature) contains 38 scenarios (33 per-fixture + 5 schema-validation).


P3:nit

4. "33 new scenarios" in PR body understates the total
The PR body says "33 new scenarios" — this only counts the per-fixture scenarios. The actual total is 38 when including the 5 schema-validation scenarios defined in the same feature file.


Per the review playbook, P2 items may be addressed in a follow-up PR within 3 days. No P0 or P1 findings.

## Review — PR #510 `test(validation): add semantic validation suites` **Verdict: Approve with comments** Well-structured testing PR. The centralized `suite_helpers.py` shared across Behave, Robot, and ASV is clean and DRY. Fixtures use a proper JSON schema with validation, known-limitation scenarios are explicitly documented with dedicated fixtures, type annotations are present throughout, and the commit message correctly references issue #316. All 11 nox sessions reported green at ≥97% coverage. --- ### P2:should-fix **1. PR label mismatch** The PR carries `Type/Feature` but the linked issue #316 is labelled `Type/Testing`. This is a test-only change with no production code — please update the PR label to `Type/Testing`. **2. Fixture counts in PR description are incorrect** Three of the five suite counts listed in the PR body do not match the actual fixture files: | Suite | PR body says | Actual | |---|---|---| | Language porting mismatches | 6 | 8 | | API surface changes | 6 | 8 | | Cross-file symbols | 5 | 6 | Dependency graph (6) and circular import (5) are correct. **3. Documentation scenario count off by one** `docs/reference/semantic_validation_coverage.md` states **"39 scenarios"** but the feature file (`features/semantic_validation_suite.feature`) contains **38** scenarios (33 per-fixture + 5 schema-validation). --- ### P3:nit **4. "33 new scenarios" in PR body understates the total** The PR body says "33 new scenarios" — this only counts the per-fixture scenarios. The actual total is **38** when including the 5 schema-validation scenarios defined in the same feature file. --- Per the review playbook, P2 items may be addressed in a follow-up PR within 3 days. No P0 or P1 findings.
CoreRasurae force-pushed feature/m6-validation-semantic from b0ce0972df
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 16s
CI / build (pull_request) Successful in 17s
CI / quality (pull_request) Successful in 17s
CI / security (pull_request) Successful in 30s
CI / typecheck (pull_request) Successful in 33s
CI / unit_tests (pull_request) Successful in 1m54s
CI / docker (pull_request) Successful in 38s
CI / integration_tests (pull_request) Successful in 2m45s
CI / coverage (pull_request) Successful in 3m30s
CI / benchmark-regression (pull_request) Successful in 23m4s
to 17d931c5d2
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 15s
CI / build (pull_request) Successful in 16s
CI / quality (pull_request) Successful in 18s
CI / security (pull_request) Successful in 31s
CI / typecheck (pull_request) Successful in 33s
CI / unit_tests (pull_request) Successful in 1m49s
CI / docker (pull_request) Successful in 38s
CI / integration_tests (pull_request) Successful in 2m50s
CI / coverage (pull_request) Successful in 3m29s
CI / benchmark-regression (pull_request) Has been cancelled
2026-03-02 21:44:06 +00:00
Compare
CoreRasurae dismissed brent.edwards's review 2026-03-02 21:44:06 +00:00
Reason:

New commits pushed, approval review dismissed automatically according to repository settings

brent.edwards left a comment

Approved.

Approved.
CoreRasurae force-pushed feature/m6-validation-semantic from 17d931c5d2
Some checks failed
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 15s
CI / build (pull_request) Successful in 16s
CI / quality (pull_request) Successful in 18s
CI / security (pull_request) Successful in 31s
CI / typecheck (pull_request) Successful in 33s
CI / unit_tests (pull_request) Successful in 1m49s
CI / docker (pull_request) Successful in 38s
CI / integration_tests (pull_request) Successful in 2m50s
CI / coverage (pull_request) Successful in 3m29s
CI / benchmark-regression (pull_request) Has been cancelled
to 7e6f6fae37
All checks were successful
CI / benchmark-publish (pull_request) Has been skipped
CI / lint (pull_request) Successful in 12s
CI / build (pull_request) Successful in 15s
CI / quality (pull_request) Successful in 17s
CI / typecheck (pull_request) Successful in 31s
CI / security (pull_request) Successful in 31s
CI / unit_tests (pull_request) Successful in 1m46s
CI / docker (pull_request) Successful in 38s
CI / integration_tests (pull_request) Successful in 2m52s
CI / coverage (pull_request) Successful in 3m41s
CI / benchmark-regression (pull_request) Successful in 23m24s
CI / lint (push) Successful in 14s
CI / build (push) Successful in 15s
CI / quality (push) Successful in 17s
CI / typecheck (push) Successful in 33s
CI / benchmark-regression (push) Has been skipped
CI / security (push) Successful in 35s
CI / unit_tests (push) Successful in 2m13s
CI / docker (push) Successful in 38s
CI / integration_tests (push) Successful in 2m58s
CI / coverage (push) Successful in 3m44s
CI / benchmark-publish (push) Successful in 13m19s
2026-03-02 22:06:02 +00:00
Compare
CoreRasurae deleted branch feature/m6-validation-semantic 2026-03-02 22:35:51 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core!510
No description provided.