feat(validation): add semantic validation service #449
No reviewers
Labels
No labels
auto/needs-reevaluation
controller-managed
auto/blocked-by-deps
auto/ci-timeout
auto/claimed-implementer
auto/claimed-merge
auto/claimed-reviewer
auto/driver-down
auto/invariant-violation
auto/last-attempt-tier-0
auto/last-attempt-tier-1
auto/last-attempt-tier-2
auto/last-attempt-tier-min
Automation Tracking
auto/needs-conflict-resolution
auto/needs-implementer
auto/postmortem
auto/ready-to-merge
auto/restart-throttled
auto/revert
auto/sentinel
auto/stale-inactivity
auto/unstable
Blocked
Bounty
$100
Bounty
$1000
Bounty
$10000
Bounty
$20
Bounty
$2000
Bounty
$250
Bounty
$50
Bounty
$500
Bounty
$5000
Bounty
$750
MoSCoW
Could have
MoSCoW
Must have
MoSCoW
Should have
Needs Feedback
Points
1
Points
13
Points
2
Points
21
Points
3
Points
34
Points
5
Points
55
Points
8
Points
88
Priority
Backlog
Priority
CI Blocker
Priority
Critical
Priority
High
Priority
Low
Priority
Medium
Signed-off: Owner
Signed-off: Scrum Master
Signed-off: Tech Lead
Spike
State
Completed
State
Duplicate
State
In Progress
State
In Review
State
Paused
State
Unverified
State
Verified
State
Wont Do
Type
Automation
Type
Bug
Type
Discussion
Type
Documentation
Type
Epic
Type
Feature
Type
Legendary
Type
Refactor
Type
Support
Type
Task
Type
Testing
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
cleveragents/cleveragents-core!449
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "feature/m6-semantic-validation"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Description
Adds the
SemanticValidationService— a new application-layer service that providesAST-based semantic analysis checks for Python projects during the strategize/execute plan
phases. Semantic checks are exposed as Validation tools (the read-only Tool subtype
defined in the spec and ADR-013) attachable per resource via the Tool Registry, integrating
into the existing
ValidationPipelineasinformationalby default.Closes #207
Core Service (
semantic_validation_service.py)SemanticValidationService— orchestrator that runs all registered rules against sourcefiles, maps results through the severity/mode pipeline, and returns normalised output
compatible with
ValidationPipeline.SemanticRuleRegistry— register/lookup/list pluggable rule implementations by name;ships pre-loaded with all built-in rules via
create_default_registry().SemanticValidationCache— file-hash-keyed LRU cache (thread-safe, bounded with LRUeviction) to skip re-analysis of unchanged files.
SemanticValidationRule—Protocolfor pluggable rules: each rule exposes anameproperty and a
check(source, filename) -> SemanticCheckResultmethod.resolve_severity()andmap_severity_to_mode()for mapping rule severityto
requiredvsinformationalvalidation mode.Built-in Rules (
semantic_validation_rules.py)Six AST-based, lightweight heuristic rules that avoid executing user code:
SyntaxCheckRuleast.parseMissingImportRuleBrokenReferenceRuleDuplicateImportRuleAPIMisuseRuleeval,exec,os.system,pickle.load, etc.) via AST analysisMissingSymbolRuleDependencyCycleRuleis retained as a backward-compatible alias forDuplicateImportRule.Configuration
validation.semantic.enabled— global toggle (default:true)validation.semantic.python.enabled— Python-specific toggle (default:true)validation.semantic.severity_mapping— per-rule severity overrides (info/warn/error)Severity and Pipeline Integration
Each rule's severity determines how it integrates with the
ValidationPipeline:error→requiredmode (failure blocks the operation)warn/info→informationalmode (reported but does not block)Public API
All new types are re-exported from
cleveragents.application.services.__init__for cleandownstream imports.
Spec Alignment
This implementation fulfils the Validation abstraction defined in
docs/specification.md:passedboolean, optionaldata, and optionalmessagefields.writes=false,checkpointable=false).informationalby default; severity mapping allows promoting individual rulesto
requiredmode.Type of Change
Quality Checklist
Anyunless justified)nox -s typecheckpasses with no errorsnox -s lintpasses with no errorsfeatures/)robot/) if applicablenox -s coverage_report)nox -s security_scan)nox -s dead_code)Testing
76 Behave BDD scenarios covering all rules, registry, cache, severity mapping, config keys,
pipeline integration, safe initialisation/cleanup, and edge-case code patterns. Robot Framework
integration tests (10 tests) and ASV performance benchmarks (5 suites) are also included.
Coverage for the two new source modules:
semantic_validation_service.py— 100% (163 stmts, 28 branches)semantic_validation_rules.py— 99% (237 stmts, 134 branches; 3 partial branch exitsremain on fully-covered lines)
Test Commands Run
Related Issues
Closes #207
Implementation Notes
check_file()returns[]immediately).SemanticValidationCacheuses anOrderedDict-based LRU with a configurablemax_size(default 4096). All operations are thread-safe via a
threading.Lock.APIMisuseRuleinspectsast.Callnodes rather than using regex, so it does notfalse-positive on string literals containing function names like
"eval".BrokenReferenceRuleperforms scope-aware name collection across all nesting levels(functions, classes, loops, comprehensions, exception handlers) to minimise false positives.
resolve_severity()gracefully falls back toinfofor unknown rules or invalid severityvalues in the configuration mapping.
fce31611ff81adaf0cac81adaf0cac84fa79515fReview:
feat(validation): add semantic validation servicePre-PR Checklist (PROTOCOL.md §21)
ISSUES CLOSED: #207Type/, noMoSCoW/,Points/,Priority/,enhancementType/FeatureonlyAnytypes in signaturesP1 — Must Fix
P1-1: Missing
CHANGELOG.mdupdatePre-PR checklist item 3 requires CHANGELOG.md to be updated. It is not touched in this PR.
P1-2:
Anyin public function signaturessemantic_validation_service.py:288—config: dict[str, Any] | None = NoneinSemanticValidationService.__init__semantic_validation_service.py:385—as_pipeline_resultsreturnslist[dict[str, Any]]semantic_validation_service.py:406—normalise_outputreturnsdict[str, Any]semantic_validation_rules.py:438—data: dict[str, Any] | NoneinSemanticCheckResultmodelPer DANGER_ZONE.md: "No
Anytypes in signatures." Consider introducing typed dicts (PipelineResultDict,NormalisedOutputDict) or at minimumMapping[str, object]for the return types. Thedatafield on the Pydantic model is arguably acceptable since it is genuinely unstructured, but the config and return types should be tightened.P1-3:
__init__.pyexports are incomplete — missing rule classesSyntaxCheckRule,MissingImportRule,BrokenReferenceRule,APIMisuseRule,MissingSymbolRule,DependencyCycleRuleare not exported from__init__.py, yet they are listed invulture_whitelist.pyand used by the Robot helper and Behave steps via direct imports. The__init__.pyonly exportsDuplicateImportRulefrom the rules module. All six rule classes should be exported and included in__all__.P1-4:
loggingused instead ofstructlogsemantic_validation_service.py:32usesimport loggingandlogging.getLogger(__name__). The rest of the codebase usesstructlog.get_logger(__name__)(seedecision_service.py,plan_lifecycle_service.py, etc.). This should usestructlogfor consistency and to benefit from the secrets-masking structlog processor.P1-5: Docs claim
subprocess.call/run/Popenare detected — code does notdocs/reference/semantic_validation.mdlistssubprocess.call(),subprocess.run(),subprocess.Popen()as detected dangerous calls.APIMisuseRule._DANGEROUS_ATTRSonly has entries foros,pickle, andmarshal—subprocessis not covered. Either addsubprocessdetection to the rule or remove the claim from the docs.P2 — Should Fix
P2-1:
DuplicateImportRuleregistered under misleading namedependency_cycleThe rule's
nameproperty returns"dependency_cycle"but it does not detect dependency cycles — it detects duplicate relative imports within a single file. The docstring even says: "This rule does not perform cross-file cycle detection." Consider renaming to"duplicate_import"withDependencyCycleRuleas backward-compat alias, or at minimum document the mismatch more prominently.P2-2:
SemanticValidationRuleProtocol not exported from__init__.pyThe protocol class
SemanticValidationRuleis defined in the service module but not re-exported through__init__.py. Custom rule authors need it.P2-3:
_is_python_fileusesstr.endswithwith a loop instead of tuplesemantic_validation_service.py:317:any(filename.endswith(ext) for ext in _PYTHON_EXTENSIONS)— more idiomatic asfilename.endswith((".py", ".pyi")).P2-4: Cache default
max_sizeinconsistency between code and docsCode:
_DEFAULT_CACHE_MAX_SIZE = 4096(line 162). Docs: "default: 512 entries" (semantic_validation.md). Pick one and align.P2-5: Behave step type annotations use bare
Anyfor contextAll step functions in
features/steps/semantic_validation_steps.pyannotatecontextasAny. Should bebehave.runner.Context(withTYPE_CHECKINGguard if needed).P2-6:
BrokenReferenceRulehardcodes dunder exclusionsLines 692-698 exclude
__name__,__file__,__doc__,__all__,__annotations__. This is incomplete — misses__spec__,__loader__,__package__,__builtins__,__cached__,__path__which are commonly used at module level.P2-7:
MissingSymbolRulecomprehension variable handlingComprehension iteration variables (
[x for x in items]) definexinside comprehension scope in Python 3. The_collect_function_local_namespicks upast.NameinStorecontext viaast.walkwhich should cover this, but worth adding an explicit test scenario to confirm comprehension-scoped vars are not falsely flagged.P3 — Nit / Optional
P3-1: PR body is empty — should have a summary with bullet points describing the change.
P3-2:
_collect_defined_namesand_collect_all_scope_nameshave overlapping logic with subtle differences. Consider refactoring to share a common walker.P3-3:
SemanticCheckResultPydantic model usesuse_enum_values=Falsewhich is the default — the explicit setting is unnecessary.Summary
The code is well-structured with good separation (rules module vs. service module), proper caching, thread safety, and thorough test coverage (48 Behave scenarios, 16 Robot tests, 7 benchmark suites). The architecture follows existing service patterns.
5 P1s and 7 P2s need attention before merge.
84fa79515fe013190ac1minimal change is to add PR summary and reference the ticket
e013190ac12f54e01270New commits pushed, approval review dismissed automatically according to repository settings
2f54e01270f2b85bf587f2b85bf587a2be3e67b0