feat(acms): implement ACMS Domain-Specific Analyzer Implementations (Python, Markdown, PostgreSQL, Docker, etc.) #588

Closed
opened 2026-03-04 23:47:10 +00:00 by freemo · 0 comments
Owner

Metadata

Field Value
Commit Message feat(acms): implement ACMS Domain-Specific Analyzer Implementations (Python, Markdown, PostgreSQL, Docker, etc.)
Branch feature/m6-acms-domain-specific-analyzers

Summary

Implement the domain-specific analyzer classes that parse resources and produce UKO triples. The spec defines analyzers across 4 domains: source code (Python, TypeScript, Rust, custom DSL), documents (Markdown, reStructuredText, HTML), data schemas (PostgreSQL, MySQL, SQLite), and infrastructure (Docker Compose, Kubernetes).

Spec Reference

Section: Architecture > Extensibility > ACMS Extensions > Domain and Language-Specific Analyzers
Lines: ~44210-44261

Current State

  • No analyzer classes exist in the codebase.
  • No Analyzer Protocol or base class is defined.
  • The UKOIndexer (which uses analyzers) is also not yet implemented.
  • BAL backend stubs exist but no actual analysis/indexing pipeline.

Description

The spec defines analyzers organized by domain:

Source Code Analyzers (uko-code + Layers 2/3)

Analyzer Features
PythonAnalyzer AST parsing, type inference via mypy, import resolution, docstring extraction
TypeScriptAnalyzer TSC-based parsing, type extraction, module resolution, JSDoc parsing
RustAnalyzer rust-analyzer integration, lifetime analysis, trait resolution, macro expansion
CustomDSLAnalyzer Configurable via grammar file (PEG) + semantic rules YAML

Document Analyzers (uko-doc)

Analyzer Features
MarkdownAnalyzer Heading hierarchy, link extraction, code block detection, topic inference, cross-reference resolution
ReStructuredTextAnalyzer Directive parsing, role resolution, cross-reference resolution, index extraction
HTMLDocumentAnalyzer Semantic HTML parsing, heading extraction, link graph, readability scoring

Data Schema Analyzers (uko-data)

Analyzer Features
PostgreSQLAnalyzer Schema introspection, DDL extraction, foreign key graph, view dependency analysis, stored procedure parsing, statistics collection
MySQLAnalyzer Schema introspection, DDL extraction, foreign key graph, trigger parsing
SQLiteAnalyzer Schema introspection, DDL extraction, virtual table detection

Infrastructure Analyzers (uko-infra)

Analyzer Features
DockerComposeAnalyzer Service graph, port mapping, volume resolution, environment variable extraction
KubernetesAnalyzer Resource parsing, service dependency graph, ConfigMap/Secret references, ingress routing

Analyzer Protocol

Each analyzer must:

  1. Accept a Resource and produce a list of UKO triples
  2. Use the most specific UKO vocabulary available (Layer 3 > 2 > 1 > 0)
  3. Include provenance metadata (source file, line range, timestamp)
  4. Be registered in the analyzer registry for lookup by resource type

Suggested phased approach:

  • Phase 1 (v3.5.0): PythonAnalyzer + MarkdownAnalyzer (most common use cases)
  • Phase 2 (v3.5.0): PostgreSQLAnalyzer + DockerComposeAnalyzer
  • Phase 3 (v3.6.0): TypeScript, Rust, remaining analyzers

Acceptance Criteria

  • Analyzer Protocol defined: analyze(resource: Resource) -> list[Triple], can_handle(resource: Resource) -> bool
  • AnalyzerRegistry with resource-type-based lookup
  • PythonAnalyzer: AST parsing producing uko-py: triples (classes, functions, imports, decorators, type annotations)
  • MarkdownAnalyzer: Heading hierarchy, link extraction, code block detection producing uko-doc: triples
  • PostgreSQLAnalyzer: Schema introspection producing uko-data: triples (tables, columns, foreign keys, constraints)
  • DockerComposeAnalyzer: Service graph, port mapping producing uko-infra: triples
  • All analyzers include provenance (sourceResource, sourcePath, sourceRange, validFrom)
  • All analyzers use the most specific UKO vocabulary layer available
  • Unit tests for each analyzer with sample input files
  • Integration test: analyze a Python file, verify correct triples produced
  • Depends on: UKO Layer 1 Domain Ontologies (for triple vocabularies)
  • Used by: UKOIndexer / Real-time Index Sync (#578)
  • Parent epic: #396 (ACMS Context Pipeline)

Suggested Milestone

v3.5.0

Priority

Medium

Suggested Assignee

@hamza.khyari — ACMS pipeline/RDF specialist

Definition of Done

This issue is complete when:

  • All subtasks below are completed and checked off.
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation.
  • The commit is pushed to the remote on the branch matching the Branch in Metadata exactly.
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.
## Metadata | Field | Value | |-------|-------| | **Commit Message** | `feat(acms): implement ACMS Domain-Specific Analyzer Implementations (Python, Markdown, PostgreSQL, Docker, etc.)` | | **Branch** | `feature/m6-acms-domain-specific-analyzers` | ## Summary Implement the domain-specific analyzer classes that parse resources and produce UKO triples. The spec defines analyzers across 4 domains: source code (Python, TypeScript, Rust, custom DSL), documents (Markdown, reStructuredText, HTML), data schemas (PostgreSQL, MySQL, SQLite), and infrastructure (Docker Compose, Kubernetes). ## Spec Reference **Section**: Architecture > Extensibility > ACMS Extensions > Domain and Language-Specific Analyzers **Lines**: ~44210-44261 ## Current State - No analyzer classes exist in the codebase. - No `Analyzer` Protocol or base class is defined. - The UKOIndexer (which uses analyzers) is also not yet implemented. - BAL backend stubs exist but no actual analysis/indexing pipeline. ## Description The spec defines analyzers organized by domain: ### Source Code Analyzers (uko-code + Layers 2/3) | Analyzer | Features | |----------|----------| | **PythonAnalyzer** | AST parsing, type inference via mypy, import resolution, docstring extraction | | **TypeScriptAnalyzer** | TSC-based parsing, type extraction, module resolution, JSDoc parsing | | **RustAnalyzer** | rust-analyzer integration, lifetime analysis, trait resolution, macro expansion | | **CustomDSLAnalyzer** | Configurable via grammar file (PEG) + semantic rules YAML | ### Document Analyzers (uko-doc) | Analyzer | Features | |----------|----------| | **MarkdownAnalyzer** | Heading hierarchy, link extraction, code block detection, topic inference, cross-reference resolution | | **ReStructuredTextAnalyzer** | Directive parsing, role resolution, cross-reference resolution, index extraction | | **HTMLDocumentAnalyzer** | Semantic HTML parsing, heading extraction, link graph, readability scoring | ### Data Schema Analyzers (uko-data) | Analyzer | Features | |----------|----------| | **PostgreSQLAnalyzer** | Schema introspection, DDL extraction, foreign key graph, view dependency analysis, stored procedure parsing, statistics collection | | **MySQLAnalyzer** | Schema introspection, DDL extraction, foreign key graph, trigger parsing | | **SQLiteAnalyzer** | Schema introspection, DDL extraction, virtual table detection | ### Infrastructure Analyzers (uko-infra) | Analyzer | Features | |----------|----------| | **DockerComposeAnalyzer** | Service graph, port mapping, volume resolution, environment variable extraction | | **KubernetesAnalyzer** | Resource parsing, service dependency graph, ConfigMap/Secret references, ingress routing | ### Analyzer Protocol Each analyzer must: 1. Accept a `Resource` and produce a list of UKO triples 2. Use the most specific UKO vocabulary available (Layer 3 > 2 > 1 > 0) 3. Include provenance metadata (source file, line range, timestamp) 4. Be registered in the analyzer registry for lookup by resource type ### Suggested phased approach: - **Phase 1** (v3.5.0): PythonAnalyzer + MarkdownAnalyzer (most common use cases) - **Phase 2** (v3.5.0): PostgreSQLAnalyzer + DockerComposeAnalyzer - **Phase 3** (v3.6.0): TypeScript, Rust, remaining analyzers ## Acceptance Criteria - [ ] `Analyzer` Protocol defined: `analyze(resource: Resource) -> list[Triple]`, `can_handle(resource: Resource) -> bool` - [ ] `AnalyzerRegistry` with resource-type-based lookup - [ ] **PythonAnalyzer**: AST parsing producing `uko-py:` triples (classes, functions, imports, decorators, type annotations) - [ ] **MarkdownAnalyzer**: Heading hierarchy, link extraction, code block detection producing `uko-doc:` triples - [ ] **PostgreSQLAnalyzer**: Schema introspection producing `uko-data:` triples (tables, columns, foreign keys, constraints) - [ ] **DockerComposeAnalyzer**: Service graph, port mapping producing `uko-infra:` triples - [ ] All analyzers include provenance (sourceResource, sourcePath, sourceRange, validFrom) - [ ] All analyzers use the most specific UKO vocabulary layer available - [ ] Unit tests for each analyzer with sample input files - [ ] Integration test: analyze a Python file, verify correct triples produced ## Related Issues - Depends on: UKO Layer 1 Domain Ontologies (for triple vocabularies) - Used by: UKOIndexer / Real-time Index Sync (#578) - Parent epic: #396 (ACMS Context Pipeline) ## Suggested Milestone v3.5.0 ## Priority Medium ## Suggested Assignee @hamza.khyari — ACMS pipeline/RDF specialist ## Definition of Done This issue is complete when: - All subtasks below are completed and checked off. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done.
freemo added this to the v3.5.0 milestone 2026-03-05 00:30:29 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Blocks
#369 Epic: Large Project Autonomy & Context
cleveragents/cleveragents-core
Reference
cleveragents/cleveragents-core#588
No description provided.