[Milestone Review] v3.7.0 — Holistic Integration Review (2026-04-03) #2113

Open
opened 2026-04-03 04:10:19 +00:00 by freemo · 3 comments
Owner

Milestone Review: v3.7.0 — M8: TUI Implementation

Overall Assessment: FAIL

The v3.7.0 milestone is not production-ready. Four of six quality gates fail on master, the milestone is only ~31% complete (66 closed / 210 total issues), and critical integration gaps exist between independently-merged PRs.


Test Results

Gate Status Details
Lint (nox -e lint) PASS All checks passed
Typecheck (nox -e typecheck) FAIL 5 Pyright errors (see #2109)
Unit Tests (nox -e unit_tests) FAIL AmbiguousStep — duplicate step definition blocks all tests (see #1609)
Security Scan (nox -e security_scan) FAIL Vulture false positives in extension_protocols.py (see #2110)
Dead Code (nox -e dead_code) FAIL Same vulture findings as security_scan
Coverage (nox -e coverage_report) FAIL Cannot compute — unit tests blocked by AmbiguousStep; last partial report shows 37% (well below 97% threshold)
Integration Tests (nox -e integration_tests) ⏱️ TIMEOUT Exceeded 10-minute timeout

Milestone Completion

  • Closed issues: 66
  • Open issues: ~144 (excluding UAT-SUPERVISOR progress reports: ~80 substantive issues)
  • Completion rate: ~31%
  • Open PRs: 30+ (many stale or in review)
  • Merged PRs: ~60

Critical Integration Gaps Found

1. Ambiguous Behave Step Definition Blocks ALL Unit Tests (existing #1609)

Two independently-implemented TUI features collide at the step definition level:

  • features/steps/tui_thought_block_steps.py:126 defines @then('the rendered text should contain "{text}"')
  • features/steps/tui_first_run_steps.py:236 defines the same step

This prevents all unit tests from running. The issue was introduced by the combination of PR #1298 (thought block rendering, issue #1238) and PR #1391 (first-run experience, issue #1248). Neither PR alone would cause this — it's a classic integration gap.

Impact: Blocks nox -e unit_tests and nox -e coverage_report entirely.

2. Session CLI References Non-Existent Domain Model Fields (new #2109)

Multiple session CLI PRs (#1566, #1567, #1569, #1481) were merged independently, each referencing domain model attributes that don't exist:

  • Session.automation_profile — field never added to Session model
  • SessionService.list_messages() — method never implemented
  • session_service.py checksum logic — concatenates "sha256:" with None and dict

Impact: Blocks nox -e typecheck; would crash at runtime.

3. Vulture False Positives in Extension Protocols (new #2110)

Protocol method parameters in extension_protocols.py flagged as dead code by vulture.

Impact: Blocks nox -e security_scan and nox -e dead_code.

Open Must-Have and High-Priority Bugs

The following critical bugs remain open in v3.7.0:

# Priority Title
#1389 Must Have TUI input bypasses A2A — violates spec requirement for exclusive A2A communication
#1737 Must Have Plan.can_revert_to() allows forward phase transitions
#1419 Must Have Review for hardcoded secrets in services
#1843 High SkillRegistry.validate_skill() crashes with AttributeError
#1848 High NamespacedProject missing invariants fields
#1844 High NamespacedProjectModel.to_domain() drops invariants
#1900 High ToolRegistry.list_tools() filter is a no-op
#1914 High UnitOfWorkContext exposes legacy ProjectRepository
#1892 High ProviderRegistry.create_llm() fails for OpenRouter
#1762 High Domain layer missing repository Protocol interfaces
#1779 High PlanLifecycleService.revert_plan() allows forward transitions
#2048 High plan tree JSON/YAML output missing envelope
#2047 High resource type add --update not working

Open Features Still In Review

Several core TUI features remain unmerged:

  • #694 — MainScreen with sidebar states and Dracula theme (21 points)
  • #998 — SessionsScreen (8 points, Critical priority)
  • #995 — SettingsScreen (8 points, Critical priority)
  • #696 — TuiMaterializer A2A integration layer (13 points)
  • #1006 — Full conversation stream block type catalog (5 points)
  • #927 — Server auth/authorization/namespace service (13 points)
  • #872 — ACMS graph backend (13 points)

API Consistency Assessment

The merged work shows moderate consistency issues:

  • Session CLI commands assume domain model fields that don't exist (automation_profile)
  • A2A wire format was corrected (PRs #1990, #1841) — good integration work
  • CLI rich output panels follow a consistent pattern across merged PRs
  • 15 TODO comments remain in production code, indicating incomplete implementations

Specification Coverage

The specification defines the TUI as the primary scope for v3.7.0. Current coverage:

  • Shell danger detection patterns
  • Slash command catalog (60+ commands)
  • Actor thought block rendering
  • Help panel (F1)
  • Session export/import
  • First-run experience
  • PermissionsScreen with diff view
  • MainScreen with sidebar states (in review)
  • SessionsScreen (in review)
  • SettingsScreen (in review)
  • TuiMaterializer A2A integration (in review)
  • Block context menu (in review)
  • Tool call expand states (in review)
  • Persona export/import (in review)
  • Session-driven interactive editing (open)
  • Permission Question Widget (in review)

Issues Created by This Review

# Type Title
#2109 Bug (Critical) Pyright type errors in session CLI — Session.automation_profile, SessionService.list_messages(), checksum concatenation
#2110 Refactoring (High) Vulture false positives in extension_protocols.py fail security_scan gate

Recommendations

  1. Immediate priority: Fix #2109 (typecheck), #2110 (vulture), and #1609 (ambiguous steps) to restore all quality gates to passing. These three issues block the entire CI pipeline.

  2. Enforce pre-merge typecheck: The session CLI type errors (#2109) indicate PRs were merged without running nox -e typecheck. Consider making typecheck a required CI check before merge.

  3. Step definition naming convention: Issue #1609 proposes a domain-prefixed step naming convention. This should be prioritized to prevent future collisions as more TUI features are merged.

  4. Focus on Must-Have bugs: Three Must-Have bugs (#1389, #1737, #1419) should be resolved before any new feature work.

  5. Stale PR cleanup: 30+ open PRs in the milestone, many dating back weeks. Consider closing stale PRs and re-creating from fresh branches.

  6. Milestone is not ready for release: With only 31% completion and 4/6 quality gates failing, v3.7.0 needs significant additional work. Consider splitting remaining work into a v3.7.1 or extending the milestone timeline.


Automated by CleverAgents Bot
Supervisor: Implementation | Agent: ca-milestone-reviewer

## Milestone Review: v3.7.0 — M8: TUI Implementation ### Overall Assessment: FAIL The v3.7.0 milestone is **not production-ready**. Four of six quality gates fail on master, the milestone is only ~31% complete (66 closed / 210 total issues), and critical integration gaps exist between independently-merged PRs. --- ### Test Results | Gate | Status | Details | |------|--------|---------| | Lint (`nox -e lint`) | ✅ PASS | All checks passed | | Typecheck (`nox -e typecheck`) | ❌ FAIL | 5 Pyright errors (see #2109) | | Unit Tests (`nox -e unit_tests`) | ❌ FAIL | `AmbiguousStep` — duplicate step definition blocks all tests (see #1609) | | Security Scan (`nox -e security_scan`) | ❌ FAIL | Vulture false positives in extension_protocols.py (see #2110) | | Dead Code (`nox -e dead_code`) | ❌ FAIL | Same vulture findings as security_scan | | Coverage (`nox -e coverage_report`) | ❌ FAIL | Cannot compute — unit tests blocked by AmbiguousStep; last partial report shows 37% (well below 97% threshold) | | Integration Tests (`nox -e integration_tests`) | ⏱️ TIMEOUT | Exceeded 10-minute timeout | ### Milestone Completion - **Closed issues**: 66 - **Open issues**: ~144 (excluding UAT-SUPERVISOR progress reports: ~80 substantive issues) - **Completion rate**: ~31% - **Open PRs**: 30+ (many stale or in review) - **Merged PRs**: ~60 ### Critical Integration Gaps Found #### 1. Ambiguous Behave Step Definition Blocks ALL Unit Tests (existing #1609) Two independently-implemented TUI features collide at the step definition level: - `features/steps/tui_thought_block_steps.py:126` defines `@then('the rendered text should contain "{text}"')` - `features/steps/tui_first_run_steps.py:236` defines the same step This prevents **all** unit tests from running. The issue was introduced by the combination of PR #1298 (thought block rendering, issue #1238) and PR #1391 (first-run experience, issue #1248). Neither PR alone would cause this — it's a classic integration gap. **Impact**: Blocks `nox -e unit_tests` and `nox -e coverage_report` entirely. #### 2. Session CLI References Non-Existent Domain Model Fields (new #2109) Multiple session CLI PRs (#1566, #1567, #1569, #1481) were merged independently, each referencing domain model attributes that don't exist: - `Session.automation_profile` — field never added to Session model - `SessionService.list_messages()` — method never implemented - `session_service.py` checksum logic — concatenates `"sha256:"` with `None` and `dict` **Impact**: Blocks `nox -e typecheck`; would crash at runtime. #### 3. Vulture False Positives in Extension Protocols (new #2110) Protocol method parameters in `extension_protocols.py` flagged as dead code by vulture. **Impact**: Blocks `nox -e security_scan` and `nox -e dead_code`. ### Open Must-Have and High-Priority Bugs The following critical bugs remain open in v3.7.0: | # | Priority | Title | |---|----------|-------| | #1389 | **Must Have** | TUI input bypasses A2A — violates spec requirement for exclusive A2A communication | | #1737 | **Must Have** | `Plan.can_revert_to()` allows forward phase transitions | | #1419 | **Must Have** | Review for hardcoded secrets in services | | #1843 | High | `SkillRegistry.validate_skill()` crashes with `AttributeError` | | #1848 | High | `NamespacedProject` missing `invariants` fields | | #1844 | High | `NamespacedProjectModel.to_domain()` drops invariants | | #1900 | High | `ToolRegistry.list_tools()` filter is a no-op | | #1914 | High | `UnitOfWorkContext` exposes legacy `ProjectRepository` | | #1892 | High | `ProviderRegistry.create_llm()` fails for OpenRouter | | #1762 | High | Domain layer missing repository Protocol interfaces | | #1779 | High | `PlanLifecycleService.revert_plan()` allows forward transitions | | #2048 | High | `plan tree` JSON/YAML output missing envelope | | #2047 | High | `resource type add --update` not working | ### Open Features Still In Review Several core TUI features remain unmerged: - #694 — MainScreen with sidebar states and Dracula theme (21 points) - #998 — SessionsScreen (8 points, Critical priority) - #995 — SettingsScreen (8 points, Critical priority) - #696 — TuiMaterializer A2A integration layer (13 points) - #1006 — Full conversation stream block type catalog (5 points) - #927 — Server auth/authorization/namespace service (13 points) - #872 — ACMS graph backend (13 points) ### API Consistency Assessment The merged work shows **moderate consistency issues**: - Session CLI commands assume domain model fields that don't exist (automation_profile) - A2A wire format was corrected (PRs #1990, #1841) — good integration work - CLI rich output panels follow a consistent pattern across merged PRs - 15 TODO comments remain in production code, indicating incomplete implementations ### Specification Coverage The specification defines the TUI as the primary scope for v3.7.0. Current coverage: - ✅ Shell danger detection patterns - ✅ Slash command catalog (60+ commands) - ✅ Actor thought block rendering - ✅ Help panel (F1) - ✅ Session export/import - ✅ First-run experience - ✅ PermissionsScreen with diff view - ❌ MainScreen with sidebar states (in review) - ❌ SessionsScreen (in review) - ❌ SettingsScreen (in review) - ❌ TuiMaterializer A2A integration (in review) - ❌ Block context menu (in review) - ❌ Tool call expand states (in review) - ❌ Persona export/import (in review) - ❌ Session-driven interactive editing (open) - ❌ Permission Question Widget (in review) ### Issues Created by This Review | # | Type | Title | |---|------|-------| | #2109 | Bug (Critical) | Pyright type errors in session CLI — Session.automation_profile, SessionService.list_messages(), checksum concatenation | | #2110 | Refactoring (High) | Vulture false positives in extension_protocols.py fail security_scan gate | ### Recommendations 1. **Immediate priority**: Fix #2109 (typecheck), #2110 (vulture), and #1609 (ambiguous steps) to restore all quality gates to passing. These three issues block the entire CI pipeline. 2. **Enforce pre-merge typecheck**: The session CLI type errors (#2109) indicate PRs were merged without running `nox -e typecheck`. Consider making typecheck a required CI check before merge. 3. **Step definition naming convention**: Issue #1609 proposes a domain-prefixed step naming convention. This should be prioritized to prevent future collisions as more TUI features are merged. 4. **Focus on Must-Have bugs**: Three Must-Have bugs (#1389, #1737, #1419) should be resolved before any new feature work. 5. **Stale PR cleanup**: 30+ open PRs in the milestone, many dating back weeks. Consider closing stale PRs and re-creating from fresh branches. 6. **Milestone is not ready for release**: With only 31% completion and 4/6 quality gates failing, v3.7.0 needs significant additional work. Consider splitting remaining work into a v3.7.1 or extending the milestone timeline. --- **Automated by CleverAgents Bot** Supervisor: Implementation | Agent: ca-milestone-reviewer
freemo added this to the v3.7.0 milestone 2026-04-03 04:10:23 +00:00
freemo self-assigned this 2026-04-03 16:58:05 +00:00
Author
Owner

MoSCoW classification: Could Have

Rationale: This is a low-priority or backlog item. Desirable but not necessary for the milestone. Include only if time permits.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: ca-project-owner

MoSCoW classification: **Could Have** Rationale: This is a low-priority or backlog item. Desirable but not necessary for the milestone. Include only if time permits. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: ca-project-owner
Author
Owner

Label compliance fix applied:

  • Removed conflicting label: State/Unverified
  • Kept: State/Verified
  • Reason: Issue had two conflicting State/* labels. Per CONTRIBUTING.md, only one State/* label is permitted. Kept the more advanced state (State/Verified).

Automated by CleverAgents Bot
Supervisor: Backlog Grooming | Agent: ca-backlog-groomer

Label compliance fix applied: - Removed conflicting label: `State/Unverified` - Kept: `State/Verified` - Reason: Issue had two conflicting `State/*` labels. Per CONTRIBUTING.md, only one `State/*` label is permitted. Kept the more advanced state (`State/Verified`). --- **Automated by CleverAgents Bot** Supervisor: Backlog Grooming | Agent: ca-backlog-groomer
Author
Owner

Label compliance fix applied:

  • Removed conflicting label: State/Unverified
  • Retained: State/Verified
  • Reason: Issue had two conflicting State/* labels simultaneously. Per CONTRIBUTING.md, only one State/* label is permitted. State/Verified is the more advanced state and has been retained.

Automated by CleverAgents Bot
Supervisor: Backlog Grooming | Agent: ca-backlog-groomer

Label compliance fix applied: - Removed conflicting label: `State/Unverified` - Retained: `State/Verified` - Reason: Issue had two conflicting `State/*` labels simultaneously. Per CONTRIBUTING.md, only one `State/*` label is permitted. `State/Verified` is the more advanced state and has been retained. --- **Automated by CleverAgents Bot** Supervisor: Backlog Grooming | Agent: ca-backlog-groomer
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#2113
No description provided.