feat(tui): wire normal text input to LLM via A2A facade #11231
No reviewers
Labels
No labels
auto/needs-reevaluation
controller-managed
auto/blocked-by-deps
auto/ci-timeout
auto/claimed-implementer
auto/claimed-merge
auto/claimed-reviewer
auto/driver-down
auto/invariant-violation
auto/last-attempt-tier-0
auto/last-attempt-tier-1
auto/last-attempt-tier-2
auto/last-attempt-tier-min
Automation Tracking
auto/needs-conflict-resolution
auto/needs-implementer
auto/postmortem
auto/ready-to-merge
auto/restart-throttled
auto/revert
auto/sentinel
auto/stale-inactivity
auto/unstable
Blocked
Bounty
$100
Bounty
$1000
Bounty
$10000
Bounty
$20
Bounty
$2000
Bounty
$250
Bounty
$50
Bounty
$500
Bounty
$5000
Bounty
$750
MoSCoW
Could have
MoSCoW
Must have
MoSCoW
Should have
Needs Feedback
Points
1
Points
13
Points
2
Points
21
Points
3
Points
34
Points
5
Points
55
Points
8
Points
88
Priority
Backlog
Priority
CI Blocker
Priority
Critical
Priority
High
Priority
Low
Priority
Medium
Signed-off: Owner
Signed-off: Scrum Master
Signed-off: Tech Lead
Spike
State
Completed
State
Duplicate
State
In Progress
State
In Review
State
Paused
State
Unverified
State
Verified
State
Wont Do
Type
Automation
Type
Bug
Type
Discussion
Type
Documentation
Type
Epic
Type
Feature
Type
Legendary
Type
Refactor
Type
Support
Type
Task
Type
Testing
No project
No assignees
3 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
cleveragents/cleveragents-core!11231
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "feature/m8-tui-llm-dispatch"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Wires the TUI normal text input path to the LLM via the existing
A2aLocalFacade. Previously, typing a message in the TUI simply echoed the text back — no LLM call was made. This PR closes that gap without introducing any new infrastructure.Changes
Production code
tui/commands.py—_create_tui_session(): creates a real database-backed session at TUI startup viaSessionService.create(), replacing the hardcodedsession_id="default"tui/commands.py—_build_tui_facade(): builds a fully-wiredA2aLocalFacadewithSessionWorkflow+ProviderRegistryregistered, following the same pattern ascli/commands/session.pytui/commands.py—run_tui(): passesfacadeandsession_idintoCleverAgentsTuiApptui/app.py—__init__: acceptsfacade: A2aLocalFacade | Noneandsession_id: strparamstui/app.py—on_input_submitted: replaces the dead-endconversation.update(preview)with:facade.dispatch(A2aRequest(method="message/send", ...))in arun_worker(thread=True)worker to keep the Textual event loop responsiveSessionActorNotConfiguredErrorcaught and shown as a user-friendly messagefacade=NoneTests
features/tui_llm_dispatch.feature+features/steps/tui_llm_dispatch_steps.py— 6 Behave BDD scenarios covering: normal dispatch, no-facade fallback,SessionActorNotConfiguredError, A2A error response, real session creation, session service fallbackrobot/tui_llm_dispatch.robot+robot/helper_tui_llm_dispatch.py— 4 Robot Framework integration tests usingFakeListLLM(no real API keys required): session creation, facade wiring, end-to-end message dispatch, no-actor error handlingHow it works
Closes #11230
PR Review Summary
Scope Verification
This PR wires the TUI normal text input path to the LLM via
A2aLocalFacade.dispatch(), closing issue #11230. All five concrete implementation steps from the issue were completed as subtasks.Specification Alignment
The approach follows the spec correctly (docs/specification.md lines 55, 166): every client — including TUI — communicates exclusively through A2A.
message/sendmaps toSessionWorkflow.tell()in local mode viaA2aLocalFacade. This PR is spec-compliant.CI Status
FAILING — Two jobs fail:
CI / integration_tests (pull_request)— failing after 4m16sCI / status-check (pull_request)— failing after 8s (likely due to integration_tests)The 5 required-for-merge gates all pass: lint, typecheck, security, unit_tests, coverage.
Detailed Review by Category
1. CORRECTNESS — PASS
All acceptance criteria from issue #11230 are addressed:
A2aLocalFacade.dispatch(A2aRequest(...))with method="message/send" ✓_on_llm_done) ✓run_worker(thread=True)✓SessionActorNotConfiguredErrorcaught and shown as friendly message ✓_create_tui_session()replacing hardcoded "default" ✓facade=None✓on_input_submitted()is untouched ✓2. SPECIFICATION ALIGNMENT — PASS
The PR follows the spec exactly: TUI routes through A2A facade using standard
message/sendoperation. This matches the architecture described in specification.md lines 55 and 166.3. TEST QUALITY — STRONG (with minor gap)
Behave BDD: 6 scenarios covering normal dispatch, no-facade fallback, SessionActorNotConfiguredError, generic error response, session creation, and session service fallback step coverage is excellent.
Robot Framework integration: 4 scenarios using FakeListLLM with real SQLite DB (no API keys) cover session creation, facade wiring, end-to-end dispatch, and no-actor error handling. The helper script uses sentinel-based output verification which is clean.
Minor gap: No dedicated test explicitly verifies the
thread=Truenon-blocking behaviour of the worker. A simple assertion checking thatrun_worker(..., thread=True)was called would close this (e.g., via mock patching ofTextualApp.run_worker).4. TYPE SAFETY — BLOCKER
The return type annotation on
_build_tui_facade()is-> Any(commands.py line 226). This should be typed as:The function clearly returns either an
A2aLocalFacadeorNone, soAnydiscards useful type information. Since the return flows directly intoCleverAgentsTuiApp.__init__(facade: A2aLocalFacade | None), this is a meaningful loss of type safety.5. READABILITY — GOOD
Names are descriptive (
_build_tui_facade,_create_tui_session,_dispatch_llm,_on_llm_done). Logic flow is easy to follow. Comments explain the "why" (e.g., non-blocking worker rationale).6. PERFORMANCE — PASS
No unnecessary inefficiencies detected. Worker thread delegation keeps the event loop responsive.
7. SECURITY — PASS
No hardcoded secrets. Input flows through A2A request objects which are validated by the facade. The
except Exceptionin_dispatch_llm()catches and displays error messages (not raw stack traces), which is safe for UI display.8. CODE STYLE — GOOD
Follows SOLID principles. Files are well within 500 lines. The fallback pattern with
contextlib.suppress(Exception)in commands.py mirrors the graceful degradation approach elsewhere in the codebase.Minor note:
_build_tui_facade()catches bareExceptionat the outer level. While this is intentional for graceful degradation, consider adding a logging call before returning None so the developer can diagnose facade build failures:9. DOCUMENTATION — GOOD
All new functions have docstrings. The PR description includes a clear "How it works" diagram.
10. COMMIT AND PR QUALITY — PASS
feat(tui): wire normal text input to LLM via A2A facade✓feature/m8-tui-llm-dispatch✓Recommendation
REQUEST_CHANGES for the following items:
_build_tui_facade()return type from-> Anyto-> "A2aLocalFacade | None"(line ~226 in commands.py).The overall code quality is strong and the implementation correctly addresses all acceptance criteria. The CI integration_tests failure appears related to the new Robot Framework test files and may require investigation into FakeListLLM availability or helper script execution in CI.
@ -211,0 +255,4 @@"Use /persona set <name> or select an actor first.")except Exception as exc: # pragma: no coverreturn f"[Error] {exc}"Suggestion: The nested
_dispatch_llm()and_on_llm_done()closures insideon_input_submittedcould be extracted as private methods on the class. This would improve testability (each closure can be tested independently) and reduce nesting depth. Currently, these functions captureconversation,session_id, andfacadefrom enclosing scope — moving them to instance methods makes this more explicit.@ -223,6 +223,67 @@ class TuiCommandRouter:return f"Import failed: {exc}"def _build_tui_facade() -> Any:Type safety concern: The return type annotation
-> Anydiscards meaningful type information. This function clearly returns either anA2aLocalFacadeorNone. Consider typing it as:This would also make the downstream assignment in
run_tui()more type-safe.@ -226,0 +257,4 @@passif session_svc is not None:facade.register_service("session_workflow",Suggestion: The outer
except Exceptionat ~line 260 is intentional for graceful degradation, but consider adding a warning log before returning None so facade build failures are diagnosable:This helps developers debug why the TUI is falling back to preview-only mode.
Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
Review of PR #11231 — feat(tui): wire normal text input to LLM via A2A facade
Previous Feedback
This is a first review — no previous feedback to address.
Overall Assessment
The PR correctly implements all 6 acceptance criteria from issue #11230:
_create_tui_session()_build_tui_facade()followingcli_bootstrap.pypatternson_input_submitted()with properA2aRequest(method="message/send")SessionActorNotConfiguredErrorProduction code quality is solid: 316-line
commands.pyand 270-lineapp.py, both under the 500-line threshold. Type annotations are complete, SOLID principles followed, documentation thorough.Blocking Issues
BLOCKER #1: CI failing — integration_tests (4m16s) and status-check (8s)
Per CONTRIBUTING.md review guidelines: "If CI is failing → ask the author to fix it BEFORE you review."
Required checks that are green: push-validation, helm, build, quality, lint, typecheck, security, unit_tests, coverage, docker.
Failed:
integration_testsandstatus-check. The new integration tests (robot helper at 219 lines with 4 subcommand functions) appear to be causing CI failures. Please investigate and fix the integration test failures before this PR can be approved for merge.BLOCKER #2:
# type: ignore[method-assign]presentFile:
robot/helper_tui_llm_dispatch.py, line 60:The project policy states ZERO tolerance for
# type: ignore. This is in a test helper file (not production code), and the use case (method reassignment via lambda) is legitimate but should be resolved differently. Suggestions:typing.cast()if the type narrowing is justifiedNon-blocking Observations (SUGGESTIONS)
_build_tui_facade()interaction with cached facade: The function callsget_facade()which returns the global cached instance fromcli_bootstrap.py. This cached instance already wiressession_servicefrom_build_facade(). The PR then optionally re-registers it and addsSessionWorkflow. This dual-path behavior (overwrite existing vs fresh build) could be confusing. Consider documenting why the override pattern is chosen over a clean rebuild._build_tui_facade()return type annotation: ReturnsAnyto accommodateNonein failure cases. The function signature would be more precise as-> A2aLocalFacade | None, but this requires importingA2aLocalFacadeat module level. If this import causes circular dependency issues, consider keepingAnyor deferring the type to a TYPE_CHECKING block.Test helper uses direct subprocess execution: The Robot test helper (
helper_tui_llm_dispatch.py) usessys.argvdispatching and separate process invocation. This works but adds startup overhead. Consider whetherpytest --tb=shortwith mock fixtures could replace the Robot tests for faster feedback.Test Coverage Assessment
Behave unit tests (
features/tui_llm_dispatch.feature+ steps):Robot integration tests (
robot/tui_llm_dispatch.robot+ helper):Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
PR Review Summary — #11231 (feat(tui): wire normal text input to LLM via A2A facade)
Review Outcome: REQUEST_CHANGES
This PR implements all acceptance criteria from issue #11230. The code quality is solid, test coverage comprehensive, and the approach follows existing architekural patterns. However, two blocking issues prevent approval:
integration_tests(4m16s) andstatus-check(8s) are red. All other checks (lint, typecheck, security, unit_tests, coverage, etc.) pass.# type: ignore[method-assign]present in test helper file — violates zero-tolerance policy even in non-production code.Please fix these issues and push a new commit to proceed with review.
Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
PR Review Summary
PR #11231: feat(tui): wire normal text input to LLM via A2A facade
CI Status: integration_tests check is FAILING (all other checks pass: lint, typecheck, security, unit_tests, coverage). The
status-checkwrapper also fails as a result.1. CORRECTNESS ✅ PASS
The implementation correctly wires TUI normal text input to the LLM via A2aLocalFacade as specified in issue #11230. The flow is sound:
_build_tui_facade()returns a fully-wiredA2aLocalFacadewith SessionWorkflow registered, orNoneon any failure._create_tui_session()creates a real DB-backed session viaSessionService.create(), with graceful fallback to"default".on_input_submitted(NORMAL mode)dispatches through the A2A facade in a worker thread and renders the assistant response.2. SPECIFICATION ALIGNMENT ⚠️ NEEDS ATTENTION
This PR aligns with the M8 TUI scope (ADR-044, TuiMaterializer A2A integration). The architecture follows the pattern established in
cli/commands/session.pyfor facade wiring. No spec departures detected.3. TEST QUALITY ✅ PASS (6 Behave scenarios + 4 Robot integration tests)
tui_llm_dispatch.feature+steps/tui_llm_dispatch_steps.py):MockApp,MockStatic,_Worker) that accurately mirrors the productionrun_workerAPI.tui_llm_dispatch.robot+helper_tui_llm_dispatch.py):4. TYPE SAFETY ✅ PASS
All function signatures are fully annotated. Key annotations:
facade: A2aLocalFacade | None = None— correct Unions for optional dependency injectionsession_id: str = "default"— clear string typing with reasonable default_build_tui_facade() -> Anyand_create_tui_session() -> str— appropriate for functions that may failTYPE_CHECKINGguard (line 26)# type: ignore5. READABILITY ✅ PASS
_build_tui_facade(),_create_tui_session(),_dispatch_llm(),_on_llm_done().session_id/facadecaptures is clear and well-commented.previewtoexpandedis consistent within the changed block.6. PERFORMANCE ✅ PASS
thread=True) correctly offloads blocking LLM call preventing event-loop starvation.exclusive=Falseparameter appears intentional — multiple workers can run concurrently.7. SECURITY ✅ PASS
actor_name=Nonefor unauthenticated setup (acceptable)._dispatch_llm()closure.exception excis caught and rendered with[Error]tag — prevents information leakage in production?8. CODE STYLE ✅ PASS
contextlib.suppress(Exception)consistently for graceful degradation.9. DOCUMENTATION ✅ PASS
_build_tui_facade(),_create_tui_session()).10. COMMIT AND PR QUALITY ✅ PASS
feat(tui): wire normal text input to LLM via A2A facadeISSUES CLOSED: #11230.Closes #11230).🚩 BLOCKING: CI Failing - integration_tests
Status:
CI / integration_tests (pull_request)— FAILING.The PR description mentions 4 Robot Framework integration tests that use FakeListLLM, but the CI integration_tests job is failing. This may be related to:
Per company policy, all required CI gates must pass before merge. The integration_tests failure is blocking — it cannot be dismissed without understanding the root cause.
The PR author should verify that
nox -s integration_testspasses locally before this can be approved and merged.Recommendation: REQUEST_CHANGES (CI blocking only). All 10 review categories pass for code quality. The sole blocker is the failing integration_tests CI check, which must be resolved before merge.
Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
Review Summary
PR: feat(tui): wire normal text input to LLM via A2A facade (#11231)
Issue Closed: #11230
1. CORRECTNESS - BLOCKING BUG IN TESTS
The implementation correctly wires TUI normal text input through the A2aLocalFacade dispatch mechanism. The flow (user input -> InputModeRouter -> facade.dispatch(A2aRequest(method="message/send")) -> SessionWorkflow.tell()) is sound.
However, a blocking bug was found in robot/helper_tui_llm_dispatch.py line 51:
PersistentSessionService(session_factory=session_factory)PersistentSessionService(session_repo: SessionRepository, message_repo: SessionMessageRepository, event_bus=None)All other test helpers use proper repository instances (features/steps/session_persistence_steps.py, robot/helper_session_tell_llm.py line 100). This causes ALL integration tests to fail with TypeError.
2. SPECIFICATION ALIGNMENT
The spec states: every client operation flows through A2A regardless of deployment mode. The PR perfectly aligns: normal text input is routed through facade.dispatch(A2aRequest(method="message/send", ...)) with SessionWorkflow.
3. TEST QUALITY
4. TYPE SAFETY
All new function signatures annotated. No
# type: ignoreadditions. Acceptable use ofAnyfor the facade return type.5. READABILITY
Clear naming conventions. Well-structured modular design. Good follow-up of CLI patterns from cli/commands/session.py.
6. PERFORMANCE
Using
self.run_worker(thread=True, exclusive=False)keeps Textual event loop responsive during LLM API round-trips. No inefficiencies detected.7. SECURITY
No hardcoded secrets or injection vectors. A2aRequest params through service layer validation.
8. CODE STYLE
SOLID principles followed. Files within 500-line limits (270 and 316 lines). Follows ruff conventions.
9. DOCUMENTATION
All new functions have docstrings. Good inline comments explaining design decisions.
10. COMMIT AND PR QUALITY
CI Status
@ -0,0 +48,4 @@PersistentSessionService,)return PersistentSessionService(session_factory=session_factory)BLOCKING: PersistentSessionService constructor mismatch at line 51.
The test helper calls:
But the actual signature is:
The PR author assumed
PersistentSessionServiceaccepts asession_factorylike other database services. All other test helpers use proper repository instances (e.g., robot/helper_session_tell_llm.py:100):Fix: Update
_make_session_service()to create properSessionRepositoryandSessionMessageRepositoryinstances from the SQLAlchemy engine, matching the pattern used by robot/helper_session_tell_llm.py. This will fix all 4 integration test cases and unblock CI.Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
Review of PR #11231: feat(tui): wire normal text input to LLM via A2A facade
No Previous Feedback Items to Verify
This is the first review. There are no previous REQUEST_CHANGES reviews.
1. CORRECTNESS -- BLOCKING ISSUES FOUND
BLOCKER 1: Wrong A2aLocalFacade constructor arguments (commands.py,
_build_tui_facade()lines 4-5)The constructor call passes keyword arguments that do not exist on
A2aLocalFacade:The actual constructor signature is:
def init(self, services: dict[str, Any] | None = None) -> None
This causes an immediate TypeError at runtime. The fix should register services properly.
BLOCKER 2: Non-existent SessionService method (commands.py,
_create_tui_session()lines 4-7)SessionService has no create_or_get() method. Available methods are:
The fallback to "default" also silently masks errors.
BLOCKER 3: Wrong method signature (commands.py,
_build_tui_facade()line 6)Based on CLI code, this is called without arguments. Passing session_workflow is incompatible.
2. SPECIFICATION ALIGNMENT
The PR conceptually aligns with A2A spec but the incorrect constructor calls diverge from implementation reality.
3. TEST QUALITY
The PR description claims Behave scenarios and Robot tests in new files, but none of these test files exist (features/tui_llm_dispatch.feature, features/steps/tui_llm_dispatch_steps.py, robot/tui_llm_dispatch.robot, robot/helper_tui_llm_dispatch.py). Without actual test files on disk I cannot verify coverage or correctness claims. All test files must be added.
4. TYPE SAFETY
No # type: ignore found -- good. New function signatures are properly annotated.
5-9. Readability, Performance, Security, Style, Documentation
Good readability with clear comments. No performance issues. No security concerns. Files under 500 lines. Docstrings present on new functions.
10. COMMIT AND PR QUALITY
Title follows Conventional Changelog format. Closes #11230. Has Type/Feature label. Milestone v3.7.0 assigned. Single commit scope (M8). Good.
CI Status
integration_tests job failed -- consistent with the blocking runtime issues found above.
Summary
Decision: REQUEST_CHANGES
Three critical blockers must be fixed:
Plus missing test files. Fix these and re-submit for review.Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
PR Review Summary
CI Status: Failing. Of the 5 required-for-merge gates (lint, typecheck, security, unit_tests, coverage), all 5 pass. However the non-merge-gate
integration_testschecks fail after 4m16s, causing the aggregatestatus-checkto report failure. Per company policy, all CI gates must pass before a PR can be approved and merged.Detailed Review by Category
1. CORRECTNESS \u2014 PASS (on TUI-specific changes)
All five acceptance criteria from issue #11230 are met:
A2aLocalFacade.dispatch(A2aRequest(...))with method\u003d"message/send" OK_on_llm_done) OKrun_worker(thread=True)OKSessionActorNotConfiguredErrorcaught and shown as friendly message OK_create_tui_session()replacing hardcoded "default" OKfacade=NoneOK2. SPECIFICATION ALIGNMENT \u2014 PASS
The approach follows the spec correctly: every client \u2014 including TUI \u2014 communicates exclusively through A2A.
message/sendmaps toSessionWorkflow.tell()in local mode viaA2aLocalFacade.3. TEST QUALITY \u2014 STRONG
Behave BDD: 6 well-named scenarios covering normal dispatch, no-facade fallback, SessionActorNotConfiguredError, A2A error response, real session creation, and session service fallback.
Robot Framework Integration: 4 tests using FakeListLLM (no real API keys): session creation, facade wiring, end-to-end dispatch, no-actor error handling.
Mock Textual infrastructure in step definitions mirrors the pattern from
tui_first_run_steps.py. Note: mocks/fakes are defined inline rather than infeatures/mocks/— this is a minor deviation but acceptable for test helpers that import real production code.4. TYPE SAFETY \u2014 PASS
facade: A2aLocalFacade | Noneannotated,session_id: str = \"default\"onCleverAgentsTuiApp.__init__. Zero# type: ignorefound.Suggestion:
_build_tui_facade()returnsAnyinstead of the properA2aLocalFacade | None. Consider fixing the return annotation.5. READABILITY \u2014 PASS
Clear naming (
_dispatch_llm,_on_llm_done,_build_tui_facade,_create_tui_session). The two-closure pattern (nested functions for worker) is idiomatic Textual.6. PERFORMANCE \u2014 PASS
Worker thread keeps event loop responsive. No unnecessary allocations.
7. SECURITY \u2014 PASS (TUI-only section)
No hardcoded secrets or credentials in the TUI dispatch code. The error path uses
f\"[Error] {exc}\"which may expose internal details — but this is only for generic exceptions (not CleverAgentsError) in a local TUI context, and severity is very low compared to server-facing surfaces.8. CODE STYLE \u2014 PASS
SOLID principles followed: the
_build_tui_facadeand_create_tui_sessionfunctions are single-purpose with try/except-based graceful degradation. Files under 500 lines (app.py ~300, commands.py added ~70).9. DOCUMENTATION \u2014 PASS
All new public functions have docstrings:
_build_tui_facade(),_create_tui_session()— the docstring is comprehensive and references similar patterns incli/commands/session.py.10. COMMIT AND PR QUALITY \u2014 REJECT (BLOCKING)
CRITICAL: PR scope bloat. The PR body claims 6 changed files with ~832 additions and 6 deletions. The actual diff shows 27 files changed with 873 insertions and 1046 deletions. The extra changes span at least 4 distinct feature areas:
Each of these changes belongs to a separate Epic and should have been submitted as separate PRs. CONTRIBUTING.md requires: "ONE EPIC SCOPE PER PR" and the rule that if describing work requires "and" between unrelated actions, split into commits.
CI:
integration_testsfails (4m16s). This may or may not be caused by the non-TUI changes — it needs to be investigated. Per policy, all CI gates must pass before merge.Summary of Required Actions
integration_testsfails, which causes overall status-check failure. The author must investigate and fix._build_tui_facade()should have a proper return annotation (A2aLocalFacade | None) instead ofAny.Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
PR Review — Re-Review Round (fresh evaluation)
This review was conducted by an independent PR review worker. The full review body is in this PR review's
REQUEST_CHANGESstatus.---Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
PR Review: REQUEST CHANGES. CI Status: Failing. Five required-for-merge gates pass. Integration_tests fails after 4m16s causing status-check failure. Per company policy all CI gates must pass before merge.
10-Category Review
1. CORRECTNESS PASS. All acceptance criteria from issue #11230 addressed: normal text to LLM dispatch via A2aLocalFacade, assistant response rendered on worker completion, Textual event loop kept responsive via run_worker(thread=True), SessionActorNotConfiguredError handled gracefully real DB-backed session created at TUI startup graceful fallback when facade=None.
2. SPECIFICATION ALIGNMENT PASS. Spec-compliant: TUI routes through A2A facade with standard message/send operation.
3. TEST QUALITY STRONG. Behave: 6 BDD scenarios (normal dispatch no-facade fallback SessionActorNotConfiguredError A2A error response real session creation service fault). Robot Framework Integration: 4 tests using FakeListLLM no API keys.
4. TYPE SAFETY PASS. All annotations present zero type ignore comments. Suggestion: _build_tui_facade() return Any should be A2aLocalFacade | None.
5. READABILITY PASS. Descriptive function names clear closure pattern for worker thread dispatch.
6. PERFORMANCE PASS. Worker thread keeps Textual event loop responsive during LLM calls.
7. SECURITY PASS. No hardcoded secrets in TUI dispatch code.
8. CODE STYLE PASS. Single-purpose functions with graceful degradation files under 500 lines.
9. DOCUMENTATION PASS. All new public functions have docstrings referencing patterns in cli/commands/session.py.
10. COMMIT AND PR QUALITY FAIL (BLOCKING). CRITICAL: PR scope bloat. PR body states 6 changed files but actual diff shows 27 files changed with 873 insertions and 1046 deletions. Extra changes span at least 4 distinct feature areas: TUI LLM dispatch intended vs Auth middleware removal vs Invariant scope simplification vs DB URL sanitisation removal. CONTRIBUTING.md requires ONE EPIC SCOPE PER PR and atomic commits.
Summary
Automated by CleverAgents Bot
Supervisor: PR Review | Agent: pr-review-worker
9e06f93483cf98a96a32Code Review Report — PR #11231 (feature/m8-tui-llm-dispatch)
Scope
Reviewed all 5 commits on
feature/m8-tui-llm-dispatchagainst issue #11230 ("feat(tui): wire normal text input to LLM via A2A facade"). Scope:tui/app.py,tui/commands.py,features/tui_llm_dispatch.feature,features/steps/tui_llm_dispatch_steps.py,robot/tui_llm_dispatch.robot,robot/helper_tui_llm_dispatch.py.BUG — HIGH
1. Worker error silently swallowed in
_on_llm_donecallback (tui/app.py:290–293)The worker callback only checks
worker.result is not Nonebut entirely ignoresworker.error. In Textual, when a threaded worker function raises an exception,worker.resultisNoneandworker.errorholds the exception object. The UI already shows "⏳ Thinking…" at this point — the user receives zero feedback that the LLM call failed.Risks:
A2aOperationNotFoundError, network errors, database errors during dispatch all silently swallowed. The user sees a frozen "Thinking…" message with no indication anything went wrong.BUG — MEDIUM
2.
_run_llm_dispatchmissing domain exception handlers (tui/app.py:70–80)The A2A facade explicitly re-raises
SessionNotFoundErrorandDatabaseError(together withSessionActorNotConfiguredError) atfacade.py:212–216. The function catches onlySessionActorNotConfiguredError— the other two domain exceptions fall through to the genericexcept Exceptionand produce a terse"[Error] …"message. These could benefit from user-friendly messages similar to the no-actor guidance.TEST — MEDIUM
3. BDD step
step_facade_received_requesthardcodes message text (features/steps/tui_llm_dispatch_steps.py:177)The assertion
request.params["message"] == "hello there"is coupled to the exact text in a single scenario. Thestep_submit_textWhen-step accepts{text}as a parameter but does not store it on the context for reuse. Any new scenario that uses "facade should have received a message/send request" with different message text would fail.Fix: store the submitted text on context in
step_submit_textand assert againstcontext._submitted_textin the Then-step.4. Robot integration test
cmd_dispatch_messagehas weak assertion (robot/helper_tui_llm_dispatch.py:169–170)The test only checks
len(assistant_msg) > 0but never verifies the response content matches the stub (_STUB_TEXT = "Hello from the TUI integration stub LLM."). A regression that returns a wrong response message would pass this test.TEST — LOW
5. No test coverage for worker error path in
_on_llm_doneNeither the BDD tests (which bypass the worker entirely by calling
_run_llm_dispatchdirectly) nor the Robot tests cover the case where the worker thread raises an exception. Combined with issue #1 above, this is a gap in both implementation and verification.Summary
_on_llm_doneignoresworker.error— silent failure, stuck UI_run_llm_dispatchmissing handler forSessionNotFoundError/DatabaseErrorPositive Findings
facade=Nonepreserves preview-only fallbackSessionActorNotConfiguredErroron first messagecontextlib.suppress(Exception)in_build_tui_facadeand_create_tui_sessioncorrectly prevents TUI launch failures@references) preserved unchanged"default"Review Feedback Addressed — @CoreRasurae Cycle
All 5 findings from the latest review have been resolved in commit
f4f6b77b5.✅ Fix 1 — HIGH: Worker error silently swallowed
Extracted
_format_worker_outcome(result, error)as a testable module-level function._on_llm_donenow delegates to it and surfacesworker.errorto the conversation widget — the UI no longer freezes silently on⏳ Thinking...when the worker thread fails.✅ Fix 2 — MEDIUM: Missing
SessionNotFoundError/DatabaseErrorhandlers_run_llm_dispatchnow catches both domain exceptions explicitly with user-friendly messages:SessionNotFoundError→"[Error] Session not found. Please restart the TUI to create a new session."DatabaseError→"[Error] Database error: <detail>"✅ Fix 3 — MEDIUM: BDD step hardcodes message text
step_submit_textnow storescontext._submitted_textandstep_facade_received_requestasserts against it — fully parameterized, works for any message text.✅ Fix 4 — MEDIUM: Robot integration test weak assertion
cmd_dispatch_messagenow asserts_STUB_TEXT in assistant_msgin addition tolen > 0— content regressions will be caught.✅ Fix 5 — LOW: No test for worker error callback path
_format_worker_outcomeis directly importable and testable. Added 4 new BDD scenarios:SessionNotFoundError,DatabaseError, worker error surfacing, and worker success pass-through. Total: 12 scenarios, all passing.Ready for re-review.
Code Review: feat(tui): wire normal text input to LLM via A2A facade (#11230)
Branch:
feature/m8-tui-llm-dispatch| PR: #11231 | Milestone: v3.7.0 (M8: TUI)Review conducted across multiple cycles covering: correctness, security, performance, error handling, test coverage, spec alignment, and code quality.
HIGH Severity
1. Concurrent dispatch race condition (BUG)
Location:
src/cleveragents/tui/app.py:328exclusive=Falseallows multiple LLM dispatches to run in parallel on the same session. If the user types several messages quickly, each spawns a new worker that calls_run_llm_dispatchconcurrently. This causes:SessionWorkflow.tell()callsappend_message()+_invoke_actor()+append_message(). Two concurrent workers would interleave message appends, producing garbled session history._on_llm_donecallbacks fire in unpredictable order, overwriting previous results.Fix: Change to
exclusive=Truewith a name to serialize dispatches:MEDIUM Severity
2. Rich markup injection in conversation widget (SECURITY / RENDERING)
Location:
src/cleveragents/tui/app.py:309,326and allconversation.update()callsTextual's
Staticwidget renders Rich markup. Neither user input (expanded) nor LLM response (assistant_message) is escaped. A user typing[bold red]hello[/bold red]would see styled red text rather than literal brackets. An LLM generating[reverse]IMPORTANT[/reverse]would have formatting applied.This is a console markup injection vector. While not RCE, it can produce misleading output or allow social engineering via styled text.
Fix: Use
rich.markup.escape()on all conversation.update() text:3. No cancellation or timeout for in-flight LLM requests (PERFORMANCE / UX)
Location:
src/cleveragents/tui/app.py:75,319The
_dispatch_llmclosure calls_run_llm_dispatchwhich synchronously invokesfacade.dispatch()in the worker thread with no timeout. If the LLM hangs (network partition, overloaded provider), the worker thread blocks indefinitely. The user has no cancel mechanism and no elapsed-time feedback beyond "Thinking...".Fix: Consider wrapping with a timeout. At minimum,
exclusive=Truemeans a second message displaces the first pending dispatch (addressing both #1 and #3).4. Overbroad
contextlib.suppress(Exception)swallows critical signals (SAFETY)Location:
src/cleveragents/tui/commands.py:249,253,288suppress(Exception)catches ALL Exception subclasses, includingKeyboardInterrupt,SystemExit,MemoryError,RecursionError. The intent (graceful degradation) is correct but scope is too broad.Fix: Use narrower exception types:
try: ... except (RuntimeError, ValueError, ImportError, OSError): pass.5. No error logging in
_run_llm_dispatch(OPERATIONAL)Location:
src/cleveragents/tui/app.py:74-90When
_run_llm_dispatchcatches exceptions, it converts them to display strings but never logs them. No structured log record or stack trace is produced, making debugging production TUI issues difficult. The rest of the codebase usesstructlogextensively.Fix: Add
logger.exception()in each except branch.6. Empty assistant response renders confusing UI (UX / BUG)
Location:
src/cleveragents/tui/app.py:79-80SessionWorkflow.tell()can legitimately return emptyassistant_message. The TUI would display"Assistant: "with blank text.Fix: Add fallback:
LOW Severity
7. Untested catch-all
except Exception(CODE QUALITY)Location:
app.py:89--except Exception as exc: # pragma: no cover. This broad handler has zero test coverage and could mask bugs.8.
_format_worker_outcomestringifies critical exceptions (SAFETY)Location:
app.py:111--return f"[Error] {error}". If Textual sendsKeyboardInterruptas worker error, it gets stringified rather than propagated.9. Hardcoded
"default"used for two different purposes (DESIGN)Location:
commands.py:290,297."default"serves as both persona lookup placeholder and session_id fallback. Extract distinguishable constants.10-12. Test coverage gaps (TEST GAPS)
[/]in user input or LLM response).assistant_messageedge case.Spec Alignment
No violations against
docs/specification.md. The implementation correctly uses A2Amessage/send, follows the facade pattern fromcli/commands/session.py, creates database-backed sessions viaSessionService, and passes persona actor as anactoroverride per dispatch.Summary Table
exclusive=Falseallows parallel dispatches on same sessionsuppress(Exception)swallows critical signals_run_llm_dispatchexcept Exceptionis untested_format_worker_outcomestringifiesKeyboardInterrupt"default"string shared for two semanticsThe implementation is well-structured and correctly follows the A2A dispatch pattern. Items #1 and #2 are the two I recommend addressing before merge.
Code Review: feat(tui): wire normal text input to LLM via A2A facade (#11230)
Branch:
feature/m8-tui-llm-dispatch| PR: #11231 | Milestone: v3.7.0 (M8: TUI)Review conducted across multiple cycles covering: correctness, security, performance, error handling, test coverage, spec alignment, and code quality.
HIGH Severity
1. Concurrent dispatch race condition (BUG)
Location:
src/cleveragents/tui/app.py:328exclusive=Falseallows multiple LLM dispatches to run in parallel on the same session. If the user types several messages quickly, each spawns a new worker that calls_run_llm_dispatchconcurrently. This causes:SessionWorkflow.tell()callsappend_message()+_invoke_actor()+append_message(). Two concurrent workers would interleave message appends, producing garbled session history._on_llm_donecallbacks fire in unpredictable order, overwriting previous results.Fix: Change to
exclusive=Truewith a name to serialize dispatches:MEDIUM Severity
2. Rich markup injection in conversation widget (SECURITY / RENDERING)
Location:
src/cleveragents/tui/app.py:309,326and allconversation.update()callsTextual's
Staticwidget renders Rich markup. Neither user input (expanded) nor LLM response (assistant_message) is escaped. A user typing[bold red]hello[/bold red]would see styled red text rather than literal brackets. An LLM generating[reverse]IMPORTANT[/reverse]would have formatting applied.This is a console markup injection vector. While not RCE, it can produce misleading output or allow social engineering via styled text.
Fix: Use
rich.markup.escape()on all conversation.update() text:3. No cancellation or timeout for in-flight LLM requests (PERFORMANCE / UX)
Location:
src/cleveragents/tui/app.py:75,319The
_dispatch_llmclosure calls_run_llm_dispatchwhich synchronously invokesfacade.dispatch()in the worker thread with no timeout. If the LLM hangs (network partition, overloaded provider), the worker thread blocks indefinitely. The user has no cancel mechanism and no elapsed-time feedback beyond "Thinking...".Fix: Consider wrapping with a timeout. At minimum,
exclusive=Truemeans a second message displaces the first pending dispatch (addressing both #1 and #3).4. Overbroad
contextlib.suppress(Exception)swallows critical signals (SAFETY)Location:
src/cleveragents/tui/commands.py:249,253,288suppress(Exception)catches ALL Exception subclasses, includingKeyboardInterrupt,SystemExit,MemoryError,RecursionError. The intent (graceful degradation) is correct but scope is too broad.Fix: Use narrower exception types:
try: ... except (RuntimeError, ValueError, ImportError, OSError): pass.5. No error logging in
_run_llm_dispatch(OPERATIONAL)Location:
src/cleveragents/tui/app.py:74-90When
_run_llm_dispatchcatches exceptions, it converts them to display strings but never logs them. No structured log record or stack trace is produced, making debugging production TUI issues difficult. The rest of the codebase usesstructlogextensively.Fix: Add
logger.exception()in each except branch.6. Empty assistant response renders confusing UI (UX / BUG)
Location:
src/cleveragents/tui/app.py:79-80SessionWorkflow.tell()can legitimately return emptyassistant_message. The TUI would display"Assistant: "with blank text.Fix: Add fallback:
LOW Severity
7. Untested catch-all
except Exception(CODE QUALITY)Location:
app.py:89--except Exception as exc: # pragma: no cover. This broad handler has zero test coverage and could mask bugs.8.
_format_worker_outcomestringifies critical exceptions (SAFETY)Location:
app.py:111--return f"[Error] {error}". If Textual sendsKeyboardInterruptas worker error, it gets stringified rather than propagated.9. Hardcoded
"default"used for two different purposes (DESIGN)Location:
commands.py:290,297."default"serves as both persona lookup placeholder and session_id fallback. Extract distinguishable constants.10-12. Test coverage gaps (TEST GAPS)
[/]in user input or LLM response).assistant_messageedge case.Spec Alignment
No violations against
docs/specification.md. The implementation correctly uses A2Amessage/send, follows the facade pattern fromcli/commands/session.py, creates database-backed sessions viaSessionService, and passes persona actor as anactoroverride per dispatch.Summary Table
exclusive=Falseallows parallel dispatches on same sessionsuppress(Exception)swallows critical signals_run_llm_dispatchexcept Exceptionis untested_format_worker_outcomestringifiesKeyboardInterrupt"default"string shared for two semanticsThe implementation is well-structured and correctly follows the A2A dispatch pattern. Items #1 and #2 are the two I recommend addressing before merge.
Review Feedback Addressed — @CoreRasurae Cycle 2
All 12 findings from the latest review have been resolved in commit
cf6e33f2b.✅ Fix 1 — HIGH: Concurrent dispatch race condition
exclusive=True, name="llm-dispatch"— dispatches are now serialised. A new message cancels any in-flight worker before starting a new one, preventing interleavedappend_message()calls, widget thrashing, and SQLite contention.✅ Fix 2 — MEDIUM: Rich markup injection
rich.markup.escape()is now applied to allconversation.update()call sites (on_input_submitted,_on_llm_done). User input and LLM responses render as literal text.✅ Fix 3 — MEDIUM: Overbroad
suppress(Exception)All
contextlib.suppress(Exception)blocks in_build_tui_facadeand_create_tui_sessionnarrowed to specific types(RuntimeError, ValueError, ImportError, AttributeError, OSError).KeyboardInterrupt,SystemExit,MemoryErroretc. now propagate correctly.✅ Fix 4 — MEDIUM: No error logging
structloglog records added to every except branch in_run_llm_dispatch— A2A errors,SessionNotFoundError,DatabaseError, and the generic catch-all each emit a structured warning/exception entry.✅ Fix 5 — MEDIUM: Empty assistant response
assistant_text = result_data.get("assistant_message", "") or "(no response)"— blank LLM responses now show a clear fallback instead of"Assistant: ".✅ Fix 6 — LOW:
_format_worker_outcomestringifiesKeyboardInterruptBaseExceptionsubclasses that are notExceptionare re-raised rather than stringified.✅ Fix 7 — LOW: Hardcoded
"default"dual semanticsExtracted
_FALLBACK_SESSION_IDand_PRE_SESSION_PERSONA_KEYconstants — the two distinct uses of"default"are now named and distinguishable.✅ Fix 8 — LOW: Test coverage gaps
Added 3 new BDD scenarios: empty
assistant_messagefallback, Rich markup escaping verification (appliesescape()and checks no unescaped tags remain), andKeyboardInterruptre-raise from_format_worker_outcome. 15 scenarios total, all passing.Ready for re-review.
Code Review Report — PR #11231 (feature/m8-tui-llm-dispatch)
Branch:
feature/m8-tui-llm-dispatch| Issue: #11230 | Milestone: v3.7.0Scope: All 7 commits on the branch (cf6e33f2..4e176c55), plus close surrounding-code connections. Previous review rounds (CoreRasurae, HAL9001) have been addressed in commits
f4f6b77bandcf6e33f2.Review conducted across: Correctness, Security, Performance, Test Coverage, Code Quality, Specification Alignment.
HIGH Severity
1. Conversation widget content replaced, not accumulated (BUG / SPEC VIOLATION)
Location:
src/cleveragents/tui/app.py:341,361_Static.update()replaces the widget content on every call. The flow is:conversation.update(_escape(f"You: {expanded}\n\n⏳ Thinking..."))conversation.update(_escape(outcome))(replaces thinking text)conversation.update(_escape(f"You: {expanded2}\n\n⏳ Thinking..."))← previous exchange is LOSTThe
SessionView.transcript: list[str]field exists but is never populated (line 231:transcript=[]). No code appends to it.The specification (
docs/specification.md:29269) describes a "scrollable message stream with block cursor navigation" showing multi-message conversations with user, actor, and tool outputs visible simultaneously (mockup at lines 29227-29246).Impact: Multi-turn conversations lose all visible history after the second message. Users cannot scroll back through prior exchanges. This directly contradicts the feature goal of "having a real conversation through the TUI interface."
Fix: Accumulate content rather than replacing it, using
conversation.mount()for new message blocks, or maintain a content buffer that grows with each exchange and update the full transcript on each change.MEDIUM Severity
2. Double
session_serviceinstance creation from Factory provider (ARCHITECTURE)Location:
src/cleveragents/application/container.py:884+src/cleveragents/tui/commands.py:251,299session_serviceis declared asproviders.Factory(notproviders.Singleton):Both
_create_tui_session()and_build_tui_facade()independently callcontainer.session_service():This creates two separate
PersistentSessionServiceinstances with two separate SQLAlchemy engines to the same SQLite file. The CLI equivalent (cli/commands/session.py:70-83) uses a module-level_servicesingleton to avoid dual instantiation:While functionally correct for SQLite (serialized access to the same file), this wastes resources (two engines, two sessionmaker factories) and is architecturally fragile — if the factory's construction parameters ever diverge, the two instances could target different DB paths.
Fix: Either change
session_servicetoproviders.Singleton, or adopt the CLI pattern of caching the first-created instance at module level.3. Missing integration tests for
DatabaseErrorandSessionNotFoundErrorin dispatch flow (TEST GAP)Location:
robot/helper_tui_llm_dispatch.pyThe Robot Framework integration tests cover 4 scenarios: session creation, facade building, dispatch, and no-actor error. The
cmd_no_actor_errorsubcommand testsSessionActorNotConfiguredError.However, there are no integration test subcommands for:
DatabaseErrorduring dispatch (e.g., DB file locked, disk full)SessionNotFoundErrorduring dispatch (e.g., session deleted between creation and dispatch)Both error paths are tested only in Behave via mocks (
features/tui_llm_dispatch.feature:43-51). The contributing guidelines require both unit AND integration tests for new behavior.Fix: Add
cmd_database_errorandcmd_session_not_foundsubcommands torobot/helper_tui_llm_dispatch.py, and corresponding Test Cases inrobot/tui_llm_dispatch.robot.4. No Behave test for
on_input_submitted→run_worker→done_callbackintegration (TEST GAP)Location:
src/cleveragents/tui/app.py:354-366The Behave tests test
_run_llm_dispatch()and_format_worker_outcome()in isolation, but the composition ofrun_worker(thread=True, exclusive=True)withdone_callbackwiring withinon_input_submittedis untested. Specifically:_dispatch_llmclosure correctly capturesfacade,session_id,expanded,actor_on_llm_donecallback correctly calls_format_worker_outcomeand updates the widgetexclusive=True, name="llm-dispatch"serialization prevents concurrent dispatchWhile testing Textual internals is impractical without a running event loop, the closure construction logic could be tested indirectly.
Mitigation: Add a comment or ADR note explaining why this integration path is untestable without Textual, confirming it has been manually verified.
LOW Severity
5. Imports inside frequently-called function bodies (STYLE)
Location:
src/cleveragents/tui/app.py:73-78,294_run_llm_dispatch()importsA2aRequest,DatabaseError,SessionActorNotConfiguredError,SessionNotFoundErroron every invocation.on_input_submitted()importsrich.markup.escapeon every keypress._create_tui_session()importscontextlibinside the function body.Python caches modules after the first import, so the runtime cost is minimal, but this pattern is inconsistent with the rest of the codebase (which uses module-level imports).
Fix: Move these imports to module level.
6.
# pragma: no coveron catch-allexcept Exception(CODE QUALITY)Location:
src/cleveragents/tui/app.py:111The catch-all
except Exception as exc: # pragma: no coverin_run_llm_dispatchis intentionally excluded from coverage because the specific exception types (SessionActorNotConfiguredError,SessionNotFoundError,DatabaseError) cover all expected failure modes. However, truly unexpected exceptions (e.g.,MemoryError,RecursionError) would be silently caught and converted to a user-facing[Error]string without any test verifying this behavior.Mitigation: Acceptable as-is since this is a safety net. Consider a note in the docstring acknowledging its purpose.
Specification Alignment
No violations against
docs/specification.mdbeyond item #1 (conversation history). The implementation correctly:message/sendas the messaging protocolcli/commands/session.pySessionServiceactoroverride per dispatch (spec-compliantagents session tellbehavior)Summary Table
_Static.update()replaces content; conversation history lost;SessionView.transcriptunusedproviders.Factorycreates twosession_serviceinstances; inconsistent with CLI singleton patternDatabaseError/SessionNotFoundErrorrun_worker+done_callbackintegration path inon_input_submitted_run_llm_dispatch,on_input_submitted,_create_tui_session)# pragma: no coveron catch-allexcept ExceptionAll previously reported issues (concurrent dispatch, markup injection, error swallowing, logging, empty response, KeyboardInterrupt handling, hardcoded strings, test gaps) have been resolved in the latest commits.
Review Outcome: REQUEST CHANGES
Items blocking merge: #1
Strongly recommended: #2, #3, #4
fe45e1e91d9c95c3ed78Code Review —
feature/m8-tui-llm-dispatch→ PR #11231BUGS
HIGH — Cancelled worker callback may corrupt multi-message transcript
File:
src/cleveragents/tui/app.py:398–423When
exclusive=Truecancels an in-flight worker (because a new message was submitted), the cancelled worker'sdone_callbackstill fires._format_worker_outcomechecksnot isinstance(error, Exception)to re-raise non-ExceptionBaseExceptionsubclasses — but if the cancellation produces a worker error that IS anExceptionsubclass, the function returnsf"[Error] {error}"instead of re-raising. The_on_llm_donecallback then overwritestranscript[-1](which now contains the second dispatch's correct output) with stale data.Even if the error IS a
BaseExceptionsubclass and gets re-raised, re-raising inside a Textualdone_callbackhas undefined behavior — it may silently crash the callback chain, be logged, or trigger an unhandled exception handler depending on the Textual version.Recommendation: Guard the callback with a flag or worker ID check. Only apply the outcome if the worker is the most recent one started for this session:
Or track a generation counter:
PERFORMANCE
MEDIUM — O(n) full-string join + Rich escape on every render call
File:
src/cleveragents/tui/app.py:307–329(_render_transcript)_render_transcriptdoes_TRANSCRIPT_SEPARATOR.join(entries)followed by_escape(text)on every render. Both operations scan the full accumulated string. For a transcript with N entries, each call processes O(N·avg_entry_size) data. With 100+ exchanges in a long session, every keystroke that re-renders will rebuild and re-escape multi-KB strings.Previous transcript entries are already stored escaped (or safe), so re-escaping them is wasteful double-processing.
Recommendation: Consider either:
RichLogwidget which supportswrite()for append-only log-style rendering.LOW —
query_one("#conversation")called on everyon_input_submittedFile:
src/cleveragents/tui/app.py:349The widget lookup is repeated on every submission. Could be cached as
self._conversationon mount.SPEC ALIGNMENT
MEDIUM — Single
_Staticwidget vs spec's per-messageVerticalScrollwith block cursor navigationSpec reference: §29269 — "Messages are rendered as
RichLogorStaticwidgets grouped in aVerticalScroll" with "block cursor navigation."File:
src/cleveragents/tui/app.py:307–329The PR concatenates all messages into a single
_Staticwidget's text. This design:This is likely an intentional simplification for this PR (which focuses on LLM dispatch wiring, not full conversation UX), but worth noting as a future spec gap.
LOW — No conversation pruning mechanism
Spec reference: §30492 — "The TUI implements automatic content pruning to maintain responsiveness."
The transcript list grows unbounded with no truncation or widget reuse. For sessions with hundreds of messages, this will cause memory pressure and rendering latency.
TEST COVERAGE GAPS
MEDIUM — Worker lifecycle integration tested only manually
File:
src/cleveragents/tui/app.py:391–397(see code comment)The
run_worker(thread=True, exclusive=True)+done_callbackwiring is explicitly excluded from Behave tests. Only the constituent parts (_run_llm_dispatchand_format_worker_outcome) are tested in isolation. The integration — cancellation timing, callback ordering, transcript state after concurrent dispatches — has only manual verification.MEDIUM —
_format_worker_outcomeSystemExit path untestedFile:
src/cleveragents/tui/app.py:133–155The docstring claims
SystemExitis re-raised but onlyKeyboardInterruptis tested (tui_llm_dispatch.featureScenario: "_format_worker_outcome re-raises BaseException subclasses").LOW —
_build_tui_facadefallback path untestedFile:
src/cleveragents/tui/commands.py:256–296(line 296:# pragma: no cover)The outer
exceptthat returnsNonewhen facade construction fails is untested.LOW —
_format_worker_outcome(result=None, error=None)path untestedFile:
src/cleveragents/tui/app.py:155(# pragma: no cover)The
return Nonebranch when both result and error are None is unreachable in practice but not covered.TEST FLAWS
LOW — Rich markup escaping test validates
escape()in isolationFile:
features/steps/tui_llm_dispatch_steps.py:341–363The test calls
_run_llm_dispatch, then manually appliesescape()to the result. It proves thatescape()works, but does NOT verify that_render_transcriptactually calls_escape()beforeconversation.update(). If someone accidentally removes the_escape()call from_render_transcript, this test would still pass.DESIGN
LOW — Module-level
_tui_session_servicesingletonFile:
src/cleveragents/tui/commands.py:41The singleton mirrors the CLI pattern (
cli/commands/session.py), which is reasonable. However, it requires tests to manually resetcmd_mod._tui_session_service = None. The Behave tests handle this correctly, but it's a maintenance risk — a test that forgets to reset the singleton could get a stale instance.Summary
All 11 findings are non-blocking review comments. No critical security, correctness (under normal single-message dispatch), or data-loss issues found.
Fix 1 (HIGH): Guard _on_llm_done against cancelled worker corrupting transcript. Added _dispatch_gen counter — incremented on each dispatch, captured in closure. done_callback exits early if worker.state is CANCELLED or if _dispatch_gen has advanced (newer dispatch started). Fix 2 (PERF): Pre-escape transcript entries on storage in on_input_submitted (_escape applied before append). _render_transcript now joins pre-escaped strings with no runtime re-escaping — O(1) amortised per render instead of O(N). Fix 3 (PERF/STYLE): Cache self._conversation widget in on_mount. query_one("#conversation") no longer called on every on_input_submitted. Fix 4 (SPEC): Added spec gap comment + pruning TODO in _render_transcript explaining the single-Static vs RichLog/VerticalScroll deviation. Fix 5 (COVERAGE): Added BDD scenarios: - _format_worker_outcome re-raises SystemExit (alongside KeyboardInterrupt) - _format_worker_outcome returns None when both inputs are None - _build_tui_facade returns None when facade construction fails Total: 18 scenarios, all passing. Fix 6 (TEST): Fixed markup escaping test to verify on_input_submitted stores pre-escaped entries in transcript (not just escape() in isolation). Fix 7 (DOCS): Added docstring note on pragma: no cover branch in _format_worker_outcome explaining why it is unreachable in practice. Refs: #11230Review Feedback Addressed — @CoreRasurae Cycle 4
All 11 findings from the latest review have been resolved in commit
b833b6f94.✅ Fix 1 — HIGH: Cancelled worker callback may corrupt transcript
Added
_dispatch_geninteger counter on the app, incremented on each dispatch and captured in the closure._on_llm_donenow returns early ifworker.state == WorkerState.CANCELLEDor ifself._dispatch_gen != current_gen(a newer dispatch has started). Stale callbacks from cancelled workers can no longer overwritetranscript[-1].✅ Fix 2 — MEDIUM: O(n) join + escape on every render
Transcript entries are now pre-escaped with
_escape()on storage inon_input_submitted._render_transcriptonly joins pre-escaped strings — no runtime re-escaping. O(1) amortised per render instead of O(N) re-escape of full history.✅ Fix 3 — LOW (perf): Repeated
query_oneon every submitself._conversationwidget cached inon_mount.on_input_submitteduses the cached reference with a fallback for robustness.✅ Fix 4 — MEDIUM (spec gap): Single
_StaticvsVerticalScrollAdded spec gap comment and
TODOin_render_transcriptdocumenting the intentional simplification and pointing to the futureRichLog/VerticalScrollmigration per spec §29269.✅ Fix 5 — MEDIUM: Worker lifecycle / coverage gaps
Added 3 new BDD scenarios:
_format_worker_outcome re-raises SystemExit(alongside the existingKeyboardInterruptscenario)_format_worker_outcome returns None when both result and error are None_build_tui_facade returns None when facade construction fails18 scenarios total, all passing.
✅ Fix 6 — LOW: Markup escaping test validated
escape()in isolationReplaced the previous scenario with
Rich markup in user input is escaped before storage in transcript— now verifies thaton_input_submittedstores pre-escaped entries in the transcript (not just thatescape()itself works). TheWhen I submit the textstep also capturescontext._app_transcriptfor the assertion.✅ Fix 7 — LOW:
_format_worker_outcomereturn NonebranchAdded docstring note explaining the
# pragma: no coverreturn Nonebranch is unreachable because Textual guarantees eitherresultorerroris set when a worker completes.✅ Remaining LOW items
cmd_mod._tui_session_service = Noneresets in Behave steps; added a note in the_get_tui_session_servicedocstring.TODOcomment added in_render_transcriptreferencing spec §30492.Ready for re-review.
Code Review: feat(tui): wire normal text input to LLM via A2A facade
Scope:
feature/m8-tui-llm-dispatch(11 commits, 6 files, +1276/-17 lines)Review cycles: 3 full cycles across all categories
Summary
The LLM dispatch wiring is well-structured. The core logic is extracted into testable module-level functions (
_run_llm_dispatch,_format_worker_outcome) with solid error handling for all expected domain exceptions. The Thread worker pattern with exclusive dispatch and generation tracking correctly prevents transcript corruption from concurrent submissions. Markup escaping is properly applied before transcript storage. No critical or high-severity issues found.Findings by Category & Severity
BUGS
None (critical/high). No race conditions, crashes, or data corruption paths identified.
SECURITY
LOW — Rich markup injection via
_run_llm_dispatchreturn valueThe function returns unescaped output (e.g.,
"You: {message}\n\nAssistant: {assistant_text}"). Current callers in_on_llm_doneapply_escape()before storage, so there is no active vulnerability. However, the function is module-level and importable from anywhere — a future direct caller that renders the output without escaping could expose the conversation widget to Rich markup injection from the user"s own message text or an LLM response containing brackets. Suggestion: either escape the output inside_run_llm_dispatchitself, or document prominently that callers MUST escape before rendering.LOW — Exception text in user-visible error messages
f"[Error] Database error: {exc}"and the catch-allf"[Error] {exc}"include the raw exception message. If aDatabaseErrorever contains a connection string or file path, it would leak to the conversation widget. Suggestion: use a generic message for the catch-all and only include the exception detail for domain errors.PERFORMANCE
MEDIUM —
_render_transcript()does O(N) copy + join on every renderentries = list(self._session.transcript)copies the entire list, then_TRANSCRIPT_SEPARATOR.join(entries)creates a large string. For conversations with hundreds of exchanges this could become noticeable. The code acknowledges this with a TODO referencing spec §30492 (conversation pruning). Acceptable for this PR given the scope and the acknowledged follow-up.LOW —
_escape()applied to full outcome string_escape(outcome)is called on the complete"You: ...\n\nAssistant: ..."string. This is O(response_length) but LLM API latency dominates, so this is negligible in practice.TEST COVERAGE
MEDIUM — Textual callback chain not covered by BDD tests
The
run_worker(thread=True, exclusive=True)+done_callback+_dispatch_gentracking path is untested by Behave. The component parts (_run_llm_dispatch,_format_worker_outcome) ARE tested in isolation viatui_llm_dispatch.feature, and the Robot Framework tests cover the facade wiring end-to-end. The code acknowledges this limitation with a clear comment. Acceptable for this PR.LOW — Test cleanup could use
finallyIn
tui_llm_dispatch_steps.py,cmd_mod._tui_session_service = Noneis called after thepatchcontext block but not in afinally. If the test asserts fail inside the block, the singleton is not cleaned up. The next test re-sets it toNonebefore work, so no test interference — but atry/finallywould be safer. Minor.SPECIFICATION ALIGNMENT
MEDIUM — Conversation uses
_Staticinstead of per-messageRichLogSpec §29269 calls for a scrollable message stream with per-message
RichLogwidgets and block-cursor navigation. This PR uses a single_Staticwidget with accumulated text. The code explicitly acknowledges this as an intentional simplification with a follow-up TODO. Acceptable given the PR scope (dispatch wiring, not widget architecture).MEDIUM — No conversation content pruning
Spec §30492 requires automatic pruning when conversations exceed 2,500 lines. Not implemented — acknowledged as TODO. Acceptable for this PR.
Things Done Well
_run_llm_dispatch()and_format_worker_outcome()are Textual-free and directly testable.SessionActorNotConfiguredError,SessionNotFoundError,DatabaseError, and A2A error responses are all handled with user-friendly messages._escape()inon_input_submittedand_on_llm_done), not just at render time. Tested in"Rich markup in user input is escaped before storage in transcript"._build_tui_facade()and_create_tui_session()to prevent duplicate SQLAlchemy engines (same pattern as CLI).exclusive=Trueon worker: Serialises LLM dispatches, preventing concurrent session writes.Recommendation
APPROVE. All findings are low-to-medium severity and the spec gaps (RichLog, pruning) are explicitly documented as out-of-scope for this PR. The implementation is solid, well-tested, and follows existing patterns from the CLI (same facade wiring, same SessionService singleton pattern).
Approved based on the BOT assessment, but on the condition that the commits are squashed into one. This branch has plenty of commits which are only just iterations of fixes.
07c514fb32652769c46c