UAT: Storage backends stay in-memory (no Tantivy/Qdrant/Neo4j/rdflib wiring) #7003

Open
opened 2026-04-10 06:23:19 +00:00 by HAL9000 · 1 comment
Owner

What was tested

  • Reviewed the storage wiring in application/container.py and attempted to select the production backends declared in the Storage & Persistence spec (Tantivy text index, Qdrant vector DB, Neo4j graph, rdflib triple store).
  • Executed the vector backend factory with index.vector.backend=qdrant to confirm runtime behavior.

Expected behavior

  • Selecting tantivy, qdrant, or neo4j backends should load concrete implementations so the ACMS pipeline can persist text, vector, and graph data according to the spec.

Actual behavior

  • The DI container wires only InMemoryTextBackend, InMemoryVectorBackend, and InMemoryGraphBackend regardless of configuration. There are no Tantivy, Qdrant, Neo4j, or rdflib implementations anywhere in the codebase.
  • Running build_vector_backend with index.vector.backend=qdrant logs acms.vector_backend.fallback and returns InMemoryVectorBackend, confirming that the configurable “production” backend is missing.

Steps to reproduce

  1. Inspect src/cleveragents/application/container.py lines 825-846: the providers always return the in-memory stub backends. There is no conditional wiring based on config.
  2. Run:
    import os, tempfile, sys
    from pathlib import Path
    sys.path.insert(0, "<repo>/src")
    from cleveragents.config.settings import Settings
    from cleveragents.infrastructure.database.unit_of_work import UnitOfWork
    from cleveragents.application.services.vector_store_service import VectorStoreService
    from cleveragents.application.services.faiss_vector_backend import build_vector_backend
    work = Path(tempfile.mkdtemp())
    os.environ["CLEVERAGENTS_DATA_DIR"] = str(work)
    os.environ["CLEVERAGENTS_INDEX_VECTOR_BACKEND"] = "qdrant"
    svc = VectorStoreService(Settings(), UnitOfWork(f"sqlite:///{work / 'cleveragents.db'}", require_confirmation=False))
    backend = build_vector_backend(svc)
    print(type(backend).__name__)
    
    Output shows the warning acms.vector_backend.fallback configured_backend=qdrant fallback_backend=InMemoryVectorBackend and returns InMemoryVectorBackend.
  3. Search the repository for Tantivy, Qdrant, Neo4j, or rdflib classes—only config strings exist; no usable adapters.

Code references

  • src/cleveragents/application/container.py (DI wiring)
  • src/cleveragents/application/services/faiss_vector_backend.py (only FAISS implementation with fallback)
  • src/cleveragents/domain/models/acms/index_stubs.py (in-memory stubs currently in use)

Impact

  • Storage & Persistence spec explicitly promises Tantivy (text), FAISS/Qdrant (vector), Neo4j/rdflib (graph). Without concrete adapters the ACMS indexing pipeline cannot satisfy those requirements, so server mode lacks the documented persistence guarantees.
## What was tested - Reviewed the storage wiring in `application/container.py` and attempted to select the production backends declared in the Storage & Persistence spec (Tantivy text index, Qdrant vector DB, Neo4j graph, rdflib triple store). - Executed the vector backend factory with `index.vector.backend=qdrant` to confirm runtime behavior. ## Expected behavior - Selecting `tantivy`, `qdrant`, or `neo4j` backends should load concrete implementations so the ACMS pipeline can persist text, vector, and graph data according to the spec. ## Actual behavior - The DI container wires only `InMemoryTextBackend`, `InMemoryVectorBackend`, and `InMemoryGraphBackend` regardless of configuration. There are no Tantivy, Qdrant, Neo4j, or rdflib implementations anywhere in the codebase. - Running `build_vector_backend` with `index.vector.backend=qdrant` logs `acms.vector_backend.fallback` and returns `InMemoryVectorBackend`, confirming that the configurable “production” backend is missing. ## Steps to reproduce 1. Inspect `src/cleveragents/application/container.py` lines 825-846: the providers always return the in-memory stub backends. There is no conditional wiring based on config. 2. Run: ```python import os, tempfile, sys from pathlib import Path sys.path.insert(0, "<repo>/src") from cleveragents.config.settings import Settings from cleveragents.infrastructure.database.unit_of_work import UnitOfWork from cleveragents.application.services.vector_store_service import VectorStoreService from cleveragents.application.services.faiss_vector_backend import build_vector_backend work = Path(tempfile.mkdtemp()) os.environ["CLEVERAGENTS_DATA_DIR"] = str(work) os.environ["CLEVERAGENTS_INDEX_VECTOR_BACKEND"] = "qdrant" svc = VectorStoreService(Settings(), UnitOfWork(f"sqlite:///{work / 'cleveragents.db'}", require_confirmation=False)) backend = build_vector_backend(svc) print(type(backend).__name__) ``` Output shows the warning `acms.vector_backend.fallback configured_backend=qdrant fallback_backend=InMemoryVectorBackend` and returns `InMemoryVectorBackend`. 3. Search the repository for `Tantivy`, `Qdrant`, `Neo4j`, or `rdflib` classes—only config strings exist; no usable adapters. ## Code references - `src/cleveragents/application/container.py` (DI wiring) - `src/cleveragents/application/services/faiss_vector_backend.py` (only FAISS implementation with fallback) - `src/cleveragents/domain/models/acms/index_stubs.py` (in-memory stubs currently in use) ## Impact - Storage & Persistence spec explicitly promises Tantivy (text), FAISS/Qdrant (vector), Neo4j/rdflib (graph). Without concrete adapters the ACMS indexing pipeline cannot satisfy those requirements, so server mode lacks the documented persistence guarantees.
HAL9000 self-assigned this 2026-04-10 06:48:03 +00:00
HAL9000 added this to the v3.6.0 milestone 2026-04-10 06:48:03 +00:00
Author
Owner

Verified — UAT bug: storage backends stay in-memory — no real backend wiring. MoSCoW: Should-have. Priority: Medium.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — UAT bug: storage backends stay in-memory — no real backend wiring. MoSCoW: Should-have. Priority: Medium. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#7003
No description provided.