UAT: Cost tracking returns None for non-OpenAI providers (Anthropic, Google, Groq, etc.) #5505

Open
opened 2026-04-09 07:07:14 +00:00 by HAL9000 · 1 comment
Owner

Bug Report

Feature Area: Token Counting and Cost Tracking Per Provider
Severity: Medium (cost tracking incomplete for non-OpenAI providers)
Found by: UAT Testing (uat-pool-1, worker: additional-llm-backends)


What Was Tested

The cost tracking integration in LangChainChatProvider for non-OpenAI providers.

Expected Behavior (from spec)

The spec states:

  • Cost Tracking: Costs are calculated based on per-token input/output rates defined in the provider's model registration.
  • Budgeting: Plans and sessions can have cost limits, and the system will halt execution if a cost estimate exceeds the defined budget.

This implies cost tracking should work for ALL providers, not just OpenAI.

Actual Behavior

In langchain_chat_provider.py, the _usage_tracker() method only activates the LangChain OpenAI callback for OpenAI-backed LLMs:

def _usage_tracker(self, llm: BaseLanguageModel) -> AbstractContextManager[Any]:
    if _openai_callback is not None and self._is_openai_llm(llm):
        return _openai_callback()
    return nullcontext()  # ← No tracking for Anthropic, Google, Groq, etc.

And _is_openai_llm() only returns True for langchain_openai-backed models:

def _is_openai_llm(self, llm: BaseLanguageModel) -> bool:
    module = getattr(llm.__class__, "__module__", "")
    return module.startswith("langchain_openai")

As a result:

  • OpenAI: usage_tracker.total_tokens and usage_tracker.total_cost are populated ✓
  • Anthropic: usage_tracker is None_resolve_token_cost() returns None
  • Google/Gemini: usage_tracker is None → cost is None
  • Groq, Together, Cohere: usage_tracker is None → cost is None

The _resolve_token_count() falls back to _estimate_token_usage() (which uses llm.get_num_tokens()) for non-OpenAI providers, but cost is always None.

Code Location

  • src/cleveragents/providers/llm/langchain_chat_provider.py
    • _usage_tracker() — only handles OpenAI
    • _is_openai_llm() — only checks for langchain_openai module
    • _resolve_token_cost() — returns None when tracker is None

Impact

  1. The CostTracker cannot record actual costs for Anthropic, Google, or other providers — it can only estimate based on token counts from ProviderCostTable.
  2. Budget enforcement via CostTracker.record_usage() requires accurate cost data, which is unavailable for non-OpenAI providers.
  3. The ProviderResponse.token_count is populated via estimation, but cost in the log is always None for non-OpenAI providers.

Suggested Fix

Implement provider-specific usage tracking:

  1. For Anthropic: Use langchain_anthropic's response metadata (available via AIMessage.usage_metadata in newer LangChain versions) or implement a custom callback.

  2. For Google: Use langchain_google_genai's response metadata similarly.

  3. Alternative: After each LLM call, extract token usage from the response's usage_metadata field (available in LangChain 0.2+) and use ProviderCostTable to compute cost. This would work for all providers.

Example approach using usage_metadata:

# In _resolve_token_count, check response metadata
# LangChain AIMessage.usage_metadata: {"input_tokens": N, "output_tokens": M}

Automated by CleverAgents Bot
Supervisor: UAT Testing | Agent: uat-tester

## Bug Report **Feature Area**: Token Counting and Cost Tracking Per Provider **Severity**: Medium (cost tracking incomplete for non-OpenAI providers) **Found by**: UAT Testing (uat-pool-1, worker: additional-llm-backends) --- ## What Was Tested The cost tracking integration in `LangChainChatProvider` for non-OpenAI providers. ## Expected Behavior (from spec) The spec states: - **Cost Tracking**: Costs are calculated based on per-token input/output rates defined in the provider's model registration. - **Budgeting**: Plans and sessions can have cost limits, and the system will halt execution if a cost estimate exceeds the defined budget. This implies cost tracking should work for ALL providers, not just OpenAI. ## Actual Behavior In `langchain_chat_provider.py`, the `_usage_tracker()` method only activates the LangChain OpenAI callback for OpenAI-backed LLMs: ```python def _usage_tracker(self, llm: BaseLanguageModel) -> AbstractContextManager[Any]: if _openai_callback is not None and self._is_openai_llm(llm): return _openai_callback() return nullcontext() # ← No tracking for Anthropic, Google, Groq, etc. ``` And `_is_openai_llm()` only returns `True` for `langchain_openai`-backed models: ```python def _is_openai_llm(self, llm: BaseLanguageModel) -> bool: module = getattr(llm.__class__, "__module__", "") return module.startswith("langchain_openai") ``` As a result: - **OpenAI**: `usage_tracker.total_tokens` and `usage_tracker.total_cost` are populated ✓ - **Anthropic**: `usage_tracker` is `None` → `_resolve_token_cost()` returns `None` ✗ - **Google/Gemini**: `usage_tracker` is `None` → cost is `None` ✗ - **Groq, Together, Cohere**: `usage_tracker` is `None` → cost is `None` ✗ The `_resolve_token_count()` falls back to `_estimate_token_usage()` (which uses `llm.get_num_tokens()`) for non-OpenAI providers, but cost is always `None`. ## Code Location - `src/cleveragents/providers/llm/langchain_chat_provider.py` - `_usage_tracker()` — only handles OpenAI - `_is_openai_llm()` — only checks for `langchain_openai` module - `_resolve_token_cost()` — returns `None` when tracker is `None` ## Impact 1. The `CostTracker` cannot record actual costs for Anthropic, Google, or other providers — it can only estimate based on token counts from `ProviderCostTable`. 2. Budget enforcement via `CostTracker.record_usage()` requires accurate cost data, which is unavailable for non-OpenAI providers. 3. The `ProviderResponse.token_count` is populated via estimation, but `cost` in the log is always `None` for non-OpenAI providers. ## Suggested Fix Implement provider-specific usage tracking: 1. **For Anthropic**: Use `langchain_anthropic`'s response metadata (available via `AIMessage.usage_metadata` in newer LangChain versions) or implement a custom callback. 2. **For Google**: Use `langchain_google_genai`'s response metadata similarly. 3. **Alternative**: After each LLM call, extract token usage from the response's `usage_metadata` field (available in LangChain 0.2+) and use `ProviderCostTable` to compute cost. This would work for all providers. Example approach using `usage_metadata`: ```python # In _resolve_token_count, check response metadata # LangChain AIMessage.usage_metadata: {"input_tokens": N, "output_tokens": M} ``` --- **Automated by CleverAgents Bot** Supervisor: UAT Testing | Agent: uat-tester
HAL9000 added this to the v3.6.0 milestone 2026-04-09 07:13:01 +00:00
Author
Owner

Label compliance fix applied:

  • Added missing labels and/or milestone to bring issue into compliance with CONTRIBUTING.md

Automated by CleverAgents Bot
Supervisor: Backlog Grooming | Agent: backlog-groomer

Label compliance fix applied: - Added missing labels and/or milestone to bring issue into compliance with CONTRIBUTING.md --- **Automated by CleverAgents Bot** Supervisor: Backlog Grooming | Agent: backlog-groomer
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
cleveragents/cleveragents-core#5505
No description provided.