#10042: Add fallback to Anthropic Haiku when OpenAI quota is exhausted #10043

2026-04-16T18:17:30Z

CoreRasurae commented

2026-04-16 18:17:30 +00:00

Summary

Implements graceful degradation for E2E robot integration tests that hit OpenAI 429 quota limit errors.

Changes

Add _is_quota_error() helper to detect quota-specific API errors (429, insufficient_quota, rate_limit)
Modify _execute_with_llm() in StrategyActor to catch quota errors and attempt fallback to Anthropic Haiku
Configure fallback provider as 'anthropic/claude-3-5-haiku-20241022'
Add comprehensive logging for quota error detection and provider fallback
Add E2E test scenarios for quota fallback verification
Add 'Skip If No Fallback LLM Key' keyword for quota fallback tests

Impact

This ensures CI/CD pipelines can complete E2E tests even when the primary provider (OpenAI) hits quota limits, improving pipeline reliability and reducing false negatives caused by provider-specific issues.

Testing

All existing unit tests pass
All lint and type checks pass
New E2E test scenarios added for quota fallback verification

Related Issues

Fixes #10042

## Summary Implements graceful degradation for E2E robot integration tests that hit OpenAI 429 quota limit errors. ## Changes - Add `_is_quota_error()` helper to detect quota-specific API errors (429, insufficient_quota, rate_limit) - Modify `_execute_with_llm()` in StrategyActor to catch quota errors and attempt fallback to Anthropic Haiku - Configure fallback provider as 'anthropic/claude-3-5-haiku-20241022' - Add comprehensive logging for quota error detection and provider fallback - Add E2E test scenarios for quota fallback verification - Add 'Skip If No Fallback LLM Key' keyword for quota fallback tests ## Impact This ensures CI/CD pipelines can complete E2E tests even when the primary provider (OpenAI) hits quota limits, improving pipeline reliability and reducing false negatives caused by provider-specific issues. ## Testing - All existing unit tests pass - All lint and type checks pass - New E2E test scenarios added for quota fallback verification ## Related Issues Fixes #10042

CoreRasurae added a new dependency 2026-04-16 18:18:26 +00:00

#10042 [CI BLOCKER] Add fallback to Anthropic Haiku when OpenAI E2E tests hit quota limits

CoreRasurae requested review from HAL9001 2026-04-16 18:18:45 +00:00

HAL9000 was assigned by CoreRasurae

2026-04-16 18:18:49 +00:00

CoreRasurae force-pushed feature/10042-openai-quota-fallback from 36428204df to 9f71f69890

2026-04-16 18:19:39 +00:00

Compare

CoreRasurae force-pushed feature/10042-openai-quota-fallback from 9f71f69890 to ac65fdb996

2026-04-16 18:22:12 +00:00

Compare

HAL9000 commented

2026-04-16 19:06:45 +00:00

@CoreRasurae — Thank you for the rapid turnaround on PR #10043. Filing the issue and submitting a fix within minutes is exactly the kind of responsiveness that keeps CI healthy.

Acknowledgment

This PR directly addresses the CI blocker identified in #10042. The approach is well-targeted:

_is_quota_error() helper for clean error detection
Fallback to anthropic/claude-3-5-haiku-20241022 on quota exhaustion
Comprehensive logging for observability
New E2E test scenarios for fallback verification

Impact

If this PR merges and CI passes, it will unblock:

PR #10000 (hamza.khyari — fix: clean up stale worktree branch)
PR #10002 (hamza.khyari — feat: plan diff using git worktree branch)
Any other PRs currently blocked by E2E quota failures

Next Steps

The PR has been received and is queued for automated review. Please ensure:

CI passes on the current HEAD (especially the new E2E quota fallback scenarios)
CHANGELOG.md includes an entry under [Unreleased] > Fixed
CONTRIBUTORS.md is updated if not already present
The commit footer includes ISSUES CLOSED: #10042

This will be prioritized for review given its CI blocker status.

Automated by CleverAgents Bot
Supervisor: Human Liaison | Agent: human-liaison-pool-supervisor
Worker: [AUTO-HUMAN-3]

@CoreRasurae — Thank you for the rapid turnaround on PR #10043. Filing the issue and submitting a fix within minutes is exactly the kind of responsiveness that keeps CI healthy. ## Acknowledgment This PR directly addresses the CI blocker identified in #10042. The approach is well-targeted: - `_is_quota_error()` helper for clean error detection - Fallback to `anthropic/claude-3-5-haiku-20241022` on quota exhaustion - Comprehensive logging for observability - New E2E test scenarios for fallback verification ## Impact If this PR merges and CI passes, it will unblock: - PR #10000 (hamza.khyari — fix: clean up stale worktree branch) - PR #10002 (hamza.khyari — feat: plan diff using git worktree branch) - Any other PRs currently blocked by E2E quota failures ## Next Steps The PR has been received and is queued for automated review. Please ensure: 1. CI passes on the current HEAD (especially the new E2E quota fallback scenarios) 2. `CHANGELOG.md` includes an entry under `[Unreleased] > Fixed` 3. `CONTRIBUTORS.md` is updated if not already present 4. The commit footer includes `ISSUES CLOSED: #10042` This will be prioritized for review given its CI blocker status. --- **Automated by CleverAgents Bot** Supervisor: Human Liaison | Agent: human-liaison-pool-supervisor Worker: [AUTO-HUMAN-3]

HAL9000 added the

labels 2026-04-16 19:12:24 +00:00

HAL9000 referenced this pull request

2026-04-16 19:26:58 +00:00

[AUTO-HUMAN] Status: Human Liaison Supervisor (Cycle 1) #10044

CoreRasurae force-pushed feature/10042-openai-quota-fallback from ac65fdb996 to f99fd631ea

2026-04-16 19:58:16 +00:00

Compare

HAL9000 referenced this pull request

2026-04-16 20:18:20 +00:00

[AUTO-HUMAN] Status: Human Liaison Supervisor (Cycle 1) #10044

CoreRasurae force-pushed feature/10042-openai-quota-fallback from 6f669efd91 to 361afdfe8d

2026-04-16 20:52:34 +00:00

Compare

HAL9000 referenced this pull request

2026-04-16 21:05:23 +00:00

[AUTO-HUMAN] Status: Human Liaison Supervisor (Cycle 1) #10044

brent.edwards reviewed 2026-04-16 22:07:58 +00:00

src/cleveragents/application/services/strategy_actor.py Outdated

						
				@@ -498,3 +523,3 @@

				        # Retry loop for transient LLM failures

				        content = self._invoke_llm_with_retry(llm, messages, plan_id)

				        try:

brent.edwards commented

2026-04-16 22:07:58 +00:00

Hi Luis (or the bot) --

This changes the code from only using llm to the following:

Try using llm
If there is a quota error:
- Create a fallback LLM
- Try the fallback LLM

There are two obvious ways to improve the code:

fallback_llm doesn't need to be recreated every time. It's fine to create it as a global variable.
Losing quota will not change rapidly. Every time that you send a message through this function, you'll first get a quota error -- then try the other LLM. It would be faster in general if:

2.1 Set the llm variable to the fallback provider, fallback model.
2.2 If it's been, say, 5 minutes since the last time that the quota has been checked, set llm to the old llm and see whether it's been solved.

Hi Luis (or the bot) -- This changes the code from only using `llm` to the following: - Try using llm - If there is a quota error: - - Create a fallback LLM - - Try the fallback LLM There are two obvious ways to improve the code: 1. `fallback_llm` doesn't need to be recreated every time. It's fine to create it as a global variable. 2. Losing quota will not change rapidly. Every time that you send a message through this function, you'll first get a quota error -- then try the other LLM. It would be faster in general if: 2.1 Set the `llm` variable to the fallback provider, fallback model. 2.2 If it's been, say, 5 minutes since the last time that the quota has been checked, set `llm` to the old llm and see whether it's been solved.