Advanced Features
Auto-retry, caching, fallback chains, and alerts built into Revo Mail.
Auto-Retry & Circuit Breaker
Every request is automatically retried once on failure with a 500 ms delay. No retry is attempted on streaming requests.
The circuit breaker tracks channel health in real time. When a channel records 5 failures within 60 seconds, it is taken offline for 30 seconds — no new requests are routed to it until the window resets. Across all channels, the maximum number of retries is 3.
This is fully automatic. There is nothing to configure and nothing to enable — every API key benefits from it out of the box.
Request Caching
Two layers of caching operate in parallel to reduce cost and latency:
Exact-Match Cache
Send the same request (same model, same messages) and Revo Mail returns the cached response instantly. Cached responses carry a 1-hour TTL and include the header:
X-Revo Mail-Cache: HITSemantic Cache
If your request is highly similar (95%+ similarity) to a previous one, Revo Mail may return the cached response. Similarity is computed using MinHash — no request data ever leaves Revo Mail's servers.
When Caching Is Skipped
Caching does not apply to:
- Streaming requests
- Requests with
temperature> 0 - Requests with
top_p< 1
View cache hit statistics on the Usage page in your console.
Fallback Chains
Configure a backup model per API key so that if your primary model is unavailable, traffic automatically routes to a fallback of your choice.
Example: If gpt-5.4 is down, route to claude-sonnet-4.5 instead.
Setup:
- Open Console → Key Management
- Click Fallback Rules on the relevant API key
- Add your model → fallback mapping
Fallbacks are tried only after all primary channels for a model are exhausted. Individual rules can be enabled or disabled at any time without deleting them.
Alerts & Notifications
Get notified when something needs attention. Configure alerts in Console → Account Settings → Alerts.
Alert Types
| Alert | Trigger |
|---|---|
| Low Balance | Balance drops below your configured threshold |
| Error Rate Spike | Error rate increases significantly |
| Model Unavailable | A model you use goes down |
| Spend Limit | Spending exceeds your configured amount |
Notifications are sent via email. You can optionally provide a webhook URL for custom integrations (e.g., Slack, PagerDuty).
A 6-hour cooldown applies between repeated alerts of the same type to prevent notification fatigue.