Advanced Features

Auto-Retry & Circuit Breaker

Every request is automatically retried once on failure with a 500 ms delay. No retry is attempted on streaming requests.

The circuit breaker tracks channel health in real time. When a channel records 5 failures within 60 seconds, it is taken offline for 30 seconds — no new requests are routed to it until the window resets. Across all channels, the maximum number of retries is 3.

This is fully automatic. There is nothing to configure and nothing to enable — every API key benefits from it out of the box.

Request Caching

Two layers of caching operate in parallel to reduce cost and latency:

Exact-Match Cache

Send the same request (same model, same messages) and Revo Mail returns the cached response instantly. Cached responses carry a 1-hour TTL and include the header:

X-Revo Mail-Cache: HIT

Semantic Cache

If your request is highly similar (95%+ similarity) to a previous one, Revo Mail may return the cached response. Similarity is computed using MinHash — no request data ever leaves Revo Mail's servers.

When Caching Is Skipped

Caching does not apply to:

Streaming requests
Requests with temperature > 0
Requests with top_p < 1

View cache hit statistics on the Usage page in your console.

Fallback Chains

Configure a backup model per API key so that if your primary model is unavailable, traffic automatically routes to a fallback of your choice.

Example: If gpt-5.4 is down, route to claude-sonnet-4.5 instead.

Setup:

Open Console → Key Management
Click Fallback Rules on the relevant API key
Add your model → fallback mapping

Fallbacks are tried only after all primary channels for a model are exhausted. Individual rules can be enabled or disabled at any time without deleting them.

Alerts & Notifications

Get notified when something needs attention. Configure alerts in Console → Account Settings → Alerts.

Alert Types

Alert	Trigger
Low Balance	Balance drops below your configured threshold
Error Rate Spike	Error rate increases significantly
Model Unavailable	A model you use goes down
Spend Limit	Spending exceeds your configured amount

Notifications are sent via email. You can optionally provide a webhook URL for custom integrations (e.g., Slack, PagerDuty).

A 6-hour cooldown applies between repeated alerts of the same type to prevent notification fatigue.

On this page