AI Checker HubAI Checker Hub

Provider Reliability Comparison

Compare provider reliability with transparent definitions across uptime, latency, error rate, cost per 1M tokens, and effective rate limits. Use filters for provider, model family, and region.

Comparison Table

Visualizations

Uptime (%)

Latency p95 (ms)

How To Choose A Provider Using This Page

Match your decision to workload type. Interactive products should prioritize p95 latency and timeout risk in user-facing regions. Batch pipelines should prioritize sustained throughput and cost stability under load. Critical production should maintain a two-provider strategy with staged failover and circuit breakers.

What Uptime Means Here

Uptime is calculated as successful checks divided by total checks over the selected window. It is a baseline health signal, not a guarantee that every model family or region performs equally at all times.

Why p95 Latency Matters More Than Averages

Averages can hide user pain during degradation. p95 better reflects tail behavior during incident windows and is generally more useful for routing thresholds and SLO protection.

FAQ

Why can providers look similar globally but differ in EU?

Regional routing, edge capacity, and policy differences can create major local variance despite similar global aggregates.

Does operational status mean performance is good?

Not always. A service can be operational while p95 latency and timeout risk are still elevated.

How often is this data updated?

This comparison view is updated from monitor snapshots and reflects rolling windows, not single-request outcomes.

Do different models from the same provider have different reliability?

Yes. Model family behavior can differ significantly, especially during capacity pressure periods.

How should I set failover thresholds using p95?

Use thresholds tied to user impact and trigger failover after consecutive breaches rather than one-time spikes.

What retry policy is safest during incidents?

Use bounded retries with jitter, short retry budgets, and a circuit breaker to avoid retry storms.

How To Read the Metrics Without Misleading Yourself

A single "best provider" rarely exists. Reliability depends on workload type, geography, and failure tolerance. Use the table to eliminate weak options first, then test finalists with your own traffic patterns.

Metric Priorities by Use Case

When to Trust 30-Day vs 90-Day Views

Use 30-day windows for current routing choices and capacity posture. Use 90-day windows to detect recurring structural risk, seasonality, or regional instability that short windows may hide.

Practical Routing Threshold Examples

Thresholds should be tied to user impact and not copied from generic templates. The examples below are starting points you can tune with your own SLO targets.

Combine these with your incident archive and provider-specific pages to avoid overreacting to short-lived spikes.

Common Comparison Mistakes to Avoid

The best teams treat this page as a decision support layer. They combine monitor data, business goals, and customer impact to choose routing strategies that are both resilient and cost-aware.

For best results, review this page together with your own app metrics at least weekly, then update thresholds gradually instead of making abrupt policy shifts after one noisy day.