Is OpenAI down right now?

This page combines independent checks and incident telemetry to classify service as No, Yes, or Degraded in near real-time.

How can I tell outage vs rate limiting?

429 spikes without 5xx/timeouts often indicate account or quota pressure, while widespread 5xx/timeouts usually indicate provider-side instability.

What should I do first if calls fail?

Validate API key, organization limits, billing state, and retry policy before assuming a global outage.

How long do OpenAI incidents usually last?

Duration varies from a few minutes to multiple hours depending on scope. Review recent incidents on this page for practical expectations.

Should I fail over to another model?

For user-facing workloads, yes. Use controlled fallback routing with capped retries and region-aware failover.

Where can I find 429 and timeout fixes?

Use our dedicated troubleshooting guides linked in the common symptoms section.

Is OpenAI Down?

Operational Last checked: loading...

Current state: Operational.

Is OpenAI Down is a rapid diagnosis page that combines live checks, endpoint signals, and incident correlation so teams can make faster production decisions under uncertainty.

How To Use This "Is OpenAI Down" Page

This page is optimized for urgent incident triage. Start with the top status indicator to classify the event quickly as operational, degraded, or outage-like. Then move to the live checks panel to identify whether the issue is likely DNS/auth-related, endpoint-specific, or broad enough to trigger fallback routing. This approach helps reduce guesswork and prevents unnecessary rollback actions when the issue is local to configuration or quota limits.

We intentionally separate user-report signals from monitor checks. Community reports are useful directional context, but they are not definitive proof of a global provider incident. Always pair user reports with endpoint health data, recent incident history, and your own service telemetry before escalating to outage mode.

Live Checks

If It Is Down, Do This Now

Verify API key validity, project quota, and billing state before escalating.
Differentiate 429 pressure from outage: inspect 5xx and timeout rates alongside throttling.
Apply exponential backoff with jitter and cap retries at 2 to avoid retry storms.
Route traffic to backup region/provider when user-facing latency crosses your SLO.
Cross-check official provider updates and independent monitor trends before rollback decisions.

Common Symptoms and Meanings

Symptom	Likely Meaning	Action
429 Too Many Requests	Rate-limit or quota pressure	429 guide
Timeouts increasing	Queue saturation / network instability	Timeout guide
5xx server errors	Provider-side degradation or outage	Enable fallback path + reduce retries
401/403 auth failures	Credential or permission issue	Validate key, org, project, role mapping

Recent Incidents

User Reports (Optional Signal)

FAQ

Why is OpenAI slow but not fully down?

Partial saturation can raise p95 latency while many requests still succeed.

Should I pause traffic immediately?

Only for critical paths. Degrade gracefully first and route to fallback.

How often are checks refreshed?

This page refreshes approximately every 60 seconds while visible.

Can status differ by region?

Yes. Regional congestion is common; inspect region-specific failures.

What is the fastest diagnosis workflow?

Check DNS/Auth first, then key endpoints, then incident trend and provider communication.

Where can I see deeper telemetry?

Use the dedicated OpenAI status page for uptime, endpoint-level latency, and incident details.

OpenAI API Status

Uptime, latency trend, incidents, endpoint and region detail.

Anthropic API Status

Independent status with uptime and latency checks for Anthropic APIs.

Gemini API Status

Independent status with uptime and latency checks for Gemini APIs.

Is Anthropic Down?

Live outage diagnosis and next-step actions for Anthropic API traffic.

Provider Reliability Comparison

Independent reliability benchmarks and caveats.

Status History Archive

Historical timeline of incidents and recovery windows.

Fallback Routing Guide

Practical production fallback strategy and checklists.

Error Code Glossary

Root-cause hints for 429, 5xx, timeout, and auth issues.

429 Error Guide

Rate-limit root-cause analysis and safe retry/backoff strategy.

Timeout Guide

Timeout diagnosis and mitigation playbook for production workloads.

Decision Criteria For Engineering Teams

If checks show localized degradation with low hard failures, prioritize latency mitigation: smaller response payloads, tighter timeout budgets, and controlled retries with jitter. If checks show broad component impact or sustained 5xx growth, switch to failover mode with explicit traffic caps and rollback checkpoints.

After recovery, keep safeguards in place until error and latency trends remain stable for multiple intervals. Fast rollback to normal operation without verification often causes repeat incidents. This page should support disciplined recovery, not just rapid diagnosis.