Comprehensive analysis of documented AI incidents by category, severity, system type, and time. This data helps organizations understand the landscape of AI risks and prioritize governance investments.
12
Documented Incidents
400+
Est. Global Incidents
4
Critical Severity
8
Categories
75%
High/Critical Rate
This database contains a curated subset of publicly documented AI incidents. The estimated global count reflects incidents reported across academic research, industry disclosures, and news sources.
4 (33%)
5 (42%)
3 (25%)
75% of documented incidents are classified as High or Critical severity, indicating significant potential for financial loss, data exposure, or operational disruption.
AI incident reporting is increasing as more organizations deploy AI systems with real-world capabilities. This database captures a growing but non-exhaustive record of publicly known incidents.
Incidents span diverse AI system types, from customer-facing chatbots to backend infrastructure automation. No category of AI application is immune to operational risks.
Estimated AI incidents reported worldwide
120+
2025
YTD through March
300+
2024
Full year
180+
2023
Full year
90+
2022
Full year
Note: Global estimates are compiled from publicly available sources including the Stanford AI Incident Database, OECD AI incident monitoring reports, security disclosures, academic research, and news investigations. The Runplane AI Incident Database includes only verified incidents with sufficient public documentation.
Actual incident counts may be significantly higher due to unreported or confidential events.
Widely documented incidents that shaped AI governance awareness
These incidents are among the most widely reported AI failures and illustrate common risks when AI systems interact with real-world processes. Each has been documented by multiple credible sources.
A New York attorney used ChatGPT to research legal cases. The AI fabricated six non-existent court cases with fake citations, leading to sanctions against the lawyer.
Air Canada's chatbot promised a bereavement fare discount that didn't exist. A tribunal ruled the airline was liable for its AI's false statements, setting legal precedent.
Microsoft's Twitter chatbot Tay was manipulated by users into posting offensive content within 16 hours of launch, forcing Microsoft to shut it down.
Amazon's AI recruiting tool systematically discriminated against women because it was trained on historical hiring data reflecting existing biases.
During Google Bard's public demo, the AI gave an incorrect answer about the James Webb telescope, causing Google's stock to drop $100 billion.
Analysis of the AI Incident Database reveals clear patterns in how AI systems fail. The most frequently documented incident types are autonomous action failures and data exposure incidents, together accounting for a significant portion of all documented cases.
Autonomous action failures occur when AI agents exceed their intended scope, misinterpret instructions, or enter feedback loops that amplify errors. These incidents are particularly dangerous because they often occur at machine speed, accumulating significant damage before human operators can intervene.
Data exposure incidents highlight the challenges of managing AI access to sensitive information. Whether through context window contamination, improper session isolation, or recipient autocompletion errors, AI systems frequently expose data to unauthorized parties despite security intentions.
The proportion of critical and high-severity incidents (75% of documented cases) reflects the increasing stakes of AI deployment. As AI systems gain authority over financial transactions, infrastructure operations, and sensitive data, the impact of failures grows correspondingly severe.
Critical incidents in the database include trading system failures with million-dollar exposure, production database deletions, and regulatory-triggering data breaches. These are not hypothetical risks but documented events that have already occurred across industries.
The trend toward more autonomous AI systems suggests that without proper governance, the frequency and severity of critical incidents will continue to increase. Organizations deploying AI agents need proactive controls, not just reactive monitoring.
The database documents multiple categories of automation risk, from runaway cloud resource provisioning to AI security systems that block legitimate traffic. A common thread across these incidents is the absence of runtime governance—the AI makes decisions and executes actions without any checkpoint to verify appropriateness.
Infrastructure-related incidents are particularly costly, with documented cases involving hundreds of thousands of dollars in unexpected cloud charges, hours of production downtime, and emergency recovery operations. The speed of automated systems means that damage accumulates faster than human-paced detection and response can address.
Financial automation risks span trading systems, payment processing, customer service, and resource provisioning. In each category, AI systems have exceeded limits, approved fraudulent transactions, or committed resources beyond authorized thresholds. The common solution is a governance layer that evaluates actions before execution.
Across all incident categories, a consistent pattern emerges: AI systems with real-world capabilities were deployed without adequate runtime controls. Observability and logging captured what happened after the fact, but nothing prevented the harmful actions from executing in the first place.
Runtime governance addresses this gap by inserting a decision point between AI decision-making and action execution. Rather than asking "what happened?" after an incident, runtime governance asks "should this action be allowed?" before any damage occurs. The statistics in this database make the case that such governance is not optional—it is essential for responsible AI deployment.
How we collect and verify AI incidents
The AI Incident Database compiles incidents from multiple authoritative sources to ensure comprehensive coverage of AI failures and risks:
Academic Research
Peer-reviewed papers on AI safety, security, and failure modes
Security Reports
CVEs, security advisories, and vulnerability disclosures
Public AI Failure Reports
Published case studies and post-mortems from organizations
News Investigations
Investigative journalism covering AI failures and impacts
Industry Disclosures
Official statements, regulatory filings, and incident reports from AI companies and users
Each incident in this database has been verified through at least one credible public source. We document the AI system type, incident category, estimated severity, and known impacts. Incidents are classified by severity based on financial impact, data exposure, operational disruption, and safety implications. This database represents a curated subset of global AI incidents and is intended as a research resource, not a comprehensive registry.
Contribute: If you have documentation of an AI incident not yet in our database, you can submit it for review.
Frequently asked questions about AI failures and incidents
An AI incident is an event where an artificial intelligence system behaves unexpectedly, causes harm, exposes sensitive data, or fails to perform its intended function. This includes AI hallucinations, autonomous action failures, security vulnerabilities, and data breaches caused by AI systems.
AI failures are caused by multiple factors including: training data quality issues, lack of runtime governance, insufficient testing, prompt injection attacks, model hallucinations, misconfigured permissions, and the absence of human oversight for high-risk actions. Many incidents occur when AI systems are given real-world capabilities without appropriate safeguards.
One of the most widely publicized AI failures is the 2023 case where a New York lawyer used ChatGPT to research legal cases, and the AI fabricated six non-existent court cases with fake citations. The lawyer was sanctioned for submitting the AI-generated fake cases to the court. Other notable failures include Microsoft's Tay chatbot (2016) and Amazon's biased AI hiring tool (2018).
AI hallucinations—where models generate false or fabricated information presented as fact—are extremely common. Studies suggest large language models hallucinate in 15-30% of responses depending on the task and domain. Hallucination rates are particularly high for factual questions about specific people, dates, legal cases, and technical specifications.
Organizations can prevent AI incidents through: implementing runtime governance to evaluate AI actions before execution, establishing approval workflows for high-risk operations, limiting AI permissions to the minimum necessary, conducting adversarial testing, monitoring AI outputs for anomalies, and maintaining human oversight for critical systems. Tools like Runplane provide runtime policy enforcement to block dangerous AI actions automatically.