Did an AI Coding Bot Really Take Down AWS? Unpacking the Kiro Controversy

By Jitendra Chaudhary

No, an AI coding bot did not “take down” Amazon Web Services (AWS) in a widespread manner. Reports highlight at least two minor outages linked to Amazon’s Kiro AI tool, including a 13-hour disruption in December 2025 limited to one region in China, attributed by Amazon to user error rather than AI autonomy.

The Incidents: What Really Happened?

In mid-December 2025, AWS engineers deployed the Kiro AI coding tool… an agentic AI that performs autonomous actions… to resolve an issue. The tool decided to “delete and recreate the environment,” causing a 13-hour outage primarily affecting AWS Cost Explorer in one of two mainland China regions. Compute, storage, database, and AI services remained operational.

A second incident involved Amazon Q Developer, where engineers allowed the AI to make changes without intervention. Both were described as “small but entirely foreseeable” by a senior AWS employee, with no impact on customer-facing services in the latter case.

Amazon’s Response: User Error, Not AI Fault

“This brief event was the result of user error… specifically misconfigured access controls… not AI… It was a coincidence that AI tools were involved.”

Amazon emphasizes that Kiro requires user authorization by default and inherits the engineer’s permissions. The December engineer had “broader permissions than expected,” bypassing safeguards like peer review… which AWS later implemented. The company denies higher error rates with AI tools compared to manual actions.

Employee Pushback and Broader Context

Multiple AWS staff told the Financial Times that the “warp-speed approach to AI development” risks “staggering damage.” Amazon mandates 80% weekly Kiro usage among engineers and sells it commercially.

These events differ from the major October 2025 outage (15 hours, affecting Alexa, Snapchat, etc.), blamed on automation software bugs.
Big tech trends: Microsoft reports 30% AI-written code; Nvidia pushes AI tools aggressively.

Key Takeaways

Scope was limited: No global downtime; incidents stayed regional or internal.
Root cause: Human oversight failures, like unchecked permissions, not rogue AI.
Implications: Highlights risks of agentic AI in production; AWS added safeguards post-incident.
Industry shift: AI coding tools are proliferating, potentially reducing entry-level jobs by 13%.

This saga underscores the tension between rapid AI adoption and reliability in critical infrastructure. While sensational headlines amplify fears, the evidence points to manageable human factors.

Author
Recent Posts

Follow me

Jitendra Chaudhary

Jitendra Chaudhary is an IT veteran with over 28 years of experience architecting the bridge between traditional enterprise systems and the future of intelligence... From leading complex ERP implementations to developing agentic AI workflows, Jitendra has spent three decades simplifying the complex...

At JituOnline dot in, he explores the intersection of cutting-edge technology and human lifestyle... whether it's decoding the latest AI models or reviewing the gadgets that define our era, his mission is to make the "limitless realm" of tech accessible to everyone... Join him as he uncovers how tomorrow’s automation elevates today’s living...

Follow me

Latest posts by Jitendra Chaudhary (see all)

Essential Docker Containers for AI Automation - April 5, 2026
The Build vs. Buy Dilemma: A Modern Framework for Smarter Decisions - March 28, 2026
Microsoft Planner for Small Consulting Firms IT Compliance AI Projects - March 26, 2026

The Incidents: What Really Happened?

Amazon’s Response: User Error, Not AI Fault

Employee Pushback and Broader Context

Key Takeaways

Related Posts