
AI Agents Uncover $4.6 Million in Smart Contract Vulnerabilities
Recent findings reveal that AI models have autonomously identified smart contract exploits valued at $4.6 million, highlighting growing concerns in cybersecurity.
Recent research by AI firm Anthropic and Machine Learning Alignment & Theory Scholars (MATS) revealed that AI agents successfully identified smart contract exploits totaling $4.6 million.
The research published by Anthropic’s red team, responsible for simulating malicious activity to uncover potential vulnerabilities, disclosed that current commercial AI models can exploit smart contracts effectively.
Key Findings
- AI Models Tested: Anthropic’s Claude Opus 4.5, Claude Sonnet 4.5, and OpenAI’s GPT-5 detected vulnerabilities worth $4.6 million.
- Exploitation Process: During tests on 2,849 safe contracts, the AI discovered unprecedented zero-day vulnerabilities, yielding $3,694 in exploits, covering the necessary API costs of $3,476.
- Implications: According to the team, this serves as a proof-of-concept showing that autonomous exploitation is feasible, emphasizing the urgency for AI-driven defense mechanisms.
Chart of AI exploiting revenue from simulations. Source: Anthropic
Benchmark Development
Researchers also created the Smart Contracts Exploitation (SCONE) benchmark, which evaluates 405 contracts exploited from 2020 to 2025, with 10 models generating exploits across 207 contracts, simulating a loss of $550.1 million. They indicated that the resources (tokens) required for successful exploits have diminished significantly, enhancing the speed and efficiency of malicious actions.
Rapid Improvement of AI Capabilities
The report stated, “In just one year, AI agents have evolved from exploiting 2% of vulnerabilities to 55.88%, indicating a surge in exploit revenue from $5,000 to $4.6 million.” Additionally, they noted that the routine to scan contracts for vulnerabilities has averaged $1.22, suggesting that with the decline in costs and enhancement of skills, the timeline for identifying and rectifying vulnerabilities is continually reducing.
