SO 2025 · All Respondents 78.5% use AI tools (any mode)
SO 2025 · Agent Use 30.9% use AI agents at any frequency
SO 2025 · Resistance 37.9% don't plan to use agents

Agent Adoption Breakdown

Stack Overflow 2025 — frequency of agent use specifically

Autonomy amplifies risk: test failures, security mistakes, and policy violations. "Verification debt" shifts effort from writing code to reviewing and debugging it.

Emerging Signal

The Long-Horizon Step Change (Early 2026)

METR (Model Evaluation & Threat Research) tracks the "50% time horizon" — the task duration at which frontier AI agents succeed half the time. This metric has been doubling every ~89 days since 2024, creating what some analysts call "a new Moore's Law for AI agents."

ModelDate50% Time Horizon
GPT-4 / Claude 3 OpusEarly 2024~4 min
Claude 3.5 SonnetJun 2024~11 min
Claude 3.7 SonnetFeb 2025~60 min
Claude Opus 4.5Nov 2025~293 min (~5 hrs)
Claude Opus 4.6Feb 2026~719 min (~12 hrs)

This enables a qualitative shift: agents that can work for hours, not minutes. OpenAI demonstrated a single Codex session running 25 hours uninterrupted, generating 30K lines of code. Cursor shipped cloud agents running on isolated VMs with up to 8 parallel instances. The architecture shifted from synchronous prompt-response to asynchronous execution loops with verification.

The caveat: Anthropic's own research found a "significant deployment overhang" — models are more capable of autonomy than users currently exercise. Median real-world Claude Code turns remain at ~45 seconds, even as the 99.9th percentile nearly doubled to 45+ minutes.

Sources: METR Time Horizons (metr.org), Anthropic research blog, OpenAI Codex blog