Mar 2023
Product
ChatGPT plugins + Code Interpreter announced
+
ChatGPT gains the ability to execute code in a sandbox — the first hint of agentic capability in a consumer product.
Code Interpreter could write and run Python, analyze data, create visualizations, and work with uploaded files. It rolled out broadly in July 2023 to Plus users, and demonstrated that LLMs could move beyond suggestion into execution.
Mar 2023
Product
Codex models deprecated in OpenAI API
+
OpenAI retires the original Codex models — GPT-3.5/4 prove capable enough for code without a specialized model.
The deprecation signaled that general-purpose LLMs had subsumed code-specialized models. This complicated longitudinal analysis: "Codex" would later be reused as a brand name for OpenAI's 2025 agentic product, referring to a materially different system.
May 2023
Survey
Stack Overflow: 43.8% AI tool adoption
+
First major benchmark — 89,184 respondents; 25.5% plan to use soon. The baseline for all future comparisons.
This was the first large-scale survey to ask specifically about AI in the development process (narrower than "ever used"), establishing a measurable adoption rate. Combined use-or-plan rate was ~69%.
Oct 2023
Survey
SWE-bench published (Princeton)
+
The benchmark that would define the agentic coding era — 2,294 real GitHub issues from 12 Python repos.
SWE-bench (arXiv:2310.06770, Jimenez et al.) evaluates whether AI can resolve real-world software issues end-to-end. Accepted as an oral at ICLR 2024, it became the de facto standard for measuring agentic coding capability. By mid-2024, top agents scored ~20% on the full benchmark.
Dec 2023
Product
Sourcegraph Cody GA
+
AI coding assistant leveraging Sourcegraph's deep code search for codebase-aware context launches generally.
Cody differentiated itself by using Sourcegraph's code intelligence graph for context. Cody Pro launched at $9/month. The product would later be pivoted — Cody Free/Pro sunset in July 2025 as Sourcegraph spun out "Amp" as a separate agentic coding company.
Mar 2024
Product
Devin announced — "first AI software engineer"
+
Cognition Labs demos an autonomous coding agent that can plan, execute, and debug multi-step tasks. Goes viral — and polarizes the industry.
Devin claimed SOTA on SWE-Bench at launch. Backed by Founders Fund. Industry reaction was split: enormous viral excitement at the demo, followed by skepticism when independent reviews showed underwhelming real-world performance. The Register later ran "'First AI software engineer' is bad at its job." GA came in December 2024 at $500/month; Devin 2.0 launched April 2025 at $20/month.
Apr 2024
Product
Google Gemini Code Assist launches
+
Google rebrands Duet AI for Developers → Gemini Code Assist at Cloud Next 2024. Powered by Gemini 1.5 Pro.
Part of a broader February 2024 consolidation retiring all "Duet AI" branding in favor of "Gemini." Supports VS Code, JetBrains IDEs, and Cloud Shell Editor. An enterprise version with codebase-aware customization followed. Gemini later shows 47.4% usage among SO 2025 agent users.
Apr 2024
Funding
Augment Code emerges from stealth — $252M raised
+
Enterprise-focused AI coding startup reveals itself with a $977M valuation after two years of stealth development.
Founded in 2022, Augment focused specifically on AI coding for large enterprise codebases. Investors: Sutter Hill Ventures, Index Ventures, Lightspeed, Meritech Capital. Reached ~$20M revenue by October 2025.
Apr 2024
Product
Amazon Q Developer GA
+
AWS rebrands CodeWhisperer → Amazon Q Developer, expanding from code completion to a full-lifecycle AI assistant.
All CodeWhisperer features were folded into Q Developer, which added broader capabilities: debugging, optimization, migration assistance, and infrastructure management. AWS claimed it could make employees "more than 80% more productive."
Apr 2024
Product
GitHub Copilot Workspace technical preview
+
GitHub unveils an issue-to-PR workflow: AI that brainstorms, plans, implements, and validates code changes from a GitHub issue.
First teased at GitHub Universe 2023, Workspace entered technical preview on April 29, 2024. A "Copilot-native developer environment" where the AI proposes a full implementation plan and code diff from an issue description. Expanded to all paying Copilot customers in December 2024.
Jun 2024
Survey
Stack Overflow 2024: adoption jumps to 61.8%
+
65,437 respondents — AI tool use rose 18 points year-over-year. "No plans" drops from 29.4% to 24.4%.
Same survey framing as 2023, making this year-over-year comparison reliable. The combined "use or plan" rate reached ~76%. The holdout population shrank from ~30% to ~24% in just one year.
Aug 2024
Policy
EU AI Act enters into force
+
The world's most comprehensive AI regulation takes effect, beginning a phased enforcement timeline that reaches coding tools via GPAI model obligations.
The Act doesn't single out coding tools, but GPAI model obligations (August 2025) require providers to supply technical documentation and comply with EU copyright rules. Models posing "systemic risks" (>10²⁵ FLOPS) face additional adversarial testing requirements. Full applicability for most operators arrives August 2026.
Aug–Dec 2024
Funding
Cursor raises $165M across Series A & B
+
Anysphere (Cursor) raises $60M at $400M valuation (August), then $105M at $2.5B valuation (December) — a 6x jump in four months.
Led by a16z and Thrive Capital. Cursor reached ~$100M ARR in 2024. The back-to-back raises signaled explosive growth and positioned Cursor as the leading AI-native IDE challenger to VS Code + Copilot.
Sep 2024
Product
Replit Agent launches
+
An autonomous agent that builds complete applications from natural language — not just suggestions, but full-stack app generation.
Replit Agent drove explosive growth: Replit went from $10M ARR (end of 2024) to $100M ARR (June 2025), with subscriber base growing 45% monthly post-Agent launch. Targeted a different user segment than Copilot/Cursor — non-professional developers and rapid prototyping.
Nov 2024
Product
Windsurf launches agentic IDE
+
Codeium launches Windsurf Editor, branded as the "first agentic IDE" — combining copilot and agent paradigms in a single editor.
Windsurf combined inline completions with agent-style multi-file editing in one interface. By July 2025, Reuters reported $82M ARR and 350+ enterprise clients. The company fully rebranded from Codeium to Windsurf in April 2025.
Jan 2025
Product
JetBrains Junie announced
+
JetBrains enters the agent race with a coding agent built into their IDE ecosystem — routine task delegation from familiar tools.
Junie could handle routine coding tasks, tests, and refactoring within JetBrains IDEs. Market-wide penetration not publicly quantified, but adoption was likely concentrated within the JetBrains user base.
Feb 2025
Product
Claude Code research preview
+
Anthropic introduces an agentic CLI tool that reads codebases, edits files, and runs commands — launched alongside Claude 3.7 Sonnet.
Claude Code took a fundamentally different approach: terminal-first rather than IDE-embedded. It could autonomously navigate repos, edit multiple files, run tests, and iterate. The launch was notably low-key — no launch event, no viral demo. It reached GA on May 22, 2025.
Feb 2025
Product
GitHub Copilot agent mode preview
+
Copilot gains autonomous multi-step capability in VS Code Insiders — planning tasks, editing multiple files, running terminal commands, and self-correcting errors.
Agent mode was distinct from standard Copilot (inline completions) and Copilot Chat (Q&A). It could break down tasks, identify sub-problems, execute solutions across files, and iterate on errors autonomously. GA rolled out to all VS Code users ~April 2025. Extended to JetBrains, Eclipse, and Xcode by May 2025.
Feb 2025
Policy
EU AI Act: prohibited practices + AI literacy enforceable
+
First enforcement phase — organizations must comply with banned AI practices and demonstrate AI literacy among staff.
While coding tools themselves weren't in the "prohibited" category, the AI literacy requirement meant organizations using AI coding tools needed training and governance programs in place. This accelerated enterprise AI policy formalization.
Apr 2025
Product
Devin 2.0 — price drops from $500 to $20/month
+
Cognition slashes pricing and shifts to usage-based model, signaling that autonomous coding agents are moving toward mass-market accessibility.
Devin 2.0 introduced ACUs (Agent Compute Units) for usage-based pricing. The 96% price drop in four months reflected both competitive pressure and the reality that early pricing had limited adoption to enterprises willing to experiment.
Apr 2025
Product
Tabnine sunsets free tier
+
Tabnine discontinues its free "Basic" plan to focus entirely on enterprise — named a Visionary in 2025 Gartner Magic Quadrant for AI Code Assistants.
The free-to-enterprise pivot reflected a market where free-tier competition from Copilot, Gemini Code Assist, and others made individual freemium unsustainable. Tabnine doubled down on enterprise features: private deployment, IP protection, and compliance.
May 2025
Product
OpenAI Codex agent research preview
+
OpenAI launches a cloud-based software engineering agent — a sandboxed system that can write features, fix bugs, and propose PRs autonomously.
Not to be confused with the deprecated 2021 Codex model. The 2025 "Codex" is a fully agentic platform that runs in sandboxed cloud environments, operates on entire codebases, and submits pull requests. The name reuse complicated longitudinal tracking.
May 2025
Product
GitHub Copilot coding agent public preview
+
Announced at Microsoft Build — an autonomous agent that takes a GitHub issue, spins up a dev environment, writes code, and opens a draft PR.
Distinct from "agent mode" (interactive, in-IDE). The coding agent is asynchronous: assign it an issue and it works in the background via GitHub Actions. Excels at low-to-medium complexity tasks in well-tested codebases. Reached GA on September 25, 2025 for all paid Copilot subscribers.
May 2025
Product
Claude Code reaches general availability
+
Three months after research preview, Claude Code goes GA. Becomes one of the fastest-growing software products ever: ~$1B ARR by November 2025.
Claude Code on the web launched October 20, 2025. Pragmatic Engineer's 2026 survey found 71% usage among regular agent users and 75% at the smallest companies — the dominant tool in the agentic-first segment. Estimated near $2B ARR by January 2026.
May–Jul 2025
Funding
OpenAI–Windsurf acquisition collapses
+
OpenAI agrees to buy Windsurf for $3B (May) — deal falls apart (July). Google signs $2.4B licensing deal and hires Windsurf's CEO.
The deal collapsed partly because Microsoft (via its OpenAI relationship) would have gained access to Windsurf's IP — problematic given Microsoft's competing Copilot product. Google DeepMind hired Windsurf CEO Varun Mohan and co-founder Douglas Chen. Windsurf continues independently with a non-exclusive Google license.
Jun 2025
Survey
Stack Overflow 2025: 78.5% using, 46% distrust
+
49,009 respondents — 47.1% daily use, but more developers distrust AI accuracy (46%) than trust it (33%). The paradox crystallizes.
2025 shifted to frequency reporting. Agent-specific data showed 30.9% using agents at any frequency, with 37.9% saying they don't plan to use agents. Only ~3% reported "highly trusting" AI output. Top frustration: "almost right but not quite" outputs (66%).
Jun 2025
Funding
Cursor raises $900M at $9.9B — fastest-growing SaaS ever
+
Series C co-led by Thrive and a16z. Cursor hit $500M ARR (May) → $1B ARR (October) — a pace unprecedented in SaaS history.
By November 2025, Cursor raised $2.3B Series D at $29.3B valuation. Total funding: ~$3.3B from Google, NVIDIA, a16z, Thrive, Accel, and Coatue. The valuation trajectory — $400M → $2.5B → $9.9B → $29.3B in 15 months — reflected the market's conviction that AI-native IDEs would replace traditional editors.
Aug 2025
Policy
EU AI Act: GPAI model obligations take effect
+
Providers of general-purpose AI models must supply technical documentation, publish training data summaries, and comply with EU copyright rules.
This milestone directly affected the foundation models powering all major coding tools (GPT-4, Claude, Gemini). Models posing "systemic risks" (>10²⁵ FLOPS) faced additional adversarial testing and risk mitigation obligations. The full compliance framework for most operators arrives August 2026.
Sep 2025
Product
Claude Code v2.0 — subagents, hooks, Agent SDK
+
Major architecture upgrade: checkpoints, subagents, hooks, background tasks, and the Claude Agent SDK. Powered by Sonnet 4.5.
The v2.0 release marked the shift from Claude Code as a single-turn tool to an orchestration platform. The Agent SDK enabled developers to build custom agentic workflows, while hooks and checkpoints allowed programmatic control over long-running autonomous sessions.
Oct 2025
Product
Cursor 2.0 + multi-agent interface
+
Cursor ships a multi-agent interface where multiple AI agents collaborate on different parts of a codebase simultaneously.
Cursor 2.0 represented the frontier of agentic IDE design — moving from single-agent to multi-agent workflows. This pushed the boundary on what "AI-assisted development" could look like: multiple agents working in parallel, each handling a different file or subsystem.
Nov 2025
Survey
Devin: 67% PR merge rate, 4× faster problem solving
+
Cognition Labs publishes Devin's 2025 performance review — PR merge rates doubled from 34% to 67%. Goldman Sachs pilots "hybrid workforce" with 20% efficiency gains.
Devin excels at tasks with clear requirements that would take a junior engineer 4-8 hours: migrations, vulnerability fixes, unit tests. One organization saved 5-10% of total developer time on security fixes alone. 20× efficiency gain on vulnerability remediation (1.5 min vs 30 min per issue).
Jan 2026
Product
Claude Cowork announced — agents beyond coding
+
Anthropic extends the agentic paradigm from coding to general knowledge work on macOS. Plugin system follows Jan 30.
Cowork signaled that the agentic infrastructure built for Claude Code was generalizable. The same long-horizon execution patterns — subagents, checkpoints, background tasks — could apply to research, analysis, and document creation.
Jan 2026
Survey
METR: AI agent time horizons doubling every ~89 days
+
METR publishes Time Horizon 1.1 — the "50% time horizon" (task length at which agents succeed half the time) jumped from ~4 min (early 2024) to hours by late 2025.
The Time Horizons benchmark tracked 228 tasks across frontier models. The doubling rate accelerated from ~188 days (full period) to ~89 days since 2024. This exponential improvement — described as "a new Moore's Law for AI agents" — was the strongest quantitative evidence that agent capabilities had crossed a practical threshold for long-horizon autonomous work.
Jan 2026
Survey
Microsoft: 4.7M paid Copilot subscribers
+
FY26 Q2 earnings call — the hardest number in AI coding. "20M users" and "90% of the Fortune 100" also cited.
Paid subscriber count is the most auditable metric in the space. The gap between "4.7M paid" and "20M users" suggests significant free/trial usage. Fortune 100 penetration (90%) reflects enterprise presence but not intensity of use within those organizations.
Feb 2026
Product
Claude Opus 4.6 — 12-hour time horizon, 1M context
+
Anthropic releases Opus 4.6 with 1M token context window (beta), 128K output. METR measures 50% time horizon at ~719 minutes (~12 hours). 75.6% SWE-bench.
The ~12-hour time horizon meant agents could theoretically sustain productive work across an entire workday without human intervention. Combined with the 1M context window, this enabled agents to hold entire codebases in context while working on complex, multi-file tasks. The 99.9th percentile Claude Code turn duration nearly doubled to 45+ minutes.
Feb 2026
Product
GPT-5.3-Codex + Codex desktop app
+
OpenAI releases GPT-5.3-Codex and desktop app for managing long-running agents. Demo: single session runs 25 hours, generates 30K lines of code.
The 25-hour demonstration — building a complete design tool with collaboration, layers, and export — was the most dramatic showcase of long-horizon capability. The agent performed validation (lint, type checking, tests, builds) after each milestone and auto-repaired failures. By March 2026, Codex had 2M+ weekly active users.
Feb 2026
Funding
Cursor hits $2B ARR — doubles in 3 months
+
Revenue doubles from $1B (Oct 2025) to $2B (Feb 2026). Enterprise customers now ~60% of revenue. Reportedly raising at $50B valuation.
Cursor shipped BugBot Autofix (cloud agents auto-fixing PRs, 35% merge rate), Cursor Automations (always-on agents triggered by code changes or Slack), and up to 8 parallel background agents on isolated VMs. Leadership explicitly called these "another step change in that progression."
Mar 2026
Product
Codex Security research preview
+
OpenAI extends its Codex agent platform into security — automated vulnerability detection and remediation.
Security was a natural extension of the agentic paradigm: if agents can write code, they can also analyze it for vulnerabilities, propose fixes, and verify patches. This addressed one of the key concerns around AI coding: that increased code volume without proportional security review creates risk.
Mar 2026
Product
Claude Computer Use + Dispatch
+
Anthropic ships phone-to-desktop dispatch: users message Claude to assign tasks that run autonomously on their computer.
This blurred the line between "coding agent" and "computer agent" — the same long-horizon execution architecture that powered Claude Code could now interact with any desktop application. Tasks could be assigned remotely and executed without the user being present at the machine.
Forecast
Survey
Gartner: 75% of enterprise engineers to use AI by 2028
+
Gartner's forward-looking forecast — up from <10% in early 2023 and 63% of organizations piloting/deploying in Q3 2023.
This trajectory — <10% → 63% piloting → 75% forecast — illustrates the enterprise adoption curve lagging but following the individual developer curve. The constraint is procurement, policy, and compliance, not developer willingness.