Agentic Engineering Weekly for April 24 - May 2, 2026
The honeymoon ended in stereo this week. More hard data on AI velocity landed (10-15%, not the 3X your CEO has been quoting), the pricing reset got calendar entries, the techlash got organised into actual policy, more takes on harness engineering.
My top 3 picks this week
- Andrej Karpathy: From Vibe Coding to Agentic Engineering: when Karpathy talks, I listen. Quotes like "Vibe coding raises the floor for everyone, agentic engineering raises the ceiling for experts" and "traditional software can easily automate what you can specify, LLM's can easily automate what you can verify" will stick in my mind for a long time. (video)
- Collaborative AI Engineering: One Dev, Two Dozen Agents, Zero Alignment: Maggie Appleton on what breaks when each engineer runs their own private CLI harness and how GitHub R&D is shaping the future of collaboration. (video)
- Welcome to Gas City: Steve Yegge's Mad Max school of agent orchestration, v1.0.0. And a pretty solid case on why you should install some windows and an Andon cord in your "Dark Factory". (article)
More takes on harness engineering
Karpathy gave the discipline a North Star at Sequoia Ascent: vibe coding raises the floor, agentic engineering raises the ceiling. What this means in practice: build for agents (markdown, CLI, MCP, structured logs, sandboxing, auditability) and treat judgment as the new core skill. Mario Zechner's Pi, the minimalist self-modifying agent that backs Peter Steinberger's OpenClaw, is the proof: a harness small enough to read in an afternoon and powerful enough to run a product. Maggie Appleton at GitHub mapped the team-scale failure mode: one dev with twenty agents in personal CLIs and zero shared context produces duplication, divergence, and burned tokens. The harness has to extend across the team.
The cost angle keeps surfacing inside the same conversation. Token-hygiene moves that used to be optional optimisations are turning into harness primitives, which is why my last video highlights 2 harness tricks that cut my own token bill in half without changing model. The harness that emits fewer tokens is the harness that survives the next price hike.
Worth reading:
- Andrej Karpathy: From Vibe Coding to Agentic Engineering: Karpathy at Sequoia Ascent on why he has never felt more behind as a programmer and what to build for. (video)
- Agent Harness Engineering: Addy Osmani's naming piece. Great companion for Boeckeler's framing. (article)
- Anthropic, OpenAI, Google, and Microsoft agree that the harness is the product. They disagree on the price.: The strategic frame in one headline, with the four pricing models laid out side by side. (article)
- Building Pi, and what makes self-modifying software so fascinating: Mario Zechner and Armin Ronacher on a harness you can read in an afternoon. (podcast)
- Collaborative AI Engineering: One Dev, Two Dozen Agents, Zero Alignment: Maggie Appleton on what breaks when each engineer runs their own private CLI harness. (video)
The honeymoon ends: more hard data on AI velocity lands
DX published a longitudinal analysis across 400 companies and the headline number is 10-15% velocity gains, not the 2-3X figure executives have been quoting from vendor case studies. Abi Noda and Brian Houck (Microsoft, SPACE co-author) walked through the methodology on video and the gap between perceived and measured productivity is the actual story. Faros named the same gap Acceleration Whiplash in their 2026 report. BCG's executive perspectives whitepaper lands in the same week with the org-chart implications. The empirical grown-ups have entered the chat, and they are bringing receipts.
Adam Tornhill's Coding Is Dead (But It Still Smells Funny) names the second-order effect: the productivity story collapses fast without judgment, taste, and codebase legibility. Stack Overflow editorialises the moment as the find out stage of AI. SimonDev's AI Coding Works. That's the Problem is the practitioner version of the same claim. The convergence matters because it gives engineering leaders a citation that did not exist a month ago. The next time someone waves a vendor case study at you, the DX longitudinal data is the answer.
The Atlantic ran an important counterpoint worth holding alongside the velocity number: revenues are catching up to hype, mainly through Claude Code and other coding agents. The bubble framing does not sit well with that data. The honest read is that AI is producing real value, just less of it and more unevenly than the slides suggested, and that the orgs capturing the value are the ones who already had the solid foundations. Same shape as DORA and CircleCI results: AI amplifies what was already there. The real value gets captured by the elite performers.
Worth reading:
- AI productivity gains: More modest than expected: The 10-15% number from a 400-company longitudinal. (article)
- The Current Impact of AI on Engineering Velocity: What 400 Companies Are Seeing: Abi Noda and Brian Houck on what the data actually says and why expectations were calibrated wrong. (video)
- Coding Is Dead (But It Still Smells Funny): Tornhill on the post-AI dev role and what falls apart without judgment. (article)
- Welcome to the find out stage of AI: Stack Overflow naming the moment with the right phrase. (article)
- Maybe AI Isn't a Bubble After All: The counterpoint. Revenues are up, barely, but mainly through Claude Code and other coding agents. Software really is the genAI killer app. (article)
The techlash is here, and it's organized
Three years of complaints curdled into policy this week. Zig published the most stringent anti-LLM contribution policy in major open source: no LLMs for issues, no LLMs for pull requests, no LLMs for comments on the bug tracker including translation. The Linux kernel drew its own line: yes to Copilot for assistance, no to AI slop in patches, humans take the fall for mistakes. A Fortune piece reports 80% of white-collar workers quietly refusing AI adoption mandates. Hank Green moved Complexly to nonprofit status precisely so it can resist algorithm and AI pressure. Felienne Hermans and Izaak Dekker made the Dutch pedagogical case for keeping AI out of education.
The Matthew Yglesias quote that ricocheted around Bluesky this week is the mainstream-consumer version of the same shift: I don't want to vibecode. I want professionally managed software companies to use AI to make better products that they sell to me. That is the position that has been missing from the conversation. The technology press is full of practitioners arguing about how to use AI, and almost empty of users articulating what they actually want. Yglesias gave them a sentence.
The cognitive-science angle showed up in a Fortune piece worth reading carefully: AI promises to free workers from grunt work, but psychologists argue those mindless tasks are exactly what the brain needs to recover. The grunt work was the recovery time. Removing it produces the burnout the productivity numbers were supposed to prevent. Combine that with the argument that effort is what produces durable understanding and the techlash starts to look less like nostalgia and more like an immune system response. The pushback is now standing on real ground.
Worth reading:
- The people do not yearn for automation: "Software brain" is changing the world, but most people still aren't buying. (article)
- The Zig project's rationale for their firm anti-AI contribution policy: The most stringent OSS policy on AI-generated contributions, with the reasoning spelled out. (article)
- White-collar workers are quietly rebelling against AI as 80% outright refuse adoption mandates: The bottom-up version of the techlash, with a hard percentage. (article)
- Er zijn goede wetenschappelijke redenen om AI uit het onderwijs te weren: Felienne Hermans and Izaak Dekker on the pedagogical case for keeping AI out of school (Dutch). (article)
- Quoting Matthew Yglesias: The mainstream-consumer position that was missing from the conversation, in one sentence. (article)
- AI isn't taking jobs. It's taking something worse.: Mo Bitar reframing the labour-market panic as an agency-and-meaning problem. (video)
The pricing reset has dates now
Last week the pricing story was a vibe. This week it has calendar entries. GitHub Copilot officially moves to usage-based billing on June 1. Per-token pricing replaces per-request, rate limits tighten, model access shifts. Ed Zitron broke The Information's leak that OpenAI projects an 80% drop in $20 ChatGPT Plus subscriptions (from 44M to 9M), with the gap supposedly filled by ad-supported $5-8 ChatGPT Go growing from 3M to 112M. Anthropic's painted door test -briefly dropping Claude Code from the $20 tier on their pricing page- ran a similar price shifting experiment. Whether all that creative math will work out is the question of the year.
Pragmatic Engineer's follow-up names fifteen companies whose AI spend has exploded in two to three months and walks through the different coping mechanisms. Stack72 puts a number on the new normal: five people, $3,000 a month, zero lines written manually. The era of flat-rate pricing absorbing your agent's inefficiency is ending. The bottleneck is what you feed the agent, not the agent itself. MindStudio's GPT-5.5 vs Opus 4.7 comparison gives the operational counterweight: GPT-5.5 uses 72% fewer output tokens than Opus 4.7 on the same tasks, and that gap shows up in the monthly invoice immediately.
The geopolitical second act is already starting. Mistral launched Vibe with Medium 3.5 the same week, with explicitly sovereign-EU positioning. The pricing conversation is going to fork along jurisdictional lines before the year is out, and the orgs who can swap providers without rewriting their prompts are going to be the ones who survive the fork. Local models continue to become a bigger part of the conversation, this pricing reset is going to be the first test of how much of the market they can capture.
Worth reading:
- GitHub Copilot is moving to usage-based billing: The official announcement with the June 1 cutover date. (article)
- OpenAI Projects ChatGPT Plus subscriptions to drop by 80%: Ed Zitron on The Information's leak. The math that may or may not work. (article)
- The Pulse: token spend breaks budgets - what next?: Fifteen named companies, the different coping mechanisms, the underlying number. (article)
- Your Agent Is Starving: $3000/month for five people and zero hand-written lines. The new normal in one sentence. (article)
- GPT-5.5 vs Claude Opus 4.7: Real-World Coding Performance Compared: The 72% fewer output tokens number, with the workload context. (article)
Software fundamentals reassert themselves as the differentiator
Russ Miles published two follow-ups this week extending the cognitive-debt argument from a fortnight ago. All the Roads Not Taken names cognitive debt, intent debt, and the brakes that let the car go faster. The Trellis and the Hill extends the gardening metaphor: quiet debts planted in the codebase grow before anyone notices. Emily Bache's Trombone Paradox lands the same point with a different image: a heavy instrument requires good posture, and agentic AI requires solid engineering posture to be effective. The fundamentals are not nostalgia. They are the differentiator the velocity numbers reward.
Daniel Terhorst-North and Gojko Adzic at GOTO put a fresh angle on spec-driven development: specs as agent input, not agent output. The discipline that survives the abstraction shift is the one that treats specs as a first-class artifact you maintain by hand. Trisha Gee and Daniel North dismantled the gatekeeping framing of code reviews on Modern Software Engineering and rebuilt the purpose for an agent-augmented team. Adrian Bolboaca named enabling constraints in product ownership. Codesai diagnosed the design problems behind mock-heavy test suites. Old discipline, restated for the moment when an agent can write the tests for you.
The Anthropic postmortem on Claude Code quality is the same theme from the AI engineering side. Three real harness bugs caused months of complaints, and the model was not the issue. The harness was. If the lesson is not yet obvious, it should be: what you build around the model matters more than which model you pick.
Fundamentals are how you build without taking a mortgage on your future.
Worth reading:
- All the Roads Not Taken: Russ Miles extends the cognitive-debt argument with the brakes-let-the-car-go-faster metaphor. (article)
- It Doesn't Help To Push AI Into A Crappy Process: Emily Bache's Trombone Paradox in written form, with the engineering-posture argument. (article)
- Spec-Driven Dev Is Back. But Not How You Think: Daniel Terhorst-North and Gojko Adzic on specs as agent input, not output. (video)
- Are Code Reviews Even Necessary?: Trisha Gee and Daniel North on what reviews are for in an agent-augmented team. (video)
- An update on recent Claude Code quality reports: Anthropic's postmortem confirming three harness bugs, not model bugs. The lesson is in the diagnosis. (article)
Quick Hits
- DeepSeek V4: almost on the frontier, a fraction of the price: V4-Pro (1.6T total, 49B active), V4-Flash (284B/13B), 1M context MoE, MIT license. Largest open weights model now. (article)
- GPT-5.5 prompting guide: Official OpenAI guide. Notable trick: send a short user-visible update before tool calls. (article)
- Anthropic Mythos: Hype, reality and the actual security implications: Thoughtworks Tech Podcast as a sober counterweight to the breathless coverage. (podcast)
- A Complete Guide To AGENTS.md: Reference for the file format that is becoming the de facto agent-context entrypoint. (article)
- Structured-Prompt-Driven Development (SPDD): Wei Zhang and Jessie Jie Xia at Thoughtworks on prompts as first-class artifacts in version control. (article)
- How Anthropic's product team moves faster than anyone else: Cat Wu on Lenny's. Inside view of the product velocity behind the harness. (video)
- How to win when software is not a moat: Evan Spiegel on Lenny's. Distribution, brand, and product instincts when code becomes commodity. (video)
- Welcome to Gas City: Steve Yegge's Mad Max school of agent orchestration, v1.0.0. (article)
Curated from articles, podcasts, and videos. Week of April 24 - May 2, 2026.