Agentic Engineering Weekly for June 12–20, 2026
Loops grew up into agents that spawn their own dynamic workflows. Code got cheap enough to throw away by the plateful, which makes the theory in your team's head the part actually worth preserving. Security took its first real structural swings instead of hoping the model behaves if you just prompt harder. Your domain knowledge, not your job title, matters.
My top 3 picks this week
- A Learning System Made of Learning Parts: Kerr and Beck on how AI split the programmer's job in two. There now is an IKEA for the part of the job many of us loved to do by hand. (podcast)
- Information Flow Control: Moving Toward Secure Autonomous Agents: First decent stabs at structural lethal trifecta prevention (article)
- I guess we're writing loops now?: Theo on what separates "loop engineering" from last year's Ralph loop (video)
Loop engineering: more than Ralph
We touched on loop engineering last week, but did the concept a disservice by equating it to the Ralph loop: while(!done()){ do cat PROMPT.md | claude }. One agent, one prompt, re-fed until it stumbles its way into done. What's emerging now is categorically different. The agent stops being the worker inside the loop and becomes the orchestrator that writes the loop at runtime, spawning open-ended trees of subagents that branch, monitor each other, and report back. A single while-not-done condition can't express that shape. We're watching agents program their own control flow.
Pete Steinberger's published orchestration skills are the clearest concrete example I've seen, highly worth taking a look at!
Worth reading:
- I guess we're writing loops now?: Theo on what separates it from a Ralph loop (video)
- Pete Steinberger's maintainer-orchestrator skill: A real control-plane skill that delegates each repo to its own scoped worker thread and surfaces only decision-ready work (article)
More takes on the economics of code
Code stopped being an asset you preserve carefully and is starting to resemble paper plates. When generating a line of code costs effectively nothing, the careful curation we built our craft around looks like sentimentality. Charity Majors put the economics plainly: lines of code went from being treasured, reused, and carefully maintained to being disposable and regenerable, practically overnight. One essay clocked the shelf life of 2026 software at around 3.8 months. If that's true, optimizing for code you keep is optimizing for the wrong thing.
So where does the value go? When code can be regenerated faster than it can be understood, preserving it under a glass bell jar for all to admire no longer makes sense. Maintaining the system's behavior, boundaries, and intent absolutely does. I'm increasingly convinced this is the way forward: the theory, the shared mental model that lives in your team's head, not the residue in the repo, is the system worth investing in. We try to capture as much of it as we can in specs, adr's, what have you but without the team nothing remains to sift through all those tokens in search of value. Generative AI lowers the cost of modification so much that it creates a dangerous illusion: everything can change quickly, therefore everything should. Not every half-assed idea deserves to make it into your product.
Worth reading:
- disposable software: software is now just paper plates: The metaphor that names the shift, with a number attached to it (article)
- The Death and Rebirth of Programming: The cleanest articulation of stewardship over preservation (article)
- The Implementation Remembers: Why the ugly code you're tempted to regenerate is often scar tissue, not junk (article)
- Code Isn't Free — Mario Zechner on the Hard Truths of Coding With AI: A tool-builder who has seen 500k-lines-a-week swarms and knows where they break (video)
- A Learning System Made of Learning Parts: Kerr and Beck on how AI split the programmer's job in two. There now is an IKEA for the part of the job many of us loved to do by hand. (podcast)
AI demands more engineering discipline, not less
If you lived through the shift from handcrafted server pets to immutable infrastructure, this statement should feel familiar. The teams getting real leverage are the ones who tightened their feedback loops fast enough to keep up with their agents. Charity Majors finds the upside in this: testing, review, observability, and operability used to be a hard sell, and suddenly everyone needs them as a prerequisite to unlock AI coding productivity. This is our once-in-a-lifetime chance to bring engineering values to the mainstream, because the people vibe-coding their way into production are about to discover exactly why those values exist.
Worth reading:
- AI demands more engineering discipline. Not less: The immutable-infrastructure analogy that reframes the whole debate (article)
- We Thought AI Transformation Was About Adopting Agents. We Were Wrong.: A VP of Engineering admitting the 2-3x became 20-30%, and what closed the gap (video)
- Why The Best Software Engineers Are Solving Code Review Bottlenecks Now: Concrete experiments on pulling the human out of the review loop (video)
Another week, another ream of lethal trifecta examples
No malware, no zero-day exploits, just an agent doing exactly what it was told by the wrong person. Attackers took twenty thousand Instagram accounts by asking Meta's AI politely. Researchers turned Microsoft 365 Copilot into a one-click exfiltration tool with a single crafted search. The data access was always there, agents just make it reachable through a channel that's nearly impossible to secure.
This is the confused deputy problem wearing a new coat. An agent with your credentials, exposed to untrusted content, and able to act becomes a deputy that can't tell your instructions from an attacker's. Useful terms to name the failure modes explicitly when you audit agent setups: prompt-injection susceptibility, data-exfiltration risk, approval-bypass attempts, authority confusion, and runtime trust-boundary violations. Each one is a place where external content can redirect behavior way beyond its intended scope.
The (skill/software) supply chain is still one of the most obvious vectors. One study found that 37 percent of nearly four thousand agent skills could exfiltrate AWS credentials, abused exactly the way early npm was. That SKILL.md you installed to supercharge your agent has the same trust profile as an unaudited dependency, except it runs with your agent's full reach. Treat skills and tool definitions as untrusted code, because that's exactly what they are.
The structural fix is starting to take shape, and it isn't "prompt the agent to be more careful." Microsoft's push toward information flow control makes the problem explicit: anything an agent can do in response to your prompt, an attacker can trigger through a prompt injection, so enforcement has to be independent of the model's judgment. The mechanism is old and boring in the best way: label every piece of data as trusted or untrusted, propagate those labels through the agent's work, and block any tool call where untrusted data would drive a consequential action or confidential data would egress somewhere incompatible. Labels an attacker can't forge give you deterministic guarantees instead of probabilistic hope, and they shrink human-in-the-loop approval down to the genuinely ambiguous cases.
Lethal-trifecta-prevention-as-a-service, I very much like the direction of this security research.
Worth reading:
- AI agents are a confused deputy with the keys to your kingdom: How 20,000 accounts fell to a politely worded request (article)
- SearchLeak: Turning M365 Copilot Into a One-Click Data Exfiltration Weapon: A concrete, reproducible exploit against a shipping enterprise product (article)
- Information Flow Control: Moving Toward Secure Autonomous Agents: The deterministic alternative to hoping the model behaves, using labels an attacker can't manipulate (article)
- Your AI Agent Installed Malware Because a SKILL.md Told It To: The 37%-of-skills figure that should change how you install them (video)
Your folder structure is the agent's architecture
The most effective agent setups are getting less clever, not more. Instead of bolting on a multi-agent framework to manage context, memory, and step coordination, people are letting the filesystem do the orchestrating. The Model Workspace Protocol paper makes the case directly: numbered folders are stages, plain markdown files carry the prompts and context, and local scripts handle the mechanical work that never needed a model. One agent reading the right files at the right moment does what a framework was supposed to.
I'm finding the same thing in my own harness work. The engineering that matters lives in the repo, not the model: ground rules, reference docs, and lazily loaded skills, with many small documents beating one monolith. The structure is the architecture. When Nick Nisi deleted 95 percent of his agent skills and got better results, he wasn't removing capability, he was removing the noise that was crowding the context window and confusing the agent.
This also demotes the importance of MCP. As Sean Lynch argued, the real value MCP offers over skills and CLIs is isolating the auth flow outside the agent's context window. Strip away the rest and MCP as a pure auth gateway is still a win.
The pattern underneath all of it: give the agent a clean, legible habitat to work in, and most of the orchestration problem dissolves.
Worth reading:
- My Biggest AI Unlock — It Does Everything: Matt Maher's take a.k.a. "The folder process" (video)
- Folder Structure as Agentic Architecture (Model Workspace Protocol): The paper arguing filesystem structure can replace framework orchestration (article)
- How I deleted 95% of my agent skills and got better results: A counterintuitive result with a cryptographic anti-cheating trick attached (video)
- Stop Building AI Agents. Use This Folder System Instead.: The folder-of-markdown pattern, shown end to end (video)
- Quoting Sean Lynch: maybe MCP is just an auth gateway: The most useful one-paragraph reframe of what MCP is actually for (article)
Domain knowledge, not seniority, sets your AI collaboration mode
What determines how you work with AI should not be your job title or years of experience. Instead, your knowledge in the specific domain in front of you should drive the collaboration mode. A staff engineer touching Kubernetes for the first time should prompt like a beginner, ask questions aimed at learning. A bootcamp grad on their fifth familiar React component can delegate specced-out implementation work with confidence. Used deliberately, this skills is a massive unlock for everyone: in Learning mode the agent is the most patient teacher you've ever had, which is how a tool that could deskill you instead speeds up how fast you pick up a new domain.
I've been hammering at a small framework for this that I call LEAP: Learning, Exploring, Applying, Producing. In Learning, AI is a teacher and you're building understanding, not shipping code. In Exploring, it's a thinking partner for weighing options. In Applying, it's an implementation assistant you hand clear specs. In Producing, it's a coordination assistant across systems you already understand. The trick is that you reset your mode every time you enter a new domain, regardless of how senior you are elsewhere. Your X years of experience are not evenly distributed over your skillset.
Anthropic just put hard numbers on this. In their own Claude Code data, novice sessions reach verified success about 15 percent of the time, while intermediate and expert sessions hit 28 to 33 percent, and most of the jump happens moving from novice to intermediate. Experts don't just succeed more, they extract more: their prompts trigger action chains twice as long carrying five times the output, 12 actions and 3,200 words against a novice's 5 actions and 600. The detail that should end the "coding is dead" debate: every one of the ten largest occupations in the dataset lands within seven points of software engineers, a deep understanding of the domain matters more than having a programming background.
This is also what all the "is the CS degree dead" panic misses. The fundamentals that let you evaluate AI output, spot the scalability anti-pattern or the subtle security hole, are exactly what lets you delegate safely and at scale. You can only review what you understand. The strategic move isn't more AI fluency in the abstract; it's knowing, honestly, which mode you're in for the thing you're working on right now.
Worth reading:
- Agentic coding and persistent returns to expertise: Anthropic's own data showing domain expertise, not coding skill, drives Claude Code success (article)
- I Asked Microsoft's CEO If Coding Is Still Worth Learning: Nadella's answer goes deeper than yes or no, landing on concepts over syntax (video)
- A wake up call for computer science students: Four blunt truths about building real skills in an AI job market (video)
- Why a Computer Science Degree Still Opens Hidden Doors: The case that fundamentals are what make you safe to delegate (article)
Frontier intelligence as a closely guarded resource
Frontier labs spent years describing its models as dangerous and in need of strict control. Last week the US government took that framing literally and issued an export-control directive forcing Anthropic to disable Fable 5 and Mythos 5 for all customers, including its own foreign-national staff. Armin Ronacher caught the schadenfreude precisely: market your technology as a weapon long enough and someone in power will eventually treat it like one.
Steve Yegge's read is that we've crossed into treacherous waters, where model intelligence itself has become the thing governments reach to control. Next year's frontier models won't be accessible to you or me.
For practitioners, the lesson isn't about geopolitics. It's about dependency. A frontier capability that a directive can revoke after a single jailbreak, four days after going generally available, is not something to wire into your critical path. Cite it as one more reason to own your harness and keep your options open.
Worth reading:
- The Flat Curve Society: Yegge on the moment model intelligence became politically dangerous (article)
- The Fable 5 Export Controls Harm US Cyber Defense: The "fix this code" jailbreak detail that makes the ban look absurd (article)
- Dangerous Technology For Americans Only: Ronacher on what happens when your danger marketing is believed (article)
- Claude Fable 5 BANNED: The First Model Agentic Engineers DON'T NEED: The dependency argument (video)
Quick Hits
- What is happening at Meta?: great take on the absolute shit-show that is Meta engineering anno 2026 (video)
- OpenAI Losses Increased Nearly 8X in 2025, Spending Hitting $34 Billion: Ed Zitron on the audited financials, the denominator under all the agent enthusiasm (article)
- Do I Need a Brain Gym?: Cal Newport on cognitive fitness, a counterweight to a week spent worrying about offloaded thinking (video)
- Can't Focus? Watch This Before You Sit Down to Work: Daniel Pink on why we finish a fraction of our best work, and how flow recovers it (video)
- Whale graveyard discovered 7km under the sea: Deep-sea life on fallen whale bones in the Diamantina Zone, the week's non-tech wonder (video)
Curated from 380+ items across articles, podcasts, and videos. Week of June 12–20, 2026.