Agentic Engineering Weekly for June 19–27, 2026

Share
Agentic Engineering Weekly for June 19–27, 2026

The spotlight slid off the model and onto everything we wrap around it. We stopped prompting and started writing loops, finance finally found where the tokens are going (on Powerpoint, it turns out), and agents are starting to show up in the team chat with their own logins. The engineering that matters now lives in the harness, not the model. How much of what we ship do we still genuinely understand? Let's dive in.

My top 3 picks this week



More on Loops

"I don't prompt Claude anymore," Boris Cherny said in a line that is still being passed around. "I have loops running that prompt Claude and figuring out what to do. My job is to write loops." Work goes into a queue, a cron job picks it up, works on it until completion, and picks up the next item. The thing you author is no longer just the spec.

"Loop" tooling is starting to become more widespread. Most harnesses support CRON-like scheduling. OpenAI's Codex threads can spawn new threads themselves allowing for recursive workflows, Claude code generates code on-the-fly that itself spawns dynamic workflows.

Public loop libraries are starting to show up: reusable, named loops you can drop into your own setup. When a pattern goes from "interesting idea" to "here is a library of them" that fast, it has stopped being speculation. If you are still hand-driving a chat window, the move this quarter is to wrap one repetitive task in a loop and watch where it takes you.

Worth reading:


Lots of tokens are being spent on tasks an LLM should not be doing

This week we learned that at big consulting firm Accenture a big source of AI token "chewing" turns out to be office workers converting PDFs into presentation slides. Sit with that. The cost of frontier intelligence, spent at scale, on generating slide decks.

Cheap tokens spent on work that should not exist is waste. The lever to pull isn't just the price per token. It is which kinds work you point AI at. Just ask your AI to generate a darn pdf-to-markdown python script already next time you want a summary of a large document.

Sean Goedecke makes the case that serving inference is obviously profitable, and that the "it must all be subsidized" story does not survive contact with the actual unit economics.

Inference is a real business, and most organizations are spending on it with no idea whether the output is worth the cost. Stop metering tokens, start measuring ROI on token spend. The absurd edge case of the week, two competing review agents locking into a $41K disagreement loop over a single dependency, is funny right up until it is your API key burning those tokens.

Worth reading:


Verum factum: you only really know what you made

An old idea I've been going on and on about resurfaced this week. Verum factum, the maxim that we only really know what we made, is the philosophical twin of a feeling every engineer using AI has had and few have named. You shipped the code but you understand it less than anything you ever hand-built. Jessitron writes about missing that gut-level certainty about her own software now that the model does the coding, and then the strange moments when she still gets it back. The problem defined: the loss of the confidence that comes from having built the thing yourself. It's not that the generated code is bad. You're no longer holding the theory of why it works, and the tests only catch what you already knew to specify upfront. You cannot unit test for taste and understanding.

So the goal is not to type more lines out of pride, it's not about being pro- or anti-AI. It is to keep the creator's confidence even when you did not write the code, which means deliberately rebuilding the theory that generation skipped: reading the diff like you would review a colleague, reconstructing the why, keeping a mental model that survives the next change.

Worth reading:


Memory was step one, dreaming is the next paradigm

Memory was one of the biggest unlocks of the last year: it gave agents growing context. This week the frontier conversation moved past it. The proposed next step has a deliberately evocative name, dreaming: a second-derivative process that periodically prunes and curates what an agent has learned, the way sleep consolidates a day rather than just recording it. If you've ever messed around with Hermes agent, this might all sound familiar; this harness has been generating its own custom skills and memory banks for a good while now.

It connects to a bet several people are now making out loud. If pretraining is bumping into a data wall, the next leap is not a bigger run but models that keep learning after deployment, closing the gap between a frozen checkpoint and an employee who actually improves on the job. The memory layer is one approach to simulate that. That is a different shape of system, and it changes the questions that matter: not "how big is the context window" but "what context does the model carry forward, and who decides."

Worth reading:


Single-player to multiplayer: this agent is getting their own identity on the team

Anthropic shipped Claude Tag this week and at first glance it looks like a Slack bot. Karpathy disagreed, calling it "a new paradigm for interacting with Claude," and the reason is worth understanding even if you never touch the product. The novelty is the agent joining your team's collaboration surface: picking up a channel's existing context, working alongside people for days, and acting under its own scoped, auditable identity instead of borrowing your credentials. Agentic work, so far a solo activity, is going multiplayer.

The moment an agent operates autonomously across shared spaces, "it runs with your credentials" stops being acceptable. You need it scoped to each channel, governed by admins, and fully auditable, because otherwise every action it takes is laundered through a human's credentials with no way to tell who actually did what. This is the same boundary problem that has dogged service accounts forever, surfacing again now that the service account can reason and act on its own.

Maggie Appleton points out that agentic engineering has been a single-player story: one developer, a dozen agents, moving fast in a personal CLI with no shared context. Scale that across a team and you get duplication, drift, and a lot of wasted tokens, because speed without alignment is just expensive divergence. The moat is made of people, and the habitat they hold with (and for) agents.

Worth reading:


Senior engineers are publishing their harnesses like dotfiles

More principal and staff engineers are showing their complete agentic setup, end to end. An L8 principal walks his whole stack. The value is not any single tool but seeing how a senior wires them into one coherent system.

Google released its own playbook this week: vibe coding does not scale, agentic engineering does. The model matters for 10%. The other 90% is the harness, the context and tooling and feedback loops you build around the model. If that ratio is even roughly right, then the setup itself becomes the valuable asset, the copyable artifact, which is exactly why these workflow showcases are suddenly everywhere. The dotfiles repo of the AI age has arrived.

The thing to take from this is not any individual's vibecoded tool list, which will be stale in a quarter. It is the move itself. Treat your harness as a first-class artifact worth versioning, documenting, and sharing, the same way you would your editor config. Watching how a strong engineer composes their loop teaches more than any framework tutorial, because the composition, the judgement about what to automate and what to keep in your hands, is the actual skill.

Worth reading:


Quick Hits


Curated from 30+ sources across articles, podcasts, and videos. Week of June 19–27, 2026.