2026-03-05 01:34:42
Agentic Engineering Patterns >
There are some behaviors that are anti-patterns in our weird new world of agentic engineering.
This anti-pattern is common and deeply frustrating.
Don't file pull requests with code you haven't reviewed yourself.
If you open a PR with hundreds (or thousands) of lines of code that an agent produced for you, and you haven't done the work to ensure that code is functional yourself, you are delegating the actual work to other people.
They could have prompted an agent themselves. What value are you even providing?
If you put code up for review you need to be confident that it's ready for other people to spend their time on it. The initial review pass is your responsibility, not something you should farm out to others.
A good agentic engineering pull request has the following characteristics:
Given how easy it is to dump unreviewed code on other people, I recommend including some form of evidence that you've put that extra work in yourself. Notes on how you manually tested it, comments on specific implementation choices or even screenshots and video of the feature working go a long way to demonstrating that a reviewer's time will not be wasted digging into the details.
Tags: ai, llms, ai-ethics, coding-agents, ai-assisted-programming, generative-ai, agentic-engineering, code-review
2026-03-04 23:50:03
I'm behind on writing about Qwen 3.5, a truly remarkable family of open weight models released by Alibaba's Qwen team over the past few weeks. I'm hoping that the 3.5 family doesn't turn out to be Qwen's swan song, seeing as that team has had some very high profile departures in the past 24 hours.
It all started with this tweet from Junyang Lin (@JustinLin610):
me stepping down. bye my beloved qwen.
Junyang Lin was the lead researcher building Qwen, and was key to releasing their open weight models from 2024 onwards.
As far as I can tell a trigger for this resignation was a re-org within Alibaba where a new researcher hired from Google's Gemini team was put in charge of Qwen, but I've not confirmed that detail.
More information is available in this article from 36kr.com. Here's Wikipedia on 36Kr confirming that it's a credible media source established in 2010 with a good track record reporting on the Chinese technology industry.
The article is in Chinese - here are some quotes translated via Google Translate:
At approximately 1:00 PM Beijing time on March 4th, Tongyi Lab held an emergency All Hands meeting, where Alibaba Group CEO Wu Yongming frankly told Qianwen employees.
Twelve hours ago (at 0:11 AM Beijing time on March 4th), Lin Junyang, the technical lead for Alibaba's Qwen Big Data Model, suddenly announced his resignation on X. Lin Junyang was a key figure in promoting Alibaba's open-source AI models and one of Alibaba's youngest P10 employees. Amidst the industry uproar, many members of Qwen were also unable to accept the sudden departure of their team's key figure.
"Given far fewer resources than competitors, Junyang's leadership is one of the core factors in achieving today's results," multiple Qianwen members told 36Kr. [...]
Regarding Lin Junyang's whereabouts, no new conclusions were reached at the meeting. However, around 2 PM, Lin Junyang posted again on his WeChat Moments, stating, "Brothers of Qwen, continue as originally planned, no problem," without explicitly confirming whether he would return. [...]
That piece also lists several other key members who have apparently resigned:
With Lin Junyang's departure, several other Qwen members also announced their departure, including core leaders responsible for various sub-areas of Qwen models, such as:
Binyuan Hui: Lead Qwen code development, principal of the Qwen-Coder series models, responsible for the entire agent training process from pre-training to post-training, and recently involved in robotics research.
Bowen Yu: Lead Qwen post-training research, graduated from the University of Chinese Academy of Sciences, leading the development of the Qwen-Instruct series models.
Kaixin Li: Core contributor to Qwen 3.5/VL/Coder, PhD from the National University of Singapore.
Besides the aforementioned individuals, many young researchers also resigned on the same day.
Based on the above it looks to me like everything is still very much up in the air. The presence of Alibaba's CEO at the "emergency All Hands meeting" suggests that the company understands the significance of these resignations and may yet retain some of the departing talent.
This story hits particularly hard right now because the Qwen 3.5 models appear to be exceptionally good.
I've not spent enough time with them yet but the scale of the new model family is impressive. They started with Qwen3.5-397B-A17B on February 17th - an 807GB model - and then followed with a flurry of smaller siblings in 122B, 35B, 27B, 9B, 4B, 2B, 0.8B sizes.
I'm hearing positive noises about the 27B and 35B models for coding tasks that still fit on a 32GB/64GB Mac, and I've tried the 9B, 4B and 2B models and found them to be notably effective considering their tiny sizes. That 2B model is just 4.57GB - or as small as 1.27GB quantized - and is a full reasoning and multi-modal (vision) model.
It would be a real tragedy if the Qwen team were to disband now, given their proven track record in continuing to find new ways to get high quality results out of smaller and smaller models.
If those core Qwen team members either start something new or join another research lab I'm excited to see what they do next.
Tags: ai, generative-ai, llms, qwen, ai-in-china
2026-03-04 07:59:04
Shock! Shock! I learned yesterday that an open problem I'd been working on for several weeks had just been solved by Claude Opus 4.6 - Anthropic's hybrid reasoning model that had been released three weeks earlier! It seems that I'll have to revise my opinions about "generative AI" one of these days. What a joy it is to learn not only that my conjecture has a nice solution but also to celebrate this dramatic advance in automatic deduction and creative problem solving.
— Donald Knuth, Claude's Cycles
Tags: november-2025-inflection, claude, generative-ai, ai, llms, donald-knuth, llm-reasoning, anthropic
2026-03-04 05:53:54
Google's latest model is an update to their inexpensive Flash-Lite family. At $0.25/million tokens of input and $1.5/million output this is 1/8th the price of Gemini 3.1 Pro.
It supports four different thinking levels, so I had it output four different pelicans:
minimal
low
medium
high
Tags: google, ai, generative-ai, llms, llm, gemini, llm-pricing, pelican-riding-a-bicycle, llm-release
2026-03-03 00:35:10
Agentic Engineering Patterns >
I like to include animated GIF demos in my online writing, often recorded using LICEcap. There's an example in the Interactive explanations chapter.
These GIFs can be pretty big. I've tried a few tools for optimizing GIF file size and my favorite is Gifsicle by Eddie Kohler. It compresses GIFs by identifying regions of frames that have not changed and storing only the differences, and can optionally reduce the GIF color palette or apply visible lossy compression for greater size reductions.
Gifsicle is written in C and the default interface is a command line tool. I wanted a web interface so I could access it in my browser and visually preview and compare the different settings.
I prompted Claude Code for web (from my iPhone using the Claude iPhone app) against my simonw/tools repo with the following:
Here's what it built, plus an animated GIF demo that I optimized using the tool:

Let's address that prompt piece by piece.
gif-optimizer.html
The first line simply tells it the name of the file I want to create. Just a filename is enough here - I know that when Claude runs "ls" on the repo it will understand that every file is a different tool.
My simonw/tools repo currently lacks a CLAUDE.md or AGENTS.md file. I've found that agents pick up enough of the gist of the repo just from scanning the existing file tree and looking at relevant code in existing files.
Compile gifsicle to WASM, then build a web page that lets you open or drag-drop an animated GIF onto it and it then shows you that GIF compressed using gifsicle with a number of different settings, each preview with the size and a download button
I'm making a bunch of assumptions here about Claude's existing knowledge, all of which paid off.
Gifsicle is nearly 30 years old now and is a widely used piece of software - I was confident that referring to it by name would be enough for Claude to find the code.
"Compile gifsicle to WASM" is doing a lot of work here.
WASM is short for WebAssembly, the technology that lets browsers run compiled code safely in a sandbox.
Compiling a project like Gifsicle to WASM is not a trivial operation, involving a complex toolchain usually involving the Emscripten project. It often requires a lot of trial and error to get everything working.
Coding agents are fantastic at trial and error! They can often brute force their way to a solution where I would have given up after the fifth inscrutable compiler error.
I've seen Claude Code figure out WASM builds many times before, so I was quite confident this would work.
"then build a web page that lets you open or drag-drop an animated GIF onto it" describes a pattern I've used in a lot of my other tools.
HTML file uploads work fine for selecting files, but a nicer UI, especially on desktop, is to allow users to drag and drop files into a prominent drop zone on a page.
Setting this up involves a bit of JavaScript to process the events and some CSS for the drop zone. It's not complicated but it's enough extra work that I might not normally add it myself. With a prompt it's almost free.
Here's the resulting UI - which was influenced by Claude taking a peek at my existing image-resize-quality tool:

I didn't ask for the GIF URL input and I'm not keen on it, because it only works against URLs to GIFs that are served with open CORS headers. I'll probably remove that in a future update.
"then shows you that GIF compressed using gifsicle with a number of different settings, each preview with the size and a download button" describes the key feature of the application.
I didn't bother defining the collection of settings I wanted - in my experience Claude has good enough taste at picking those for me, and we can always change them if its first guesses don't work.
Showing the size is important since this is all about optimizing for size.
I know from past experience that asking for a "download button" gets a button with the right HTML and JavaScript mechanisms set up such that clicking it provides a file save dialog, which is a nice convenience over needing to right-click-save-as.
Also include controls for the gifsicle options for manual use - each preview has a “tweak these settings” link which sets those manual settings to the ones used for that preview so the user can customize them further
This is a pretty clumsy prompt - I was typing it in my phone after all - but it expressed my intention well enough for Claude to build what I wanted.
Here's what that looks like in the resulting tool, this screenshot showing the mobile version. Each image has a "Tweak these settings" button which, when clicked, updates this set of manual settings and sliders:

Run “uvx rodney --help” and use that tool to tray your work - use this GIF for testing https://static.simonwillison.net/static/2026/animated-word-cloud-demo.gif
Coding agents work so much better if you make sure they have the ability to test their code while they are working.
There are many different ways to test a web interface - Playwright and Selenium and agent-browser are three solid options.
Rodney is a browser automation tool I built myself, which is quick to install and has --help output that's designed to teach an agent everything it needs to know to use the tool.
This worked great - in the session transcript you can see Claude using Rodney and fixing some minor bugs that it spotted, for example:
The CSS
display: noneis winning over the inline style reset. I need to setdisplay: 'block'explicitly.
When I'm working with Claude Code I usually keep an eye on what it's doing so I can redirect it while it's still in flight. I also often come up with new ideas while it's working which I then inject into the queue.
Include the build script and diff against original gifsicle code in the commit in an appropriate subdirectory
The build script should clone the gifsicle repo to /tmp and switch to a known commit before applying the diff - so no copy of gifsicle in the commit but all the scripts needed to build the wqsm
I added this when I noticed it was putting a lot of effort into figuring out how to get Gifsicle working with WebAssembly, including patching the original source code. Here's the patch and the build script it added to the repo.
I knew there was a pattern in that repo already for where supporting files lived but I couldn't remember what that pattern was. Saying "in an appropriate subdirectory" was enough for Claude to figure out where to put it - it found and used the existing lib/ directory.
You should include the wasm bundle
This probably wasn't necessary, but I wanted to make absolutely sure that the compiled WASM file (which turned out to be 233KB) was committed to the repo. I serve simonw/tools via GitHub Pages at tools.simonwillison.net and I wanted it to work without needing to be built locally.
Make sure the HTML page credits gifsicle and links to the repo
This is just polite! I often build WebAssembly wrappers around other people's open source projects and I like to make sure they get credit in the resulting page.
Claude added this to the footer of the tool:
Built with gifsicle by Eddie Kohler, compiled to WebAssembly. gifsicle is released under the GNU General Public License, version 2.
Tags: claude, ai, claude-code, llms, prompt-engineering, webassembly, coding-agents, tools, generative-ai, gif, agentic-engineering
2026-03-02 22:53:15
I just sent the February edition of my sponsors-only monthly newsletter. If you are a sponsor (or if you start a sponsorship now) you can access it here. In this month's newsletter:
Here's a copy of the January newsletter as a preview of what you'll get. Pay $10/month to stay a month ahead of the free copy!
I use Claude as a proofreader for spelling and grammar via this prompt which also asks it to "Spot any logical errors or factual mistakes". I'm delighted to report that Claude Opus 4.6 called me out on this one:

Tags: newsletter, kakapo, claude