That’s what the folks at HuggingFace have been concluding in their latest paper “Fully Autonomous AI Agents Should Not be Developed”
Two elements are surprising in the paper:
The first surprising information, is obviously the claim that you shouldn’t build a fully autonomous agent. These are systems that can create and subsequently use their own code in order to achieve their own task.
Here we aren’t talking about coding agents, these are still fine. A fully autonomous agent isn’t bounded by a set of tools and can use coding agents in order to build more tools to use.
Huggingface researchers even did a full grid analysis on multiple dimensions. No matter which dimension you want to remove, the risk-reward ratio for a fully autonomous agent isn’t looking too good.
Like all things engineering, the sweet spot seems to always start with the simplest possible solution and move up the complexity later if needs be.
This sentiment from HuggingFace is surprisingly also echoed in an interview from Anthropic:
I don’t know if these two engineers have been briefed before the interview, but their candidness about the fact that agent are not really working in production is hilarious. Peak AI comedy, highly recommend watching the full thing.
The second surprising thing is more semantic, there is no clear definition of what an agent is.
You might be saying “Yes, it’s because of these pesky AI companies hyping up the AI field with non-sense” and you would be partially right.
The issue runs much deeper than that, with this quote being the eye opener:
The question what is an agent? is embarrassing for the agent- based computing community in just the same way that the question what is intelligence? is embarrassing for the mainstream AI community.
The problem is that although the term is widely used, by many people working in closely related areas, it defies attempts to produce a single universally accepted definition
This quote is from Wooldridge & Jennings in their book from 1995.
1995, folks.
This term is at the same level of murkiness as AGI, intelligence or consciousness at this point.
Which I believe requires engineers working in that field to use much more precise words about what they are building (workflows, prompt-chaining, orchestrator-evaluators, etc.).
A good place to start learning this semantic is this blog post by Anthropic, which is very clear and devoid of hype (it’s related to the interview mentioned earlier):
The team has been gracious enough to provide a lot example (text and coding) of what each of the patterns looks like in what they call workflows.
btw if you are new to agent, workflow, or agentic systems, I’ve made a tutorial on it last week with the resources above:
ps: if you know a thumbnail guy, send me a DM mines suck.
All in all, there isn’t anything too new here on the software engineering side with the statement from Hugginface.
The old adage still holds in the stochastic realm of AI engineering:
Build the least complex solution as possible that meets the requirements.
If you want to test new tech, like a swarm of AI agents, do it in a hobby project, not in production.