Essay · June 19, 2026 · 9 min read

The agent that was always five years away

From email schedulers to AI chiefs of staff, a decade of autonomous-agent products died waiting for the models to catch up.

In April 2014, a startup called x.ai introduced two of the first AI agents most people ever delegated a task to. They were named Amy and Andrew Ingram, they lived in your email, and their single job was to schedule meetings. You copied Amy on a thread, she negotiated the time with the other party in plain English, and a calendar invite appeared. It was a clean, narrow, useful idea, and it was a decade early. x.ai spent years quietly employing a back office of human trainers to clean up the natural-language requests the model fumbled, never found unit economics that beat the free scheduling links built into everyone's calendar, and sunset Amy and Andrew on October 31, 2021. Its founder, Dennis Mortensen, later wrote that the vertical-agent thesis had arrived a decade too early on the quality of the underlying models. He was describing the entire category.

The AI agent, meaning software that acts on your behalf and completes a task end to end, has been roughly five years from working since at least 2014. The archive is full of the products that placed the bet before the models were ready, and they share a tell. When an agent appears to work before the technology can support it, look for the human standing behind the curtain.

Facebook M is the cleanest example. Launched inside Messenger in August 2015 and pitched as a rival to Siri and Alexa, M promised to do the hard things: book a restaurant, order a gift, plan a trip. The way it did them, in the small private beta it never escaped, was with human contractors fielding the requests the AI could not, on the theory that the model would watch over their shoulders and gradually take over. The model never took over, a human in every loop did not scale, and Facebook closed M in January 2018. The same shape recurs across the category. x.ai had its trainers. Builder.ai, the most expensive agent failure in the archive, ran what it marketed as an AI assistant named Natasha on the backs of roughly 700 human engineers in India before collapsing into insolvency in 2025 with more than $445 million spent. The agent that works before its time usually has a person inside it.

The well-funded, sincere version of the bet was Adept, founded in 2022 by several authors of the transformer paper that made the whole modern wave possible. Adept raised about $415 million to build an enterprise agent that could operate software the way a person does, clicking through the tools a worker already uses. It was as credentialed a team as the field could assemble, and it still could not turn the research into a business before the patience ran out. In June 2024 Amazon hired the founders and key staff, the product was wound down for outside customers, and the FTC opened an inquiry into whether the arrangement was an acquisition wearing a hiring announcement. The transformer's own authors could not make the agent pay on the 2024 timeline.

By 2025 the bet had fragmented into vertical agents, each aimed at one job. theGist built an assistant that read across Slack, email, and CRM to hand a knowledge worker a single prioritized digest a day. Astra, out of Bengaluru, pitched an AI chief of staff that would handle up to 80 percent of a sales rep's grunt work. Hiro, from the founder who had built and sold the savings app Digit, offered a personal AI CFO that could model your income and debts with arithmetic a general chatbot could not be trusted to do. Each was a reasonable, narrow agent. theGist returned its capital to investors and shut down in September 2025. Astra signed two beta customers and folded after four months, its founder citing both long enterprise sales cycles and customer wariness about handing an AI the keys to their CRM. Hiro was acqui-hired into OpenAI five months after launch, the math-verification work it was built on heading into ChatGPT while the app went dark.

There is a particular cruelty in how some of these died. theGist and Astra both pointed to the same culprit: the industry's swing toward autonomous agents, the loudly promoted frontier of 2025, drained demand for the in-between products that did one assistive task well. The category's own hype cycle moved past them. They had built agents that helped a human work, in the season when every deck promised an agent that replaced the human entirely. The promise of the bigger agent made the smaller, shipping one look obsolete before it had a chance to find a market.

Not every agent death was so structural. GameOn, a sports and commerce chatbot, collapsed in a roughly $60 million fraud rather than a failure of the technology. But it points at the same underlying difficulty by contrast: an agent has to be reliable across an open world where the cost of a wrong action, a double-booked meeting, a bad trade, a misread blood test, is real and immediate. That reliability bar is the thing the demos clear and the products do not.

Why has the agent always been five years away? Because the distance between a demo that completes a task and a product a customer will let act unsupervised is not a straight line. It is the same S-curve that Stefan Seltz-Axmacher described when his self-driving trucking company Starsky shut down: progress on the easy 90 percent is fast and seductive, and progress on the long tail of edge cases that decides whether you can remove the human is slow and expensive. An agent that is right 90 percent of the time is a demo. An agent a business will trust to act on its own needs to be right in the cases that matter, and closing that gap has consistently taken longer than a single funding cycle.

The economics follow from that gap. An agent propped up by humans, x.ai's trainers, M's contractors, Builder.ai's engineers, has the unit economics of a services business wearing a software multiple, and the market eventually prices it as the former. The ones that tried to be fully automated before the models could support it shipped something unreliable, and unreliable is fatal for a product whose entire proposition is that you can stop paying attention. Either way the number that mattered, the cost to serve one reliable action, never fell to where the business needed it.

The agent may finally be arriving. The models of 2025 and 2026 can hold a plan across many steps in a way the models of 2014 could not, and the survivors in this category are mostly features inside the labs' own products rather than standalone companies. But that is the point of the archive. The products filed here are the ones that proved, at their own expense, that the agent was not here yet. Mortensen's line about being a decade too early is the epitaph for more than x.ai. It belongs to the whole category, and the strange thing about being a decade early is that it looks exactly like being wrong, right up until someone else is on time.

The agent that was always five years away

Referenced in this essay

More essays

New entries and essays by email.