
The president’s pick to sit on an appellate court covering Silicon Valley has represented several blockchain entities in courts.


The president’s pick to sit on an appellate court covering Silicon Valley has represented several blockchain entities in courts.

As Brazil’s Pix system expands and BRICS eyes a reserve currency, Trump responds with a 50% tariff and a sweeping trade investigation.

DOGE gained 18% this week, and multiple data points suggest a 300% rally is possible before the end of 2025.
Imagine AI so sophisticated it could read a customer’s mind? Or identify and close a cybersecurity loophole weeks before hackers strike? How about a team of AI agents equipped to restructure a global supply chain and circumnavigate looming geopolitical disruption? Such disruptive possibilities explain why agentic AI is sending ripples of excitement through corporate boardrooms.

Although still so early in its development that there lacks consensus on a single, shared definition, agentic AI refers loosely to a suite of AI systems capable of connected and autonomous decision-making with zero or limited human intervention. In scenarios where traditional AI typically requires explicit prompts or instructions for each step, agentic AI will independently execute tasks, learning and adapting to its environment to refine decisions over time.
From assuming oversight for complex workflows, such as procurement or recruitment, to carrying out proactive cybersecurity checks or automating support, enterprises are abuzz at the potential use cases for agentic AI.
According to one Capgemini survey, 50% of business executives are set to invest in and implement AI agents in their organizations in 2025, up from just 10% currently. Gartner has also forecast that 33% of enterprise software applications will incorporate agentic AI by 2028. For context, in 2024 that proportion was less than 1%.
“It’s creating such a buzz – software enthusiasts seeing the possibilities unlocked by LLMs, venture capitalists wanting to find the next big thing, companies trying to find the ‘killer app,” says Matt McLarty, chief technology officer at Boomi. But, he adds, “right now organizations are struggling to get out of the starting blocks.”
The challenge is that many organizations are so caught up in the excitement that they risk attempting to run before they can walk when it comes to deployment of agentic AI, believes McLarty. And in so doing they risk turning it from potential business breakthrough into a source of cost, complexity, and confusion.
The heady capabilities of agentic AI have created understandable temptation for senior business leaders to rush in, acting on impulse rather than insight risks turning the technology into a solution in search of a problem, points out McLarty.
It’s a scenario that’s unfolded with previous technologies. The decoupling of Blockchain from Bitcoin in 2014 paved the way for a Blockchain 2.0 boom in which organizations rushed to explore the applications for a digital, decentralized ledger beyond currency. But a decade on, the technology has fallen far short of forecasts at the time, dogged by technology limitations and obfuscated use cases.
“I do see Blockchain as a cautionary tale,” says McLarty. “The hype and ultimate lack of adoption is definitely a path the agentic AI movement should avoid.” He explains, “The problem with Blockchain is that people struggle to find use cases where it applies as a solution, and even when they find the use cases, there is often a simpler and cheaper solution,” he adds. “I think agentic AI can do things no other solution can, in terms of contextual reasoning and dynamic execution. But as technologists, we get so excited about the technology, sometimes we lose sight of the business problem.”
Instead of diving in headfirst, McLarty advocates for an iterative attitude toward applications of agentic AI, targeting “low-hanging fruit” and incremental use cases. This includes focusing investment on the worker agents that are set to make up the components of more sophisticated, multi-agent agentic systems further down the road.
However, with a narrower, more prescribed remit, these AI agents with agentic capabilities can add instant value. Enabled with natural language processing (NLP) they can be used to bridge the linguistic shortfalls in current chat agents for example or adaptively carry out rote tasks via dynamic automation.
“Current rote automation processes generate a lot of value for organizations today, but they can lead to a lot of manual exception processing,” points out McLarty. “Agentic exception handling agents can eliminate a lot of that.”
It’s also essential to avoid use cases for agentic AI that could be addressed with a cheaper and simpler technology. “Configuring a self-manager, ephemeral agent swarm may sound exciting and be exhilarating to build, but maybe you can just solve the problem with a simple reasoning agent that has access to some in-house contextual data and API-based tools,” says McLarty. “Let’s call it the KASS principle: Keep agents simple, stupid.”
The future value of agentic AI will lie in its interoperability and organizations that prioritize this pillar at the earliest phase of their adoption will find themselves ahead of the curve.
As McLarty explains, the usefulness of agentic AI agents in scenarios like customer support chats lies in their combination of four elements: a defined business scope, large language models (LLM), the wider context derived from an organization’s existing data, and capabilities executed through its core applications. These latter two rely on in-built interoperability. For example, an AI agent tasked with onboarding new employees will require access to updated HR policies, asset catalogs and IT. “Organizations can get a massive head start on business value through AI agents by having interoperable data and applications to plug and play with agents,” he says.
Agent-to-agent frameworks like the model context protocol (MCP) – an open and standardized plug-and-play that connects AI models to internal (or external) information sources – can be layered onto an existing API architecture to embed connectedness from the outset. And while it might feel like an additional hurdle now, in the longer-term those organizations that make this investment early will reap the benefits.
“The icing on the cake for interoperability is that all the work you do to connect agents to data and applications now will help you prepare for the multi-agent future where interoperability between agents will be essential,” says McLarty.
In this future, multi-agent systems will work collectively on more intricate, cross-functional tasks. Agentic systems will draw on AI agents across inventory, logistics and production to coordinate and optimize supply chain management for example or perform complex assembly tasks.
Conscious that this is where the technology is headed, third-party developers are already beginning to offer multi-agent capability. In December, Amazon launched such a tool for its Bedrock service, providing users access to specialized agents coordinated by a supervisor agent capable of breaking down requests, delegating tasks and consolidating outputs.
But though such an off-the-rack solution has the advantage of allowing enterprises to bypass both the risk and complexity in leveraging such capabilities, the digital heterogeneity of larger organizations in particular will likely mean – in the longer-term at least – they’ll need to rely on their own API architecture to realize the full potential in multi-agent systems.
McLarty’s advice is simple, “This is definitely a time to ground yourself in the business problem, and only go as far as you need to with the solution.”
This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff.
This content was researched, designed, and written entirely by human writers, editors, analysts, and illustrators. This includes the writing of surveys and collection of data for surveys. AI tools that may have been used were limited to secondary production processes that passed thorough human review.
MIT Technology Review’s How To series helps you get things done.
Simon Willison has a plan for the end of the world. It’s a USB stick, onto which he has loaded a couple of his favorite open-weight LLMs—models that have been shared publicly by their creators and that can, in principle, be downloaded and run with local hardware. If human civilization should ever collapse, Willison plans to use all the knowledge encoded in their billions of parameters for help. “It’s like having a weird, condensed, faulty version of Wikipedia, so I can help reboot society with the help of my little USB stick,” he says.
But you don’t need to be planning for the end of the world to want to run an LLM on your own device. Willison, who writes a popular blog about local LLMs and software development, has plenty of compatriots: r/LocalLLaMA, a subreddit devoted to running LLMs on your own hardware, has half a million members.
For people who are concerned about privacy, want to break free from the control of the big LLM companies, or just enjoy tinkering, local models offer a compelling alternative to ChatGPT and its web-based peers.
The local LLM world used to have a high barrier to entry: In the early days, it was impossible to run anything useful without investing in pricey GPUs. But researchers have had so much success in shrinking down and speeding up models that anyone with a laptop, or even a smartphone, can now get in on the action. “A couple of years ago, I’d have said personal computers are not powerful enough to run the good models. You need a $50,000 server rack to run them,” Willison says. “And I kept on being proved wrong time and time again.”
Getting into local models takes a bit more effort than, say, navigating to ChatGPT’s online interface. But the very accessibility of a tool like ChatGPT comes with a cost. “It’s the classic adage: If something’s free, you’re the product,” says Elizabeth Seger, the director of digital policy at Demos, a London-based think tank.
OpenAI, which offers both paid and free tiers, trains its models on users’ chats by default. It’s not too difficult to opt out of this training, and it also used to be possible to remove your chat data from OpenAI’s systems entirely, until a recent legal decision in the New York Times’ ongoing lawsuit against OpenAI required the company to maintain all user conversations with ChatGPT.
Google, which has access to a wealth of data about its users, also trains its models on both free and paid users’ interactions with Gemini, and the only way to opt out of that training is to set your chat history to delete automatically—which means that you also lose access to your previous conversations. In general, Anthropic does not train its models using user conversations, but it will train on conversations that have been “flagged for Trust & Safety review.”
Training may present particular privacy risks because of the ways that models internalize, and often recapitulate, their training data. Many people trust LLMs with deeply personal conversations—but if models are trained on that data, those conversations might not be nearly as private as users think, according to some experts.
“Some of your personal stories may be cooked into some of the models, and eventually be spit out in bits and bytes somewhere to other people,” says Giada Pistilli, principal ethicist at the company Hugging Face, which runs a huge library of freely downloadable LLMs and other AI resources.
For Pistilli, opting for local models as opposed to online chatbots has implications beyond privacy. “Technology means power,” she says. “And so who[ever] owns the technology also owns the power.” States, organizations, and even individuals might be motivated to disrupt the concentration of AI power in the hands of just a few companies by running their own local models.
Breaking away from the big AI companies also means having more control over your LLM experience. Online LLMs are constantly shifting under users’ feet: Back in April, ChatGPT suddenly started sucking up to users far more than it had previously, and just last week Grok started calling itself MechaHitler on X.
Providers tweak their models with little warning, and while those tweaks might sometimes improve model performance, they can also cause undesirable behaviors. Local LLMs may have their quirks, but at least they are consistent. The only person who can change your local model is you.
Of course, any model that can fit on a personal computer is going to be less powerful than the premier online offerings from the major AI companies. But there’s a benefit to working with weaker models—they can inoculate you against the more pernicious limitations of their larger peers. Small models may, for example, hallucinate more frequently and more obviously than Claude, GPT, and Gemini, and seeing those hallucinations can help you build up an awareness of how and when the larger models might also lie.
“Running local models is actually a really good exercise for developing that broader intuition for what these things can do,” Willison says.
Local LLMs aren’t just for proficient coders. If you’re comfortable using your computer’s command-line interface, which allows you to browse files and run apps using text prompts, Ollama is a great option. Once you’ve installed the software, you can download and run any of the hundreds of models they offer with a single command.
If you don’t want to touch anything that even looks like code, you might opt for LM Studio, a user-friendly app that takes a lot of the guesswork out of running local LLMs. You can browse models from Hugging Face from right within the app, which provides plenty of information to help you make the right choice. Some popular and widely used models are tagged as “Staff Picks,” and every model is labeled according to whether it can be run entirely on your machine’s speedy GPU, needs to be shared between your GPU and slower CPU, or is too big to fit onto your device at all. Once you’ve chosen a model, you can download it, load it up, and start interacting with it using the app’s chat interface.
As you experiment with different models, you’ll start to get a feel for what your machine can handle. According to Willison, every billion model parameters require about one GB of RAM to run, and I found that approximation to be accurate: My own 16 GB laptop managed to run Alibaba’s Qwen3 14B as long as I quit almost every other app. If you run into issues with speed or usability, you can always go smaller—I got reasonable responses from Qwen3 8B as well.
And if you go really small, you can even run models on your cell phone. My beat-up iPhone 12 was able to run Meta’s Llama 3.2 1B using an app called LLM Farm. It’s not a particularly good model—it very quickly goes off into bizarre tangents and hallucinates constantly—but trying to coax something so chaotic toward usability can be entertaining. If I’m ever on a plane sans Wi-Fi and desperate for a probably false answer to a trivia question, I now know where to look.
Some of the models that I was able to run on my laptop were effective enough that I can imagine using them in my journalistic work. And while I don’t think I’ll depend on phone-based models for anything anytime soon, I really did enjoy playing around with them. “I think most people probably don’t need to do this, and that’s fine,” Willison says. “But for the people who want to do this, it’s so much fun.”