AI agent sprawl is coming — how can enterprises keep them all in line?
If you caught that viral clip of an AI tool running dozens of X accounts at once, you saw more than a stunt — you saw the future.
Enterprises are now betting big on AI agents — bots that can perceive, reason, and act across digital tasks to drive massive efficiency gains. Gartner predicts these agents will make 15% of everyday work decisions and power 33% of enterprise apps by 2028, up from virtually zero today.
The promise is immense. But so is the chaos.
In a recent experiment, researchers at Carnegie Mellon created a fake software company, The Agent Company, and entirely staffed it with AI agents powered by leading models. Each agent was assigned a real-world role (software engineer, HR, project manager, and such) and tasks, such as analyzing spreadsheets, writing performance reviews, and navigating file systems.
The result? A complete meltdown.
Even the top performer, Anthropic’s Claude, completed just 24% of the tasks. On the other hand, Amazon’s Nova barely hit 1.7%. These agents got lost in folders, failed basic coordination, and in some cases, resorted to ‘self-deception,’ like renaming coworkers just to fake task completion. Long story short: they collapsed under the weight of routine work alone.
Another study focused on more basic tasks found that multi-agent systems, where each agent was given a designated task (say, research), failed more than 60% of the time. They ended up giving hardcoded answers, confusing their roles, ignoring each other’s messages, and going into endless loops without realizing the task at hand was already done.
For businesses, this uncoordinated proliferation of AI agents or ‘agent sprawl’ only spells trouble, with risks like data leaks and hallucinations even leading to lawsuits. Having a single agent focused on a lone task is one thing — you can easily keep tabs on it. But when there are multiple such systems in play, things can go very wrong very soon, leaving teams scrambling to figure out what’s going on and where.
“The most common thing that happens…is that AI agents provide incorrect information to customers…Without good real-time supervision and alerts that can demonstrate how the company is meeting its duty of care in supervising agents, companies could get sued and found negligent. The Air Canada and Character.ai lawsuits are early signs of what happens when companies do not have active, centralized management and supervision of their AI agents,” Dr. Tatyana Mamut, CEO and co-founder of Wayfound, a startup tackling agent sprawl for companies like Salesforce, told Future Nexus.
So, how to keep agents in line?
At the enterprise level, agent sprawl begins to occur when each business function starts defining and deploying its own AI agent strategy using different tools and point solutions. This siloed approach appears attractive at first, as it helps teams move fast, but it leads to overlapping initiatives — with agents often operating without coordination or regulation.
To fix this, Mamut explained, teams need a central command center, where every AI agent project is connected to one system, allowing business units to move quickly and efficiently. This also gives CIOs, CMOs, and CEOs direct visibility and control over what projects are underway, which ones have been deployed to production, and how they are complying with company guidelines and regulatory requirements.
For this, Wayfound has created a centralized hub with OpenTelemetry standards, an open-source framework for collecting and transmitting observability data. It connects to AI agents built on any platform through SDKs, API, and MCP servers, and then tracks all agent-to-agent interactions and tool calls to other systems, providing relevant metrics, explanations, alerts, performance reviews, and suggestions for improvement. This ultimately serves as a virtual ‘meeting’ where agents share context and solve problems together, ensuring consistency, compliance, and supervision.
“Our AI agent manager can read all the interactions that happen and (when available) all the reasoning that went into an AI agent’s decision to call another agent or tool, and the requests and responses that occurred. By giving the Wayfound Manager the company, regulatory, and agent-specific guidelines that the agent should be following, we are able to assess the performance of AI agents across their entire network of connected agents and tools,” she further added.
Managers can then assess the compliance and performance of AI agents using both standard and custom guidelines that can be written and tested for fidelity in plain language. The guidelines can apply to just one agent or all agents across the enterprise, depending on the case.
Monitoring Agentforce
With its centralized, rules-driven approach, Wayfound has also become the official monitoring partner for Salesforce’s Agentforce, which uses a unified platform approach to deploy and manage agents.
Giants such as Disney are using these bots for handling tasks ranging from sales to customer service, hitting accuracy rates as high as 93%. And Wayfound is keeping their behavior in line by complementing the Agentforce Command Center with real-time monitoring and alerting capabilities.
As the space evolves, Mamut expects enterprises will double down on agents, creating virtual teams of experts for different tasks. With interoperable and open standards emerging, these agents are likely to stem from different platforms, instead of just one like Salesforce. This will make setting up an underlying, centralized hub for monitoring the agents more important than ever.
“Managing AI agents is more like managing human employees than monitoring traditional software. Ensuring these AI agents are managed and supervised well is the key to solving the problem of agent sprawl, agent security, and agent performance. While this problem will never be fully solved, it can already be well-managed, giving enterprise leaders the confidence and control they need to win in the Age of AI,” Mamut added.