What does Anthropic's Tool Search for Claude mean for you?

What does Anthropic's Tool Search for Claude mean for you?

Alex Salazar's avatar
Alex Salazar
DECEMBER 2, 2025
4 MIN READ
THOUGHT LEADERSHIP
Rays decoration image
Ghost Icon

I was recently in Amsterdam meeting with some of the largest enterprises, and they all raised the same challenge: how to give AI agents access to more tools without everything falling apart? 

The issue is that as soon as they hit 20-30 tools, token costs became untenable and selection accuracy plummeted. The pain has been so acute that many teams have been attempting (unsuccessfully) to build their own workarounds with RAG pipelines, only to hit performance walls. 

That's why I'm excited about Anthropic's recently announced Tool Search Tool, which represents a major step forward in solving this common challenge for AI agents.

What did Anthropic actually release?

Announced as one of three new beta features, Anthropic’s Tool Search Tool allows Claude models to dynamically discover and load tools on demand instead of manually adding every single tool definition into its context window upfront.

Before, these models had to keep every possible tool in its working memory at all times. Now it can offload that and search through it when needed. It's like the difference between keeping everything in your head and referring to  a dictionary. Just like your brain, giving the Claude models the ability to keep tools in a “dictionary” means reducing the taxing load of holding onto all that memory while also improving accuracy. 

Let's dive more into the two primary constraints it addresses:

Token bloat: In their announcement, Anthropic provides a concrete example: Consider a five-server setup:

  • GitHub: 35 tools (~26K tokens)
  • Slack: 11 tools (~21K tokens)
  • Sentry: 5 tools (~3K tokens)
  • Grafana: 5 tools (~3K tokens)
  • Splunk: 2 tools (~2K tokens)

That's 58 tools consuming approximately 55K tokens before the conversation even begins. Add additional servers like Jira (which alone uses ~17K tokens) and you quickly approach 100K+ token overhead. This token consumption directly impacts both response latency and operational costs.

Prior to this release, agents began experiencing reliability issues after approximately 20 tools. To put this in perspective, the GitHub toolkit alone contains 18 tools, and Gmail has 10-13. This created severe practical constraints. Organizations couldn't deploy agents capable of handling multiple systems simultaneously. 

Accuracy: Tool selection accuracy was another critical constraint. As the number of tools increased, the model's ability to select the correct tool decreased significantly. This was particularly problematic when tools had similar names or overlapping functionality.

How Anthropic solved this for Claude

The solution is straightforward: mark tools with defer_loading: true. Those tools remain discoverable but don't consume context until Claude actually needs them. Claude searches using either regex or keyword ranking (BM25), then only loads what it needs.

The results are compelling: an 85% reduction in token usage while maintaining access to your full tool library, plus significant accuracy improvements on MCP evaluations, with Opus 4 improving from 49% to 74% with this enabled.

Why are we excited about this at Arcade?

While this capability represents a significant leap forward, it simultaneously introduces critical infrastructure challenges that organizations must address when running and scaling agents in production. As agents have access to any number of tools, enterprises now must ensure they can connect to them securely, the tools are optimized for agents, and they can maintain governance and control at scale. That’s where Arcade’s MCP runtime can help.

1. Secure Agent Authorization

Agent authorization is one of the hardest challenges to solve and is why most AI projects never go beyond a single-user demo. Arcade ensures agents can take actions on any system with controlled, user-specific permissions. It integrates within existing OAuth, IDP, and user access flows, so you get granular controls for your agents out of the box. 

2. Agent-Optimized Tools

Most MCP servers and tools just wrap existing APIs, which leads to poor accuracy and disgruntled users. You can give Claude access to a thousand tools, but if they're poorly built, it doesn't matter. Bad tool definitions lead to bad tool selection. Arcade provides the largest catalog of agent-optimized MCP tools out of the box. Our tools outperform because we've done the hard work of making them actually work, not just wrapping APIs, but building tools specifically designed to handle agent intent with better reliability and lower costs.

3. Governance at Scale

More tool access unlocks more use cases, which means more agents and more teams deploying them across your organization. This agents and MCP sprawl makes it hard to know if teams are rebuilding existing servers or breaking workflows as they push upgrades. The Arcade MCP runtime centralizes the control and governance of all your MCP tools, improves discovery and access of these tools across teams, enables safe testing and versioning, and provides the only visibility into what every agent accesses on behalf of each user across each service, ultimately accelerating trusted production deployments across the board.

Tool search limitations to consider

It’s important to call out some limitations to Anthropic's Tool Search Tool that should be considered.

First, this tool is exclusively available for Claude. This means if you’re using Anthropic for your large model but another vendor for your small models (a pretty common pattern), this feature won’t work across both. This will also be particularly painful for teams using coding agents or IDE assistants, where this feature will only work on a subset of models available.

Second, broad framework support will require time. Currently, implementation requires using Anthropic's SDK directly with special beta headers and flags. This capability is not yet supported in LangChain or other popular frameworks.

Time to start building

Anthropic has helped to eliminate a major constraint on AI agent capabilities.

However, the critical question isn't whether your agent can access a thousand tools, it's whether it should, and whether you can manage that safely and effectively, particularly when agents have access to critical production systems.

That's where Arcade comes into play. As the runtime for MCP, Arcade is the only one able to deliver secure agent authorization, high-accuracy tools, and centralized governance. We give you the ability to deploy multi-user AI agents that take actions across any system with granular permissions and complete visibility, no complex infrastructure required. 


Building production AI agents? Try Arcade’s MCP runtime for free so you can ship faster and scale with control.

SHARE THIS POST

RECENT ARTICLES

Rays decoration image
THOUGHT LEADERSHIP

We Threw 4,000 Tools at Anthropic's New Tool Search. Here's What Happened.

TL;DR: Anthropic's new Tool Search is a step in the right direction-but if you're running 4,000+ tools across multiple services, it might not be ready for prime time. The promise Anthropic's Tool Search promises to let Claude "access thousands of tools without consuming its context window." Music to our ears. At Arcade, we maintain thousands of agent-optimized tools across Gmail, Slack, GitHub, HubSpot, Salesforce, and dozens more platforms. If anyone was going to stress-test this feature, it

Rays decoration image
THOUGHT LEADERSHIP

38 Proxy Server AI Revenue Metrics: Market Growth, Data Collection ROI, and Infrastructure Performance

Comprehensive analysis of proxy server market valuations, AI-driven revenue acceleration, and performance benchmarks shaping the future of scoped, user-delegated access The convergence of proxy infrastructure and artificial intelligence represents one of the fastest-growing segments in enterprise technology, with the proxy server market valued at $1 billion in 2024. This growth reflects the need for secure, scoped access pathways as AI systems move from prototypes to real operations. Arcade.de

Rays decoration image
THOUGHT LEADERSHIP

26 Global AI Developer Community Statistics: Adoption Rates, Security Challenges, and Market Momentum

A data-driven analysis of the worldwide AI developer ecosystem, covering adoption patterns, security concerns, productivity gains, and enterprise deployment trends The global AI developer community has reached an inflection point, with 17.4 million developers using or building with AI/ML—a significant jump from 15.5 million in 2023. This massive shift toward AI-powered development creates both unprecedented opportunity and urgent challenges around security, multi-user authorization, and tool r

Blog CTA Icon

Get early access to Arcade, and start building now.