AgentKit Ships, But Production Agents Still Need Authentication

AgentKit Ships, But Production Agents Still Need Authentication

Shub Argha's avatar
Shub Argha
OCTOBER 6, 2025
4 MIN READ
THOUGHT LEADERSHIP
Rays decoration image
Ghost Icon

OpenAI just dropped AgentKit at DevDay, and the demos look clean—visual workflow builders, embedded chat interfaces, evaluation frameworks. Ramp went from blank canvas to live buyer agent in hours instead of months. LY Corporation built a multi-agent workflow in under two hours.

But here's what the launch post doesn't tell you: most of those demos will hit a wall before production.

What AgentKit Actually Shipped

AgentKit is three things bundled together:

Agent Builder gives you a visual canvas for composing agent logic. Drag-and-drop nodes, inline eval config, full versioning. You can see the entire workflow instead of debugging orchestration code at 2 AM.

ChatKit handles the annoying parts of chat UI—streaming responses, thread management, thinking indicators. Embed it in your product, customize the theme, ship.

Evals + RFT lets you measure agent performance with datasets, trace grading, and automated prompt optimization. Reinforcement fine-tuning on o4-mini and GPT-5 (private beta) to make models call the right tools at the right time.

It's a legitimate step forward for agent development velocity. Building workflows that used to take months now takes hours.

The Authentication and Authorization Wall

Here's the problem: AgentKit makes it easy to build agents. It doesn't solve how those agents actually authenticate and authorize in production.

AgentKit ships with native MCP (Model Context Protocol) support. MCP standardizes how agents connect to tools and data sources. Dropbox, Google Drive, SharePoint, Microsoft Teams—all available through the Connector Registry.

But MCP was designed for local, single-user scenarios. YOUR personal Claude Desktop connecting to YOUR personal filesystem. One user, one machine, no auth complexity.

99% of MCP servers today are still built that way—even the hosted ones. They work great for demos. They break in production when you need:

  • Authentication: Secure OAuth flows so agents can access third-party APIs on behalf of users
  • Authorization: Per-user permissions ensuring agents only access what each user is allowed to see
  • Token management: Production-grade refresh logic, secure storage, and scope validation
  • Audit trails: Logs showing which AI action happened on behalf of which user, with what permissions

This is what kills 70% of AI projects before they ship. Not the agent logic. The auth layer.

What Production-Ready Agent Auth Actually Requires

The gap between "works in a demo" and "ships to production" comes down to a few hard problems:

OAuth flows that don't suck. Your agent needs to access Gmail, Slack, Salesforce—tools that require OAuth. Building and maintaining OAuth integrations for dozens of services is months of work. Then you need to handle token refresh, scope changes, API versioning, and edge cases that only surface at scale.

Per-user authorization, not bot tokens. Most agent demos use a single API key or bot token with admin access. That's fine for a prototype. In production, you need every agent action to map to a specific user with appropriate permissions. Your legal team will ask: "Which user authorized this action? What were their permissions? Can we prove it in an audit?"

Token management that doesn't leak credentials. Storing OAuth tokens securely, refreshing them before expiry, handling revocation—this is all undifferentiated heavy lifting. Get it wrong and you're in breach of compliance requirements. Get it right and you've built something every other AI team also needs to build.

Production infrastructure. Monitoring which agent actions succeeded or failed. Rate limiting to avoid hammering APIs. Logging for debugging. Evaluation hooks to measure performance. None of this is glamorous, but all of it is mandatory for production deployment.

How Arcade.dev Solves This

This is exactly what Arcade.dev was built for—giving AI agents secure, authenticated access to real tools.

Pre-built OAuth integrations for Gmail, Slack, Notion, Stripe, Salesforce, and 100+ other services. Built and maintained by the team that shipped auth at scale (Okta, Stormpath). You don't write OAuth code. You configure scopes and let Arcade handle the rest.

Per-user authorization by default. Every agent action happens as the end user, with their permissions, through their authorized connection. No shared bot tokens, no admin access hacks.

Production-grade token management. Refresh logic, secure storage, scope validation, error handling. Your agent code never touches raw credentials.

Observability and evaluation. Monitoring, logging, rate limiting, and eval hooks built in. Everything you need to run agents at scale.

The architecture is straightforward: you build your agent workflows (in AgentKit, LangGraph, CrewAI, whatever), and Arcade provides the authenticated tool layer underneath. Your agent calls arcade.send_email() and Arcade handles the OAuth flow, token refresh, and user authorization.

Why This Matters Now

AgentKit just made it dramatically easier to build agent workflows. That's going to create a surge of teams hitting the authentication wall in the next few weeks.

You'll see it when you try to connect your agent to a user's Gmail account. Or when your compliance team asks how you're handling OAuth token storage. Or when you realize your demo works great with your own API keys but breaks when you try to deploy it for multiple users.

The good news: this is a solved problem. You don't need to build your own OAuth infrastructure. You don't need to maintain integrations for dozens of services. You don't need to figure out per-user authorization from scratch.

Arcade.dev handles the auth layer so you can focus on building agent workflows.


We're shipping MCP support for Arcade.dev next week—making it even easier to connect agent frameworks like AgentKit to authenticated tools. Join our Discord for updates or sign up for Arcade.dev to get on our mailing list.

If you're building agents today and already hitting auth problems, we're here. Get started with our quickstart guide or reach out directly—we're helping teams ship production agents every day.

SHARE THIS POST

RECENT ARTICLES

Rays decoration image
THOUGHT LEADERSHIP

Enterprise MCP Guide For Retail Banking & Payments: Use Cases, Best Practices, and Trends

The global payments industry processes $2.0 quadrillion in value flows annually, generating $2.5 trillion in revenue. Yet despite decades of digital transformation investment, critical banking operations,anti-money laundering investigation, KYC onboarding, payment reconciliation,remain largely manual. Model Context Protocol (MCP) represents the infrastructure breakthrough that enables financial institutions to move beyond chatbot pilots to production-grade AI agents that take multi-user authoriz

Rays decoration image
THOUGHT LEADERSHIP

Enterprise MCP Guide For Capital Markets & Trading: Use Cases, Best Practices, and Trends

Capital markets technology leaders face a critical infrastructure challenge: scattered AI pilots, disconnected integrations, and fragmented, domain-specific systems that turn engineers into human APIs manually stitching together trading platforms, market data feeds, and risk management tools. The Model Context Protocol (MCP) represents a fundamental shift from this costly one-off integration approach to a universal standardization layer that acts as the backbone for AI-native financial enterpris

Rays decoration image
THOUGHT LEADERSHIP

Enterprise MCP Guide For InsurTech: Use Cases, Best Practices, and Trends

The insurance industry faces a pivotal transformation moment. Model Context Protocol (MCP) has moved from experimental technology to production infrastructure, with 16,000+ active servers deployed across enterprises and millions of weekly SDK downloads. For InsurTech leaders, the question is no longer whether to adopt MCP, but how to implement it securely and effectively. Arcade's platform provides the MCP runtime for secure, multi-user authorization so AI agents can act on behalf of users acros

Blog CTA Icon

Get early access to Arcade, and start building now.