AgentKit Ships, But Production Agents Still Need Authentication

AgentKit Ships, But Production Agents Still Need Authentication

Shub Argha's avatar
Shub Argha
OCTOBER 6, 2025
4 MIN READ
THOUGHT LEADERSHIP
Rays decoration image
Ghost Icon

OpenAI just dropped AgentKit at DevDay, and the demos look clean—visual workflow builders, embedded chat interfaces, evaluation frameworks. Ramp went from blank canvas to live buyer agent in hours instead of months. LY Corporation built a multi-agent workflow in under two hours.

But here's what the launch post doesn't tell you: most of those demos will hit a wall before production.

What AgentKit Actually Shipped

AgentKit is three things bundled together:

Agent Builder gives you a visual canvas for composing agent logic. Drag-and-drop nodes, inline eval config, full versioning. You can see the entire workflow instead of debugging orchestration code at 2 AM.

ChatKit handles the annoying parts of chat UI—streaming responses, thread management, thinking indicators. Embed it in your product, customize the theme, ship.

Evals + RFT lets you measure agent performance with datasets, trace grading, and automated prompt optimization. Reinforcement fine-tuning on o4-mini and GPT-5 (private beta) to make models call the right tools at the right time.

It's a legitimate step forward for agent development velocity. Building workflows that used to take months now takes hours.

The Authentication and Authorization Wall

Here's the problem: AgentKit makes it easy to build agents. It doesn't solve how those agents actually authenticate and authorize in production.

AgentKit ships with native MCP (Model Context Protocol) support. MCP standardizes how agents connect to tools and data sources. Dropbox, Google Drive, SharePoint, Microsoft Teams—all available through the Connector Registry.

But MCP was designed for local, single-user scenarios. YOUR personal Claude Desktop connecting to YOUR personal filesystem. One user, one machine, no auth complexity.

99% of MCP servers today are still built that way—even the hosted ones. They work great for demos. They break in production when you need:

  • Authentication: Secure OAuth flows so agents can access third-party APIs on behalf of users
  • Authorization: Per-user permissions ensuring agents only access what each user is allowed to see
  • Token management: Production-grade refresh logic, secure storage, and scope validation
  • Audit trails: Logs showing which AI action happened on behalf of which user, with what permissions

This is what kills 70% of AI projects before they ship. Not the agent logic. The auth layer.

What Production-Ready Agent Auth Actually Requires

The gap between "works in a demo" and "ships to production" comes down to a few hard problems:

OAuth flows that don't suck. Your agent needs to access Gmail, Slack, Salesforce—tools that require OAuth. Building and maintaining OAuth integrations for dozens of services is months of work. Then you need to handle token refresh, scope changes, API versioning, and edge cases that only surface at scale.

Per-user authorization, not bot tokens. Most agent demos use a single API key or bot token with admin access. That's fine for a prototype. In production, you need every agent action to map to a specific user with appropriate permissions. Your legal team will ask: "Which user authorized this action? What were their permissions? Can we prove it in an audit?"

Token management that doesn't leak credentials. Storing OAuth tokens securely, refreshing them before expiry, handling revocation—this is all undifferentiated heavy lifting. Get it wrong and you're in breach of compliance requirements. Get it right and you've built something every other AI team also needs to build.

Production infrastructure. Monitoring which agent actions succeeded or failed. Rate limiting to avoid hammering APIs. Logging for debugging. Evaluation hooks to measure performance. None of this is glamorous, but all of it is mandatory for production deployment.

How Arcade.dev Solves This

This is exactly what Arcade.dev was built for—giving AI agents secure, authenticated access to real tools.

Pre-built OAuth integrations for Gmail, Slack, Notion, Stripe, Salesforce, and 100+ other services. Built and maintained by the team that shipped auth at scale (Okta, Stormpath). You don't write OAuth code. You configure scopes and let Arcade handle the rest.

Per-user authorization by default. Every agent action happens as the end user, with their permissions, through their authorized connection. No shared bot tokens, no admin access hacks.

Production-grade token management. Refresh logic, secure storage, scope validation, error handling. Your agent code never touches raw credentials.

Observability and evaluation. Monitoring, logging, rate limiting, and eval hooks built in. Everything you need to run agents at scale.

The architecture is straightforward: you build your agent workflows (in AgentKit, LangGraph, CrewAI, whatever), and Arcade provides the authenticated tool layer underneath. Your agent calls arcade.send_email() and Arcade handles the OAuth flow, token refresh, and user authorization.

Why This Matters Now

AgentKit just made it dramatically easier to build agent workflows. That's going to create a surge of teams hitting the authentication wall in the next few weeks.

You'll see it when you try to connect your agent to a user's Gmail account. Or when your compliance team asks how you're handling OAuth token storage. Or when you realize your demo works great with your own API keys but breaks when you try to deploy it for multiple users.

The good news: this is a solved problem. You don't need to build your own OAuth infrastructure. You don't need to maintain integrations for dozens of services. You don't need to figure out per-user authorization from scratch.

Arcade.dev handles the auth layer so you can focus on building agent workflows.


We're shipping MCP support for Arcade.dev next week—making it even easier to connect agent frameworks like AgentKit to authenticated tools. Join our Discord for updates or sign up for Arcade.dev to get on our mailing list.

If you're building agents today and already hitting auth problems, we're here. Get started with our quickstart guide or reach out directly—we're helping teams ship production agents every day.

SHARE THIS POST

RECENT ARTICLES

Rays decoration image
THOUGHT LEADERSHIP

Agent Auth: The Problem That Kills Production Agents

Your agent needs to pull data from Google Drive, post a summary to Slack, and create a Jira ticket. Simple request. But whose credentials does it use? Should it have permission to delete your entire Drive folder? This authorization problem kills agent demos before they reach production. It's not about users logging into your agent (LangGraph Platform handles that). It's about your agent accessing other services on behalf of those users. If you're building real agents, you've hit this wall. The

PRODUCT RELEASE

Your AI Agent Doesn't Know Who the Hell You Are (And That's a Problem)

Picture this: You walk into a newly opened restaurant for the first time, excited by the positive reviews, and confidently stride to a window-side table. As soon as you're settled, the waiter approaches, but before they can speak, you say, "The usual, please." The waiter stares at you like you've lost your mind. They've never seen you before. They have no idea what "the usual" means. That's your AI agent every time you start a new conversation. It has absolutely no idea who you are, what you w

Rays decoration image
Customer Story

From WhatsApp Message to Xero Invoice: How Tradestack Actually Ships AI Agents

The 70% of AI projects that never reach production have something in common: they hit the authentication wall and never recover. Tradestack broke through it by leveraging Arcade’s capabilities. When Vaibhav Pandey and his team at Tradestack set out to build an agentic back office for UK contractors, they faced a challenge that kills most agent projects: giving AI secure, reliable access to critical business systems. Their target customers (mid-market contractors  juggling invoicing, estimates,

Blog CTA Icon

Get early access to Arcade, and start building now.