Medical device manufacturers waste hours copying data between documents when preparing 510(k) submissions. Each submission requires dozens of forms pulling information from test reports, design specs, and prior submissions. This guide shows how to build an AI agent that automates FDA form completion using Arcade's Google Docs and Google Drive toolkits.
Prerequisites
- Active Arcade account with API key (get your API key)
- Python 3.8 or higher
- Google Cloud Console project with OAuth 2.0 credentials
- Google Workspace access with Drive and Docs permissions
- Basic async/await knowledge in Python
Agent Architecture
The autofill agent executes this workflow:
- User requests form completion through chat interface
- Agent searches Google Drive for source documents
- Agent reads document content using Google Docs toolkit
- LLM extracts required information from sources
- Agent creates and populates target form
- User reviews and approves autofilled content
Arcade handles OAuth authentication, token management, and API access throughout this process.
Authentication Setup
Note: The Google Docs toolkit requires a self-hosted Arcade instance. It is not available in Arcade Cloud.
Configure Google OAuth in Arcade
Add the Google OAuth provider to your Arcade instance:
- Access your Arcade Dashboard (default:
http://localhost:9099/dashboardfor self-hosted) - Navigate to OAuth → Providers
- Click "Add OAuth Provider"
- Select "Included Providers" tab
- Choose "Google" from the dropdown
- Enter your Client ID and Client Secret
- Copy the generated Redirect URL
- Add this Redirect URL to your Google Cloud Console app's Authorized redirect URIs
Full configuration details: Google auth provider documentation
Install Required Packages
pip install arcadepy
For building custom tools:
pip install arcade-ai
Set Environment Variables
export ARCADE_API_KEY="your_arcade_api_key"
Build Document Search Component
Initialize Arcade Client
import os
from arcadepy import Arcade
from typing import List, Dict, Any
class FDAFormAutofiller:
def __init__(self):
self.client = Arcade(api_key=os.getenv("ARCADE_API_KEY"))
self.user_sessions = {}
async def authenticate_user(self, user_id: str) -> Dict[str, Any]:
"""Authenticate user for Google Drive and Docs access"""
# Define required OAuth scopes
scopes = [
"https://www.googleapis.com/auth/drive",
"https://www.googleapis.com/auth/documents",
]
auth_response = await self.client.auth.start(
user_id=user_id,
provider="google",
scopes=scopes
)
if auth_response.status != "completed":
return {
"authorization_required": True,
"url": auth_response.url,
"message": "Complete authorization to access Google Drive and Docs"
}
await self.client.auth.wait_for_completion(auth_response)
self.user_sessions[user_id] = {"authenticated": True}
return {"authenticated": True}
Search Drive for Source Documents
Use the GoogleDrive toolkit to locate relevant documents:
async def search_source_documents(
self,
user_id: str,
query_terms: str,
max_results: int = 20
) -> List[Dict[str, Any]]:
"""Search Google Drive for FDA source documents"""
# Verify authentication
if user_id not in self.user_sessions:
await self.authenticate_user(user_id)
# Execute search using GoogleDrive.SearchFiles
result = await self.client.tools.execute(
tool_name="GoogleDrive.SearchFiles",
input={
"query": query_terms,
"limit": max_results
},
user_id=user_id
)
return result.output.get("files", [])
Available GoogleDrive toolkit tools: GoogleDrive reference
Extract Information from Documents
Read Document Content
Use the GoogleDocs toolkit to retrieve document content. The toolkit provides these tools:
GoogleDocs.GetDocument- Get the latest version of a Google DocGoogleDocs.AppendText- Insert text at end of documentGoogleDocs.CreateBlankDocument- Create blank document with titleGoogleDocs.CreateDocument- Create document with title and content
async def get_document_content(
self,
user_id: str,
document_id: str
) -> str:
"""Retrieve content from Google Doc in markdown format"""
result = await self.client.tools.execute(
tool_name="GoogleDocs.GetDocument",
input={
"document_id": document_id,
"format": "MARKDOWN" # Options: MARKDOWN, HTML, GOOGLE_API_JSON
},
user_id=user_id
)
return result.output.get("content", "")
Full tool documentation: Google Docs toolkit
Structure Information Extraction
Define a schema for FDA form data:
from pydantic import BaseModel, Field
from typing import List
class DeviceInformation(BaseModel):
"""Structured device information for FDA 510(k) forms"""
device_name: str = Field(description="Device trade or common name")
manufacturer: str = Field(description="Legal manufacturer name")
classification_name: str = Field(description="Device classification per 21 CFR")
product_code: str = Field(description="Three-letter FDA product code")
intended_use: str = Field(description="Intended use statement")
indications_for_use: str = Field(description="Specific clinical indications")
predicate_device: str = Field(description="510(k) number of predicate (format: KYYXXXX)")
technological_characteristics: List[str] = Field(
description="Key technological features"
)
async def extract_device_information(
self,
user_id: str,
source_document_ids: List[str]
) -> DeviceInformation:
"""Extract structured device data from source documents"""
# Retrieve content from all source documents
documents = []
for doc_id in source_document_ids:
content = await self.get_document_content(user_id, doc_id)
documents.append(content)
combined_content = "\n\n---\n\n".join(documents)
# Use LLM with structured output to extract information
# Implementation depends on your LLM provider
# Ensure the LLM returns data matching the DeviceInformation schema
extraction_prompt = f"""
Extract device information for FDA 510(k) submission from these documents:
{combined_content}
Return structured data following the DeviceInformation schema.
"""
# Your LLM extraction logic here
# extracted_data = await your_llm.extract(prompt, schema=DeviceInformation)
return extracted_data
Create and Populate Forms
Generate Form Template
Create FDA form templates in Google Docs:
async def create_form_document(
self,
user_id: str,
form_type: str,
title: str
) -> str:
"""Create new FDA form document with template"""
# Get form template
template = self.get_form_template(form_type)
# Create document with content using GoogleDocs.CreateDocument
result = await self.client.tools.execute(
tool_name="GoogleDocs.CreateDocument",
input={
"title": title,
"content": template
},
user_id=user_id
)
return result.output.get("document_id")
def get_form_template(self, form_type: str) -> str:
"""Return FDA form template structure"""
templates = {
"indications_for_use": """
INDICATIONS FOR USE STATEMENT
FDA Form 3881
Device Name: [DEVICE_NAME]
Manufacturer: [MANUFACTURER]
510(k) Number: [510K_NUMBER]
INDICATIONS FOR USE:
[INDICATIONS]
PRESCRIPTION USE: [ ] Yes [ ] No
(Per 21 CFR 801.109)
OVER-THE-COUNTER USE: [ ] Yes [ ] No
(21 CFR 801.109)
""",
"device_description": """
DEVICE DESCRIPTION
Device Name: [DEVICE_NAME]
Product Code: [PRODUCT_CODE]
Classification Name: [CLASSIFICATION]
PHYSICAL DESCRIPTION:
[PHYSICAL_DESC]
TECHNOLOGICAL CHARACTERISTICS:
[TECH_CHARACTERISTICS]
MATERIALS:
[MATERIALS]
""",
"510k_summary": """
510(k) SUMMARY
Submitter Information:
Name: [MANUFACTURER]
Address: [ADDRESS]
Device Information:
Trade Name: [DEVICE_NAME]
Common Name: [COMMON_NAME]
Classification Name: [CLASSIFICATION]
Product Code: [PRODUCT_CODE]
Predicate Device:
510(k) Number: [PREDICATE_510K]
Trade Name: [PREDICATE_NAME]
Substantial Equivalence:
[SE_COMPARISON]
"""
}
return templates.get(form_type, "")
Populate Form with Extracted Data
Replace template placeholders with extracted information:
async def autofill_form(
self,
user_id: str,
document_id: str,
device_info: DeviceInformation
) -> Dict[str, Any]:
"""Autofill FDA form with extracted device information"""
# Get current document content
current_content = await self.get_document_content(user_id, document_id)
# Create replacement mapping
replacements = {
"[DEVICE_NAME]": device_info.device_name,
"[MANUFACTURER]": device_info.manufacturer,
"[PRODUCT_CODE]": device_info.product_code,
"[CLASSIFICATION]": device_info.classification_name,
"[INDICATIONS]": device_info.indications_for_use,
"[510K_NUMBER]": device_info.predicate_device,
"[TECH_CHARACTERISTICS]": "\n".join(
f"• {char}" for char in device_info.technological_characteristics
)
}
# Replace placeholders
updated_content = current_content
for placeholder, value in replacements.items():
updated_content = updated_content.replace(placeholder, value)
# Create new populated document
result = await self.client.tools.execute(
tool_name="GoogleDocs.CreateDocument",
input={
"title": f"Completed - {document_id}",
"content": updated_content
},
user_id=user_id
)
return {
"success": True,
"document_id": result.output.get("document_id"),
"document_url": result.output.get("url")
}
Build Complete Agent Workflow
Orchestrate Full Process
Combine components into complete autofill workflow:
async def process_form_autofill_request(
self,
user_id: str,
form_type: str,
search_query: str
) -> Dict[str, Any]:
"""Execute complete FDA form autofill workflow"""
try:
# Step 1: Authenticate user
auth_result = await self.authenticate_user(user_id)
if auth_result.get("authorization_required"):
return auth_result
# Step 2: Search for source documents
source_docs = await self.search_source_documents(
user_id,
search_query,
max_results=20
)
if not source_docs:
return {
"success": False,
"message": "No source documents found. Refine your search query."
}
# Step 3: Extract information from top 5 documents
doc_ids = [doc["id"] for doc in source_docs[:5]]
device_info = await self.extract_device_information(user_id, doc_ids)
# Step 4: Validate extracted information
validation = self.validate_device_information(device_info)
if not validation["valid"]:
return {
"success": False,
"errors": validation["errors"],
"message": "Extracted information failed validation"
}
# Step 5: Create form template
form_title = f"FDA {form_type} - {device_info.device_name}"
form_doc_id = await self.create_form_document(
user_id,
form_type,
form_title
)
# Step 6: Autofill form
result = await self.autofill_form(user_id, form_doc_id, device_info)
return {
"success": True,
"form_document_id": result["document_id"],
"form_document_url": result["document_url"],
"source_documents": [doc["name"] for doc in source_docs[:5]],
"device_name": device_info.device_name,
"validation_warnings": validation.get("warnings", [])
}
except Exception as e:
return {
"success": False,
"error": str(e)
}
Validate Extracted Information
Implement validation against FDA requirements:
def validate_device_information(self, device_info: DeviceInformation) -> Dict[str, Any]:
"""Validate extracted device data meets FDA requirements"""
errors = []
warnings = []
# Validate device name
if not device_info.device_name or len(device_info.device_name) < 3:
errors.append("Device name required (minimum 3 characters)")
# Validate product code format
if not device_info.product_code or len(device_info.product_code) != 3:
errors.append("Product code must be exactly 3 characters")
# Validate 510(k) predicate format
if not device_info.predicate_device.startswith("K"):
errors.append("Predicate 510(k) number must start with 'K'")
# Check intended use detail
if len(device_info.intended_use) < 50:
warnings.append("Intended use statement should be more detailed")
# Validate technological characteristics
if len(device_info.technological_characteristics) < 3:
warnings.append("Consider adding more technological characteristics")
return {
"valid": len(errors) == 0,
"errors": errors,
"warnings": warnings
}
Integrate with Agent Frameworks
LangChain Integration
Use Arcade tools with LangChain:
from arcadepy import Arcade
async def setup_arcade_tools_for_langchain(user_id: str):
"""Get Arcade tools formatted for LangChain"""
client = Arcade()
# Get Google Docs and Drive tools
docs_tools = await client.tools.list(toolkit="google_docs", user_id=user_id)
drive_tools = await client.tools.list(toolkit="google_drive", user_id=user_id)
# Authorize all tools
all_tools = docs_tools.items + drive_tools.items
for tool in all_tools:
auth_result = await client.tools.authorize(
tool_name=tool.name,
user_id=user_id
)
if auth_result.status != "completed":
print(f"Authorize: {auth_result.url}")
await client.auth.wait_for_completion(auth_result)
return all_tools
Full LangChain integration guide: Using Arcade tools with LangChain
Google ADK Integration
Use Arcade with Google ADK:
from google.adk import Agent
from google_adk_arcade.tools import get_arcade_tools
from arcadepy import AsyncArcade
async def create_fda_agent(user_id: str):
"""Create FDA form agent with Google ADK"""
client = AsyncArcade()
# Get Google toolkit tools
tools = await get_arcade_tools(
client,
toolkits=["google_docs", "google_drive"]
)
# Authorize tools
for tool in tools:
result = await client.tools.authorize(
tool_name=tool.name,
user_id=user_id
)
if result.status != "completed":
await client.auth.wait_for_completion(result)
# Create agent
agent = Agent(
model="gemini-2.0-flash",
name="fda_form_agent",
instruction="""
You are an FDA regulatory assistant for 510(k) submissions.
Search Google Drive for relevant documents, extract device information,
and populate FDA form templates. Verify all information with the user
before finalizing forms.
""",
tools=tools
)
return agent
Integration documentation: Google ADK with Arcade
Handle Errors and Edge Cases
Implement Retry Logic
import asyncio
async def safe_execute_tool(
self,
tool_name: str,
input_params: Dict[str, Any],
user_id: str,
max_retries: int = 3
) -> Any:
"""Execute tool with retry logic"""
for attempt in range(max_retries):
try:
result = await self.client.tools.execute(
tool_name=tool_name,
input=input_params,
user_id=user_id
)
return result
except Exception as e:
error_type = getattr(e, 'type', 'unknown')
# Handle authorization errors
if error_type == "authorization_required":
await self.authenticate_user(user_id)
continue
# Handle rate limits with exponential backoff
if error_type == "rate_limit_exceeded":
await asyncio.sleep(2 ** attempt)
continue
# Final attempt failed
if attempt == max_retries - 1:
raise Exception(f"Failed after {max_retries} attempts: {str(e)}")
Handle Authorization Callbacks
Manage user authorization flows:
from fastapi import FastAPI, Request
from fastapi.responses import JSONResponse
app = FastAPI()
@app.post("/api/oauth/callback")
async def oauth_callback(req: Request):
"""Handle OAuth callback completion"""
body = await req.json()
user_id = body.get("userId")
arcade = Arcade(api_key=os.getenv("ARCADE_API_KEY"))
# Load available tools after authorization
docs_tools = await arcade.tools.list(toolkit="google_docs", user_id=user_id)
drive_tools = await arcade.tools.list(toolkit="google_drive", user_id=user_id)
return JSONResponse({
"success": True,
"available_tools": len(docs_tools.items) + len(drive_tools.items)
})
Optimize Performance
Implement Document Caching
from collections import OrderedDict
import time
class DocumentCache:
"""LRU cache for document content"""
def __init__(self, max_size: int = 100, ttl_seconds: int = 3600):
self.cache = OrderedDict()
self.max_size = max_size
self.ttl_seconds = ttl_seconds
def get(self, doc_id: str):
"""Retrieve cached document"""
if doc_id not in self.cache:
return None
content, timestamp = self.cache[doc_id]
# Check expiration
if time.time() - timestamp > self.ttl_seconds:
del self.cache[doc_id]
return None
# Move to end (recently used)
self.cache.move_to_end(doc_id)
return content
def set(self, doc_id: str, content: str):
"""Cache document content"""
if doc_id in self.cache:
self.cache.move_to_end(doc_id)
self.cache[doc_id] = (content, time.time())
# Evict oldest if full
if len(self.cache) > self.max_size:
self.cache.popitem(last=False)
Batch Document Processing
async def batch_process_documents(
self,
user_id: str,
document_ids: List[str],
batch_size: int = 5
) -> List[str]:
"""Process documents in batches"""
results = []
for i in range(0, len(document_ids), batch_size):
batch = document_ids[i:i + batch_size]
# Process batch concurrently
tasks = [
self.get_document_content(user_id, doc_id)
for doc_id in batch
]
batch_results = await asyncio.gather(*tasks, return_exceptions=True)
# Filter exceptions
valid_results = [
r for r in batch_results
if not isinstance(r, Exception)
]
results.extend(valid_results)
return results
Deploy to Production
Security Configuration
Store credentials securely:
import keyring
from cryptography.fernet import Fernet
class SecureCredentialManager:
"""Secure credential storage"""
def __init__(self):
encryption_key = os.getenv("ENCRYPTION_KEY")
if not encryption_key:
raise ValueError("ENCRYPTION_KEY environment variable required")
self.fernet = Fernet(encryption_key.encode())
def store_token(self, user_id: str, token: str):
"""Store encrypted token"""
encrypted = self.fernet.encrypt(token.encode())
keyring.set_password("fda_agent", user_id, encrypted.decode())
def retrieve_token(self, user_id: str) -> str:
"""Retrieve decrypted token"""
encrypted = keyring.get_password("fda_agent", user_id)
if not encrypted:
return None
return self.fernet.decrypt(encrypted.encode()).decode()
Implement Monitoring
import logging
from datetime import datetime
class AgentMonitor:
"""Structured logging for agent operations"""
def __init__(self):
self.logger = logging.getLogger("fda_agent")
self.logger.setLevel(logging.INFO)
handler = logging.StreamHandler()
formatter = logging.Formatter(
'%(asctime)s - %(levelname)s - %(message)s'
)
handler.setFormatter(formatter)
self.logger.addHandler(handler)
def log_autofill(self, user_id: str, form_type: str, status: str):
"""Log form autofill operation"""
self.logger.info(
f"Autofill | User: {user_id} | Form: {form_type} | Status: {status}"
)
def log_document_access(self, user_id: str, doc_id: str, action: str):
"""Log document access for audit trail"""
self.logger.info(
f"Document | User: {user_id} | Doc: {doc_id} | Action: {action} | "
f"Time: {datetime.now().isoformat()}"
)
Configure Rate Limiting
from datetime import datetime, timedelta
from collections import defaultdict
class RateLimiter:
"""Rate limiting for API operations"""
def __init__(self, max_calls: int = 100, window_minutes: int = 60):
self.max_calls = max_calls
self.window = timedelta(minutes=window_minutes)
self.calls = defaultdict(list)
def check_limit(self, user_id: str) -> bool:
"""Check if user exceeded rate limit"""
now = datetime.now()
# Remove expired calls
self.calls[user_id] = [
call_time for call_time in self.calls[user_id]
if now - call_time < self.window
]
# Check limit
if len(self.calls[user_id]) >= self.max_calls:
return False
# Record call
self.calls[user_id].append(now)
return True
Test Your Agent
Unit Tests
import pytest
from unittest.mock import AsyncMock, MagicMock
@pytest.fixture
def mock_client():
"""Mock Arcade client"""
client = AsyncMock()
client.auth.start = AsyncMock(return_value=MagicMock(status="completed"))
client.tools.execute = AsyncMock()
return client
@pytest.mark.asyncio
async def test_document_search(mock_client):
"""Test document search functionality"""
autofiller = FDAFormAutofiller()
autofiller.client = mock_client
# Mock search results
mock_client.tools.execute.return_value = MagicMock(
output={"files": [{"id": "doc1", "name": "Device Spec"}]}
)
results = await autofiller.search_source_documents(
"test_user",
"device specification"
)
assert len(results) == 1
assert results[0]["name"] == "Device Spec"
@pytest.mark.asyncio
async def test_validation():
"""Test device information validation"""
autofiller = FDAFormAutofiller()
# Invalid device info
invalid_info = DeviceInformation(
device_name="",
manufacturer="Test Corp",
classification_name="Class II",
product_code="AB", # Invalid length
intended_use="Test",
indications_for_use="Test",
predicate_device="123", # Invalid format
technological_characteristics=[]
)
validation = autofiller.validate_device_information(invalid_info)
assert not validation["valid"]
assert len(validation["errors"]) > 0
Best Practices
Form Template Management
- Store templates in version-controlled Drive folder
- Use naming convention:
FDA_[FormNumber]_[FormName]_[ExpirationDate].gdoc - Implement template version checking
- Set alerts for form expiration dates
Data Validation
- Cross-reference device names with FDA database
- Verify predicate 510(k) numbers
- Validate date formats
- Check character limits per form specifications
Audit Trail Requirements
- Record all document accesses with timestamps
- Log all form modifications
- Track source documents used for autofill
- Store original and modified versions
User Review Workflow
- Never submit forms without human review
- Highlight autofilled sections
- Provide source document links
- Enable inline commenting
Conclusion
This FDA form autofill agent shows how Arcade's Google Docs and Drive toolkits automate regulatory workflows. Arcade handles OAuth authentication, token management, and secure API access, letting you focus on agent logic.
The agent provides:
- Secure multi-user authentication
- Intelligent document search
- Automated information extraction
- Form population with validation
- Error handling and retry logic
- Production-ready security
Regulatory teams reduce form preparation time from hours to minutes while maintaining accuracy and audit trails. The architecture scales across different FDA form types by adjusting extraction schemas and templates.
For production deployment, self-host Arcade for enhanced control over authentication flows and data residency.



