CrewAI API (2026) - Complete Developer Guide to Building, Deploying, and Integrating Multi Agent Workflows
CrewAI is an agent orchestration framework designed to help you build âteamsâ of AI agents (a crew) that collaborate on tasks while you keep control over structure, state, tooling, and execution. Developers typically use it in two ways: CrewAI OSS (open source in your codebase) and CrewAI AMP (a hosted agent management platform with an HTTP API).
Overview
This page focuses on âCrewAI APIâ the way most teams mean it: the AMP HTTP API for running deployed crews, plus the OSS Python APIs you use to build those crews in the first place. Youâll get a clear mental model, production-ready integration patterns, and example requests you can paste into Postman, your backend, or a serverless function.
CrewAI OSS (Open Source)
- Design Agents, Tasks, Crews (and Flows)
- Connect to LLMs and tools
- Run locally or on your infrastructure
- Framework/library API inside your app
CrewAI AMP (Hosted)
- Build & deploy crews through the platform
- Each deployed crew has a unique base URL
- HTTP workflow: inputs â kickoff â status â results
- Product integration API for apps
(inputs: dict) â (final_output). Your product calls the crew endpoint, tracks execution, and displays results.
Table of contents
- What âCrewAI APIâ means (AMP HTTP API vs OSS library APIs)
- Core mental model: Agents, Tools, Tasks, Crews, Flows
- CrewAI AMP API basics: base URL, auth, workflow
- Endpoints youâll use most:
GET /inputs,POST /kickoff,GET /{kickoff_id}/status,POST /resume - Webhooks for streaming progress into your app (task/step/crew callbacks)
- Implementing a production integration (backend pattern + sample code)
- Building a crew in OSS that is âAPI-readyâ (inputs, outputs, structure)
- LLM connectivity and provider choices (LiteLLM + custom LLMs)
- Reliability: retries, idempotency, timeouts, rate control, and failure modes
- Security: token handling, multi-tenant design, and least privilege
- Cost control: cutting token burn and keeping crews predictable
- Observability and evaluation (what to log, what to measure)
- Common use cases and architecture blueprints
- FAQs
1) What âCrewAI APIâ means
When people search âCrewAI API,â they usually want one of these:
A) The CrewAI AMP HTTP API (run a deployed crew)
The AMP API is a clean, ârun my workflowâ interface. You donât call agents directly; you call the crew endpoint associated with a deployed automation.
- You get a Bearer token from your crewâs detail page in the AMP dashboard.
- Each deployed crew has a unique base URL like
https://your-crew-name.crewai.com. - Typical workflow:
GET /inputsto discover required parametersPOST /kickoffto start execution (returns akickoff_id)GET /{kickoff_id}/statusto poll progress/results
B) The CrewAI OSS library APIs (build and run crews in your code)
This is the Python framework. You create:
- Agents: autonomous units that perform tasks, use tools, collaborate, and keep memory.
- Tools: functions/skills agents can call (web search, file reads, integrations, custom tools).
- Crews: groups of agents + tasks + a process (sequential/hierarchical) that define how work executes.
- Flows: the structured âbackboneâ that orchestrates steps, state, triggers, and when to run a crew.
Most production teams use both: they build with OSS concepts, then deploy in AMP, then integrate via the AMP HTTP API.
2) Core mental model (the parts you must understand)
Before you integrate anything, align on how CrewAI thinks.
Agents
An agent is like a specialist teammate with a role, goal, tools, and constraints. Agents can perform tasks, make decisions, use tools, communicate/collaborate, maintain memory, and delegate tasks when allowed.
Tools
Tools are callable âskillsâ an agent can use. Tools expand capability (web search, data analysis, collaboration, delegation) and should support robust error handling, caching, and sync/async patterns.
Crews
A crew is the unit you actually run: tasks + agents + a process strategy. A crew defines execution strategy and collaboration.
Flows
Flows are the backbone/manager of your AI application: state management, event-driven execution, and control flow (branches, loops).
3) CrewAI AMP API basics (base URL, auth, workflow)
Base URL
Each deployed crew has its own unique endpoint like:
https://your-crew-name.crewai.com
This means you donât have one global API host. You have one host per deployed crew.
Authentication
Requests require a Bearer token in the Authorization header.
Retrieve tokens from the crewâs detail page (Status tab) in the AMP dashboard.
Keep tokens server-side only.
Typical workflow (how your app should call it)
- Discover required inputs:
GET /inputs - Start execution:
POST /kickoffwithinputs(optional metadata/webhooks) - Monitor:
GET /{kickoff_id}/statusuntil complete - Read results from the completed status payload (or receive completion via webhook)
4) The key AMP endpoints (with examples)
4.1 GET /inputs â discover what the crew needs
Returns the list of required input parameter names. Example response:
{
"inputs": ["budget", "interests", "duration", "age"]
}
How to use it
- Confirm your crew contract (backend payload requirements)
- Generate forms in internal admin UIs
- Validate payloads in multi-tenant products before kickoff
Common mistake
- Assuming inputs are stable foreverâversion your crew if you change them.
- Skipping validation and paying for failed runs.
4.2 POST /kickoff â start a crew execution
Your primary ârunâ call. inputs is required. meta is optional. Webhooks are optional:
taskWebhookUrl, stepWebhookUrl, crewWebhookUrl.
{
"inputs": {
"budget": "1000 USD",
"interests": "games, tech, ai, relaxing hikes, amazing food",
"duration": "7 days",
"age": "35"
},
"meta": {
"requestId": "travel-req-123",
"source": "web-app"
}
}
kickoff_id as your âjob ID.â Store it with: user/org id, requestId (your idempotency key), created time, status, and final output payload.
4.3 GET /{kickoff_id}/status â poll status + progress
Returns execution state including status, current task, and progress.
{
"status": "running",
"current_task": "research_task",
"progress": { "completed_tasks": 1, "total_tasks": 3 }
}
- Start at 1â2 seconds, then back off (2s â 5s â 10s).
- Stop after a maximum time; mark as âtimeoutâ and notify the user.
- Prefer webhooks for realtime UX; keep polling as fallback.
4.4 POST /resume â human-in-the-loop continuation
Continue an execution with human feedback/approval. Common parameters:
execution_id, task_id, human_feedback, is_approve, and optional webhook URLs again.
5) Webhooks: making the âAPI feel realtimeâ
Polling works, but webhooks turn your integration into a product experience. At kickoff, you can pass:
taskWebhookUrlâ called after each task completionstepWebhookUrlâ called after each agent thought/actioncrewWebhookUrlâ called when execution completes
Recommended webhook design
- Sign webhook payloads (HMAC) if supported, or embed a secret token in the URL path.
- Verify
kickoff_idbelongs to your tenant before accepting. - Make handlers idempotent (same payload may be retried).
- Store progress events for debugging and customer support.
Minimal product UX with webhooks
- UI shows: âRunning⌠Task 2/5â
- Stream partial outputs to the user
- On completion: show final result + âdownloadâ / âcopyâ / âapply changesâ
6) Production integration blueprint (backend-first)
A reliable CrewAI integration usually looks like this:
Step A â Backend receives a request
Example: user clicks âGenerate support replyâ or âCreate travel planâ.
Step B â Backend validates and normalizes inputs
- Validate required fields
- Trim huge text
- Attach metadata:
requestId,tenantId,userId,source - Optionally store a sanitized copy for audit
Step C â Backend calls POST /kickoff
- Store returned
kickoff_id - Respond immediately with
kickoff_id(async UX)
Step D â Progress tracking
- Option 1: Backend polls status; client asks backend for updates
- Option 2: Webhooks update your DB; client subscribes (SSE/WebSocket)
Step E â Completion
- On completion (status complete or webhook), store final output
- Return final output to user
- Optional post-processing: schema validation, formatting, citations, etc.
7) Building a crew thatâs âAPI-readyâ (OSS side)
Even if you integrate via AMP, most of your quality comes from how you build the crew.
Design your crew contract
A deployed crew behaves like a function:
(inputs: dict) -> (final_output)
- Define required inputs (small list, clearly named)
- Define optional inputs (with defaults)
- Decide output format (plain text vs JSON)
topic to subject, youâll break integrations.
Version your crew or support both during migration.
Keep tools tight
Tools are powerâand risk. Best practices:
- Prefer tools that return structured JSON
- Add strict validation to tool outputs
- Fail fast with helpful messages
- Cache expensive calls where possible
Choose a process style
Sequential
- Simple, predictable, easier to debug
- Often best for API products
Hierarchical
- Better for complex problems
- Requires a manager LLM
- Can increase token use and latency
For most API products: start sequential, move to hierarchical only when necessary.
8) LLM connections (LiteLLM + custom LLMs)
CrewAI supports multiple providers via LiteLLM. The default model is controlled by OPENAI_MODEL_NAME
(commonly defaulting to gpt-4o-mini if not set).
When to switch models
- Use cheaper/faster models for routing, extraction, summarization
- Use larger models for final synthesis or complex reasoning
- Consider separate models for function-calling vs writing (e.g., a
function_calling_llm)
Custom LLM integration
You can implement a custom LLM (e.g., via a BaseLLM abstraction) when you need:
- A private model behind your firewall
- Enterprise auth headers
- Routing across multiple endpoints/providers
9) Reliability: retries, timeouts, idempotency, failure modes
Handle HTTP errors intentionally
Common status codes: 400, 401, 404, 422, 500.
- 401: rotate/check token; alert ops; donât retry blindly
- 422: payload missing required inputs; fix validation
- 500: retry with backoff; log
requestId/kickoff_idfor support
Idempotency (you implement it)
- Generate a
requestId(UUID) - Store it with a unique constraint per tenant
- If the same
requestIdarrives again, return the existingkickoff_id
This prevents duplicate runs when users double-click or networks retry.
Timeouts
- Short tasks: 30â120 seconds
- Research tasks: 2â10 minutes
- Long workflows: webhooks + background jobs
If timeouts happen: show âstill running,â keep polling in background, notify on completion.
Rate limiting
- Assume per-tenant fairness
- Respect LLM provider limits
- Respect tool/provider limits
- Use a queue for high concurrency
10) Security: tokens, tenants, and least privilege
Never expose the Bearer token to clients
Bearer tokens grant access to crew operations. Keep tokens server-side only.
Multi-tenant mapping
Store:
tenant â crew base URLtenant â token(encrypted at rest)tenant â allowed crew versions
Then validate:
- User belongs to tenant
- User is allowed to run the crew
- Inputs are within constraints (length, content type, etc.)
Prefer user-scoped tokens when applicable
If available, user-scoped tokens reduce blast radius compared to org-level tokens.
Webhook security
- Use HTTPS only
- Validate signatures or secret tokens
- Reject unknown
kickoff_id - Enforce strict JSON parsing
- Log and rate limit webhook endpoints
11) Cost control: reduce token burn and keep outputs predictable
Multi-agent systems can get expensive if you donât constrain them. Practical methods:
A) Minimize context passing between tasks
Summarize âwhat mattersâ at each step instead of forwarding entire transcripts.
B) Use smaller models for early steps
Routing/extraction/tool selection often works with cheaper models; save bigger models for final synthesis.
C) Keep tools structured
Clean JSON tool outputs reduce model âinterpretationâ overhead.
D) Reduce verbosity in production
- Concise internal reasoning
- Clear final output
- Stable formatting
E) Add a âbudget guardrailâ
- Max input size
- Max run time
- Max tool calls
- Fail-safe response: âCould not complete within limits; here is partial output.â
12) Observability and evaluation (what to log, what to measure)
Even without a fancy dashboard, log these:
Per kickoff
kickoff_idrequestId- tenant/user
- start/end timestamps
- status transitions
- tasks completed
- tool calls count (and which tools)
- model used (if you can capture it)
- final output length
- error category (validation/auth/provider/tool)
Why it matters
- Debugging: âWhy did this output change yesterday?â
- Cost: âWhich tasks burn the most tokens?â
- Reliability: âWhich tool fails most often?â
- Quality: âWhich prompts produce the best acceptance rate?â
13) Common use cases and architecture patterns
1) Research + report generation
- Inputs: topic, audience, tone, length
- Tools: web search, doc retrieval, citation formatting
- Tasks: gather sources â outline â draft â fact-check â finalize
2) Customer support copilot
- Inputs: ticket_text, policy_snippets, customer_tier
- Tools: CRM lookup, knowledge base search, refund policy tool
- Human-in-the-loop: require approval before sending (use
POST /resume)
3) Sales ops automation
- Inputs: account_name, last_call_notes
- Tools: Salesforce/HubSpot integrations, email drafting
- Output: next steps + drafted follow-up
4) Data extraction pipeline
- Inputs: document_text
- Tools: file read, schema validation
- Output: strict JSON for downstream systems
5) Internal âagentic workflowsâ (multi-step business processes)
Flows shine here: state + triggers + conditional routing. Use them to handle complex business logic around when to run crews and how to postprocess results.
14) FAQs (quick answers)
Is CrewAI an API or a framework?
Both. CrewAI OSS is a framework/library, and AMP exposes deployed crews through an HTTP API.
What endpoints do I need to integrate a deployed crew?
The core ones are GET /inputs, POST /kickoff, GET /{kickoff_id}/status, and POST /resume for human-in-the-loop workflows.
How do I authenticate?
Include a Bearer token in the Authorization header. Keep tokens server-side only.
Does each crew have its own URL?
Yesâeach deployed crew has a unique base URL like https://your-crew-name.crewai.com.
Can I get progress updates without polling?
Yesâprovide webhook URLs at kickoff for task/step/crew callbacks.
Can CrewAI use models other than OpenAI?
Yes. CrewAI can connect to many providers via LiteLLM and supports custom LLM implementations.
What is a Crew in CrewAI?
A crew is a collaborative group of agents plus tasks and an execution process (e.g., sequential or hierarchical).
Final checklist: a âgoodâ CrewAI API integration
- Tokens stored server-side onlyNever ship Bearer tokens to browsers or mobile clients.
- requestId metadata for traceability + idempotency (your own)Store requestId with unique constraint per tenant.
- Validate inputs against GET /inputs contractFail fast before paying for a run.
- Async UXKickoff returns quickly; UI polls or listens for webhook events.
- Webhooks for realtime progressMake webhook handlers idempotent and secure.
- Human-in-the-loop gates via POST /resumeUse approvals for risky actions like email sending or publishing.
- Logs/metrics per kickoffTrack status transitions, tool calls, and failure categories.
- Model/tool constraintsGuardrails prevent runaway token spend and latency spikes.