CrewAI API (2026) - Complete Developer Guide to Building, Deploying, and Integrating Multi Agent Workflows

CrewAI is an agent orchestration framework designed to help you build “teams” of AI agents (a crew) that collaborate on tasks while you keep control over structure, state, tooling, and execution. Developers typically use it in two ways: CrewAI OSS (open source in your codebase) and CrewAI AMP (a hosted agent management platform with an HTTP API).

AMP HTTP API OSS Concepts Endpoints Webhooks Production patterns

Overview

This page focuses on “CrewAI API” the way most teams mean it: the AMP HTTP API for running deployed crews, plus the OSS Python APIs you use to build those crews in the first place. You’ll get a clear mental model, production-ready integration patterns, and example requests you can paste into Postman, your backend, or a serverless function.

CrewAI OSS (Open Source)
  • Design Agents, Tasks, Crews (and Flows)
  • Connect to LLMs and tools
  • Run locally or on your infrastructure
  • Framework/library API inside your app
CrewAI AMP (Hosted)
  • Build & deploy crews through the platform
  • Each deployed crew has a unique base URL
  • HTTP workflow: inputs → kickoff → status → results
  • Product integration API for apps
Core idea Your deployed crew behaves like a service: (inputs: dict) → (final_output). Your product calls the crew endpoint, tracks execution, and displays results.

Table of contents

  • What “CrewAI API” means (AMP HTTP API vs OSS library APIs)
  • Core mental model: Agents, Tools, Tasks, Crews, Flows
  • CrewAI AMP API basics: base URL, auth, workflow
  • Endpoints you’ll use most: GET /inputs, POST /kickoff, GET /{kickoff_id}/status, POST /resume
  • Webhooks for streaming progress into your app (task/step/crew callbacks)
  • Implementing a production integration (backend pattern + sample code)
  • Building a crew in OSS that is “API-ready” (inputs, outputs, structure)
  • LLM connectivity and provider choices (LiteLLM + custom LLMs)
  • Reliability: retries, idempotency, timeouts, rate control, and failure modes
  • Security: token handling, multi-tenant design, and least privilege
  • Cost control: cutting token burn and keeping crews predictable
  • Observability and evaluation (what to log, what to measure)
  • Common use cases and architecture blueprints
  • FAQs

1) What “CrewAI API” means

When people search “CrewAI API,” they usually want one of these:

A) The CrewAI AMP HTTP API (run a deployed crew)

The AMP API is a clean, “run my workflow” interface. You don’t call agents directly; you call the crew endpoint associated with a deployed automation.

Key ideas
  • You get a Bearer token from your crew’s detail page in the AMP dashboard.
  • Each deployed crew has a unique base URL like https://your-crew-name.crewai.com.
  • Typical workflow:
    • GET /inputs to discover required parameters
    • POST /kickoff to start execution (returns a kickoff_id)
    • GET /{kickoff_id}/status to poll progress/results

B) The CrewAI OSS library APIs (build and run crews in your code)

This is the Python framework. You create:

  • Agents: autonomous units that perform tasks, use tools, collaborate, and keep memory.
  • Tools: functions/skills agents can call (web search, file reads, integrations, custom tools).
  • Crews: groups of agents + tasks + a process (sequential/hierarchical) that define how work executes.
  • Flows: the structured “backbone” that orchestrates steps, state, triggers, and when to run a crew.

Most production teams use both: they build with OSS concepts, then deploy in AMP, then integrate via the AMP HTTP API.

2) Core mental model (the parts you must understand)

Before you integrate anything, align on how CrewAI thinks.

Agents

An agent is like a specialist teammate with a role, goal, tools, and constraints. Agents can perform tasks, make decisions, use tools, communicate/collaborate, maintain memory, and delegate tasks when allowed.

Practical takeaway If your “CrewAI API” output is inconsistent, the fix is often in agent configuration: role clarity, tool scope, and expected output format.

Tools

Tools are callable “skills” an agent can use. Tools expand capability (web search, data analysis, collaboration, delegation) and should support robust error handling, caching, and sync/async patterns.

Practical takeaway Tools are where you connect your real product—databases, CRMs, your own APIs. If you want predictable automation, invest in tools that return structured outputs and handle failures gracefully.

Crews

A crew is the unit you actually run: tasks + agents + a process strategy. A crew defines execution strategy and collaboration.

Practical takeaway Your AMP deployment is basically “a crew as an API service.”

Flows

Flows are the backbone/manager of your AI application: state management, event-driven execution, and control flow (branches, loops).

Practical takeaway If your product needs orchestration beyond “run this crew once,” flows are how you build reliable pipelines: triggers → validate → run crew → postprocess → persist.

3) CrewAI AMP API basics (base URL, auth, workflow)

Base URL

Each deployed crew has its own unique endpoint like:

Example
Copied ✓
https://your-crew-name.crewai.com

This means you don’t have one global API host. You have one host per deployed crew.

Authentication

Requests require a Bearer token in the Authorization header. Retrieve tokens from the crew’s detail page (Status tab) in the AMP dashboard. Keep tokens server-side only.

Typical workflow (how your app should call it)

  1. Discover required inputs: GET /inputs
  2. Start execution: POST /kickoff with inputs (optional metadata/webhooks)
  3. Monitor: GET /{kickoff_id}/status until complete
  4. Read results from the completed status payload (or receive completion via webhook)
Note That’s the “happy path.” Production requires retries, idempotency, timeouts, rate limiting, logging, and safe token handling.

4) The key AMP endpoints (with examples)

4.1 GET /inputs — discover what the crew needs

Returns the list of required input parameter names. Example response:

Response JSON
Copied ✓
{
  "inputs": ["budget", "interests", "duration", "age"]
}
How to use it
  • Confirm your crew contract (backend payload requirements)
  • Generate forms in internal admin UIs
  • Validate payloads in multi-tenant products before kickoff
Common mistake
  • Assuming inputs are stable forever—version your crew if you change them.
  • Skipping validation and paying for failed runs.

4.2 POST /kickoff — start a crew execution

Your primary “run” call. inputs is required. meta is optional. Webhooks are optional: taskWebhookUrl, stepWebhookUrl, crewWebhookUrl.

Request JSON
Copied ✓
{
  "inputs": {
    "budget": "1000 USD",
    "interests": "games, tech, ai, relaxing hikes, amazing food",
    "duration": "7 days",
    "age": "35"
  },
  "meta": {
    "requestId": "travel-req-123",
    "source": "web-app"
  }
}
Best practice Treat kickoff_id as your “job ID.” Store it with: user/org id, requestId (your idempotency key), created time, status, and final output payload.

4.3 GET /{kickoff_id}/status — poll status + progress

Returns execution state including status, current task, and progress.

Response JSON
Copied ✓
{
  "status": "running",
  "current_task": "research_task",
  "progress": { "completed_tasks": 1, "total_tasks": 3 }
}
How to poll safely
  • Start at 1–2 seconds, then back off (2s → 5s → 10s).
  • Stop after a maximum time; mark as “timeout” and notify the user.
  • Prefer webhooks for realtime UX; keep polling as fallback.

4.4 POST /resume — human-in-the-loop continuation

Continue an execution with human feedback/approval. Common parameters: execution_id, task_id, human_feedback, is_approve, and optional webhook URLs again.

Why it matters Some tasks should never be fully autonomous: sending emails, deleting/overwriting data, approving payouts/refunds, publishing content. Use review gates: crew drafts → human approves → resume continues.

5) Webhooks: making the “API feel realtime”

Polling works, but webhooks turn your integration into a product experience. At kickoff, you can pass:

  • taskWebhookUrl — called after each task completion
  • stepWebhookUrl — called after each agent thought/action
  • crewWebhookUrl — called when execution completes

Recommended webhook design

  • Sign webhook payloads (HMAC) if supported, or embed a secret token in the URL path.
  • Verify kickoff_id belongs to your tenant before accepting.
  • Make handlers idempotent (same payload may be retried).
  • Store progress events for debugging and customer support.

Minimal product UX with webhooks

  • UI shows: “Running… Task 2/5”
  • Stream partial outputs to the user
  • On completion: show final result + “download” / “copy” / “apply changes”

6) Production integration blueprint (backend-first)

A reliable CrewAI integration usually looks like this:

Step A — Backend receives a request

Example: user clicks “Generate support reply” or “Create travel plan”.

Step B — Backend validates and normalizes inputs

  • Validate required fields
  • Trim huge text
  • Attach metadata: requestId, tenantId, userId, source
  • Optionally store a sanitized copy for audit

Step C — Backend calls POST /kickoff

  • Store returned kickoff_id
  • Respond immediately with kickoff_id (async UX)

Step D — Progress tracking

  • Option 1: Backend polls status; client asks backend for updates
  • Option 2: Webhooks update your DB; client subscribes (SSE/WebSocket)

Step E — Completion

  • On completion (status complete or webhook), store final output
  • Return final output to user
  • Optional post-processing: schema validation, formatting, citations, etc.
Why backend-first? Your Bearer token must stay secret. Don’t ship the token to the browser.

7) Building a crew that’s “API-ready” (OSS side)

Even if you integrate via AMP, most of your quality comes from how you build the crew.

Design your crew contract

A deployed crew behaves like a function:

Contract
Copied ✓
(inputs: dict) -> (final_output)
  • Define required inputs (small list, clearly named)
  • Define optional inputs (with defaults)
  • Decide output format (plain text vs JSON)
Pro tip Keep public API inputs stable. If you rename topic to subject, you’ll break integrations. Version your crew or support both during migration.

Keep tools tight

Tools are power—and risk. Best practices:

  • Prefer tools that return structured JSON
  • Add strict validation to tool outputs
  • Fail fast with helpful messages
  • Cache expensive calls where possible

Choose a process style

Sequential
  • Simple, predictable, easier to debug
  • Often best for API products
Hierarchical
  • Better for complex problems
  • Requires a manager LLM
  • Can increase token use and latency

For most API products: start sequential, move to hierarchical only when necessary.

8) LLM connections (LiteLLM + custom LLMs)

CrewAI supports multiple providers via LiteLLM. The default model is controlled by OPENAI_MODEL_NAME (commonly defaulting to gpt-4o-mini if not set).

When to switch models

  • Use cheaper/faster models for routing, extraction, summarization
  • Use larger models for final synthesis or complex reasoning
  • Consider separate models for function-calling vs writing (e.g., a function_calling_llm)

Custom LLM integration

You can implement a custom LLM (e.g., via a BaseLLM abstraction) when you need:

  • A private model behind your firewall
  • Enterprise auth headers
  • Routing across multiple endpoints/providers

9) Reliability: retries, timeouts, idempotency, failure modes

Handle HTTP errors intentionally

Common status codes: 400, 401, 404, 422, 500.

  • 401: rotate/check token; alert ops; don’t retry blindly
  • 422: payload missing required inputs; fix validation
  • 500: retry with backoff; log requestId/kickoff_id for support

Idempotency (you implement it)

  • Generate a requestId (UUID)
  • Store it with a unique constraint per tenant
  • If the same requestId arrives again, return the existing kickoff_id

This prevents duplicate runs when users double-click or networks retry.

Timeouts

  • Short tasks: 30–120 seconds
  • Research tasks: 2–10 minutes
  • Long workflows: webhooks + background jobs

If timeouts happen: show “still running,” keep polling in background, notify on completion.

Rate limiting

  • Assume per-tenant fairness
  • Respect LLM provider limits
  • Respect tool/provider limits
  • Use a queue for high concurrency

10) Security: tokens, tenants, and least privilege

Never expose the Bearer token to clients

Bearer tokens grant access to crew operations. Keep tokens server-side only.

Multi-tenant mapping

Store:

  • tenant → crew base URL
  • tenant → token (encrypted at rest)
  • tenant → allowed crew versions

Then validate:

  • User belongs to tenant
  • User is allowed to run the crew
  • Inputs are within constraints (length, content type, etc.)

Prefer user-scoped tokens when applicable

If available, user-scoped tokens reduce blast radius compared to org-level tokens.

Webhook security

  • Use HTTPS only
  • Validate signatures or secret tokens
  • Reject unknown kickoff_id
  • Enforce strict JSON parsing
  • Log and rate limit webhook endpoints

11) Cost control: reduce token burn and keep outputs predictable

Multi-agent systems can get expensive if you don’t constrain them. Practical methods:

A) Minimize context passing between tasks

Summarize “what matters” at each step instead of forwarding entire transcripts.

B) Use smaller models for early steps

Routing/extraction/tool selection often works with cheaper models; save bigger models for final synthesis.

C) Keep tools structured

Clean JSON tool outputs reduce model “interpretation” overhead.

D) Reduce verbosity in production

  • Concise internal reasoning
  • Clear final output
  • Stable formatting

E) Add a “budget guardrail”

  • Max input size
  • Max run time
  • Max tool calls
  • Fail-safe response: “Could not complete within limits; here is partial output.”

12) Observability and evaluation (what to log, what to measure)

Even without a fancy dashboard, log these:

Per kickoff

  • kickoff_id
  • requestId
  • tenant/user
  • start/end timestamps
  • status transitions
  • tasks completed
  • tool calls count (and which tools)
  • model used (if you can capture it)
  • final output length
  • error category (validation/auth/provider/tool)

Why it matters

  • Debugging: “Why did this output change yesterday?”
  • Cost: “Which tasks burn the most tokens?”
  • Reliability: “Which tool fails most often?”
  • Quality: “Which prompts produce the best acceptance rate?”

13) Common use cases and architecture patterns

1) Research + report generation

  • Inputs: topic, audience, tone, length
  • Tools: web search, doc retrieval, citation formatting
  • Tasks: gather sources → outline → draft → fact-check → finalize

2) Customer support copilot

  • Inputs: ticket_text, policy_snippets, customer_tier
  • Tools: CRM lookup, knowledge base search, refund policy tool
  • Human-in-the-loop: require approval before sending (use POST /resume)

3) Sales ops automation

  • Inputs: account_name, last_call_notes
  • Tools: Salesforce/HubSpot integrations, email drafting
  • Output: next steps + drafted follow-up

4) Data extraction pipeline

  • Inputs: document_text
  • Tools: file read, schema validation
  • Output: strict JSON for downstream systems

5) Internal “agentic workflows” (multi-step business processes)

Flows shine here: state + triggers + conditional routing. Use them to handle complex business logic around when to run crews and how to postprocess results.

14) FAQs (quick answers)

Is CrewAI an API or a framework?

Both. CrewAI OSS is a framework/library, and AMP exposes deployed crews through an HTTP API.

What endpoints do I need to integrate a deployed crew?

The core ones are GET /inputs, POST /kickoff, GET /{kickoff_id}/status, and POST /resume for human-in-the-loop workflows.

How do I authenticate?

Include a Bearer token in the Authorization header. Keep tokens server-side only.

Does each crew have its own URL?

Yes—each deployed crew has a unique base URL like https://your-crew-name.crewai.com.

Can I get progress updates without polling?

Yes—provide webhook URLs at kickoff for task/step/crew callbacks.

Can CrewAI use models other than OpenAI?

Yes. CrewAI can connect to many providers via LiteLLM and supports custom LLM implementations.

What is a Crew in CrewAI?

A crew is a collaborative group of agents plus tasks and an execution process (e.g., sequential or hierarchical).

Final checklist: a “good” CrewAI API integration

  • Tokens stored server-side only
    Never ship Bearer tokens to browsers or mobile clients.
  • requestId metadata for traceability + idempotency (your own)
    Store requestId with unique constraint per tenant.
  • Validate inputs against GET /inputs contract
    Fail fast before paying for a run.
  • Async UX
    Kickoff returns quickly; UI polls or listens for webhook events.
  • Webhooks for realtime progress
    Make webhook handlers idempotent and secure.
  • Human-in-the-loop gates via POST /resume
    Use approvals for risky actions like email sending or publishing.
  • Logs/metrics per kickoff
    Track status transitions, tool calls, and failure categories.
  • Model/tool constraints
    Guardrails prevent runaway token spend and latency spikes.
Disclaimer
This guide is an educational summary of the concepts and workflow described in the prompt. Always verify exact endpoint behavior, schemas, and auth requirements in your official CrewAI documentation for your specific deployment/version.