CrewAI API - Complete Developer Guide

Q: Is CrewAI an API or a framework?

Both. CrewAI OSS is a framework/library you run in your code, and CrewAI AMP exposes deployed crews via an HTTP API.

Q: What endpoints do I need to integrate a deployed crew?

Common endpoints are GET /inputs, POST /kickoff, GET /{kickoff_id}/status, and POST /resume for human-in-the-loop approvals.

Q: How do I authenticate to CrewAI AMP?

Use a Bearer token in the Authorization header. Keep tokens server-side only.

Q: Does each crew have its own URL?

Yes. Each deployed crew has its own base URL, typically like https://your-crew-name.crewai.com.

Q: Can I get progress updates without polling?

Yes. Provide webhook URLs at kickoff for task/step/crew callbacks to receive progress and completion events.

Q: Can CrewAI use models other than OpenAI?

Yes. CrewAI supports multiple providers via LiteLLM and can be extended using custom LLM implementations.

Overview

This page focuses on “CrewAI API” the way most teams mean it: the AMP HTTP API for running deployed crews, plus the OSS Python APIs you use to build those crews in the first place. You’ll get a clear mental model, production-ready integration patterns, and example requests you can paste into Postman, your backend, or a serverless function.

CrewAI OSS (Open Source)

Design Agents, Tasks, Crews (and Flows)
Connect to LLMs and tools
Run locally or on your infrastructure
Framework/library API inside your app

CrewAI AMP (Hosted)

Build & deploy crews through the platform
Each deployed crew has a unique base URL
HTTP workflow: inputs → kickoff → status → results
Product integration API for apps

Core idea Your deployed crew behaves like a service: (inputs: dict) → (final_output). Your product calls the crew endpoint, tracks execution, and displays results.

What “CrewAI API” means (AMP HTTP API vs OSS library APIs)
Core mental model: Agents, Tools, Tasks, Crews, Flows
CrewAI AMP API basics: base URL, auth, workflow
Endpoints you’ll use most: GET /inputs, POST /kickoff, GET /{kickoff_id}/status, POST /resume
Webhooks for streaming progress into your app (task/step/crew callbacks)
Implementing a production integration (backend pattern + sample code)
Building a crew in OSS that is “API-ready” (inputs, outputs, structure)
LLM connectivity and provider choices (LiteLLM + custom LLMs)
Reliability: retries, idempotency, timeouts, rate control, and failure modes
Security: token handling, multi-tenant design, and least privilege
Cost control: cutting token burn and keeping crews predictable
Observability and evaluation (what to log, what to measure)
Common use cases and architecture blueprints
FAQs

1) What “CrewAI API” means

When people search “CrewAI API,” they usually want one of these:

A) The CrewAI AMP HTTP API (run a deployed crew)

The AMP API is a clean, “run my workflow” interface. You don’t call agents directly; you call the crew endpoint associated with a deployed automation.

Key ideas

You get a Bearer token from your crew’s detail page in the AMP dashboard.
Each deployed crew has a unique base URL like https://your-crew-name.crewai.com.
Typical workflow:
- GET /inputs to discover required parameters
- POST /kickoff to start execution (returns a kickoff_id)
- GET /{kickoff_id}/status to poll progress/results

B) The CrewAI OSS library APIs (build and run crews in your code)

This is the Python framework. You create:

Agents: autonomous units that perform tasks, use tools, collaborate, and keep memory.
Tools: functions/skills agents can call (web search, file reads, integrations, custom tools).
Crews: groups of agents + tasks + a process (sequential/hierarchical) that define how work executes.
Flows: the structured “backbone” that orchestrates steps, state, triggers, and when to run a crew.

Most production teams use both: they build with OSS concepts, then deploy in AMP, then integrate via the AMP HTTP API.

2) Core mental model (the parts you must understand)

Before you integrate anything, align on how CrewAI thinks.

Agents

An agent is like a specialist teammate with a role, goal, tools, and constraints. Agents can perform tasks, make decisions, use tools, communicate/collaborate, maintain memory, and delegate tasks when allowed.

Practical takeaway If your “CrewAI API” output is inconsistent, the fix is often in agent configuration: role clarity, tool scope, and expected output format.

Tools

Tools are callable “skills” an agent can use. Tools expand capability (web search, data analysis, collaboration, delegation) and should support robust error handling, caching, and sync/async patterns.

Practical takeaway Tools are where you connect your real product—databases, CRMs, your own APIs. If you want predictable automation, invest in tools that return structured outputs and handle failures gracefully.

Crews

A crew is the unit you actually run: tasks + agents + a process strategy. A crew defines execution strategy and collaboration.

Practical takeaway Your AMP deployment is basically “a crew as an API service.”

Flows

Flows are the backbone/manager of your AI application: state management, event-driven execution, and control flow (branches, loops).

Practical takeaway If your product needs orchestration beyond “run this crew once,” flows are how you build reliable pipelines: triggers → validate → run crew → postprocess → persist.

3) CrewAI AMP API basics (base URL, auth, workflow)

Base URL

Each deployed crew has its own unique endpoint like:

Example

Copied ✓

https://your-crew-name.crewai.com

This means you don’t have one global API host. You have one host per deployed crew.

Authentication

Requests require a Bearer token in the Authorization header. Retrieve tokens from the crew’s detail page (Status tab) in the AMP dashboard. Keep tokens server-side only.

Typical workflow (how your app should call it)

Discover required inputs: GET /inputs
Start execution: POST /kickoff with inputs (optional metadata/webhooks)
Monitor: GET /{kickoff_id}/status until complete
Read results from the completed status payload (or receive completion via webhook)

Note That’s the “happy path.” Production requires retries, idempotency, timeouts, rate limiting, logging, and safe token handling.

4) The key AMP endpoints (with examples)

4.1 `GET /inputs` — discover what the crew needs

Returns the list of required input parameter names. Example response:

Response JSON

Copied ✓

{
  "inputs": ["budget", "interests", "duration", "age"]
}

How to use it

Confirm your crew contract (backend payload requirements)
Generate forms in internal admin UIs
Validate payloads in multi-tenant products before kickoff

Common mistake

Assuming inputs are stable forever—version your crew if you change them.
Skipping validation and paying for failed runs.

4.2 `POST /kickoff` — start a crew execution

Your primary “run” call. inputs is required. meta is optional. Webhooks are optional: taskWebhookUrl, stepWebhookUrl, crewWebhookUrl.

Request JSON

Copied ✓

{
  "inputs": {
    "budget": "1000 USD",
    "interests": "games, tech, ai, relaxing hikes, amazing food",
    "duration": "7 days",
    "age": "35"
  },
  "meta": {
    "requestId": "travel-req-123",
    "source": "web-app"
  }
}

Best practice Treat kickoff_id as your “job ID.” Store it with: user/org id, requestId (your idempotency key), created time, status, and final output payload.

4.3 `GET /{kickoff_id}/status` — poll status + progress

Returns execution state including status, current task, and progress.

Response JSON

Copied ✓

{
  "status": "running",
  "current_task": "research_task",
  "progress": { "completed_tasks": 1, "total_tasks": 3 }
}

How to poll safely

Start at 1–2 seconds, then back off (2s → 5s → 10s).
Stop after a maximum time; mark as “timeout” and notify the user.
Prefer webhooks for realtime UX; keep polling as fallback.

4.4 `POST /resume` — human-in-the-loop continuation

Continue an execution with human feedback/approval. Common parameters: execution_id, task_id, human_feedback, is_approve, and optional webhook URLs again.

Why it matters Some tasks should never be fully autonomous: sending emails, deleting/overwriting data, approving payouts/refunds, publishing content. Use review gates: crew drafts → human approves → resume continues.

5) Webhooks: making the “API feel realtime”

Polling works, but webhooks turn your integration into a product experience. At kickoff, you can pass:

taskWebhookUrl — called after each task completion
stepWebhookUrl — called after each agent thought/action
crewWebhookUrl — called when execution completes

Recommended webhook design

Sign webhook payloads (HMAC) if supported, or embed a secret token in the URL path.
Verify kickoff_id belongs to your tenant before accepting.
Make handlers idempotent (same payload may be retried).
Store progress events for debugging and customer support.

Minimal product UX with webhooks

UI shows: “Running… Task 2/5”
Stream partial outputs to the user
On completion: show final result + “download” / “copy” / “apply changes”

6) Production integration blueprint (backend-first)

A reliable CrewAI integration usually looks like this:

Step A — Backend receives a request

Example: user clicks “Generate support reply” or “Create travel plan”.

Step B — Backend validates and normalizes inputs

Validate required fields
Trim huge text
Attach metadata: requestId, tenantId, userId, source
Optionally store a sanitized copy for audit

Step C — Backend calls `POST /kickoff`

Store returned kickoff_id
Respond immediately with kickoff_id (async UX)

Step D — Progress tracking

Option 1: Backend polls status; client asks backend for updates
Option 2: Webhooks update your DB; client subscribes (SSE/WebSocket)

Step E — Completion

On completion (status complete or webhook), store final output
Return final output to user
Optional post-processing: schema validation, formatting, citations, etc.

Why backend-first? Your Bearer token must stay secret. Don’t ship the token to the browser.

7) Building a crew that’s “API-ready” (OSS side)

Even if you integrate via AMP, most of your quality comes from how you build the crew.

Design your crew contract

A deployed crew behaves like a function:

Contract

Copied ✓

(inputs: dict) -> (final_output)

Define required inputs (small list, clearly named)
Define optional inputs (with defaults)
Decide output format (plain text vs JSON)

Pro tip Keep public API inputs stable. If you rename topic to subject, you’ll break integrations. Version your crew or support both during migration.

Keep tools tight

Tools are power—and risk. Best practices:

Prefer tools that return structured JSON
Add strict validation to tool outputs
Fail fast with helpful messages
Cache expensive calls where possible

Choose a process style

Sequential

Simple, predictable, easier to debug
Often best for API products

Hierarchical

Better for complex problems
Requires a manager LLM
Can increase token use and latency

For most API products: start sequential, move to hierarchical only when necessary.

8) LLM connections (LiteLLM + custom LLMs)

CrewAI supports multiple providers via LiteLLM. The default model is controlled by OPENAI_MODEL_NAME (commonly defaulting to gpt-4o-mini if not set).

When to switch models

Use cheaper/faster models for routing, extraction, summarization
Use larger models for final synthesis or complex reasoning
Consider separate models for function-calling vs writing (e.g., a function_calling_llm)

Custom LLM integration

You can implement a custom LLM (e.g., via a BaseLLM abstraction) when you need:

A private model behind your firewall
Enterprise auth headers
Routing across multiple endpoints/providers

9) Reliability: retries, timeouts, idempotency, failure modes

Handle HTTP errors intentionally

Common status codes: 400, 401, 404, 422, 500.

401: rotate/check token; alert ops; don’t retry blindly
422: payload missing required inputs; fix validation
500: retry with backoff; log requestId/kickoff_id for support

Idempotency (you implement it)

Generate a requestId (UUID)
Store it with a unique constraint per tenant
If the same requestId arrives again, return the existing kickoff_id

This prevents duplicate runs when users double-click or networks retry.

Timeouts

Short tasks: 30–120 seconds
Research tasks: 2–10 minutes
Long workflows: webhooks + background jobs

If timeouts happen: show “still running,” keep polling in background, notify on completion.

Rate limiting

Assume per-tenant fairness
Respect LLM provider limits
Respect tool/provider limits
Use a queue for high concurrency

10) Security: tokens, tenants, and least privilege

Never expose the Bearer token to clients

Bearer tokens grant access to crew operations. Keep tokens server-side only.

Multi-tenant mapping

Store:

tenant → crew base URL
tenant → token (encrypted at rest)
tenant → allowed crew versions

Then validate:

User belongs to tenant
User is allowed to run the crew
Inputs are within constraints (length, content type, etc.)

Prefer user-scoped tokens when applicable

If available, user-scoped tokens reduce blast radius compared to org-level tokens.

Webhook security

Use HTTPS only
Validate signatures or secret tokens
Reject unknown kickoff_id
Enforce strict JSON parsing
Log and rate limit webhook endpoints

11) Cost control: reduce token burn and keep outputs predictable

Multi-agent systems can get expensive if you don’t constrain them. Practical methods:

A) Minimize context passing between tasks

Summarize “what matters” at each step instead of forwarding entire transcripts.

B) Use smaller models for early steps

Routing/extraction/tool selection often works with cheaper models; save bigger models for final synthesis.

C) Keep tools structured

Clean JSON tool outputs reduce model “interpretation” overhead.

D) Reduce verbosity in production

Concise internal reasoning
Clear final output
Stable formatting

E) Add a “budget guardrail”

Max input size
Max run time
Max tool calls
Fail-safe response: “Could not complete within limits; here is partial output.”

12) Observability and evaluation (what to log, what to measure)

Even without a fancy dashboard, log these:

Per kickoff

kickoff_id
requestId
tenant/user
start/end timestamps
status transitions
tasks completed
tool calls count (and which tools)
model used (if you can capture it)
final output length
error category (validation/auth/provider/tool)

Why it matters

Debugging: “Why did this output change yesterday?”
Cost: “Which tasks burn the most tokens?”
Reliability: “Which tool fails most often?”
Quality: “Which prompts produce the best acceptance rate?”

13) Common use cases and architecture patterns

1) Research + report generation

Inputs: topic, audience, tone, length
Tools: web search, doc retrieval, citation formatting
Tasks: gather sources → outline → draft → fact-check → finalize

2) Customer support copilot

Inputs: ticket_text, policy_snippets, customer_tier
Tools: CRM lookup, knowledge base search, refund policy tool
Human-in-the-loop: require approval before sending (use POST /resume)

3) Sales ops automation

Inputs: account_name, last_call_notes
Tools: Salesforce/HubSpot integrations, email drafting
Output: next steps + drafted follow-up

4) Data extraction pipeline

Inputs: document_text
Tools: file read, schema validation
Output: strict JSON for downstream systems

5) Internal “agentic workflows” (multi-step business processes)

Flows shine here: state + triggers + conditional routing. Use them to handle complex business logic around when to run crews and how to postprocess results.

14) FAQs (quick answers)

Is CrewAI an API or a framework?

Both. CrewAI OSS is a framework/library, and AMP exposes deployed crews through an HTTP API.

What endpoints do I need to integrate a deployed crew?

The core ones are GET /inputs, POST /kickoff, GET /{kickoff_id}/status, and POST /resume for human-in-the-loop workflows.

How do I authenticate?

Include a Bearer token in the Authorization header. Keep tokens server-side only.

Does each crew have its own URL?

Yes—each deployed crew has a unique base URL like https://your-crew-name.crewai.com.

Can I get progress updates without polling?

Yes—provide webhook URLs at kickoff for task/step/crew callbacks.

Can CrewAI use models other than OpenAI?

Yes. CrewAI can connect to many providers via LiteLLM and supports custom LLM implementations.

What is a Crew in CrewAI?

A crew is a collaborative group of agents plus tasks and an execution process (e.g., sequential or hierarchical).

Final checklist: a “good” CrewAI API integration

Tokens stored server-side only
Never ship Bearer tokens to browsers or mobile clients.
requestId metadata for traceability + idempotency (your own)
Store requestId with unique constraint per tenant.
Validate inputs against GET /inputs contract
Fail fast before paying for a run.
Async UX
Kickoff returns quickly; UI polls or listens for webhook events.
Webhooks for realtime progress
Make webhook handlers idempotent and secure.
Human-in-the-loop gates via POST /resume
Use approvals for risky actions like email sending or publishing.
Logs/metrics per kickoff
Track status transitions, tool calls, and failure categories.
Model/tool constraints
Guardrails prevent runaway token spend and latency spikes.

Disclaimer

This guide is an educational summary of the concepts and workflow described in the prompt. Always verify exact endpoint behavior, schemas, and auth requirements in your official CrewAI documentation for your specific deployment/version.

CrewAI API (2026) - Complete Developer Guide to Building, Deploying, and Integrating Multi Agent Workflows

Overview

CrewAI OSS (Open Source)

CrewAI AMP (Hosted)

Table of contents

1) What “CrewAI API” means

A) The CrewAI AMP HTTP API (run a deployed crew)

B) The CrewAI OSS library APIs (build and run crews in your code)

2) Core mental model (the parts you must understand)

Agents

Tools

Crews

Flows

3) CrewAI AMP API basics (base URL, auth, workflow)

Base URL

Authentication

Typical workflow (how your app should call it)

4) The key AMP endpoints (with examples)

4.1 GET /inputs — discover what the crew needs

How to use it

Common mistake

4.2 POST /kickoff — start a crew execution

4.3 GET /{kickoff_id}/status — poll status + progress

4.4 POST /resume — human-in-the-loop continuation

5) Webhooks: making the “API feel realtime”

Recommended webhook design

Minimal product UX with webhooks

6) Production integration blueprint (backend-first)

Step A — Backend receives a request

Step B — Backend validates and normalizes inputs

Step C — Backend calls POST /kickoff

Step D — Progress tracking

Step E — Completion

7) Building a crew that’s “API-ready” (OSS side)

Design your crew contract

Keep tools tight

Choose a process style

Sequential

Hierarchical

8) LLM connections (LiteLLM + custom LLMs)

When to switch models

Custom LLM integration

9) Reliability: retries, timeouts, idempotency, failure modes

Handle HTTP errors intentionally

Idempotency (you implement it)

Timeouts

Rate limiting

10) Security: tokens, tenants, and least privilege

Never expose the Bearer token to clients

Multi-tenant mapping

Prefer user-scoped tokens when applicable

Webhook security

11) Cost control: reduce token burn and keep outputs predictable

A) Minimize context passing between tasks

B) Use smaller models for early steps

C) Keep tools structured

D) Reduce verbosity in production

E) Add a “budget guardrail”

12) Observability and evaluation (what to log, what to measure)

Per kickoff

Why it matters

13) Common use cases and architecture patterns

1) Research + report generation

2) Customer support copilot

3) Sales ops automation

4) Data extraction pipeline

5) Internal “agentic workflows” (multi-step business processes)

14) FAQs (quick answers)

Final checklist: a “good” CrewAI API integration

4.1 `GET /inputs` — discover what the crew needs

4.2 `POST /kickoff` — start a crew execution

4.3 `GET /{kickoff_id}/status` — poll status + progress

4.4 `POST /resume` — human-in-the-loop continuation

Step C — Backend calls `POST /kickoff`