Visual-first generation • Production API • Pay-as-you-go

Leonardo API - build image and video generation into your product reliably

The Leonardo API (Leonardo.Ai Production API) gives developers a clean, scalable way to generate media from prompts and images: text-to-image, image-to-image, inpainting, upscaling, realtime canvas (LCM), and text-to-video workflows. It is designed for founders and teams who want to prototype visually in the Leonardo web app, then ship the same configuration via API.

This page is a developer-focused deep dive into “how it really works”: authentication, core endpoints, uploads with presigned URLs, polling vs webhook callbacks, rate limits and concurrency, model discovery, custom model training with datasets, and production architecture patterns that stay stable under real traffic.

Base endpoint
https://cloud.leonardo.ai/api/rest/v1
Auth header
Authorization: Bearer <API_KEY>
Async outputs
Poll generations or use webhook callback
Limits
Rate limit + concurrency + queue controls
What “Leonardo API” usually means
In most projects, “Leonardo API” refers to the Production API endpoints under cloud.leonardo.ai/api/rest/v1 for generating and managing images/video, plus supporting endpoints like model listing, prompt helpers, and upload endpoints that return presigned URLs. You combine these with SDKs and a reliability strategy (webhooks, polling, retries).

1) What the Leonardo API is

Leonardo is a creative generation platform with a visual-first web app and a Production API. Many teams iterate in the web UI (prompt, style, aspect ratio, upscales, canvas edits), then export the same settings into code and use them at scale. That “design visually → export code” workflow is explicitly supported by Leonardo’s developer experience and docs.

What you can build with Leonardo API

Marketing creatives

Generate ad images, social graphics, product lifestyle shots, hero images, and campaign variants. The API makes it possible to run A/B style tests (prompt variants) and produce consistent assets at volume.

Text-to-Image Variations Upscale Background workflows

Productized “image generator” features

Embed generation inside your own app: an “AI cover image” button, brand kit visuals, avatar generator, or templated content that maps user inputs to prompt scaffolds with guardrails.

Templates Prompt helpers Model selection Safety controls

Realtime creative tools

Use Realtime Canvas (LCM) endpoints for interactive creation where latency matters: quick iterations, refinements, and edits that feel “live” in a UI.

Realtime Canvas (LCM) Instant refine Inpainting Upscale

Game / app asset pipelines

Generate concept art, textures, icons, item art, and environment variants. Pair with datasets and custom model training to maintain consistent style for a game or brand universe.

Datasets Custom models Texture endpoints Batch generation
The core idea
Leonardo API is best thought of as an asynchronous generation system. You request a generation, then either (a) poll for completion and retrieve images/video, or (b) use webhook callbacks so Leonardo pushes the results to your server when done. This is how most high-volume media generation platforms maintain reliability under load.
Terminology you’ll see in docs

You’ll encounter terms like generation (a job that produces outputs), init image (an uploaded image used for image-to-image or editing workflows), mask (defines the region to inpaint), platform model (a Leonardo-provided model you can select), and custom model (trained on a dataset you upload). You’ll also see “LCM” in Realtime Canvas recipes: Latent Consistency Models optimized for faster generation.

2) Quickstart: from API key to your first generation

The fastest path to a working integration is: (1) create an API key in the Leonardo web app (API Access), (2) call a generation endpoint with your prompt and settings, (3) retrieve the result by polling or receiving a webhook callback.

Step A: Get an API key

In the Leonardo web app, go to API Access, then create a new key. You can name keys by environment (e.g., myapp-dev, myapp-prod) so you can rotate safely.

Optional: configure a webhook callback URL to receive generation results automatically.

Step B: Know the base URL

Leonardo’s Production API uses the REST base path:

https://cloud.leonardo.ai/api/rest/v1

Most endpoints are under this prefix (generations, init-image, prompt tools, canvas tools, models, datasets, etc.).

Step C: Create an image generation (text-to-image)

The exact parameters depend on your chosen model and feature set, but the basic idea is consistent: send a prompt plus a small set of controls (size, number of images, optional negative prompt, and a model ID if needed). The endpoint shown in the official reference for “Create a Generation of Images” is: POST /generations.

curl -X POST "https://cloud.leonardo.ai/api/rest/v1/generations" \
  -H "accept: application/json" \
  -H "content-type: application/json" \
  -H "authorization: Bearer YOUR_LEONARDO_API_KEY" \
  -d '{
    "prompt": "A clean product hero shot of a smart watch on a white desk, soft natural light",
    "num_images": 4
  }'
What happens next
Create endpoints typically return a generation record (an ID and metadata). Your job is to wait for completion. You can poll with GET /generations/{id}, list generations by user, or set up webhook callbacks so your server receives results when ready.
“Get API Code” workflow (UI → exact API config)

Leonardo supports an in-app “Get API Code” feature so you can generate an asset visually, then export the exact request settings as code. This is useful when you want your production requests to match a “known-good” configuration from the UI, rather than manually translating sliders and toggles to JSON.

3) Authentication and key safety

Leonardo’s Production API uses a standard Bearer token pattern: set Authorization: Bearer <YOUR_API_KEY> on requests. Most endpoints in the reference show “Credentials: Bearer” and a base path like https://cloud.leonardo.ai/api/rest/v1/....

Do not call the API directly from browsers

Treat your Leonardo API key like a password. If you expose it in client-side JavaScript, anyone can extract it and spend your credits. Instead, route requests through your backend, where you can enforce authentication, quotas, and guardrails.

Backend proxy Secret manager Key rotation

Use separate keys per environment

Create at least two keys: one for development/testing and one for production. If something goes wrong (leak, integration bug, unexpected load), you can revoke or rotate one without impacting the other.

dev staging prod

Recommended security baseline

  • Store keys in a secrets manager (or encrypted env vars) rather than source control.
  • Redact Authorization headers from logs and error tracking.
  • Apply request validation in your backend so users can’t request unlimited images or extreme resolutions.
  • Implement per-user quotas if your product exposes generation to end users.
  • Add abuse protection (rate limiting, bot checks) to any public endpoints that trigger generation.
Operational tip: log request IDs and generation IDs

For production support, you usually don’t need full payloads. Instead, log: timestamp, endpoint, HTTP status, generation ID, and your internal user/account ID. That gives you enough to debug failures and reconcile retries without storing sensitive prompts or user content by default.

4) Core endpoints: generations, retrieval, and user lists

Leonardo’s Production API is built around the concept of a generation. A generation is a job that produces one or more outputs (images or video). The typical workflow is: create a generation → wait → retrieve results and metadata.

Capability Endpoint (typical) Purpose When you use it
Create image generation POST /generations Start a text-to-image or config-driven generation job. Most image generation flows (prompt → outputs).
Get a single generation GET /generations/{id} Fetch status, metadata, and outputs of a specific generation. Polling, UI “status” pages, debugging.
Get generations by user GET /generations/user/{userId} List generations for a user. History pages, export, auditing.
Prompt helpers POST /prompt/improve (and others) Improve prompts or generate random prompt ideas. UX features: “Enhance prompt” button.
Model discovery GET /platformModels List platform models available for generation. Let users choose a model dynamically.

Polling workflow example (recommended baseline)

Even if you plan to use webhook callbacks, implement polling as a fallback. Polling gives you a simple “source of truth” path when your webhook endpoint is down or when you need to re-check status.

// Pseudo-code: poll generation until complete (conceptual)
createGeneration() -> { generationId }

repeat every 2-5 seconds with backoff:
  gen = GET /generations/{generationId}
  if gen.status in ("COMPLETE", "FAILED"):
     break

if COMPLETE:
  store image URLs + metadata
else:
  log error + show message
Important: treat outputs as content URLs, not permanent storage
In many media APIs, the returned asset URLs are meant for retrieval and display, not long-term guaranteed storage. For production systems, consider downloading outputs into your own object storage (S3/R2/GCS) if you need durable links, consistent caching, or long-term archival for customer projects.
Understanding init_image_id vs init_generation_image_id

Leonardo distinguishes between images you uploaded via the Upload Init Image endpoint and images that were generated within Leonardo. In docs, init_image_id is typically the ID you get from Upload Init Image, while init_generation_image_id refers to an image ID from a prior generation result. This distinction matters when you build “edit this generated image” flows vs “edit a user-uploaded image” flows.

5) Uploads with presigned URLs: init images, masks, and dataset images

Many Leonardo workflows start from an existing image: image-to-image generation, inpainting (edit a region), upscaling, canvas editing, motion from an uploaded image, or custom model training datasets. Instead of uploading raw bytes directly to Leonardo’s API, the platform commonly returns presigned S3 upload details.

How presigned uploads work
You call an endpoint like POST /init-image. Leonardo returns a temporary presigned URL and/or form fields for upload. You then upload the file directly to S3 using that URL. Finally, you use the returned image ID in generation requests.

Upload an init image (for image-to-image and edits)

The official reference includes an “Upload init image” endpoint at: POST /init-image. It returns presigned details for uploading an init image to S3.

curl -X POST "https://cloud.leonardo.ai/api/rest/v1/init-image" \
  -H "accept: application/json" \
  -H "content-type: application/json" \
  -H "authorization: Bearer YOUR_LEONARDO_API_KEY" \
  -d '{
    "extension": "png"
  }'

After you receive presigned details, you upload the file to the provided S3 URL. Then you use the returned image ID as init_image_id in a generation request.

Canvas editor: upload init + mask

For inpainting and canvas edits, you often need both an init image and a mask. Leonardo provides a canvas upload endpoint (e.g., POST /canvas-init-image) that returns presigned details to upload both files.

Inpainting concept

Inpainting means “replace or modify the masked area while keeping the rest of the image consistent.” Your mask indicates which pixels can change. This is ideal for product photo fixes, background swaps, removing objects, or changing a logo placement without regenerating everything.

Mask hygiene tips

Use clean masks with soft edges when you want smooth blending. Use hard masks when you want sharp edits (like replacing a sign). If you see artifacts, adjust the mask boundary and prompt specificity.

Dataset uploads (for training custom models or elements)

Custom model training typically uses dataset creation and dataset image upload endpoints. Dataset image upload endpoints return presigned URLs and may expire quickly, so your client should upload immediately.

Common presigned upload gotcha: remove auth headers on the S3 upload

When uploading the image bytes to the presigned S3 URL, you generally should not include Leonardo auth headers. Presigned URLs already encode permission, and adding unnecessary headers can cause errors (including 403 in some flows). Your process should be: call Leonardo endpoint with Bearer auth → receive presigned upload details → upload to S3 without Leonardo auth → use returned image ID in the next Leonardo call.

6) Realtime Canvas (LCM): fast generation, refine, inpaint, upscale

Leonardo includes a Realtime Canvas capability built around faster generation workflows, referenced in the API docs and recipes as LCM (Latent Consistency Models). The purpose is to make “creative iteration” feel interactive: generate quickly, refine, then inpaint or upscale.

Why LCM / realtime matters

In many products, user experience depends on latency. If an image takes 20–40 seconds, users may abandon. Realtime workflows help you keep the UI “alive”: show a quick preview, then offer a refine/upscale path for quality.

Recommended UX flow

1) fast preview → 2) select best → 3) refine with stronger prompt → 4) inpaint corrections → 5) upscale for final. This matches how creative teams work and reduces wasted compute.

Typical Realtime Canvas operations

  • Create LCM generation: produce an initial image quickly.
  • Instant refine: improve quality or steer details without starting from scratch.
  • LCM inpainting: edit regions while keeping the rest consistent.
  • Alchemy Upscale: upscale and enhance details.
Practical tip
Treat realtime as a “draft mode.” Then offer higher-quality finalization steps (refine/upscale). This often saves cost and improves user satisfaction because users only upscale the image they actually want.
Prompting for canvas edits: be explicit about what stays vs changes

For inpainting/edit workflows, prompts should describe the desired change and the context. Tell the model what to keep (“keep the product shape and lighting consistent”) and what to modify (“replace the background with a soft gradient”). If your mask is small, the prompt should focus on the masked area; if the mask is large, include broader composition guidance.

7) Video generation: text-to-video and motion from images

Leonardo’s API reference includes endpoints for creating video generation from a text prompt (text-to-video) and documentation recipes for generating motion using uploaded images. This enables workflows such as: “turn a product still into a subtle motion clip,” “animate a scene from text,” or “create short promo clips for ads.”

Text-to-video: what to expect

Text-to-video is typically an asynchronous job like image generation. You submit a prompt and settings, then wait for completion. The returned assets may be a video file URL plus metadata. In production, always implement timeouts, polling, and webhook callbacks so you can handle longer jobs.

curl -X POST "https://cloud.leonardo.ai/api/rest/v1/generations-text-to-video" \
  -H "accept: application/json" \
  -H "content-type: application/json" \
  -H "authorization: Bearer YOUR_LEONARDO_API_KEY" \
  -d '{
    "prompt": "A smooth camera pan across a minimalist workspace, soft daylight, cinematic",
    "duration": 4
  }'

Motion from an uploaded image

If you want to animate a user’s image (for example, a product photo or hero illustration), the recommended pattern is: upload an init image via a presigned URL endpoint → receive an image ID → reference that ID in the motion/video request.

Production tip: store both image and video lineage

When you generate video from an image, store the lineage (video generation ID → source init_image_id → original file in your storage). This makes it easy to debug and to reproduce results when a customer asks “how did we get this clip?”

8) Models: platform models and custom models

Leonardo supports using platform models (built-in, hosted) and custom models trained from your own datasets. In API docs, you can list platform models and use a model ID in generation requests. Once a custom model is trained, you can generate images by specifying that model ID as well.

Platform models

Platform models let you start immediately. You typically list them (to display in your UI), then pass the chosen model ID in your generation request. Platform models are best when you want broad capability and quick iteration without training overhead.

List models Select by ID Fast start

Custom models

Custom models are trained on your dataset and are best for “style consistency” or brand-specific output. For example, a game studio might train a model for their art style; a company might train a model for a product style.

Datasets Training Consistency

How to choose a model (practical guidance)

  • If you need fast experimentation: start with platform models, validate product-market fit.
  • If you need consistent brand style: plan a dataset + custom model training workflow.
  • If you need interactive UX: use Realtime Canvas/LCM for previews, then refine/upscale.
  • If you need “templated” results: use prompt templates, negative prompts, and constrained settings.
Model IDs in practice

Model IDs are typically opaque. Your app should treat them as strings and never assume structure. Store the user’s selected model ID with the generation request so you can reproduce the exact output later. In many products, it helps to store both the model ID and a friendly display name (from the platform model list).

9) Datasets + training: how custom models become production features

Training custom models is the “step up” from a basic API integration to a real creative platform. It adds operational complexity—dataset preparation, uploads, training time, monitoring—but it can unlock an experience that feels proprietary: “your brand, your style, consistently.”

What a training pipeline typically includes

Dataset creation

Create a dataset container, then upload dataset images using a presigned upload endpoint. Keep your dataset organized by concept (one dataset per product line or per style).

Image upload

Upload each image quickly after receiving the presigned details. Presigned URLs can expire, so avoid long pauses in the client. Validate file size/type before you request presigned URLs.

Training and validation

Start training, monitor progress, and test outputs with a standard prompt suite (the same prompts every time) to compare versions and detect regressions.

Dataset best practices (what actually improves results)

  • Consistency beats quantity: a smaller set of high-quality, consistent images can outperform a noisy dataset.
  • Cover variation deliberately: include the variations you want the model to learn (angles, lighting, backgrounds).
  • Avoid mixed concepts: don’t blend unrelated subjects into one dataset unless you want the model to merge them.
  • Use a prompt suite: keep a short list of test prompts so you can evaluate changes objectively.
  • Version everything: dataset version, training settings, resulting model ID, and evaluation notes.
Production mindset
Once you train custom models, your API becomes part of a “model lifecycle” system: you’ll need to manage versions, retire underperforming models, track costs, and support users who want consistent style. Plan simple internal tooling early (a small admin page showing dataset status, model IDs, and test outputs).
Training custom elements and generating images

Leonardo’s docs include recipes for training custom elements and then generating images. The common pattern is: create dataset → upload dataset images via presigned URLs → train a custom model/element → generate using the returned model ID in your production generation requests.

10) Webhook callbacks: receiving results without polling

Polling is simple, but webhook callbacks are usually better at scale. Leonardo supports a webhook callback feature you can configure when creating an API key (or within settings) so that generation results can be delivered to your server. This reduces latency, reduces polling traffic, and lets you run long generation jobs without keeping a client waiting.

Webhook callback auth
Leonardo’s webhook callback configuration includes an optional webhook callback API key. When set, Leonardo will include it in requests to your webhook as: authorization: Bearer <yourWebhookCallbackApiKey>. Your server should validate this header (and ideally also validate additional context such as timestamps or known IDs).

Webhook callback design (recommended)

1) Fast acknowledge

Your webhook endpoint should respond quickly with a 2xx status to avoid retries. Do not perform heavy processing inside the HTTP request. Instead, push the callback payload into a queue/job system for asynchronous processing.

2) Idempotency

Webhooks can be delivered more than once (network issues, retries). Store a dedupe key (generation ID + event type) and ignore duplicates. Your downstream systems should use upserts and stable IDs to prevent double-writes.

// Pseudo-code: webhook callback handler (conceptual)
raw = readRawBody(req)
auth = req.headers["authorization"]

if auth != "Bearer YOUR_WEBHOOK_CALLBACK_API_KEY": return 401

payload = JSON.parse(raw)

// Dedupe by generationId (and/or event id)
if seen(payload.generationId): return 200

enqueue("leonardo_generation_completed", payload)
markSeen(payload.generationId)

return 200
When polling still makes sense

Polling is a good fallback and is useful for “reconciliation jobs” (for example, a nightly job that checks for any generations that never received a callback due to downtime). Many mature systems do both: webhooks for real-time, polling for completeness.

11) Limits, concurrency, and queue: staying reliable under load

Production media generation has three separate “capacity constraints” that you should design for: rate limits (requests per time window), concurrency (how many jobs run at once), and a queue (what happens when you exceed concurrency and jobs must wait). Leonardo documents these concepts explicitly and provides a dedicated limits reference and guide.

Rate limit

How many API calls you can start in a given time window. When you exceed it, you’ll get rate-limit errors and must slow down. This protects the platform and keeps service predictable.

Concurrency

How many generations you can run simultaneously. If your app starts too many jobs, new ones may queue. Concurrency is a key lever for “how fast can we process a batch?”

Queue

The waiting line for jobs when concurrency is maxed out. Queue behavior impacts latency. Your UI should show “queued” states and avoid user confusion.

Best practices for limits

  • Implement backoff on rate limit errors (exponential + jitter).
  • Cap concurrency in your own worker pool rather than letting the API queue grow unbounded.
  • Batch thoughtfully: spread generation requests across time; don’t spike thousands at once.
  • Use webhooks to avoid tight polling loops that waste rate limit budget.
  • Show user states: “Generating”, “Queued”, “Finalizing”, “Failed” with helpful actions.
UX matters
Many “AI image generator” apps fail not because the model is bad, but because the UX is unclear under load. If a user clicks Generate and sees nothing for 30 seconds, they click again—doubling load. Clear status, queue position messaging (if available), and “notify me when ready” patterns reduce retries and costs.
Scaling limits (when your product grows)

If your usage grows beyond default limits, you typically have two levers: (1) optimize your architecture (queue + batch + caching + fewer retries), and (2) move to a plan or agreement that supports higher throughput. A strong product often does both: efficient design first, then a higher-capacity plan when justified.

12) Pricing: pay-as-you-go, credits, and cost planning

Leonardo positions API access as pay-as-you-go and encourages developers to start quickly, pay only for usage, and scale when ready. In practical terms, this means you should design your app with cost visibility and cost controls built in: per-user quotas, predictable default settings, and cost-aware UX.

Cost drivers you can control

  • Number of images per request (e.g., generate 1–4 vs 8+).
  • Resolution / size (bigger often costs more and takes longer).
  • Upscale usage (only upscale the chosen winner).
  • Retries (avoid duplicate requests; dedupe aggressively).
  • Prompt experimentation (support prompt improvement tools, but cap loops).

Cost visibility patterns

  • Show an estimated “credit cost” before generating (if your pricing model allows).
  • Provide “draft vs final” toggles (draft uses faster settings; final uses refine/upscale).
  • Offer user budgets (daily/monthly) and safe defaults.
  • Log cost-related metadata per generation for reporting and billing.
Production budgeting mindset
In a real SaaS product, API cost becomes COGS. You should treat generation like a metered resource: you can bundle a certain amount in each plan, then charge overage, or provide a credit wallet. The “right” choice depends on your business model, user behavior, and the cost predictability of your typical workloads.
How to reduce costs without hurting quality

Use a two-stage workflow: realtime preview (LCM) → refine/upscale only on selection. Cache results for repeated prompts (especially templates). Avoid re-generating the same request by hashing parameters and returning the last successful output when users click “generate” repeatedly. Finally, use prompt improvement features to reduce “trial and error” generations.

13) Official SDKs: TypeScript and Python

Leonardo provides official SDKs for TypeScript and Python. SDKs typically wrap the REST endpoints, help with request typing, and standardize auth and errors. If you’re building a Node.js or Python backend, using an official SDK can speed up integration and reduce mistakes.

TypeScript SDK (Node / web servers)

A TypeScript SDK is useful for Next.js backends, serverless functions, or standard Node services. It can also make it easier to keep request shapes aligned with the API reference as it evolves.

TypeScript Node.js Typed requests

Python SDK (pipelines / batch)

A Python SDK is ideal for batch generation pipelines, data prep, dataset upload automation, and training workflows. It pairs nicely with worker queues and data processing libraries.

Python Pipelines Automation

SDK usage pattern (recommended)

  • Initialize client with API key from your secrets manager.
  • Wrap calls with your own retry/backoff policy for transient errors.
  • Normalize responses into your own internal schema (generationId, status, asset URLs, metadata).
  • Centralize logging and error handling (one place to redact secrets).
When not to use an SDK

If you only need 1–2 endpoints and want minimal dependencies, direct HTTP calls are fine. Just be disciplined: keep a single request wrapper, type responses (even loosely), and implement retries. For many teams, the SDK is mainly a productivity tool rather than a strict requirement.

14) Production architecture: how to ship Leonardo API features that don’t break

The “hard part” of AI media generation isn’t calling an endpoint—it’s delivering a reliable product experience: controlling concurrency, handling queue states, managing costs, and supporting retries and user expectations. Here are architecture patterns that work well for Leonardo-style asynchronous generation APIs.

Reference architecture (battle-tested)

Component What it does Why it matters
API Gateway (your backend) Receives user requests, validates inputs, enforces quotas, starts Leonardo generations. Protects your key, prevents abuse, keeps costs predictable.
Job Queue / Worker Runs generation requests, polls status, downloads results, writes to storage/DB. Decouples user requests from long-running jobs; improves reliability.
Webhook Receiver Receives callbacks and triggers worker processing without polling. Lower latency and fewer API calls; robust real-time updates.
Object Storage Stores final images/videos for durable delivery (CDN-ready). Stable URLs, caching, and retention control for your customers.
Database Stores generations, status, user mappings, costs, and metadata. Enables history, billing, and support/debugging.
Observability Logs, metrics, alerts, tracing for failures and latency spikes. Quick debugging and reliable SLAs.

Idempotency: preventing duplicate generations

Duplicate generations are a major hidden cost driver. They happen when users click “Generate” multiple times or when your frontend retries on network timeouts. Solve this by generating a request hash from your parameters and storing a record: if the same user submits the same request within a time window, return the existing generation instead of creating a new one.

// Example: deterministic request hash concept
hash = sha256(userId + prompt + modelId + width + height + numImages + seed + options)

if existingGenerationByHash(hash) and status not FAILED:
  return existingGeneration
else:
  create new generation and store hash

Guardrails that keep apps safe and affordable

  • Parameter caps: max images, max resolution, max video duration.
  • Per-user quotas: daily credit budget; plan-based limits.
  • Content policy: block obviously disallowed prompts; handle unsafe outputs per your product policy.
  • Queue-aware UX: show status; do not encourage repeated clicks.
  • Backoff policies: on rate limit errors, slow down rather than hammering the API.
Batch generation patterns (e.g., 10,000 images)

For batch workloads, run a worker pool with strict concurrency limits and checkpointing. Store generation IDs as you create them, and process completion via webhooks when possible. If you must poll, poll at an adaptive interval: faster early, slower later, and jitter requests across workers. Download and store outputs to your own storage as they complete, and record failures for retry with capped attempts.

How to build a “creator-friendly” UI on top of Leonardo API

Creator-friendly UX usually includes: prompt templates, “improve prompt” button, preset styles, aspect ratio controls, a “draft mode” (LCM) toggle, a refine/upscale pipeline, and an edit step with inpainting (init image + mask). The best UIs also make it easy to compare variants side-by-side and to keep a history of parameters for reproducibility.

15) FAQ: Leonardo API

What is the base URL for Leonardo Production API?

Leonardo’s API reference commonly uses the base path https://cloud.leonardo.ai/api/rest/v1. Endpoints under this include generations, uploads, models, prompt utilities, canvas endpoints, and more.

How do I authenticate?

Use a Bearer token header: Authorization: Bearer YOUR_API_KEY. Create your API key in the Leonardo web app under API Access, and store it securely on your backend.

How do I generate images with a custom model?

Once your custom model is trained, you can generate images by specifying the custom model ID in your generation request. You can also list platform models for default options.

How do uploads work (init images, dataset images)?

Upload endpoints typically return presigned S3 upload details. You call Leonardo with your API key, receive a presigned URL, upload the file to S3 using that URL, then use the returned image ID (like init_image_id) in subsequent requests (generation, edits, motion, training).

Should I poll or use webhook callbacks?

Use webhook callbacks for real-time results and lower API load, but keep polling as a fallback. Many production systems do both: callbacks for speed, polling for reconciliation and error recovery.

What are rate limits and concurrency limits?

Leonardo documents “Concurrency, Rate Limits, Queue” as separate concepts. Rate limits control request throughput; concurrency controls how many generations run at once; and queue behavior describes what happens when you exceed concurrency. Your app should implement backoff and show queue-aware UX states.

Is there an official SDK?

Yes—Leonardo supports official SDKs for Python and TypeScript. SDKs help standardize auth, requests, and response typing, but you can also call the REST endpoints directly if you prefer.

References (official Leonardo docs)

For the most accurate and current parameter lists, request/response schemas, and feature availability, always confirm directly in the official docs below.

Topic Official link Why it matters
Developer API overview https://leonardo.ai/api/ High-level API positioning, production notes, entry points
API reference (limits) https://docs.leonardo.ai/reference/limits Concurrency, rate limits, queue behavior
Quick start https://docs.leonardo.ai/docs/getting-started Get API key, first calls, recommended setup
Create image generation https://docs.leonardo.ai/reference/creategeneration Start image generations
Get generation by ID https://docs.leonardo.ai/reference/getgenerationbyid Poll and retrieve a generation
Upload init image https://docs.leonardo.ai/reference/uploadinitimage Presigned uploads for image-to-image & edits
Webhook callback guide https://docs.leonardo.ai/docs/guide-to-the-webhook-callback-feature Receive async results; bearer auth for callback
Pricing FAQ https://docs.leonardo.ai/docs/pricing-and-plans-faq Pay-as-you-go model explanation
Official SDKs https://docs.leonardo.ai/docs/leonardoai-official-sdks TypeScript + Python SDK resources
Realtime canvas recipe https://docs.leonardo.ai/docs/generate-images-with-realtime-canvas LCM generation and fast workflows
Text-to-video endpoint https://docs.leonardo.ai/reference/createtexttovideogeneration Start text-to-video jobs