Kling API (2026): Complete Developer Guide
Kling (often called “Kling AI” / “可灵”) is best known for generating short, cinematic videos from text prompts and/or reference images. The Kling API lets you integrate those capabilities into your own products: marketing video tools, creator platforms, mobile apps, internal content pipelines, and “agentic” systems that generate clips on demand.
What is Kling API?
Kling API is a developer interface for generating AI media primarily short videos using Kling models. In most integrations you send a request describing the scene you want (your prompt, optional negative prompt, and often an aspect ratio and duration), plus optional reference assets (like an image or a motion clip if you’re using motion transfer). Kling then runs a “generation task” and returns a task ID. Your system polls the task status until the job finishes, then downloads or streams the resulting video URL(s) to your application.
Kling is frequently used when a product needs: consistent cinematic motion, natural-looking subject movement, and fast turnaround for short clips. In practice, “Kling API” can mean two different things:
- Official Kling developer API (JWT-based authorization with an AccessKey + SecretKey, using a Kling API domain).
- Gateway APIs from platforms that host Kling models behind their own endpoints (often with their own pricing units, queueing behavior, and authentication).
Ways to access Kling by API
Depending on region, business needs, and your developer stack, you might integrate Kling in one of these ways:
1) Official Kling developer API
The official approach typically uses: AccessKey + SecretKey to generate a short-lived JWT (JSON Web Token). You pass that token as: Authorization: Bearer <API_TOKEN> on each request. The workflow is task-based: create a task, then query the task to fetch the result when it’s ready.
2) Gateway APIs (developer-friendly “one integration” platforms)
Many teams prefer a gateway API because it can simplify onboarding (one account, one key, one SDK), provide clearer docs, add webhooks and queues, and unify multiple video models under a single set of conventions. Examples include platforms that publish “Kling image-to-video” or “Kling v1.x / v2.x” endpoints, often billed per second or per credit.
3) Model hosting platforms
Another common route is hosting platforms that expose individual model endpoints (e.g., “Kling 1.6 Pro image-to-video”), with a built-in queue, job tracking, and file upload. This is attractive when you want a strong developer experience and don’t mind that you’re using that platform’s key rather than official Kling keys.
What you can build with Kling API
Kling is usually integrated as a “video generation service” within a larger product. Here are the most common real-world builds:
Creator tools & social content apps
- Text-to-video creation: users type a prompt, choose a style, and receive a short clip.
- Image-to-video animation: users upload a photo and “bring it to life” with controlled motion.
- Template-driven generation: your app provides preset prompts (product ads, travel reels, brand intros).
- Batch generation pipelines: generate multiple variations and pick the best via human or automated scoring.
E-commerce & marketing automation
- Product videos from catalog images (with careful prompt templates and brand-safe controls).
- Localized marketing clips where the same concept is generated in multiple languages and formats.
- Campaign variations: generate a dozen creative versions, then A/B test performance.
Internal content pipelines
- Automated storyboard → clips: writers provide scene prompts, Kling generates sequences.
- Brand kits: stable prompt prefixes for consistent tone, color, and camera style.
- Compliance filtering: pre-validate prompts and post-review outputs before publishing.
Agentic workflows
In “agent” systems, Kling is the video tool. The model (or your own rules) decides when to generate a clip and with what parameters. The key is to keep the tool safe and deterministic: the agent proposes a prompt, but your system validates it, applies policy rules, and then executes the request.
Core concepts: tasks, duration, credits, and quality modes
The Kling API family follows patterns that are common across modern video generation systems:
1) Task-based async processing
Video generation is computationally heavy, so it’s typically asynchronous: you submit a job, receive a task ID immediately, and poll (or receive a callback) when the job finishes. This decouples your app’s UX from the model’s runtime.
2) Duration-based pricing
Most Kling video pricing structures in the ecosystem map to seconds of output video. That means a 10-second output usually costs roughly twice a 5-second output. Some systems also charge based on resolution, quality mode (“STD” vs “PRO”), or special features (motion control, avatar, or effects).
3) Quality modes and model versions
Kling is available in multiple versions (often described as v1.0, v1.5, v1.6, v2.0, v2.1, and newer variants). Versions may differ in realism, motion quality, feature support (start/end frame, camera control, motion brush), and availability of standard vs professional modes.
4) Asset handling
Some APIs accept public URLs for images/videos. Others let you upload files first and pass a handle/URL to the generation endpoint. In production you should store your own copy of generated outputs—because many providers clear generated assets after a retention period.
Authentication (JWT AccessKey + SecretKey)
The official Kling API approach typically works like this:
- You obtain an AccessKey and a SecretKey from your Kling developer account.
- For each API session (or each request), you generate an API Token that follows the JWT (RFC 7519) standard.
- You call endpoints with Authorization: Bearer <API_TOKEN>.
JWT token generation: what it means in practice
JWT is a signed token format. In typical HMAC-based JWT usage, you create a token containing a header (algorithm metadata), a payload (claims like “issued at” and “expires at”), and a signature (computed using the SecretKey). Kling’s developer specs may require fixed header/payload fields. Your token generation must match the “fixed encryption method” expected by Kling.
Because provider-specific JWT claim requirements can change, treat token generation as a small, well-tested utility: write a unit test that generates a token and validates it by calling a lightweight endpoint (like an account or task query).
Example: server-side JWT token utility (Node.js)
This is a template. You must match the exact claims and signing requirements from your Kling developer docs.
import jwt from "jsonwebtoken";
export function makeKlingToken({ accessKey, secretKey }) {
// Provider-specific claims vary. Common ones include:
// - iss / sub: identifies the caller
// - iat: issued at
// - exp: expires at (short TTL recommended)
const now = Math.floor(Date.now() / 1000);
const payload = {
iss: accessKey,
iat: now,
exp: now + 5 * 60, // 5 minutes
};
// Many JWTs use HS256 with a shared secret. Confirm the required alg in Kling docs.
return jwt.sign(payload, secretKey, { algorithm: "HS256" });
}
Example: using the token in an HTTP request
curl -X GET "https://api.klingai.com/your/endpoint" \
-H "Authorization: Bearer YOUR_KLING_JWT_TOKEN" \
-H "Content-Type: application/json"
API domain & environment setup
Official Kling developer specs commonly describe an API domain like: https://api.klingai.com. In a production implementation, you’ll usually have:
- DEV environment keys and endpoints (for testing)
- PROD environment keys and endpoints (for real billing)
- Separate storage buckets for generated outputs (dev vs prod)
Recommended environment variables
# Official-style keys (server only)
KLING_ACCESS_KEY="..."
KLING_SECRET_KEY="..."
# Your own app settings
APP_PUBLIC_BASE_URL="https://yourapp.com"
KLING_WEBHOOK_URL="https://yourapp.com/api/kling/webhook"
KLING_MAX_CONCURRENCY="5"
KLING_DEFAULT_DURATION_SECONDS="5"
If you integrate through a gateway provider instead, you’ll typically use a single API key (or token) and a base URL that belongs to that provider. In that case, you may not need JWT at all—but you should still keep keys server-side and log usage carefully.
Task workflow: create → poll → deliver
Whether you use official Kling access or a gateway, the same lifecycle is common:
Step 1: Create a generation task
Your backend submits a request with the model version, prompt, and parameters. The API returns: a task ID (or request ID).
Step 2: Track status
You query task status until it transitions from “queued / in progress” to “succeeded” (or “failed”). For a better UX, your app should show real-time progress states: Queued → Generating → Finalizing → Ready.
Step 3: Retrieve the output
When the task succeeds, the API response includes one or more URLs for the generated video(s) and sometimes metadata (duration, resolution, seed, cost units). Download the output to your own storage for long-term access.
Step 4: Deliver and cache
Serve the file from your CDN or object storage. Cache results by user + prompt + settings when appropriate—especially for template-driven generation where many users produce similar content.
Minimal pseudo-API example (create + poll)
// 1) Your backend receives user request
// POST /api/generate-video
// { prompt, durationSeconds, aspectRatio, imageUrl? }
// 2) Backend generates Kling JWT (official) or uses gateway key
// 3) Backend submits task and stores taskId in DB
// 4) Backend returns { taskId } to client immediately
// 5) Client polls your backend:
// GET /api/tasks/:taskId
// Backend queries Kling, returns { status, progress?, videoUrl? }
// 6) When succeeded, your backend copies video to your storage and returns CDN url
Callbacks / webhooks (recommended)
Polling works, but webhooks are better for scale. With webhooks, you provide a callback URL when submitting the task. When the video is ready, the provider calls your webhook endpoint with the task ID and results. Your backend validates the request signature (if provided), stores final output, and notifies the user.
Webhook best practices
- Verify authenticity: validate signatures or shared secrets so attackers can’t spoof “completed” events.
- Idempotency: store webhook event IDs and ignore duplicates.
- Fast response: respond 200 quickly; do heavy work asynchronously.
- Retry support: expect webhooks to retry, and build for eventual delivery.
Example webhook handler (Express)
import express from "express";
const app = express();
app.use(express.json());
app.post("/api/kling/webhook", async (req, res) => {
// 1) Verify signature / secret (implementation depends on provider)
// 2) Extract taskId and result URLs
const { task_id, status, result } = req.body || {};
// 3) Enqueue a job to download result to your storage
// 4) Update DB & notify user
res.status(200).send("ok");
});
Storage & retention (save outputs fast)
A critical operational detail for AI media APIs is retention. Providers often clear generated assets after a set period for security and cost reasons. That means if your product shows “your video library,” you must store outputs yourself.
A common retention pattern is ~30 days on provider-hosted URLs. Even if your provider’s retention differs, the safe production choice is: download every successful output immediately and store it in your own object storage (S3/R2/GCS/Azure Blob) with a CDN in front.
Recommended storage workflow
- On success, fetch the video file via server-to-server download.
- Write to object storage with a stable key: videos/{userId}/{taskId}.mp4
- Store metadata: duration, resolution, prompt hash, model version, createdAt, cost units.
- Serve via CDN with signed URLs if content is private.
Text-to-Video: how it typically works
Text-to-video (T2V) is the simplest end-user experience: the user writes a prompt and gets a clip. But in production, you’ll want a “prompt builder” UI so users can express intent without writing complex prompts: subject, scene, style, camera movement, lighting, and mood.
What a strong T2V prompt includes
- Subject: who/what is the focus?
- Action: what happens in the clip?
- Scene: environment, time of day, location.
- Camera: close-up, wide shot, dolly-in, handheld, aerial.
- Lighting: golden hour, neon, soft studio, dramatic shadows.
- Style: documentary, cinematic, anime, product ad, vintage film.
- Constraints: aspect ratio, duration, “no text overlays,” etc.
Example prompts
1) Cinematic travel shot:
"A slow dolly-in toward a misty mountain lake at sunrise, soft fog drifting across the water, cinematic lighting, shallow depth of field, ultra-realistic."
2) Product ad:
"Minimal studio setup with softbox lighting. A sleek black smartwatch rotates slowly on a reflective surface, crisp highlights, clean background, premium commercial style."
3) Story moment:
"A child runs through a field of tall grass, wind moving the grass in waves, warm golden-hour light, handheld camera feel, gentle motion blur."
Your app can generate these prompts automatically from simple user inputs, ensuring consistent quality and brand safety.
Image-to-Video: animation from a reference image
Image-to-video (I2V) is one of Kling’s most popular workflows. The user provides a single image (a character, product, or scene), and the model generates a short video that adds motion while trying to preserve the subject identity.
When I2V works best
- High-resolution, clear images with a distinct subject and good lighting.
- Images with depth cues (foreground/background separation) to support natural camera motion.
- Prompts that describe specific motion (turn head, walk forward, waves, smoke drifting).
- Reasonable expectations: small, believable motion usually looks better than extreme transformations.
Common I2V product features
- Motion presets: subtle, medium, dynamic, “cinematic camera.”
- Strength sliders: how strongly to follow the input image vs creatively diverge.
- Aspect ratio: 16:9 for YouTube, 9:16 for Reels/TikTok, 1:1 for feeds.
Reference & motion transfer: how to think about it
Beyond basic T2V and I2V, advanced Kling integrations often include “reference” features that help with consistency and control. The exact feature names vary by version and provider, but the concepts are stable:
1) Character reference (identity consistency)
The user provides a character image, and the model attempts to preserve that identity across clips. Your app can expose “face vs subject” reference modes. Face reference tries to keep facial features consistent while letting clothing/background vary; subject reference attempts to keep the whole character consistent.
2) Start/end frame control
You can guide the clip by providing a first frame and/or a last frame. This is useful for transitions (“start in the room, end outside”) and for generating short narrative beats with predictable endpoints.
3) Motion transfer (performance-based animation)
Motion transfer applies movement from a reference motion clip to a subject in a reference image. In a typical UX, the user uploads a character image and a short motion video (dance, walk cycle, gesture), and the model synthesizes the subject performing that motion.
Key parameters you should support
The most important API-level parameters are consistent across most Kling integrations:
| Parameter | What it controls | Product recommendation |
|---|---|---|
| prompt | What the model should generate (scene + action + style) | Provide a “prompt builder” and store prompt versions for reproducibility. |
| negative_prompt | What to avoid (artifacts, unwanted styles, text overlays) | Use a safe default negative prompt; keep it short and brand-aligned. |
| duration | Output length (often 5s or 10s) | Offer 5s default; allow upgrades for 10s+ with clear cost messaging. |
| aspect_ratio | Frame shape (16:9, 9:16, 1:1) | Default based on destination: Reels=9:16, YouTube=16:9. |
| model / model_name | Which Kling version and mode to use | Expose “Quality” as a UI choice rather than model IDs. |
| seed (optional) | Reproducibility for variations | Store seed in metadata; enable “generate variations.” |
| strength / fidelity (I2V) | How closely to follow the input image | Provide a slider with recommended presets. |
Prompt length & validation
Many Kling specs enforce a prompt length limit (often measured in characters). Your UI should enforce limits with a live counter and guidance. Also validate prohibited content and prevent users from submitting prompts that will likely be rejected by moderation.
Prompting guide for cinematic results
If you want consistently good outputs, prompting should be productized. Treat prompts like code: version them, test them, and measure how changes affect quality. Here’s a reliable approach:
Use a stable “prompt frame” template
Create a structured template:
[SUBJECT], [ACTION], in/at [SCENE], [TIME OF DAY],
[CAMERA], [LENS/DEPTH], [LIGHTING], [MOOD], [STYLE],
high detail, natural motion, no text overlays.
Make camera motion explicit
- “slow dolly-in”
- “handheld documentary feel”
- “static tripod shot”
- “orbit around subject”
- “drone aerial establishing shot”
Keep motion realistic for higher hit-rate
Overly complex motion instructions can reduce success rate. Prefer a single, clear motion instruction and let the model focus. If you need complex action, generate multiple clips and stitch them.
Use negative prompts to prevent common artifacts
A short negative prompt can help avoid: “text, subtitles, watermark, logo, deformed hands, distorted face, flicker, low quality”. Keep it aligned with your product’s style.
Pricing patterns & cost control
Kling video APIs are commonly priced by: seconds of output video, sometimes multiplied by a quality tier (standard vs pro) and feature modifiers (motion control, avatar, effects). The safest way to build a pricing UX is:
- Show estimated cost before generation (e.g., “5 seconds • Pro • 16:9”).
- Hard-cap duration and concurrency per user plan.
- Provide “draft mode” for cheap previews (lower quality or shorter duration) and “final mode” for premium output.
- Offer variations with guardrails (max N variants per request).
Cost control techniques that actually work
- Default to 5 seconds, let users extend only if they love the preview.
- Queue requests and limit concurrency to prevent accidental cost spikes.
- Prompt caching for templates so the same campaign doesn’t re-run unnecessarily.
- Auto-stop and refund logic for failed jobs if your provider bills on submission.
- Content validation before submitting tasks so you don’t pay for prompts that will be rejected.
Simple cost estimation (duration-based)
Even when the provider uses “credits,” you can convert to a user-facing estimate. A common internal model:
estimated_cost_units = duration_seconds * tier_multiplier * feature_multiplier
// Example product defaults:
tier_multiplier: STD=1.0, PRO=1.6
feature_multiplier: none=1.0, motion_control=1.3, avatar=1.5
Rate limits, concurrency & retries
Video generation APIs can overload under spikes. Even if a provider does not publish strict rate limits, you should design as if they do: throttle, queue, and implement backoff retries for transient errors.
Recommended defaults for SaaS products
- Per-user concurrency: 1–2 jobs at a time on free tier, 3–10 on paid tiers.
- Global concurrency: match your worker capacity and provider stability.
- Retries: only for transient failures (timeouts, 429, 500/503). Never retry invalid payloads.
- Backoff: exponential with jitter. Cap at 2–4 retries to avoid runaway costs.
Retry pattern (pseudo)
for attempt in 1..4:
resp = call_provider()
if resp.success: return resp
if resp.status in [429, 500, 503, timeout]:
sleep(backoff_with_jitter(attempt))
continue
else:
// 4xx validation errors: do not retry
raise
Errors, debugging & reliability
Most Kling API integration issues fall into a few buckets:
1) Authentication failures
- Token expired
- JWT signing mismatch (wrong algorithm, wrong claims, wrong secret)
- Using AccessKey/SecretKey directly in Bearer header instead of a JWT token
2) Input validation issues
- Prompt too long
- Missing required fields (image_url, duration, aspect ratio)
- Invalid model name or unsupported mode for a model version
- Unsupported file type or file size too large
3) Provider-side load
- Timeouts while generating
- Queue backlog and slow completion
- Transient server errors
Debug checklist
- Confirm your server clock is correct (JWT iat/exp depends on time).
- Log your request payload shape (without sensitive user images) and validate required fields.
- Start with a minimal known-good request (short prompt, default duration, standard aspect).
- Verify that the model version supports the features you’re requesting (e.g., start/end frame, motion brush).
- Store full provider responses (status, error code, message, request id) in your logs for incident triage.
Safety, moderation & compliance
Any realistic media generator needs safety policies. Your product must implement safety on two layers:
- Pre-generation: validate prompts and uploaded assets before sending to Kling.
- Post-generation: review outputs before public sharing (especially on platforms with minors, news content, or political content).
Safety controls you should ship
- Prompt filters for disallowed sexual content, violence, hate, and illegal activity.
- Impersonation checks for public figures and private individuals.
- Consent confirmations for user-uploaded images of people.
- Watermark / disclosure options: “AI-generated video” label in your UI.
- Abuse monitoring: rate limit suspicious accounts and flag repeat policy violations.
Legal/product considerations
You should clearly state:
- Who owns generated outputs (user vs your platform) and what licensing applies.
- Whether users can use outputs commercially and what plan is required.
- How long you store user uploads and generated videos, and how deletion works.
Production architecture (what scales)
The biggest mistake teams make is calling a video generation API directly from the client. The correct architecture is:
Client → Your Backend → Kling (or gateway) → Your Backend → Client
Recommended components
- API gateway: auth, quotas, request validation
- Generation service: creates tasks, stores metadata, triggers callbacks
- Queue + workers: controls concurrency and retries
- Storage + CDN: durable hosting of generated MP4s
- Database: tasks, users, plans, costs, prompt templates
- Moderation pipeline: pre- and post-checks, review UI
State machine you should implement
| State | Meaning | What the UI shows |
|---|---|---|
| CREATED | Task recorded, not yet submitted | Preparing… |
| QUEUED | Submitted to provider and waiting | Queued |
| RUNNING | Provider generating | Generating… |
| FINALIZING | Provider finished; you are downloading/storing | Finalizing… |
| SUCCEEDED | Stored in your CDN and ready | Ready (play/download) |
| FAILED | Generation failed | Failed (retry) |
Why your backend matters
- Protects keys and secrets
- Enforces quotas and pricing rules
- Normalizes response formats across providers
- Stores outputs permanently and reliably
- Enables safe moderation and audit logs
Logging, metrics & quality assurance
For video generation apps, you need standard observability plus media-specific metrics:
Core metrics
- Task latency: submission → completion time
- Success rate: succeeded / total
- Provider errors: by status code and error code
- Cost units: per user, per workspace, per day
Media-specific QA
- Flicker score or temporal consistency checks
- Face clarity checks when users animate portraits
- Watermark detection (if forbidden by your policy)
- Prompt compliance: does the clip match the requested scene?
Quickstart (developer-friendly approach)
Because official Kling endpoints and payload fields can differ by program access, the most reliable “quickstart” is to implement the common task pattern (create + poll) in a way that works with either official access or a gateway. Below is a clean reference implementation pattern.
1) Your backend: create task endpoint
// POST /api/kling/create
// body: { type: "t2v"|"i2v", prompt, negativePrompt?, durationSeconds, aspectRatio, imageUrl? }
export async function createTask(req, res) {
// 1) Validate inputs (length, allowed ratios, duration caps)
// 2) Run safety checks
// 3) Generate provider auth (JWT or gateway key)
// 4) Call provider create endpoint
// 5) Save task metadata to DB
// 6) Return { taskId } to client
}
2) Your backend: task status endpoint
// GET /api/kling/task/:id
export async function getTask(req, res) {
// 1) Lookup task in DB
// 2) Query provider status
// 3) If succeeded, ensure output is copied to your storage
// 4) Return normalized status JSON:
// { status, progress?, videoUrl?, error? }
}
3) Client: polling UI loop
async function poll(taskId) {
while(true){
const r = await fetch(`/api/kling/task/${taskId}`);
const data = await r.json();
if(data.status === "SUCCEEDED") return data.videoUrl;
if(data.status === "FAILED") throw new Error(data.error || "Generation failed");
await new Promise(res => setTimeout(res, 2000));
}
}
FAQ
Official links & references
These references are useful starting points. Always verify the latest endpoints, model lists, and auth requirements in your account’s developer portal.
- Kling global app: app.klingai.com/global
- Historical “Kling AI API Specification” (discontinued notice + pointers): docs.qingque.cn
- JWT standard overview (RFC 7519 concept): jwt.io
- Example model-hosting style docs (Kling endpoints on a hosting platform): fal.ai Kling v1.6 I2V
Changelog
- Initial publication of the Kling API developer guide (task workflow, JWT auth concepts, T2V/I2V, production architecture).