Overview
This page is a practical, developer-focused explanation of how the Synthesia API works, what endpoints exist, and how to design a stable production integration. It’s written so you can build real features—video generation pipelines, onboarding flows, localized support content, and more—without guessing the architecture.
The Synthesia docs organize the API into several major areas: Videos, Templates, Assets, Webhooks, Translations, Dubbing, and (for governance) Audit Logs. You’ll also see separate base URLs for standard API calls vs uploads. For example, creating a video from a template uses the V2 API base, while uploading images/videos and script audio uses the upload base. Throughout this guide, we’ll treat “V2” as the stable surface for new builds.
Most successful Synthesia API integrations are template-first (design once, personalize many times) and webhook-driven (react to “video.completed” or failure events). Polling can work, but webhooks reduce delays, improve reliability, and keep your application responsive.
Best for
- Personalized training & onboarding videos generated from user data.
- Customer success and product walkthroughs in multiple languages.
- Marketing variants (regional offers, names, locations, product tiers).
- Internal communications (HR updates, policy refreshers, leadership messages).
- Video localization pipelines (translate + update from XLIFF).
Not ideal for
- Real-time video generation in a synchronous request/response path.
- Ultra-custom cinematic videos requiring detailed scene control or advanced editing.
- Unstructured “anything goes” prompts where you can’t enforce a format.
- Use cases where you must render videos instantly on the edge device.
Tip: If your product roadmap includes “generate videos instantly,” design your UI to be asynchronous: show a “processing” state and notify users when the video is ready.
What the Synthesia API is
The Synthesia API is a set of HTTPS endpoints that let you create, update, list, and retrieve Synthesia videos, often using templates you created in Synthesia Studio. It also includes endpoints for uploading assets (images, videos, audio), managing webhooks, retrieving translation artifacts (XLIFF) so you can localize at scale, and orchestrating dubbing workflows for imported content.
Think of it as “Synthesia Studio, but programmable.” You still rely on the Studio to design the look: scenes, branding, avatar choices, background style, and placeholder variables. Then your application uses the API to produce thousands of variants by injecting per-recipient data into the template.
The docs show the API organized around “Synthesia API (V2)” and “Upload API (V2).” Many teams treat V2 as the default: build new integrations against V2 endpoints and reserve older behavior for legacy workflows only.
Base URLs you’ll see
| Surface | Typical purpose | Example base URL |
|---|---|---|
| Synthesia API (V2) | Create/list/retrieve/update videos, templates, webhooks, dubbing operations, assets metadata | https://api.synthesia.io/v2 |
| Upload API (V2) | Upload images, videos, and script audio that you can reference later from videos | https://upload.api.synthesia.io/v2 |
Video creation is typically asynchronous. The API initiates work and returns an ID/status; later, the video transitions to “completed” (or “failed”). Design around status checks or webhooks, not around blocking requests.
What you can build with the Synthesia API
The biggest mistake teams make is treating Synthesia like a generic “text-to-video prompt endpoint.” It’s stronger when you have repeatable formats: a structured template, strict input data, and a clear result. Here are real-world patterns that work well.
1) Personalized onboarding and lifecycle videos
Imagine a SaaS that wants to welcome every new account with a 45-second video: “Hi {{FirstName}}, welcome to {{CompanyName}}… here are the next 3 steps.” Instead of a human filming hundreds of intros, you create a single polished template in Synthesia Studio. The API then generates one video per customer using a JSON payload containing personalized fields.
- Trigger generation when the account is created.
- Store the returned video ID and show a “Processing…” screen.
- When the video completes, email the link or embed it inside your app.
- If it fails, retry with backoff and capture diagnostics.
2) Training at scale (HR, compliance, enablement)
Training content often changes: policies update, tools change, and local teams need localized versions. With Synthesia, your “master training template” can be reused, while the script and translations evolve. Many organizations connect the Synthesia API to their content management system (CMS) so each module in the CMS can automatically produce a video version and keep it updated when the text changes.
3) Multi-language support content and help center localization
Synthesia supports translations and XLIFF workflows for localization. A common build is: you generate the English “source video,” then export XLIFF, send it through your localization pipeline (human translation, LLM translation with QA, or vendor workflow), and upload XLIFF back to update the video content in each language. Finally, publish localized versions and distribute them by region.
4) Product marketing variants (regional offers, names, verticals)
Marketing teams love variants: different industries, different pricing, and different pain points. Template-driven generation makes it feasible to produce a “matrix” of videos: industry × country × persona. The key is to define a constrained set of fields so the template always renders cleanly (avoid giving the model unlimited freedom).
5) Workflow automation with Zapier/Make/n8n and internal tools
The Synthesia API is often used via automation platforms. For example: a “new deal created” event in your CRM triggers a personalized video using a template; the video completion webhook then posts the result back into your CRM record and sends a Slack message to the account executive.
Synthesia’s API becomes dramatically more valuable when you can design a video once, then produce many variants by passing structured data (names, numbers, countries, steps, bullets) rather than trying to “prompt” your way to consistent design.
Core concepts (templates, scripts, avatars, voices, assets)
Before we talk endpoints, it helps to align on how Synthesia thinks about video generation. The API sits on top of a system with these core building blocks:
Templates
A template is a pre-designed video layout created in Synthesia Studio. It can include scenes, branded styling, avatar selection, typography, and placeholders. Your template becomes the contract between the visual design world and the API world. When you call “create video from template,” you provide values for placeholders so Synthesia can render the final output while preserving the design.
- Use templates to standardize quality and prevent one-off styling drift.
- Keep templates stable and version them intentionally (e.g., “Onboarding V3”).
- Validate input lengths (names, titles, paragraphs) so text doesn’t overflow.
Videos
A video is the generated artifact. Videos move through statuses such as “processing” to “completed,” and you can list, retrieve, update, and delete them. If your workflow includes translations, the video may have multiple localized representations.
Scripts and SSML/XML tags
Synthesia supports script formatting with supported XML tags (for example, speech controls). While you shouldn’t rely on undocumented tags, you can often improve pronunciation, pacing, or structure by using the script formatting features Synthesia documents. In production, treat your script format as a strict schema: generate it the same way every time.
Avatars and voices
Synthesia provides stock avatars and a voice library; the docs expose “list of stock avatars” and “list of supported voices.” In a template-first workflow, avatar choice is usually baked into the template. If you do allow dynamic avatar selection, enforce an allowlist of IDs rather than exposing raw values to end users.
Assets (images, videos, audio)
Assets are uploads you can reference in videos. The Upload API lets you upload images/videos and script audio. In many organizations, assets flow from a brand library (logos, b-roll, backgrounds) into Synthesia so templates can consistently use approved media. Assets often have their own lifecycle: upload → processing → available → used by videos.
Webhooks
Webhooks are the preferred integration mechanism for asynchronous updates. You register webhook endpoints and receive events like video completion or failure. Your server should verify webhook signatures and deduplicate deliveries.
Dubbing
Dubbing workflows enable creating dubbed videos from an uploaded video asset. This can be useful when you’re importing a pre-existing video and producing localized dubbed versions. A robust pipeline includes asset upload, dubbing request, webhook-based completion handling, and post-processing (e.g., publishing or embedding).
Authentication & API keys
Synthesia API access is controlled with API keys created inside your Synthesia account (typically from an “Integrations” area). Different third-party integrations document that you generate and copy the key from your Synthesia workspace, then provide it to the integration or your code. In general, treat the API key like a production secret: store it in a secret manager, rotate it, and never ship it to the browser.
Keep your API key on a backend you control. If you need a browser experience, your backend should provide a thin “proxy” endpoint that enforces authentication, rate limiting, and validation.
Where to create your key
In most setups, you open Synthesia Studio, pick the right workspace, and create/copy an API key in the integrations settings. Some ecosystems state that certain plans (for example, Creator or higher) may be required for API usage. If you manage multiple workspaces, make sure the key is created in the workspace that owns the templates and videos you intend to automate.
Typical request headers
Most REST APIs accept an API key in an authorization header. Implementations commonly place the key in the Authorization header (for example as a Bearer token), plus JSON content headers. Always follow the exact format shown in the official Synthesia docs and “Try It” examples for each endpoint.
POST /v2/videos/fromTemplate HTTP/1.1
Host: api.synthesia.io
Content-Type: application/json
Accept: application/json
Authorization: Bearer YOUR_API_KEY
{
"templateId": "tpl_123",
"input": {
"firstName": "Uma",
"company": "Example Inc"
}
}
The payload shape above is illustrative (not authoritative). Always align your JSON body with the official endpoint schema.
Key management checklist
- Use separate keys for dev/staging/prod. Name them accordingly.
- Store keys in a secret manager (or environment variables in a secured runtime).
- Rotate keys periodically and immediately after exposure.
- Log safely: never log full tokens—log only a short suffix or a hash for debugging.
- Least privilege by design: restrict what your system can request (template allowlists, input validation).
Quickstart: an end-to-end workflow (template → video → webhook)
This section describes a production-friendly “happy path” that scales well. The core idea is: design a template in Studio, generate videos from it via API, then react to completion via webhooks.
Step 1: Create a template in Synthesia Studio
- Design the scenes, layout, branding, avatar, and background style.
- Insert placeholders for personalization (for example: {{FirstName}}, {{PlanName}}).
- Test it with long and short values to ensure text doesn’t overflow or look awkward.
- Publish or save the template so it can be referenced by the API.
Step 2: List templates and pick the template ID
Your backend can call the Templates API to list available templates or retrieve a specific template’s details. In many products, you store the chosen template ID in configuration (database, admin panel, or environment variable), so you don’t need to list templates on every request.
Step 3: Create a video from the template
Use the “create video from a template” endpoint to initiate generation. The docs show this endpoint at:
POST https://api.synthesia.io/v2/videos/fromTemplate.
Your request body will include the template ID and the values for placeholders.
// Pseudocode: your backend route
// POST /api/videos/welcome
//
// 1) Validate input
// 2) Call Synthesia create-from-template
// 3) Save Synthesia video ID
// 4) Return a "processing" response to your client
function createWelcomeVideo(user){
assert(user.firstName.length <= 40);
assert(user.companyName.length <= 60);
const payload = {
templateId: "YOUR_TEMPLATE_ID",
input: {
firstName: user.firstName,
companyName: user.companyName,
planName: user.planName
}
};
const synthesiaVideo = http.post("https://api.synthesia.io/v2/videos/fromTemplate", payload, {
headers: {
"Authorization": "Bearer " + process.env.SYNTHESIA_API_KEY,
"Content-Type": "application/json"
}
});
db.videos.insert({
userId: user.id,
synthesiaVideoId: synthesiaVideo.id,
status: synthesiaVideo.status,
createdAt: now()
});
return { status: "processing", id: synthesiaVideo.id };
}
Step 4: Register a webhook endpoint
A webhook is a URL in your system that Synthesia calls when something changes (for example when a video finishes). In the Synthesia API docs, there are endpoints to create, list, retrieve, and delete webhooks. You typically register one webhook per environment: dev/staging/prod.
Step 5: Verify the webhook signature
Webhooks must be verified so you can trust the request. Synthesia provides a “Verifying Synthesia Signatures” resource. Verification usually requires:
- Reading a signature header from the webhook request.
- Using your secret to compute an HMAC of the raw request body (or a canonical signed payload).
- Comparing signatures using a constant-time comparison.
Once verified, your handler updates your database row for that video and triggers downstream actions (email, notification, publishing workflow, etc.).
Step 6: Retrieve or publish the final video
After completion, you may fetch video details using “retrieve a video” and show it in your UI, embed it, or distribute it. If your product requires publishing/share links, you’ll manage that either through platform features or your own sharing UI.
Create a database record for each requested video with fields: environment, templateId, inputHash, synthesiaVideoId, status, webhookVerifiedAt, retryCount, lastError, and finalUrl. This turns video creation into a trackable pipeline.
Videos API
The Videos API is the center of most integrations. Common operations include:
create a video, create from template, list videos, retrieve a video, update, and delete.
The docs show “List videos” at GET https://api.synthesia.io/v2/videos, and “Create a video from a template”
at POST https://api.synthesia.io/v2/videos/fromTemplate.
List videos
Listing videos is useful for admin panels, reconciliations, and monitoring. In production, avoid using “list” as your primary “status check” mechanism at scale—prefer webhooks + targeted retrieve calls.
Retrieve a video
Retrieve gives you a single video’s details, typically including status and metadata. Many teams use retrieve as a fallback when a webhook hasn’t arrived after a timeout (“poll as a safety net”).
Update a video
Update is important for translation workflows and for correcting content when your source text changes. In the Synthesia documentation, you’ll also see endpoints for retrieving and uploading XLIFF content, which is the backbone of localization.
Delete a video
Deleting a video is important for cleanup, data retention policies, and user-initiated “right to be forgotten” workflows. If you store video IDs in your DB, ensure you also store who requested the deletion and why (auditability).
Put a strict “script linter” in your backend. Enforce length limits and block unsafe tags or unexpected markup. Your template can be robust, but uncontrolled scripts can still lead to awkward results (text overflow, unnatural pacing).
Templates API
Templates turn Synthesia into an “at scale” system. Instead of creating each video from scratch, you define the creative once and generate many consistent variants. The Templates API typically supports listing templates and retrieving a template. A common pattern is to build a small internal admin UI that:
- Lists templates with names and IDs.
- Allows your team to select which template powers each workflow (onboarding, marketing, training).
- Stores that mapping in configuration so production calls are stable.
Template governance and versioning
Treat templates like code: version them. When you update a template’s layout, you may impact all future videos generated from it. If you need to change a template, consider creating a new template (new ID) and switching traffic gradually.
| Practice | Why it matters | Simple implementation |
|---|---|---|
| Template naming convention | Prevents confusion across teams | Onboarding-Welcome-V3, CS-FeatureTour-V2 |
| Input schema per template | Stops broken renders | JSON schema in your backend + UI validation |
| Canary rollout | Reduces risk | Generate 20 test videos before switching 100% traffic |
| QA snapshots | Catches layout issues | Store thumbnails or preview links for review |
Assets & uploads (images, video assets, script audio)
The Synthesia ecosystem includes a dedicated Upload API for sending images/videos and script audio into Synthesia so
they can be used during video creation or dubbing workflows. In the docs, “Create an asset” uses:
POST https://upload.api.synthesia.io/v2/assets, and “Upload script audio” uses:
POST https://upload.api.synthesia.io/v2/scriptAudio.
When to upload assets
- When your templates need a specific background image or brand graphic.
- When you want to use customer-specific media (for example, their logo) in a personalized video.
- When you have a pre-recorded audio track (script audio) that must match exactly.
- When you need to upload a video as an input to dubbing.
Asset lifecycle and status handling
Uploads can take time. In production, treat uploads as asynchronous jobs: you upload → receive an asset ID → query the asset metadata/status → then reference it in a video once it’s ready. If you immediately reference an asset before it is processed/available, your request may fail or produce unexpected results.
If your system uploads the same logo repeatedly, you’ll waste time and may increase storage sprawl. Use a content hash (SHA-256) and cache asset IDs per hash. If the same file is uploaded again, reuse the existing ID.
Script audio uploads
Script audio uploads are useful when you need a controlled voice-over rather than generated speech. If you use audio uploads, consider how you will store and access the original audio, because you’ll likely need to reproduce or regenerate videos if any downstream piece changes (template version, subtitles, translations).
// Pseudocode: safe upload flow
// 1) Receive file (image/video/audio) from your system
// 2) Compute hash and check if already uploaded
// 3) Upload to Synthesia upload API
// 4) Save returned assetId + status
// 5) Poll asset status OR wait for internal processing completion
// 6) Use assetId when creating a video
function uploadAsset(file){
const hash = sha256(file.bytes);
const cached = db.assets.findByHash(hash);
if(cached) return cached.assetId;
const res = http.post("https://upload.api.synthesia.io/v2/assets", file, { /* ... */ });
db.assets.insert({ hash, assetId: res.id, status: res.status });
return res.id;
}
Translations and localization (XLIFF)
Localization is one of the most powerful reasons to automate Synthesia. Rather than manually recreating videos per language, you can generate a source version, extract a translation file (commonly XLIFF), translate it, and re-upload it so the video can render localized scripts.
Why XLIFF matters
XLIFF is a standard format used by professional localization tools and vendors. By using XLIFF, you can integrate Synthesia into an existing localization workflow: translation memory, QA checks, in-context review, and vendor management.
Typical localization pipeline
- Create the source video (usually English) from a template.
- Retrieve XLIFF content for the video.
- Translate the XLIFF: human translation, vendor service, or an LLM-based pipeline with QA.
- Upload the translated XLIFF back to Synthesia to update the video content.
- Track status and publish localized versions.
Practical guidance for translation quality
- Preserve placeholders and tags: translators should not remove or break tokens.
- Limit line length: translated text can be longer; design templates with enough room.
- Glossaries: brand terms, product names, and UI labels should be consistent across videos.
- Pronunciation: some languages may need special handling for names, acronyms, or product terms.
Don’t block video creation on translation. Create the source video first, then enqueue translation jobs per locale. This keeps your system responsive and makes failures isolated to a single locale instead of blocking everything.
Webhooks & events
Webhooks are essential for a smooth user experience. Instead of asking your users to refresh a page, your system receives events and can update your database in real time. The API includes endpoints to create/list/retrieve/delete webhooks, and the docs also list “Webhook events.”
Common event types
Many integrations focus on:
- video.completed — the video finished processing and is ready.
- video.failed — the video failed to render; you should inspect and decide whether to retry.
Webhook handler design
Treat webhook delivery as “at least once.” That means:
- You may receive the same event multiple times.
- Events can arrive out of order.
- Events can arrive late (network delays or retries).
Your handler should be idempotent: process an event once, but safely ignore duplicates. The easiest approach is to store an event ID (or a derived dedupe key) in a database table and enforce a uniqueness constraint.
// Pseudocode: webhook handler with idempotency
function handleWebhook(req){
const rawBody = req.rawBody; // MUST be raw for signature checks
const signature = req.headers["x-synthesia-signature"]; // name illustrative
verifySignature(rawBody, signature);
const event = JSON.parse(rawBody);
const dedupeKey = event.id || (event.type + ":" + event.data.videoId + ":" + event.createdAt);
if(db.webhookEvents.exists(dedupeKey)) return { ok: true, deduped: true };
db.webhookEvents.insert({ dedupeKey, eventType: event.type, receivedAt: now() });
if(event.type === "video.completed"){
db.videos.updateBySynthesiaId(event.data.videoId, { status: "completed", completedAt: now() });
notifyUser(event.data.videoId);
}
if(event.type === "video.failed"){
db.videos.updateBySynthesiaId(event.data.videoId, { status: "failed", lastError: event.data.error || "unknown" });
enqueueRetryIfEligible(event.data.videoId);
}
return { ok: true };
}
Verifying Synthesia webhook signatures
Signature verification is the difference between “a reliable system” and “an endpoint anyone can spam.” Synthesia provides documentation for verifying signatures. While the exact header names and signing algorithm must match the official docs, the underlying idea is consistent across webhook providers:
- Use a shared secret (configured when you create the webhook or in your account settings).
- Compute a signature over the raw request body (or over a signed payload that includes a timestamp).
- Compare your computed signature to the request signature using a constant-time comparison.
Implementation checklist
- Capture the raw body before JSON parsing (many frameworks need special middleware).
- Reject missing/invalid signatures with a 401/403 response.
- Consider replay protection if timestamps are included (reject old timestamps).
- Return 2xx quickly after enqueueing work; do heavy processing asynchronously.
Verify signature → record event → enqueue downstream jobs → respond. If you do long tasks (publishing, transcoding, emailing) inline, you risk timeouts and duplicated deliveries.
Dubbing API
Dubbing is designed for workflows where you start with an uploaded video asset and want dubbed versions across locales.
The docs show a V2 endpoint:
POST https://api.synthesia.io/v2/dubbing, plus endpoints for adding locales and retrieving dubbed videos for an
imported video.
Why dubbing matters
Dubbing can reduce the effort required to localize existing video libraries. If your company already has recorded videos, you can import them and use dubbing to generate localized variants without fully recreating the content in Studio. In practice, dubbing workflows still require QA: verify pronunciation, ensure timing feels natural, and confirm that local compliance rules are respected.
Typical dubbing pipeline
- Upload the source video asset via the Upload API or via the assets pipeline.
- Create a dubbing request referencing the uploaded asset.
- Subscribe to webhook events for completion/failure.
- Retrieve dubbed outputs and publish them (or attach them to a CMS record).
Operational considerations
- Batching: if you dub many videos, throttle requests to avoid hitting rate limits.
- Locale QA: include a QA step before publishing to customers.
- Fallback strategy: if dubbing fails for a locale, you may publish subtitles first.
Audit logs (governance & compliance)
For enterprise and regulated environments, audit logs help track key actions and changes. The Synthesia docs include audit log endpoints under V2, including querying events and exporting logs. If you operate in a compliance-heavy domain, you should capture audit logs and store them in your central logging/monitoring system.
How teams use audit logs
- Detect suspicious activity (unexpected template changes, unusual API usage spikes).
- Support compliance audits (who created or deleted content, and when).
- Support incident response timelines.
- Confirm operational workflows (did the automation run as expected?).
Even if you’re not “enterprise,” it’s smart to keep internal audit trails: which user triggered a video, what inputs were used, and what output ID was produced. That record is invaluable when customers ask “how was this generated?”
Rate limits & reliability
Rate limits protect the API and your account. They vary by tier/plan and operation type (read vs write). In general, video creation endpoints are “write operations” and are more likely to be limited than simple “read operations.” Always build a throttle layer in your backend.
Practical throttling strategy
- Token bucket per endpoint: keep independent rate limits for “create video” vs “retrieve video.”
- Queue writes: treat video creation as a queued job, not as a direct user request.
- Exponential backoff with jitter: on 429/5xx, retry gracefully.
- Idempotency keys: if supported, use them so retries don’t create duplicates.
Asynchronous design: your reliability superpower
Asynchronous systems handle burst traffic elegantly. If 10,000 users sign up in an hour, your backend can enqueue 10,000 video jobs and process them steadily. Your UI stays responsive and users understand the video will arrive soon.
What to do on 429
- Respect Retry-After if provided.
- Slow down worker concurrency.
- Prioritize important videos (onboarding) over bulk jobs (marketing variants).
- Surface delayed status in your UI (“We’re processing a high volume…”).
What to do on 5xx
- Retry with backoff and jitter.
- Log response IDs and timestamps for support.
- Failover to “notify when ready” rather than blocking user flow.
- Escalate after a threshold (pager, Slack alerts).
Production architecture (reference design)
Here is a production-proven blueprint for integrating Synthesia API into a real product. It emphasizes security, reliability, and scalability.
Core components
- API service (backend): holds Synthesia API key, validates inputs, calls endpoints.
- Job queue: processes video creation requests asynchronously (e.g., SQS, RabbitMQ, Redis queue).
- Database: stores job records, video IDs, status, template versions, and audit trail.
- Webhook receiver: verifies signatures, dedupes, updates status, triggers downstream tasks.
- Notification system: emails, in-app notifications, or Slack messages when videos complete.
- Admin panel: selects templates, previews sample outputs, monitors failures.
Data model suggestion
| Table | Important fields | Notes |
|---|---|---|
| video_jobs | id, environment, template_id, input_json, input_hash, synthesia_video_id, status, attempts, last_error, created_at, updated_at | One row per requested video generation |
| assets | id, hash, synthesia_asset_id, type, status, created_at | Cache uploaded assets to avoid re-uploading |
| webhook_events | dedupe_key, type, synthesia_video_id, received_at, verified_at | Enforce uniqueness on dedupe_key |
| template_configs | workflow_name, template_id, version, schema_json, enabled | Controls which template powers each workflow |
Workflow diagram (simple)
User action
↓
Your backend API validates + enqueues job
↓
Worker calls Synthesia create-from-template
↓
DB stores synthesiaVideoId + status
↓
Synthesia processes video
↓
Synthesia calls your webhook (video.completed / video.failed)
↓
Webhook verifies signature + updates DB
↓
Notifier sends email / updates UI / triggers publish steps
Reliable webhooks with signature verification, dedupe, and safe retries will eliminate most production pain. Polling-only systems become fragile at scale and create unnecessary API traffic.
Pricing & cost control (how to think about spend)
Synthesia’s public pricing pages focus on Studio plans and seats rather than a simple “per-API-call meter.” In practice, API usage is typically associated with having the right plan and the right permissions in your workspace. Some integrations explicitly mention that a paid plan may be required for API access.
Where costs come from
- Seats and plan level: which plan your organization uses, and how many editors/admins.
- Video generation limits: plans often include a certain amount of video generation or credits.
- Localization volume: translating/dubbing across many locales multiplies output volume.
- Asset storage and operational overhead: not always billed directly, but impacts internal cost.
Cost control strategies that actually work
Template discipline
Keep templates tight and reusable. The fewer templates you maintain, the fewer accidental variants you generate. Version intentionally; avoid “generate and see” experimentation in production.
Generate only on intent
Don’t generate videos for every lead automatically if only 10% convert. Gate generation behind meaningful intent (trial activation, meeting booked, onboarding step reached).
Deduplicate requests
If a user resubmits the same form five times, you don’t want five identical videos. Use an input hash and reuse existing videos when inputs match.
Batch and schedule bulk runs
Large campaigns (5,000 variants) should run in off-peak windows with controlled concurrency. This reduces rate limit issues and makes monitoring easier.
Transparency for your users
If your product exposes video generation to customers, give them clarity: show a queue position, estimated completion window (not a promise), and a clear explanation of what triggers cost (e.g., “each generated video counts toward your quota”).
Security, privacy, and enterprise considerations
Synthesia is positioned for business use, and their site highlights enterprise-grade security and compliance messaging. From an engineering perspective, your main job is to ensure that your integration does not accidentally leak personal data, secrets, or internal content.
Security checklist
- Secret storage: API keys belong in a secret manager, not in code or CI logs.
- PII minimization: only send what the template needs; avoid extra personal details.
- Access control: restrict which internal users can trigger bulk generation.
- Webhook verification: verify signatures and keep a strict allowlist of IPs if supported.
- Audit trail: store who triggered what, with input hashes and job metadata.
- Data retention: set retention windows for your job logs and intermediate assets.
Compliance and policy
If you operate in regulated industries (finance, healthcare, education), consult legal/compliance teams on how AI-generated video content must be labeled, stored, and audited. Even if Synthesia meets certain compliance standards, your usage patterns determine whether your overall system is compliant.
If you are personalizing videos with names, account details, or region-specific information, keep your template inputs minimal, and avoid placing sensitive content into video scripts unless you have a clear business need and proper consent.
Testing & troubleshooting
Most integration issues are not “API bugs.” They come from input validation, missing assets, webhook misconfiguration, or incorrect assumptions about async behavior. Here’s a practical checklist.
Common issues and fixes
| Problem | Likely cause | Fix |
|---|---|---|
| Videos stuck in “processing” | Webhook not configured; you’re not polling fallback | Register webhooks and add a timed retrieve fallback after X minutes |
| Webhook handler gets called but nothing updates | Signature verification failing; handler returns non-2xx | Capture raw body, verify with correct secret, respond quickly |
| Template renders with broken layout | Input text too long; translations expanded | Enforce length limits, shorten copy, redesign template spacing |
| Asset reference fails | Asset not finished processing or wrong ID | Poll asset status, cache IDs, verify MIME types |
| 429 rate limits | Burst writes; no throttle | Add queue, throttle per endpoint, honor Retry-After |
Test environments
Use separate API keys and separate webhooks for staging. A recommended approach is: set up a staging workspace (or separate project) with test templates and test assets. Then run scripted “smoke tests” that create a video, wait for completion event, retrieve metadata, and validate output properties.
Monitoring
- Track “time to completion” (p50/p95) for video jobs.
- Track failure rate by template ID and by locale.
- Alert on webhook verification failures (it could be an attack or misconfiguration).
- Log request IDs and timestamps for vendor support.
FAQs
Is Synthesia API free to use?
Usually, API access is tied to having the right Synthesia plan and permissions. Many teams use the API as part of a paid plan/workspace setup rather than a public “free tier per call.” Check your account’s plan and the official pricing page, and confirm API availability in your workspace settings.
What is the best way to generate thousands of videos?
Use templates + a job queue + webhooks. Put video generation into a background worker, throttle requests per endpoint, and store a job record for each video with input hashes so retries don’t create duplicates. Avoid doing it synchronously from a user request.
How do I make sure webhooks are safe?
Verify signatures, store a dedupe key, and keep the webhook handler fast. If the signature check fails, reject the request. Record events and enqueue downstream work rather than doing heavy operations inside the webhook request.
Can I translate videos automatically?
Yes, many teams use translation workflows with XLIFF: generate the source video, export XLIFF, translate, then upload the localized XLIFF back to update the video’s content. For best results, include QA and ensure templates can handle longer text.
Do I need to upload assets for every video?
Not necessarily. If your assets are shared (brand backgrounds, logos), upload once and reuse the asset IDs. If assets are unique per video (customer-specific logos), upload per customer but still deduplicate using hashes to avoid re-uploading identical files.
What’s the fastest way to reduce failures?
Enforce strict input validation (lengths, allowed characters, required fields), keep templates stable and versioned, and implement robust webhook handling with retries. Most “failures” in scaled systems come from inconsistent inputs rather than the API itself.
How should I handle rate limits?
Throttle requests, especially write endpoints like video creation. Use a queue, backoff on 429 errors, and prioritize interactive user-facing videos over bulk campaign runs. Never let your UI directly trigger unlimited parallel API calls.
Can I build a “video generator” feature inside my app?
Yes, but do it with a server-side integration. Your app UI should send the structured inputs to your backend, your backend calls Synthesia, and the UI displays job status and completion. Do not expose API keys in the browser.
Sources (official docs + helpful references)
Below are useful references you can bookmark. Some URLs are long; copy them directly from this section.
- Synthesia API docs (reference): https://docs.synthesia.io/reference/introduction
- API Quickstart: https://docs.synthesia.io/reference/synthesia-api-quickstart
- Create video from template (V2): https://docs.synthesia.io/reference/create-a-video-from-a-template
- List videos (V2): https://docs.synthesia.io/reference/list-videos
- Create an asset (Upload API V2): https://docs.synthesia.io/reference/create-an-asset
- Upload script audio (Upload API V2): https://docs.synthesia.io/reference/upload-script-audio
- Verifying Synthesia signatures: https://docs.synthesia.io/reference/verifying-synthesia-signatures
- Dubbing guide (overview): https://docs.synthesia.io/reference/upload-large-files-via-temporary-aws-credentials
- Audit logs endpoint listing: https://docs.synthesia.io/reference/audit-logs
- Synthesia pricing: https://www.synthesia.io/pricing
© Agents API Hub