Flowise API - Complete Developer Guide

Overview

Flowise is a visual builder for LLM applications. You can design “chatflows” (flows composed of nodes like prompts, tools, retrievers, vector stores, memory, and model providers) and then deploy them so they can be used as chat assistants, embedded widgets, or API endpoints.

The Flowise API matters because it turns your flow into a programmable service. Instead of manually opening the Flowise UI, you can connect your flow to any product: a website chatbot, a mobile app, a helpdesk, a Slack bot, a CRM workflow, an internal tool, or a scheduled automation that runs nightly.

When people say “Flowise API,” they typically mean one of these:

1) Prediction API (the “call my flow” endpoint)

This is the primary runtime endpoint. You send a question (and optional session details), and Flowise returns an AI response generated through your flow’s nodes. This is the endpoint you use to build chat UIs.

2) Management APIs (CRUD + ops)

These include endpoints for listing chatflows, working with assistants, storing documents, upserting vectors, fetching history, handling attachments, and gathering feedback/leads.

Key idea

In Flowise, a “chatflow” is both your application logic and your runtime contract. The API is how you execute that contract from your own services.

Last updated: February 8, 2026. Always verify details in the official Flowise docs for your installed version and deployment mode.

Base URL & API versioning

Flowise exposes a REST API under an /api/v1 prefix in typical deployments. If you self-host Flowise at http://localhost:3000, your prediction endpoint for a chatflow usually looks like:

http://localhost:3000/api/v1/prediction/<CHATFLOW_ID>

In production, your base URL depends on where you host Flowise (a cloud VM, Kubernetes, Docker, reverse proxy, or managed hosting). Most endpoints follow a consistent style:

Runtime calls: send inputs to a flow and receive outputs (Prediction API).
List/get resources: chatflows, assistants, tools, document store configs, variables, etc.
Operational endpoints: ping/health and metrics for monitoring.

While the URL prefix is commonly /api/v1, the exact available routes can vary by Flowise version and configuration. Some instances may additionally restrict access via authorization features or reverse proxy rules.

Practical advice

Treat the Flowise server as a backend service. Keep its API behind your network perimeter when possible, or enable authentication and rate limiting when exposing it to the public internet.

Auth & security model

Flowise has multiple layers of authorization concepts, which can apply at the application level and the chatflow level. This matters because you may want:

Private management APIs (only admins can list or edit chatflows, variables, document stores, etc.)
Public runtime endpoints (end-users can talk to the bot, but only through a protected or scoped mechanism)
Per-chatflow access controls (one flow can be public, another can be internal-only)

Chatflow-level API keys

Flowise can require API keys at the chatflow level. In many setups, you manage API keys in the dashboard (often a “DefaultKey” is created, and you can add or delete keys). When enabled, clients must include a bearer token or an API key depending on your configuration.

Bearer authentication in the API reference

Flowise’s API reference pages commonly describe bearer auth (an Authorization: Bearer <token> header) for many endpoints. This is especially relevant for management endpoints like Chatflows listing, assistants, tools, and other administrative resources. In other words:

Prediction: may be open or may require an API key depending on how your chatflow is configured.
Management APIs: generally should be protected and are typically documented with bearer auth.

App-level authorization (protecting the Flowise instance)

Flowise also supports protecting the instance itself (so the UI and endpoints are not publicly accessible). In production, you should assume that any instance exposed without authentication is a risk, especially if your flows include tool nodes that can reach internal systems or if you allow file uploads.

Golden rule

Never expose powerful management endpoints to the public internet without authentication, rate limiting, and network controls. If you need a public chatbot, expose a narrow runtime surface (Prediction API) and keep admin routes private.

Recommended security controls

Network controls

Put Flowise behind a reverse proxy, VPN, firewall rules, or private network. Limit inbound traffic to known systems.

API authentication

Enable bearer or API key auth for endpoints you do not want public. Rotate keys and store them securely.

Rate limiting

Throttle /api/v1/prediction to prevent abuse. Use per-IP or per-user limits based on your app’s identity model.

Input validation

Validate request shape, maximum input length, and allowed file types for attachments. Log and reject suspicious payloads.

Prediction API

The Prediction API is the “runtime” endpoint for Flowise. You send a request to a specific chatflow ID and Flowise returns the generated output after running the flow’s nodes (LLM calls, retrieval, tools, memory, etc.). Official docs describe it as the primary endpoint for interacting with flows and assistants, and it supports core chat interactions and streaming responses.

Endpoint

POST /api/v1/prediction/{id}

Here, {id} is the chatflow ID (or flow ID) you want to run. In the Flowise UI, you can typically click an “API” button or open the “Use as API” panel for a chatflow to copy the endpoint.

Typical request shape

Flowise requests are commonly JSON. The minimal request often includes a question or message (frequently called question). Many integrations also send a session ID or chat ID so Flowise can apply short-term memory and return consistent multi-turn conversations. Depending on your nodes and configuration, you can also send additional parameters for memory, overrides, files, or context.

{
  "question": "What is Flowise and how does the Prediction API work?",
  "chatId": "chat_01HZZZEXAMPLE",
  "overrideConfig": {
    "vars": { "tone": "concise", "audience": "developers" }
  }
}

Note: exact fields may vary by version and by chatflow settings. The pattern stays consistent: question is your input, chat/session identifiers anchor conversation state, and override config can tweak behavior at runtime.

Streaming responses

Flowise supports streaming in many setups so your UI can render tokens as they arrive. Streaming details (transport, headers, and client implementation) depend on your version and deployment. For production chat UIs, streaming improves perceived speed: users see the assistant “typing,” which reduces abandonment.

Override configuration

Override configuration lets you adjust certain runtime settings without creating a new chatflow. This is useful for:

Changing a prompt variable (tone, language, persona) per request
Injecting user profile data (“Plan: Pro”, “Region: EU”, “Role: Support Agent”) so a single flow can serve many contexts
Toggling tool behavior (for example, disabling a tool for unauthenticated users)
Routing retrieval behavior (like switching a document store collection or namespace)

Design tip

Use one stable chatflow and pass “who is calling” context via variables or metadata. That’s easier to maintain than cloning flows for every customer, environment, or product tier.

Response shape

Responses usually include the assistant’s text and may include additional fields like sources, intermediate steps, or metadata, depending on your flow. If your flow includes retrieval, you may see citations, documents, or chunk information. If your flow includes tool calls, you may see tool results.

{
  "text": "Flowise exposes a REST Prediction API that runs your chatflow and returns an AI response…",
  "chatId": "chat_01HZZZEXAMPLE",
  "sourceDocuments": [
    { "pageContent": "…", "metadata": { "source": "kb/article-12" } }
  ]
}

Treat the response as a structured payload. Even if your UI only needs text, store the full response when feasible (subject to privacy rules). This helps debugging, evaluation, and quality iteration.

Chatflows API

The Chatflows API is used for listing and retrieving chatflows programmatically. In self-hosted environments, this can power internal dashboards, automation scripts, and CI/CD workflows that validate or export flows.

Common endpoints

Operation	Method	Path	Purpose
List chatflows	GET	`/api/v1/chatflows`	Retrieve available chatflows in the instance/workspace.
Get chatflow	GET	`/api/v1/chatflows/{id}`	Fetch one chatflow by ID (details depend on auth).
Get by API key	GET	`/api/v1/chatflows/apikey/{apikey}`	Retrieve a chatflow using an API key (useful when you only have the key).

In the official API reference, these routes are typically documented with bearer authentication, meaning you should assume they are administrative and should not be exposed publicly.

Why the “get by API key” route matters

Some embedding patterns use a chatflow API key as an access token. In that model, a client can present the API key, and Flowise can resolve it back to a chatflow. This can simplify “drop-in widget” integration in controlled environments. However, you still should treat the key as a credential and keep it out of public source code when possible.

Production recommendation

If you’re building a public-facing app, consider placing your own backend in front of Flowise. Your backend can authenticate end users, enforce quotas, and forward requests to Flowise prediction endpoints safely.

Assistants API

Flowise includes “Assistants” concepts in its API reference. Assistants can represent higher-level chat entities, often tied to configuration, knowledge sources, tools, and runtime behavior. Depending on your Flowise version, assistants may overlap with chatflows or act as a separate abstraction layer.

Typical reasons you might use Assistants endpoints:

Build an internal admin portal that lists available assistants
Provision assistants or fetch their configuration for a deployment pipeline
Bind assistants to specific document stores or tools in a controlled environment

Because assistant behavior can vary by version, always confirm the fields returned by your Flowise instance. Use the official API reference pages and your own test calls to lock down the contract you depend on.

Chat Message API

Chat experiences are not just “prompt in, text out.” Real products require session management, message history, conversation metadata, and tools for reviewing or exporting logs. That’s where Chat Message endpoints are useful.

Common use cases include:

Show a “conversation history” screen in your app
Allow support teams to review transcripts for quality and escalation
Export conversation logs for evaluation and safety review
Implement “continue conversation” UX across devices

In many architectures, you will store conversations in your own database as the source of truth, while Flowise handles short-term memory inside the flow. If you do rely on Flowise chat message storage, treat it as a service dependency and build a fallback for data retention requirements.

Design choice

If you need strict compliance or custom analytics, store chat transcripts in your system. Use Flowise memory for runtime quality, and your DB for audits, reporting, and user-visible history.

Document Store API

Retrieval-augmented generation (RAG) is a core reason people choose Flowise: you can build chatflows that search documents and ground answers in your knowledge base. Flowise exposes a Document Store API category that supports:

Creating or managing document stores/collections
Previewing and processing documents via loaders
Chunking, embedding, and preparing content for retrieval
Integrating file sources (local files, URLs, cloud storage, etc.) depending on loaders

Loader preview/process pattern

Flowise often uses a two-step model for ingestion:

Preview: parse a file or source and show how it will be chunked
Process: actually chunk/embed/store the content for retrieval

This is important for product UX: a preview step lets you show “what will be ingested” before burning compute or storing data. In an admin portal, you can let users adjust chunk sizes, overlap, or filters, then run process after confirmation.

Document ingestion is one of the most version-dependent areas, because available loaders and their parameters can evolve. Always test your instance and pin behavior in your integration layer.

Vector Upsert API

Many RAG systems need a direct “upsert vectors” capability—either to ingest embeddings computed elsewhere or to push records into a vector store quickly. Flowise exposes a Vector Upsert category in its API reference, which is typically used to:

Insert or update vectors in a configured vector store
Associate metadata with vectors for filtering (source, doc ID, tags)
Support incremental ingestion pipelines (new docs daily, updated docs on change)

If your enterprise already has an embedding pipeline, Vector Upsert endpoints can let Flowise “consume” that data without forcing you to rebuild ingestion inside Flowise.

Two common strategies

Strategy A: Use Flowise loaders and let Flowise compute embeddings and store vectors.
Strategy B: Compute embeddings in your stack and push them via vector upsert (more control, more engineering).

Variables API

Variables endpoints allow you to manage configuration values used across flows—think environment variables, global settings, or runtime config that you don’t want hardcoded into each node. This supports:

Keeping API provider keys out of flow definitions (store keys securely and reference them)
Separating staging and production environments (different variable sets)
Updating a value once and having it apply to many chatflows

The exact variable model depends on Flowise version. In some setups, credentials are stored as dedicated “Credentials” objects rather than simple variables. Either way, the goal is the same: do not bake secrets into your flow JSON if you can avoid it.

Why variables matter for scaling

Teams often start with a single flow and one environment. As you scale, you need:

Separate environments (dev, staging, prod)
Multiple model providers (OpenAI, Azure, Anthropic, local models) across different customers
Rotating provider keys without editing flows
Feature flags (toggle tool access, enable/disable expensive retrieval)

Variables and credentials are the backbone for doing this without turning Flowise into a manual configuration nightmare.

Tools API

Tool calling is a major capability for modern agentic workflows: your flow can decide to call a “tool” to fetch information or perform actions. Flowise includes a Tools category in its API reference, which can support:

Listing available tools configured on the instance
Managing tool definitions and metadata
Allowing admins to review what tools exist and which flows can call them

In products, tools are high-risk/high-reward. They make assistants dramatically more useful (fetch orders, check inventory, create tickets), but they also expand the blast radius of misuse. Secure tool endpoints with:

Strong authentication and authorization
Least privilege tool permissions (don’t let a chatbot call admin-only tools)
Rate limits and audit logs
Validation and safety checks before executing actions

Attachments API

Attachments endpoints enable file uploads and attachment handling for assistants. Common product needs include:

Users upload PDFs/images and ask the assistant to summarize or answer questions about them
Users attach documents to be ingested into a document store
Support agents attach screenshots to get troubleshooting guidance

File handling is one of the most security-sensitive areas of any AI system. Even if Flowise provides a convenient attachments route, you should still implement guardrails:

Restrict allowed file types and maximum file size
Scan uploads if your environment requires it
Store files in a hardened bucket with restricted access
Prevent path traversal and suspicious filenames at your perimeter
Keep the endpoint private unless you have a strong reason to expose it

Security reminder

Publicly exposed file upload routes are common targets. If you don’t need public attachments, disable or firewall those routes and require authentication for any upload feature.

Feedback & Leads APIs

Flowise includes categories for Feedback and Leads. These can be used to collect:

User satisfaction signals (thumbs up/down, ratings, free-form comments)
Lead capture information (email, name, company) associated with a chatbot conversation
Product analytics for improving flow quality over time

Feedback endpoints become very valuable when you deploy Flowise-based assistants to real users. Without feedback loops, quality improvements are guesswork. With feedback loops, you can:

Identify which topics fail most often
Find missing documents in your knowledge base
Detect hallucination patterns and adjust retrieval or prompts
Measure improvements as you iterate

If you already use analytics or CRM systems, you can route Flowise feedback/leads into your existing pipeline. For compliance, only collect what you need and disclose collection clearly in your UI.

Ping & health endpoints

The API reference includes a Ping category. Health checks are essential for production:

Load balancers need a quick “is it up?” endpoint
Kubernetes readiness/liveness probes need stable checks
Monitoring needs an endpoint that can be called frequently without cost

A good health strategy includes:

Shallow health: server process is up and can respond (ping endpoint)
Deep health: server can reach critical dependencies (DB, vector store, model provider)

If Flowise does not provide deep health out of the box, implement deep checks in your own backend and alert on failures.

Monitoring and metrics

Modern deployments need metrics for latency, error rates, and throughput. Flowise supports a metrics endpoint (commonly /api/v1/metrics) intended for Prometheus scraping and observability workflows. In official monitoring docs, the metrics endpoint is described as requiring API key authentication.

What to measure

Prediction latency: end-to-end time for /prediction requests
Error rate: 4xx vs 5xx breakdown, plus upstream model provider failures
Traffic volume: requests per minute by chatflow ID
Tool calls: which tools are used, how often, and success rate
RAG quality indicators: retrieval hits, chunk counts, top sources

Even if you do not scrape Flowise metrics, you should instrument your own gateway or backend that calls Flowise so you can observe user-facing performance and cost drivers.

Operational tip

Put an API gateway in front of Flowise. Gateways make it easier to enforce auth, rate limits, request logging, and metrics—even if Flowise is running on a single server.

Production architecture patterns

You can integrate Flowise in many ways, from a simple “frontend calls prediction directly” to a robust enterprise architecture. The right design depends on whether your users are internal, external, or multi-tenant.

Pattern A: Direct-to-Flowise (fastest prototype)

Your frontend sends requests directly to /api/v1/prediction/{chatflowId}. This can work for demos or internal tools behind a VPN. But it becomes risky for public apps because you might expose keys, and you have limited control over quotas, abuse prevention, and user-level auth.

Pattern B: Backend gateway (recommended for production)

Your frontend calls your backend (authenticated as your user), and your backend calls Flowise. Benefits:

Keep Flowise keys server-side
Enforce per-user quotas and rate limits
Standardize logging and analytics
Mask internal error details and return consistent UX
Route to different chatflows based on user role or plan tier

Pattern C: Queue + workers (for heavy jobs)

If some flows are long-running (research, multi-step agents, large document processing), a synchronous request/response may time out. In that case:

Backend receives request and creates a job record
Enqueue a worker job to call Flowise
Worker calls prediction and stores result
UI polls your backend (or uses websockets) for completion

Why queues help

Queues protect your system from traffic spikes, simplify retries, and make throughput controllable. They also improve UX by letting users continue other work while the assistant finishes.

Examples (cURL, Node.js, Python)

cURL: call a chatflow via Prediction API

curl -X POST "http://localhost:3000/api/v1/prediction/<CHATFLOW_ID>" \
  -H "Content-Type: application/json" \
  -d '{
    "question": "Explain Flowise Prediction API in simple terms",
    "chatId": "chat_demo_001"
  }'

cURL: list chatflows (admin / protected)

curl "http://localhost:3000/api/v1/chatflows" \
  -H "Authorization: Bearer <YOUR_TOKEN>"

Node.js: simple server-side gateway to Flowise

// Node 18+ (built-in fetch). Put this in your backend, not in the browser.
// Example: POST /api/chat  -> forwards to Flowise prediction endpoint.

import http from "node:http";

const FLOWISE_BASE = process.env.FLOWISE_BASE_URL || "http://localhost:3000";
const CHATFLOW_ID = process.env.FLOWISE_CHATFLOW_ID;

function readJson(req) {
  return new Promise((resolve, reject) => {
    let buf = "";
    req.on("data", (c) => (buf += c));
    req.on("end", () => {
      try { resolve(JSON.parse(buf || "{}")); }
      catch (e) { reject(e); }
    });
  });
}

const server = http.createServer(async (req, res) => {
  if (req.method === "POST" && req.url === "/api/chat") {
    try {
      const body = await readJson(req);
      const question = String(body.question || "").slice(0, 8000);

      const flowiseRes = await fetch(`${FLOWISE_BASE}/api/v1/prediction/${CHATFLOW_ID}`, {
        method: "POST",
        headers: { "Content-Type": "application/json" },
        body: JSON.stringify({
          question,
          chatId: body.chatId || `chat_${Date.now()}`
        })
      });

      const text = await flowiseRes.text();
      res.writeHead(flowiseRes.status, { "Content-Type": "application/json" });
      res.end(text);
    } catch (err) {
      res.writeHead(400, { "Content-Type": "application/json" });
      res.end(JSON.stringify({ error: "Invalid request", detail: String(err) }));
    }
    return;
  }

  res.writeHead(404, { "Content-Type": "application/json" });
  res.end(JSON.stringify({ error: "Not found" }));
});

server.listen(8080, () => console.log("Gateway listening on http://localhost:8080"));

Python: call prediction endpoint

import requests

BASE = "http://localhost:3000"
CHATFLOW_ID = "YOUR_CHATFLOW_ID"

payload = {
  "question": "Give me a short checklist for securing a Flowise instance",
  "chatId": "py_demo_001"
}

r = requests.post(f"{BASE}/api/v1/prediction/{CHATFLOW_ID}", json=payload, timeout=60)
r.raise_for_status()
print(r.json())

For protected endpoints, include whatever authentication your instance requires (bearer token, API key, or reverse proxy auth).

Security notes for real deployments

AI workflow servers often start life as internal prototypes and then get exposed publicly to “just make it work.” That’s risky. Flowise can connect to tools, ingest files, and call model providers—so it should be treated like a production backend.

Hardening checklist

Keep admin routes private. Restrict Chatflows, Tools, Variables, Document Store management endpoints to trusted networks.
Require auth for prediction if your use case is not public and anonymous.
Rate limit prediction and upload endpoints.
Log requests with privacy-aware redaction (don’t log secrets or full PII if you can avoid it).
Patch regularly. Track Flowise security advisories and upgrade on a schedule.
Disable unused capabilities. If you don’t need uploads or certain tool routes, block them at the proxy/firewall layer.

Deployment safety

Use HTTPS everywhere. Place Flowise behind a reverse proxy (Nginx, Caddy, Traefik, Cloudflare) and configure:

Strict transport security
Request size limits (especially for file routes)
WAF rules or bot protection for public endpoints
Separate staging and production instances

Principle of least privilege

Give each flow only the tools and data it needs. If a public chatbot doesn’t need access to internal tickets or payment systems, don’t connect those tools to that flow.

FAQs

What is the Flowise API in one sentence?

It’s a REST API (commonly under /api/v1) that lets you run Flowise chatflows via the Prediction endpoint and manage related resources like chatflows, assistants, documents, tools, variables, attachments, and feedback.

Which endpoint do I use to “chat” with my Flowise bot?

Use the Prediction API: POST /api/v1/prediction/{chatflowId}. Send a JSON body containing your message (often question) and optional chat/session identifiers.

Do I need authentication to call the Prediction API?

It depends on your chatflow and instance configuration. Many teams keep prediction open for internal networks, and require an API key or bearer auth when exposing it publicly. Always verify the security settings of your deployment.

How do I find the chatflow ID?

Open the chatflow in the Flowise UI and click the “API” or “Use as API” option. The endpoint displayed includes the chatflow ID. In many setups, it looks like /api/v1/prediction/<id>.

What is the Chatflows API used for?

It’s typically used for administrative tasks: listing chatflows, retrieving flow definitions, and in some cases resolving a chatflow via an API key. These endpoints are usually protected and not meant for anonymous public users.

Should my frontend call Flowise directly?

For prototypes behind a private network, it can be fine. For production public apps, it’s better to add a backend gateway. Your gateway authenticates users, enforces quotas, and forwards requests to Flowise securely.

How do I build multi-tenant apps with Flowise?

Common approaches are: (1) one Flowise instance per tenant (strong isolation), or (2) one shared instance with routing logic (single instance) where your backend decides which chatflow to call per tenant. Store tenant context in your system and pass it into Flowise via variables or overrides, while enforcing strict access control.

Can Flowise do RAG (document Q&A) through the API?

Yes. Build a chatflow with retrieval nodes and a document store/vector store behind it. Then call the prediction endpoint. The assistant response can include sources depending on your flow configuration.

What’s the difference between Document Store API and Vector Upsert API?

Document Store endpoints are oriented around ingestion workflows (load/preview/process documents). Vector Upsert is oriented around pushing vectors (embeddings + metadata) directly into a configured vector store, often for incremental or external ingestion pipelines.

What are “Variables” in Flowise?

They’re reusable configuration values used across flows—useful for environment separation, provider configuration, and avoiding hardcoding. Some versions also use dedicated credentials objects for secrets.

How do I monitor Flowise in production?

Use health checks (Ping) and metrics (Prometheus scraping, commonly /api/v1/metrics with authentication). Also instrument your own gateway layer so you can track end-user latency, error rates, and usage per chatflow.

What’s the safest way to expose a public chatbot powered by Flowise?

Expose only a narrow runtime endpoint (prediction) through your own backend gateway, require user authentication when appropriate, apply rate limiting, restrict tools, and keep all admin/management routes private.

Official docs (recommended references)

Use these official Flowise docs to confirm endpoint details for your version:

Build note

This page is a comprehensive, developer-oriented explanation and architecture guide. Your exact request/response fields can vary by Flowise version and node configuration—always test your own instance.

Final “ship it” checklist

✅ Decide whether clients call Flowise directly or via your backend gateway
✅ Enable authentication for any route that should not be public
✅ Add rate limiting and request size limits (especially for uploads)
✅ Instrument logs and metrics (latency, errors, throughput per chatflow)
✅ Store chat transcripts where your compliance requirements demand it
✅ Keep admin routes private; expose only what you must
✅ Patch Flowise regularly and monitor security advisories