LangSmith Pricing Calculator

This page gives you a practical, builder-friendly way to estimate LangSmith costs using a simple calculator plus a full explainer of the pricing pieces that matter: seats, tracing volume, retention upgrades, deployment runs, and (optionally) deployment uptime. It is designed for teams shipping LLM apps and agentic workflows who want predictable bills and fewer surprises.

1) CalculatorEnter plan, seats, traces, retention, deployments 2) What LangSmith isObservability + evaluation + deployment 3) How pricing worksWhat you pay for and why 4) Retention & upgradesBase vs extended and “auto upgrades” 5) DeploymentsRuns, uptime, and modeling cost 6) ExamplesRealistic scenarios and math 7) Cost optimizationCaching, sampling, limits, hygiene 8) FAQFast answers to common pricing questions

Calculator inputs

Choose a plan, set monthly volumes, and optionally adjust rate assumptions. The default values follow the public plan and billing statements, but the “Advanced” section lets you override rates quickly if pricing changes.

Plan

Public plan snapshots commonly list 5k included traces for Developer and 10k for Plus, with the Plus plan priced per seat.

Seats

Plus supports multiple seats; Developer is typically 1 seat.

Base traces this month

A “trace” is a record of an execution (e.g., chain run, agent run, tool call tree). Traces are the unit used to measure observability volume.

Upgrade to extended retention

If you keep some traces for longer (extended retention), that portion can incur an upgrade cost. This calculator models that as a percentage.

How to interpret “upgraded traces”

Some teams prefer percent-based estimates early on; later you might know the actual count (for example, if alerts or key projects trigger upgrades).

Deployment runs

A deployment run is one end-to-end invocation of a deployed agent. Resuming after an interrupt can count as a separate run.

Advanced pricing settings (editable)

Rate preset (quick)

Preset A matches the “starting at $0.50 per 1k base traces thereafter” wording and a common extended-total assumption of ~$5.00/1k. If your contract or dashboard shows different retention line-items, choose Custom.

Seat price ($/seat/mo)

Included base traces

Base trace overage ($/1k)

Upgrade to extended ($/1k)

Deployment run ($/run)

Currency label

Include deployment uptime estimate

Some invoices include an uptime component for hosted deployments. If you want to model it, enable this section and enter minutes.

Why keep rates editable? Pricing and retention policies can evolve. Editable fields let you update assumptions without rewriting the page.

Estimate

The calculator shows totals plus the components so you can see what is driving cost. Most teams find that the biggest levers are trace volume, extended retention upgrades, and uptime (if you run always-on deployments).

Estimated monthly total

$0.00 USD

Billable base traces

Upgraded traces

Per-1k base overage rate

$0.50

Seats$0.00 USD

Base trace overage$0.00 USD

Extended retention upgrades$0.00 USD

Deployment runs$0.00 USD

Total$0.00 USD

This is an estimate for planning. Taxes, enterprise terms, and any provider-specific discounts are not included. Use your dashboard and invoices as the source of truth when reconciling actual billing.

Quick interpretation:

If your total is higher than expected, check (1) upgraded traces percentage, (2) monthly trace volume, and (3) uptime minutes if you enabled uptime modeling.

What is LangSmith, and why does pricing feel “different” from a typical API?

Many pricing pages in AI look like the classic developer meter: a simple table of per-token rates, a few model tiers, and maybe a couple of add-ons. LangSmith is different because it is not a model provider. It is an “agent engineering platform” focused on observing and improving your LLM applications. When you use it well, it becomes part of your reliability loop: you instrument your application, inspect trace trees, run evaluations, build datasets, compare prompt and tool changes, monitor production drift, and (for some teams) deploy agents in a managed way.

That means a good pricing calculator must consider usage patterns that do not show up in model bills: how many “runs” you produce (traces), how long you retain them, whether certain classes of traces are automatically upgraded to longer retention, and whether you are using hosted deployments that come with their own run and uptime meters.

The purpose of this page is to help you estimate those costs with clear assumptions and to give you enough context to choose the right plan and architecture. If you are a solo builder experimenting locally, your costs may be near zero and the main question is “How do I stay within included traces?” If you are a team shipping a customer-facing agent, the questions become: “How do we avoid sending every dev run to production tracing?”, “Which traces should we keep for 400 days?”, and “How do we set usage limits so a runaway loop doesn’t blow up our bill?”

Think of LangSmith pricing like observability pricing, not like model pricing.

Your model costs are driven by tokens. Your LangSmith costs are driven by trace volume, retention, and deployment execution. The “best” configuration is the one that gives you enough visibility to ship reliable agents without paying to store noise.

How LangSmith pricing works

LangSmith pricing typically breaks into a few buckets. You can summarize them as: Seats Traces Retention (base vs extended) Deployments (runs + uptime) Optional product features / enterprise terms

The public plan descriptions usually highlight the first three: plan seat cost, included traces per month, and pay-as-you-go overages. The billing documentation highlights deployment runs and clarifies what constitutes a billable “deployment run.”

1) Seats

Seats are the simplest part. If you are on a plan that charges per user, your base subscription cost is seat count × price per seat. In the public plan snapshot, the Plus plan is shown at a per-seat monthly rate, while the Developer plan is shown as $0 per seat per month for a single seat (solo use).

Practically, seats matter because they anchor your “minimum bill.” Even if you send only a small number of traces, a team plan may still have a baseline monthly seat charge. This is common in developer tools because it funds support, collaboration features, and the product surface around traces (dashboards, projects, workspaces, and evaluation workflows).

2) Traces (base traces)

A trace is the recorded story of a run. In a simple chain, it might contain a prompt, an LLM call, and the final result. In an agent, it can contain steps, tool calls, intermediate model calls, retrieval, and more. Traces are a core unit because they let you debug and evaluate behavior. When something goes wrong, the trace tree tells you where.

Plans include a number of base traces per month. After you exceed that included amount, base trace overages are billed at a per-1,000-trace rate. The commonly surfaced public overage statement is “starting at $0.50 per 1k base traces thereafter,” after the included amount. Your calculator therefore needs both the included traces and the overage rate, because the billable portion is: max(0, total base traces − included base traces).

3) Retention: base vs extended

Retention is about how long traces stay available. Many observability systems offer a “default retention” tier and a “long retention” tier. Base retention covers the short window you need for day-to-day debugging; extended retention covers the longer window you need for audits, regression investigation, and longitudinal monitoring.

In practice, you rarely want to keep everything for extended retention. Most traces are routine or noisy: dev experiments, repeated failures, tests, and iterative runs. A smaller subset is genuinely valuable for a long time: traces from production incidents, traces tied to important dataset examples, traces for compliance workflows, or traces that demonstrate a key behavior your team cares about. That is why this page models “upgrades” as a percentage or a fixed count of traces.

4) Deployments: runs and uptime

If you deploy agents via the platform, deployment billing can have separate meters. One commonly documented meter is deployment runs, defined as one end-to-end invocation of a deployed agent. The billing doc clarifies that nodes and subgraphs inside a single execution are not billed separately; however, calls to other agents can be charged to the hosting deployment, and resuming after an interrupt can count as a separate run. That matters for teams using human-in-the-loop patterns.

Some teams also see deployment uptime line items on invoices, especially if deployments are continuously hosted. Uptime can behave like the “always-on service” component of a platform: you pay for the infrastructure to keep the deployment available, plus you pay per run when it is actually invoked. Since uptime can vary by contract and deployment type, the calculator keeps this optional and editable.

Key idea: most surprising bills come from retention upgrades and “too many traces.” If you treat tracing like logging and send everything forever, costs rise quickly. If you treat tracing like sampling—keeping the right traces for longer—cost stays manageable.

Plan snapshot (public-facing)

The following table summarizes typical public plan highlights. Always verify in your dashboard or the latest plan page before final decisions.

Plan	Seats	Included base traces / month	Base overage (after included)	Best for
Developer	1 seat (solo)	5,000	Starts at ~$0.50 / 1k base traces	Solo builders, early prototyping, small-scale debugging
Plus	$39 / seat / month	10,000	Starts at ~$0.50 / 1k base traces	Teams shipping agents, collaboration + deployment workflows
Enterprise	Custom	Custom	Custom	SSO, governance, large-scale usage, custom terms

Trace retention and upgrades: how to think about it

Retention is both a technical and a budgeting decision. Technically, retention affects what you can inspect later. Budget-wise, retention is a storage-and-access decision: keeping traces longer costs more than keeping them briefly.

The simplest mental model is: Base retention is short-term and cheap. Extended retention is long-term and costs more. An upgrade moves a trace from base to extended for the long window.

In the calculator you can choose “upgraded traces” as a percentage of total traces or a fixed count. Percent is useful when you are forecasting. Fixed count is useful when you have operational clarity: you know how many traces are “important” each month, or you know how many traces are automatically upgraded by rules.

Why upgrades happen

Many teams explicitly upgrade traces tied to:

Incidents: production errors, outages, or user complaints where you want a record for later postmortems.
Evaluations: labeled datasets and regression tests where you need trace history as evidence of behavior.
Key journeys: “golden paths” in your application that define success for your users.
Compliance: environments where a longer evidence trail is beneficial (subject to policy).
Long-cycle debugging: intermittent issues that occur weekly or monthly and are hard to reproduce.

Upgrades can also happen automatically depending on platform behavior and settings. The safest approach is to assume that “some portion” of traces will be upgraded in real deployments unless you actively manage retention settings and upgrade rules.

How to pick an upgrade percentage

There is no universal correct percentage. Here are practical starting points:

1–3%: Mature teams with strict sampling, strong limits, and clear retention rules. Mostly production issues only.
5–10%: Common for teams shipping and iterating quickly. Some dev experiments plus production issues are kept long-term.
15–30%: Early teams discovering patterns, running heavy evaluations, or keeping many traces for analysis. This can be expensive.

If you are unsure, start at 10% in your model, run the calculator, then re-run at 5% and 20%. You will quickly see how sensitive your total is to retention. This sensitivity test is one of the fastest ways to find the best cost-control lever for your organization.

Retention tip: “Keep everything” is rarely the best plan. Decide what you must keep for 400 days, and let the rest expire on base retention. You can still be highly reliable without storing noise.

Deployment costs: runs, uptime, and how the math works

Deployments are where people often mix up “trace volume” and “execution volume.” Traces measure observability records. Deployment runs measure invocations of the hosted agent. These are related but not the same. You can have many traces without deployments (for example, local runs instrumented during development). And you can have deployments with fewer traces if you limit tracing or sample.

Deployment runs

The documented definition of a deployment run is one end-to-end invocation. Importantly:

Nodes and subgraphs inside one agent execution are not charged separately as separate runs.
Calling other agents can incur charges on the deployment hosting the called agent.
Human-in-the-loop patterns can create additional runs when you resume after an interrupt.

The calculator uses a default run price field and multiplies it by the number of runs you enter. If your workflow includes interrupts and resumes, you should consider modeling a higher effective run count than “user requests,” because a single user request could produce multiple runs.

Uptime (optional modeling)

If you are hosting deployments continuously, uptime may appear as a separate cost component. It behaves like a capacity reservation: the platform keeps the deployment available. Uptime is often billed in minutes, which is why the calculator asks for minutes.

A practical way to estimate uptime minutes is:

Always-on deployment: about 43,200 minutes in a 30-day month (24 × 60 × 30), minus any pauses, maintenance, or scaling behavior.
Work-hours only: 8 hours/day × 22 workdays ≈ 10,560 minutes/month.
Staging environment: often a fraction of the above depending on team practice.

Because uptime rates and calculation rules can vary by contract and deployment type, uptime fields are intentionally editable and off by default. Use them to plan and to compare scenarios: “always-on vs work-hours” or “dev deployment vs prod deployment.”

Deployment planning: For early-stage teams, start by modeling costs without uptime, then add uptime later if you move to continuously hosted agents. This avoids overestimating on day one while still keeping the tool useful as you scale.

Examples: realistic monthly scenarios and step-by-step math

Examples are the fastest way to build intuition. The goal isn’t to perfectly match an invoice; it’s to understand how each dial changes the outcome so you can decide what to instrument, what to retain, and what to limit. The examples below use simple math, then highlight practical actions you can take to reduce cost without losing reliability.

Example A: Solo developer prototyping (Developer plan)

You are a solo builder. You test a small RAG chatbot, run a few evaluations, and keep only a handful of traces for extended retention. Your numbers:

Plan: Developer
Seats: 1
Total traces: 4,000
Upgraded traces: 2% (80 traces)
Deployment runs: 0 (you deploy elsewhere or not yet)

Since total traces are below included traces, billable base traces are 0. Your estimate becomes:

Seat cost: $0
Base overage: $0
Upgrades: tiny (80 traces × upgrade rate / 1,000)
Runs: $0

In this scenario the main cost driver is not the platform; it’s your model usage. LangSmith is effectively “free” for observability. Your best strategy is to keep traces low-value: do not trace every unit test, avoid tracing huge synthetic loops, and keep upgrades near 0–2%.

Example B: Small team shipping an internal assistant (Plus plan)

You have a team of 4. You trace heavily in staging and lightly in production. You keep 10% of traces for long retention because you run monthly evaluation cycles and you want to compare results across releases.

Plan: Plus
Seats: 4
Total traces: 200,000
Included traces: 10,000
Billable base traces: 190,000
Upgraded traces: 10% (20,000)
Deployment runs: 25,000

Your cost components:

Seats: 4 × $39 = $156/month baseline
Base overage: 190,000 / 1,000 × base rate
Upgrades: 20,000 / 1,000 × upgrade rate
Runs: 25,000 × $0.005 = $125/month (if using that default)

The practical takeaway: once trace volume gets large, “included traces” becomes a small part of the story. Your cost is primarily about how many runs you send and how many you keep long-term. This is where you should introduce:

Sampling: trace 100% of failures and only a portion of successes.
Environment rules: avoid sending dev/test loops to the same workspace as production.
Upgrade discipline: keep upgrades for incidents, eval datasets, and key journeys only.

Example C: Customer-facing agent with always-on deployments

You run a customer-facing agent 24/7. You keep longer retention for compliance and incident review, and you see uptime as a cost component. Here the best lever is often to reduce uptime minutes by pausing unused environments (like staging) and ensuring your production deployment uses the right tier for its traffic.

If you enable uptime in the calculator, enter your dev and prod minutes and rates. Then compare:

Always-on staging + always-on production
Work-hours staging + always-on production
Work-hours staging + scaled-down production during low-traffic windows (if supported)

Even if you cannot perfectly control uptime, this scenario planning helps you understand where budget goes and what to optimize first.

Cost optimization: how to reduce LangSmith spend without losing visibility

The best cost strategy is not “trace less.” It is “trace smarter.” You want enough data to diagnose issues and prove improvements, and you want your data to be the right kind of data. That means:

Prioritize high-signal traces (errors, unusual tool calls, low-confidence answers, low satisfaction feedback).
Reduce or exclude low-signal traces (unit tests, repeated dev loops, synthetic sweeps not needed for diagnosis).
Keep only a curated subset for extended retention.
Enforce usage limits so runaway behavior cannot create a surprise bill.

1) Sampling strategies that work in practice

Sampling is common in observability because it balances cost and insight. A robust sampling approach might include:

Trace 100% of failures: all exceptions, timeouts, tool errors, and parsing failures.
Trace 100% of “high-risk” actions: actions that trigger side effects (emails sent, tickets created, payments initiated).
Trace a fixed percentage of successful runs: e.g., 5–10% of successes for baseline monitoring.
Trace more when you ship: temporarily increase sampling for a release window, then decrease.

This gives you a consistent picture while keeping trace counts bounded. A common pattern is “dynamic sampling”: if the system sees high error rate, it increases sampling for a short period to capture more context.

2) Separate workspaces by environment

One of the easiest ways to accidentally inflate trace volume is to send: dev runs + test runs + staging runs + production runs into the same workspace. It becomes hard to filter, and you end up retaining data you do not need. A cleaner setup is:

Dev workspace: short retention, aggressive sampling, minimal upgrades.
Staging workspace: medium retention for release cycles, upgrades only for evaluation datasets.
Production workspace: high signal, upgrades for incidents and key journeys.

This also simplifies budgeting because you can allocate a budget per environment and enforce limits accordingly.

3) Set usage limits and alerting

Cost surprises happen when a loop runs unexpectedly—an agent gets stuck, a retriever returns huge context, or a queue replays messages. The safest solution is to set trace limits for a workspace and to monitor usage. Even if you later raise limits, having the guardrails prevents the “overnight bill spike” scenario.

4) Keep extended retention for only what you truly need

Extended retention is extremely valuable for regression detection and long-cycle debugging, but it is also a common cost driver. A good extended-retention policy usually defines:

What is automatically upgraded (if anything).
What must be upgraded during incidents.
What evaluation traces are always upgraded.
How long you actually need the data (and who can access it).

Then you can pick a stable upgraded-trace percentage. Over time, many teams push that percentage down as they become more disciplined.

Rule of thumb: Reduce trace volume first, then reduce upgraded traces. If you keep upgraded traces constant but reduce overall traces, you typically preserve the most valuable long-term evidence while lowering the base overage.

5) Use evaluation datasets to “pay once, learn many times”

If you invest in a curated evaluation dataset, you can run repeated experiments and compare improvements without keeping every production trace forever. This shifts your long-term knowledge from raw traces to structured evaluations. In many teams, that is the path to both higher reliability and lower storage cost.

LangSmith Self-Hosted Pricing Calculator:

A LangSmith Self-Hosted Pricing Calculator is a planning tool that estimates the total cost of running LangSmith in your own environment not just the Enterprise license, but also the infrastructure you’ll operate. Unlike a basic SaaS calculator, a self-hosted calculator combines two major cost layers:

Enterprise/self-hosted licensing (contract-driven, often based on seats, expected usage, support level, and security requirements), and
Your infrastructure costs (Kubernetes/compute, Postgres and storage, backups, monitoring, and networking) that grow with trace volume and retention.

A good self-hosted calculator lets you model the real cost drivers that matter most in production: number of seats, monthly trace volume, the split between base vs extended retention, sampling rate for successful requests (while keeping 100% of errors), and whether you’ll run deployments or high-volume Agent Builder workflows. It then outputs a clear monthly estimate and shows which levers reduce spend like defaulting to base retention, promoting only high-signal traces to extended, separating dev/staging/prod environments, and trimming large trace payloads by storing big artifacts externally.

In short, a LangSmith Self-Hosted Pricing Calculator helps engineering and finance teams build a realistic budget and rollout plan so you can meet compliance/data residency goals while keeping both licensing and infrastructure costs predictable as usage scales.

FAQ: LangSmith Pricing Calculator

Tap a question to expand. These are practical answers focused on budgeting and estimating.

What does this calculator estimate?Open

It estimates your monthly spend based on seats, base trace overages (after included traces), extended retention upgrades, deployment runs, and optional uptime minutes. It is meant for forecasting and scenario planning.

Why do you model upgraded traces separately from base overages?Open

Retention upgrades are a different behavior from simply exceeding included traces. Even if you are under included traces, keeping data longer can still create additional cost. Modeling upgrades separately helps you see that lever clearly.

Should “upgraded traces” apply only to billable traces or to all traces?Open

Different billing implementations can treat retention as an add-on line item. For planning, applying upgrades to total traces is a conservative and common estimate because retention is a property of stored traces, not only overage traces. If your invoice shows upgrades only after a threshold, switch to Custom and align the logic to your observed billing.

What is a “deployment run”?Open

A deployment run is one end-to-end invocation of a deployed agent. It can be higher than user requests if your workflow includes interrupts and resumes, because resuming after an interrupt can count as another run.

How do I estimate deployment uptime minutes?Open

For a 30-day month, always-on is roughly 43,200 minutes. If you only keep a deployment active during working hours, estimate 8 hours/day × 22 workdays × 60 minutes ≈ 10,560 minutes. Use your real operational pattern when possible.

Does tracing affect my model costs?Open

Tracing itself doesn’t directly add model tokens, but the way you build your system might. For example, if you log or store huge payloads, you might do additional processing. The primary model cost driver remains token usage; the primary LangSmith cost driver is trace volume and retention.

What’s the best way to lower costs quickly?Open

Start with sampling and environment separation. Trace 100% of failures, sample successes, and keep dev/test noise out of production workspaces. Then reduce extended retention upgrades by defining clear upgrade rules.

Can I rely on this calculator for accounting?Open

It’s a planning tool. For accounting, rely on invoices and your billing dashboard. Use the Advanced section to match the calculator’s rates to what you observe in your own billing.

Glossary

A short glossary helps align vocabulary across engineering, product, and finance stakeholders.

Term	Meaning in plain language	Why it matters for cost
Trace	A recorded run (often a tree) of your app or agent execution.	Trace volume drives base overage and storage.
Base retention	Default “short window” trace storage for debugging.	Cheaper than extended; most traces should stay here.
Extended retention	Longer storage window (useful for audits, regression, long-cycle debugging).	Usually more expensive; upgrade only high-signal traces.
Upgrade	Moving a trace from base retention to extended retention.	A major lever: reducing upgrades can lower bills fast.
Deployment run	One full invocation of a deployed agent.	Billed per run; human-in-the-loop resumes can increase counts.
Deployment uptime	Minutes the deployment is kept active/available.	Can be a baseline infra cost for hosted deployments.

Shareable takeaway: If you want predictable bills, track three numbers monthly: total traces, upgraded traces, and deployment runs. Everything else is a multiplier.

Disclaimer & maintenance notes

This page is an educational calculator. Actual billing depends on your LangSmith account settings, plan details, and contract terms. If pricing changes, update the values in the “Advanced pricing settings” section (seat price, included traces, overage rates, upgrade rates, deployment run price, and uptime rates).

Recommended maintenance:

Update the “Last updated” date when you change assumptions.
Keep a short changelog for transparency (what changed, why).
Cross-check one real invoice monthly until you trust your model.