The cost control layer for AI agents.
Meter every LLM call per user per model. Enforce spending limits before they hit your margin. Drop in, ship, done.
pip install paygent The Problem
The hidden cost problem in AI agents
Every agent invocation is a black box of unpredictable spend. Without metering and guardrails, you're flying blind.
Invisible cost multiplication
A single user prompt triggers chains of LLM calls, sub-agents, and tool executions. One prompt. Five model calls. No way to see it happening.
Retries you never see
Your agent retries failed calls and recovers from hallucinations automatically. Great for reliability. Terrible for your bill — because none of it shows up.
Tools blow your budget
Search APIs, code execution, file parsing — every tool call has a cost that varies per user and per session. Forecasting per-user spend is guesswork.
Revenue minus reality
You charge $49/month. Some users cost you $12. Others cost you $140. Same plan, wildly different margins — and you can't tell which is which.
No per-user guardrails
Your heaviest users burn 10x more than they pay for. There's no automated way to throttle, gate, or nudge them toward a higher tier.
Billing flies blind
Your billing system charges a flat monthly fee. Your actual costs vary per user, per session, per prompt. These two worlds never talk to each other.
What Paygent Does
A runtime layer for controlling AI agent economics.
Three capabilities. One SDK. Full spend control.
Metering
A single user prompt can trigger a chain of model calls, retries, and tool executions. You have no idea what any individual user actually costs you.
Track token consumption, tool costs, and agent activity per user, per session, in real time. Know exactly what every invocation costs before it becomes a surprise.
Guardrails
You charge every user the same flat fee, but some cost you 10x more than others. Without per-user limits, your best plan subsidizes your worst margins.
Set per-user spend limits and per-model token caps. When a user hits their threshold, Paygent automatically triggers a soft warning or hard block — your rules, not manual intervention.
Sync & Connect
Wiring usage data to your backend takes weeks of custom code — aggregation, API endpoints, database queries. It's fragile and it breaks every time you change pricing.
Usage events sync to the Paygent backend automatically. Query per-user costs, model-level breakdowns, and billing period summaries via API. Connect the data to your existing billing however you want.
Unit Economics Impact
See the difference
Without Paygent
Costs exceed revenue. Margin is negative.
With Paygent
RecommendedGuardrails keep costs within budget.
Illustrative example based on typical AI SaaS usage patterns.
Quick Start
Get started in 5 minutes
from paygent import Paygent, paygent_context
# 1. Initialize (connects to Paygent backend)
pg = Paygent.init(api_key="pg_live_...")
# 2. Wrap your request handler
with paygent_context(user_id="user_123"):
response = openai.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": query}],
)
# That's it. Every LLM call is now:
# ✓ Metered (tokens, cost, per model)
# ✓ Guarded (soft warnings + hard blocks)
# ✓ Synced (to Paygent backend, automatically) Ready?
Start metering your AI agent costs today
pip install paygent