Budgets
AgentKavach supports three budget periods — daily, monthly, and total — all enforced in-memory with sub-millisecond checks. Budgets can be applied across seven dimensions: cost, total tokens, input tokens, output tokens, call count, and duration.
Budget Types #
Each budget can enforce limits on one or more of these dimensions. Alerts are evaluated independently per budget type, so you can set different thresholds for cost vs. token usage.
| Type | Unit | Description |
|---|---|---|
cost | USD | Total spend in dollars. The primary budget dimension. |
tokens_total | tokens | Combined input + output token count. |
tokens_input | tokens | Input/prompt token count only. |
tokens_output | tokens | Output/completion token count only. |
calls | count | Number of LLM API calls made. |
duration | milliseconds | Cumulative LLM call duration. Tracks how long your agents spend waiting for model responses. |
ℹ️ Per-type alert evaluation
Budget.daily(limit) #
A daily budget resets at midnight UTC every day. Ideal for capping per-day spend on production agents.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
limit | float | Yes | — | USD limit per day. Must be > 0. |
from agentkavach import AgentKavach, Budget
guard = AgentKavach(
provider="openai",
api_key="cg_...",
llm_key="sk-...",
budget=Budget.daily(50), # $50/day
)
response = guard.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}],
)
print(f"Spent: ${guard.spent:.4f}")
print(f"Remaining: ${guard.remaining:.4f}")
print(f"Utilization: {guard.engine.utilization:.1%}")Budget.monthly(limit) #
A monthly budget resets on the 1st of each month at midnight UTC. Best for teams with monthly cloud billing cycles.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
limit | float | Yes | — | USD limit per month. Must be > 0. |
guard = AgentKavach(
provider="anthropic",
api_key="cg_...",
llm_key="sk-ant-...",
budget=Budget.monthly(500), # $500/month
)Budget.total(limit) #
A total budget is a lifetime cap that never resets. Once the limit is reached, the agent is permanently blocked until the budget is manually increased.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
limit | float | Yes | — | USD lifetime limit. Must be > 0. Never resets. |
guard = AgentKavach(
provider="google",
api_key="cg_...",
llm_key="AIza...",
budget=Budget.total(1000), # $1,000 lifetime cap
)Budget.org_budget(limit, period) #
An organization-level budget that applies across all agents in your org. When any agent makes an LLM call, spend is counted toward the org budget pool. If the org-wide limit is hit, all agents are blocked — regardless of their individual budgets.
Org budgets coexist with per-agent budgets. Each agent can have its own individual budget and share the org pool. The most restrictive limit wins — if either budget is exceeded, the call is rejected.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
limit | float | Yes | — | USD limit for the entire organization. Must be > 0. |
period | str | Period | No | "daily" | Budget period: "daily", "monthly", or "total". Defaults to "daily". |
from agentkavach import AgentKavach, Budget
# Individual agent budget: $10/day per agent
# Org budget: $50/day across ALL agents
org_pool = Budget.org_budget(limit=50, period="daily")
guard = AgentKavach(
provider="openai",
api_key="cg_...",
llm_key="sk-...",
agent_name="research-bot",
budget=Budget.daily(10), # per-agent: $10/day
org_budget=org_pool, # org-wide: $50/day
)
response = guard.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}],
)
# Spend is tracked against BOTH budgets
print(f"Agent spent: ${guard.spent:.4f}")
print(f"Agent remaining: ${guard.remaining:.4f}")ℹ️ Most restrictive wins
Individual vs. Org Budgets #
AgentKavach supports two levels of budget enforcement that work independently or together:
| Feature | Individual Budget | Org Budget |
|---|---|---|
| Scope | Single agent | All agents in the org |
| Set via SDK | budget=Budget.daily(10) | org_budget=Budget.org_budget(50) |
| Set via YAML | agents.<name>.budget.daily | org_budget.limit / org_budget.period |
| Set via Dashboard | Settings → Agent Budgets | Settings → Organization Budget |
| Set via API | POST /v1/agents/<name>/budgets | POST /v1/org/budgets |
| Enforcement | Blocks only this agent | Blocks ALL agents in the org |
| Ingest rejection | 429 budget_exceeded | 429 org_budget_exceeded |
Example: Both budgets together
from agentkavach import AgentKavach, Budget
# Org-wide cap: $50/day across all agents
org = Budget.org_budget(limit=50, period="daily")
# Each agent gets its own limit PLUS the shared org cap
research = AgentKavach(
provider="openai",
api_key="cg_...",
llm_key="sk-...",
agent_name="research-bot",
budget=Budget.daily(20), # individual: $20/day
org_budget=org, # shared: $50/day
)
support = AgentKavach(
provider="anthropic",
api_key="cg_...",
llm_key="sk-ant-...",
agent_name="support-bot",
budget=Budget.daily(15), # individual: $15/day
org_budget=org, # shared: $50/day
)
# research-bot stops at $20 (individual limit)
# support-bot stops at $15 (individual limit)
# Both stop if combined org spend hits $50Checking Budget State #
You can inspect the current budget state at any time:
# Current spend in the active period
guard.spent # e.g. 12.34
# Remaining budget
guard.remaining # e.g. 37.66
# Utilization as a fraction (0.0 to 1.0)
guard.engine.utilization # e.g. 0.2468What Happens When Budget Runs Out #
When the budget is exhausted, AgentKavach follows a deterministic sequence:
BudgetExceededErroris raised before the LLM call is made.- You never pay for a call that exceeds your budget.
- The
on_killcallback fires if configured. - All subsequent
guard.create()calls raise immediately. - Fail-open: internal AgentKavach errors never block LLM calls — only
BudgetExceededErrorpropagates.
from agentkavach.exceptions import BudgetExceededError
try:
response = guard.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}],
)
except BudgetExceededError as e:
print(f"Budget exhausted: {e}")
# e.spent, e.limit, e.period availableℹ️ Near-zero latency