AgentKavach wraps your LLM client with hard budget enforcement, real-time alerts, and a kill switch that fires before you overspend. One SDK for OpenAI, Anthropic, Google, and Mistral.
< 0.1ms
Budget check overhead
4
LLM providers
4
Alert channels
99.9%
Uptime SLA
from agentkavach import AgentKavach, Budget, ChannelType
def emergency_stop():
agent.save_checkpoint()
sys.exit(1)
guard = AgentKavach(
api_key="cg_...",
llm_key="sk-...",
budget=Budget.daily(50),
on_kill=emergency_stop,
)
response = guard.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}],
)Three steps to hard budget enforcement. No infrastructure changes required.
One pip install. No infrastructure changes, no sidecar processes.
pip install agentkavachReplace your provider client with AgentKavach. Same API, same parameters, with budget enforcement built in.
guard = AgentKavach(
api_key="cg_...",
llm_key="sk-...",
budget=Budget.daily(50),
on_kill=emergency_stop,
)Every call is tracked. Alerts fire at your thresholds. The kill switch engages at 100%. No surprises on your bill.
response = guard.create(
model="gpt-4o",
messages=[...],
)From a single developer to a fleet of 10,000 agents. AgentKavach scales with your team.
Set daily, monthly, or total budgets per agent. The kill callback fires at 100% to halt runaway spend before it escalates.
OpenAI, Anthropic, Google, Mistral. Track costs across all providers with a single SDK and a unified guard.create() API.
Slack, email, PagerDuty, or custom webhooks. Configure escalating thresholds at 50%, 80%, 95%, and 100%.
See every agent's spend in real time. Drill into model usage, cost trends, and alert history from a single pane.
Token limits, call count caps, runtime limits, and runaway loop detection. Stop agents before they spiral out of control.
Pool a single budget across multiple agents and providers. Enforce team-wide spend limits with thread-safe accounting.
The same guard.create() call works across OpenAI, Anthropic, Google, and Mistral. Switch providers without changing your budget logic.
from agentkavach import AgentKavach, Budget
guard = AgentKavach(
provider="openai", # "openai", "anthropic", "google", or "mistral"
api_key="cg_...", # your AgentKavach API key
llm_key="sk-...", # your LLM provider API key
agent_name="my-agent", # identifies this agent in the dashboard
budget=Budget.daily(50), # $50/day hard limit
on_kill=emergency_stop, # callback at 100% utilization
save_prompts=False, # opt-in: log prompts to dashboard
)
response = guard.create(
model="gpt-4o", # any model from any supported provider
messages=[{"role": "user", "content": "Hello!"}],
)Configure alert channels at different budget thresholds. Start with an email at 50%, escalate to Slack at 80%, page your on-call at 95%, and trigger the kill switch at 100%. Each channel validates its credentials at construction time.
guard = AgentKavach(
provider="openai",
api_key="cg_...",
llm_key="sk-...",
budget=Budget.daily(100),
on_kill=emergency_stop,
channels=[
AgentKavach.channel(ChannelType.EMAIL,
threshold=0.50,
to="ops@acme.com",
),
AgentKavach.channel(ChannelType.SLACK,
threshold=0.80,
webhook_url="https://hooks...",
template="{agent_name} at {pct}% budget",
),
AgentKavach.channel(ChannelType.PAGERDUTY,
threshold=0.95,
routing_key="R0abc...",
),
AgentKavach.channel(ChannelType.KILL,
threshold=1.0,
),
],
)channels:
slack:
type: slack
webhook_url: https://hooks.slack.com/...
channel: "#cost-alerts"
pagerduty:
type: pagerduty
routing_key: R0abc123
service_url: https://events.pagerduty.com/v2/enqueue
team: AI Ops Team
shared_budgets:
engineering-pool:
limit: 500
period: daily
agents: [research-bot, code-review]
agents:
research-bot:
provider: openai
budget: { type: daily, limit: 50 }
alerts:
- { threshold: 0.80, channels: [slack] }
- { threshold: 1.0, channels: [kill] }
support-agent:
provider: anthropic
budget: defaultDefine budgets, alerts, and shared pools for all your agents in a single configuration file. AgentKavach validates every channel reference, budget type, and shared pool name at load time. Misconfigured alerts fail fast, not at 3 AM.
AgentKavach is built with security at every layer. Your API keys, spend data, and agent communications are protected by industry-standard practices from day one. We never use your LLM provider API key for any purpose other than proxying your requests. We only read the response metadata (token counts, model used) to track costs — we never store, log, or access your prompts or completions.
Start free. Scale when you need to.
For solo developers and prototypes.
For teams running production agents.
For fleets of AI agents at scale.
Sourced from provider pricing pages: OpenAI, Anthropic, Google, Mistral
| Provider | Model | Input / 1K tokens | Output / 1K tokens |
|---|---|---|---|
| Anthropic | claude-3-5-haiku-20241022 | $0.000800 | $0.004000 |
| Anthropic | claude-3-5-sonnet-20241022 | $0.003000 | $0.015000 |
| Anthropic | claude-3-haiku-20240307 | $0.000250 | $0.001250 |
| Anthropic | claude-3-opus-20240229 | $0.015000 | $0.075000 |
| Anthropic | claude-haiku-4-5 | $0.000800 | $0.004000 |
| Anthropic | claude-opus-4-0 | $0.015000 | $0.075000 |
| Anthropic | claude-opus-4-6 | $0.015000 | $0.075000 |
| Anthropic | claude-sonnet-4-0 | $0.003000 | $0.015000 |
| Anthropic | claude-sonnet-4-6 | $0.003000 | $0.015000 |
| gemini-1.5-flash | $0.000075 | $0.000300 | |
| gemini-1.5-pro | $0.001250 | $0.005000 | |
| gemini-2.0-flash | $0.000100 | $0.000400 | |
| gemini-2.5-flash | $0.000150 | $0.003500 | |
| gemini-2.5-pro | $0.001250 | $0.010000 | |
| Mistral | codestral-latest | $0.000300 | $0.000900 |
| Mistral | ministral-8b-latest | $0.000100 | $0.000100 |
| Mistral | mistral-embed | $0.000100 | $0.000100 |
| Mistral | mistral-large-latest | $0.002000 | $0.006000 |
| Mistral | mistral-small-latest | $0.000100 | $0.000300 |
| Mistral | pixtral-large-latest | $0.002000 | $0.006000 |
| OpenAI | codex-mini | $0.001500 | $0.006000 |
| OpenAI | gpt-3.5-turbo | $0.000500 | $0.001500 |
| OpenAI | gpt-4 | $0.030000 | $0.060000 |
| OpenAI | gpt-4-turbo | $0.010000 | $0.030000 |
| OpenAI | gpt-4.1 | $0.002000 | $0.008000 |
| OpenAI | gpt-4.1-mini | $0.000400 | $0.001600 |
| OpenAI | gpt-4.1-nano | $0.000100 | $0.000400 |
| OpenAI | gpt-4.5-preview | $0.075000 | $0.150000 |
| OpenAI | gpt-4o | $0.002500 | $0.010000 |
| OpenAI | gpt-4o-mini | $0.000150 | $0.000600 |
| OpenAI | o1 | $0.015000 | $0.060000 |
| OpenAI | o1-mini | $0.003000 | $0.012000 |
| OpenAI | o3 | $0.010000 | $0.040000 |
| OpenAI | o3-mini | $0.001100 | $0.004400 |
| OpenAI | o4-mini | $0.001100 | $0.004400 |
Yes. Upgrade or downgrade at any time. Changes take effect immediately. Downgrades are prorated.
Each LLM API call tracked through the SDK counts as one event. Batch calls count as one event per item in the batch.
Pro comes with a 14-day free trial. No credit card required to start. Max does not include a trial.
Once you exceed your event limit, monitoring, alerts, and the kill switch stop functioning until your limit resets at midnight. New agents cannot be onboarded past the limit either. To avoid disruption, upgrade to a higher plan before reaching your cap.
Max customers can request custom invoicing and payment terms. Contact us for details.