Are these prices live?

Prices reflect public list pricing as of the page's last-updated date. Always confirm with your provider — usage tiers and discounts vary.

Does this include cache pricing?

Yes for Anthropic and OpenAI. Cached input tokens are billed at the lower cache-read rate when applicable.

Developer Tools

AI Token Cost Calculator (Claude, GPT, Gemini)

Estimate the cost of an LLM workload. Plug in tokens-in, tokens-out, monthly volume, and get a per-call and monthly cost across major providers.

Last reviewed Apr 28, 2026·Reviewed by CalcLab Team·Methodology

Model

Input tokens / call

Output tokens / call

Cache hit ratio

On input tokens

Calls / month

Per call

$0.0180

In: $0.00600 · Out: $0.01200

Monthly

$180

10,000 calls

Annualized

$2,160

Cost comparison

All models · same workload · sorted cheapest first

Monthly cost @ 10,000 calls

Llama 4 Maverick (hosted)

How this is calculated

For every model, we calculate:

cost = (input_tokens × input_price + output_tokens × output_price) / 1,000,000

If you set a cache hit ratio, that fraction of input tokens is billed at the lower cache-read rate (Anthropic, OpenAI, and Google all expose one).

Token estimation rule of thumb

~4 characters per token for English
~0.75 tokens per word
1,000 words ≈ 1,300–1,500 tokens (input)
Code is heavier — closer to 3 chars/token

Why caching matters

System prompts, tool definitions, and long context can be cached at 10–25% of the read price. For chat workloads with stable system prompts, caching cuts bills 60–80%.

Frequently asked

Prices reflect public list pricing as of the page's last-updated date. Always confirm with your provider — usage tiers and discounts vary.
Yes for Anthropic and OpenAI. Cached input tokens are billed at the lower cache-read rate when applicable.