An API that tells you which AI model to actually use, so you can stop hardcoding gpt-4-turbo-2024-04-09-final-FINAL in eighteen different repos.
OpenAI ships a new model. Anthropic ships a new model. Google ships a new model. Your config.ts goes stale by lunchtime. You go and update it in fourteen places. You miss two. Production breaks at 3am.
This API is the cure.
curl "https://thelatestmodel.com/v1/models/recommend?tier=fast"
const res = await fetch('https://thelatestmodel.com/v1/models/recommend?tier=fast');
const { recommendation } = await res.json();
console.log(recommendation.id); // e.g. "gemini-2.5-flash"
import requests
r = requests.get('https://thelatestmodel.com/v1/models/recommend', params={'tier': 'fast'})
print(r.json()['recommendation']['id']) # e.g. "gemini-2.5-flash"
Returns the cheapest fast model alive today. Plug it in, ship it, sleep well.
Tell us what you need — we'll build the URL and hand you the code. No Googling required.
https://thelatestmodel.com/v1/models/recommend?tier=best&provider=anthropic
open ↗
…
All endpoints are available at /v1/... (recommended) or unprefixed (mirrors v1, may change with future versions).
GET /v1/health
Are we breathing? Find out.
GET /v1/models
Every curated model with prices, context windows, modalities, capabilities. Filter freely.
Try it: all models · anthropic · openai · google · xai · reasoning · vision
Available filters (see /v1/models/providers for the full provider list):
?provider=anthropic|openai|google|xai|deepseek|mistral|...
?tier=fast|balanced|best
?capability=vision|pdf|reasoning|toolCalling|structuredOutput
?minContextWindow=128000
?includeDeprecated=true
GET /v1/models/recommend
The opinionated bit. Tell us your vibe, get one model back.
Try it: fast · balanced · best · best anthropic · cheapest reasoner
Available filters:
?tier=fast # cheap and quick
?tier=balanced # the daily driver
?tier=best # flagship, money-no-object
?provider=anthropic # restrict to one provider
?capability=reasoning # require a capability
?maxCostPerMTok=2 # under $2/M input tokens
?minContextWindow=200000
GET /v1/models/providers
List of providers we track and how many models each has. Have a look.
GET /v1/models/:provider/latest
The latest non-deprecated model from a given provider.
{
"id": "claude-opus-4-7",
"provider": "anthropic",
"name": "Claude Opus 4.7",
"family": "claude-opus",
"contextWindow": 1000000,
"outputLimit": 128000,
"pricing": {
"inputPerMTok": 5, // USD, per million input tokens
"outputPerMTok": 25,
"cacheReadPerMTok": 0.5,
"cacheWritePerMTok": 6.25
},
"capabilities": {
"vision": true,
"pdf": true,
"reasoning": true,
"toolCalling": true,
"structuredOutput": false
},
"modalities": {
"input": ["text", "image", "pdf"],
"output": ["text"]
},
"releaseDate": "2026-04-16",
"knowledgeCutoff": "2026-01-31",
"openWeights": false,
"tier": "best",
"deprecated": false,
"updatedAt": "2026-05-02T06:00:00Z"
}
const res = await fetch('https://thelatestmodel.com/v1/models/recommend?tier=fast');
const { recommendation } = await res.json();
// recommendation.id is your model string — pass it straight through
openai.chat.completions.create({
model: recommendation.id,
messages: [{ role: 'user', content: 'Hello there.' }]
});
import requests
import openai
r = requests.get('https://thelatestmodel.com/v1/models/recommend', params={'tier': 'fast'})
model_id = r.json()['recommendation']['id']
client = openai.OpenAI()
client.chat.completions.create(
model=model_id,
messages=[{'role': 'user', 'content': 'Hello there.'}]
)
# Get the recommended model ID
curl "https://thelatestmodel.com/v1/models/recommend?tier=fast" | jq -r '.recommendation.id'
This is a free service. Cache the answer locally, fall back to a hardcoded model if we have a bad day.
const FALLBACK = 'claude-haiku-4-5';
async function pickModel() {
try {
const ctrl = new AbortController();
setTimeout(() => ctrl.abort(), 1500);
const r = await fetch(
'https://thelatestmodel.com/v1/models/recommend?tier=fast',
{ signal: ctrl.signal }
);
if (!r.ok) throw new Error(`status ${r.status}`);
const { recommendation } = await r.json();
return recommendation.id;
} catch {
return FALLBACK;
}
}
import requests
from requests.exceptions import RequestException
FALLBACK = 'claude-haiku-4-5'
def pick_model():
try:
r = requests.get(
'https://thelatestmodel.com/v1/models/recommend',
params={'tier': 'fast'},
timeout=1.5
)
r.raise_for_status()
return r.json()['recommendation']['id']
except (RequestException, KeyError):
return FALLBACK
Every morning at 06:00 UTC, a robot wakes up, asks models.dev what's new, bakes the answer straight into a fresh deploy, and goes back to bed. No drift, no stale config files, no awkward staging environments. The robot does not get tipped.
The current API is v1. Endpoints under /v1/... are stable. Unprefixed endpoints (e.g. /models) currently mirror v1 but may change in future.
Production callers should use /v1/... to avoid breakage when v2 ships.
$0.00 — runs on Cloudflare Workers' free tier with aggressive edge caching. If thousands of you start hammering it, the cache serves nearly all of you without ever waking the worker. Beautiful, really.
?includeDeprecated=true.Last data refresh: checking…
© 2026 New Moonshot Limited. All rights reserved.
Disclaimer: This API aggregates publicly available pricing data from models.dev. While we make reasonable efforts to keep data current, pricing and model availability change frequently. Always verify current pricing and model status directly with the provider's official API before making purchasing decisions or deploying to production. We are not responsible for pricing inaccuracies, deprecated models, or financial losses resulting from reliance on this data. Model deprecation, availability, and pricing are subject to each provider's terms of service. This is a free, best-effort service provided "as is" with no guarantees or warranties of any kind.