theend.my  /  v2.0

Production-grade inference endpoints built for developers who ship.
Fast, reliable, and brutally simple to integrate.

Requests
0 Models
0 Endpoints
0ms Latency
Scroll
// API Surface

Six endpoints.
Infinite possibilities.

POST
/api/chat
Stream chat completions with any model in real time.
POST
/api/search
Live web search with AI synthesis over fresh results.
POST
/api/merge
Multi-model synthesis — blend outputs from multiple LLMs.
GET
/api/models
Full model catalog. No API key required to browse.
POST
/api/attachments/extract
Extract structured text from PDF, DOCX, and XLSX files.
POST
/v1/messages
Anthropic SDK compatible — drop-in replacement endpoint.
// Why theend.my

Built different.

Real-time Streaming

<100ms first byte. SSE delta tokens delivered the instant the model generates them. No buffering, no waiting.

Your Domain. Your Brand.

Clients hit your endpoint. Upstream providers stay invisible. Full white-label API with zero traces of the source.

Anthropic-Compatible

Drop in ANTHROPIC_BASE_URL and you're done. Every existing SDK, tool, and workflow works without a single code change.

Merge Intelligence

2–3 models draft, critique, and synthesize into one refined answer. Consensus beats any single model every time.

Live Web Search

Search the web and synthesize results in a single API call. Fresh data, grounded answers, one request.

File Extraction

PDF, DOCX, XLSX — upload and chat. Clean text extracted server-side, fed directly into the model context.

// Documentation

Start in 60 seconds.

Claude Code Quickstart
Point Claude Code at ai.theend.my — set these in PowerShell:
# Set your API gateway endpoint
$env:ANTHROPIC_BASE_URL = "https://ai.theend.my"
$env:ANTHROPIC_API_KEY = "YOUR_KEY"
# Launch Claude Code
claude

Three steps.

00
01 — Get key

Create an account.

Sign up and generate your API key in seconds. No credit card required to start.

00
02 — Pick model

Choose your model.

Select from models tuned for speed, reasoning, or multimodal tasks.

00
03 — Start streaming

Ship in one call.

Send your first request and get streaming responses with sub-100ms time-to-first-token.


Paiement par token.
Jusqu'à 60× moins cher.

Tous les modèles — Claude Opus 4, GPT-5, Gemini 2.5 Pro. Prix bien en dessous du marché.

Starter
€20
/ mois
Input tokens50M
Output tokens10M
Prix/1M input€0.40
Prix/1M output€2.00
≈ 12× moins cher qu'Anthropic direct
  • GPT-5, Claude Opus 4, Gemini
  • Streaming SSE
  • API /v1/messages compatible
  • Support communauté
Scale
€250
/ mois
Input tokens1 Milliard
Output tokens200M
Prix/1M input€0.25
Prix/1M output€1.25
≈ 60× moins cher qu'Anthropic direct
  • Tout du Growth
  • SLA 99.9% garanti
  • Clés API multiples
  • Support dédié prioritaire