Quickstart
One call to get a routing decision. No auth required in v1 — the demo router is public.
curl -X POST https://slng.polsia.app/api/v1/route \ -H "Content-Type: application/json" \ -d '{"use_case":"transcription","region":"us","optimize_for":"latency"}'
Response:
{
"request_id": "3f8a2b1c-4d5e-6f7a-8b9c-0d1e2f3a4b5c",
"provider": "Deepgram",
"model": "Nova-3",
"region_endpoint": "us-east-1",
"estimated_latency_ms": 180,
"estimated_cost_per_minute_usd": 0.0043,
"fallback": ["AssemblyAI Universal-2", "Google STT us-central"],
"routing": {
"use_case": "transcription",
"region": "us",
"optimize_for": "latency",
"input_format": null,
"reason": "Lowest p95 latency in us-east (avg 180ms). Streaming-first, phone-grade noise handling."
}
}
POST /api/v1/route
Returns the provider slng would route to for the given workload parameters. Each response is logged (provider, params, request_id) for usage analytics.
Request body
| Parameter | Type | Description | |
|---|---|---|---|
| use_case | string | required |
Type of voice workload.
transcription · tts · conversational
|
| region | string | required |
Geographic region for data residency and latency routing.
us · eu · apac
|
| optimize_for | string | required |
Primary optimization target.
latency · cost
|
| input_format | string | optional |
Audio input format. Affects provider selection when transcribing.
wav · mp3 · pcm16
|
Example — transcription, EU, optimize latency
curl -X POST https://slng.polsia.app/api/v1/route \ -H "Content-Type: application/json" \ -d '{ "use_case": "transcription", "region": "eu", "optimize_for": "latency", "input_format": "wav" }'
Example — TTS, US, optimize cost
curl -X POST https://slng.polsia.app/api/v1/route \ -H "Content-Type: application/json" \ -d '{"use_case":"tts","region":"us","optimize_for":"cost"}'
Example — conversational agent, APAC
curl -X POST https://slng.polsia.app/api/v1/route \ -H "Content-Type: application/json" \ -d '{"use_case":"conversational","region":"apac","optimize_for":"latency"}'
GET /api/v1/providers
Returns the full provider catalog used by the router — names, capabilities, regions, latency benchmarks, and pricing. Useful for building your own tooling on top of the routing data.
curl https://slng.polsia.app/api/v1/providers
Each provider entry:
{
"id": "deepgram",
"name": "Deepgram",
"capability": "stt",
"regions": ["us-east-1", "eu-west-1", "ap-east-1"],
"latency_p50_ms": 170,
"cost_per_minute_usd": 0.0043,
"notes": "Nova-3 pricing. Source: https://deepgram.com/pricing (May 2026)"
}
Response shape
All responses are JSON. Success responses from /api/v1/route:
use_case.
region.
optimize_for.
Error responses
400 Bad Request — invalid or missing required parameter. Body: { "error": "..." }
429 Too Many Requests — rate limit exceeded. Body: { "error": "...", "retry_after_seconds": N }. Header: Retry-After: N
Rate limits
30 requests per minute per IP. Tracked in-memory, resets on server restart.
Rate limit headers are included on every response:
X-RateLimit-Limit: 30X-RateLimit-Remaining: 29
When the limit is exceeded, the response is 429 with a Retry-After header indicating seconds until the window resets.
v1 is a demo router for evaluation — rate limits are intentionally generous. Production access (higher limits, SLA, dedicated routing) requires an API key. Request access →
How routing decisions are made
Every call to POST /api/v1/route
runs through the same decision engine slng uses internally. The router evaluates three dimensions:
Use case
STT, TTS, and LLM providers have different latency/cost profiles. The router only considers providers in the correct capability bucket — Deepgram is never a TTS fallback.
Region
Data residency is a hard constraint. EU traffic routes exclusively to EU endpoints — providers without EU infrastructure are excluded, regardless of latency or cost. Same for APAC. Region selection determines which providers are in scope before any optimization is applied.
Optimization target
Within the eligible provider set for a use case + region, the router selects the best option for your stated priority. Latency picks the provider with the lowest p95 response time. Cost picks the cheapest option that still meets quality requirements for the workload.
Failover chain
The fallback array is an ordered list of alternatives
slng activates if the primary provider rate-limits, returns errors, or exceeds latency thresholds.
Failover is automatic — no code change, no incident.
The routing table used by this API is exactly the same one powering the playground. A /compare endpoint (coming soon) will let you see how routing decisions differ across optimization targets and regions side-by-side.