Your API endpoint returns a response in 340ms. Is that good? Bad? Acceptable?
It depends. A search endpoint aggregating millions of records at 340ms is excellent. A simple GET /api/user/me at 340ms is slow — something is wrong.
This guide covers real-world API response time benchmarks, what causes slow APIs, and how to measure and improve yours.
API Response Time Benchmarks (2026)
| Endpoint type | Excellent | Good | Acceptable | Needs work |
|---|---|---|---|---|
| Simple CRUD (GET/POST) | < 50ms | < 100ms | < 200ms | > 500ms |
| DB joins / aggregations | < 100ms | < 250ms | < 500ms | > 1s |
| Search / filtering | < 200ms | < 500ms | < 1s | > 2s |
| Report generation | < 500ms | < 1s | < 3s | > 5s |
| Third-party API proxy | < 300ms | < 500ms | < 1s | > 2s |
| File upload processing | < 1s | < 3s | < 5s | > 10s |
These are server-side processing times (latency), not total response times including network. Add 20-150ms for network depending on geography.
Response Time vs Latency: What's the Difference?
Total response time = Network latency + Server processing time
Example:
DNS lookup: 15ms
TCP connect: 25ms
TLS handshake: 30ms
Server processing: 120ms ← This is "API latency"
Data transfer: 10ms
──────────────────────────
Total response: 200ms ← This is "response time"API latency is what your code controls. Response time is what users experience. Monitor both — optimize latency, track response time.
Why Averages Lie — Use Percentiles
If 99 requests take 50ms and 1 takes 5 seconds, the average is 99ms. Looks fine. But 1 in 100 users waits 5 seconds — and if a user makes 100 requests per session, they have a 63% chance of hitting that slow request.
Use percentiles instead:
- P50 (median): The typical experience. Half your requests are faster than this.
- P95: What your slowest 5% of users experience. This is where cold starts, slow queries, and external API delays show up.
- P99: The worst 1%. Connection pool exhaustion, GC pauses, timeout cascades. Your power users will hit this regularly.
// Good API performance profile:
// P50: 45ms — typical request, healthy
// P95: 180ms — some slower queries, acceptable
// P99: 450ms — occasional spikes, still under 500ms
//
// Bad API performance profile:
// P50: 80ms — looks fine...
// P95: 2.4s — 5% of users waiting 2+ seconds
// P99: 8.1s — 1% getting timeoutsTop 6 Causes of Slow API Response Times
1. Missing database indexes
The #1 cause. A query on a 1M row table without an index does a full table scan. Adding the right index can take a 3-second query to 5ms.
2. N+1 queries
Fetching a list of 50 orders, then making 50 separate queries to get each order's items. Solution: use JOINs or eager loading.
3. No connection pooling
Opening a new database connection for every request adds 20-50ms. A connection pool reuses connections — first request pays the cost, subsequent requests are instant.
4. External API calls without timeouts
Your endpoint calls Stripe or SendGrid. That API is slow today. Without a timeout, your endpoint waits indefinitely. Set timeouts on every external call (10s max).
5. Serverless cold starts
On Vercel or Lambda, the first request after inactivity initializes the function: 200-2000ms of overhead. Provisioned concurrency or keep-warm pings help.
6. Response payload too large
Returning 10MB of JSON when the client only needs 10 fields. Use pagination, field selection (?fields=id,name,email), and compression.
How to Measure API Response Time
Quick test: curl
curl -w "DNS: %{time_namelookup}s\nConnect: %{time_connect}s\nTLS: %{time_appconnect}s\nTTFB: %{time_starttransfer}s\nTotal: %{time_total}s\n" \
-o /dev/null -s https://yourapi.com/api/usersOngoing monitoring: Nurbak Watch
For continuous P50/P95/P99 tracking across every API route, Nurbak Watch runs inside your Next.js server and measures real server-side latency — not synthetic pings:
// instrumentation.ts
import { initWatch } from '@nurbak/watch'
export function register() {
initWatch({
apiKey: process.env.NURBAK_WATCH_KEY,
})
}Every API route auto-discovered. P50/P95/P99 from real traffic. Alerts via Slack/WhatsApp when latency spikes. Free during beta.

