What is a good API response time?

A good API response time depends on the endpoint type. For simple CRUD endpoints (GET /users, POST /orders), under 200ms is good and under 100ms is excellent. For endpoints involving database joins or aggregations, under 500ms is acceptable. For search or reporting endpoints, under 1 second is typical. Any endpoint consistently above 2 seconds needs optimization. These are server-side processing times, not including network latency.

What is the difference between API response time and API latency?

API response time is the total time from when the client sends a request to when it receives the complete response — it includes network latency, DNS lookup, TLS handshake, server processing, and data transfer. API latency typically refers to the server-side processing time only — the time your code takes to handle the request. Response time = latency + network overhead. When monitoring, measure both: latency for code optimization, response time for user experience.

API Response Time: Standards & Benchmarks (2026)

Your API endpoint returns a response in 340ms. Is that good? Bad? Acceptable?

It depends. A search endpoint aggregating millions of records at 340ms is excellent. A simple GET /api/user/me at 340ms is slow — something is wrong.

This guide covers real-world API response time benchmarks, what causes slow APIs, and how to measure and improve yours.

API Response Time Benchmarks (2026)

Endpoint type	Excellent	Good	Acceptable	Needs work
Simple CRUD (GET/POST)	< 50ms	< 100ms	< 200ms	> 500ms
DB joins / aggregations	< 100ms	< 250ms	< 500ms	> 1s
Search / filtering	< 200ms	< 500ms	< 1s	> 2s
Report generation	< 500ms	< 1s	< 3s	> 5s
Third-party API proxy	< 300ms	< 500ms	< 1s	> 2s
File upload processing	< 1s	< 3s	< 5s	> 10s

These are server-side processing times (latency), not total response times including network. Add 20-150ms for network depending on geography.

Response Time vs Latency: What's the Difference?

    Total response time = Network latency + Server processing time

Example:
DNS lookup:         15ms
TCP connect:        25ms
TLS handshake:      30ms
Server processing: 120ms  ← This is "API latency"
Data transfer:      10ms
──────────────────────────
Total response:    200ms  ← This is "response time"

API latency is what your code controls. Response time is what users experience. Monitor both — optimize latency, track response time.

Why Averages Lie — Use Percentiles

If 99 requests take 50ms and 1 takes 5 seconds, the average is 99ms. Looks fine. But 1 in 100 users waits 5 seconds — and if a user makes 100 requests per session, they have a 63% chance of hitting that slow request.

Use percentiles instead:

P50 (median): The typical experience. Half your requests are faster than this.
P95: What your slowest 5% of users experience. This is where cold starts, slow queries, and external API delays show up.
P99: The worst 1%. Connection pool exhaustion, GC pauses, timeout cascades. Your power users will hit this regularly.

    // Good API performance profile:
// P50:  45ms  — typical request, healthy
// P95: 180ms  — some slower queries, acceptable
// P99: 450ms  — occasional spikes, still under 500ms
//
// Bad API performance profile:
// P50:  80ms  — looks fine...
// P95: 2.4s   — 5% of users waiting 2+ seconds
// P99: 8.1s   — 1% getting timeouts

Top 6 Causes of Slow API Response Times

1. Missing database indexes

The #1 cause. A query on a 1M row table without an index does a full table scan. Adding the right index can take a 3-second query to 5ms.

2. N+1 queries

Fetching a list of 50 orders, then making 50 separate queries to get each order's items. Solution: use JOINs or eager loading.

3. No connection pooling

Opening a new database connection for every request adds 20-50ms. A connection pool reuses connections — first request pays the cost, subsequent requests are instant.

4. External API calls without timeouts

Your endpoint calls Stripe or SendGrid. That API is slow today. Without a timeout, your endpoint waits indefinitely. Set timeouts on every external call (10s max).

5. Serverless cold starts

On Vercel or Lambda, the first request after inactivity initializes the function: 200-2000ms of overhead. Provisioned concurrency or keep-warm pings help.

6. Response payload too large

Returning 10MB of JSON when the client only needs 10 fields. Use pagination, field selection (?fields=id,name,email), and compression.

How to Measure API Response Time

Quick test: curl

    curl -w "DNS: %{time_namelookup}s\nConnect: %{time_connect}s\nTLS: %{time_appconnect}s\nTTFB: %{time_starttransfer}s\nTotal: %{time_total}s\n" \
  -o /dev/null -s https://yourapi.com/api/users

Ongoing monitoring: Nurbak Watch

For continuous P50/P95/P99 tracking across every API route, Nurbak Watch runs inside your Next.js server and measures real server-side latency — not synthetic pings:

    // instrumentation.ts
import { initWatch } from '@nurbak/watch'

export function register() {
  initWatch({
    apiKey: process.env.NURBAK_WATCH_KEY,
  })
}

Every API route auto-discovered. P50/P95/P99 from real traffic. Alerts via Slack/WhatsApp when latency spikes. Free during beta.

What's a Good API Response Time? Standards, Benchmarks & How to Improve

API Response Time Benchmarks (2026)

Response Time vs Latency: What's the Difference?

Why Averages Lie — Use Percentiles

Top 6 Causes of Slow API Response Times

1. Missing database indexes

2. N+1 queries

3. No connection pooling

4. External API calls without timeouts

5. Serverless cold starts

6. Response payload too large

How to Measure API Response Time

Quick test: curl

Ongoing monitoring: Nurbak Watch

Related Articles

Fabian Delgado

Ready to try it?

API Response Time Benchmarks (2026)

Response Time vs Latency: What's the Difference?

Why Averages Lie — Use Percentiles

Top 6 Causes of Slow API Response Times

1. Missing database indexes

2. N+1 queries

3. No connection pooling

4. External API calls without timeouts

5. Serverless cold starts

6. Response payload too large

How to Measure API Response Time

Quick test: curl

Ongoing monitoring: Nurbak Watch

Related Articles

Fabian Delgado

Ready to try it?

Read Next

SLO vs SLA vs SLI: What's the Difference? (With Examples)

MTTD Explained: How to Measure and Reduce Mean Time to Detect

The Incident Response Lifecycle for API Teams (5 Steps)