The three patterns
Offset pagination
Client asks for "page 5, 50 items per page." Server runs a query like SELECT ... ORDER BY id LIMIT 50 OFFSET 200. Easy to implement, easy to expose ("page 5 of 47"), familiar to anyone who has built a paginated table.
Where it breaks down. Two places. First, the database has to scan and discard the first 200 rows before returning the next 50. The cost is linear in the offset; page 1,000 takes a thousand times longer than page 1. Second, if rows are inserted between requests, the contents of "page 5" shift. A user paging through gets either duplicates or skipped rows depending on which end the inserts land. For an admin dashboard over a small dataset, neither problem matters; for a public API over an active dataset, both do.
Keyset pagination
Client asks for "items after the row with id 12345." Server runs a query like SELECT ... WHERE id > 12345 ORDER BY id LIMIT 50. The query uses the primary key index directly — performance is independent of how deep into the dataset you are.
Inserts and deletes don't shift results in problematic ways. A row inserted ahead of the cursor is invisible until the client asks for it; a row inserted behind the cursor doesn't displace anything. Sorting by a single, unique, monotonic column makes this trivial. Sorting by a non-unique column (a timestamp shared by many rows, a name) requires a tie-breaker — typically the primary key as a secondary sort key — and the cursor has to encode both columns.
What you give up. No "go to page 27." No "47 pages total" without a separate count query. The interface is "next" and "previous" and that's it. For most APIs that's a feature, not a bug; for APIs whose UX needs deep linking to a specific page, it's a constraint.
Cursor pagination (opaque tokens)
The client gets an opaque token from the server with each page and passes it back to fetch the next. The token wraps the same information that keyset pagination passes explicitly — a position in some sort order — but the format is the server's business. Clients don't construct cursors; they receive and replay them.
This is the right shape for a public API. The opacity buys you the ability to change the underlying scheme later (switch sort columns, add a stable secondary key, switch from a single-column to a composite cursor) without breaking clients. Internally it is usually keyset pagination with the cursor base64-encoding the relevant column values.
A worked comparison
Suppose you're paginating a list of orders for a single customer, sorted newest-first by created_at. The dataset has 50,000 rows and is actively growing.
- Offset. Page 1 is fast. Page 100 (offset 5,000) is noticeably slower. Page 1,000 (offset 50,000) is a full table scan for that customer. Meanwhile a new order arrives between the client's request for page 1 and page 2; the new order pushes everything down by one, so the last item on page 1 reappears as the first item on page 2.
- Keyset by
created_at. Fast at any depth, but two orders created in the same millisecond end up at the same cursor position; one of them gets skipped on the boundary. Easy fix: cursor of(created_at, id)with the queryWHERE (created_at, id) < (?, ?). - Cursor token. Same query as keyset internally, but the API response is
"next_cursor": "eyJjcmVhdGVkX2F0IjogIjIwMjYtMDUtMDRUMTI6MDA6MDBaIiwgImlkIjogMTIzNDV9". The client passes the string back as?cursor=.... The server decodes it, runs the keyset query, and returns the next page plus the next cursor.
The cursor approach is roughly the same code as keyset on the server, with an extra encode/decode step. The protocol is cleaner because the client never has to know what column the cursor is sorting by.
The response shape
Two parts. First, the data itself — an array of items. Second, the pagination metadata. A workable shape:
{
"data": [ /* items */ ],
"pagination": {
"next_cursor": "eyJjcmVhdGVkX2F0Ijog...",
"has_more": true
}
}
Things worth deciding once and applying consistently across endpoints:
- Where the cursor lives. A query parameter (
?cursor=...) or a Link header are both acceptable. The query parameter is friendlier for cURL and ad-hoc inspection; the Link header is more RESTful and matches the pattern inRFC 5988. Pick one. - Whether to return a count. Counts on large datasets are expensive. If you don't need them, don't promise them. If you do, label them clearly as estimates when they are.
- What "next" means at the end. Either omit
next_cursoror set it to null. Don't return the same cursor that the client sent. Clients that loop on "next" will spin forever. - Page-size limits. Cap the maximum page size and document it. Without a cap, a client can ask for a million items and either time out or return them all.
When offset is still fine
Offset is fine — preferable, even — when:
- The dataset is bounded (the customer has fewer than a few thousand records of this type) and you control the upper bound.
- The UI genuinely needs random access into pages — for example, an admin grid with a "go to page 47" control.
- The data is read-heavy and the small risk of duplicate or skipped items between page loads is not material.
"Internal admin tool" is the canonical fit. "Public API" almost never is.
Decision criteria
- If the dataset can grow without bound, you need cursor or keyset.
- If clients are external and you might change the underlying sort, you need opaque cursors rather than exposed keyset values.
- If clients need to deep-link to a specific position, you need offset (or stable, human-readable cursors that include the sort key).
- If sort order is not a unique column, the cursor needs a tie-breaker.
- If you need a total count, decide whether it's worth the query cost; many APIs ship without one and are fine.
Common mistakes
- Cursor that encodes the offset. If your "cursor" is just a base64-encoded offset, you have offset pagination with extra steps. The point of the cursor is to avoid scanning skipped rows.
- Forgetting the secondary sort key. Sort by a non-unique column without a tie-breaker and the cursor will skip or duplicate rows on collisions. Always cursor on a tuple that ends in a unique column.
- Inconsistent direction. If pages go newest-first, the cursor has to specify "older than", not "after". Off-by-one cursor direction shows up as missing or duplicated boundary items.
- No reverse navigation. Some APIs need it (a UI that lets you page backward). Keyset pagination supports it but only if the cursor and the query are both written symmetrically.
- Mixing pagination with filters that change. If the cursor is based on the sort and the client changes the sort or filter mid-iteration, the cursor becomes meaningless. Either invalidate cursors on filter change (return 400) or document explicitly that they're scoped to the original query.
Where to go next
For where pagination sits in the broader design picture, see API Design Best Practices. For how rate limits interact with pagination — clients that page through a large dataset can saturate a quota fast — see API Rate Limiting Strategies. For the REST conventions this platform uses, see the REST reference.