# Pagy::Keyset

In 

Implement wicked-fast keyset pagination for big data.

Try it now!

# Concept

The "Keyset" pagination, also known as "SQL Seek Method" (and often, improperly called "Cursor" pagination) is a technique that avoids the inevitable slowness of querying pages deep into a collection (i.e. when offset is a big number, you're going to get slower queries).

This technique comes with that huge advantage and a set of limitations that makes it particularly useful for APIs and less convenient for UIs.

# Glossary

Term Description
offset pagination Technique to fetch each page by changing the offset from the collection start.
It requires two queries per page (or one if countless): it's slow toward the end of big tables.
It can be used for a rich frontend: it's the regular pagy pagination.
keyset pagination Technique to fetch the next page starting from the latest fetched record in an uniquely ordered collection.
It requires only one query per page: it's very fast regardless the table size and position (if properly indexed). It has a very limited usage in frontend.
uniquely ordered When the concatenation of the values of the ordered columns is unique for each record. It is similar to a composite primary key for the ordered table, but dynamically based on the keyset columns.
set The uniquely ordered ActiveRecord::Relation or Sequel::Dataset collection to paginate.
keyset The hash of column/direction pairs. Pagy extracts it from the order of the set.
latest The hash of keyset attributes of the latest fetched record (from the latest page). Pagy decodes it from the :page variable and uses it to filter the newest records.
next The next page, i.e. the encoded reference to the last record of the current page.
page The current page, i.e. the encoded reference to the latest record of the latest page.

# Keyset or Offset pagination?

# Usage

# Constraints

# Setup

Pagy Keyset pagination does not waste resources and code complexity checking your set and your table config at every request.

That means that you have to be sure that your set is uniquely ordered and that your tables have the right indexes.

You do it once during development, and pagy will be fast at each request.

Depending on your order requirements, here is how you set it up:

# How Pagy::Keyset works

  • You pass an uniquely ordered set and Pagy::Keyset queries the page of records.
  • It keeps track of the latest fetched record by encoding its keyset attributes into the page query string param of the next URL.
  • At each request, the :page is decoded and used to prepare a when clause that filters the newest records, and the :limit of records is pulled.
  • You know that you reached the end of the collection when pagy.next.nil?.

# ORMs

Pagy::Keyset implements the subclasses for ActiveRecord::Relation and Sequel::Dataset sets and instantiate them internally:

Pagy::Keyset.new(active_record_set)
#=> #<Pagy::Keyset::ActiveRecord:0x00000001066215e0>
 
Pagy::Keyset.new(sequel_set) 
#=> #<Pagy::Keyset::Sequel:0x00000001066545e0>

# Methods

The constructor takes the set, and an optional hash of variables. It returns a Pagy::Keyset::ActiveRecord or Pagy::Keyset::Sequel object (depending by the set class).

The next page, i.e. the encoded reference to the last record of the current page. It is nil for the last page.

The Array of fetched records for the current page.

# Variables

The current page, i.e. the encoded reference to the latest record of the latest page. Default nil for the first page.

The :limit per page. Default Pagy::DEFAULT[:limit]. You can use the limit extra to have it automatically assigned from the limit request param.

Boolean variable that enables the tuple comparison e.g. (brand, id) > (:brand, :id). It works only with the same direction order, hence it's ignored for mixed order. Check how your DB supports it (your keyset should include only NOT NULL columns). Default nil.

Use this for DB-specific extra optimizations, if you know what you are doing.

If the :filter_newest variable is set to a lambda, pagy will call it with the set, latest and keyset arguments instead of using its auto-generated query to filter the newest records (after the latest). It must return the filtered set. For example:

filter_newest = lambda do |set, latest, keyset|
  set.where(my_optimized_query(keyset), **latest)
end

Pagy::Keyset(set, filter_newest:)

A lambda to override the generic json encoding of the keyset attributes. Use it when the generic to_json method would lose some information when decoded.

For example: Time objects may lose or round the fractional seconds through the encoding/decoding cycle, causing the ordering to fail and thus creating all sort of unexpected behaviors (e.g. skipping or repeating the same page, missing or duplicated records, etc.). Here is what you can do:

# Match the microsecods with the strings stored into the time columns of SQLite
jsonify_keyset_attributes = lambda do |attributes|
  # Convert it to a string matching the stored value/format in SQLite DB
  attributes[:created_at] = attributes[:created_at].strftime('%F %T.%6N')
  attributes.to_json
end

Pagy::Keyset(set, jsonify_keyset_attributes:)

(Notice that it doesn't work with Sequel::Dataset sets)

# Attribute Readers

limit, latest, page, vars

# Troubleshooting