Skip to content

Approximate percentiles

Construct and run approximate percentage queries using uddsketch or tdigest

This walkthrough uses a response_times table to show approximate percentiles end to end. TimescaleDB implements them with bounded-memory sketches instead of sorting every value, which keeps latency predictable on large datasets and continuous aggregates.

By default, TimescaleDB Toolkit uses uddsketch, but you can also choose tdigest. For algorithm trade-offs, see advanced aggregation methods.

In this procedure, we use an example table called response_times that contains information about how long a server takes to respond to API calls.

  1. Create a continuous aggregate for daily aggregates
    CREATE MATERIALIZED VIEW response_times_daily
    WITH (timescaledb.continuous)
    AS SELECT
    time_bucket('1 day'::interval, ts) as bucket,
    percentile_agg(response_time_ms)
    FROM response_times
    GROUP BY 1;
  2. Query the ninety-fifth percentile over the last 30 days

    Re-aggregate the aggregate to get the last 30 days:

    SELECT approx_percentile(0.95, percentile_agg) as threshold
    FROM response_times_daily
    WHERE bucket >= time_bucket('1 day'::interval, now() - '30 days'::interval);
  3. Create an alert
    WITH t as (SELECT approx_percentile(0.95, percentile_agg(percentile_agg)) as threshold
    FROM response_times_daily
    WHERE bucket >= time_bucket('1 day'::interval, now() - '30 days'::interval))
    SELECT count(*)
    FROM response_times
    WHERE ts > now()- '1 minute'::interval
    AND response_time_ms > (SELECT threshold FROM t);

For more information about percentile approximation API calls, see the hyperfunction API documentation.