UddSketch overview
Percentile approximation with guaranteed relative error using the UddSketch algorithm
Estimate the value at a given percentile, or the percentile rank of a given value, using the UddSketch algorithm. This
estimation is more memory- and CPU-efficient than an exact calculation using PostgreSQL‘s percentile_cont and
percentile_disc functions.
uddsketch is one of two advanced percentile approximation aggregates provided in TimescaleDB Toolkit. It produces
stable estimates within a guaranteed relative error.
The other advanced percentile approximation aggregate is tdigest, which is more accurate at extreme
quantiles, but is somewhat dependent on input order.
If you aren’t sure which aggregate to use, try the default percentile estimation method,
percentile_agg. It uses the uddsketch algorithm with some sensible defaults.
For more information about percentile approximation algorithms, see the algorithms overview.
Two-step aggregation
Section titled “Two-step aggregation”This group of functions uses the two-step aggregation pattern.
Rather than calculating the final result in one step, you first create an intermediate aggregate by using the aggregate function.
Then, use any of the accessors on the intermediate aggregate to calculate a final result. You can also roll up multiple intermediate aggregates with the rollup functions.
The two-step aggregation pattern has several advantages:
- More efficient because multiple accessors can reuse the same aggregate
- Easier to reason about performance, because aggregation is separate from final computation
- Easier to understand when calculations can be rolled up into larger intervals, especially in window functions and continuous aggregates
- Perform retrospective analysis even when underlying data is dropped, because the intermediate aggregate stores extra information not available in the final result
To learn more, see the blog post on two-step aggregates.
Samples
Section titled “Samples”Aggregate and roll up percentile data using percentile_agg
Section titled “Aggregate and roll up percentile data using percentile_agg”Create an hourly continuous aggregate that contains a percentile aggregate:
CREATE MATERIALIZED VIEW foo_hourlyWITH (timescaledb.continuous)AS SELECT time_bucket('1 h'::interval, ts) AS bucket, percentile_agg(value) AS pct_aggFROM fooGROUP BY 1;Use accessors to query directly from the continuous aggregate for hourly data. You can also roll the hourly data up into daily buckets, then calculate approximate percentiles:
SELECT time_bucket('1 day'::interval, bucket) AS bucket, approx_percentile(0.95, rollup(pct_agg)) AS p95, approx_percentile(0.99, rollup(pct_agg)) AS p99FROM foo_hourlyGROUP BY 1;Aggregate and roll up percentile data using uddsketch
Section titled “Aggregate and roll up percentile data using uddsketch”Create an hourly continuous aggregate that contains a percentile aggregate:
CREATE MATERIALIZED VIEW foo_hourlyWITH (timescaledb.continuous)AS SELECT time_bucket('1 h'::interval, ts) AS bucket, uddsketch(100, 0.01, value) AS uddsketchFROM fooGROUP BY 1;Use accessors to query directly from the continuous aggregate for hourly data. You can also roll the hourly data up into daily buckets, then calculate approximate percentiles:
SELECT time_bucket('1 day'::interval, bucket) AS bucket, approx_percentile(0.95, rollup(uddsketch)) AS p95, approx_percentile(0.99, rollup(uddsketch)) AS p99FROM foo_hourlyGROUP BY 1;Available functions
Section titled “Available functions”Aggregate
Section titled “Aggregate”uddsketch(): aggregate data in a uddsketch for percentile calculation
Alternate aggregate
Section titled “Alternate aggregate”percentile_agg(): aggregate data using sensible defaults for percentile calculation
Accessors
Section titled “Accessors”approx_percentile(): estimate the value at a given percentile from a uddsketchapprox_percentile_array(): estimate values at multiple percentiles from a uddsketchapprox_percentile_rank(): estimate the percentile rank of a given value from a uddsketcherror(): get the maximum relative error of a uddsketchmean(): calculate the exact mean from values in a uddsketchnum_vals(): get the number of values in a uddsketch
Rollup
Section titled “Rollup”rollup(): combine multiple uddsketch aggregates