hyperloglog()
Aggregate data into a hyperloglog for approximate counting
This is the first step for estimating the approximate number of distinct
values using the hyperloglog algorithm. Use hyperloglog to create an
intermediate aggregate from your raw data. This intermediate form can then
be used by one or more accessors in this group to compute final results.
Optionally, multiple such intermediate aggregate objects can be combined
using rollup() before an accessor is applied.
If you’re not sure what value to set for buckets, try using the alternate
aggregate function, approx_count_distinct().
approx_count_distinct also creates a hyperloglog, but it sets a
default bucket value that should work for many use cases.
Samples
Section titled “Samples”Given a table called samples, with a column called weights, return
a hyperloglog over the weights column.
SELECT hyperloglog(32768, weights) FROM samples;Using the same data, build a view from the aggregate that you can pass
to other hyperloglog functions.
CREATE VIEW hll AS SELECT hyperloglog(32768, data) FROM samples;Arguments
Section titled “Arguments”The syntax is:
hyperloglog( buckets INTEGER, value AnyElement) RETURNS Hyperloglog| Name | Type | Default | Required | Description |
|---|---|---|---|---|
buckets | INTEGER | - | ✔ | Number of buckets in the hyperloglog. Increasing the number of buckets improves accuracy but increases memory use. Value is rounded up to the next power of 2, and must be between 2^4 (16) and 2^18. Setting a value less than 2^10 (1,024) may result in poor accuracy if the true cardinality is high and is not recommended. If unsure, start experimenting with 8,192 (2^13) which has an approximate error rate of 1.15%. |
value | AnyElement | - | ✔ | The column containing the elements to count. The type must have an extended, 64-bit, hash function. |
Returns
Section titled “Returns”| Column | Type | Description |
|---|---|---|
| hyperloglog | Hyperloglog | A hyperloglog object which can be passed to other hyperloglog APIs for rollups and final calculation |