approx_count_distinct()
Aggregate data into a hyperloglog for approximate counting without specifying the number of buckets
This is an alternate first step for approximating the number of distinct
values. It provides some added convenience by using some sensible default
parameters to create a hyperloglog.
Use approx_count_distinct to create an intermediate aggregate from your raw data.
This intermediate form can then be used by one or more accessors in this
group to compute final results.
Optionally, multiple such intermediate aggregate objects can be combined
using rollup() before an accessor is applied.
Samples
Section titled “Samples”Given a table called samples, with a column called weights, return
a hyperloglog over the weights column:
SELECT approx_count_distinct(weights) FROM samples;Using the same data, build a view from the aggregate that you can pass
to other hyperloglog functions.
CREATE VIEW hll AS SELECT approx_count_distinct(data) FROM samples;Arguments
Section titled “Arguments”The syntax is:
approx_count_distinct( value AnyElement) RETURNS Hyperloglog| Name | Type | Default | Required | Description |
|---|---|---|---|---|
value | AnyElement | - | ✔ | The column containing the elements to count. The type must have an extended, 64-bit, hash function. |
Returns
Section titled “Returns”| Column | Type | Description |
|---|---|---|
| hyperloglog | Hyperloglog | A hyperloglog object which can be passed to other hyperloglog APIs for rollups and final calculation |