mcv_agg()
Aggregate data into a space-saving aggregate for further calculation of most-frequent values
Since 1.16.0
Aggregate data into a space-saving aggregate, which stores frequency information in an intermediate form. You can then use any of the accessors in this group to return estimated frequencies or the most common elements.
This differs from freq_agg in that you can specify a target number of values to keep, rather than a frequency cutoff.
Samples
Section titled “Samples”Create a topN aggregate over the country column of the users table. Targets the top 10 most-frequent values:
SELECT mcv_agg(10, country) FROM users;Create a topN aggregate over the type column of the devices table. Estimates the skew of the data to be 1.05, and
targets the 5 most-frequent values:
SELECT mcv_agg(5, 1.05, type) FROM devices;Arguments
Section titled “Arguments”The syntax is:
mcv_agg ( n INTEGER, [skew DOUBLE PRECISION,] value AnyElement) RETURNS SpaceSavingAggregate| Name | Type | Default | Required | Description |
|---|---|---|---|---|
n | INTEGER | - | ✔ | The target number of most-frequent values |
value | AnyElement | - | ✔ | The column to store frequencies for |
skew | DOUBLE PRECISION | 1.1 | The estimated skew of the data, defined as the s parameter of a zeta distribution. Must be greater than 1.0. Defaults to 1.1. For more information, see the section on skew. |
Returns
Section titled “Returns”| Column | Type | Description |
|---|---|---|
| agg | SpaceSavingAggregate | An object storing the most common elements of the given table and their estimated frequency. You can pass this object to any of the accessor functions to get a final result. |