histogram()
Partition the dataset into buckets and get the number of counts in each bucket
The histogram() function represents the distribution of a set of
values as an array of equal-width buckets. It partitions the dataset
into a specified number of buckets (nbuckets) ranging from the
inputted min and max values.
The return value is an array containing nbuckets+2 buckets, with the
middle nbuckets bins for values in the stated range, the first
bucket at the head of the array for values under the lower min bound,
and the last bucket for values greater than or equal to the max bound.
Each bucket is inclusive on its lower bound, and exclusive on its upper
bound. Therefore, values equal to the min are included in the bucket
starting with min, but values equal to the max are in the last bucket.
Samples
Section titled “Samples”A simple bucketing of device’s battery levels from the readings dataset:
SELECT device_id, histogram(battery_level, 20, 60, 5)FROM readingsGROUP BY device_idLIMIT 10;The expected output:
device_id | histogram------------+------------------------------ demo000000 | {0,0,0,7,215,206,572} demo000001 | {0,12,173,112,99,145,459} demo000002 | {0,0,187,167,68,229,349} demo000003 | {197,209,127,221,106,112,28} demo000004 | {0,0,0,0,0,39,961} demo000005 | {12,225,171,122,233,80,157} demo000006 | {0,78,176,170,8,40,528} demo000007 | {0,0,0,126,239,245,390} demo000008 | {0,0,311,345,116,228,0} demo000009 | {295,92,105,50,8,8,442}Arguments
Section titled “Arguments”The syntax is:
SELECT histogram( value = <anyelement>, min = <anyelement>, max = <anyelement>, nbuckets = <integer>);| Name | Type | Default | Required | Description |
|---|---|---|---|---|
value | DOUBLE PRECISION | - | ✔ | A set of values to partition into a histogram |
min | DOUBLE PRECISION | - | ✔ | The histogram’s lower bound used in bucketing (inclusive) |
max | DOUBLE PRECISION | - | ✔ | The histogram’s upper bound used in bucketing (exclusive) |
nbuckets | INTEGER | - | ✔ | The integer value for the number of histogram buckets (partitions) |
Returns
Section titled “Returns”| Column | Type | Description |
|---|---|---|
histogram | INTEGER[ ] | An array of counts, with nbuckets + 2 elements. The first element is the count of values less than min, the last element is the count of values greater than or equal to max, and the middle elements are counts for each bucket in the range. |