---
title: stats_agg (two variables) overview | Tiger Data Docs
description: Statistical analysis and linear regression functions for two-dimensional data
---

Perform linear regression analysis, for example to calculate correlation coefficient and covariance, on two-dimensional data. You can also calculate common statistics, such as average and standard deviation, on each dimension separately. These functions are similar to the [PostgreSQL statistical aggregates](https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-AGGREGATE-STATISTICS-TABLE), but they include more features and are easier to use in continuous aggregates and window functions. The linear regressions are based on the standard least-squares fitting method.

These functions work on two-dimensional data. To work with one-dimensional data, for example to calculate the average and standard deviation of a single variable, see [the one-dimensional `stats_agg` functions](/reference/toolkit/statistical-and-regression-analysis/stats_agg-one-variable/index.md).

## Two-step aggregation

This group of functions uses the two-step aggregation pattern.

Rather than calculating the final result in one step, you first create an intermediate aggregate by using the aggregate function.

Then, use any of the accessors on the intermediate aggregate to calculate a final result. You can also roll up multiple intermediate aggregates with the rollup functions.

The two-step aggregation pattern has several advantages:

1. More efficient because multiple accessors can reuse the same aggregate
2. Easier to reason about performance, because aggregation is separate from final computation
3. Easier to understand when calculations can be rolled up into larger intervals, especially in window functions and continuous aggregates
4. Perform retrospective analysis even when underlying data is dropped, because the intermediate aggregate stores extra information not available in the final result

To learn more, see the [blog post on two-step aggregates](https://www.timescale.com/blog/how-postgresql-aggregation-works-and-how-it-inspired-our-hyperfunctions-design).

## Samples

### Calculate regression and statistical properties

Create a statistical aggregate that summarizes daily statistical data about two variables, `val2` and `val1`, where `val2` is the dependent variable and `val1` is the independent variable. Use the statistical aggregate to calculate the average of the dependent variable and the slope of the linear-regression fit:

```
WITH t AS (
    SELECT
        time_bucket('1 day'::interval, ts) AS dt,
        stats_agg(val2, val1) AS stats2D
    FROM foo
    WHERE id = 'bar'
    GROUP BY time_bucket('1 day'::interval, ts)
)
SELECT
    average_x(stats2D),
    slope(stats2D)
FROM t;
```

## Available functions

### Aggregate

- [`stats_agg()`](/reference/toolkit/statistical-and-regression-analysis/stats_agg-two-variables/stats_agg/index.md): aggregate data into an intermediate statistical aggregate form for further calculation

### Accessors for y variable statistics

- [`average_y()`](/reference/toolkit/statistical-and-regression-analysis/stats_agg-two-variables/average_y_x/index.md): calculate the average of the dependent variable from a statistical aggregate
- [`stddev_y()`](/reference/toolkit/statistical-and-regression-analysis/stats_agg-two-variables/stddev_y_x/index.md): calculate the standard deviation of the dependent variable from a statistical aggregate
- [`variance_y()`](/reference/toolkit/statistical-and-regression-analysis/stats_agg-two-variables/variance_y_x/index.md): calculate the variance of the dependent variable from a statistical aggregate
- [`skewness_y()`](/reference/toolkit/statistical-and-regression-analysis/stats_agg-two-variables/skewness_y_x/index.md): calculate the skewness of the dependent variable from a statistical aggregate
- [`kurtosis_y()`](/reference/toolkit/statistical-and-regression-analysis/stats_agg-two-variables/kurtosis_y_x/index.md): calculate the kurtosis of the dependent variable from a statistical aggregate
- [`sum_y()`](/reference/toolkit/statistical-and-regression-analysis/stats_agg-two-variables/sum_y_x/index.md): calculate the sum of the dependent variable from a statistical aggregate

### Accessors for regression analysis

- [`corr()`](/reference/toolkit/statistical-and-regression-analysis/stats_agg-two-variables/corr/index.md): calculate the correlation coefficient from a statistical aggregate
- [`covariance()`](/reference/toolkit/statistical-and-regression-analysis/stats_agg-two-variables/covariance/index.md): calculate the covariance from a statistical aggregate
- [`determination_coeff()`](/reference/toolkit/statistical-and-regression-analysis/stats_agg-two-variables/determination_coeff/index.md): calculate the coefficient of determination (R²) from a statistical aggregate
- [`slope()`](/reference/toolkit/statistical-and-regression-analysis/stats_agg-two-variables/slope/index.md): calculate the slope of the linear regression line from a statistical aggregate
- [`intercept()`](/reference/toolkit/statistical-and-regression-analysis/stats_agg-two-variables/intercept/index.md): calculate the y-intercept of the linear regression line from a statistical aggregate
- [`x_intercept()`](/reference/toolkit/statistical-and-regression-analysis/stats_agg-two-variables/x_intercept/index.md): calculate the x-intercept of the linear regression line from a statistical aggregate

### Accessors for aggregate information

- [`num_vals()`](/reference/toolkit/statistical-and-regression-analysis/stats_agg-two-variables/num_vals/index.md): get the number of values contained in a statistical aggregate

### Rollup

- [`rollup()`](/reference/toolkit/statistical-and-regression-analysis/stats_agg-two-variables/rollup/index.md): combine multiple two-dimensional statistical aggregates

### Mutator

- [`rolling()`](/reference/toolkit/statistical-and-regression-analysis/stats_agg-two-variables/rolling/index.md): create a rolling window aggregate for use in window functions
