Accelerate queries using indexes
Add indexes to speed up queries on your hypertables and learn best practices for creating them
Indexes shrink scan work for point lookups and selective filters on recent, non-compressed chunks in the rowstore (compressed chunks rely on columnar indexing instead).
You can create an index on any combination of columns. To define an index as a UNIQUE or PRIMARY KEY index, it must include the partitioning column (this is usually the time column).
Which column you choose to create your
index on depends on what kind of data you have stored.
When you create a hypertable, set the datatype for the time column as
timestamptz and not timestamp.
For more information, see PostgreSQL timestamp best practices.
While it is possible to add an index that does not include the time column,
doing so results in very slow ingest speeds. For time-series data, indexing
on the time column allows one index to be created per chunk.
Consider a simple example with temperatures collected from two locations named
office and garage:
An index on (location, time DESC) is organized like this:
garage-0940garage-0930garage-0920garage-0910office-0930office-0920office-0910An index on (time DESC, location) is organized like this:
0940-garage0930-garage0930-office0920-garage0920-office0910-garage0910-officeA good rule of thumb with indexes is to think in layers. Start by choosing the
columns that you typically want to run equality operators on, such as
location = garage. Then finish by choosing columns you want to use range
operators on, such as time > 0930.
As a more complex example, imagine you have a number of devices tracking
1,000 different retail stores. You have 100 devices per store, and 5 different
types of devices. All of these devices report metrics as float values, and you
decide to store all the metrics in the same table, like this:
CREATE TABLE devices ( time timestamptz, device_id int, device_type int, store_id int, value float);When you create this table, an index is automatically generated on the time column, making it faster to query your data based on time.
If you want to query your data on something other than time, you can create
different indexes. For example, you might want to query data from the last month
for just a given device_id. Or you could query all data for a single
store_id for the last three months.
You want to keep the index on time so that you can quickly filter for a given
time range, and add another index on device_id and store_id. This creates a
composite index. A composite index on (store_id, device_id, time) orders by
store_id first. Each unique store_id, will then be sorted by device_id in
order. And each entry with the same store_id and device_id are then ordered
by time. To create this index, use this command:
CREATE INDEX ON devices (store_id, device_id, time DESC);When you have this composite index on your hypertable, you can run a range of different queries. Here are some examples:
SELECT * FROM devices WHERE store_id = xThis queries the portion of the list with a specific store_id. The index is
effective for this query, but could be a bit bloated; an index on just
store_id would probably be more efficient.
SELECT * FROM devices WHERE store_id = x AND time > 10This query is not effective, because it would need to scan multiple sections of
the list. This is because the part of the list that contains data for
time > 10 for one device would be located in a different section than for a
different device. In this case, consider building an index on (store_id, time)
instead.
SELECT * FROM devices WHERE device_id = M AND time > 10The index in the example is useless for this query, because the data for
device M is located in a completely different section of the list for each
store_id.
SELECT * FROM devices WHERE store_id = M AND device_id = M AND time > 10This is an accurate query for this index. It narrows down the list to a very specific portion.
Creating indexes
Section titled “Creating indexes”You can create an index using the CREATE INDEX command. For example, to create
an index that sorts first by location, then by time, in descending order:
CREATE INDEX ON conditions (location, time DESC);You can run this command before or after you convert a regular PostgreSQL table to a hypertable.
Default indexes
Section titled “Default indexes”Some indexes are created by default when you perform certain actions on your database.
When you create a hypertable with a call to CREATE TABLE, a time index
is created on your data. If you want to manually create a time index, you can use this command:
CREATE INDEX ON conditions (time DESC);You can also create an additional index on another column and time. For example:
CREATE INDEX ON conditions (location, time DESC);TimescaleDB also creates sparse indexes per columnstore chunk for optimization, including bloom indexes for equality filters and min/max indexes for range filters. You can manually configure these indexes when you call CREATE TABLE or ALTER TABLE.
If you do not want to create default indexes, you can set
create_default_indexes to false when you create a hypertable. For example:
CREATE TABLE conditions ( time TIMESTAMPTZ NOT NULL, location TEXT NOT NULL, device TEXT NOT NULL, temperature DOUBLE PRECISION NULL, humidity DOUBLE PRECISION NULL) WITH ( tsdb.hypertable, tsdb.create_default_indexes=false);When you create a hypertable using CREATE TABLE … WITH …, the default partitioning
column is automatically the first column with a timestamp data type. Also, TimescaleDB creates a
columnstore policy that automatically converts your data to the columnstore, after an interval equal to the value of the chunk_interval, defined through after in the policy. This columnar format enables fast scanning and
aggregation, optimizing performance for analytical workloads while also saving significant storage space. In the
columnstore conversion, hypertable chunks are compressed by up to 98%, and organized for efficient, large-scale queries.
You can customize this policy later using alter_job. However, to change after or
created_before, the compression settings, or the hypertable the policy is acting on, you must
remove the columnstore policy and add a new one.
You can also manually convert chunks in a hypertable to the columnstore.
Best practices for indexing
Section titled “Best practices for indexing”If you have sparse data, with columns that are often NULL, you can add a clause
to the index, saying WHERE column IS NOT NULL. This prevents the index from
indexing NULL data, which can lead to a more compact and efficient index. For
example:
CREATE INDEX ON conditions (time DESC, humidity) WHERE humidity IS NOT NULL;To define an index as a UNIQUE or PRIMARY KEY index, the index must include
the time column and the partitioning column, if you are using one. For example,
a unique index must include at least the (time, location) columns, in addition
to any other columns you want to use. Generally,
time-series data uses UNIQUE indexes more rarely than relational data.
If you do not want to create an index in a single transaction, you can use the
CREATE_INDEX
function. This uses a separate function to create an index on each chunk,
instead of a single transaction for the entire hypertable. This means that you
can perform other actions on the table while the index is being created, rather
than having to wait until index creation is complete.
You can also use the
PostgreSQL WITH clause
to perform indexing transactions on an individual chunk.