Ingest data

Ingest data into your services running in Managed Service for TimescaleDB

There are several different ways of ingesting your data into Managed Service for TimescaleDB. This section contains instructions to:

Bulk upload from a .csv file
Insert data directly using a client driver, such as JDBC, ODBC, or Node.js
Insert data directly using a message queue, such as Kafka

Before you begin, make sure you have created your Managed Service for TimescaleDB service, and can connect to it using psql.

Preparing your new database

Connect to your MST service
Use psql to connect to your MST service.
psql -h <HOSTNAME> -p <PORT> -U <USERNAME> -W -d <DATABASE_NAME>
You retrieve the MST service URL, port, and login credentials from the MST service overview in the MST dashboard.
Create a new database
Create a new database for your data. In this example, the new database is called new_db:
CREATE DATABASE new_db; \c new_db;
Create a table
Create a new SQL table in your database. The columns you create for the table must match the columns in your source data. In this example, the table is storing weather condition data, and has columns for the timestamp, location, and temperature:
CREATE TABLE conditions ( time TIMESTAMPTZ NOT NULL, location text NOT NULL, temperature DOUBLE PRECISION NULL );
Load the TimescaleDB extension
Load the timescaledb PostgreSQL extension:
CREATE EXTENSION timescaledb; \dx
Convert to a hypertable
Convert the SQL table into a hypertable:
SELECT create_hypertable('conditions', by_range('time'));
Note
The by_range dimension builder is an addition to TimescaleDB 2.13.

When you have successfully set up your new database, you can ingest data using one of these methods.

Bulk upload from CSV files

If you have a dataset stored in a .csv file, you can import it into an empty hypertable. You need to begin by creating the new table, before you import the data.

Before you begin, make sure you have prepared your new database.

Insert data using timescaledb-parallel-copy
Insert data into the new hypertable using the timescaledb-parallel-copy tool. You should already have the tool installed, but you can install it manually from our GitHub repository if you need to. In this example, we are inserting the data using four workers:
timescaledb-parallel-copy \ --connection '<service_url>' \ --table conditions \ --file ~/Downloads/example.csv \ --workers 4 \ --copy-options "CSV" \ --skip-header
We recommend that you set the number of workers lower than the number of available CPU cores on your client machine or server, to prevent the workers having to compete for resources. This helps your ingest go faster.
(Optional) Use PostgreSQL COPY instead
If you don’t want to use the timescaledb-parallel-copy tool, or if you have a very small dataset, you can use the PostgreSQL COPY command instead:
psql '<service_url>/new_db?sslmode=require' -c "\copy conditions FROM <example.csv> WITH (FORMAT CSV, HEADER)"

Insert data directly using a client driver

You can use a client driver such as JDBC, Python, or Node.js, to insert data directly into your new database.

See the PostgreSQL instructions for using the ODBC driver.

See the Code Quick Starts for using various languages, including Python and node.js.

Insert data directly using a message queue

If you have data stored in a message queue, you can import it into your MST service. This section provides instructions on using the Kafka Connect PostgreSQL connector.

This connector deploys PostgreSQL change events from Kafka Connect to a runtime service. It monitors one or more schemas in a Managed Service for TimescaleDB service, and writes all change events to Kafka topics, which can then be independently consumed by one or more clients. Kafka Connect can be distributed to provide fault tolerance, which ensures the connectors are running and continually keeping up with changes in the database.

You can also use the PostgreSQL connector as a library without Kafka or Kafka Connect. This allows applications and services to directly connect to MST and obtain the ordered change events. In this environment, the application must record the progress of the connector so that when it is restarted, the connect can continue where it left off. This approach can be useful for less critical use cases. However, for production use cases, we recommend that you use the connector with Kafka and Kafka Connect.

See these instructions for using the Kafka connector.

Ingest data

Preparing your new database

Bulk upload from CSV files

Insert data directly using a client driver

Insert data directly using a message queue

What can I help you with?

Suggestions