Simulate an IoT sensor dataset
Simulate an IoT sensor dataset for testing and development with Tiger Cloud
The Internet of Things (IoT) describes a trend where computing capabilities are embedded into IoT devices. That is, physical objects, ranging from light bulbs to oil wells. Many IoT devices collect sensor data about their environment and generate time-series datasets with relational metadata.
It is often necessary to simulate IoT datasets. For example, when you are testing a new system. This tutorial shows how to simulate a basic dataset in your Tiger Cloud service, and then run simple queries on it.
To simulate a more advanced dataset, see Time-series Benchmarking Suite (TSBS).
Prerequisites for this tutorial
To follow the steps on this page:
-
Create a target Tiger Cloud service with the Real-time analytics capability.
You need your connection details. This procedure also works for self-hosted TimescaleDB.
Simulate a dataset
Section titled “Simulate a dataset”To simulate a dataset, run the following queries:
- Create the sensors tableCREATE TABLE sensors(id SERIAL PRIMARY KEY,type VARCHAR(50),location VARCHAR(50));
- Create the sensor_data hypertableCREATE TABLE sensor_data (time TIMESTAMPTZ NOT NULL,sensor_id INTEGER,temperature DOUBLE PRECISION,cpu DOUBLE PRECISION,FOREIGN KEY (sensor_id) REFERENCES sensors (id)) WITH (tsdb.hypertable);
When you create a hypertable using CREATE TABLE … WITH …, the default partitioning column is automatically the first column with a timestamp data type. Also, TimescaleDB creates a columnstore policy that automatically converts your data to the columnstore, after an interval equal to the value of the chunk_interval, defined through
afterin the policy. This columnar format enables fast scanning and aggregation, optimizing performance for analytical workloads while also saving significant storage space. In the columnstore conversion, hypertable chunks are compressed by up to 98%, and organized for efficient, large-scale queries.You can customize this policy later using alter_job. However, to change
afterorcreated_before, the compression settings, or the hypertable the policy is acting on, you must remove the columnstore policy and add a new one.You can also manually convert chunks in a hypertable to the columnstore.
- Populate the sensors tableINSERT INTO sensors (type, location) VALUES('a','floor'),('a', 'ceiling'),('b','floor'),('b', 'ceiling');
- Verify that the sensors have been added correctlySELECT * FROM sensors;
Sample output:
id | type | location----+------+----------1 | a | floor2 | a | ceiling3 | b | floor4 | b | ceiling(4 rows) - Generate and insert a dataset for all sensorsINSERT INTO sensor_data (time, sensor_id, cpu, temperature)SELECTtime,sensor_id,random() AS cpu,random()*100 AS temperatureFROM generate_series(now() - interval '24 hour', now(), interval '5 minute') AS g1(time), generate_series(1,4,1) AS g2(sensor_id);
- Verify the simulated datasetSELECT * FROM sensor_data ORDER BY time;
Sample output:
time | sensor_id | temperature | cpu-------------------------------+-----------+--------------------+---------------------2020-03-31 15:56:25.843575+00 | 1 | 6.86688972637057 | 0.6820705672726042020-03-31 15:56:40.244287+00 | 2 | 26.589260622859 | 0.2295834696851672030-03-31 15:56:45.653115+00 | 3 | 79.9925176426768 | 0.4577798903919762020-03-31 15:56:53.560205+00 | 4 | 24.3201029952615 | 0.6418856489472092020-03-31 16:01:25.843575+00 | 1 | 33.3203678019345 | 0.01591639174148442020-03-31 16:01:40.244287+00 | 2 | 31.2673618085682 | 0.7011859565973282020-03-31 16:01:45.653115+00 | 3 | 85.2960689924657 | 0.6934138899669052020-03-31 16:01:53.560205+00 | 4 | 79.4769988860935 | 0.360561791341752...
Run basic queries
Section titled “Run basic queries”After you simulate a dataset, you can run some basic queries on it. For example:
-
Average temperature and CPU by 30-minute windows:
SELECTtime_bucket('30 minutes', time) AS period,AVG(temperature) AS avg_temp,AVG(cpu) AS avg_cpuFROM sensor_dataGROUP BY period;Sample output:
period | avg_temp | avg_cpu------------------------+------------------+-------------------2020-03-31 19:00:00+00 | 49.6615830013373 | 0.4773444299741342020-03-31 22:00:00+00 | 58.8521540844037 | 0.5036377705012762020-03-31 16:00:00+00 | 50.4250325243144 | 0.5110755912998382020-03-31 17:30:00+00 | 49.0742547437549 | 0.5272672538024682020-04-01 14:30:00+00 | 49.3416377226822 | 0.438027751864865... -
Average and last temperature, average CPU by 30-minute windows:
SELECTtime_bucket('30 minutes', time) AS period,AVG(temperature) AS avg_temp,last(temperature, time) AS last_temp,AVG(cpu) AS avg_cpuFROM sensor_dataGROUP BY period;Sample output:
period | avg_temp | last_temp | avg_cpu------------------------+------------------+------------------+-------------------2020-03-31 19:00:00+00 | 49.6615830013373 | 84.3963081017137 | 0.4773444299741342020-03-31 22:00:00+00 | 58.8521540844037 | 76.5528806950897 | 0.5036377705012762020-03-31 16:00:00+00 | 50.4250325243144 | 43.5192013625056 | 0.5110755912998382020-03-31 17:30:00+00 | 49.0742547437549 | 22.740753274411 | 0.5272672538024682020-04-01 14:30:00+00 | 49.3416377226822 | 59.1331578791142 | 0.438027751864865... -
Query the metadata:
SELECTsensors.location,time_bucket('30 minutes', time) AS period,AVG(temperature) AS avg_temp,last(temperature, time) AS last_temp,AVG(cpu) AS avg_cpuFROM sensor_data JOIN sensors on sensor_data.sensor_id = sensors.idGROUP BY period, sensors.location;Sample output:
location | period | avg_temp | last_temp | avg_cpu----------+------------------------+------------------+-------------------+-------------------ceiling | 2020-03-31 15:30:00+00 | 25.4546818090603 | 24.3201029952615 | 0.435734559316188floor | 2020-03-31 15:30:00+00 | 43.4297036845237 | 79.9925176426768 | 0.56992522883229ceiling | 2020-03-31 16:00:00+00 | 53.8454438598516 | 43.5192013625056 | 0.490728285357666floor | 2020-03-31 16:00:00+00 | 47.0046211887772 | 23.0230117216706 | 0.53142289724201ceiling | 2020-03-31 16:30:00+00 | 58.7817596504465 | 63.6621567420661 | 0.488188337767497floor | 2020-03-31 16:30:00+00 | 44.611586847653 | 2.21919436007738 | 0.434762630766879ceiling | 2020-03-31 17:00:00+00 | 35.7026890735142 | 42.9420990403742 | 0.550129583687522floor | 2020-03-31 17:00:00+00 | 62.2794370166957 | 52.6636955793947 | 0.454323202022351...
You have now successfully simulated and run queries on an IoT dataset.