Glossary

A

12 terms

a table that tells a computer operating system which access rights each user has to a particular system object, such as a file directory or individual file.

a set of properties (atomicity, consistency, isolation, durability) that guarantee database transactions are processed reliably.

a set of database properties, Atomicity, Consistency, Isolation, Durability, ensuring reliable and consistent transactions. Inherited from PostgreSQL.

dynamic query plan adjustment based on actual execution statistics and data distribution patterns, improving performance over time.

a materialized, precomputed summary of query results over time-series data, providing faster access to analytics.

the process of automatically notifying administrators when predefined conditions or thresholds are met in system monitoring.

a system optimized for large-scale analytical queries, supporting complex aggregations, time-based queries, and data exploration.

the identification of abnormal patterns or outliers within time-series datasets, common in observability, IoT, and finance.

a storage pattern where data is only added, never modified in place. Ideal for time-series workloads and audit trails.

the process of moving old or infrequently accessed data to long-term, cost-effective storage solutions.

automatic division of a hypertable into chunks based on partitioning dimensions to optimize scalability and performance.

an isolated location within a cloud region that provides redundant power, networking, and connectivity.

B

10 terms

a self-balancing tree data structure that maintains sorted data and allows searches, sequential access, insertions, and deletions in logarithmic time.

an automated task that runs in the background without user intervention, typically for maintenance operations like compression or data retention.

a PostgreSQL process that runs background tasks independently of client sessions.

handling data in grouped batches rather than as individual real-time events, often used for historical data processing.

the process of filling in historical data that was missing or needs to be recalculated, often used during migrations or after schema changes.

a copy of data stored separately from the original data to protect against data loss, corruption, or system failure.

a probabilistic data structure that tests set membership with possible false positives but no false negatives. TimescaleDB uses blocked bloom filters to speed up point lookups by eliminating chunks that don’t contain queried values.

memory area where frequently accessed data pages are cached to reduce disk I/O operations.

a PostgreSQL index type that stores summaries about ranges of table blocks, useful for large tables with naturally ordered data.

a PostgreSQL data type for storing binary data as a sequence of bytes.

C

25 terms

the percentage of data requests served from memory cache rather than disk, indicating query performance efficiency.

the number of unique values in a dataset or database column.

a database constraint that limits the values that can be stored in a column by checking them against a specified condition.

a horizontal partition of a hypertable that contains data for a specific time interval and space partition.

the time period covered by each chunk in a hypertable, which affects query performance and storage efficiency.

a query optimization technique that skips chunks not relevant to the query’s time range, dramatically improving performance.

a method for allocating IP addresses and routing IP packets.

authentication tokens used by applications to access services programmatically without user interaction.

in financial data, the closing price of a security at the end of a trading period.

computing services delivered over the internet, including servers, storage, databases, networking, software, analytics, and intelligence.

the use of public, private, or hybrid cloud infrastructure to host TimescaleDB, enabling elastic scalability and managed services.

an approach to building applications that leverage cloud infrastructure, scalability, and services like Kubernetes.

a tier of data storage for infrequently accessed data that offers lower costs but higher access times.

a data storage format that stores data column by column rather than row by row, optimizing for analytical queries.

TimescaleDB‘s columnar storage engine optimized for analytical workloads and compression.

the process of reducing data size by encoding information using fewer bits, improving storage efficiency and query performance.

a technique for managing multiple database connections efficiently, reducing overhead for high-concurrency environments.

protocols ensuring distributed systems agree on data state, critical for multi-node database deployments.

an automated rule that compresses hypertable chunks after they reach a specified age or size threshold.

the ratio between the original data size and the compressed data size, indicating compression effectiveness.

a rule enforced by the database to maintain data integrity and consistency.

a materialized view that incrementally updates with new data, providing fast access to pre-computed aggregations.

aggregating monotonic counter data, handling counter resets and extrapolation.

a time-based job scheduler in Unix-like computer operating systems.

a backup stored in a different geographical region from the primary data for disaster recovery.

D

19 terms

a centralized repository storing structured and unstructured data at scale, often integrated with time-series databases for analytics.

the tracking of data flow from source to destination, including transformations, essential for compliance and debugging.

automated workflows for moving, transforming, and loading data between systems, often using tools like Apache Kafka or Apache Airflow.

the process of moving data from one system, storage type, or format to another.

the practice of storing data for a specified period before deletion, often governed by compliance requirements or storage optimization.

the process of summarizing detailed historical data into higher-level aggregates, balancing storage needs with query efficiency.

uneven distribution of data across partitions or nodes, potentially causing performance bottlenecks.

a storage management strategy that places data on different storage tiers based on access patterns and performance requirements.

a classification that specifies which type of value a variable can hold, such as integer, string, or boolean.

the process of restoring compressed data to its original, uncompressed state.

the difference between two values, commonly used in counter aggregations to calculate the change over time.

a network management protocol used to automatically assign IP addresses and other network configuration parameters.

a partitioning key in a hypertable that determines how data is distributed across chunks.

the process and procedures for recovering and protecting a business’s IT infrastructure in the event of a disaster.

a floating-point data type that provides more precision than the standard float type.

the process of reducing the temporal resolution of time-series data by aggregating data points over longer time intervals.

the period during which a system, service, or application is unavailable or not operational.

a migration approach where new data is written to both the source and target databases simultaneously, followed by backfilling historical data to ensure completeness.

a migration pattern where applications write data to both the source and target systems simultaneously.

E

11 terms

processing data at or near the data source such as IoT devices, rather than solely in centralized servers, reducing latency.

a device that aggregates data from sensors and performs preprocessing before sending data to cloud or centralized databases.

a data pipeline pattern where raw data is loaded first, then transformed within the target system, leveraging database processing power.

a vector representation of data such as text or images, that captures semantic meaning in a high-dimensional space.

the percentage of requests or operations that result in errors over a given time period.

a measure of the straight-line distance between two points in multidimensional space.

a message is delivered and processed precisely once. There is no loss and no duplicates.

a PostgreSQL command that shows the execution plan for a query, useful for performance analysis.

an architectural pattern storing all changes as a sequence of events, naturally fitting time-series database capabilities.

a design pattern where components react to events such as sensor readings, requiring real-time data pipelines and storage.

a PostgreSQL add-on that extends the database’s functionality beyond the core features.

F

7 terms

the central table in a star schema containing quantitative measures, often time-series data with foreign keys to dimension tables.

the automatic switching to a backup system, server, or network upon the failure or abnormal termination of the primary system.

high-volume, timestamped datasets like stock market feeds or trade logs, requiring low-latency, scalable databases like TimescaleDB.

a database constraint that establishes a link between data in two tables by referencing the primary key of another table.

a copy of a database service that shares the same data but can diverge independently through separate writes.

a free instance of Tiger Cloud with limited resources. You can create up to two free services under any pricing plan. When a free service reaches the resource limit, it converts to the read-only state. You can convert a free service to a standard one under paid pricing plans.

a standard network protocol used for transferring files between a client and server on a computer network.

G

6 terms

a technique for handling missing data points in time-series by interpolation or other methods, often implemented with hyperfunctions.

a PostgreSQL index type designed for indexing composite values and supporting fast searches.

a PostgreSQL index type that provides a framework for implementing custom index types.

an advanced downsampling algorithm that extends Largest-Triangle-Three-Buckets with Gaussian Process modeling.

PostgreSQL‘s configuration parameter system that controls various aspects of database behavior.

a unique identifier used in software applications, typically represented as a 128-bit value.

H

15 terms

an index type that provides constant-time lookups for equality comparisons but doesn’t support range queries.

refers to datasets with a large number of unique values, which can strain storage and indexing in time-series applications.

a predefined range of metrics organized for statistical analysis, commonly visualized in monitoring tools.

a replication configuration where the standby server can serve read-only queries while staying synchronized with the primary.

a system design that ensures an agreed level of operational performance, usually uptime, for a higher than normal period.

in financial data, the highest price of a security during a specific time period.

a graphical representation of the distribution of numerical data, showing the frequency of data points in different ranges.

previously recorded data that provides context and trends for analysis and decision-making.

a graph-based algorithm for approximate nearest neighbor search in high-dimensional spaces.

a tier of data storage for frequently accessed data that provides the fastest access times but at higher cost.

TimescaleDB‘s hybrid storage engine that seamlessly combines row and column storage for optimal performance.

an SQL function in TimescaleDB designed for time-series analysis, statistics, and specialized computations.

a probabilistic data structure used for estimating the cardinality of large datasets with minimal memory usage.

a migration tool and strategy for moving data to TimescaleDB with minimal downtime.

TimescaleDB‘s core abstraction that automatically partitions time-series data for scalability.

I

10 terms

the property where repeated operations produce the same result, crucial for reliable data ingestion and processing.

the speed at which new data is written to the system, measured in rows per second. Critical for IoT and observability.

a mathematical operation that combines two vectors to produce a scalar, used in similarity calculations.

an SQL operation that adds new rows of data to a database table.

a data type that represents whole numbers without decimal points.

a statistical measure representing the y-intercept in linear regression analysis.

an AWS VPC component that enables communication between instances in a VPC and the internet.

a method of estimating unknown values that fall between known data points.

a security feature that restricts access to specified IP addresses or ranges.

a database transaction property that defines the degree to which operations in one transaction are isolated from those in other concurrent transactions.

J

6 terms

an automated task scheduled to run at specific intervals or triggered by certain conditions.

the process of running scheduled background tasks or automated procedures.

PostgreSQL feature that compiles frequently executed query parts for improved performance, available in TimescaleDB.

a record of past job executions, including their status, duration, and any errors encountered.

a lightweight data interchange format that is easy for humans to read and write.

a compact, URL-safe means of representing claims to be transferred between two parties.

L

12 terms

the time delay between a request being made and the response being received.

a set of rules that automatically manage data throughout its lifecycle, including retention and deletion.

a data migration technique that moves data with minimal or zero downtime.

a service distributing traffic across servers or database nodes to optimize resource use and avoid single points of failure.

a data structure optimized for write-heavy workloads, though TimescaleDB primarily uses B-tree indexes for balanced read/write performance.

a framework for building applications with large language models, providing tools for data ingestion and querying.

a method for handling missing data by using the most recent known value.

a backup method that exports data in a human-readable format, allowing for selective restoration.

a PostgreSQL feature that replicates data changes at the logical level rather than the physical level.

the process of recording events, errors, and system activities for monitoring and troubleshooting purposes.

in financial data, the lowest price of a security during a specific time period.

a downsampling algorithm that preserves the visual characteristics of time-series data.

M

12 terms

a distance metric calculated as the sum of the absolute differences of their coordinates.

the process of compressing chunks manually rather than through automated policies.

the process of computing and storing the results of a query or view for faster access.

a database object that stores the result of a query and can be refreshed periodically.

a query pattern designed to minimize disk I/O by leveraging available RAM and efficient data structures.

a quantitative measurement used to assess system performance, business outcomes, or operational efficiency.

a security method that requires two or more verification factors to grant access.

the process of moving data, applications, or systems from one environment to another.

the continuous observation and measurement of system performance and health.

an architecture pattern supporting multiple customers or applications within a single database instance, with proper isolation.

a lightweight messaging protocol designed for small sensors and mobile devices.

a fully managed TimescaleDB service that handles infrastructure and maintenance tasks.

N

5 terms

a network address translation service that enables instances in a private subnet to connect to the internet.

an individual server within a distributed system, contributing to storage, compute, or replication tasks.

database design technique organizing data to reduce redundancy, though time-series data often benefits from denormalized structures.

a database constraint that ensures a column cannot contain empty values.

a PostgreSQL data type for storing exact numeric values with user-defined precision.

O

9 terms

an open standard for access delegation commonly used for token-based authentication and authorization.

the ability to measure the internal states of a system by examining its outputs.

systems or workloads focused on large-scale, multidimensional, and complex analytical queries.

high-speed transactional systems optimized for data inserts, updates, and short queries.

an acronym for Open, High, Low, Close prices, commonly used in financial data analysis.

an extension of OHLC that includes Volume data for complete candlestick analysis.

in financial data, the opening price of a security at the beginning of a trading period.

open standard for collecting, processing, and exporting telemetry data, often stored in time-series databases.

the process of making systems, queries, or operations more efficient and performant.

P

20 terms

a technique for copying large amounts of data using multiple concurrent processes to improve performance.

a PostgreSQL feature that uses multiple CPU cores to execute single queries faster, inherited by TimescaleDB.

the practice of dividing large tables into smaller, more manageable pieces based on certain criteria.

a statistical measure that indicates the value below which a certain percentage of observations fall.

a measure of how efficiently a system operates, often quantified by metrics like throughput, latency, and resource utilization.

a PostgreSQL utility for taking base backups of a running PostgreSQL cluster.

a PostgreSQL utility for backing up database objects and data in various formats.

a PostgreSQL utility for restoring databases from backup files created by `pg_dump`.

a PostgreSQL extension that adds vector similarity search capabilities for AI and machine learning applications.

a cloud solution for building search, RAG, and AI agents with PostgreSQL. Enables calling AI embedding and generation models directly from the database using SQL.

a performance enhancement for pgvector featuring StreamingDiskANN indexing, binary quantization compression, and label-based filtering.

a TimescaleDB tool for automatically vectorizing and indexing data for similarity search.

a backup method that copies the actual database files at the storage level.

the ability to restore a database to a specific moment in time.

an automated rule or procedure that performs maintenance tasks like compression, retention, or refresh operations.

the use of time-series data to forecast equipment failure, common in IoT and industrial applications.

an open-source object-relational database system known for its reliability, robustness, and performance.

a PostgreSQL extension that adds support for geographic objects and spatial queries.

a database constraint that uniquely identifies each row in a table.

an interactive terminal-based front-end to PostgreSQL that allows users to type queries interactively.

Q

5 terms

a measure of database performance indicating how many queries a database can process per second.

a request for data or information from a database, typically written in SQL.

a measure of how efficiently database queries execute, including factors like execution time and resource usage.

a component determining the most efficient strategy for executing SQL queries based on database structure and indexes.

the database process of determining the most efficient way to execute a query.

R

24 terms

a security model that assigns permissions to users based on their roles within an organization.

an isolation level where transactions can read committed changes made by other transactions.

a technique for improving database performance by distributing read queries across multiple database replicas.

the lowest isolation level where transactions can read uncommitted changes from other transactions.

a database role with permissions limited to reading data without modification capabilities.

a copy of the primary database that serves read-only queries, improving read scalability and geographic distribution.

the immediate analysis of incoming data streams, crucial for observability, trading platforms, and IoT monitoring.

a PostgreSQL data type for storing single-precision floating-point numbers.

a continuous aggregate that includes both materialized historical data and real-time calculations on recent data.

a technique for combining ranked results from multiple search methods into a single list. Instead of comparing raw scores across different systems, RRF uses rank position: each result receives a score of 1 / (k + rank), where k is a smoothing constant (typically 60). Scores are summed across all search methods and sorted by total. Commonly used in hybrid search to fuse keyword (BM25) and vector similarity results.

an automated rule that determines when and how continuous aggregates are updated with new data.

a geographical area containing multiple data centers, used in cloud computing for data locality and compliance.

an isolation level that ensures a transaction sees a consistent snapshot of data throughout its execution.

a copy of a database that can be used for read scaling, backup, or disaster recovery purposes.

the process of copying and maintaining data across multiple database instances to ensure availability and durability.

the time it takes for a system to respond to a request, measured from request initiation to response completion.

a web service architecture that uses HTTP methods to enable communication between applications.

the process of recovering data from backups to restore a database to a previous state.

a snapshot of database state that can be used as a reference point for recovery operations.

an automated rule that determines how long data is kept before being deleted from the system.

a set of rules that determine where network traffic is directed within a cloud network.

the maximum acceptable time that systems can be down after a failure or disaster.

the maximum acceptable amount of data loss measured in time after a failure or disaster.

traditional row-oriented data storage where data is stored row by row, optimized for transactional workloads.

S

27 terms

an XML-based standard for exchanging authentication and authorization data between security domains.

an automated task that runs at predetermined times or intervals.

the process of modifying database structure over time while maintaining compatibility with existing applications.

the structure of a database, including tables, columns, relationships, and constraints.

a virtual firewall that controls inbound and outbound traffic for cloud resources.

mechanisms allowing applications to dynamically locate services like database endpoints, often used in distributed environments.

a TimescaleDB compression technique that recompresses data segments to improve compression ratios.

the highest isolation level that ensures transactions appear to run serially even when executed concurrently.

horizontal partitioning of data across multiple database instances, distributing load and enabling linear scalability.

a secure version of FTP that encrypts both commands and data during transmission.

query optimization for DISTINCT operations that incrementally jumps between ordered values without reading intermediate rows. Uses a Custom Scan node to efficiently traverse ordered indexes, dramatically improving performance over traditional DISTINCT queries.

a technique for finding items that are similar to a given query item, often used with vector embeddings.

a contract that defines the expected level of service between a provider and customer.

a quantitative measure of some aspect of service quality.

a target value or range for service quality measured by an SLI.

a statistical measure representing the rate of change in linear regression analysis.

an internet standard for email transmission across networks.

a point-in-time copy of data that can be used for backup and recovery purposes.

a PostgreSQL index type for data structures that naturally partition search spaces.

techniques for reducing storage costs and improving performance through compression, tiering, and efficient data organization.

continuous flows of data generated by devices, logs, or sensors, requiring high-ingest, real-time storage solutions.

a programming language designed for managing and querying relational databases.

a cryptographic network protocol for secure communication over an unsecured network.

a security protocol that establishes encrypted links between networked computers.

a regular Tiger Cloud service that includes the resources and features according to the pricing plan. You can create standard services under any of the paid plans.

a PostgreSQL replication method that continuously sends write-ahead log records to standby servers.

simulated transactions or probes used to test system health, generating time-series metrics for performance analysis.

T

25 terms

an optimized PostgreSQL instance extended with database engine innovations such as TimescaleDB, in a cloud infrastructure that delivers speed without sacrifice.

a database object that stores data in rows and columns, similar to a spreadsheet.

a PostgreSQL storage structure that defines where database objects are physically stored on disk.

a connection-oriented protocol that ensures reliable data transmission between applications.

a probabilistic data structure for accurate estimation of percentiles in distributed systems.

the collection of real-time data from systems or devices for monitoring and analysis.

a PostgreSQL data type for storing variable-length character strings.

a measure of system performance indicating the amount of work performed or data processed per unit of time.

a storage strategy that automatically moves data between different storage classes based on access patterns and age.

Tiger Data ‘s managed cloud platform that provides TimescaleDB as a fully managed solution with additional features.

Tiger Data ‘s service for integrating operational databases with data lake architectures.

an optimized PostgreSQL instance extended with database engine innovations such as TimescaleDB, in a cloud infrastructure that delivers speed without sacrifice.

data points indexed and ordered by time, typically representing how values change over time.

a statistical calculation that gives more weight to values based on the duration they were held.

grouping timestamps into uniform intervals for analysis, commonly used with hyperfunctions.

the application of statistical models to time-series data to predict future trends or events.

an open-source PostgreSQL extension for real-time analytics that provides scalability and performance optimizations.

a data type that stores date and time information without timezone data.

a PostgreSQL data type that stores timestamp with timezone information.

a cryptographic protocol that provides security for communication over networks.

marker indicating deleted data in append-only systems, requiring periodic cleanup processes.

the database property controlling the visibility of uncommitted changes between concurrent transactions.

a measure of database performance indicating transaction processing capacity.

a unit of work performed against a database that must be completed entirely or not at all.

a database procedure that automatically executes in response to certain events on a table or view.

U

5 terms

a connectionless communication protocol that provides fast but unreliable data transmission.

a database constraint that ensures all values in a column or combination of columns are distinct.

the amount of time that a system has been operational and available for use.

a billing model where storage costs are based on actual data stored rather than provisioned capacity.

a 128-bit identifier used to uniquely identify information without central coordination.

V

8 terms

a PostgreSQL maintenance operation that reclaims storage and updates database statistics.

a variable-length character data type that can store strings up to a specified maximum length.

SIMD (Single Instruction, Multiple Data) optimizations for processing arrays of data, improving analytical query performance.

increasing system capacity by adding more power (CPU, RAM) to existing machines, as opposed to horizontal scaling.

a platform or dashboard used to display time-series data in charts, graphs, and alerts for easier monitoring and analysis.

a mathematical object with magnitude and direction, used in machine learning for representing data as numerical arrays.

a virtual network dedicated to your cloud account that provides network isolation.

a financial indicator that shows the average price weighted by volume over a specific time period.

W

6 terms

PostgreSQL‘s method for ensuring data integrity by writing changes to a log before applying them to data files.

a storage tier that balances access speed and cost, suitable for data accessed occasionally.

a timestamp that tracks the progress of continuous aggregate materialization.

a communication protocol that provides full-duplex communication channels over a single TCP connection.

an SQL function that performs calculations across related rows, particularly useful for time-series analytics and trend analysis.

techniques for prioritizing and scheduling different types of database operations to optimize overall system performance.

X

1 term

a markup language that defines rules for encoding documents in a format that is both human-readable and machine-readable.

Y

1 term

a human-readable data serialization standard commonly used for configuration files.

Z

2 terms

a system design goal where services remain available during maintenance, upgrades, or migrations without interruption.

migration strategies that maintain service availability throughout the transition process, often using techniques like dual-write and gradual cutover.

Glossary

ACL (Access Control List)

ACID

ACID compliance

Adaptive query optimization

Aggregate (Continuous Aggregate)

Alerting

Analytics database

Anomaly detection

Append-only storage

Archival

Auto-partitioning

Availability zone

B-tree

Background job

Background worker

Batch processing

Backfill

Backup

Bloom filter

Buffer pool

BRIN (Block Range Index)

Bytea

Cache hit ratio

Cardinality

Check constraint

Chunk

Chunk interval

Chunk skipping

CIDR (Classless Inter-Domain Routing)

Client credentials

Close

Cloud

Cloud deployment

Cloud-native

Cold storage

Columnar

Columnstore

Compression

Connection pooling

Consensus algorithm

Compression policy

Compression ratio

Constraint

Continuous aggregate

Counter aggregation

Cron

Cross-region backup

Data lake

Data lineage

Data pipeline

Data migration

Data retention

Data rollup

Data skew

Data tiering

Data type

Decompress

Delta

DHCP (Dynamic Host Configuration Protocol)

Dimension

Disaster recovery

Double precision

Downsample

Downtime

Dual-write and backfill

Dual-write

Edge computing

Edge gateway

ELT (Extract, Load, Transform)

Embedding

Error rate

Euclidean distance

Exactly-once

Explain

Event sourcing

Event-driven architecture

Extension

Fact table

Failover