Skip to content

pgai documentation

A Python library that turns PostgreSQL into the retrieval engine behind robust, production-ready RAG and Agentic applications.

A Python library that turns PostgreSQL into the retrieval engine behind robust, production-ready RAG and Agentic applications.

  • 🔄 Automatically create vector embeddings from data in PostgreSQL tables as well as documents in S3. The embeddings are automatically updated as the data changes.

  • 🔍 Powerful vector and semantic search with pgvector and pgvectorscale.

  • 🛡️ Production-ready out-of-the-box: Supports batch processing for efficient embedding generation, with built-in handling for model failures, rate limits, and latency spikes.

Works with any PostgreSQL database, including Timescale Cloud, Amazon RDS, Supabase and more.

The pgai python library can be installed using pip:

Terminal window
pip install pgai

To setup the necessary database functions and tables in your PostgreSQL database, run the following python code:

from pgai
pgai.install(DB_URL)

All of the pgai objects are installed into the ai schema.

Vectorizer automates the embedding process within your database management by treating embeddings as a declarative, DDL-like feature, like an index.

Overview: Automate AI embedding with pgai Vectorizer - a comprehensive overview of Vectorizer features, demonstrating how it streamlines the process of working with vector embeddings in your database.

  • Chunking: chunking algorithms you can use from within SQL.

The pgai extension is a PostgreSQL extension that performs model calling inside of PostgreSQL. You can find more information about the extension in the pgai extension documentation.