Skip to main content

Python binding for pgvecto.rs

Project description

Python bindings for pgvector.rs

pdm-managed

Currently supports SQLAlchemy.

Usage

Install from PyPI:

pip install pgvecto_rs

See the usage examples:

SQLAlchemy

Install SQLAlchemy and psycopg3

pip install "psycopg[binary]" sqlalchemy

Then write your code. For example:

import numpy as np
from sqlalchemy import create_engine, select, insert, types
from sqlalchemy import Integer, String
from pgvector_rs.sqlalchemy import Vector
from sqlalchemy.orm import Session, DeclarativeBase, mapped_column, Mapped

URL = "postgresql+psycopg://<...>"

# Define the ORM model


class Base(DeclarativeBase):
    pass


class Document(Base):
    __tablename__ = "documents"

    id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
    text: Mapped[str] = mapped_column(String)
    embedding: Mapped[np.ndarray] = mapped_column(Vector(3))

    def __repr__(self) -> str:
        return f"{self.text}: {self.embedding}"


# Connect to the DB and create the table
engine = create_engine(URL)
Document.metadata.create_all(engine)

with Session(engine) as session:
    # Insert 3 rows into the table
    t1 = insert(Document).values(text="hello world", embedding=[1, 2, 3])
    t2 = insert(Document).values(text="hello postgres", embedding=[1, 2, 4])
    t3 = insert(Document).values(text="hello pgvecto.rs", embedding=[1, 3, 4])
    for t in [t1, t2, t3]:
        session.execute(t)
    session.commit()

    # Select the row "hello pgvecto.rs"
    stmt = select(Document).where(Document.text == "hello pgvecto.rs")
    target = session.scalar(stmt)

    # Select all the rows and sort them
    # by the squared_euclidean_distance to "hello pgvecto.rs"
    stmt = select(
        Document.text,
        Document.embedding.squared_euclidean_distance(target.embedding).label(
            "distance"
        ),
    ).order_by("distance")
    for doc in session.execute(stmt):
        print(doc)

# Drop the table
Document.metadata.drop_all(engine)

The output will be:

('hello pgvecto.rs', 0.0)
('hello postgres', 1.0)
('hello world', 2.0)

All the operators include:

  • squared_euclidean_distance
  • negative_dot_product_distance
  • negative_cosine_distance

Development

This package is managed by PDM.

Set up things:

pdm venv create
pdm use # select the venv inside the project path
pdm sync

Run lint:

pdm run format
pdm run check

Run test in current environment:

pdm run test

Test

Tox is used to test the package locally.

Run test in all environment:

tox run

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pgvecto_rs-0.1.1.tar.gz (8.8 kB view hashes)

Uploaded Source

Built Distribution

pgvecto_rs-0.1.1-py3-none-any.whl (8.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page