PGVectorStore#

class langchain_postgres.v2.vectorstores.PGVectorStore(
key: object,
engine: PGEngine,
vs: AsyncPGVectorStore,
)[source]#

Postgres Vector Store class

PGVectorStore constructor. :param key: Prevent direct constructor usage. :type key: object :param engine: Connection pool engine for managing connections to Postgres database. :type engine: PGEngine :param vs: The async only VectorStore implementation :type vs: AsyncPGVectorStore

Raises:

Exception – If called directly by user.

Parameters:

Attributes

embeddings

Access the query embedding object if available.

Methods

__init__(key, engine, vs)

PGVectorStore constructor.

aadd_documents(documents[, ids])

Embed documents and add to the table.

aadd_embeddings(texts, embeddings[, ...])

Add data along with embeddings to the table.

aadd_texts(texts[, metadatas, ids])

Embed texts and add to the table.

aapply_vector_index(index[, name, concurrently])

Create an index on the vector store table.

add_documents(documents[, ids])

Embed documents and add to the table.

add_embeddings(texts, embeddings[, ...])

Add data along with embeddings to the table.

add_texts(texts[, metadatas, ids])

Embed texts and add to the table.

adelete([ids])

Delete records from the table.

adrop_vector_index([index_name])

Drop the vector index.

afrom_documents(documents, embedding, ...[, ...])

Create an PGVectorStore instance from documents.

afrom_texts(texts, embedding, engine, table_name)

Create an PGVectorStore instance from texts.

aget_by_ids(ids)

Get documents by ids.

ais_valid_index([index_name])

Check if index exists in the table.

amax_marginal_relevance_search(query[, k, ...])

Return docs selected using the maximal marginal relevance.

amax_marginal_relevance_search_by_vector(...)

Return docs selected using the maximal marginal relevance.

amax_marginal_relevance_search_with_score_by_vector(...)

Return docs and distance scores selected using the maximal marginal relevance.

apply_vector_index(index[, name, concurrently])

Create an index on the vector store table.

areindex([index_name])

Re-index the vector store table.

as_retriever(**kwargs)

Return VectorStoreRetriever initialized from this VectorStore.

asearch(query, search_type, **kwargs)

Async return docs most similar to query using a specified search type.

asimilarity_search(query[, k, filter])

Return docs selected by similarity search on query.

asimilarity_search_by_vector(embedding[, k, ...])

Return docs selected by vector similarity search.

asimilarity_search_with_relevance_scores(query)

Async return docs and relevance scores in the range [0, 1].

asimilarity_search_with_score(query[, k, filter])

Return docs and distance scores selected by similarity search on query.

asimilarity_search_with_score_by_vector(...)

Return docs and distance scores selected by vector similarity search.

create(engine, embedding_service, table_name)

Create an PGVectorStore instance.

create_sync(engine, embedding_service, ...)

Create an PGVectorStore instance.

delete([ids])

Delete records from the table.

drop_vector_index([index_name])

Drop the vector index.

from_documents(documents, embedding, engine, ...)

Create an PGVectorStore instance from documents.

from_texts(texts, embedding, engine, table_name)

Create an PGVectorStore instance from texts.

get_by_ids(ids)

Get documents by ids.

get_table_name()

is_valid_index([index_name])

Check if index exists in the table.

max_marginal_relevance_search(query[, k, ...])

Return docs selected using the maximal marginal relevance.

max_marginal_relevance_search_by_vector(...)

Return docs selected using the maximal marginal relevance.

max_marginal_relevance_search_with_score_by_vector(...)

Return docs and distance scores selected using the maximal marginal relevance.

reindex([index_name])

Re-index the vector store table.

search(query, search_type, **kwargs)

Return docs most similar to query using a specified search type.

similarity_search(query[, k, filter])

Return docs selected by similarity search on query.

similarity_search_by_vector(embedding[, k, ...])

Return docs selected by vector similarity search.

similarity_search_with_relevance_scores(query)

Return docs and relevance scores in the range [0, 1].

similarity_search_with_score(query[, k, filter])

Return docs and distance scores selected by similarity search on query.

similarity_search_with_score_by_vector(embedding)

Return docs and distance scores selected by similarity search on vector.

__init__(
key: object,
engine: PGEngine,
vs: AsyncPGVectorStore,
)[source]#

PGVectorStore constructor. :param key: Prevent direct constructor usage. :type key: object :param engine: Connection pool engine for managing connections to Postgres database. :type engine: PGEngine :param vs: The async only VectorStore implementation :type vs: AsyncPGVectorStore

Raises:

Exception – If called directly by user.

Parameters:
async aadd_documents(
documents: list[Document],
ids: list | None = None,
**kwargs: Any,
) → list[str][source]#

Embed documents and add to the table.

Raises:

InvalidTextRepresentationError <asyncpg.exceptions.InvalidTextRepresentationError> – if the ids data type does not match that of the id_column.

Parameters:
  • documents (list[Document])

  • ids (list | None)

  • kwargs (Any)

Return type:

list[str]

async aadd_embeddings(
texts: Iterable[str],
embeddings: list[list[float]],
metadatas: list[dict] | None = None,
ids: list[str] | None = None,
**kwargs: Any,
) → list[str][source]#

Add data along with embeddings to the table.

Parameters:
  • texts (Iterable[str])

  • embeddings (list[list[float]])

  • metadatas (list[dict] | None)

  • ids (list[str] | None)

  • kwargs (Any)

Return type:

list[str]

async aadd_texts(
texts: Iterable[str],
metadatas: list[dict] | None = None,
ids: list | None = None,
**kwargs: Any,
) → list[str][source]#

Embed texts and add to the table.

Raises:

InvalidTextRepresentationError <asyncpg.exceptions.InvalidTextRepresentationError> – if the ids data type does not match that of the id_column.

Parameters:
  • texts (Iterable[str])

  • metadatas (list[dict] | None)

  • ids (list | None)

  • kwargs (Any)

Return type:

list[str]

async aapply_vector_index(
index: BaseIndex,
name: str | None = None,
concurrently: bool = False,
) → None[source]#

Create an index on the vector store table.

Parameters:
  • index (BaseIndex)

  • name (str | None)

  • concurrently (bool)

Return type:

None

add_documents(
documents: list[Document],
ids: list | None = None,
**kwargs: Any,
) → list[str][source]#

Embed documents and add to the table.

Raises:

InvalidTextRepresentationError <asyncpg.exceptions.InvalidTextRepresentationError> – if the ids data type does not match that of the id_column.

Parameters:
  • documents (list[Document])

  • ids (list | None)

  • kwargs (Any)

Return type:

list[str]

add_embeddings(
texts: Iterable[str],
embeddings: list[list[float]],
metadatas: list[dict] | None = None,
ids: list[str] | None = None,
**kwargs: Any,
) → list[str][source]#

Add data along with embeddings to the table.

Parameters:
  • texts (Iterable[str])

  • embeddings (list[list[float]])

  • metadatas (list[dict] | None)

  • ids (list[str] | None)

  • kwargs (Any)

Return type:

list[str]

add_texts(
texts: Iterable[str],
metadatas: list[dict] | None = None,
ids: list | None = None,
**kwargs: Any,
) → list[str][source]#

Embed texts and add to the table.

Raises:

InvalidTextRepresentationError <asyncpg.exceptions.InvalidTextRepresentationError> – if the ids data type does not match that of the id_column.

Parameters:
  • texts (Iterable[str])

  • metadatas (list[dict] | None)

  • ids (list | None)

  • kwargs (Any)

Return type:

list[str]

async adelete(
ids: list | None = None,
**kwargs: Any,
) → bool | None[source]#

Delete records from the table.

Raises:

InvalidTextRepresentationError <asyncpg.exceptions.InvalidTextRepresentationError> – if the ids data type does not match that of the id_column.

Parameters:
  • ids (list | None)

  • kwargs (Any)

Return type:

bool | None

async adrop_vector_index(
index_name: str | None = None,
) → None[source]#

Drop the vector index.

Parameters:

index_name (str | None)

Return type:

None

async classmethod afrom_documents(
documents: list[Document],
embedding: Embeddings,
engine: PGEngine,
table_name: str,
schema_name: str = 'public',
ids: list | None = None,
content_column: str = 'content',
embedding_column: str = 'embedding',
metadata_columns: list[str] | None = None,
ignore_metadata_columns: list[str] | None = None,
id_column: str = 'langchain_id',
metadata_json_column: str = 'langchain_metadata',
distance_strategy: DistanceStrategy = DistanceStrategy.COSINE_DISTANCE,
k: int = 4,
fetch_k: int = 20,
lambda_mult: float = 0.5,
index_query_options: QueryOptions | None = None,
**kwargs: Any,
) → PGVectorStore[source]#

Create an PGVectorStore instance from documents.

Parameters:
  • documents (list[Document]) – Documents to add to the vector store.

  • embedding (Embeddings) – Text embedding model to use.

  • engine (PGEngine) – Connection pool engine for managing connections to postgres database.

  • table_name (str) – Name of an existing table.

  • schema_name (str, optional) – Name of the database schema. Defaults to “public”.

  • ids (list | None) – (Optional[list]): List of IDs to add to table records. Defaults to None.

  • content_column (str, optional) – Column that represent a Document’s page_content. Defaults to “content”.

  • embedding_column (str, optional) – Column for embedding vectors. The embedding is generated from the document value. Defaults to “embedding”.

  • metadata_columns (list[str], optional) – Column(s) that represent a document’s metadata. Defaults to an empty list.

  • ignore_metadata_columns (Optional[list[str]], optional) – Column(s) to ignore in pre-existing tables for a document’s metadata. Can not be used with metadata_columns. Defaults to None.

  • id_column (str, optional) – Column that represents the Document’s id. Defaults to “langchain_id”.

  • metadata_json_column (str, optional) – Column to store metadata as JSON. Defaults to “langchain_metadata”.

  • distance_strategy (DistanceStrategy) – Distance strategy to use for vector similarity search. Defaults to COSINE_DISTANCE.

  • k (int) – Number of Documents to return from search. Defaults to 4.

  • fetch_k (int) – Number of Documents to fetch to pass to MMR algorithm.

  • lambda_mult (float) – Number between 0 and 1 that determines the degree of diversity among the results with 0 corresponding to maximum diversity and 1 to minimum diversity. Defaults to 0.5.

  • index_query_options (QueryOptions) – Index query option.

  • kwargs (Any)

Raises:

InvalidTextRepresentationError <asyncpg.exceptions.InvalidTextRepresentationError> – if the ids data type does not match that of the id_column.

Returns:

PGVectorStore

Return type:

PGVectorStore

async classmethod afrom_texts(
texts: list[str],
embedding: Embeddings,
engine: PGEngine,
table_name: str,
schema_name: str = 'public',
metadatas: list[dict] | None = None,
ids: list | None = None,
content_column: str = 'content',
embedding_column: str = 'embedding',
metadata_columns: list[str] | None = None,
ignore_metadata_columns: list[str] | None = None,
id_column: str = 'langchain_id',
metadata_json_column: str = 'langchain_metadata',
distance_strategy: DistanceStrategy = DistanceStrategy.COSINE_DISTANCE,
k: int = 4,
fetch_k: int = 20,
lambda_mult: float = 0.5,
index_query_options: QueryOptions | None = None,
**kwargs: Any,
) → PGVectorStore[source]#

Create an PGVectorStore instance from texts.

Parameters:
  • texts (list[str]) – Texts to add to the vector store.

  • embedding (Embeddings) – Text embedding model to use.

  • engine (PGEngine) – Connection pool engine for managing connections to postgres database.

  • table_name (str) – Name of an existing table.

  • schema_name (str, optional) – Name of the database schema. Defaults to “public”.

  • metadatas (Optional[list[dict]], optional) – List of metadatas to add to table records. Defaults to None.

  • ids (list | None) – (Optional[list]): List of IDs to add to table records. Defaults to None.

  • content_column (str, optional) – Column that represent a Document’s page_content. Defaults to “content”.

  • embedding_column (str, optional) – Column for embedding vectors. The embedding is generated from the document value. Defaults to “embedding”.

  • metadata_columns (list[str], optional) – Column(s) that represent a document’s metadata. Defaults to an empty list.

  • ignore_metadata_columns (Optional[list[str]], optional) – Column(s) to ignore in pre-existing tables for a document’s metadata. Can not be used with metadata_columns. Defaults to None.

  • id_column (str, optional) – Column that represents the Document’s id. Defaults to “langchain_id”.

  • metadata_json_column (str, optional) – Column to store metadata as JSON. Defaults to “langchain_metadata”.

  • distance_strategy (DistanceStrategy) – Distance strategy to use for vector similarity search. Defaults to COSINE_DISTANCE.

  • k (int) – Number of Documents to return from search. Defaults to 4.

  • fetch_k (int) – Number of Documents to fetch to pass to MMR algorithm.

  • lambda_mult (float) – Number between 0 and 1 that determines the degree of diversity among the results with 0 corresponding to maximum diversity and 1 to minimum diversity. Defaults to 0.5.

  • index_query_options (QueryOptions) – Index query option.

  • kwargs (Any)

Raises:

InvalidTextRepresentationError <asyncpg.exceptions.InvalidTextRepresentationError> – if the ids data type does not match that of the id_column.

Returns:

PGVectorStore

Return type:

PGVectorStore

async aget_by_ids(
ids: Sequence[str],
) → list[Document][source]#

Get documents by ids.

Parameters:

ids (Sequence[str])

Return type:

list[Document]

async ais_valid_index(
index_name: str | None = None,
) → bool[source]#

Check if index exists in the table.

Parameters:

index_name (str | None)

Return type:

bool

Return docs selected using the maximal marginal relevance.

Parameters:
  • query (str)

  • k (int | None)

  • fetch_k (int | None)

  • lambda_mult (float | None)

  • filter (dict | None)

  • kwargs (Any)

Return type:

list[Document]

async amax_marginal_relevance_search_by_vector(
embedding: list[float],
k: int | None = None,
fetch_k: int | None = None,
lambda_mult: float | None = None,
filter: dict | None = None,
**kwargs: Any,
) → list[Document][source]#

Return docs selected using the maximal marginal relevance.

Parameters:
  • embedding (list[float])

  • k (int | None)

  • fetch_k (int | None)

  • lambda_mult (float | None)

  • filter (dict | None)

  • kwargs (Any)

Return type:

list[Document]

async amax_marginal_relevance_search_with_score_by_vector(
embedding: list[float],
k: int | None = None,
fetch_k: int | None = None,
lambda_mult: float | None = None,
filter: dict | None = None,
**kwargs: Any,
) → list[tuple[Document, float]][source]#

Return docs and distance scores selected using the maximal marginal relevance.

Parameters:
  • embedding (list[float])

  • k (int | None)

  • fetch_k (int | None)

  • lambda_mult (float | None)

  • filter (dict | None)

  • kwargs (Any)

Return type:

list[tuple[Document, float]]

apply_vector_index(
index: BaseIndex,
name: str | None = None,
concurrently: bool = False,
) → None[source]#

Create an index on the vector store table.

Parameters:
  • index (BaseIndex)

  • name (str | None)

  • concurrently (bool)

Return type:

None

async areindex(
index_name: str | None = None,
) → None[source]#

Re-index the vector store table.

Parameters:

index_name (str | None)

Return type:

None

as_retriever(
**kwargs: Any,
) → VectorStoreRetriever#

Return VectorStoreRetriever initialized from this VectorStore.

Parameters:

**kwargs (Any) –

Keyword arguments to pass to the search function. Can include: search_type (Optional[str]): Defines the type of search that

the Retriever should perform. Can be “similarity” (default), “mmr”, or “similarity_score_threshold”.

search_kwargs (Optional[Dict]): Keyword arguments to pass to the
search function. Can include things like:

k: Amount of documents to return (Default: 4) score_threshold: Minimum relevance threshold

for similarity_score_threshold

fetch_k: Amount of documents to pass to MMR algorithm

(Default: 20)

lambda_mult: Diversity of results returned by MMR;

1 for minimum diversity and 0 for maximum. (Default: 0.5)

filter: Filter by document metadata

Returns:

Retriever class for VectorStore.

Return type:

VectorStoreRetriever

Examples:

# Retrieve more documents with higher diversity
# Useful if your dataset has many similar documents
docsearch.as_retriever(
    search_type="mmr",
    search_kwargs={'k': 6, 'lambda_mult': 0.25}
)

# Fetch more documents for the MMR algorithm to consider
# But only return the top 5
docsearch.as_retriever(
    search_type="mmr",
    search_kwargs={'k': 5, 'fetch_k': 50}
)

# Only retrieve documents that have a relevance score
# Above a certain threshold
docsearch.as_retriever(
    search_type="similarity_score_threshold",
    search_kwargs={'score_threshold': 0.8}
)

# Only get the single most similar document from the dataset
docsearch.as_retriever(search_kwargs={'k': 1})

# Use a filter to only retrieve documents from a specific paper
docsearch.as_retriever(
    search_kwargs={'filter': {'paper_title':'GPT-4 Technical Report'}}
)
async asearch(
query: str,
search_type: str,
**kwargs: Any,
) → list[Document]#

Async return docs most similar to query using a specified search type.

Parameters:
  • query (str) – Input text.

  • search_type (str) – Type of search to perform. Can be “similarity”, “mmr”, or “similarity_score_threshold”.

  • **kwargs (Any) – Arguments to pass to the search method.

Returns:

List of Documents most similar to the query.

Raises:

ValueError – If search_type is not one of “similarity”, “mmr”, or “similarity_score_threshold”.

Return type:

list[Document]

Return docs selected by similarity search on query.

Parameters:
  • query (str)

  • k (int | None)

  • filter (dict | None)

  • kwargs (Any)

Return type:

list[Document]

async asimilarity_search_by_vector(
embedding: list[float],
k: int | None = None,
filter: dict | None = None,
**kwargs: Any,
) → list[Document][source]#

Return docs selected by vector similarity search.

Parameters:
  • embedding (list[float])

  • k (int | None)

  • filter (dict | None)

  • kwargs (Any)

Return type:

list[Document]

async asimilarity_search_with_relevance_scores(
query: str,
k: int = 4,
**kwargs: Any,
) → list[tuple[Document, float]]#

Async return docs and relevance scores in the range [0, 1].

0 is dissimilar, 1 is most similar.

Parameters:
  • query (str) – Input text.

  • k (int) – Number of Documents to return. Defaults to 4.

  • **kwargs (Any) –

    kwargs to be passed to similarity search. Should include: score_threshold: Optional, a floating point value between 0 to 1 to

    filter the resulting set of retrieved docs

Returns:

List of Tuples of (doc, similarity_score)

Return type:

list[tuple[Document, float]]

async asimilarity_search_with_score(
query: str,
k: int | None = None,
filter: dict | None = None,
**kwargs: Any,
) → list[tuple[Document, float]][source]#

Return docs and distance scores selected by similarity search on query.

Parameters:
  • query (str)

  • k (int | None)

  • filter (dict | None)

  • kwargs (Any)

Return type:

list[tuple[Document, float]]

async asimilarity_search_with_score_by_vector(
embedding: list[float],
k: int | None = None,
filter: dict | None = None,
**kwargs: Any,
) → list[tuple[Document, float]][source]#

Return docs and distance scores selected by vector similarity search.

Parameters:
  • embedding (list[float])

  • k (int | None)

  • filter (dict | None)

  • kwargs (Any)

Return type:

list[tuple[Document, float]]

async classmethod create(
engine: PGEngine,
embedding_service: Embeddings,
table_name: str,
schema_name: str = 'public',
content_column: str = 'content',
embedding_column: str = 'embedding',
metadata_columns: list[str] | None = None,
ignore_metadata_columns: list[str] | None = None,
id_column: str = 'langchain_id',
metadata_json_column: str | None = 'langchain_metadata',
distance_strategy: DistanceStrategy = DistanceStrategy.COSINE_DISTANCE,
k: int = 4,
fetch_k: int = 20,
lambda_mult: float = 0.5,
index_query_options: QueryOptions | None = None,
) → PGVectorStore[source]#

Create an PGVectorStore instance.

Parameters:
  • engine (PGEngine) – Connection pool engine for managing connections to postgres database.

  • embedding_service (Embeddings) – Text embedding model to use.

  • table_name (str) – Name of an existing table.

  • schema_name (str, optional) – Name of the database schema. Defaults to “public”.

  • content_column (str) – Column that represent a Document’s page_content. Defaults to “content”.

  • embedding_column (str) – Column for embedding vectors. The embedding is generated from the document value. Defaults to “embedding”.

  • metadata_columns (list[str]) – Column(s) that represent a document’s metadata.

  • ignore_metadata_columns (list[str]) – Column(s) to ignore in pre-existing tables for a document’s metadata. Can not be used with metadata_columns. Defaults to None.

  • id_column (str) – Column that represents the Document’s id. Defaults to “langchain_id”.

  • metadata_json_column (str) – Column to store metadata as JSON. Defaults to “langchain_metadata”.

  • distance_strategy (DistanceStrategy) – Distance strategy to use for vector similarity search. Defaults to COSINE_DISTANCE.

  • k (int) – Number of Documents to return from search. Defaults to 4.

  • fetch_k (int) – Number of Documents to fetch to pass to MMR algorithm.

  • lambda_mult (float) – Number between 0 and 1 that determines the degree of diversity among the results with 0 corresponding to maximum diversity and 1 to minimum diversity. Defaults to 0.5.

  • index_query_options (QueryOptions) – Index query option.

Returns:

PGVectorStore

Return type:

PGVectorStore

classmethod create_sync(
engine: PGEngine,
embedding_service: Embeddings,
table_name: str,
schema_name: str = 'public',
content_column: str = 'content',
embedding_column: str = 'embedding',
metadata_columns: list[str] | None = None,
ignore_metadata_columns: list[str] | None = None,
id_column: str = 'langchain_id',
metadata_json_column: str = 'langchain_metadata',
distance_strategy: DistanceStrategy = DistanceStrategy.COSINE_DISTANCE,
k: int = 4,
fetch_k: int = 20,
lambda_mult: float = 0.5,
index_query_options: QueryOptions | None = None,
) → PGVectorStore[source]#

Create an PGVectorStore instance.

Parameters:
  • key (object) – Prevent direct constructor usage.

  • engine (PGEngine) – Connection pool engine for managing connections to postgres database.

  • embedding_service (Embeddings) – Text embedding model to use.

  • table_name (str) – Name of an existing table.

  • schema_name (str, optional) – Name of the database schema. Defaults to “public”.

  • content_column (str, optional) – Column that represent a Document’s page_content. Defaults to “content”.

  • embedding_column (str, optional) – Column for embedding vectors. The embedding is generated from the document value. Defaults to “embedding”.

  • metadata_columns (list[str], optional) – Column(s) that represent a document’s metadata. Defaults to None.

  • ignore_metadata_columns (Optional[list[str]]) – Column(s) to ignore in pre-existing tables for a document’s metadata. Can not be used with metadata_columns. Defaults to None.

  • id_column (str, optional) – Column that represents the Document’s id. Defaults to “langchain_id”.

  • metadata_json_column (str, optional) – Column to store metadata as JSON. Defaults to “langchain_metadata”.

  • distance_strategy (DistanceStrategy, optional) – Distance strategy to use for vector similarity search. Defaults to COSINE_DISTANCE.

  • k (int, optional) – Number of Documents to return from search. Defaults to 4.

  • fetch_k (int, optional) – Number of Documents to fetch to pass to MMR algorithm. Defaults to 20.

  • lambda_mult (float, optional) – Number between 0 and 1 that determines the degree of diversity among the results with 0 corresponding to maximum diversity and 1 to minimum diversity. Defaults to 0.5.

  • index_query_options (Optional[QueryOptions], optional) – Index query option. Defaults to None.

Returns:

PGVectorStore

Return type:

PGVectorStore

delete(
ids: list | None = None,
**kwargs: Any,
) → bool | None[source]#

Delete records from the table.

Raises:

InvalidTextRepresentationError <asyncpg.exceptions.InvalidTextRepresentationError> – if the ids data type does not match that of the id_column.

Parameters:
  • ids (list | None)

  • kwargs (Any)

Return type:

bool | None

drop_vector_index(
index_name: str | None = None,
) → None[source]#

Drop the vector index.

Parameters:

index_name (str | None)

Return type:

None

classmethod from_documents(
documents: list[Document],
embedding: Embeddings,
engine: PGEngine,
table_name: str,
schema_name: str = 'public',
ids: list | None = None,
content_column: str = 'content',
embedding_column: str = 'embedding',
metadata_columns: list[str] | None = None,
ignore_metadata_columns: list[str] | None = None,
id_column: str = 'langchain_id',
metadata_json_column: str = 'langchain_metadata',
distance_strategy: DistanceStrategy = DistanceStrategy.COSINE_DISTANCE,
k: int = 4,
fetch_k: int = 20,
lambda_mult: float = 0.5,
index_query_options: QueryOptions | None = None,
**kwargs: Any,
) → PGVectorStore[source]#

Create an PGVectorStore instance from documents.

Parameters:
  • documents (list[Document]) – Documents to add to the vector store.

  • embedding (Embeddings) – Text embedding model to use.

  • engine (PGEngine) – Connection pool engine for managing connections to postgres database.

  • table_name (str) – Name of an existing table.

  • schema_name (str, optional) – Name of the database schema. Defaults to “public”.

  • ids (list | None) – (Optional[list]): List of IDs to add to table records. Defaults to None.

  • content_column (str, optional) – Column that represent a Document’s page_content. Defaults to “content”.

  • embedding_column (str, optional) – Column for embedding vectors. The embedding is generated from the document value. Defaults to “embedding”.

  • metadata_columns (list[str], optional) – Column(s) that represent a document’s metadata. Defaults to an empty list.

  • ignore_metadata_columns (Optional[list[str]], optional) – Column(s) to ignore in pre-existing tables for a document’s metadata. Can not be used with metadata_columns. Defaults to None.

  • id_column (str, optional) – Column that represents the Document’s id. Defaults to “langchain_id”.

  • metadata_json_column (str, optional) – Column to store metadata as JSON. Defaults to “langchain_metadata”.

  • distance_strategy (DistanceStrategy) – Distance strategy to use for vector similarity search. Defaults to COSINE_DISTANCE.

  • k (int) – Number of Documents to return from search. Defaults to 4.

  • fetch_k (int) – Number of Documents to fetch to pass to MMR algorithm.

  • lambda_mult (float) – Number between 0 and 1 that determines the degree of diversity among the results with 0 corresponding to maximum diversity and 1 to minimum diversity. Defaults to 0.5.

  • index_query_options (QueryOptions) – Index query option.

  • kwargs (Any)

Raises:

InvalidTextRepresentationError <asyncpg.exceptions.InvalidTextRepresentationError> – if the ids data type does not match that of the id_column.

Returns:

PGVectorStore

Return type:

PGVectorStore

classmethod from_texts(
texts: list[str],
embedding: Embeddings,
engine: PGEngine,
table_name: str,
schema_name: str = 'public',
metadatas: list[dict] | None = None,
ids: list | None = None,
content_column: str = 'content',
embedding_column: str = 'embedding',
metadata_columns: list[str] | None = None,
ignore_metadata_columns: list[str] | None = None,
id_column: str = 'langchain_id',
metadata_json_column: str = 'langchain_metadata',
distance_strategy: DistanceStrategy = DistanceStrategy.COSINE_DISTANCE,
k: int = 4,
fetch_k: int = 20,
lambda_mult: float = 0.5,
index_query_options: QueryOptions | None = None,
**kwargs: Any,
) → PGVectorStore[source]#

Create an PGVectorStore instance from texts.

Parameters:
  • texts (list[str]) – Texts to add to the vector store.

  • embedding (Embeddings) – Text embedding model to use.

  • engine (PGEngine) – Connection pool engine for managing connections to postgres database.

  • table_name (str) – Name of an existing table.

  • schema_name (str, optional) – Name of the database schema. Defaults to “public”.

  • metadatas (Optional[list[dict]], optional) – List of metadatas to add to table records. Defaults to None.

  • ids (list | None) – (Optional[list]): List of IDs to add to table records. Defaults to None.

  • content_column (str, optional) – Column that represent a Document’s page_content. Defaults to “content”.

  • embedding_column (str, optional) – Column for embedding vectors. The embedding is generated from the document value. Defaults to “embedding”.

  • metadata_columns (list[str], optional) – Column(s) that represent a document’s metadata. Defaults to empty list.

  • ignore_metadata_columns (Optional[list[str]], optional) – Column(s) to ignore in pre-existing tables for a document’s metadata. Can not be used with metadata_columns. Defaults to None.

  • id_column (str, optional) – Column that represents the Document’s id. Defaults to “langchain_id”.

  • metadata_json_column (str, optional) – Column to store metadata as JSON. Defaults to “langchain_metadata”.

  • distance_strategy (DistanceStrategy) – Distance strategy to use for vector similarity search. Defaults to COSINE_DISTANCE.

  • k (int) – Number of Documents to return from search. Defaults to 4.

  • fetch_k (int) – Number of Documents to fetch to pass to MMR algorithm.

  • lambda_mult (float) – Number between 0 and 1 that determines the degree of diversity among the results with 0 corresponding to maximum diversity and 1 to minimum diversity. Defaults to 0.5.

  • index_query_options (QueryOptions) – Index query option.

  • kwargs (Any)

Raises:

InvalidTextRepresentationError <asyncpg.exceptions.InvalidTextRepresentationError> – if the ids data type does not match that of the id_column.

Returns:

PGVectorStore

Return type:

PGVectorStore

get_by_ids(
ids: Sequence[str],
) → list[Document][source]#

Get documents by ids.

Parameters:

ids (Sequence[str])

Return type:

list[Document]

get_table_name() → str[source]#
Return type:

str

is_valid_index(
index_name: str | None = None,
) → bool[source]#

Check if index exists in the table.

Parameters:

index_name (str | None)

Return type:

bool

Return docs selected using the maximal marginal relevance.

Parameters:
  • query (str)

  • k (int | None)

  • fetch_k (int | None)

  • lambda_mult (float | None)

  • filter (dict | None)

  • kwargs (Any)

Return type:

list[Document]

max_marginal_relevance_search_by_vector(
embedding: list[float],
k: int | None = None,
fetch_k: int | None = None,
lambda_mult: float | None = None,
filter: dict | None = None,
**kwargs: Any,
) → list[Document][source]#

Return docs selected using the maximal marginal relevance.

Parameters:
  • embedding (list[float])

  • k (int | None)

  • fetch_k (int | None)

  • lambda_mult (float | None)

  • filter (dict | None)

  • kwargs (Any)

Return type:

list[Document]

max_marginal_relevance_search_with_score_by_vector(
embedding: list[float],
k: int | None = None,
fetch_k: int | None = None,
lambda_mult: float | None = None,
filter: dict | None = None,
**kwargs: Any,
) → list[tuple[Document, float]][source]#

Return docs and distance scores selected using the maximal marginal relevance.

Parameters:
  • embedding (list[float])

  • k (int | None)

  • fetch_k (int | None)

  • lambda_mult (float | None)

  • filter (dict | None)

  • kwargs (Any)

Return type:

list[tuple[Document, float]]

reindex(index_name: str | None = None) → None[source]#

Re-index the vector store table.

Parameters:

index_name (str | None)

Return type:

None

search(
query: str,
search_type: str,
**kwargs: Any,
) → list[Document]#

Return docs most similar to query using a specified search type.

Parameters:
  • query (str) – Input text

  • search_type (str) – Type of search to perform. Can be “similarity”, “mmr”, or “similarity_score_threshold”.

  • **kwargs (Any) – Arguments to pass to the search method.

Returns:

List of Documents most similar to the query.

Raises:

ValueError – If search_type is not one of “similarity”, “mmr”, or “similarity_score_threshold”.

Return type:

list[Document]

Return docs selected by similarity search on query.

Parameters:
  • query (str)

  • k (int | None)

  • filter (dict | None)

  • kwargs (Any)

Return type:

list[Document]

similarity_search_by_vector(
embedding: list[float],
k: int | None = None,
filter: dict | None = None,
**kwargs: Any,
) → list[Document][source]#

Return docs selected by vector similarity search.

Parameters:
  • embedding (list[float])

  • k (int | None)

  • filter (dict | None)

  • kwargs (Any)

Return type:

list[Document]

similarity_search_with_relevance_scores(
query: str,
k: int = 4,
**kwargs: Any,
) → list[tuple[Document, float]]#

Return docs and relevance scores in the range [0, 1].

0 is dissimilar, 1 is most similar.

Parameters:
  • query (str) – Input text.

  • k (int) – Number of Documents to return. Defaults to 4.

  • **kwargs (Any) –

    kwargs to be passed to similarity search. Should include: score_threshold: Optional, a floating point value between 0 to 1 to

    filter the resulting set of retrieved docs.

Returns:

List of Tuples of (doc, similarity_score).

Return type:

list[tuple[Document, float]]

similarity_search_with_score(
query: str,
k: int | None = None,
filter: dict | None = None,
**kwargs: Any,
) → list[tuple[Document, float]][source]#

Return docs and distance scores selected by similarity search on query.

Parameters:
  • query (str)

  • k (int | None)

  • filter (dict | None)

  • kwargs (Any)

Return type:

list[tuple[Document, float]]

similarity_search_with_score_by_vector(
embedding: list[float],
k: int | None = None,
filter: dict | None = None,
**kwargs: Any,
) → list[tuple[Document, float]][source]#

Return docs and distance scores selected by similarity search on vector.

Parameters:
  • embedding (list[float])

  • k (int | None)

  • filter (dict | None)

  • kwargs (Any)

Return type:

list[tuple[Document, float]]