ElasticsearchEmbeddingsCache#

class langchain_elasticsearch.cache.ElasticsearchEmbeddingsCache(index_name: str, store_input: bool = True, metadata: Dict[str, Any] | None = None, namespace: str | None = None, maximum_duplicates_allowed: int = 1, *, es_url: str | None = None, es_cloud_id: str | None = None, es_user: str | None = None, es_api_key: str | None = None, es_password: str | None = None, es_params: Dict[str, Any] | None = None)[source]#

An Elasticsearch store for caching embeddings.

Initialize the Elasticsearch cache store by specifying the index/alias to use and determining which additional information (like input, input parameters, and any other metadata) should be stored in the cache. Provide a namespace to organize the cache.

Parameters:
  • index_name (str) – The name of the index or the alias to use for the cache. If they do not exist an index is created, according to the default mapping defined by the mapping property.

  • store_input (bool) – Whether to store the input in the cache. Default to True.

  • metadata (Optional[dict]) – Additional metadata to store in the cache, for filtering purposes. This must be JSON serializable in an Elasticsearch document. Default to None.

  • namespace (Optional[str]) – A namespace to use for the cache.

  • maximum_duplicates_allowed (int) – Defines the maximum number of duplicate keys permitted. Must be used in scenarios where the same key appears across multiple indices that share the same alias. Default to 1.

  • es_url (str | None) – URL of the Elasticsearch instance to connect to.

  • es_cloud_id (str | None) – Cloud ID of the Elasticsearch instance to connect to.

  • es_user (str | None) – Username to use when connecting to Elasticsearch.

  • es_password (str | None) – Password to use when connecting to Elasticsearch.

  • es_api_key (str | None) – API key to use when connecting to Elasticsearch.

  • es_params (Dict[str, Any] | None) – Other parameters for the Elasticsearch client.

Attributes

mapping

Get the default mapping for the index.

Methods

__init__(index_name[,Β store_input,Β ...])

Initialize the Elasticsearch cache store by specifying the index/alias to use and determining which additional information (like input, input parameters, and any other metadata) should be stored in the cache.

amdelete(keys)

Async delete the given keys and their associated values.

amget(keys)

Async get the values associated with the given keys.

amset(key_value_pairs)

Async set the values for the given keys.

ayield_keys(*[,Β prefix])

Async get an iterator over keys that match the given prefix.

build_document(text_input,Β vector)

Build the Elasticsearch document for storing a single embedding

decode_vector(data)

Decode the base64 string to vector data as bytes.

encode_vector(data)

Encode the vector data as bytes to as a base64 string.

mdelete(keys)

Delete the given keys and their associated values.

mget(keys)

Get the values associated with the given keys.

mset(key_value_pairs)

Set the values for the given keys.

yield_keys(*[,Β prefix])

Get an iterator over keys that match the given prefix.

__init__(index_name: str, store_input: bool = True, metadata: Dict[str, Any] | None = None, namespace: str | None = None, maximum_duplicates_allowed: int = 1, *, es_url: str | None = None, es_cloud_id: str | None = None, es_user: str | None = None, es_api_key: str | None = None, es_password: str | None = None, es_params: Dict[str, Any] | None = None)[source]#

Initialize the Elasticsearch cache store by specifying the index/alias to use and determining which additional information (like input, input parameters, and any other metadata) should be stored in the cache. Provide a namespace to organize the cache.

Parameters:
  • index_name (str) – The name of the index or the alias to use for the cache. If they do not exist an index is created, according to the default mapping defined by the mapping property.

  • store_input (bool) – Whether to store the input in the cache. Default to True.

  • metadata (Optional[dict]) – Additional metadata to store in the cache, for filtering purposes. This must be JSON serializable in an Elasticsearch document. Default to None.

  • namespace (Optional[str]) – A namespace to use for the cache.

  • maximum_duplicates_allowed (int) – Defines the maximum number of duplicate keys permitted. Must be used in scenarios where the same key appears across multiple indices that share the same alias. Default to 1.

  • es_url (str | None) – URL of the Elasticsearch instance to connect to.

  • es_cloud_id (str | None) – Cloud ID of the Elasticsearch instance to connect to.

  • es_user (str | None) – Username to use when connecting to Elasticsearch.

  • es_password (str | None) – Password to use when connecting to Elasticsearch.

  • es_api_key (str | None) – API key to use when connecting to Elasticsearch.

  • es_params (Dict[str, Any] | None) – Other parameters for the Elasticsearch client.

async amdelete(keys: Sequence[K]) β†’ None#

Async delete the given keys and their associated values.

Parameters:

keys (Sequence[K]) – A sequence of keys to delete.

Return type:

None

async amget(keys: Sequence[K]) β†’ list[V | None]#

Async get the values associated with the given keys.

Parameters:

keys (Sequence[K]) – A sequence of keys.

Returns:

A sequence of optional values associated with the keys. If a key is not found, the corresponding value will be None.

Return type:

list[V | None]

async amset(key_value_pairs: Sequence[tuple[K, V]]) β†’ None#

Async set the values for the given keys.

Parameters:

key_value_pairs (Sequence[Tuple[K, V]]) – A sequence of key-value pairs.

Return type:

None

async ayield_keys(*, prefix: str | None = None) β†’ AsyncIterator[K] | AsyncIterator[str]#

Async get an iterator over keys that match the given prefix.

Parameters:

prefix (str) – The prefix to match.

Yields:

Iterator[K | str] – An iterator over keys that match the given prefix. This method is allowed to return an iterator over either K or str depending on what makes more sense for the given store.

Return type:

AsyncIterator[K] | AsyncIterator[str]

build_document(text_input: str, vector: bytes) β†’ Dict[str, Any][source]#

Build the Elasticsearch document for storing a single embedding

Parameters:
  • text_input (str)

  • vector (bytes)

Return type:

Dict[str, Any]

static decode_vector(data: str) β†’ bytes[source]#

Decode the base64 string to vector data as bytes.

Parameters:

data (str)

Return type:

bytes

static encode_vector(data: bytes) β†’ str[source]#

Encode the vector data as bytes to as a base64 string.

Parameters:

data (bytes)

Return type:

str

mdelete(keys: Sequence[str]) β†’ None[source]#

Delete the given keys and their associated values.

Parameters:

keys (Sequence[str])

Return type:

None

mget(keys: Sequence[str]) β†’ List[bytes | None][source]#

Get the values associated with the given keys.

Parameters:

keys (Sequence[str])

Return type:

List[bytes | None]

mset(key_value_pairs: Sequence[Tuple[str, bytes]]) β†’ None[source]#

Set the values for the given keys.

Parameters:

key_value_pairs (Sequence[Tuple[str, bytes]])

Return type:

None

yield_keys(*, prefix: str | None = None) β†’ Iterator[str][source]#

Get an iterator over keys that match the given prefix.

Parameters:

prefix (str | None)

Return type:

Iterator[str]