BM25Strategy#

class langchain_elasticsearch.vectorstores.BM25Strategy( k1: float | None = None, b: float | None = None, )[source]#

Methods

`__init__`([k1, b])
`before_index_creation`(*, client, text_field, ...)	Executes before the index is created.
`es_mappings_settings`(*, text_field, ...)	Create the required index and do necessary preliminary work, like creating inference pipelines or checking if a required model was deployed.
`es_query`(*, query, query_vector, text_field, ...)	Returns the Elasticsearch query body for the given parameters.
`needs_inference`()	Some retrieval strategies index embedding vectors and allow search by embedding vector, for example the DenseVectorStrategy strategy.

Parameters:

k1 (float | None)
b (float | None)

__init__( k1: float | None = None, b: float | None = None, )[source]#

Parameters:

k1 (float | None)
b (float | None)

before_index_creation( *, client: Elasticsearch, text_field: str, vector_field: str, ) → None#

Executes before the index is created. Used for setting up any required Elasticsearch resources like a pipeline. Defaults to a no-op.

Parameters:

client (Elasticsearch) – The Elasticsearch client.
text_field (str) – The field containing the text data in the index.
vector_field (str) – The field containing the vector representations in the index.

Return type:

None

es_mappings_settings( *, text_field: str, vector_field: str, num_dimensions: int | None, ) → Tuple[Dict[str, Any], Dict[str, Any]][source]#

Create the required index and do necessary preliminary work, like creating inference pipelines or checking if a required model was deployed.

Parameters:

client – Elasticsearch client connection.
text_field (str) – The field containing the text data in the index.
vector_field (str) – The field containing the vector representations in the index.
num_dimensions (int | None) – If vectors are indexed, how many dimensions do they have.

Returns:

Dictionary with field and field type pairs that describe the schema.

Return type:

Tuple[Dict[str, Any], Dict[str, Any]]

es_query( *, query: str | None, query_vector: List[float] | None, text_field: str, vector_field: str, k: int, num_candidates: int, filter: List[Dict[str, Any]] = [], ) → Dict[str, Any][source]#

Returns the Elasticsearch query body for the given parameters. The store will execute the query.

Parameters:

query (str | None) – The text query. Can be None if query_vector is given.
k (int) – The total number of results to retrieve.
num_candidates (int) – The number of results to fetch initially in knn search.
filter (List[Dict[str, Any]]) – List of filter clauses to apply to the query.
query_vector (List[float] | None) – The query vector. Can be None if a query string is given.
text_field (str)
vector_field (str)

Returns:

The Elasticsearch query body.

Return type:

Dict[str, Any]

needs_inference() → bool#

Some retrieval strategies index embedding vectors and allow search by embedding vector, for example the DenseVectorStrategy strategy. Mapping a user input query string to an embedding vector is called inference. Inference can be applied in Elasticsearch (using a model_id) or outside of Elasticsearch (using an EmbeddingService defined on the VectorStore). In the latter case, this method has to return True.

Return type:: bool