PineconeRerank#

class langchain_pinecone.rerank.PineconeRerank[source]#

Bases: BaseDocumentCompressor

Document compressor that uses Pinecone Rerank API.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

param client: Any = None#

Pinecone client to use for compressing documents.

param model: str | None = None#

Model to use for reranking. Mandatory to specify the model name.

param pinecone_api_key: SecretStr | None [Optional]#

Pinecone API key. Must be specified directly or via environment variable PINECONE_API_KEY.

param rank_fields: Sequence[str] | None = None#

Fields to use for reranking when documents are dictionaries.

param return_documents: bool = True#

Whether to return the documents in the reranking results.

param top_n: int | None = 3#

Number of documents to return.

async acompress_documents(
documents: Sequence[Document],
query: str,
callbacks: Callbacks | None = None,
) Sequence[Document]#

Async compress retrieved documents given the query context.

Parameters:
  • documents (Sequence[Document]) – The retrieved documents.

  • query (str) – The query context.

  • callbacks (Optional[Callbacks]) – Optional callbacks to run during compression.

Returns:

The compressed documents.

Return type:

Sequence[Document]

compress_documents(
documents: Sequence[Document],
query: str,
callbacks: list[BaseCallbackHandler] | BaseCallbackManager | None = None,
) Sequence[Document][source]#

Compress documents using Pinecone’s rerank API.

Parameters:
  • documents (Sequence[Document]) – A sequence of documents to compress.

  • query (str) – The query to use for compressing the documents.

  • callbacks (list[BaseCallbackHandler] | BaseCallbackManager | None) – Callbacks to run during the compression process.

Returns:

A sequence of compressed documents.

Return type:

Sequence[Document]

rerank(
documents: Sequence[str | Document | dict],
query: str,
*,
rank_fields: Sequence[str] | None = None,
model: str | None = None,
top_n: int | None = None,
truncate: str = 'END',
) List[Dict[str, Any]][source]#

Returns an ordered list of documents ordered by their relevance to the provided query.

This method reranks documents using Pinecone’s reranking API as part of a two-stage vector retrieval process to improve result quality. It first converts documents to the appropriate format, then sends them along with the query to the reranking model. The reranking model scores the results based on their semantic relevance to the query and returns a new, more accurate ranking.

Parameters:
  • query (str) – The query to use for reranking.

  • documents (Sequence[str | Document | dict]) – A sequence of documents to rerank. Can be strings, Document objects, or dictionaries with an optional ‘id’ field and text content.

  • rank_fields (Sequence[str] | None) – A sequence of keys to use for reranking when documents are dictionaries. Only the first field is used for models that support a single rank field.

  • model (str | None) – The model to use for reranking. Defaults to self.model. Supported models include ‘bge-reranker-v2-m3’, ‘pinecone-rerank-v0’, and ‘cohere-rerank-3.5’.

  • top_n (int | None) – The number of results to return. If None returns all results. Defaults to self.top_n.

  • truncate (str) – How to truncate documents if they exceed token limits. Options: “END”, “MIDDLE”. Defaults to “END”.

Returns:

  • id: The document ID

  • index: The original index in the input documents sequence

  • score: The relevance score (0-1, with 1 being most relevant)

  • document: The document content (if return_documents=True)

Return type:

A list of dictionaries containing

Examples

```python from langchain_pinecone import PineconeRerank from langchain_core.documents import Document from pinecone import Pinecone

# Initialize Pinecone client pc = Pinecone(api_key=”your-api-key”)

# Create the reranker reranker = PineconeRerank(

client=pc, model=”bge-reranker-v2-m3”, top_n=2

)

# Create sample documents documents = [

Document(page_content=”Apple is a popular fruit known for its sweetness.”), Document(page_content=”Apple Inc. has revolutionized the tech industry.”), Document(page_content=”An apple a day keeps the doctor away.”),

]

# Rerank documents rerank_results = reranker.rerank(

documents=documents, query=”Tell me about the tech company Apple”,

)

# Display results for result in rerank_results:

print(f”Score: {result[‘score’]}, Document: {result[‘document’]}”)

```

Using dictionaries with custom fields: ```python # Create documents as dictionaries with custom fields docs = [

{“id”: “doc1”, “content”: “Apple is a fruit known for its sweetness.”}, {“id”: “doc2”, “content”: “Apple Inc. creates innovative tech products.”},

]

# Rerank using a custom field results = reranker.rerank(

documents=docs, query=”tech companies”, rank_fields=[“content”], top_n=1

)#