VertexStringEvaluator#
- class langchain_google_vertexai.evaluators.evaluation.VertexStringEvaluator(metric: str, **kwargs)[source]#
Evaluate the perplexity of a predicted string.
Attributes
evaluation_name
The name of the evaluation.
requires_input
Whether this evaluator requires an input string.
requires_reference
Whether this evaluator requires a reference label.
Methods
__init__
(metric, **kwargs)aevaluate_strings
(*, prediction[, ...])Asynchronously evaluate Chain or LLM output, based on optional input and label.
evaluate
(examples, predictions, *[, ...])evaluate_strings
(*, prediction[, reference, ...])Evaluate Chain or LLM output, based on optional input and label.
- Parameters:
metric (str)
- async aevaluate_strings(
- *,
- prediction: str,
- reference: str | None = None,
- input: str | None = None,
- **kwargs: Any,
Asynchronously evaluate Chain or LLM output, based on optional input and label.
- Parameters:
prediction (str) – The LLM or chain prediction to evaluate.
reference (Optional[str], optional) – The reference label to evaluate against.
input (Optional[str], optional) – The input to consider during evaluation.
**kwargs – Additional keyword arguments, including callbacks, tags, etc.
- Returns:
The evaluation results containing the score or value.
- Return type:
dict
- evaluate(
- examples: Sequence[Dict[str, str]],
- predictions: Sequence[Dict[str, str]],
- *,
- question_key: str = 'context',
- answer_key: str = 'reference',
- prediction_key: str = 'prediction',
- instruction_key: str = 'instruction',
- **kwargs: Any,
- Parameters:
examples (Sequence[Dict[str, str]])
predictions (Sequence[Dict[str, str]])
question_key (str)
answer_key (str)
prediction_key (str)
instruction_key (str)
kwargs (Any)
- Return type:
List[dict]
- evaluate_strings(
- *,
- prediction: str,
- reference: str | None = None,
- input: str | None = None,
- **kwargs: Any,
Evaluate Chain or LLM output, based on optional input and label.
- Parameters:
prediction (str) – The LLM or chain prediction to evaluate.
reference (Optional[str], optional) – The reference label to evaluate against.
input (Optional[str], optional) – The input to consider during evaluation.
**kwargs – Additional keyword arguments, including callbacks, tags, etc.
- Returns:
The evaluation results containing the score or value.
- Return type:
dict