VertexStringEvaluator#

class langchain_google_vertexai.evaluators.evaluation.VertexStringEvaluator(metric: str, **kwargs)[source]#

Evaluate the perplexity of a predicted string.

Attributes

evaluation_name

The name of the evaluation.

requires_input

Whether this evaluator requires an input string.

requires_reference

Whether this evaluator requires a reference label.

Methods

__init__(metric, **kwargs)

aevaluate_strings(*, prediction[, ...])

Asynchronously evaluate Chain or LLM output, based on optional input and label.

evaluate(examples, predictions, *[, ...])

evaluate_strings(*, prediction[, reference, ...])

Evaluate Chain or LLM output, based on optional input and label.

Parameters:

metric (str)

__init__(metric: str, **kwargs)[source]#
Parameters:

metric (str)

async aevaluate_strings(
*,
prediction: str,
reference: str | None = None,
input: str | None = None,
**kwargs: Any,
) dict#

Asynchronously evaluate Chain or LLM output, based on optional input and label.

Parameters:
  • prediction (str) – The LLM or chain prediction to evaluate.

  • reference (Optional[str], optional) – The reference label to evaluate against.

  • input (Optional[str], optional) – The input to consider during evaluation.

  • **kwargs – Additional keyword arguments, including callbacks, tags, etc.

Returns:

The evaluation results containing the score or value.

Return type:

dict

evaluate(
examples: Sequence[Dict[str, str]],
predictions: Sequence[Dict[str, str]],
*,
question_key: str = 'context',
answer_key: str = 'reference',
prediction_key: str = 'prediction',
instruction_key: str = 'instruction',
**kwargs: Any,
) List[dict][source]#
Parameters:
  • examples (Sequence[Dict[str, str]])

  • predictions (Sequence[Dict[str, str]])

  • question_key (str)

  • answer_key (str)

  • prediction_key (str)

  • instruction_key (str)

  • kwargs (Any)

Return type:

List[dict]

evaluate_strings(
*,
prediction: str,
reference: str | None = None,
input: str | None = None,
**kwargs: Any,
) dict#

Evaluate Chain or LLM output, based on optional input and label.

Parameters:
  • prediction (str) – The LLM or chain prediction to evaluate.

  • reference (Optional[str], optional) – The reference label to evaluate against.

  • input (Optional[str], optional) – The input to consider during evaluation.

  • **kwargs – Additional keyword arguments, including callbacks, tags, etc.

Returns:

The evaluation results containing the score or value.

Return type:

dict