
Evaluator Types
Character Count
Analyze response length and verbosity to ensure outputs meet specific length requirements.
Character Count Ratio
Measure the ratio of characters to the input to assess response proportionality and expansion.
Word Count
Ensure appropriate response detail level by tracking the total number of words in outputs.
Word Count Ratio
Measure the ratio of words to the input to compare input/output verbosity and expansion patterns.
Answer Relevancy
Verify responses address the query to ensure AI outputs stay on topic and remain relevant.
Faithfulness
Detect hallucinations and verify facts to maintain accuracy and truthfulness in AI responses.
PII Detection
Identify personal information exposure to protect user privacy and ensure data security compliance.
Profanity Detection
Flag inappropriate language use to maintain content quality standards and professional communication.
Secrets Detection
Monitor for credential and key leaks to prevent accidental exposure of sensitive information.
SQL Validation
Validate SQL queries to ensure proper syntax and structure in database-related AI outputs.
JSON Validation
Validate JSON responses to ensure proper formatting and structure in API-related outputs.
Regex Validation
Validate regex patterns to ensure correct regular expression syntax and functionality.
Placeholder Regex
Validate placeholder regex patterns to ensure proper template and variable replacement structures.
Semantic Similarity
Validate semantic similarity between expected and actual responses to measure content alignment.
Agent Goal Accuracy
Validate agent goal accuracy to ensure AI systems achieve their intended objectives effectively.
Topic Adherence
Validate topic adherence to ensure responses stay focused on the specified subject matter.
Measure Perplexity
Measure text perplexity from logprobs to assess the predictability and coherence of generated text.