
What is a Monitor?
A monitor is an evaluator that runs on a group of defined spans with specific characteristics in real time. For every span that matches the group filter, it will run the evaluator and log the monitor result. This allows you to continuously assess the quality and performance of your LLM outputs as they are generated in production. Monitors can use two types of evaluators:- LLM-as-a-Judge: uses a large language model to evaluate outputs based on semantic qualities. You can create custom evaluators with this method by writing prompts that capture your own criteria.
- Traceloop built in evaluators: deterministic evaluations for structural validation, safety checks, and syntactic analysis.