Skip to main content

Datasets Overview

The Dataset acts as the Ground Truth used to challenge your AI Targets.

Evaluating Generative AI is rarely black and white. To truly understand how your system behaves, you need to challenge it with real-world scenarios, diverse contexts, and expected outcomes. Datasets in CertOps make this structured, repeatable, and scalable.

The CSV Schema

In CertOps, an uploaded Dataset is a structured, tabular collection of Inputs (Queries/Context) and Expected Outputs.

When you upload a dataset CSV (e.g. qa_dataset.csv), CertOps automatically parses the file and infers a schema based on your columns.

Reusability and Metric Mapping

Datasets are highly reusable. You do not need to create a unique dataset for every single Target. A single, comprehensive dataset can be utilized across many different components in your Suite.

Instead of strict dataset-to-target coupling, CertOps uses dynamic column mapping that is entirely driven by your chosen Metrics.

Because every Metric (whether a pre-built System Metric or your own Custom Metric) defines its own unique Jinja2 variables inside its prompt template, the variables you need to provide will change depending only on the metrics you choose to run.

For instance, the built-in hallucination metric specifically asks for {{context}} and {{output}}, while a custom metric might require arbitrary variables like {{corporate_policy}} and {{user_persona}}.

When configuring your tests in the certops.yaml manifest, you simply map whatever these specific metric variables are directly to the physical columns in your CSV dataset. This decoupled architecture allows you to maintain one centralized ground-truth dataset without ever needing to rename your CSV columns to match metric inputs.