Cette page est affichée en anglais. Une traduction française est en cours.
ScriptsMay 12, 2026·2 min de lecture

Giskard Checks — Evals and Safety Tests for LLM Agents

Giskard Checks gives Python teams a modular eval layer for agent regressions, groundedness, and policy conformance with scenario-based tests.

Introduction

Giskard Checks gives Python teams a modular eval layer for agent regressions, groundedness, and policy conformance with scenario-based tests.

  • Best for: Python teams that need reproducible evals for agent regressions and grounding checks
  • Works with: Python 3.12+, OpenAI-compatible clients, async test runs, scenario-based evaluation suites
  • Setup time: 10-25 minutes

Practical Notes

  • Quant: the current README requires Python 3.12+ and splits the project into modular packages such as giskard-checks.
  • Quant: built-in checks explicitly include Groundedness, Conformity, regex matching, semantic similarity, and LLM-as-judge patterns.

Rollout pattern

  • Start with one regression scenario and one groundedness scenario around a user-facing workflow.
  • Add pass/fail gates only after you understand variance across repeated runs and model versions.
  • Keep old v2-only capabilities separate if you still rely on Scan or RAGET; the README is explicit that those are legacy paths.

Watchouts

Do not assume every historical Giskard feature still exists in the same package line; v3 is a rewrite and the README explicitly separates planned versus available modules.

FAQ

Q: Is this the old all-in-one Giskard package? A: No. The README frames v3 as a modular rewrite and points to v2 only for legacy Scan and RAGET use cases.

Q: Why is it useful for agents? A: It gives scenario-based checks for outputs that can vary while still needing quality gates.

Q: What should I test first? A: Groundedness and one regression path tied to a real business workflow, not synthetic toy prompts.

🙏

Source et remerciements

Source: https://github.com/Giskard-AI/giskard-oss > License: Apache-2.0 > GitHub stars: 5,344 · forks: 453

Fil de discussion

Connectez-vous pour rejoindre la discussion.
Aucun commentaire pour l'instant. Soyez le premier à partager votre avis.

Actifs similaires