Open Source · Apache 2.0 · v0.3.0

Notarize, govern,
and audit AI agents

Notarize → Enforce → Certify.
Three commands turn agentnotary.yaml into a sealed, runtime-enforced, audit-ready agent. Open source. Framework-agnostic.

View on GitHub See the Demo
$ pip install agentnotary

The Problem

Your agents are ungoverned. You know it.

Software has Dockerfiles, CI/CD, semantic versioning, crash logs, and rollbacks. AI agents have a Python file in a repo, maybe. No manifest. No tests. No audit trail. No way to rollback a bad prompt.

97%
of companies have deployed AI agents
82%
can't even track what agents they're running
0
standardized formats for agent configuration before AgentNotary
Problem AgentNotary Solution
"What agent is running in prod?" agentnotary.yaml + agent.lock — sealed snapshot
"Why did the agent burn $4,000 overnight?" agentnotary guard run — would have blocked it at $1.00
"Did the prompt or the model drift?" agentnotary seal --verify — fails CI on drift
"We need EU AI Act docs by August 2026" agentnotary compliance --standard eu-ai-act
"Why did the agent make that decision?" agentnotary replay <session-id> — flight recorder
"How many shadow agents are in this codebase?" agentnotary scan ./src — finds all frameworks

Features

Everything your agents are missing

Built on the same mental model that made Docker successful — but for the AI agent era.

📋
Agent Manifest
agentnotary.yaml is the Dockerfile for agents. Declares model, framework, tools, guardrails, memory, and evals in a single file.
🧪
Standardized Testing
Write YAML eval cases with expected behavior in natural language. AgentNotary uses an LLM-as-judge to grade pass/fail, latency, and cost.
🏷️
Version Control
agentnotary tag v1.2.0 snapshots your manifest, prompts, and evals. agentnotary rollback v1.1.0 restores any prior state instantly.
🛩️
Flight Recorder
Record every LLM call, tool invocation, and decision to disk. Replay any session to understand exactly what happened and why.
🔍
Shadow Agent Scanner
Scan any codebase and find every AI agent across LangChain, CrewAI, AutoGen, OpenAI SDK, Anthropic SDK, LlamaIndex, and DSPy.
🛡️
Guardrail Declaration
Declare safety constraints in the manifest — cost caps, content filters, human approval requirements — alongside the agent configuration.

Demo

Two minutes to a governed agent

Terminal
$ pip install agentnotary
Successfully installed agentnotary-0.1.0

$ agentnotary init support-agent
⬡ AgentNotary — Initializing agent: support-agent

✓ Created agentnotary.yaml
✓ Created evals/test_suite.yaml
✓ Created .agentnotary/.gitignore

$ agentnotary validate
⬡ AgentNotary — Validating agent manifest

Agent: support-agent v0.1.0
Model: claude-sonnet-4-20250514
! WARNING: No system prompt defined

$ agentnotary test
⬡ AgentNotary — Running eval suite

STATUS TEST LATENCY COST
──────────────────────────────────────────
PASS basic-greeting 312ms $0.0014
PASS escalate-angry-customer 445ms $0.0021
FAIL no-pii-leak 289ms $0.0012

66.7% pass rate 2 passed · 1 failed

$ agentnotary tag v0.1.0
✓ Tagged v0.1.0
Hash: a3f7c912b441

$ agentnotary scan ./backend
⬡ AgentNotary — Scanning for agents in: ./backend

Files scanned: 847
Agents found: 12

FILE FW GUARDRAILS
──────────────────────────────────────────────────────
services/support/agent.py anthropic
scripts/bulk_email.py openai_sdk
workflows/pipeline.py langchain

⚠ 9 ungoverned agents found (no guardrails detected)

Commands

10 commands. That's the whole CLI.

agentnotary init [name]
Scaffold a new agent project with manifest, evals, and directories
agentnotary validate
Check the current manifest for errors, warnings, and missing fields
agentnotary test
Run the eval suite. Uses LLM-as-judge to grade expected behavior
agentnotary tag <version>
Snapshot manifest + prompts + evals as a named version
agentnotary versions
List all tagged versions with hash and timestamp
agentnotary rollback <version>
Restore a prior version. Auto-saves current state before rollback
agentnotary sessions
List recorded sessions with cost, token, and status summary
agentnotary replay <id>
Replay a session: every LLM call, tool use, and decision in order
agentnotary scan [dir]
Find every AI agent in a codebase across 7 frameworks
agentnotary info
Show the current agent's configuration and status

Open Source

Govern your agents.
Before someone else does.

Apache 2.0 licensed. Contributions welcome. Follows the Docker monetization playbook — open-source CLI, cloud platform coming.

Star on GitHub Open an Issue