Skip to content

selfradiance/agent-reality-check

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

agent-reality-check

AI agents can recommend APIs, models, or services based on factual claims. Some of those claims are machine-checkable, such as a provider, service, pricing unit, and price. This project is a narrow v0.1 proof harness for checking one kind of claim before trusting the recommendation.

Core thesis: do not optimize agents. Verify the claims agents rely on.

What v0.1 Proves

v0.1 verifies agent-made pricing claims against a local canonical pricing fixture. It does not prove live commerce verification. It does not scrape the web, call paid APIs, call LLMs, or decide whether a recommendation is good. It only asks: does this claimed price match the local source-of-truth fixture?

Non-Goals

  • No live scraping
  • No network calls
  • No paid APIs
  • No LLM calls
  • No dashboards
  • No SaaS framing
  • No AgentGate integration in v0.1
  • No product checkout or delivery verification
  • No generalized truth engine

60-Second Demo

npm install
npm run demo
npm test
npm run typecheck

To verify one claim file against one source fixture:

npm run verify -- examples/passing-claim.json examples/source-pricing.json

Verification results are appended to:

.agent-reality-check/log.jsonl

Each JSONL row includes previous_hash and row_hash, so edits to prior rows break the local hash chain.

Expected Output

Passing claim:

MATCH claim-pass-openai-gpt-4-1-mini-input
Claim matches source: OpenAI/gpt-4.1-mini 1M input tokens is 0.4 USD.
claimed_price: 0.4
source_price: 0.4
delta: 0

Failing claim:

MISMATCH claim-fail-openai-gpt-4-1-mini-input
Claim mismatch: claimed 2 USD for OpenAI/gpt-4.1-mini 1M input tokens, source is 0.4 USD.
claimed_price: 2
source_price: 0.4
delta: 1.6

Data Model

Agent claim JSON includes:

  • claim_id
  • agent_name
  • intent
  • provider
  • service
  • pricing_unit
  • claimed_price
  • claimed_currency
  • evidence_url optional string
  • captured_at ISO timestamp

Source pricing fixture JSON includes:

  • source_id
  • source_name
  • retrieved_at ISO timestamp
  • prices array with provider, service, pricing_unit, price, and currency

Verifier result JSON includes:

  • result_id
  • claim_id
  • status: match, mismatch, or missing_source
  • claimed_price
  • source_price when a source row exists
  • delta when a source row exists
  • human_summary
  • verified_at ISO timestamp

Why This Matters For Agentic Commerce

Agents may make recommendations based on claims. Before trusting a recommendation, a buyer, merchant, auditor, or marketplace can verify the claims against source truth. This v0.1 keeps the proof small: one local pricing claim, one local source fixture, one deterministic verifier, and an append-only log.

Limitations

  • Local fixture only
  • No live scraping
  • No product checkout
  • No delivery verification
  • No legal or financial advice
  • No generalized truth engine

Future Extensions

These are possible later extensions, not part of v0.1:

  • Live pricing adapter
  • Docs/spec claim adapter
  • AgentGate bond/slash adapter

Development

npm test
npm run typecheck
npm run demo

About

Deterministic local proof harness for verifying AI agent recommendation claims against source-of-truth fixtures, with append-only outcome logs.

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors