agent-reality-check

AI agents can recommend APIs, models, or services based on factual claims. Some of those claims are machine-checkable, such as a provider, service, pricing unit, and price. This project is a narrow v0.1 proof harness for checking one kind of claim before trusting the recommendation.

Core thesis: do not optimize agents. Verify the claims agents rely on.

What v0.1 Proves

v0.1 verifies agent-made pricing claims against a local canonical pricing fixture. It does not prove live commerce verification. It does not scrape the web, call paid APIs, call LLMs, or decide whether a recommendation is good. It only asks: does this claimed price match the local source-of-truth fixture?

Non-Goals

No live scraping
No network calls
No paid APIs
No LLM calls
No dashboards
No SaaS framing
No AgentGate integration in v0.1
No product checkout or delivery verification
No generalized truth engine

60-Second Demo

npm install
npm run demo
npm test
npm run typecheck

To verify one claim file against one source fixture:

npm run verify -- examples/passing-claim.json examples/source-pricing.json

Verification results are appended to:

.agent-reality-check/log.jsonl

Each JSONL row includes previous_hash and row_hash, so edits to prior rows break the local hash chain.

Expected Output

Passing claim:

MATCH claim-pass-openai-gpt-4-1-mini-input
Claim matches source: OpenAI/gpt-4.1-mini 1M input tokens is 0.4 USD.
claimed_price: 0.4
source_price: 0.4
delta: 0

Failing claim:

MISMATCH claim-fail-openai-gpt-4-1-mini-input
Claim mismatch: claimed 2 USD for OpenAI/gpt-4.1-mini 1M input tokens, source is 0.4 USD.
claimed_price: 2
source_price: 0.4
delta: 1.6

Data Model

Agent claim JSON includes:

claim_id
agent_name
intent
provider
service
pricing_unit
claimed_price
claimed_currency
evidence_url optional string
captured_at ISO timestamp

Source pricing fixture JSON includes:

source_id
source_name
retrieved_at ISO timestamp
prices array with provider, service, pricing_unit, price, and currency

Verifier result JSON includes:

result_id
claim_id
status: match, mismatch, or missing_source
claimed_price
source_price when a source row exists
delta when a source row exists
human_summary
verified_at ISO timestamp

Why This Matters For Agentic Commerce

Agents may make recommendations based on claims. Before trusting a recommendation, a buyer, merchant, auditor, or marketplace can verify the claims against source truth. This v0.1 keeps the proof small: one local pricing claim, one local source fixture, one deterministic verifier, and an append-only log.

Limitations

Local fixture only
No live scraping
No product checkout
No delivery verification
No legal or financial advice
No generalized truth engine

Future Extensions

These are possible later extensions, not part of v0.1:

Live pricing adapter
Docs/spec claim adapter
AgentGate bond/slash adapter

Development

npm test
npm run typecheck
npm run demo

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
examples		examples
src		src
test		test
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

agent-reality-check

What v0.1 Proves

Non-Goals

60-Second Demo

Expected Output

Data Model

Why This Matters For Agentic Commerce

Limitations

Future Extensions

Development

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

agent-reality-check

What v0.1 Proves

Non-Goals

60-Second Demo

Expected Output

Data Model

Why This Matters For Agentic Commerce

Limitations

Future Extensions

Development

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages