GitHub - quantifylabs/aegis-memory: Secure context engineering for AI agents. Content security · integrity verification · trust hierarchy · ACE patterns. Self-hosted, Apache 2.0.

Your agent's context is your attack surface. Act accordingly.

Secure context engineering for production AI agents.
Content security. Integrity verification. Trust hierarchy. Context that improves itself.

Website • Docs • Blog • Quickstart • Security Guide

The Problem Nobody Else Is Solving

Agents are getting compromised. Not theoretically — right now.

EchoLeak (CVE-2025-32711, CVSS 9.3) — a single email triggered zero-click data exfiltration from Microsoft 365 Copilot¹
CrewAI + GPT-4o — researchers achieved 65% exfiltration success rate against multi-agent systems (COLM 2025)²
Drift chatbot cascade — one compromised chatbot integration cascaded into 700+ organizations via Salesforce, Google Workspace, Slack, S3, and Azure³
OWASP Top 10 for Agentic Applications published December 2025 — memory and context manipulation is a top risk category⁴

Agent A's output is Agent B's instruction. Memory is the vector.

Every other memory layer trusts content by default. That is the vulnerability.

What Your Context Layer Is Missing

We audited the docs, repos, and changelogs of every major memory tool.⁵ These protections do not exist anywhere else:

Security Feature	mem0	Zep	Letta	Aegis
Content injection detection	—	—	—	4-stage pipeline
Memory integrity	—	—	—	HMAC-SHA256
Agent identity binding	—	—	—	Cryptographic API key
Trust hierarchy	—	—	—	4-tier OWASP model
Per-agent rate limiting	—	—	—	Sliding window
Security audit trail	—	—	—	Immutable event log
Sensitive data protection	—	—	—	Auto-detect + reject/redact/flag

The Context Hub (v2.3.0+)

Aegis is the only OSS context hub. Four artifacts, one secure surface, one API call to load them all:

Artifact	What it is	Endpoint
Prompts	Versioned, with one active version per name	`/prompts/*`
Memory	What we've always done — secure, ranked, decayed	`/memories/*`
Skills	Anthropic Agent Skills spec, semantic activation	`/skills/*`
Subagents	Delegation surface with tool + scope policy	`/subagents/*`
Bundle	Load all four in one call, token-budgeted	`POST /context/load`

Every artifact: HMAC integrity-signed. Content-scanned. Trust-gated. Audit-logged.

from aegis_memory import AegisClient
client = AegisClient(api_key="...")

bundle = client.load_context(
    agent_id="executor",
    query="paginate the orders API",
    token_budget=8000,
)
# → ranked memories + active prompt + matched skills + available subagents
# → integrity-verified across all four
# → token-budgeted to fit your model

Other context hubs (LangSmith, MindStudio) are closed-source. Other memory layers (mem0, Zep, Letta) stop at memory. Aegis does both, with security as the foundation.

Memory Depth (v2.4.0)

Beyond storing memories, Aegis owns their lifecycle. mem0, Zep, and Letta ship variants of these primitives; what's distinct in Aegis is the audit-preserving and human-reviewable shape of each one — typed edges with explicit resolution states, consolidation that soft-deprecates rather than deletes.

Hybrid retrieval. Every query runs through dense (pgvector cosine) and sparse (PostgreSQL tsvector) channels, fused with Reciprocal Rank Fusion. Catches the exact-match cases (entity names, error codes, tool names, file paths) that pure embedding similarity blurs.

results = client.hybrid_query(query="ZX7-PAGE-94 cursor pagination", agent_id="executor")

Contradiction detection. When two memories make incompatible claims, Aegis surfaces the conflict as a contradicts edge — a typed link with confidence and rationale. Resolve via API.

client.scan_contradictions(namespace="default")
unresolved = client.list_contradictions()
client.resolve_edge(edge_id=..., resolution="kept_source")
metrics = client.contradiction_metrics()
# → {"unresolved_contradictions": 3, "total_contradictions_detected": 17}

Semantic consolidation. Real merge, not prefix matching. Embedding-similar memories above threshold get merged via heuristic or LLM, with full audit trail (losing memory stays queryable with is_deprecated=True and metadata.consolidated_into).

plan = client.consolidate_memories(dry_run=True)   # review first
client.consolidate_memories(dry_run=False)         # then apply

Built for a World Where Agents Get Compromised

Aegis implements OWASP AI Agent Security recommendations natively. Six capabilities, none optional:

4-stage content security pipeline — input validation, sensitive data scanning, prompt injection detection, optional LLM-based injection classification. Every memory write. Not optional.
HMAC-SHA256 integrity signing — tamper detection on store, verification on demand. You know if a memory was modified.
OWASP 4-tier trust hierarchy — untrusted, internal, privileged, system. Agents get compromised. Aegis limits the blast radius.
Cryptographic agent binding — API keys bound to agent identity. No more trusting a request body that says "I'm the admin agent."
ACE loop — generation, reflection, curation. Agents that learn from their own mistakes and promote what works.
Multi-agent coordination — scoped access control, cross-agent query, structured handoffs. Memory sharing with boundaries.

Get Running in 2 Minutes

Start the server

git clone https://github.com/quantifylabs/aegis-memory.git
cd aegis-memory

export OPENAI_API_KEY=sk-...
docker compose up -d

curl http://localhost:8000/health
# {"status": "healthy"}

Install the SDK

pip install aegis-memory

Multi-agent context in 10 lines

from aegis_memory import AegisClient

client = AegisClient(api_key="dev-key", base_url="http://localhost:8000")

# Planner agent stores task breakdown
client.add(
    content="Task: Build login. Steps: 1) Form, 2) Validation, 3) API",
    agent_id="planner",
    scope="agent-shared",
    shared_with_agents=["executor"]
)

# Executor queries planner's memories
memories = client.query_cross_agent(
    query="current task",
    requesting_agent_id="executor",
    target_agent_ids=["planner"]
)
print(memories[0].content)

Full Quickstart Guide

Context That Improves Itself

Aegis Memory is the first context layer with a complete ACE loop — the Generation → Reflection → Curation cycle from Stanford/SambaNova's research, engineered for production.

Your agent made the same mistake 5 times? ACE loop remembers the fix forever. Stale memories polluting retrieval? Curation auto-cleans your playbook.

Generation          Execution          Reflection          Curation
    |                   |                   |                  |
 Query playbook  ->  Run task with   ->  Auto-vote on    ->  Promote effective
 for strategies      tracked memories    used memories       Flag ineffective
                                         Auto-reflect        Consolidate duplicates
                                         on failures

Full ACE Loop in Code

from aegis_memory import AegisClient

client = AegisClient(api_key="your-key")

# 1. GENERATION: Query agent-specific playbook
playbook = client.get_playbook_for_agent(
    "executor",
    query="API pagination task",
    task_type="api-integration",
)
memory_ids = [e.id for e in playbook.entries]

# 2. EXECUTION: Track which memories the agent uses
run = client.start_run(
    "task-42", "executor",
    task_type="api-integration",
    memory_ids_used=memory_ids,
)

# ... agent does its work ...

# 3. REFLECTION: Complete with outcome (auto-feedback!)
client.complete_run("task-42", success=True, evaluation={"score": 0.95})
# -> Auto-votes 'helpful' on every memory used
# -> On failure: auto-votes 'harmful' AND creates a reflection memory

# 4. CURATION: Periodically clean up
curation = client.curate(namespace="production")
# -> Promotes high-effectiveness entries
# -> Flags low-effectiveness for deprecation
# -> Identifies duplicate entries to consolidate

What "Engineered" Means vs "Inspired"

Feature	ACE-Inspired	Aegis ACE-Engineered
Voting	Manual vote endpoints	Auto-voting tied to run outcomes
Reflection	Manual reflection creation	Auto-reflection on failure with error context
Curation	Not implemented	Full curation cycle with promote/flag/consolidate
Run tracking	Not tracked	First-class `ace_runs` table linking memories to outcomes
Agent-specific playbook	Generic query	Filtered by agent_id + task_type

ACE Patterns Guide

Choosing the Right Tool

Different tools solve different problems. This comparison stays focused on capabilities clearly documented in public repos and docs.⁵

If you need...	Usually pick	Reason
Personalized assistant memory (user/profile facts)	mem0	Designed around persistent user/agent memory for assistants
Personal/team "second brain" with ingestion	Supermemory	Knowledge-base style memory with connectors
Graph-native episodic memory over agent events	Graphiti / Zep	Focused on temporal + knowledge graph memory models
Stateful agent runtime + built-in memory blocks	Letta	Agent framework centered on durable state
Secure context engineering with built-in security, trust, and compliance	Aegis Memory	Only context layer with content security, integrity verification, and trust hierarchy
Multi-agent coordination with access boundaries	Aegis Memory	Scope-aware ACLs + cross-agent query APIs
Self-improving context loops (what worked / failed)	Aegis Memory	ACE patterns: vote, reflection, playbook

Quick Feature Comparison

Memory-depth primitives (hybrid retrieval, contradiction handling, consolidation) are now table stakes — mem0, Zep, Letta, and Aegis all ship variants in 2026.⁶ The differences are in how, not whether.

Capability	mem0	Graphiti / Zep	Letta	Aegis Memory
Primary focus	Assistant personalization	Graph-based episodic memory	Stateful agents	Secure context engineering
Open source	Yes	Yes	Yes	Yes
Self-host posture	Available	Available	Available	Self-host-first
Content security pipeline	—	—	—	4-stage (validation, PII, injection, LLM)
Memory integrity	—	—	—	HMAC-SHA256
Trust hierarchy	—	—	—	4-tier OWASP model
Multi-agent ACL/scopes	—	—	—	Yes
Cross-agent query	—	—	—	Yes
Handoff baton	—	—	—	Yes
ACE loop	—	—	—	Yes
Typed memory model	—	—	—	Yes
Temporal decay	—	Partial	—	Yes
Hybrid retrieval (dense + sparse + RRF)	Semantic + BM25 + entity	Semantic + keyword + graph	Yes (RRF)	Yes (pgvector + tsvector + RRF)
Contradiction detection	Mem0g (graph variant, LLM)	LLM + temporal invalidation	—	Typed `contradicts` edge, cheap + optional LLM, explicit resolution workflow
Semantic consolidation	LLM-merge + DELETE losers	Temporal supersession	—	LLM/heuristic merge + audit-preserving (`is_deprecated=True` + `consolidated_into`)
Unified context hub (prompts + memory + skills + subagents)	—	—	—	Yes

When to Pick Aegis

Pick Aegis Memory when most of these are true:

You need content security — injection detection, integrity verification, sensitive data protection.
You need multiple agents to share memory safely with explicit ACL/scopes.
You need handoffs where one agent passes a reliable state bundle to another.
You want ACE patterns (vote/reflection/playbook) to continuously improve memory quality.
You want hybrid retrieval that catches exact-token cases (entity names, error codes, file paths) without giving up semantic similarity.
You need contradiction tracking that's reviewable, not just auto-deleted — typed edges with explicit kept_source / kept_target / both_valid / both_invalid resolutions, plus a /metrics endpoint for measuring epistemic conflict over time.
You need consolidation with an audit trail — losing memories stay queryable (is_deprecated=True, metadata.consolidated_into) rather than being deleted.
You prefer a self-host posture with operational control over storage and deployment.
You need temporal decay so stale memories don't pollute retrieval over time.

Performance

Benchmarked on 8 vCPU / 7.6 GB RAM (Intel 13th Gen), 1000 memories, Docker Compose (PostgreSQL 16 + pgvector), concurrency=10. Queries include OpenAI embedding latency. Reproduce with cd benchmarks && bash run_benchmark.sh.

Operation	p50	p95	p99	Throughput
Sequential add	72ms	89ms	97ms	14.1 ops/s
Batch add (5x20)	216ms	292ms	292ms	4.6 ops/s
Concurrent add (c=10)	100ms	193ms	511ms	85.1 ops/s
Sequential query	282ms	411ms	1502ms	3.8 ops/s
Concurrent query (c=10)	413ms	1832ms	1897ms	18.6 ops/s
Cross-agent query	304ms	380ms	380ms	3.3 ops/s
Vote	64ms	176ms	176ms	14.1 ops/s
Deduplication	75ms	112ms	112ms	13.6 ops/s

Query tail latency (p95/p99) is dominated by the external OpenAI embedding call, not Aegis or PostgreSQL. Write and vote operations that skip embedding are consistently under 100ms at p50.

Deployment

Docker Compose

docker compose up -d

Kubernetes

kubectl apply -f k8s/

Configuration

Variable	Default	Description
`DATABASE_URL`	`postgresql+asyncpg://...`	PostgreSQL connection
`OPENAI_API_KEY`	—	For embeddings
`AEGIS_API_KEY`	`dev-key`	API authentication
`CONTENT_POLICY_INJECTION`	`flag`	`reject` / `redact` / `flag` / `allow`
`CONTENT_POLICY_SECRETS`	`reject`	`reject` / `redact` / `flag` / `allow`
`ENABLE_LLM_INJECTION_CLASSIFIER`	`false`	Enable Stage 4 LLM classifier
`INJECTION_CLASSIFIER_MODEL`	`gpt-4o-mini`	Model for injection classification

Full Configuration

Documentation

docs.aegismemory.com — Full documentation

Quickstart — Get running in 5 minutes
Security Guide — Content security, integrity, trust hierarchy
ACE Patterns — Self-improving agent patterns
Smart Memory — Zero-config memory extraction
Integrations — CrewAI, LangChain guides
CLI Reference — Command-line tools

Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

# Run tests
pytest tests/ -v

# Run linting
ruff check server/

License

Apache 2.0 — Use it however you want. See LICENSE.

Links

Built by engineers who read the OWASP reports and acted on them.

EchoLeak: Zero-click exfiltration from M365 Copilot. arxiv.org/html/2509.10540v1 ↩
Multi-agent exfiltration study (COLM 2025). openreview.net/pdf?id=DAozI4etUp ↩
CVE-2025-32711 zero-click AI vulnerability analysis. socprime.com/blog/cve-2025-32711-zero-click-ai-vulnerability/ ↩
OWASP Top 10 for Agentic Applications (2026). genai.owasp.org ↩
Security comparison based on public documentation and open-source repositories as of February 2026. Sources: mem0 docs | Zep docs | Letta repo | Aegis docs ↩ ↩²
Memory-depth feature claims verified May 2026 against vendor blogs and docs. Sources: mem0 State of AI Agent Memory 2026 (hybrid: semantic + BM25 + entity), mem0 architecture (consolidation, Mem0g contradiction resolver), Graphiti / Zep paper and Neo4j writeup (LLM-based edge contradiction with temporal invalidation), Letta archival search docs (RRF hybrid). Aegis design choices documented in server/contradiction_detector.py and server/consolidation.py. ↩

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
.github		.github
aegis_memory		aegis_memory
alembic		alembic
benchmarks		benchmarks
docs		docs
examples		examples
k8s		k8s
migrations		migrations
playbooks		playbooks
server		server
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
QUICKSTART.md		QUICKSTART.md
README.md		README.md
alembic.ini		alembic.ini
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The Problem Nobody Else Is Solving

What Your Context Layer Is Missing

The Context Hub (v2.3.0+)

Memory Depth (v2.4.0)

Built for a World Where Agents Get Compromised

Get Running in 2 Minutes

Start the server

Install the SDK

Multi-agent context in 10 lines

Context That Improves Itself

Full ACE Loop in Code

What "Engineered" Means vs "Inspired"

Choosing the Right Tool

Quick Feature Comparison

When to Pick Aegis

Performance

Deployment

Docker Compose

Kubernetes

Configuration

Documentation

Contributing

License

Links

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

The Problem Nobody Else Is Solving

What Your Context Layer Is Missing

The Context Hub (v2.3.0+)

Memory Depth (v2.4.0)

Built for a World Where Agents Get Compromised

Get Running in 2 Minutes

Start the server

Install the SDK

Multi-agent context in 10 lines

Context That Improves Itself

Full ACE Loop in Code

What "Engineered" Means vs "Inspired"

Choosing the Right Tool

Quick Feature Comparison

When to Pick Aegis

Performance

Deployment

Docker Compose

Kubernetes

Configuration

Documentation

Contributing

License

Links

Footnotes

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages