The first CI/CD gate for AI Memory. Detect regressions, user-leakage, and rank-collapse before your users do.
Don't trust "Semantic Similarity". We run 7 stress-tests (interference, noise, capacity) to prove your memory actually works.
Automated "User Separation" tests. Guarantee that User A's context never bleeds into User B's recall.
We differentiate between "True Recall" and "Model Guessing" by calculating the signal-to-noise ratio in your retrievals.
Every failed test comes with actionable fix suggestions, code examples, and documentation links. Not just diagnosis โ direction.
Drop a memory-audit.yaml in your repo. Block deploys that break memory. Track regressions over time.
Compare your memory against RAG, KNN, Hopfield. Know exactly where you stand and what to improve.
Each test targets a specific failure mode that breaks production memory systems.
| Test | What It Catches | PASS Threshold |
|---|---|---|
| Sensitivity Ratio | Cue dominance โ memory is decorative | > 0.30 |
| Marginal Permutation | Structure-blind โ only sees density | < 0.30 |
| Spectral Entropy | Rank collapse โ one pattern dominates | > 0.40 |
| Bimodal Switch | Interpolation โ can't pick a winner | > 0.30 |
| Orthogonal Capacity | Catastrophic forgetting | โฅ 3 patterns |
| Attractor Pull | No denoising โ noise propagates | > 0.10 |
| User Separation | Multi-tenant leakage | > 0.90 |
Drop a config file in your repo. Block deploys that break memory.
name: Memory Audit on: [push, pull_request] jobs: audit: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Run Memory Harness uses: memory-harness/action@v1 with: provider: ./my_memory.py config: memory-audit.yaml - name: Upload Report uses: actions/upload-artifact@v4 with: name: memory-report path: report.html
Join 500+ teams who ship AI agents with confidence.