Building a cross-site pattern pool for production AI agent failures. The model: connected sites push agent reports (warnings, learned_patterns, runtime events); each site gets back patterns filtered to their specific stack and scored by cross-site relevance. The moat is personalized delivery — not a public knowledge dump.
Pool is at 1,164 patterns. Sharing the engineering gotchas because none of this was in any blog post I could find:
1. Fingerprint dedup is harder than it looks. Started with raw text SHA-256. Got ~40% duplicate rate from formatting noise alone. Switched to NFKC-normalize + strip timestamps/UUIDs/hex strings before hashing. Dup rate dropped to ~3%, but now we miss near-duplicates that differ in one numeric param. Semantic embedding dedup would catch them; cost vs. benefit at this scale is unclear.
2. The deploy filesystem will eat your data directory. Render’s slug builder silently dropped a large data/ subtree even though git tracked it. Two days lost before we noticed the production AgentMinds API was serving an old 207-pattern pool while local was at 1,164. Fix: bundle promoted patterns into a single source-tree JSON (1.5 MB). Loader checks bundle first, falls back to per-file dir for local dev.
3. Scoring under thin data. Beta-Bernoulli (k+1)/(n+2) so a pattern with 1/1 success doesn’t swamp one with 8/10. Then a +6 score boost for patterns flagged production_observed (real fixes confirmed in prod) over documented (best practice, not yet validated). Effect: the personalized top-7 for our own dogfood site is now all production fixes, not docs.
4. Cross-site PII boundary. Patterns are anonymized at the source. But fingerprint collisions across sites are signal — cluster size = how widely a failure is seen. So site_count is published; site identifiers never are. Caller auth gates personalized delivery; the public pool only exposes the shape, not the contents.
Looking for ~5 pilot teams. If you’re running a LangChain / LangGraph / MCP project in production and want to try it (pip install agentminds or npx agentminds-mcp), the trade is simple: you get personalized cross-site patterns filtered to your stack, we get real-world feedback on what’s noise vs. signal. No fee, no contract — just a couple of weeks of actual use.
Spec for the response format (CC-BY-4.0): GitHub - agentmindsdev/profile: AgentMinds Reporting Profile (ARP) — public spec, JSON Schema, conformance levels · GitHub
Lineage: Sentry / OTel / MCP / anthropic-skills / OASF.
AgentMinds is hosted at https://agentminds.dev — public pool stats and stack-filtered patterns are browsable without registering.
If anyone’s solved cross-site agent-failure dedup differently — especially the semantic vs. exact tradeoff — I’d love to hear that too.