We’ve been running into a specific failure mode in
production RAG agents that I haven’t seen discussed
much here: temporal staleness at the retrieval layer.
The problem: standard vector search returns documents
by semantic similarity with no concept of time. A
compliance guideline from 2023 that has since been
superseded scores the same cosine similarity as the
current version. The agent ingests both with equal
confidence.
In regulated environments (clinical NLP, financial
compliance, legal), this isn’t a quality issue, it’s
a liability.
What we built to address it:
A post-retrieval middleware layer that intercepts
vector payloads before they reach the LLM and applies
deterministic temporal decay scoring:
decay_score(0.0 = fresh, 1.0 = fully stale)domain_velocitylabel: hypersonic / fluid / frozendays_until_staleexact integer- Conflict detection for contradictory sources
No LLM in the scoring layer. Pure math. Auditable.
The domain_velocity classification is the part that
surprised us most, the same 90-day-old document
should be treated completely differently depending on
whether it’s from a fast-moving regulatory domain
(hypersonic, 30-day half-life) vs fundamental
mathematics (frozen, 50-year half-life).
Questions for the community:
-
Are you handling temporal relevance in your RAG
pipelines at all, or relying on the LLM to reason
about document dates? -
For those in regulated industries : how are you
handling auditability of what context the agent
was allowed to see? -
Is anyone doing post-retrieval filtering beyond
basic metadata filters?
Live API for testing if anyone wants to benchmark:
api.knowledgeuniverse.tech
Happy to discuss the math behind the decay functions
if useful.