Post-retrieval temporal decay : how are you handling stale context in production RAG pipelines?

We’ve been running into a specific failure mode in
production RAG agents that I haven’t seen discussed
much here: temporal staleness at the retrieval layer.

The problem: standard vector search returns documents
by semantic similarity with no concept of time. A
compliance guideline from 2023 that has since been
superseded scores the same cosine similarity as the
current version. The agent ingests both with equal
confidence.

In regulated environments (clinical NLP, financial
compliance, legal), this isn’t a quality issue, it’s
a liability.

What we built to address it:

A post-retrieval middleware layer that intercepts
vector payloads before they reach the LLM and applies
deterministic temporal decay scoring:

  • decay_score (0.0 = fresh, 1.0 = fully stale)
  • domain_velocity label: hypersonic / fluid / frozen
  • days_until_stale exact integer
  • Conflict detection for contradictory sources

No LLM in the scoring layer. Pure math. Auditable.

The domain_velocity classification is the part that
surprised us most, the same 90-day-old document
should be treated completely differently depending on
whether it’s from a fast-moving regulatory domain
(hypersonic, 30-day half-life) vs fundamental
mathematics (frozen, 50-year half-life).

Questions for the community:

  1. Are you handling temporal relevance in your RAG
    pipelines at all, or relying on the LLM to reason
    about document dates?

  2. For those in regulated industries : how are you
    handling auditability of what context the agent
    was allowed to see?

  3. Is anyone doing post-retrieval filtering beyond
    basic metadata filters?

Live API for testing if anyone wants to benchmark:
api.knowledgeuniverse.tech

Happy to discuss the math behind the decay functions
if useful.