How are you validating LangChain agent output before it executes shell commands?

Been running LangChain agents in production for a few months and kept
hitting the same anxiety: the agent generates a shell command or installs
a package, and I have no systematic way to validate it before it runs.

Tried prompt engineering to “be safe” — doesn’t hold under load. Tried
manual review — doesn’t scale. So I built a rule engine that sits between
generation and execution.

It checks for:

  • Destructive shell patterns (rm -rf, dd, mkfs, format)
  • eval/exec abuse and command injection
  • Secret and credential leakage in env vars or logs
  • Risky package installs in package.json
  • Unsafe CI/CD permissions in GitHub Actions

Returns a structured ALLOW / WARN / BLOCK verdict with rule ID and risk
score — so the agent itself can decide whether to proceed, flag for review,
or abort.

Works via Telegram bot (/scan, /deps, /ci) or API. No SDK needed, no
changes to your existing LangChain setup — just add a validation step
before tool execution.

Called it TEOS Sentinel Shield. 10 free scans/day if anyone wants to test
it against their own agent outputs:

Bot: Telegram: Launch @teoslinker_bot

Curious how others are handling this — are you doing any pre-execution
validation, or relying entirely on prompt constraints?

@Elmahrosa-Steward ,
What have worked best for me is a combination of Middleware + HITL (Human in the loop). I have been consistently using default HumanInTheLoopMiddleWare like this for write/execute tools from FileSystemMiddleWare

from langchain.agents import create_agent
from langchain.agents.middleware import HumanInTheLoopMiddleware
from deepagents.middleware import FilesystemMiddleware

fs_middleware = FilesystemMiddleware()


_WRITE_OPS = {"write_file", "edit_file"}
_EXEC_OPS = {"execute"}

interrupt_on = {
    tool.name: True  # approve / edit / reject / respond
    for tool in fs_middleware.tools
    if tool.name in _EXEC_OPS
} | {
    tool.name: {"allowed_decisions": ["approve", "edit", "reject"]}
    for tool in fs_middleware.tools
    if tool.name in _WRITE_OPS
}

hitl_middleware = HumanInTheLoopMiddleware(interrupt_on=interrupt_on)

I know this might slow down agent execution, but given the sensitive nature of these commands, I had to rely on human intervention.