Support “less than” conditions for Run Count alerts

Currently, Run Count alerts only support conditions of the form:

“Run count greater than or equal to X runs in the last X minutes.”

This makes it difficult to detect inactivity or drops in traffic, such as:

  • No traces received in a given time window

  • Sudden drop in usage

  • Missing expected periodic events (e.g., heartbeat checks)

For my use case I want to ensure that traces are continuously being logged during certain time windows of the week. If no traces are received within a X-hour window, this likely indicates:

  • A system issue

  • A broken integration

  • Or a drop in usage that requires investigation

Currently I do not see a possibility to detect this directly using LangSmith alerts.


Is there a way to change the comparison operators in alert conditions for Run Count:

  • < (less than)

  • (less than or equal to)

If not, what is the work around? or where can I send a feature request to the Langsmith Team?

Hello @j.cho,
No, Run Count alert conditions currently only support “greater than or equal to” thresholds; < and comparisons are not available.**

The alerting UI only evaluates conditions of the form “Run count ≥ X in the last Y minutes”, so you cannot directly create a rule that fires when the run count is less than a threshold:

For a WorkAround (I assumed you are using FAST API for your backend):

Use a webhook + background monitor: have LangSmith send a webhook on activity (e.g., Run count ≥ 1), and let your FastAPI service alert if no webhook arrives within your X‑hour window.

This inverts the existing run-count rule (which only supports ) by pushing activity events out and detecting the absence of events externally. Below is a minimal, production-ready pattern: a FastAPI endpoint that accepts LangSmith rule webhooks, updates a last_seen timestamp, and an async background monitor that posts to your alert endpoint (PagerDuty/Slack/email webhook) if no events arrive within ALERT_AFTER_SECONDS.

# webhook_monitor.py
# Run: uvicorn webhook_monitor:app --host 0.0.0.0 --port 8080

import os
import time
import asyncio
from typing import Optional

import httpx
from fastapi import FastAPI, Request

ALERT_AFTER_SECONDS = int(os.getenv("ALERT_AFTER_SECONDS", 60 * 60)) 
CHECK_INTERVAL_SECONDS = int(os.getenv("CHECK_INTERVAL_SECONDS", 60)) 
ALERT_WEBHOOK_URL = os.getenv("ALERT_WEBHOOK_URL")                    
ALERT_REPEAT_COOLDOWN = int(os.getenv("ALERT_REPEAT_COOLDOWN", 60 * 60 * 4))  # 4 hours

app = FastAPI()

# For actual production, use Redis or DB instead.
state = {
    "last_seen": time.time(),
    "last_alerted": 0,
}


@app.post("/langsmith-webhook") # just an example endpoint
async def langsmith_webhook(req: Request):
    """
    Endpoint for LangSmith automation rule webhook.
    The LangSmith rule should fire on activity (e.g., Run count ≥ 1 within a short window).
    """
    payload = await req.json()    # you may inspect payload for filtering/verification
    # Update last seen timestamp whenever a valid webhook arrives
    state["last_seen"] = time.time()
    return {"status": "ok"}


async def send_alert(message: str):
    """
    Post to your alerting integration (PagerDuty/Slack/email) using a generic webhook URL.
    Replace with direct PagerDuty/Slack API calls if desired.
    """
    if not ALERT_WEBHOOK_URL:
        # fallback: print or integrate with other notifier
        print("ALERT:", message)
        return

    async with httpx.AsyncClient(timeout=10.0) as client:
        try:
            await client.post(ALERT_WEBHOOK_URL, json={"text": message}) # destination where you to send the alert
        except Exception as e:
            print("Failed to send alert:", e)


async def monitor_loop():
    while True:
        now = time.time()
        last_seen = state.get("last_seen", 0)
        last_alerted = state.get("last_alerted", 0)
        if now - last_seen > ALERT_AFTER_SECONDS:
            # If we haven't alerted recently, send an alert
            if now - last_alerted > ALERT_REPEAT_COOLDOWN:
                message = f"No traces received in the last {ALERT_AFTER_SECONDS // 60} minutes."
                await send_alert(message)
                state["last_alerted"] = time.time()
        await asyncio.sleep(CHECK_INTERVAL_SECONDS)


@app.on_event("startup")
async def startup_event():
    # Start monitor loop in background
    asyncio.create_task(monitor_loop())

Finally, please see the following doc in order to understand how to open a ticket for a feature request.

Thank you very much @keenborder786 for the clarification!

1 Like

Thank you, @j.cho :blush: I’d really appreciate it if you could mark the above as the solution when convenient.

2 Likes

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.