Hi all,
I’m working on evaluating a RAG-based agent. I’m able to run experiments locally using an evaluator (LLM as judge), but when I try running the same thing on the deployed app — with the same dataset and evaluator — the experiments don’t run (run count stays at 0).
Has anyone faced this issue or knows what might be causing it? Any guidance would be appreciated ![]()
