Hi everyone,
I’m running a large number of LangWatch scenarios in parallel (scenario.langwatch.ai) and I’m consistently hitting OpenAI rate limits during execution.
I’m trying to understand what the recommended or best-practice approaches are for handling this at scale. For example:
How do you typically manage concurrency when running many LLM tests?
Is using multiple OpenAI API keys ever considered a valid approach, or is it generally discouraged?
I’d love to hear how others have solved this in production or CI environments, and whether there are LangWatch-specific patterns that work well.
Thanks in advance!