Google has provisioned throughput.
Currently it’s impossible to use it in the JS SDK.
Please add ability to add custom headers to vertex AI requests.
curl -X POST
-H “Authorization: Bearer $(gcloud auth print-access-token)”
-H “Content-Type: application/json”
-H “X-Vertex-AI-LLM-Request-Type: dedicated” \ # Options: dedicated, shared
$URL
-d ‘{“contents”: [{“role”: “user”, “parts”: [{“text”: “Hello.”}]}]}’