OpenRouter is designed with performance as a top priority. OpenRouter is heavily optimized to add as little latency as possible to your requests.
OpenRouter is designed to add minimal latency to your requests. This is achieved through:
When OpenRouter’s edge caches are cold (typically during the first 1-2 minutes of operation in a new region), you may experience slightly higher latency as the caches warm up. This normalizes once the caches are populated.
To maintain accurate billing and prevent overages, OpenRouter performs additional database checks when:
OpenRouter expires caches more aggressively under these conditions to ensure proper billing, which increases latency until additional credits are added.
When using model routing or provider routing, if the primary model or provider fails, OpenRouter will automatically try the next option. A failed initial completion unsurprisingly adds latency to the specific request. OpenRouter tracks provider failures, and will attempt to intelligently route around unavailable providers so that this latency is not incurred on every request.
To achieve optimal performance with OpenRouter:
Maintain Healthy Credit Balance
Use Provider Preferences