Red Teaming

Policy for red teaming and adversarial testing on OpenRouter

Approval Required

Because most model and provider terms of service do not support red-teaming, you cannot red-team models on OpenRouter without prior approval. This includes prompt injection, jailbreaking, or any attempt to get models to behave in ways that violate their terms of service.

Customers who engage in red teaming without approval may be flagged for model or provider terms of service violations. Unapproved red teaming may result in:

Provider bans

Model access restrictions

Platform-wide account bans

Compatibility with Zero Data Retention

Note that certain types of safety classifiers are run online (while the prompt is in-flight and in-memory). Prompts may therefore be flagged by classifiers even with full ZDR configured. These classifiers operate independently of data retention policies and are fully compatible with Zero Data Retention (ZDR).

Request Approval

To request red teaming approval, email safety@openrouter.ai with the following details:

A description of your research or use case

The models & providers you intend to test

The types of adversarial techniques you plan to use

Your expected timeline

Approval generally takes 5 business days. Approval is not guaranteed and is granted at OpenRouter’s discretion based on the details of your use case.

Red Teaming

Red Teaming

Approval Required

Legitimate Red Teaming

Compatibility with Zero Data Retention

Request Approval

Approval Required

Legitimate Red Teaming

Compatibility with Zero Data Retention

Request Approval