For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
ModelsChatRankingsDocs
DocsAPI ReferenceClient SDKsAgent SDKCookbookChangelog
DocsAPI ReferenceClient SDKsAgent SDKCookbookChangelog
  • Get Started
    • Quickstart: Build a Chat App
    • Enterprise Quickstart
    • Free Models Router
  • Working with Coding Agents
    • Automatic Code Review
    • Claude Code
    • Claude Desktop
    • Codex CLI
    • Cursor
    • Hermes Agent
    • Junie CLI
    • MCP Servers
    • OpenClaw 🦞
    • OpenCode
  • Building Agents
    • Add Human-in-the-Loop Controls
    • Build a Long-Horizon Agent
    • Build Your Own Agent TUI
    • Build Your Own Headless Agent
  • Video Generation
    • Choose a Video Generation Model
    • Generate and Download a Video from Text
    • Get Video Results with Webhooks
    • Guide a Video with Reference Images
    • Turn an Image into a Video
    • Use Provider-Specific Video Options
  • Evaluate & Optimize
    • Distillation
    • RAG with Embeddings & Rerank
    • Red Teaming
  • Administration
    • Activity Export
    • API Key Rotation
    • Crypto API
    • Organization Management
    • Usage Accounting
    • User Tracking
LogoLogo
ModelsChatRankingsDocs
On this page
  • Before you start
  • Step 1: Write a prompt that tells the model how to use the references
  • Step 2: Submit the reference-to-video job
  • Step 3: Add more references when consistency matters
  • Step 4: Poll and download
  • Check your work
Video Generation

Guide a Video with Reference Images

Use reference images to influence video subject, style, or identity without exact frame control
Was this page helpful?
Previous

Turn an Image into a Video

Use frame images to control the first or last frame of an OpenRouter video
Next
Built with

Use this guide when you need to add reference-to-video generation where images influence the output without forcing exact first or last frames.

By the end, your implementation should submit a reference-to-video job with input_references.

For reusable agent knowledge across projects, install the openrouter-video skill.

Before you start

You need:

  • An OpenRouter API key available as OPENROUTER_API_KEY
  • Node.js 20 or newer
  • One or more public HTTPS image URLs, starting with REFERENCE_IMAGE_URL
  • A model that supports reference-to-video, confirmed from the current OpenRouter video docs or model description

If you have not chosen a model yet, read Choose a Video Generation Model so you can select one based on your clip duration, output shape, input type, audio, provider controls, and cost requirements.

Use the API reference pages as the source of truth for exact fields:

  • Create video generation request
  • List video generation models
  • TypeScript SDK video generation reference

Use input_references for visual guidance. Use frame_images only when you need exact frame control.

Use stable, directly downloadable image URLs. Some providers cannot fetch image URLs that require cookies, redirects through HTML pages, bot checks, or unusual headers.

Submitting POST /api/v1/videos starts a real video generation job and may spend OpenRouter credits.

The video models endpoint does not expose a dedicated structured reference-image field for every provider. Confirm reference support from the model description or current docs before you submit:

$curl https://openrouter.ai/api/v1/videos/models

Example model output excerpt:

1{
2 "id": "bytedance/seedance-2.0-fast",
3 "supported_durations": [
4 4,
5 5,
6 6,
7 7,
8 8,
9 9,
10 10,
11 11,
12 12,
13 13,
14 14,
15 15
16 ],
17 "supported_resolutions": ["480p", "720p"],
18 "supported_aspect_ratios": [
19 "1:1",
20 "3:4",
21 "9:16",
22 "4:3",
23 "16:9",
24 "21:9",
25 "9:21"
26 ]
27}

For bytedance/seedance-2.0-fast, the model list can confirm the example duration, resolution, and aspect_ratio; reference-image support may still need confirmation from the model description or docs.

Step 1: Write a prompt that tells the model how to use the references

Reference images work best when the prompt explains what should stay consistent.

Create a 4-second product video of the same backpack from the reference image.
Keep the shape, color, and logo placement consistent.
Place it on a wet city sidewalk at night with neon reflections.
Use a slow orbiting camera move and realistic lighting.

Step 2: Submit the reference-to-video job

Build the video request with input_references when the images should guide subject, identity, or style. Unlike frame_images, reference images are not exact frame anchors.

Example request shape:

1const apiKey = process.env.OPENROUTER_API_KEY;
2const referenceImageUrl = process.env.REFERENCE_IMAGE_URL;
3
4if (!apiKey) {
5 throw new Error("Set OPENROUTER_API_KEY first.");
6}
7
8if (!referenceImageUrl) {
9 throw new Error(
10 "Set REFERENCE_IMAGE_URL to a directly downloadable image URL.",
11 );
12}
13
14const response = await fetch("https://openrouter.ai/api/v1/videos", {
15 method: "POST",
16 headers: {
17 Authorization: `Bearer ${apiKey}`,
18 "Content-Type": "application/json",
19 },
20 body: JSON.stringify({
21 model: "bytedance/seedance-2.0-fast",
22 prompt:
23 "Create a 4-second product video of the same backpack from the reference image. Keep the shape, color, and logo placement consistent. Place it on a wet city sidewalk at night with neon reflections. Use a slow orbiting camera move and realistic lighting.",
24 duration: 4,
25 resolution: "720p",
26 aspect_ratio: "16:9",
27 generate_audio: false,
28 input_references: [
29 {
30 type: "image_url",
31 image_url: {
32 url: referenceImageUrl,
33 },
34 },
35 ],
36 }),
37});
38
39if (!response.ok) {
40 throw new Error(await response.text());
41}
42
43const job = await response.json();
44console.log(job);

The submit call returns the job fields immediately. In the QA run, the submitted job later completed and downloaded with this final summary:

1{
2 "id": "Mu1opxXnpIIpwMhwFl8v",
3 "status": "completed",
4 "polls": ["pending", "pending", "pending", "pending", "completed"],
5 "output_path": "reference-video.mp4",
6 "bytes": 2020562
7}

Step 3: Add more references when consistency matters

Some models can use multiple reference images. Before doing this in production, check the current docs or model description for the selected model, then start with the smallest number of references that gives you enough consistency.

1const characterSideUrl = process.env.CHARACTER_SIDE_URL;
2const styleReferenceUrl = process.env.STYLE_REFERENCE_URL;
3
4if (!characterSideUrl || !styleReferenceUrl) {
5 throw new Error("Set CHARACTER_SIDE_URL and STYLE_REFERENCE_URL first.");
6}
7
8const inputReferences = [
9 {
10 type: "image_url",
11 image_url: { url: referenceImageUrl },
12 },
13 {
14 type: "image_url",
15 image_url: { url: characterSideUrl },
16 },
17 {
18 type: "image_url",
19 image_url: { url: styleReferenceUrl },
20 },
21];

Then set input_references in the request body to inputReferences.

Request shape for the optional multi-reference path:

1[
2 {
3 "type": "image_url",
4 "image_url": { "url": "https://your-domain.example/character-front.jpg" }
5 },
6 {
7 "type": "image_url",
8 "image_url": { "url": "https://your-domain.example/character-side.jpg" }
9 },
10 {
11 "type": "image_url",
12 "image_url": { "url": "https://your-domain.example/style-reference.jpg" }
13 }
14]

Step 4: Poll and download

After submission, poll from a server route, worker, or job runner instead of the browser. Keep the flow explicit: poll with a limit, stop on terminal failure, then download the completed video.

Example polling and download helper:

1import { writeFile } from "node:fs/promises";
2
3async function waitForVideo(job) {
4 let current = job;
5
6 for (let attempt = 1; attempt <= 60; attempt += 1) {
7 if (current.status === "completed") {
8 return current;
9 }
10
11 if (current.status === "failed") {
12 throw new Error(current.error ?? "Video generation failed.");
13 }
14
15 if (["cancelled", "expired"].includes(current.status)) {
16 throw new Error(current.error ?? `Video generation ${current.status}.`);
17 }
18
19 await new Promise((resolve) => setTimeout(resolve, 30_000));
20
21 if (!current.polling_url) {
22 throw new Error("Video job did not include a polling_url.");
23 }
24
25 const pollingUrl = new URL(current.polling_url, "https://openrouter.ai");
26 const response = await fetch(pollingUrl, {
27 headers: {
28 Authorization: `Bearer ${apiKey}`,
29 },
30 });
31
32 if (!response.ok) {
33 throw new Error(await response.text());
34 }
35
36 current = await response.json();
37 }
38
39 throw new Error("Video generation did not complete after 60 attempts.");
40}
41
42async function downloadVideo(job) {
43 const videoUrl =
44 job.unsigned_urls?.[0] ??
45 `https://openrouter.ai/api/v1/videos/${job.id}/content?index=0`;
46
47 const response = await fetch(videoUrl, {
48 headers: videoUrl.startsWith("https://openrouter.ai/api/")
49 ? { Authorization: `Bearer ${apiKey}` }
50 : undefined,
51 });
52
53 if (!response.ok) {
54 throw new Error(await response.text());
55 }
56
57 return Buffer.from(await response.arrayBuffer());
58}
59
60const completedJob = await waitForVideo(job);
61const videoBuffer = await downloadVideo(completedJob);
62await writeFile("reference-video.mp4", videoBuffer);
63console.log("Saved reference-video.mp4");

The QA run saved the finished video after polling completed:

Saved reference-video.mp4

Check your work

The output should borrow subject, style, or identity cues from the reference images while still following the generated scene described in the prompt. The implementation should produce a playable MP4 from the completed job.