MCP App Store

Overview

Turn ideas into high-quality videos with Runway in ChatGPT. Create polished product marketing spots, cinematic social clips, or creative short stories from a simple prompt, an image, or a product URL. Generate fresh visuals from scratch, animate stills into motion, and transform existing footage with AI-powered edits—so you can go from concept to final-ready content in minutes, all inside ChatGPT.

Tools

complete_upload

ChatGPT
Finalize a file upload and get the asset URL. Call this AFTER successfully uploading the file bytes to the presigned URL(s) from init_upload. parts is REQUIRED — even for single-part uploads. Each entry corresponds to one PUT response and carries the ETag from that response's etag header (strip surrounding quotes if present). Returns the asset URL to use as startFrame / endFrame / referenceImages[].url in the generation tools, or referenceVideo.url in edit_video.

feedback

ChatGPT
Call this when you (the AI agent) get stuck using Runway tools. Examples of when to call: - A generation failed or produced unexpected results - You couldn't figure out which tool to use - The user said the result wasn't what they wanted - The prompting guide didn't have the info you needed This helps the Runway team improve the tools. Be specific about what went wrong.

generate_image

ChatGPT
Generate OR edit an image using a Runway-hosted image model. This is the only image tool — there is no separate "edit_image" tool. Pass the source image as referenceImages[0] to edit it. CREDIT MODE: Runway account availability is checked against the currently connected workspace. For creative product ad videos from a product URL or product image, use generate_product_marketing_video instead. It handles product-image extraction, storyboard generation, and final video creation. For video generation/editing, use generate_video (single shot) or generate_multishot_video (3-5 connected scenes). MODES: 1. Text-to-image — pass promptText only. The model creates a new image from scratch. Example: "a corgi astronaut floating in a nebula, cinematic lighting". 2. Image edit / transform — pass the source image as referenceImages[0] (give it a tag like "input" or "cat") and describe the change in promptText. Example: referenceImages: [{url: ..., tag: "cat"}], promptText: "@cat on a beach at sunset". Use for: background swaps, style transfer, object additions/removals, restyling, photo retouching. 3. Composite from multiple references — pass 2+ referenceImages each with a distinct tag, then reference them by tag in the prompt. Example: referenceImages: [{url: ..., tag: "person"}, {url: ..., tag: "room"}], promptText: "@person standing in @room". HANDLING USER ATTACHMENTS: If the user attached an image, run init_upload -> curl -> complete_upload first to get a Runway-hosted asset URL, then pass it as referenceImages[].url. REUSING ASSETS: URLs returned by other Runway tools (image outputs from prior generate_image calls, complete_upload outputs) are stable, hosted asset URLs. Pass them directly into referenceImages[].url — never re-upload. STYLE SAFETY: Do not include names of artists, directors, photographers, studios, or other living creators as style anchors. If the user asks for a named style, translate it into neutral visual descriptors such as camera language, palette, lighting, genre, era, composition, and material texture before calling the tool. USER-FACING REPLIES: Pick the right model internally, but DO NOT mention the model name to the user (e.g. "using Nano Banana Pro", "with GPT Image 2") unless they explicitly ask which model was used. Talk about the image content — subject, style, composition — not the engine. MODELS: | model | tier | best for | |------------------|--------|-----------------------------------------------------------------------------------------| | nano-banana-pro | any | DEFAULT. Photo-real images, character consistency, edits/composites with references. | | gpt-image-2 | paid | Readable text in images, charts/infographics, strict adherence to complex instructions. | | gen-4 | any | Fast, low-cost general-purpose Runway model. | Picking heuristic: - nano-banana-pro (DEFAULT) — photo-realistic images, edits, multi-image composites. - gpt-image-2 — pick when the image needs readable text, charts, infographics, or strict adherence to a long instruction list. - gen-4 — pick for speed, or when the user explicitly wants to continue with the current account settings. ASPECT RATIOS: - nano-banana-pro: auto, 21:9, 16:9, 4:3, 1:1, 3:4, 9:16 - gpt-image-2: 21:9, 16:9, 4:3, 1:1, 3:4, 9:16, plus 5:3, 3:5, 7:6, 6:7, 5:4, 4:5 - gen-4: 16:9, 9:16, 1:1, 4:3, 3:4, 21:9 Parameters: - model: One of nano-banana-pro, gpt-image-2, gen-4. Default: nano-banana-pro. - promptText: Required. Descriptive prompt. For edits, describe the change ("make the background neon"). For composites, reference tagged images ("@cat sitting on @couch"). - ratio: Aspect ratio. See per-model list above. - count: Number of images to generate in one go. Options: 1, 2, 3, 4. Default: 1. - referenceImages: Optional array of {url, tag} — for edits and composites. tag is the alias used in the prompt (e.g. tag: "cat" lets you write "@cat" in promptText). URLs come from prior Runway tool outputs, complete_upload, or any public https URL.

generate_multishot_video

ChatGPT
Generate a multi-shot video — 3 to 5 connected scenes from a single story or per-shot prompts. Powered by Kling 3.0 (standard at 720p, pro at 1080p). No model parameter — resolution selects the engine. CREDIT MODE: Runway account availability is checked against the currently connected workspace. Use this when the user wants a short narrative or sequence: - "a 10-second mini-story about a cat finding a friend" - "a multi-scene product ad: closeup, lifestyle, hero shot" - "transition through 3 environments: forest -> desert -> beach" For a single continuous shot, use generate_video instead. For editing/restyling an existing video, also use generate_video (with referenceVideo). MODES: 1. auto (DEFAULT) — Pass a single storyPrompt and the workflow plans the shots for you. Best when the user describes the story not the individual scenes. Example: mode: "auto", storyPrompt: "a barista discovers their cafe has been frozen in time". 2. custom — Pass per-shot prompts as shots: [{prompt}, ...] (3-5 shots). Total duration is divided evenly across them. Best when the user has specific ideas for each scene. Example: mode: "custom", shots: [{prompt: "wide shot of cafe"}, {prompt: "barista looks up, surprised"}, {prompt: "close on a frozen droplet of coffee"}]. Optional firstSceneImage anchors the opening frame for either mode. Availability is checked against the current Runway workspace. Multi-shot tasks routinely take 5-10 minutes. USER-FACING REPLIES: DO NOT mention the underlying engine (Kling 3.0 / Pro) to the user unless they explicitly ask which model was used. Talk about the multi-shot story — the scenes, transitions, and overall narrative — not the engine. Parameters: - mode: 'auto' (default) or 'custom'. - storyPrompt: Required for mode='auto'. Full story / overall scene description. - shots: Required for mode='custom'. Array of {prompt}, 3-5 entries. - duration: Total seconds as a STRING (not a number) — '5' | '10' (default) | '15'. - aspectRatio: '16:9' (default) | '1:1' | '9:16'. - resolution: '720p' (default, Kling 3.0 Standard) | '1080p' (Kling 3.0 Pro). - sound: Boolean, generate audio (default true). - firstSceneImage: Optional {url} to anchor the first frame. URL from generate_image, complete_upload, or any public https URL.

generate_product_marketing_video

ChatGPT
Generate a polished creative product ad video from a product URL or product image plus a campaign idea. CREDIT MODE: Runway account availability is checked against the currently connected workspace. Use this for product marketing / creative ad requests like: - "Make a jewelry ad featuring a chameleon in a jewelry store" - "Turn this product page into a cinematic ad" - "Use this product image and this character reference to make a funny product commercial" v1 supports only format: creative_ad_video. Workflow: 1. Get the exact product image from productImages[0] or extract the best high-resolution product image from productUrl. 2. Use optional referenceImages as character, scene, mood, or world references. 3. Internally generate a 3x3 storyboard image, preferring GPT Image 2 for paid workspaces and falling back to Nano Banana Pro otherwise. 4. Use that storyboard as sequential shot guidance for Seedance 2, along with the product image and references, to create the final video. Inputs: - productUrl: Product landing page URL. The tool will scrape metadata and rank image candidates, preferring high-resolution product/gallery images. - productImages: Product image URLs from uploads, recent assets, or previous generations. Preferred over productUrl extraction. - productImageFile: ChatGPT-only uploaded product image file. Use this when ChatGPT provides a user-uploaded product image. - referenceImages: Optional character/scene/mood/world references. - promptText: Required creative campaign idea. If the user requests a named artist/director/photographer/studio style, describe the visual qualities instead and do not include the name. - duration: 10 by default; use 15 only when the user asks for a longer ad. STYLE SAFETY: Do not include names of artists, directors, photographers, studios, or other creators anywhere in prompts submitted to Runway tasks. When style direction is needed, translate it into camera language, palette, lighting, genre, era, composition, and material texture. USER-FACING REPLIES: DO NOT mention internal model names unless asked. Do not mention the intermediate storyboard unless useful. Tell the user the creative ad video is generating and the viewer will update when ready.

generate_video

ChatGPT
Generate OR edit a video using a Runway-hosted video model. This is the only single-shot video tool — there is no separate "edit_video" tool. Pass the source video as referenceVideo to edit/restyle it. CREDIT MODE: Runway account availability is checked against the currently connected workspace. For creative product ad videos from a product URL or product image, use generate_product_marketing_video instead. It handles product-image extraction, cinematic storyboard generation, and the final ad video. For multi-shot videos (one prompt drives 3-5 connected scenes), use generate_multishot_video instead. MODES: 1. Text-to-video — pass promptText only. Most models support this directly. gen-4-turbo auto-generates a starting frame first, then animates it. Example: "aerial drone shot, slow push forward over misty mountains at sunrise". 2. Image-to-video — pass startFrame.url (or ChatGPT's startImageFile) plus promptText to animate a still image. Use this when you want to bring a specific image to life. Example: a product photo + "slow 360 orbit around the bottle". 3. Video-to-video (edit/restyle) — pass referenceVideo.url plus a promptText describing the desired change. Use this to modify an existing video while preserving motion and composition: remove or replace backgrounds, remove unwanted objects/people, add new objects, swap backgrounds, change lighting, restyle scene, alter weather/time-of-day. Only seedance-2 (DEFAULT for v2v) and kling-o3-pro support this mode. Example: an existing dance video + "remove the background and place the dancer on a clean white studio backdrop". VIDEO EDITING REQUESTS: - If the user asks to remove a background, remove an object/person, clean up a scene, add something to an existing video, or otherwise edit footage, DO NOT say Runway MCP cannot do it. Use video-to-video: pass the source clip as referenceVideo / referenceVideoFile, default to seedance-2, and describe the edit in promptText. - For background removal, ask for or use the input video as referenceVideo and prompt for the desired replacement/background treatment, e.g. "remove the background and isolate the subject on a plain neutral studio backdrop" or "replace the background with a clean white seamless studio". DURATION DEFAULTS (the tool sets these automatically when you OMIT duration): - Text-to-video & image-to-video → 10s (or the model's largest supported value <= 10s; veo-3.1 → 8s). - Video-to-video WITH referenceVideo.durationSeconds → largest model-supported value <= source. (e.g. 20s source with kling-o3-pro {5,10,15} → 15s; 8s source with seedance-2 {4..15} → 8s.) - Video-to-video WITHOUT referenceVideo.durationSeconds → model's MAX supported duration (15s for seedance/kling) so the edit doesn't get accidentally truncated. DURATION FOR V2V — IMPORTANT: Always pass referenceVideo.durationSeconds when you know it (read it from the file metadata you uploaded). It's the only way the tool can know how much of the source to edit. - DO NOT pass duration: 5 for a 20-second source video — that will silently clip the edit to the first 5 seconds. Just OMIT duration and pass referenceVideo.durationSeconds instead, and the tool will pick the right output length. - Only set duration explicitly when the user asks for a shorter clip ("just make me a 5-second teaser of this 20-second video"). DO NOT use this tool to edit a still image → use generate_image and pass the image as referenceImages[0]. HANDLING USER ATTACHMENTS: - ChatGPT: if it supplies an uploaded image as startImageFile, pass it directly. This tool will download the temporary ChatGPT file server-side, upload it to Runway, and use it as startFrame.url. - ChatGPT: if it supplies an uploaded video as referenceVideoFile, pass it directly for video-to-video edits. This tool will download the temporary ChatGPT file server-side, upload it to Runway, and use it as referenceVideo.url. - ChatGPT: NEVER pass local paths such as /mnt/data/file.png or /mnt/data/file.mp4 as startFrame.url or referenceVideo.url; the remote MCP server cannot read them. Use startImageFile / referenceVideoFile instead. - Claude/Cursor/local agents: if the user attached a local image or video, run init_upload -> curl -> complete_upload first to get a Runway-hosted asset URL, then pass it as startFrame.url, endFrame.url, referenceImages[].url, or referenceVideo.url. REUSING ASSETS: URLs returned by other Runway tools (image outputs from generate_image, the asset URL from complete_upload) are stable, hosted asset URLs. Pass them directly into the relevant field — never re-upload. Note: outputs of generate_video itself are NOT yet usable as referenceVideo (re-upload via init_upload if needed). USER-FACING REPLIES: Pick the right model internally, but DO NOT mention the model name to the user (e.g. "using Veo 3.1", "with Seedance 2") unless they explicitly ask which model was used. Talk about the video content — subject, motion, mood — not the engine. MODELS (* = auto-generates starting frame from text): | model | tier | t2v | i2v | v2v | end | refs | audio | durations | |--------------|------|-----|-----|-----|-----|------|-------|------------| | seedance-2 | paid | Y | Y | Y | Y | Y | Y | 4-15 | | kling-o3-pro | paid | Y | Y | Y | Y | Y | Y | 5, 10, 15 | | kling-3-pro | paid | Y | Y | - | Y | - | Y | 5, 10, 15 | | gen-4.5 | paid | Y | Y | - | - | - | Y | 2-10 | | veo-3.1 | paid | Y | Y | - | Y | - | Y | 4, 6, 8 | | gen-4-turbo | any | | Y | - | - | - | - | 5, 10 | Legend: t2v = text-to-video, i2v = image-to-video (startFrame), v2v = video-to-video (referenceVideo, edit/restyle existing video), end = end-frame target, refs = reference images. * for gen-4-turbo means it auto-generates a starting frame from text. * for gen-4.5 means audio is t2v-only. Picking heuristic: - seedance-2 (DEFAULT for t2v/i2v/v2v) — best general-purpose, including existing-footage edits like removing/replacing backgrounds, removing objects/people, and adding new elements. Pick this unless you have a reason not to. - kling-o3-pro — pick when consistency matters: character/product must look identical across shots, brand or serialized content, or when v2v needs to preserve identity. Best for ads, episodic content, and product storytelling. Optimized for 1-2 main subjects. - kling-3-pro — pick when prompt fidelity matters more than reference-driven consistency: complex multi-character scenes (3+ people), crowded environments, experimental scripts, rapid ideation without reference assets. (No v2v support.) - veo-3.1 — pick when photorealism and natively-generated audio matter (Veo invents dialogue spoken by characters in-frame, plus ambient sound matched to the scene). (No v2v support.) - gen-4.5 — pick for keyframe-driven storytelling (provide a startFrame and the model anchors at timestamp 0). (No v2v support.) - gen-4-turbo — free-plan accessible option. Use it when the user explicitly wants to continue with the current account settings. (No v2v support.) Parameters: - model: One of seedance-2, kling-o3-pro, kling-3-pro, gen-4.5, veo-3.1, gen-4-turbo. Default for t2v/i2v: seedance-2. Default for v2v (when referenceVideo is set): seedance-2. Free-plan accessible option: gen-4-turbo. - promptText: Required. For t2v/i2v, describe the motion/style. For v2v, describe the edit ("remove the background", "remove the person in the left background", "add a floating logo above the table", "make it nighttime", "snow falling", "cyberpunk neon"). - ratio: Aspect ratio (16:9, 9:16, 1:1, plus 4:3, 3:4, 21:9 on some models). Ignored for v2v — the model preserves the source video's ratio. - duration: Seconds (model-dependent — see table). USUALLY OMIT — defaults to 10s for t2v/i2v, and to the largest model-supported value <= source for v2v. See DURATION DEFAULTS above. Only set explicitly when the user asks for a non-default length. - startFrame: {url} — i2v only. Animate this image. URL must be public HTTPS or a Runway-hosted asset URL. Do NOT use ChatGPT local paths like /mnt/data/...; use startImageFile for ChatGPT uploads. - startImageFile: ChatGPT-only uploaded image file param {download_url, file_id, mime_type, file_name}. Use this instead of startFrame only when ChatGPT provides a user-uploaded file. - endFrame: {url} — i2v only. Final-frame target. Requires startFrame. (seedance-2, kling-o3-pro, kling-3-pro, veo-3.1.) - referenceImages: Array of {url, tag} — i2v style/subject anchors. (seedance-2, kling-o3-pro only.) - referenceVideo: {url, durationSeconds?} — v2v source video for editing/restyling existing footage, including removing/replacing backgrounds, removing objects/people, and adding elements. URL must be a Runway-hosted /datasets/<UUID>.<ext> URL from complete_upload. Do NOT use ChatGPT local paths like /mnt/data/...; use referenceVideoFile for ChatGPT uploads. Pass durationSeconds from your file metadata when known so the model picks an appropriate output length. - referenceVideoFile: ChatGPT-only uploaded video file param {download_url, file_id, mime_type, file_name}. Use this instead of referenceVideo only when ChatGPT provides a user-uploaded video. - generateAudio: Whether to generate native audio (model-dependent; default true for audio-capable models). For v2v with kling-o3-pro, controls whether to keep the original audio. - resolution: Override default resolution (seedance: 480p/720p/1080p; veo: 720p/1080p — 1080p requires 8s).

get_task

ChatGPT
Gets details for a Runway task by ID — used to check status and retrieve the result of a generation/edit task once it completes. Generation tools (generate_image, generate_video, generate_multishot_video, edit_video) return immediately with a task_pending status and a task ID; call this tool to poll for completion. Suggested cadence: wait 30-60s for image tasks, 60-120s for video tasks, then call once. If status is still PENDING/RUNNING/THROTTLED, wait another 30-60s and try again. When status is SUCCEEDED the response includes the asset URL and renders the result inline.

init_upload

ChatGPT
Initialize a file upload to Runway. Returns presigned URLs for direct upload. USE THIS ONLY WHEN: The user has attached a NEW local file (image or video) that isn't already on Runway. CHATGPT: If ChatGPT supplies the uploaded file as file, pass it directly. This tool will download the temporary ChatGPT file server-side, upload it to Runway, complete the upload, and return the final asset URL. Do not ask the user to run curl. CHATGPT: Do NOT call this tool with only filename, fileSize, and mimeType; that only returns curl instructions and cannot upload bytes from ChatGPT. For image-to-video, call generate_video with startImageFile. For video-to-video, call generate_video with referenceVideoFile. DO NOT USE THIS FOR: - URLs returned by generate_image, generate_video, generate_multishot_video, or edit_video. Those URLs are already stable Runway-hosted assets — pass them directly as startFrame.url / referenceImages[].url / referenceVideo.url. Re-uploading wastes a turn and creates a duplicate asset. - Any https:// URL the user pasted that is publicly fetchable. Pass it through directly. CLAUDE/CURSOR/LOCAL WORKFLOW: 1. Get the file size and MIME type using bash: ``bash wc -c < "/absolute/path/to/your/file" | tr -d ' ' file --mime-type -b "/absolute/path/to/your/file" ` 2. Call this tool with the filename, fileSize, and mimeType. 3. Use bash/curl to upload the file to the presigned URL(s). The curl commands this tool returns include -D - so the response headers (including ETag) are written to stdout. 4. Capture the ETag header from each PUT response (it looks like etag: "abc123..."; strip the surrounding quotes). 5. Call complete_upload with the uploadId AND the parts array — parts is required even for single-part uploads. 6. Use the returned assetUrl as startFrame.url / referenceImages[].url / referenceVideo.url in the generation tools. Parameters: - file: ChatGPT-only uploaded file param {download_url, file_id, mime_type, file_name}`. If present, the server completes the upload and returns an asset URL directly. - filename: Name for the uploaded file (e.g., "video.mp4") - fileSize: Size in bytes (from wc -c) - mimeType: MIME type (from file --mime-type). Supported: image/jpeg, image/png, image/webp, image/gif, video/mp4, video/quicktime, video/webm

list_recent

ChatGPT
Lists recent uploaded and generated assets for the authenticated workspace. Returns asset IDs, media types, task IDs when available, and reusable asset URLs. Use this to help the user pick recent images or videos as references.

list_workspaces

ChatGPT
Lists every Runway workspace the authenticated user belongs to, with role and disabled-model list per workspace. The MCP connection is pinned to ONE workspace (chosen at sign-in); use this tool to confirm which workspace you are currently posting to and to compare options before suggesting a switch. To switch workspaces, the user must disconnect and reconnect the Runway MCP from their client (this is intentional — it keeps account context and audit logs aligned to a single workspace per session).

whoami

ChatGPT
Returns the authenticated Runway user profile, the workspace this MCP connection is pinned to (chosen at sign-in), and the list of image/video models available in that workspace. Call this first to confirm auth and to pick a model the user can actually run. Runway account availability is checked against the currently connected workspace. If a model is unavailable in the connected workspace, the tool returns a neutral account-limitation message. If multipleWorkspacesAvailable is true, the user belongs to other Runway workspaces — call list_workspaces to see them. To switch the active workspace, the user must disconnect and reconnect the MCP from their client.

Capabilities

WritesInteractive

App Stats

11

Tools

ChatGPT

Platforms

Works with

ChatGPT

Data refreshed daily