Runway

Also known as: RunwayML, Runway AI, Runway Gen models

Runway: Runway is an AI video generation and editing platform that turns text or image prompts into video, and lets users edit existing footage — adding, removing, or transforming objects, changing camera angles, and transferring lip sync and body motion onto a different character.

Runway is an AI platform for generating and editing video from text or image prompts, offering tools to add, remove, or transform objects, change styles, and transfer lip sync onto footage.

What It Is

A marketer with a thirty-second product clip used to need a video editor and an hour of work just to swap a background or remove a stray object from one shot. Runway turns that into a text instruction: describe the change, and the platform edits the actual footage instead of generating something new from scratch. That distinction is what separates Runway from most video generators a reader encounters in an AI video editing pitch — it doesn’t just make video, it edits the video already on someone’s hard drive.

Runway is built around three connected capabilities rather than one model. According to Runway Research, the platform currently runs Gen-4.5 for generating new clips from a text or image prompt — the part most people picture when they hear “AI video generator” — and Aleph 2.0 for in-context editing: feed it an existing clip plus a prompt, and it can add, remove, or transform an object, change the camera angle, or restyle the lighting. Act-Two is a separate performance-capture tool — it reads motion, facial expression, and lip movement from a driving video and transfers that performance onto a different character, for lip sync and full-body animation.

Think of in-context editing as a photo editor’s selection tool extended across time: instead of selecting a region in one image, the model tracks an object or attribute through every frame and applies the edit consistently, without the flicker that plagued earlier attempts at video editing. According to Runway Research, Aleph treats editing as a conditioning problem — it generates new frames while staying anchored to the original footage’s structure, rather than starting from random noise the way a pure text-to-video generator does. That anchoring keeps the shot temporally consistent: parts nobody asked to change look the same in the last frame as the first.

How It’s Used in Practice

Most people who run into Runway aren’t generating video from nothing — they’re editing a clip that already exists. A marketer touching up a product video opens an existing cut, points to the few seconds with the issue, and types a prompt: remove the cardboard box in the corner of frame, or change the studio backdrop from white to gradient blue. Aleph applies that change across every affected frame and leaves the rest of the shot untouched — the scenario that drives most day-to-day use.

A second, more specialized use case is localization: a company with a talking-head video in one language feeds it through Act-Two alongside a translated audio track, and the tool adjusts lip movement to match without a reshoot. Generating a brand-new clip with Gen-4.5 — the use case most people associate with “AI video” — is actually the less common entry point here.

Pro Tip: Keep one edit per prompt. “Remove the parked car” produces a far more reliable result than “remove the car, add rain, and make it night” in a single pass — each added instruction increases the chance the model drifts from the original footage instead of preserving it.

When to Use / When Not

Scenario	Use	Avoid
Editing a few seconds of existing footage (remove or add an object, change lighting)	✅
Need pixel-perfect, frame-by-frame manual control over an edit		❌
Generating a short concept clip from a text description for a pitch deck	✅
Producing long-form video with a continuous multi-minute narrative		❌
Localizing a talking-head video into another language via lip sync	✅
Replacing a professional colorist’s full grading pass on a feature film		❌

Common Misconception

Myth: Runway only generates new video clips from a prompt, the same as other text-to-video tools.

Reality: Its more distinctive capability is editing footage that already exists. Aleph applies a change to real footage and keeps everything else as it was, which is a different problem from generating a clip out of nothing — and one most text-to-video-only tools don’t solve.

One Sentence to Remember

Runway is less a single video generator and more a small toolkit — one model for creating clips from scratch, one for editing footage that already exists, and one for transferring a performance onto a different character — so the first question is which of those three problems needs solving.

FAQ

Q: Is Runway free to use? A: Runway offers a free tier with a limited one-time credit allowance, plus paid subscription plans with larger monthly credit allowances for heavier generation and editing use.

Q: What is Runway Aleph used for? A: Aleph is Runway’s in-context video editing model — it adds, removes, or transforms objects, changes camera angles, and adjusts style or lighting directly on existing footage instead of generating a new clip.

Q: Can Runway do lip sync? A: Yes — Act-Two, Runway’s performance-capture tool, transfers facial expression, body motion, and lip sync from a driving video onto a different character or reference image.

Sources

Runway Research: Introducing Runway Aleph - Runway’s announcement explaining Aleph’s in-context editing approach.
Runway Help Center: Performance Capture with Act-Two - Official documentation on Act-Two’s performance capture.

Expert Takes

MONA

In-context editing is a conditioning problem, not a generation problem. A text-to-video model predicts frames from noise; an editing model like Aleph predicts frames while anchored to the structure of footage that already exists. That constraint is what keeps the unedited parts of a clip stable across frames — the technical challenge isn’t making something look real, it’s making the edit agree with everything around it that wasn’t supposed to change.

MAX

Treat the edit prompt like a specification, not a wish. Vague instructions — “make it better,” “fix the background” — produce vague edits, because the model has nothing precise to condition on. Specify the object, the change, and what should stay untouched, and the result is something you can actually ship. Teams that get consistent output from tools like this write prompts the way they’d write a ticket: one clear change, one clear scope.

DAN

Performance capture and in-context editing are quietly replacing entire steps of the traditional video production pipeline — the reshoot, the dubbing booth, the rotoscoping pass. That’s not a feature update. That’s a shift in who needs a video production budget at all. The platforms that own both the generation step and the editing step are the ones positioned to own the workflow end to end, not just one stage of it.

ALAN

Transferring someone’s expression and lip movement onto another character is a capability, not a constraint, and capabilities get used however the market wants once they exist. Who consented to a performance being lifted off them and applied to a face that isn’t theirs? The tool doesn’t ask, and most of the policy work for AI video editing is still drafting an answer the technology already launched without.

Back to Glossary