Motion Forcing

Pipeline Illustration

(a) Preparation of Motion Representations

Input Image

→

Depth & Seg
Models

→

Static Depth

+

Ego Motion

→

Depth Warping Video

Point Image

+

Object Motion

→

Point Video

Static depth = depth × ~seg (dynamic objects removed); Point image = minimum inscribed circles of object masks.

(b) Two-Stage Generation

Input Image

Depth Warping

Point Video

→

Stage 1 Motion Forcing Point → Shape

→

Depth Video

→

Stage 2 Motion Forcing Shape → Appearance

→

RGB Video

Comparisons — Driving

Case 1

Condition

Ours

Wan 2.6

Seed Dance 2.0

MOFA-Video

Case 2

Condition

Ours

Wan 2.6

Seed Dance 2.0

MOFA-Video

More Driving Scenes

Scene 1

Dangerous Cut-in

Double Cut-in

Left Cut-in & Brake

Right Cut-in

Scene 2 & 3

Front Car Braking

Dangerous Right Cut-out

Reverse Car Left Cut-in

Right Cut-in

Ego-Motion Control

The same scene with different ego-vehicle trajectories (speed up / slow down / turn left / turn right).

Speed Up

Slow Down

Turn Left

Turn Right

Comparisons — Physics (Physion)

Case 1

Condition

Ours

Wan 2.6

Seed Dance 2.0

MOFA-Video

Case 2

Condition

Ours

Wan 2.6

Seed Dance 2.0

MOFA-Video

More Physics Actions

Action 2 — Condition

Action 2 — Generated

Action 4 — Condition

Action 4 — Generated

Embodied AI (Jaco Play)

Case 1 · Action 1

Case 1 · Action 2

Case 2 · Action 1

Case 2 · Action 2

Case 3 · Action 1

Case 3 · Action 2

Case 4 · Action 1

Case 4 · Action 2

Failure Cases

When the control signals deviate significantly from realistic scenarios, the model still produces incorrect results.

Case 1 — Condition

Case 1 — Result

Case 2 — Condition

Case 2 — Result

Acknowledgements

We thank the authors of CogVideoX, Video-Depth-Anything, VGGT, and Ultralytics YOLO for their outstanding open-source contributions.

Citation

@misc{xu2026motion,
      title={Motion Forcing: A Decoupled Framework for Robust Video Generation in Motion Dynamics}, 
      author={Tianshuo Xu and Zhifei Chen and Leyi Wu and Hao Lu and Ying-cong Chen},
      year={2026},
      eprint={2603.10408},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2603.10408}, 
}