Matrix Dataset Document#
The Matrix Dataset#
The Matrix dataset was first introduced by the Matrix team in the paper “The Matrix: Infinite-Horizon World Generation with Real-Time Moving Control.” This dataset is specifically designed for training world models and comprises millions of video sequences accompanied by corresponding control signals.
Visual Content:
- Forza Horizon 5: 937,900 video-control signal pairs.
Data format: 60 FPS, 2560×1600P, 4-6s with control signals
Stage 1: 675,193 video clips (60 FPS), avg. duration 12.8 s, no control signals
- Stage 2: 208,933 clips (60 FPS), avg. duration 6.09 s, control signal at every second
Avg. control-label distribution: D 50.94 %, DR 24.67 %, DL 24.38 %
Avg. signal-change events per clip: 3.04
- Stage 3: 53,774 clips (60 FPS), avg. duration 6.07 s, control signal at every second
Avg. control-label distribution: D 51.25 %, DR 24.38 %, DL 24.37 %
Avg. signal-change events per clip: 3.02
Scene: Driving across woods, grass, sea, field, river, others (ratio: 12%:15%:18%:16%:15%:9%:15%)
- Cyberpunk 2077: 300k video-control signal pairs.(Coming soon)
Data format: 60 FPS, 2560×1600P, 4-6s with control signals
Scene: Dense urban environments with skyscrapers, including day-night cycles and indoor-outdoor scenes (ratio: 1:3).

Citations and publications#
@misc{feng2024matrixinfinitehorizonworldgeneration,
title={The Matrix: Infinite-Horizon World Generation with Real-Time Moving Control},
author={Ruili Feng and Han Zhang and Zhantao Yang and Jie Xiao and Zhilei Shu and Zhiheng Liu and Andy Zheng and Yukun Huang and Yu Liu and Hongyang Zhang},
year={2024},
eprint={2412.03568},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2412.03568},
}