BadWorld: Adversarial Attack on World Models
We introduce BadWorld, a label-free adversarial attack for visual world models.
Starting from a single perturbed context image, BadWorld reliably causes future rollouts to break down across unseen user controls, revealing severe robustness risks in current VWMs.

BadWorld attacks the world model without requiring any ground-truth future video or predefined correct rollout.
We introduce four self-supervised objectives: (1) Velocity-Max, amplifying denoising updates; (2) Velocity-Min, suppressing denoising updates; (3) Drift-Max, encouraging semantic drift; and (4) Drift-Min, inducing motion collapse. For all objectives, we approximate the history using repeated clean contexts and constrain the attack to early denoising timesteps.
Among them, we choose Velocity-Min as the final objective, since it consistently achieves strong attack effectiveness on both Matrix-Game-2.0 and Astra.
| Clean | Velocity-Max | Velocity-Min ✓ | Drift-Max | Drift-Min |
|---|---|---|---|---|
| Clean | Velocity-Max | Velocity-Min ✓ | Drift-Max | Drift-Min |
|---|---|---|---|---|
BadWorld further improves the perturbation by considering that future user controls are unknown and may vary.
It actively searches for hard trajectories where the current perturbation is less effective, then updates the perturbation against these challenging controls. This makes the adversarial image more robust across different camera paths or action sequences, leading to stronger degradation and better generalization.
| Clean | Velocity-Min | + Bi-Level Attack |
|---|---|---|

Visual world models are becoming interactive simulators, but their fragile dynamics suggest they have not truly learned stable physical knowledge.
BadWorld shows that tiny, imperceptible changes to a single input image can severely corrupt future rollouts. This reveals a critical robustness gap for safety-critical deployment and a practical path for privacy protection.
@misc{shen2026badworldadversarialattacksworld,
title={BadWorld: Adversarial Attacks on World Models},
author={Linghui Shen and Mingyue Cui and Xingyi Yang},
year={2026},
eprint={2606.16519},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2606.16519},
}
powered by Academic Project Page Template