UniFlow: Towards Zero-Shot LiDAR Scene Flow for Autonomous Vehicles via Cross-Domain Generalization

Siyi Li¹, Qingwen Zhang², Ishan Khatri³, Kyle Vedder¹, Deva Ramanan³, Neehar Peri³

¹University of Pennsylvania ²KTH Royal Institute of Technology ³Carnegie Mellon University

Paper Code

RGB views, LiDAR setups, and BEV point clouds from Argoverse 2, Waymo, nuScenes, and TruckScenes illustrate the cross-domain challenge of LiDAR scene flow.

Abstract

LiDAR scene flow is the task of estimating per-point 3D motion between two point clouds. Recent methods achieve centimeter-level accuracy on popular autonomous vehicle (AV) datasets, but are typically only trained and evaluated on a single sensor. In this paper, we aim to learn general motion priors that transfer to diverse and unseen LiDAR sensors. However, prior work in LiDAR semantic segmentation and 3D object detection demonstrate that naively training on multiple datasets yields worse performance than single dataset models.

Interestingly, we find that this conventional wisdom does not hold for motion estimation, and that state-of-the-art scene flow methods greatly benefit from cross-dataset training. We posit that low-level tasks such as motion estimation may be less sensitive to sensor configuration; indeed, our analysis shows that models trained on fast-moving objects (e.g., from highway datasets) perform well on fast-moving objects, even across different datasets.

Informed by our analysis, we propose UniFlow, a family of feedforward models that unifies and trains on multiple large-scale LiDAR scene flow datasets with diverse sensor placements and point cloud densities. Our frustratingly simple solution establishes a new state-of-the-art on Waymo and nuScenes, improving over prior work by 5.1% and 35.2% respectively. Moreover, UniFlow achieves state-of-the-art accuracy on unseen datasets like TruckScenes, outperforming prior TruckScenes-specific models by 30.1%.

Cross-Dataset Generalization Correlates with Velocity Distribution

The velocity distributions for the AV2, Waymo, nuScenes, and TruckScenes train sets (top). The Dynamic Mean EPE per velocity bin of Flow4D trained on AV2, Waymo, nuScenes, TruckScenes, and UniFlow (bottom). Notably, Flow4D trained on TruckScenes outperforms Flow4D trained on any other dataset for fast-moving objects (2.0, ∞) across all datasets, as TruckScenes contains the largest number of fast-moving objects.

Quantitative Results

Method	AV2	Waymo	nuScenes	TruckScenes
NSFP	0.422	0.574	0.602	0.658
FastNSF	0.383	–	0.560	0.588
SeFlow	0.309	0.328	0.554	0.681
ICP Flow	0.331	–	–	0.472
DeFlow	0.276	–	0.314	0.570
SSF	0.181	0.264	0.220	0.453
Flow4D	0.145	0.215	0.230	0.456
ΔFlow	0.113	0.198	0.216	0.402
UniFlow (Ours)
UniFlow-SSF	0.156	0.234	0.144	0.435
UniFlow-Flow4D	0.132	0.191	0.196	0.281
UniFlow-ΔFlow	0.118	0.188	0.140	0.101

Comparison to State-of-the-Art Methods. We compare UniFlow against recent scene flow methods using Dynamic Bucket-Normalized Mean EPE across multiple autonomous driving datasets. UniFlow consistently improves performance over single-dataset baselines, with especially strong gains on TruckScenes and nuScenes.

Qualitative Results

Compared to dataset-specific ΔFlow, ΔFlow (UniFlow) produces more accurate motion estimates for rare objects and long-range vehicles under challenging conditions.

Scene flow predictions on a challenging rainy sequence from TruckScenes. As shown above and in the video, ΔFlow frequently produces artifacts on rain streaks and background points, which become especially pronounced during occlusions, and predicts inconsistent flow vectors on dynamic objects. In contrast, ΔFlow (UniFlow) yields significantly more stable and coherent motion fields.

CVPR 2026 Challenge

We are hosting a challenge at CVPR 2026 to encourage broad community involvement in addressing long-range (up to 75m) LiDAR scene flow across diverse AV datasets. Participants are allowed to train their models on any publicly available datasets (including TruckScenes) and will be evaluated on Argoverse 2, nuScenes, Waymo, and TruckScenes. We include supervised and unsupervised baselines above (Dynamic Bucket-Normalized; lower is better).

Methods	Mean	AV2	Waymo	nuScenes	TruckScenes
Unsupervised
NSFP	0.5642	0.4219	0.5740	0.6024	0.6583
FastNSF	0.4971	0.3826	0.4576	0.5597	0.5884
SeFlow	0.4710	0.3085	0.3509	0.5440	0.6808
TeFlow	0.3465	0.2051	0.2747	0.3954	0.5108
Supervised
SSF	0.2764	0.2651	0.2451	0.1955	0.4000
Flow4D	0.2511	0.2136	0.2047	0.2384	0.3476
ΔFlow	0.2391	0.1901	0.1959	0.2196	0.3500
SSF (UniFlow, Ours)	0.2469	0.2193	0.2274	0.2498	0.2911
Flow4D-XL (UniFlow, Ours)	0.2416	0.2050	0.1890	0.2852	0.2872
ΔFlow (UniFlow, Ours)	0.2222	0.1831	0.1879	0.2526	0.2651

Citation

@misc{li2025uniflowzeroshotlidarscene, title={UniFlow: Towards Zero-Shot LiDAR Scene Flow for Autonomous Vehicles via Cross-Domain Generalization}, author={Siyi Li and Qingwen Zhang and Ishan Khatri and Kyle Vedder and Deva Ramanan and Neehar Peri}, year={2025}, eprint={2511.18254}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2511.18254} }