UniFlow: Towards Zero-Shot LiDAR Scene Flow for Autonomous Vehicles via Cross-Domain Generalization

1University of Pennsylvania    2KTH Royal Institute of Technology    3Carnegie Mellon University
UniFlow teaser
Front-center RGB images (top), LiDAR sensor positions (middle), and BEV LiDAR point clouds (bottom) for Argoverse 2, Waymo, nuScenes, and TruckScenes. All four datasets use different sensors, and collect data in different environments.

Abstract

LiDAR scene flow is the task of estimating per-point 3D motion between consecutive point clouds. Recent methods achieve centimeter-level accuracy on popular autonomous vehicle (AV) datasets, but are typically only trained and evaluated on a single sensor. In this paper, we aim to learn general motion priors that transfer to diverse and unseen LiDAR sensors.

However, prior work in LiDAR semantic segmentation and 3D object detection demonstrate that naively training on multiple datasets yields worse performance than single dataset models. Interestingly, we find that this conventional wisdom does not hold for motion estimation, and that state-of-the-art scene flow methods greatly benefit from cross-dataset training without architectural modification. We posit that low-level tasks such as motion estimation may be less sensitive to sensor configuration; indeed, our analysis shows that models trained on fast-moving objects (e.g., from highway datasets) perform well on fast-moving objects, even across different datasets.

Informed by our analysis, we propose UniFlow, a feedforward model that unifies and trains on multiple large-scale LiDAR scene flow datasets with diverse sensor placements and point cloud densities. Our frustratingly simple solution establishes a new state-of-the-art on Waymo and nuScenes, improving over prior work by 5.1% and 35.2% respectively. Moreover, UniFlow achieves state-of-the-art accuracy on unseen datasets like TruckScenes and AEVAScenes, outperforming prior dataset-specific models by 30.1% and 22.5% respectively.


Cross-Dataset Generalization Correlates with Velocity Distribution

Velocity distribution and bucketed performance
Cross-Dataset Generalization Correlates with Velocity Distribution. The velocity distributions for the AV2, Waymo, nuScenes, and TruckScenes train sets (top). The Dynamic Mean EPE per velocity bin of Flow4D trained on AV2, Waymo, nuScenes, TruckScenes, and UniFlow (bottom). Notably, Flow4D trained on TruckScenes outperforms Flow4D trained on any other dataset for fast-moving objects (2.0, ∞) across all datasets, as TruckScenes contains the largest number of fast-moving objects.

Quantitative Results

Method AV2 Waymo nuScenes
NSFP0.4220.5740.602
FastNSF0.3830.560
SeFlow0.3090.3280.554
ICP Flow0.331
DeFlow0.2760.314
SSF0.1810.2640.220
Flow4D0.1450.2150.230
ΔFlow0.1130.1980.216
UniFlow
UniFlow-SSF0.1560.2340.144
UniFlow-Flow4D0.1320.1910.196
UniFlow-ΔFlow0.1180.1880.140

In-domain performance. We compare UniFlow against recent scene flow methods on AV2, Waymo, and nuScenes using Dynamic Bucket-Normalized Mean EPE. UniFlow establishes a new state-of-the-art on Waymo and nuScenes, improving over prior work by 5.1% and 35.2% respectively.

Method TruckScenes AEVAScenes
NSFP0.658
FastNSF0.588
SeFlow0.681
ICP Flow0.472
DeFlow0.570
SSF0.4530.759ZS
Flow4D0.4560.433ZS
ΔFlow0.4020.402ZS
UniFlow
UniFlow-SSF0.435ZS0.639ZS
UniFlow-Flow4D0.281ZS0.448ZS
UniFlow-ΔFlow0.101ZS0.344ZS

Generalization to unseen datasets. We compare UniFlow against recent scene flow methods on TruckScenes and AEVAScenes using Dynamic Bucket-Normalized Mean EPE. UniFlow outperforms prior dataset-specific models by 30.1% on TruckScenes and 22.5% on AEVAScenes. We mark zero-shot results with ZS.



Qualitative Results

Zero-shot generalization on TruckScenes: ΔFlow vs ΔFlow (UniFlow)
Zero-Shot Generalization on TruckScenes. Compared with the dataset-specific ΔFlow model, ΔFlow (UniFlow) produces more accurate motion estimates, with better robustness to rain artifacts (on the top left) and stronger generalization to rare vehicles (middle row) and long-range vehicles (bottom row).

Video

Challenging rainy sequence from TruckScenes. As shown above and in the video, ΔFlow (left) frequently produces artifacts on rain streaks and background points, which become especially pronounced during occlusions, and predicts inconsistent flow vectors on dynamic objects. In contrast, ΔFlow (UniFlow) (right) yields significantly more stable and coherent motion fields.


CVPR 2026 Challenge

We are hosting a challenge at CVPR 2026 to encourage broad community involvement in addressing long-range (up to 75m) LiDAR scene flow across diverse AV datasets. Participants are allowed to train their models on any publicly available datasets (including TruckScenes) and will be evaluated on Argoverse 2, nuScenes, Waymo, and TruckScenes. We include supervised and unsupervised baselines above (Dynamic Bucket-Normalized; lower is better).

Methods Mean AV2 Waymo nuScenes TruckScenes
Unsupervised
NSFP0.56420.42190.57400.60240.6583
FastNSF0.49710.38260.45760.55970.5884
SeFlow0.47100.30850.35090.54400.6808
TeFlow0.34650.20510.27470.39540.5108
Supervised
SSF0.27640.26510.24510.19550.4000
Flow4D0.25110.21360.20470.23840.3476
ΔFlow0.23910.19010.19590.21960.3500
SSF (UniFlow, Ours)0.24690.21930.22740.24980.2911
Flow4D-XL (UniFlow, Ours)0.24160.20500.18900.28520.2872
ΔFlow (UniFlow, Ours)0.22220.18310.18790.25260.2651

Citation

@misc{li2025uniflowzeroshotlidarscene, title={UniFlow: Towards Zero-Shot LiDAR Scene Flow for Autonomous Vehicles via Cross-Domain Generalization}, author={Siyi Li and Qingwen Zhang and Ishan Khatri and Kyle Vedder and Deva Ramanan and Neehar Peri}, year={2025}, eprint={2511.18254}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2511.18254} }