UniFlow: Towards Zero-Shot LiDAR Scene Flow for Autonomous Vehicles via Cross-Domain Generalization

1University of Pennsylvania    2KTH Royal Institute of Technology    3Carnegie Mellon University
UniFlow teaser
RGB views, LiDAR setups, and BEV point clouds from Argoverse 2, Waymo, nuScenes, and TruckScenes illustrate the cross-domain challenge of LiDAR scene flow.

Abstract

LiDAR scene flow is the task of estimating per-point 3D motion between two point clouds. Recent methods achieve centimeter-level accuracy on popular autonomous vehicle (AV) datasets, but are typically only trained and evaluated on a single sensor. In this paper, we aim to learn general motion priors that transfer to diverse and unseen LiDAR sensors. However, prior work in LiDAR semantic segmentation and 3D object detection demonstrate that naively training on multiple datasets yields worse performance than single dataset models.

Interestingly, we find that this conventional wisdom does not hold for motion estimation, and that state-of-the-art scene flow methods greatly benefit from cross-dataset training. We posit that low-level tasks such as motion estimation may be less sensitive to sensor configuration; indeed, our analysis shows that models trained on fast-moving objects (e.g., from highway datasets) perform well on fast-moving objects, even across different datasets.

Informed by our analysis, we propose UniFlow, a family of feedforward models that unifies and trains on multiple large-scale LiDAR scene flow datasets with diverse sensor placements and point cloud densities. Our frustratingly simple solution establishes a new state-of-the-art on Waymo and nuScenes, improving over prior work by 5.1% and 35.2% respectively. Moreover, UniFlow achieves state-of-the-art accuracy on unseen datasets like TruckScenes, outperforming prior TruckScenes-specific models by 30.1%.


Cross-Dataset Generalization Correlates with Velocity Distribution

Velocity distribution and bucketed performance
The velocity distributions for the AV2, Waymo, nuScenes, and TruckScenes train sets (top). The Dynamic Mean EPE per velocity bin of Flow4D trained on AV2, Waymo, nuScenes, TruckScenes, and UniFlow (bottom). Notably, Flow4D trained on TruckScenes outperforms Flow4D trained on any other dataset for fast-moving objects (2.0, ∞) across all datasets, as TruckScenes contains the largest number of fast-moving objects.

Quantitative Results

Method AV2 Waymo nuScenes TruckScenes
NSFP0.4220.5740.6020.658
FastNSF0.3830.5600.588
SeFlow0.3090.3280.5540.681
ICP Flow0.3310.472
DeFlow0.2760.3140.570
SSF0.1810.2640.2200.453
Flow4D0.1450.2150.2300.456
ΔFlow0.1130.1980.2160.402
UniFlow (Ours)
UniFlow-SSF0.1560.2340.1440.435
UniFlow-Flow4D0.1320.1910.1960.281
UniFlow-ΔFlow0.1180.1880.1400.101

Comparison to State-of-the-Art Methods. We compare UniFlow against recent scene flow methods using Dynamic Bucket-Normalized Mean EPE across multiple autonomous driving datasets. UniFlow consistently improves performance over single-dataset baselines, with especially strong gains on TruckScenes and nuScenes.


Qualitative Results

Zero-shot generalization on TruckScenes: ΔFlow vs ΔFlow (UniFlow)
Compared to dataset-specific ΔFlow, ΔFlow (UniFlow) produces more accurate motion estimates for rare objects and long-range vehicles under challenging conditions.

Scene flow predictions on a challenging rainy sequence from TruckScenes. As shown above and in the video, ΔFlow frequently produces artifacts on rain streaks and background points, which become especially pronounced during occlusions, and predicts inconsistent flow vectors on dynamic objects. In contrast, ΔFlow (UniFlow) yields significantly more stable and coherent motion fields.


CVPR 2026 Challenge

We are hosting a challenge at CVPR 2026 to encourage broad community involvement in addressing long-range (up to 75m) LiDAR scene flow across diverse AV datasets. Participants are allowed to train their models on any publicly available datasets (including TruckScenes) and will be evaluated on Argoverse 2, nuScenes, Waymo, and TruckScenes. We include supervised and unsupervised baselines above (Dynamic Bucket-Normalized; lower is better).

Methods Mean AV2 Waymo nuScenes TruckScenes
Unsupervised
NSFP0.56420.42190.57400.60240.6583
FastNSF0.49710.38260.45760.55970.5884
SeFlow0.47100.30850.35090.54400.6808
TeFlow0.34650.20510.27470.39540.5108
Supervised
SSF0.27640.26510.24510.19550.4000
Flow4D0.25110.21360.20470.23840.3476
ΔFlow0.23910.19010.19590.21960.3500
SSF (UniFlow, Ours)0.24690.21930.22740.24980.2911
Flow4D-XL (UniFlow, Ours)0.24160.20500.18900.28520.2872
ΔFlow (UniFlow, Ours)0.22220.18310.18790.25260.2651

Citation

@misc{li2025uniflowzeroshotlidarscene, title={UniFlow: Towards Zero-Shot LiDAR Scene Flow for Autonomous Vehicles via Cross-Domain Generalization}, author={Siyi Li and Qingwen Zhang and Ishan Khatri and Kyle Vedder and Deva Ramanan and Neehar Peri}, year={2025}, eprint={2511.18254}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2511.18254} }