Deep-Flow uses OT-CFM for unsupervised anomaly detection in AV driving by modeling expert human behavior densities. Advances AI safety validation for scaling rare scenario detection in autonomous vehicles.
Deep-Flow is an unsupervised anomaly detection framework for autonomous driving that utilizes Optimal Transport Conditional Flow Matching (OT-CFM) to model the continuous probability density of expert human driving behavior, enabling the detection of rare, safety-critical "long-tail" scenarios that traditional rule-based systems often miss . By operating on a low-rank spectral manifold via a Principal Component Analysis (PCA) bottleneck, Deep-Flow ensures kinematic smoothness and enables numerically stable, deterministic log-likelihood estimation through exact Jacobian trace computation .
The framework addresses limitations of conventional generative models—such as exposure bias in autoregressive models and intractable likelihoods in diffusion models—by treating the entire trajectory as a single primitive in a spectral space, thereby ensuring global kinematic consistency and avoiding vanishing likelihoods over long horizons . To resolve multi-modal ambiguity at complex junctions, Deep-Flow employs an Early Fusion Transformer encoder with lane-aware goal conditioning and a direct skip-connection to the flow head, preserving intent integrity throughout the network .
A key innovation is the kinematic complexity weighting scheme, which prioritizes high-energy maneuvers—quantified via path tortuosity and jerk—during simulation-free training, thus focusing on safety-relevant behaviors . Evaluated on the Waymo Open Motion Dataset (WOMD), Deep-Flow achieves an AUC-ROC of 0.766 against a heuristic golden set of safety-critical events, despite being trained solely on nominal expert data without labeled anomalies .
Crucially, Deep-Flow distinguishes between kinematic danger and semantic non-compliance, identifying out-of-distribution behaviors such as lane-boundary violations and non-normative junction maneuvers that traditional safety filters overlook . The learned likelihood distribution reveals a "Safety Ceiling": while safe driving can exhibit low likelihood (e.g., complex urban interactions), safety-critical events are mathematically excluded from high-likelihood regions, enabling the definition of rigorous statistical safety gates for autonomous deployment .
This approach provides a data-driven, mathematically principled foundation for scalable safety validation in Level 4 autonomous vehicles, moving beyond brittle heuristic rules toward objective, continuous risk assessment . Code and pre-trained checkpoints are available at https://github.com/AntonioAlgaida/FlowMatchingTrajectoryAnomaly.
This research introduces "Deep-Flow," a novel framework for unsupervised anomaly detection tailored to the safety-critical domain of autonomous driving (AV). The core innovation lies in the integration of Optimal Transport Conditional Flow Matching (OT-CFM) within a manifold-aware spectral space. Instead of relying on raw sensor data or Euclidean representations, the method first projects driving scenarios into a spectral space that respects the underlying manifold structure of the data, effectively capturing the complex topological relationships between vehicle states and environmental factors. By training a flow-based model on the probability densities of expert human driving behaviors, Deep-Flow learns to generate the "normal" distribution of driving maneuvers, establishing a robust baseline for safety validation.
The key contribution of this work is the application of continuous flow matching to the problem of rare scenario detection, offering distinct advantages over traditional generative models like GANs or VAEs in terms of training stability and sample quality. The OT-CFM approach allows for precise modeling of the temporal dynamics and continuous nature of driving trajectories, enabling the system to distinguish between acceptable variations in driving behavior and true anomalies that pose safety risks. This capability is vital for AI safety validation, as it provides a scalable mechanism to identify out-of-distribution events without requiring exhaustive labeled datasets of accidents. By automating the detection of these edge cases, Deep-Flow addresses a major bottleneck in scaling AV deployment, ensuring that systems can be rigorously tested against the vast "long tail" of rare but dangerous real-world scenarios.
This paper introduces Deep-Flow, a novel framework for unsupervised anomaly detection in autonomous vehicle (AV) driving using Optimal Transport Conditional Flow Matching (OT-CFM). The approach models expert human driving behavior as a conditional density distribution, enabling the detection of rare or anomalous scenarios—such as unexpected pedestrian movements or sensor failures—that may elude traditional supervised learning methods. By leveraging manifold-aware spectral representations, Deep-Flow operates in a lower-dimensional space that preserves critical driving dynamics, improving robustness and computational efficiency. The method avoids the need for explicit adversarial training (common in GAN-based anomaly detection) while maintaining high sensitivity to deviations from learned behavior.
The key contributions include: 1. OT-CFM for Unsupervised Anomaly Detection: The use of optimal transport theory to align and compare distributions of expert behavior with AV-generated trajectories, improving generalization to rare scenarios. 2. Manifold-Aware Spectral Space: A spectral embedding that captures the intrinsic geometry of driving trajectories, enhancing the discrimination between normal and anomalous behaviors. 3. Scalability for AI Safety Validation: The framework is designed for continuous, real-time anomaly detection, addressing a critical gap in AV safety validation where rare, high-risk scenarios are difficult to capture via traditional testing.
Why it matters: As autonomous vehicles scale toward widespread deployment, ensuring safety in unseen or adversarial conditions remains a major challenge. Deep-Flow provides a data-efficient, theoretically grounded approach to detect anomalies without relying on exhaustive labeled datasets. By improving the robustness of AV perception and decision-making systems, this work contributes to scalable AI safety validation, a key enabler for real-world AV adoption.
Source: [arXiv:2602.17586](https://arxiv.org/abs/2602.17586)