Optimizing Diffusion Models for Joint Trajectory Prediction and Controllable Generation

Abstract

Diffusion models are promising for joint trajectory prediction and controllable generation in autonomous driving, but they face challenges of inefficient inference steps and high computational demands. To tackle these challenges, we introduce Optimal Gaussian Diffusion (OGD) and Estimated Clean Manifold (ECM) Guidance. OGD optimizes the prior distribution for a small diffusion time T and starts the reverse diffusion process from it. ECM directly injects guidance gradients to the estimated clean manifold, eliminating extensive gradient backpropagation throughout the network. Our methodology streamlines the generative process, enabling practical applications with reduced computational overhead. Experimental validation on the large-scale Argoverse 2 dataset demonstrates our approach’s superior performance, offering a viable solution for computationally efficient, high-quality joint trajectory prediction and controllable generation for autonomous driving.

Key Contributions

Optimal Gaussian Diffusion (OGD): We propose a method that optimizes the prior distribution for diffusion models, enabling efficient reverse diffusion with significantly fewer steps. Instead of starting from a standard Gaussian distribution, OGD starts from an optimal Gaussian distribution that minimizes the distance to the target data distribution.
Estimated Clean Manifold (ECM) Guidance: We introduce a novel guided sampling approach that directly injects guidance gradients to the estimated clean manifold, avoiding computationally expensive gradient backpropagation throughout the entire network.
Significant Efficiency Improvements: Our approach achieves the same level of performance with only 1/12 of the diffusion steps used by vanilla diffusion models, dramatically reducing computational overhead.
Real-world Validation: Extensive experiments on the Argoverse 2 dataset demonstrate superior performance for both joint trajectory prediction and controllable generation tasks in autonomous driving scenarios.

Methodology Overview

Optimal Gaussian Diffusion (OGD)

Traditional diffusion models start the reverse process from a non-informative prior (e.g., standard Gaussian), which requires many denoising steps to reach the target distribution. Our OGD method analytically determines an optimal Gaussian prior that is closer to the target data distribution, enabling:

Flexible tuning of diffusion steps at inference time
No additional training costs
Better performance with fewer reverse diffusion steps

Estimated Clean Manifold (ECM) Guidance

Conventional guided sampling requires expensive gradient backpropagation through the entire diffusion network. Our ECM approach reformulates controllable generation as a multi-objective optimization problem:

Primary objective: Maximize likelihood (ensure samples lie on the clean manifold)
Secondary objective: Minimize guidance cost (satisfy user preferences)

This hierarchical approach enables efficient guided sampling without lengthy backpropagation.

Experimental Results

Performance Comparison Across Different Tasks

Our experiments on the Argoverse 2 dataset show:

Joint Trajectory Prediction: OGD ranks 4th on the Argoverse 2 Multi-world leaderboard
Computational Efficiency: 1/12 diffusion steps compared to vanilla diffusion
Controllable Generation: Superior performance in guidance effectiveness and realism
Inference Speed: ~5x faster inference with significantly reduced GPU memory usage

The results demonstrate that our approach enables practical deployment of diffusion models for real-time trajectory prediction in autonomous driving applications.