CrossModalityDiffusion: Multi-Modal Novel View Synthesis with Unified Intermediate Representation

Published in WACV GeoCV Workshop, 2025

Recommended citation: Berian, A., Brignac, D., Wu, J., Daba, N., & Mahalanobis, A. CrossModalityDiffusion: Multi-Modal Novel View Synthesis with Unified Intermediate Representation. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2025.

CrossModalityDiffusion is a modular framework designed to address challenges in interpreting geometry across diverse geospatial imaging modalities, such as EO, SAR, and LiDAR. By employing modality-specific encoders and volumetric rendering techniques, it generates geometry-aware feature volumes that unify inputs from varying viewpoints.

Download paper here
Download code here