CrossModalityDiffusion: Multi-Modal Novel View Synthesis with Unified Intermediate Representation
Published in WACV GeoCV Workshop, 2025
Recommended citation: Berian, A., Brignac, D., Wu, J., Daba, N., & Mahalanobis, A. CrossModalityDiffusion: Multi-Modal Novel View Synthesis with Unified Intermediate Representation. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2025.
CrossModalityDiffusion is a modular framework designed to address challenges in interpreting geometry across diverse geospatial imaging modalities, such as EO, SAR, and LiDAR. By employing modality-specific encoders and volumetric rendering techniques, it generates geometry-aware feature volumes that unify inputs from varying viewpoints.
