RGB
Depth
Abstract
We introduce OceanSplat, a novel method that captures geometric structures in scattering media with high fidelity for real-time 3D underwater scene representation. In underwater environments, attenuation and scattering degrade 3D scene fidelity by generating visual artifacts. To address this issue, we propose a trinocular view consistency, which enforces geometric alignment across views translated along two orthogonal axes, thereby constraining the spatial placement of 3D Gaussians and preserving object geometry under complex scattering conditions. Additionally, we design a self-supervised geometric regularization module that reuses synthetic camera translations to generate an epipolar depth prior, which helps suppress medium-induced misplacement of 3D Gaussians. Moreover, we propose a depth-aware alpha adjustment module that uses directional and depth cues to guide visibility learning in the early stages of training, preventing erroneous placements and medium entanglement. Our method effectively represents 3D scenes under scattering media without external geometric cues by preventing foreground 3D Gaussians from erroneously contributing to the medium in novel views, thereby preserving overall scene quality. Experiments on real-world underwater and simulated scenes demonstrate that our method outperforms prior approaches in 3D scene representation under scattering media.
Methodology
Figure 1: Overview of our OceanSplat. We enforce trinocular consistency by inverse warping rasterized outputs from two orthogonal views to guide 3D Gaussian placement. From these views, we derive a synthetic epipolar depth prior via triangulation to provide self-supervised geometric constraints. Additionally, depth-aware alpha adjustment suppresses erroneous 3D Gaussians early and aligns rendered depth with the Gaussian z-component to prevent floaters in novel views.
Qualitative Results
Figure 2: Qualitative results of novel view synthesis on diverse real-world underwater 3D scenes.
Quantitative Results
Table 1: Quantitative evaluation of novel view synthesis performance on real-world underwater scenes compared with reproducible prior methods.