AdaGaR: Adaptive Gabor Representation for Dynamic Scene Reconstruction
Abstract
Reconstructing dynamic 3D scenes from monocular videos requires simultaneously capturing high-frequency appearance details and temporally continuous motion. Existing methods using single Gaussian primitives are limited by their low-pass filtering nature, while standard Gabor functions introduce energy instability. Moreover, lack of temporal continuity constraints often leads to motion artifacts during interpolation.
We propose AdaGaR, a unified framework addressing both frequency adaptivity and temporal continuity in explicit dynamic scene modeling. We introduce Adaptive Gabor Representation, extending Gaussians through learnable frequency weights and adaptive energy compensation to balance detail capture and stability. For temporal continuity, we employ Cubic Hermite Splines with Temporal Curvature Regularization to ensure smooth motion evolution. An Adaptive Initialization mechanism combining depth estimation, point tracking, and foreground masks establishes stable point cloud distributions in early training.
Experiments on Tap-Vid DAVIS demonstrate state-of-the-art performance (PSNR 35.49, SSIM 0.9433, LPIPS 0.0723) and strong generalization across frame interpolation, depth consistency, video editing, and stereo view synthesis.
Pipeline
Adaptive Gaussian → Gabor Transition
The slider below controls the aggregated wave coefficients that modulate our Gaussian primitives. Increasing the coefficient boosts the sinusoidal carrier with frequency symbolized by ω, letting us move from pure Gaussian support to a detailed Adaptive Gabor representation.
Gaussian Envelope (1-D)
Gabor (1-D)
2-D Gaussian
2-D Gabor
The CUDA-style accumulation multiplies the Gaussian support with sinusoidal weights
to update alpha. The visualization shows how this interaction sculpts opacity.