4D Gaussian Splatting (4DGS) has recently emerged as a promising technique for capturing complex dynamic 3D scenes with high fidelity. It utilizes a 4D Gaussian representation and a GPU-friendly rasterizer, enabling rapid rendering speeds. Despite its advantages, 4DGS faces significant challenges, notably the requirement of millions of 4D Gaussians, each with extensive associated attributes, leading to substantial memory and storage cost.
This paper introduces a memory-efficient framework for 4DGS. We streamline the color attribute by decomposing it into a per-Gaussian direct color component with only 3 parameters and a shared lightweight alternating current color predictor. This approach eliminates the need for spherical harmonics coefficients, which typically involve up to 144 parameters in classic 4DGS, thereby creating a memory-efficient 4D Gaussian representation. Furthermore, we introduce an entropy-constrained Gaussian deformation technique that uses a deformation field to expand the action range of each Gaussian and integrates an opacity-based entropy loss to limit the number of Gaussians, thus forcing our model to use as few Gaussians as possible to fit a dynamic scene well. With simple half-precision storage and zip compression, our framework achieves a storage reduction by approximately 190\(\times\) and 125\(\times\) on the Technicolor and Neural 3D Video datasets, respectively, compared to the original 4DGS. Meanwhile, it maintains comparable rendering speeds and scene representation quality, setting a new standard in the field.
Framework. Overview of our proposed MEGA framework. (a) The original 4D Gaussian uses 4D spherical harmonics \(\boldsymbol{h}\) to represent color, which is highly redundant and consumes substantial memory. (b) Our memory-efficient 4D Gaussian replaces \(\boldsymbol{h}\) with a compact, view-independent, and time-independent color component \(\boldsymbol{c}_{dc}\), achieving an about 8\(\times\) reduction in storage overhead. (c) In the per-Gaussian transformation, a lightweight AC color predictor compensates for the absent viewpoint and temporal information in \(\boldsymbol{c}_{dc}\), and a deformation predictor expands the action range of each Gaussian. (d) Our rendering process consists of four steps: per-Gaussian transformation, temporal slicing, projection, and differentiable rasterization.
Subjective comparison of various methods.
@article{zhang2024mega,
title={MEGA: Memory-Efficient 4D Gaussian Splatting for Dynamic Scenes},
author={Zhang, Xinjie and Liu, Zhening and Zhang, Yifan and Ge, Xingtong and He, Dailan and Xu, Tongda and Wang, Yan and Lin, Zehong and Yan, Shuicheng and Zhang, Jun},
journal={arXiv preprint arXiv:2410.13613},
year={2024}
}