SqueezeMe: Mobile-Ready Distillation of Gaussian Full-Body Avatars

We can run 3 of these Gaussian Splatting avatars at 72 frames per second. It runs locally on a Meta Quest 3 VR headset.

Abstract

Gaussian-based human avatars have achieved an unprecedented level of visual fidelity. However, existing approaches based on high-capacity neural networks typically require a desktop GPU to achieve real-time performance for a single avatar, and it remains non-trivial to animate and render such avatars on mobile devices including a standalone VR headset due to substantially limited memory and computational bandwidth. In this paper, we present SqueezeMe, a simple and highly effective framework to convert high-fidelity 3D Gaussian full-body avatars into a lightweight representation that supports both animation and rendering with mobile-grade compute. Our key observation is that the decoding of pose-dependent Gaussian attributes from a neural network creates non-negligible memory and computational overhead. Inspired by blendshapes and linear pose correctives widely used in Computer Graphics, we address this by distilling the pose correctives learned with neural networks into linear layers. Moreover, we further reduce the parameters by sharing the correctives among nearby Gaussians. Combining them with a custom splatting pipeline based on Vulkan, we achieve, for the first time, simultaneous animation and rendering of 3 Gaussian avatars in real-time (72 FPS) on a Meta Quest 3 VR headset.

BibTeX

@inproceedings{SqueezeMe,
  author    = {Iandola, Forrest and Pidhorskyi, Stanislav and Santesteban, Igor and Gupta, Divam and Pahuja, Anuj and Bartolovic, Nemanja and Yu, Frank and Garbin, Emanuel and Simon, Tomas and Saito, Shunsuke},
  title     = {{SqueezeMe}: Mobile-Ready Distillation of Gaussian Full-Body Avatars},
  booktitle = {SIGGRAPH},
  year      = {2025},
  publisher = {ACM},
  url       = {https://arxiv.org/abs/2412.15171},
}