Recognition: unknown
Real-time Facial Surface Geometry from Monocular Video on Mobile GPUs
read the original abstract
We present an end-to-end neural network-based model for inferring an approximate 3D mesh representation of a human face from single camera input for AR applications. The relatively dense mesh model of 468 vertices is well-suited for face-based AR effects. The proposed model demonstrates super-realtime inference speed on mobile GPUs (100-1000+ FPS, depending on the device and model variant) and a high prediction quality that is comparable to the variance in manual annotations of the same image.
This paper has not been read by Pith yet.
Forward citations
Cited by 2 Pith papers
-
Markerless Head Tracking for Accurate and Accessible Neuronavigation
Markerless multi-camera head tracking achieves 2.32 mm and 2.01° median accuracy versus marker-based systems in 50 subjects, sufficient for transcranial magnetic stimulation.
-
Emotion-Conditioned Short-Horizon Human Pose Forecasting with a Lightweight Predictive World Model
Facial emotion embeddings improve short-term pose forecasting accuracy for emotion-driven motions when fused via normalized gating in a lightweight LSTM world model, but not with simple multimodal fusion.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.