Luyuan Wang, Yiqian Wu, Yongliang Yang, Chen Liu, Xiaogang Jin
Computer Animation and Virtual Worlds (Special Issue of CASA'2024), Wiley, 2024, 35(3): e2278
Given digital apparel sample display images as input (a, c), our method can effectively improve the realism of avatars’ faces (b, d). The gray boxes highlight the faces. We also apply our method to rendering-style portraits (e), producing photo-realistic results (f).
The rapid development
of the online apparel shopping industry demands innovative solutions for
high-quality digital apparel sample displays with virtual avatars. However,
developing such displays is prohibitively expensive and prone to the
well-known “uncanny valley” effect, where a nearly human-looking artifact
arouses eeriness and repulsiveness, thus affecting the user experience. To
effectively mitigate the “uncanny valley” effect and improve the overall
authenticity of digital apparel sample displays, we present a novel
photo-realistic portrait generation framework. Our key idea is to employ
transfer learning to learn an identity-consistent mapping from the latent
space of rendered portraits to that of real portraits. During the inference
stage, the input portrait of an avatar can be directly transferred to a
realistic portrait by changing its appearance style while maintaining the
facial identity. To this end, we collect a new dataset,
Daz-Rendered-Faces-HQ (DRFHQ), specifically designed for rendering-style
portraits. We leverage this dataset to fine-tune the StyleGAN2-FFHQ
generator, using our carefully crafted framework, which helps to preserve
the geometric and color features relevant to facial identity. We evaluate
our framework using portraits with diverse gender, age, and race variations.
Qualitative and quantitative evaluations, along with ablation studies,
highlight our method’s advantages over state-of-the-art approaches.