许威威             English

 

浙江大学计算机学院CAD&CG国家重点实验室

 

浙江省杭州市西湖区余杭塘路388号浙江大学紫金港校区, 蒙民伟楼516室

 

邮件: xww@ cad.zju.edu.cn

 

 

个人简介

现任浙江大学CAD&CG国家重点实验室长聘教授,教育部长江学者。曾任日本立命馆大学博士后,微软亚洲研究院网络图形组研究员, 杭州师范大学浙江省钱江学者特聘教授。主要研究方向为智能图形图像处理,涵盖三维视觉、物理仿真、数字孪生和虚拟现实。在国内外高水平学术会议和期刊发表论文100余篇,其中ACM Transactions on Graphics, IEEE TVCG, 及IEEE CVPR, NIPS,AAAI等CCF-A类论文50余篇。获中国和美国授权专利15项。所开发的三维注册和重建技术在先临三维描仪、动态人体三维重建、百度阿波罗自动驾驶仿真、华为河图大场景重建中应用。2014年受国家自然科学基金优秀青年基金资助,主持国家自然科学基金重点项目一项,获浙江省自然科学二等奖一项。  

 

学术服务

国际学术会议共同主席:

ACM VRST 2013,2014

国际学术会议论文委员会委员:

Pacific Graphics, ACM Symposium on Geometry Processing, CASA, IEEE Virtual Reality, GMP, SPM

审稿:

ACM SIGGRAPH, ACM SIGGRAPH Asia, IEEE Visualization, IEEE TVCG, SGP, Computers & Graphics

 

招生信息:

常年招收计算机方向硕士、博士和博士后,欢迎自我驱动,有志于高水平科研与软件开发的学生申请报考。

 

发表论文:

 

2024

 

Signature

Automatic Digital Garment Initialization from Sewing Patterns

Chen Liu, Weiwei Xu, Yin Yang, Huaming Wang

The rapid advancement of digital fashion and generative AI technology calls for an automated approach to transform digital sewing patterns into wellfitted garments on human avatars. When given a sewing pattern with its associated sewing relationships, the primary challenge is to establish an initial arrangement of sewing pieces that is free from folding and intersections. This setup enables a physics-based simulator to seamlessly stitch them into a digital garment, avoiding undesirable local minima. To achieve this, we harness AI classification, heuristics, and numerical optimization. This has led to the development of an innovative hybrid system that minimizes the need for user intervention in the initialization of garment pieces. The seeding process of our system involves the training of a classification network for selecting seed pieces, followed by solving an optimization problem to determine their positions and shapes. Subsequently, an iterative selection-arrangement procedure automates the selection of pattern pieces and employs a phased initialization approach to mitigate local minima associated with numerical optimization. Our experiments confirm the reliability, efficiency, and scalability of our system when handling intricate garments with multiple layers and numerous pieces. According to our findings, 68 percent of garments can be initialized with zero user intervention, while the remaining garments can be easily corrected through user operations during post-processing.

 

ACM Transactions on Graphics 2024 (SIGGRAPH Journal track)  [Paper] [Code] [Video] 


 

 

Signature

High-quality Surface Reconstruction using Gaussian Surfels

Pinxuan Dai, Jiamin Xu, Wenxiang Xie, Xinguo Liu, Huamin Wang, Weiwei Xu

We propose a novel point-based representation, Gaussian surfels, to combine the advantages of the flexible optimization procedure in 3D Gaussian points and the surface alignment property of surfels. This is achieved by directly setting the z-scale of 3D Gaussian points to 0, effectively flattening the original 3D ellipsoid into a 2D ellipse. Such a design provides clear guidance to the optimizer. By treating the local z-axis as the normal direction, it greatly improves optimization stability and surface alignment. While the derivatives to the local z-axis computed from the covariance matrix are zero in this setting, we design a self-supervised normal-depth consistency loss to remedy this issue. Monocular normal priors and foreground masks are incorporated to enhance the reconstruction quality, mitigating issues related to highlights and background. We propose a volumetric cutting method to aggregate the information of Gaussian surfels so as to remove erroneous points in depth maps generated by alpha blending. Finally, we apply screened Poisson reconstruction method to the fused depth maps to extract the surface mesh. Experimental results show that our method demonstrates superior performance in surface reconstruction compared to state-of-the-art neural volume rendering and point-based rendering methods.

 

ACM SIGGRAPH 2024  [Paper] [ Code ] [ Video


 

 

Signature

Neural Homogenization of Yarn-Level Cloth

Xudong Feng, Huaming Wang, Yin Yang, Weiwei Xu

Real-world fabrics, composed of threads and yarns, often display complex stress-strain relationships, making their homogenization a challenging task for fast simulation by continuum-based models. Consequently, existing homogenized yarn-level models frequently struggle with numerical stability at large time steps, forcing a trade-off between model accuracy and stability. In this paper, we propose a neural homogenized constitutive model for simulating yarn-level cloth. Unlike analytic models, a neural model is advantageous in adapting to complex dynamic behaviors, and its inherent smoothness naturally mitigates stability issues. We also introduce a sector-based warm-start strategy to accelerate the data collection process in homogenization. This model is trained using collected strain energy datasets and its accuracy is validated through both qualitative and quantitative experiments. Thanks to our model's stability, our simulator can now achieve two-orders-of-magnitude speedups with large time steps compared to previous model.

 

ACM SIGGRAPH 2024  [Paper] [Code] [Video] 


 

 

Signature

3D-SceneDreamer: Text-Driven 3D-Consistent Scene Generation

Songchun Zhang, Yibo Zhang, Quan Zheng, Rui Ma, Wei Hua, Hujun Bao, Weiwei Xu, Changqing Zou

Text-driven 3D scene generation techniques have made rapid progress in recent years. Their success is mainly attributed to using existing generative models to iteratively perform image warping and inpainting to generate 3D scenes. However, these methods heavily rely on the outputs of existing models, leading to error accumulation in geometry and appearance that prevent the models from being used in various scenarios (e.g., outdoor and unreal scenarios). To address this limitation, we generatively refine the newly generated local views by querying and aggregating global 3D information, and then progressively generate the 3D scene. Specifically, we employ a tri-plane featuresbased NeRF as a unified representation of the 3D scene to constrain global 3D consistency, and propose a generative refinement network to synthesize new contents with higher quality by exploiting the natural image prior from 2D diffusion model as well as the global 3D information of the current scene. Our extensive experiments demonstrate that, in comparison to previous methods, our approach supports wide variety of scene generation and arbitrary camera trajectories with improved visual quality and 3D consistency.

 

IEEE CVPR 2024  [Paper] [ Code ] [ Video ] 


 

 

Signature

Text-Guided 3D Face Synthesis - From Generation to Editing

Yunjie Wu, Yapeng Meng, Zhipeng Hu, Lincheng Li, Haoqian Wu, Kun Zhou, Weiwei Xu, Xin Yu

Text-guided 3D face synthesis has achieved remarkable results by leveraging text-to-image (T2I) diffusion models. However most existing works focus solely on the direct generation ignoring the editing restricting them from synthesizing customized 3D faces through iterative adjustments. In this paper we propose a unified text-guided framework from face generation to editing. In the generation stage we propose a geometry-texture decoupled generation to mitigate the loss of geometric details caused by coupling. Besides decoupling enables us to utilize the generated geometry as a condition for texture generation yielding highly geometry-texture aligned results. We further employ a fine-tuned texture diffusion model to enhance texture quality in both RGB and YUV space. In the editing stage we first employ a pre-trained diffusion model to update facial geometry or texture based on the texts. To enable sequential editing we introduce a UV domain consistency preservation regularization preventing unintentional changes to irrelevant facial attributes. Besides we propose a self-guided consistency weight strategy to improve editing efficacy while preserving consistency. Through comprehensive experiments we showcase our method's superiority in face synthesis.

 

IEEE CVPR 2024  [Paper] [Code] [Video] 


 

 

2023

 

ScaNeRF: Scalable Bundle-Adjusting Neural Radiance Fields for Large-Scale Scene Rendering

Xiuchao Wu, Jiamin Xu, Xin Zhang, Hujun Bao, Qixing Huang, Yujun Shen, James Tompkin, Weiwei Xu

High-quality large-scale scene rendering requires a scalable representation and accurate camera poses. This research combines tile-based hybrid neural fields with parallel distributive optimization to improve bundle-adjusting neural radiance fields. The proposed method scales with a divide-and-conquer strategy. We partition scenes into tiles, each with a multi-resolution hash feature grid and shallow chained diffuse and specular multi-layer perceptrons (MLPs). Tiles unify foreground and background via a spatial contraction function that allows both distant objects in outdoor scenes and planar reflections as virtual images outside the tile. Decomposing appearance with the specular MLP allows a specular-aware warping loss to provide a second optimization path for camera poses. We apply the alternating direction method of multipliers (ADMM) to achieve consensus among camera poses while maintaining parallel tile optimization. Experimental results show that our method outperforms state-of-the-art neural scene rendering method quality by 5%–10% in PSNR, maintaining sharp distant objects and view-dependent reflections across six indoor and outdoor scenes.

 

ACM Transactions on Graphics 2023 (SIGGRAPH Asia Journal track)  [Paper] [Code] [Video] 


 

 

SAILOR: Synergizing Radiance and Occupancy Fields for Live Human Performance Capture

Zheng Dong, Ke Xu, Yaoan Gao, Qilin Sun, Hujun Bao, Weiwei Xu, Rynson W.H.Lau

Immersive user experiences in live VR/AR performances require a fast and accurate free-view rendering of the performers. Existing methods are mainly based on Pixel-aligned Implicit Functions (PIFu) or Neural Radiance Fields (NeRF). However, while PIFu-based methods usually fail to produce photorealistic view-dependent textures, NeRF-based methods typically lack local geometry accuracy and are computationally heavy (e.g., dense sampling of 3D points, additional fine-tuning, or pose estimation). In this work, we propose a novel generalizable method, named SAILOR, to create photorealistic human free-view videos from very sparse RGBD streams with low latency. To produce photorealistic view-dependent textures while preserving locally accurate geometry, we integrate PIFu and NeRF such that they work synergistically by conditioning the PIFu on depth and then rendering view-dependent textures through NeRF. Specifically, we propose a novel network, named SRONet, for this hybrid representation to reconstruct and render live free-view videos. SRONet can handle unseen performers without fine-tuning. Both geometric and colorimetric supervision signals are exploited to enhance SRONet's capability of capturing high-quality details. Besides, a neural blending-based ray interpolation scheme, a tree-based data structure, and a parallel computing pipeline are incorporated for fast upsampling, efficient points sampling, and acceleration. To evaluate the rendering performance, we construct a real-captured RGBD benchmark from 40 performers. Experimental results show that SAILOR outperforms existing human reconstruction and performance capture methods.

 

ACM Transactions on Graphics 2023 (SIGGRAPH Asia Journal track)  [Paper] [Code] [Video] 


 

 

Neural Motion Graph

Hongyu Tao, Shuaiying Hou, Changqing Zou, Hujun Bao, Weiwei Xu

Deep learning techniques have been employed to design a controllable human motion synthesizer. Despite their potential, however, designing a neural network-based motion synthesis that enables flexible user interaction, fine-grained controllability, and the support of new types of motions at reduced time and space consumption costs remains a challenge. In this paper, we propose a novel approach, a neural motion graph, that addresses the challenge by enabling scalability to new motions while using compact neural networks. Our approach represents each type of motion with a separate neural node to reduce the cost of adding new motion types. In addition, designing a separate neural node for each motion type enables task-specific control strategies and has greater potential to achieve a high-quality synthesis of complex motions, such as the Mongolian dance. Furthermore, a single transition network, which acts as neural edges, is used to model the transition between two motion nodes. The transition network is designed with a lightweight control module to achieve a fine-grained response to user control signals. Overall, the design choice makes the neural motion graph highly controllable and scalable. In addition to being fully flexible to user interaction through high-level and fine-grained user-control signals, our experimental and subjective evaluation results demonstrate that our proposed approach, neural motion graph, outperforms state-of-the-art human motion synthesis methods in terms of the quality of controlled motion generation.

 

ACM Transactions on Graphics 2023 (SIGGRAPH Asia Conference track)  [Paper] [Code] [Video] 


 

 

Fast and Robust Non-Rigid Registration Using Accelerated Majorization-Minimization

Yuxin Yao, Bailin Deng, Weiwei Xu, Juyong Zhang

Non-rigid 3D registration, which deforms a source 3D shape in a non-rigid way to align with a target 3D shape, is a classical problem in computer vision. Such problems can be challenging because of imperfect data (noise, outliers and partial overlap) and high degrees of freedom. Existing methods typically adopt the `p type robust norm to measure the alignment error and regularize the smoothness of deformation, and use a proximal algorithm to solve the resulting non-smooth optimization problem. However, the slow convergence of such algorithms limits their wide applications. In this paper, we propose a formulation for robust non-rigid registration based on a globally smooth robust norm for alignment and regularization, which can effectively handle outliers and partial overlaps. The problem is solved using the majorization-minimization algorithm, which reduces each iteration to a convex quadratic problem with a closed-form solution. We further apply Anderson acceleration to speed up the convergence of the solver, enabling the solver to run efficiently on devices with limited compute capability. Extensive experiments demonstrate the effectiveness of our method for non-rigid alignment between two shapes with outliers and partial overlaps, with quantitative evaluation showing that it outperforms state-of-the-art methods in terms of registration accuracy and computational speed.

 

IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 2023  [Paper] [ Code ] [Video] 


 

 

CF-Font: Content Fusion for Few-shot Font Generation

Chi Wang, Min Zhou, Tiezheng Ge, Yuning Jiang, Hujun Bao, Weiwei Xu

Content and style disentanglement is an effective way to achieve few-shot font generation. It allows to transfer the style of the font image in a source domain to the style defined with a few reference images in a target domain. However, the content feature extracted using a representative font might not be optimal. In light of this, we propose a content fusion module (CFM) to project the content feature into a linear space defined by the content features of basis fonts, which can take the variation of content features caused by different fonts into consideration. Our method also allows to optimize the style representation vector of reference images through a lightweight iterative style-vector refinement (ISR) strategy. Moreover, we treat the 1D projection of a character image as a probability distribution and leverage the distance between two distributions as the reconstruction loss (namely projected character loss, PCL). Compared to L2 or L1 reconstruction loss, the distribution distance pays more attention to the global shape of characters. Whave evaluated our method on a dataset of 300 fonts with 6.5k characters each. Experimental results verify that our method outperforms existing state-of-the-art few-shot font generation methods by a large margin. The source code can be found at https://github.com/wangchi95/CF-Font.

 

IEEE CVPR 2023  [Paper] [ Code ] [ Video ] 


 

 

Unsupervised Domain Adaption with Pixel-level Discriminator for Image-aware Layout Generation

Chenchen Xu, Min Zhou, Tiezheng Ge, Yuning Jiang, Weiwei Xu

Layout is essential for graphic design and poster generation. Recently, applying deep learning models to generate layouts has attracted increasing attention. This paper focuses on using the GAN-based model conditioned on image contents to generate advertising poster graphic layouts, which requires an advertising poster layout dataset with paired product images and graphic layouts. However, the paired images and layouts in the existing dataset are collected by inpainting and annotating posters, respectively. There exists a domain gap between inpainted posters (source domain data) and clean product images (target domain data). Therefore, this paper combines unsupervised domain adaption techniques to design a GAN with a novel pixel-level discriminator (PD), called PDA-GAN, to generate graphic layouts according to image contents. The PD is connected to the shallow level feature map and computes the GAN loss for each input-image pixel. Both quantitative and qualitative evaluations demonstrate that PDAGAN can achieve state-of-the-art performances and generate high-quality image-aware graphic layouts for advertising posters.

 

IEEE CVPR 2023  [Paper] [Code] [Video] 


 

 

A Two-part Transformer Network for Controllable Motion Synthesis

Shuaiying Hou, Hongyu Tao, Hujun Bao, Weiwei Xu

Although part-based motion synthesis networks have been investigated to reduce the complexity of modeling heterogeneous human motions, their computational cost remains prohibitive in interactive applications. To this end, we propose a novel two-part transformer network that aims to achieve high-quality, controllable motion synthesis results in real-time. Our network separates the skeleton into the upper and lower body parts, reducing the expensive cross-part fusion operations, and models the motions of each part separately through two streams of auto-regressive modules formed by multi-head attention layers. However, such a design might not sufficiently capture the correlations between the parts. We thus intentionally let the two parts share the features of the root joint and design a consistency loss to penalize the difference in the estimated root features and motions by these two auto-regressive modules, significantly improving the quality of synthesized motions. After training on our motion dataset, our network can synthesize a wide range of heterogeneous motions, like cartwheels and twists. Experimental and user study results demonstrate that our network is superior to state-of-the-art human motion synthesis networks in the quality of generated motions.

 

IEEE Transactions on Visualization and Computer Graphics (TVCG) 2023  [Paper] [ Code ] [ Video ] 


 

 

Hybrid Mesh-neural Representation for 3D Transparent Object Reconstruction

Jiamin Xu, Zihan Zhu, Hujun Bao, Weiwei Xu

In this study, we propose a novel method to reconstruct the 3D shapes of transparent objects using images captured by handheld cameras under natural lighting conditions. It combines the advantages of an explicit mesh and multi-layer perceptron (MLP) network as a hybrid representation to simplify the capture settings used in recent studies. After obtaining an initial shape through multi-view silhouettes, we introduced surface-based local MLPs to encode the vertex displacement field (VDF) for reconstructing surface details. The design of local MLPs allowed representation of the VDF in a piecewise manner using two-layer MLP networks to support the optimization algorithm. Defining local MLPs on the surface instead of on the volume also reduced the search space. Such a hybrid representation enabled us to relax the ray-pixel correspondences that represent the light path constraint to our designed ray-cell correspondences, which significantly simplified the implementation of a single-image-based environment-matting algorithm. We evaluated our representation and reconstruction algorithm on several transparent objects based on ground truth models. The experimental results show that our method produces high-quality reconstructions that are superior to those of state-of-the-art methods using a simplified data-acquisition setup.

 

Computational Visual Media Journal (CVMJ) 2023  [Paper] [ Code ] [ Video ] 


 

 

2022

 

Learning-Based Bending Stiffness Parameter Estimation by a Drape Tester

Xudong Feng, Wenchao Huang, Weiwei Xu, Huamin Wang

Real-world fabrics often possess complicated nonlinear, anisotropic bending stiffness properties. Measuring the physical parameters of such properties for physics-based simulation is difficult yet unnecessary, due to the persistent existence of numerical errors in simulation technology. In this work, we propose to adopt a simulation-in-the-loop strategy: instead of measuring the physical parameters, we estimate the simulation parameters to minimize the discrepancy between reality and simulation. This strategy offers good flexibility in test setups, but the associated optimization problem is computationally expensive to solve by numerical methods. Our solution is to train a regression-based neural network for inferring bending stiffness parameters, directly from drape features captured in the real world. Specifically, we choose the Cusick drape test method and treat multiple-view depth images as the feature vector. To effectively and efficiently train our network, we develop a highly expressive and physically validated bending stiffness model, and we use the traditional cantilever test to collect the parameters of this model for 618 real-world fabrics. Given the whole parameter data set, we then construct a parameter subspace, generate new samples within the subspace, and finally simulate and augment synthetic data for training purposes. The experiment shows that our trained system can replace cantilever tests for quick, reliable and effective estimation of simulation-ready parameters. Thanks to the use of the system, our simulator can now faithfully simulate bending effects comparable to those in the real world.

 

ACM Transactions on Graphics 2022 (SIGGRAPH Asia Journal track)  [Paper] [ Code ] [ Video


 

 

Geometry-aware Two-scale PIFu Representation for Human Reconstruction

Zheng Dong, Ke Xu, Ziheng Duan, Hujun Bao, Weiwei Xu, Rynson W.H. Lau

Although PIFu-based 3D human reconstruction methods are popular, the quality of recovered details is still unsatisfactory. In a sparse (e.g., 3 RGBD sensors) capture setting, the depth noise is typically amplified in the PIFu representation, resulting in flat facial surfaces and geometry-fallible bodies. In this paper, we propose a novel geometry-aware two-scale PIFu for 3D human reconstruction from sparse, noisy inputs. Our key idea is to exploit the complementary properties of depth denoising and 3D reconstruction, for learning a two-scale PIFu representation to reconstruct high-frequency facial details and consistent bodies separately. To this end, we first formulate depth denoising and 3D reconstruction as a multi-task learning prob- lem. The depth denoising process enriches the local geometry information of the reconstruction features, while the reconstruction process enhances depth denoising with global topology information. We then propose to learn the two-scale PIFu representation using two MLPs based on the denoised depth and geometry-aware features. Extensive experiments demonstrate the effectiveness of our approach in reconstructing facial details and bodies of different poses and its superiority over state-of-the-art methods.

 

NIPS 2022 (Spotlight)  [Paper] [ Supplementary ] [ Video ] 


 

 

NICE-SLAM: Neural Implicit Scalable Encoding for SLAM

Zihan Zhu, Songyou Peng, Viktor Larsson, Weiwei Xu, Hujun Bao, Zhaopeng Cui, Martin R. Oswald, Marc Pollefeys

Neural implicit representations have recently shown encouraging results in various domains, including promising progress in simultaneous localization and mapping (SLAM). Nevertheless, existing methods produce over-smoothed scene reconstructions and have difficulty scaling up to large scenes. These limitations are mainly due to their simple fully-connected network architecture that does not incorporate local information in the observations. In this paper, we present NICE-SLAM, a dense SLAM system that incorporates multi-level local information by introducing a hierarchical scene representation. Optimizing this representation with pre-trained geometric priors enables detailed reconstruction on large indoor scenes. Compared to recent neural implicit SLAM systems, our approach is more scalable, efficient, and robust. Experiments on five challenging datasets demonstrate competitive results of NICE-SLAM in both mapping and tracking quality.

 

IEEE CVPR 2022  [Paper] [ Project Page ] [ Video


 

 

Scalable neural indoor scene rendering

Xiuchao Wu, Jiamin Xu, Zihan Zhu, Hujun Bao, Qixing Huang, James Tompkin, Weiwei Xu

We propose a scalable neural scene reconstruction and rendering method to support distributed training and interactive rendering of large indoor scenes. Our representation is based on tiles and a separation of view-independent appearance (diffuse color and shading) and view-dependent appearance (specular highlights, reflections), each of which predicted by lower-capacity MLPs. After assigning MPLs per tile, our scheme allows tile MLPs to be trained in parallel and still represent complex reflections through a two-pass training strategy. This is enabled by a background sampling strategy that can augment tile information from a proxy global mesh geometry and tolerate typical errors from reconstructed proxy geometry. Further, we design a two-MLP based representation at each tile to leverage the phenomena that view-dependent surface effects can be attributed to a reflected virtual light at the total ray distance to the source. This lets us handle sparse samplings of the input scene where reflection highlights do not always appear consistently in input images. We show interactive free-viewpoint rendering results from five scenes. One of them covers areas of more than 100 锟絆. Experimental results show that our method produces higher-quality renderings than a single large-capacity MLP and other recent baseline methods.

 

ACM Transactions on Graphics 2022 (SIGGRAPH Journal track)  [Paper] [ Project Page ] [ Video ] 


 

 

Automatic quantization for physics-based simulation

Jiafeng Liu, Haoyang Shi, Siyuan Zhang, Yin Yang, Chongyang Ma, Weiwei Xu

Quantization has proven effective in high-resolution and large-scale simulations, which benefit from bit-level memory saving. However, identifying a quantization scheme that meets the requirement of both precision and memory efficiency requires trial and error. In this paper, we propose a novel framework to allow users to obtain a quantization scheme by simply specifying either an error bound or a memory compression rate. Based on the error propagation theory, our method takes advantage of auto-diff to estimate the contributions of each quantization operation to the total error. We formulate the task as a constrained optimization problem, which can be efficiently solved with analytical formulas derived for the linearized objective function. Our workflow extends the Taichi compiler and introduces dithering to improve the precision of quantized simulations. We demonstrate the generality and efficiency of our method via several challenging examples of physics-based simulation, which achieves up to 2.5x memory compression without noticeable degradation of visual quality in the results. Our code and data are available at https://github.com/Hanke98/AutoQuantizer.

 

ACM Transactions on Graphics 2022 (SIGGRAPH Journal track)  [Paper] [ Supplementary ] [ Video ] 


 

 

Composition-aware Graphic Layout GAN for Visual-Textual Presentation Designs

Min Zhou, Chenchen Xu, Ye Ma, Tiezheng Ge, Yuning Jiang, Weiwei Xu

In this paper, we study the graphic layout generation problem of producing high-quality visualtextual presentation designs for given images. We note that image compositions, which contain not only global semantics but also spatial information, would largely affect layout results. Hence, we propose a deep generative model, dubbed as composition-aware graphic layout GAN (CGLGAN), to synthesize layouts based on the global and spatial visual contents of input images. To obtain training images from images that already contain manually designed graphic layout data, previous work suggests masking design elements (e.g., texts and embellishments) as model inputs, which inevitably leaves hint of the ground truth. We study the misalignment between the training inputs (with hint masks) and test inputs (without masks), and design a novel domain alignment module (DAM) to narrow this gap. For training, we built a large-scale layout dataset which consists of 60,548 advertising posters with annotated layout information. To evaluate the generated layouts, we propose three novel metrics according to aesthetic intuitions. Through both quantitative and qualitative evaluations, we demonstrate that the proposed model can synthesize high-quality graphic layouts according to image compositions.

 

IJCAI 2022  [Paper] [ Supplementary ] [ Video ] 


 

 

Active Boundary Loss for Semantic Segmentation

Chi Wang, Yunke Zhang, Miaomiao Cui, Peiran Ren, Yin Yang, Xuansong Xie, Xian-Sheng Hua, Hujun Bao, Weiwei Xu

This paper proposes a novel active boundary loss for semantic segmentation. It can progressively encourage the alignment between predicted boundaries and ground-truth boundaries during end-to-end training, which is not explicitly enforced in commonly used cross-entropy loss. Based on the predicted boundaries detected from the segmentation results using current network parameters, we formulate the boundary alignment problem as a differentiable direction vector prediction problem to guide the movement of predicted boundaries in each iteration. Our loss is model-agnostic and can be plugged in to the training of segmentation networks to improve the boundary details. Experimental results show that training with the active boundary loss can effectively improve the boundary F-score and mean Intersection-over-Union on challenging image and video object segmentation datasets. Our code can be found at https://github.com/wangchi95/active-boundary-loss.

 

AAAI 2022 (Oral)  [Paper] [ Supplementary ] [ Video ] 


 

 

Erroneous pixel prediction for semantic image segmentation

Lixue Gong, Yiqun Zhang, Yunke Zhang, Yin Yang, Weiwei Xu

Our method is inspired by the Bayesian deep learning which improves image segmentation accuracy by modeling the uncertainty of the network output. In contrast to uncertainty, our method directly learns to predict the erroneous pixels of a segmentation network, which is modeled as a binary classification problem. It can speed up the training comparing to the Monte Carlo integration often used in the Bayesian deep learning. It also allows us to train a branch to correct the labels of erroneous pixels. Our method consists of three stages: 1) predict pixel-wise error probability of the initial result, 2) re-estimate new labels for the pixels with high error probability, 3) fuse the initial result and the re-estimated result with respect to the error probability. We formulate the error-pixel prediction problem as a classification task and employ an error-prediction branch in the network to predict the pixel-wise error probabilities. We also introduce another network branch called detail branch. This branch is designed such that the training process is focused on the erroneous pixels. We experimentally validate our method on Cityscapes and ADE20K dataset. Our model can be easily attached to various advanced segmentation networks to improve performance. Taking the segmentation results from DeepLabv3+ as the initial segmentation result, our network can achieve 82.88% of mIoU on Cityscapes testing dataset and 45.73% on ADE20K validation dataset, which is 0.74% and 0.13% higher than DeepLabv3+.

 

Computational Visual Media (CVM) 2022  [Paper] [ Supplementary ] [ Video ] 


 

 

2021

 

Location-aware Single Image Reflection Removal

Zheng Dong, Ke Xu, Yin Yang, Hujun Bao, Weiwei Xu, Rynson W.H. Lau

This paper proposes a novel location-aware deep learning-based single image reflection removal method. Our network has a reflection detection module to regress a probabilistic reflection confidence map, taking multi-scale Laplacian features as inputs. This probabilistic map tells whether a region is reflection-dominated or transmissiondominated, and it is used as a cue for the network to control the feature flow when predicting reflection and transmission layers. We design our network as a recurrent network to progressively refine reflection removal results at each iteration. The novelty is that we leverage Laplacian kernel parameters to emphasize strong reflections' boundaries. It is beneficial to the strong reflection detection and substantially improves the quality of reflection removal results. Extensive experiments verify the superior performance of the proposed method over state-of-the-art approaches.

 

IEEE ICCV 2021  [Paper] [ Supplementary ] [ Video ] 


 

Attention-guided Temporally Coherent Video Object Matting

Yunke Zhang, Chi Wang, Miaomiao Cui, Peiran Ren, Xuansong Xie, Xian-sheng Hua, Hujun Bao, Qixing Huang, Weiwei Xu

This paper proposes a novel deep learning-based video object matting method that can achieve temporally coherent matting results. Its key component is an attention-based temporal aggregation module that maximizes image matting networks' strength for video matting networks. This module computes temporal correlations for pixels adjacent to each other along the time axis in feature space, which is robust against motion noises. We also design a novel loss term to train the attention weights, which drastically boosts the video matting performance. Besides, we show how to effectively solve the trimap generation problem by fine-tuning a state-of-the-art video object segmentation network with a sparse set of user-annotated keyframes. To facilitate video matting and trimap generation networks' training, we construct a large-scale video matting dataset with 80 training and 28 validation foreground video clips with ground-truth alpha mattes. Experimental results show that our method can generate high-quality alpha mattes for various videos featuring appearance change, occlusion, and fast motion. Our code and dataset can be found at https://github.com/yunkezhang/TCVOM.

 

ACM Multimedia (MM) 2021  [Paper] [ Supplementary ] [ Video


 

 

Image Re-composition via Regional Content-Style Decoupling

Rong Zhang, Wei Li, Yiqun Zhang, Hong Zhang, Jinhui Yu, Ruigang Yang, Weiwei Xu

Typical image composition harmonizes regions from different images to a single plausible image. We extend the idea of image composition by introducing the content-style decomposition and combination to form the concept of image re-composition. In other words, our image re-composition could arbitrarily combine those contents and styles decomposed from different images to generate more diverse images in a unified framework. In the decomposition stage, we incorporate the whitening normalization to obtain a more thorough content-style decoupling, which substantially improves the re-composition results. Moreover, to handle the variation of structure and texture of different objects in an image, we design the network to support regional feature representation and achieve region-aware content-style decomposition. Regarding the composition stage, we propose a cycle consistency loss to constrain the network preserving the content and style information during the composition. Our method can produce diverse re-composition results, including content-content, content-style and style-style. Our experimental results demonstrate a large improvement over the current state-of-the-art methods.

 

ACM Multimedia (MM) 2021  [ Paper ] [ Supplementary ] [ Video


 

 

QuanTaichi: A Compiler for Quantized Simulations

Yuaning Hu, Jiafeng Liu, Xuanda Yang, Mingkuan Xu, Ye Kuang, Weiwei Xu, Qiang Dai, William T. Freeman, Frédo Durand

High-resolution simulations can deliver great visual quality, but they are often limited by available memory, especially on GPUs. We present a compiler for physical simulation that can achieve both high performance and significantly reduced memory costs, by enabling flexible and aggressive quantization. Low-precision (“quantized”) numerical data types are used and packed to represent simulation states, leading to reduced memory space and bandwidth consumption. Quantized simulation allows higher resolution simulation with less memory, which is especially attractive on GPUs. Implementing a quantized simulator that has high performance and packs the data tightly for aggressive storage reduction would be extremely labor-intensive and error-prone using a traditional programming language. To make the creation of quantized simulation practical, we have developed a new set of language abstractions and a compilation system. A suite of tailored domainspecific optimizations ensure quantized simulators often run as fast as the full-precision simulators, despite the overhead of encoding-decoding the packed quantized data types. Our programming language and compiler, based on Taichi, allow developers to effortlessly switch between different full-precision and quantized simulators, to explore the full design space of quantization schemes, and ultimately to achieve a good balance between space and precision. The creation of quantized simulation with our system has large benefits in terms of memory consumption and performance, on a variety of hardware, from mobile devices to workstations with high-end GPUs. We can simulate with levels of resolution that were previously only achievable on systems with much more memory, such as multiple GPUs. For example, on a single GPU, we can simulate a Game of Life with 20 billion cells (8× compression per pixel), an Eulerian fluid system with 421 million active voxels (1.6× compression per voxel), and a hybrid Eulerian-Lagrangian elastic object simulation with 235 million particles (1.7× compression per particle). At the same time, quantized simulations create physically plausible results. Our quantization techniques are complementary to existing acceleration approaches of physical simulation: they can be used in combination with these existing approaches, such as sparse data structures, for even higher scalability and performance.

 

ACM SIGGRAPH 2021  [Paper] [ Supplementary ] [ Video


 

 

Scalable Image-based Indoor Scene Rendering with Reflections

Jiamin Xu, Xiuchao Wu, Zihan Zhu, Qixing Huang, Yin Yang, Hujun Bao, Weiwei Xu

This paper proposes a novel scalable image-based rendering (IBR) pipeline for indoor scenes with reflections. We make substantial progress towards three sub-problems in IBR, namely, depth and reflection reconstruction, view selection for temporally coherent view-warping, and smooth rendering refinements. First, we introduce a global-mesh-guided alternating optimization algorithm that robustly extracts a two-layer geometric representation. The front and back layers encode the RGB-D reconstruction and the reflection reconstruction, respectively. This representation minimizes the image composition error under novel views, enabling accurate renderings of reflections. Second, we introduce a novel approach to select adjacent views and compute blending weights for smooth and temporal coherent renderings. The third contribution is a supersampling network with a motion vector rectification module that refines the rendering results to improve the final output's temporal coherence. These three contributions together lead to a novel system that produces highly realistic rendering results with various reflections. The rendering quality outperforms state-of-the-art IBR or neural rendering algorithms considerably.

 

ACM SIGGRAPH 2021  [Paper] [ Supplementary ] [ Video


 

 

2020

 

Medial Elastics: Efficient and Collision-ready Deformation via Medial Axis Transform

Lei Lan, Ran Luo, Marco Fratarcangeli, Weiwei Xu, Huamin Wang, Xiaohu Guo, Junfeng Yao, Yin Yang

We propose a framework for the interactive simulation of nonlinear deformable objects. The primary feature of our system is the seamless integration of deformable simulation and collision culling, which are often independently handled in existing animation systems. The bridge connecting them is the medial axis transform or MAT, a high-fidelity volumetric approximation of complex 3D shapes. From the physics simulation perspective, MAT leads to an expressive and compact reduced nonlinear model. We employ a semi-reduced projective dynamics formulation, which well captures high-frequency local deformations of high-resolution models while retaining a low computation cost. Our key observation is that the most compelling (nonlinear) deformable effects are enabled by the local constraints projection, which should not be aggressively reduced, and only apply model reduction at the global stage. From the collision detection/culling perspective, MAT is geometrically versatile using linear-interpolated spheres (i.e. the so-called medial primitives) to approximate the boundary of the input model. The intersection test between two medial primitives is formulated as a quadratically constrained quadratic program problem. We give an algorithm to solve this problem exactly, which returns the deepest penetration between a pair of intersecting medial primitives. When coupled with spatial hashing, collision (including self-collision) can be efficiently identified on the GPU within few milliseconds even for massive simulations. We have tested our system on a variety of geometrically complex and high-resolution deformable objects, and our system produces convincing animations with all the collisions/self-collisions well handled at an interactive rate.

 

ACM Transactions on Graphics (TOG)  [Paper] [ Supplementary ] [ Video


 

 

Quasi-Newton Solver for Robust Non-Rigid Registration

Yuxin Yao, Bailin Deng, Weiwei Xu, Juyong Zhang

Imperfect data (noise, outliers and partial overlap) and high degrees of freedom make non-rigid registration a classical challenging problem in computer vision. Existing methods typically adopt the ?p type robust estimator to regularize the fitting and smoothness, and the proximal operator is used to solve the resulting non-smooth problem. However, the slow convergence of these algorithms limits its wide applications. In this paper, we propose a formulation for robust non-rigid registration based on a globally smooth robust estimator for data fitting and regularization, which can handle outliers and partial overlaps. We apply the majorization-minimization algorithm to the problem, which reduces each iteration to solving a simple least-squares problem with L-BFGS. Extensive experiments demonstrate the effectiveness of our method for non-rigid alignment between two shapes with outliers and partial overlap, with quantitative evaluation showing that it outperforms state-of-the-art methods in terms of registration accuracy and computational speed. The source code is available at this URL

 

IEEE CVPR 2020  [Paper] [ Supplementary ] [ Video


 

 

AutoRemover: Automatic Object Removal for Autonomous Driving Videos

Rong Zhang, Wei Li, Peng Wang, Chenye Guan, Jin Fang,Yuhang Song, Jinhui Yu, Baoquan Chen, Weiwei Xu, Ruigang Yang

Motivated by the need for photo-realistic simulation in autonomous driving, in this paper we present a video inpainting algorithm AutoRemover, designed specifically for generating street-view videos without any moving objects. In our setup we have two challenges: the first is the shadow, shadows are usually unlabeled but tightly coupled with the moving objects. The second is the large ego-motion in the videos. To deal with shadows, we build up an autonomous driving shadow dataset and design a deep neural network to detect shadows automatically. To deal with large ego-motion, we take advantage of the multi-source data, in particular the 3D data, in autonomous driving. More specifically, the geometric relationship between frames is incorporated into an inpainting deep neural network to produce high-quality structurally consistent video output. Experiments show that our method outperforms other state-of-the-art (SOTA) object removal algorithms, reducing the RMSE by over 19%.

 

AAAI Conference on Artificial Intelligence  [Paper] [ Supplementary ] [ Video


 

 

2019

 

Accelerated complex-step finite difference for expedient deformable simulation.

Ran Luo, Weiwei Xu, Tianjia Shao, Hongyi Xu, Yin Yang

In deformable simulation, an important computing task is to calculate the gradient and derivative of the strain energy function in order to infer the corresponding internal force and tangent stiffness matrix. The standard numerical routine is the finite difference method, which evaluates the target function multiple times under a small real-valued perturbation. Unfortunately, the subtractive cancellation prevents us from setting this perturbation sufficiently small, and the regular finite difference is doomed for computing problems requiring a high-accuracy derivative evaluation. In this paper, we graft a new finite difference scheme, namely the complex finite difference(CFD), with physics-based animation. CFD is based on the complex Taylor series expansion, which avoids the subtraction for the first-order derivative approximation. As a result, one can use a very small perturbation to calculate the numerical derivative that is as accurate as its analytic counterpart. We significantly accelerate the original CFD method so that it is also as efficient as the analytic derivative. This is achieved by discarding high-order error terms, decoupling real and imaginary calculations, replacing costly functions based on the theory of equivalent infinitesimal, and isolating the propagation of the perturbation in composite/nesting functions. CFD can be further augmented with the multicomplex Taylor expansion and Cauchy-Riemann formula to handle higher-order derivatives and tensor-valued functions. We demonstrate the accuracy, convenience, and efficiency of this new numerical routine in the context of deformable simulation – one can easily deploy a robust simulator for general hyperelastic materials, including user-crafted ones to cater specific needs in different applications. Higher-order derivatives of the energy can be readily computed to construct modal derivative bases for reduced real-time simulation. Inverse simulation problems can also be conveniently solved using gradient/Hessian based optimization procedures.

 

ACM Transactions on Graphics (TOG)  [Paper] [ Supplementary ] [ Video


 

 

Parametric 3D modeling of a symmetric human body

Yin Chen, Zhan Song, Weiwei Xu, Ralph R.Martin, Zhiquan Cheng

To realistically represent 3D human body shape in a mathematical way, the parametric model used should incorporate symmetry as displayed by real people. This paper proposes a symmetric parametric model called symmetricSCAPE. It successfully incorporates symmetry into a parametric model of the 3D body, formulating body geometric variations of both shape and pose using a triangular mesh representation. The symmetry constraint is imposed on each symmetrically-related triangle pair of the body mesh. Mathematically, symmetry-related constraint matrices are derived, and applied during shape and pose deformation of triangle pairs. By accurately registering a pre-designed symmetrization template mesh to the training dataset, we learn how the symmetricSCAPE model causes the body mesh to deform relative to the symmetry. Our experiments demonstrate that the symmetricSCAPE model results in a better, more parsimonious, and more accurate parametric model of the 3D human body than traditional non-symmetry-aware representations.

 

Computers & Graphics  [Paper] [ Supplementary ] [ Video


 

 

AADS: Augmented autonomous driving simulation using data-driven algorithms

Wei Li, Chengwei Pan, Rong Zhang, Jiaping Ren, Yuexin Ma, Jin Fang, Feilong Yan, Qichuan Geng, Xinyu Huang, Huajun Gong, Weiwei Xu, Guoping Wang, Dinesh Manocha, Ruigang Yang

Simulation systems have become essential to the development and validation of autonomous driving (AD) technologies. The prevailing state-of-the-art approach for simulation uses game engines or high-fidelity computer graphics (CG) models to create driving scenarios. However, creating CG models and vehicle movements (the assets for simulation) remain manual tasks that can be costly and time consuming. In addition, CG images still lack the richness and authenticity of real-world images, and using CG images for training leads to degraded performance. Here, we present our augmented autonomous driving simulation (AADS). Our formulation augmented real-world pictures with a simulated traffic flow to create photorealistic simulation images and renderings. More specifically, we used LiDAR and cameras to scan street scenes. From the acquired trajectory data, we generated plausible traffic flows for cars and pedestrians and composed them into the background. The composite images could be resynthesized with different viewpoints and sensor models (camera or LiDAR).The resulting images are photorealistic, fully annotated, and ready for training and testing of AD systems from perception to planning. We explain our system design and validate our algorithms with a number of AD tasks from detection to segmentation and predictions. Compared with traditional approaches, our method offers scalability and realism. Scalability is particularly important for AD simulations, and we believe that real-world complexity and diversity cannot be realistically captured in a virtual environment. Our augmented approach combines the flexibility of a virtual environment (e.g., vehicle movements) with the richness of the real world to allow effective simulation.

 

Science Robotics  [Paper] [ Supplementary ] [ Video


 

 

Computational Design of Skinned Quad-Robots

Xudong Feng, Jiafeng Liu, Huamin Wang, Yin Yang, Hujun Bao, Bernd Bickel, and Weiwei Xu

We present a computational design system that assists users to model, optimize, and fabricate quad-robots with soft skins.Our system addresses the challenging task of predicting their physical behavior by fully integrating the multibody dynamics of the mechanical skeleton and the elastic behavior of the soft skin. The developed motion control strategy uses an alternating optimization scheme to avoid expensive full space time-optimization, interleaving space-time optimization for the skeleton and frame-by-frame optimization for the full dynamics. The output are motor torques to drive the robot to achieve a user prescribed motion trajectory.We also provide a collection of convenient engineering tools and empirical manufacturing guidance to support the fabrication of the designed quad-robot. We validate the feasibility of designs generated with our system through physics simulations and with a physically-fabricated prototype.

 

IEEE TVCG  [Paper] [ Supplementary ] [ Video


 

 

A Late Fusion CNN for Digital Matting

Yunke Zhang, Lixue Gong, Lubin Fan, Peiran Ren, Qixing Huang, Hujun Bao, Weiwei Xu*

This paper studies the structure of a deep convolutional neural network to predict the foreground alpha matte by taking a single RGB image as input. Our network is fully convolutional with two decoder branches for the foreground and background classification respectively. Then a fusion branch is used to integrate the two classification results which gives rise to alpha values as the soft segmentation result. This design provides more degrees of freedom than a single decoder branch for the network to obtain better alpha values during training. The network can implicitly produce trimaps without user interaction, which is easy to use for novices without expertise in digital matting. Experimental results demonstrate that our network can achieve high-quality alpha mattes for various types of objects and outperform the state-of-the-art CNN-based image matting methods on the human image matting task.

 

IEEE CVPR 2019  [Paper] [ Supplementary ] [ Video


 

 

2018

 

Stress-aware large-scale mesh editing using a domain-decomposed multigrid solver

Weiwei Xu, Haifeng Yang, Yin Yang, Yiduo Wang, Kun Zhou

In this paper, we develop a domain-decomposed subspace and multigrid solver to analyze the stress distribution for large-scale finite element meshes with millions of degrees of freedom. Through the domain decomposition technique, the shape editing directly updates the data structure of local finite element matrices. Doing so avoids the expensive factorization step in a direct solver and provides users with a progressive feedback of the stress distribution corresponding to the mesh operations: a fast preview is achieved through the subspace solver, and the multigrid solver refines the preview result if the user needs to examine the stress distribution carefully at certain design stages. Our system constructs the subspace for stress analysis using reduced constrained modes and builds a three-level multigrid solver through the algebraic multigrid method. We remove mid-edge nodes and lump unknowns with the Schur complement method. The updating and solving of the large global stiffness matrix are implemented in parallel after the domain decomposition. Experimental results show that our solver outperforms the parallel Intel MKL solver. Speedups of 50%-100% can be achieved for large-scale meshes with reasonable pre-computation costs when setting the stopping criterion of the multigrid solver to be 1e-3 relative error.

 

Computer Aided Geometric Design Journal  [Paper] [ Supplementary ] [ Video


 

 

Physics-Based Quadratic Deforma Using Elastic Weighting

Ran Luo, Weiwei Xu, Huamin Wang, Kun Zhou, Yin Yang

This paper presents a spatial reduction framework for simulating nonlinear deformable objects interactively. This reduced model is built using a small number of overlapping quadratic domains as we notice that incorporating high-order degrees of freedom (DOFs) is important for the simulation quality. Departing from existing multi-domain methods in graphics, our method interprets deformed shapes as blended quadratic transformations from nearby domains. Doing so avoids expensive safeguards against the domain coupling and improves the numerical robustness under large deformations. We present an algorithm that efficiently computes weight functions for reduced DOFs in a physics-aware manner. Inspired by the well-known multi-weight enveloping technique, our framework also allows subspace tweaking based on a few representative deformation poses. Such elastic weighting mechanism significantly extends the expressivity of the reduced model with light-weight computational efforts. Our simulator is versatile and can be well interfaced with many existing techniques. It also supports local DOF adaption to incorporate novel deformations (i.e., induced by the collision). The proposed algorithm complements state-of-the-art model reduction and domain decomposition methods by seeking for good trade-offs among animation quality, numerical robustness, pre-computation complexity and simulation efficiency from an alternative perspective.

 

IEEE transactions on visualization and computer graphics Journal  [Paper][ Supplementary ] [ Video


 

 

Online Global Non‐rigid Registration for 3D Object Reconstruction Using Consumer‐level Depth Cameras

Jiamin Xu, Weiwei Xu, Yin Yang, Zhigang Deng, Hujun Bao

We investigate how to obtain high‐quality 360‐degree 3D reconstructions of small objects using consumer‐level depth cameras. For many homeware objects such as shoes and toys with dimensions around 0.06 – 0.4 meters, their whole projections, in the hand‐held scanning process, occupy fewer than 20% pixels of the camera's image. We observe that existing 3D reconstruction algorithms like KinectFusion and other similar methods often fail in such cases even under the close‐range depth setting. To achieve high‐quality 3D object reconstruction results at this scale, our algorithm relies on an online global non‐rigid registration, where embedded deformation graph is employed to handle the drifting of camera tracking and the possible nonlinear distortion in the captured depth data. We perform an automatic target object extraction from RGBD frames to remove the unrelated depth data so that the registration algorithm can focus on minimizing the geometric and photogrammetric distances of the RGBD data of target objects. Our algorithm is implemented using CUDA for a fast non‐rigid registration. The experimental results show that the proposed method can reconstruct high‐quality 3D shapes of various small objects with textures.

 

Computer Graphics Forum Journal  [Paper][ Supplementary ] [ Video


 

 

Automatic Unpaired Shape Deformation Transfer

Lin Gao, Jie Yang, Yi-Ling Qiao, Yu-Kun Lai, Paul L Rosin, Weiwei Xu, Shihong Xia

Transferring deformation from a source shape to a target shape is a very useful technique in computer graphics. State-of-the-art deformation transfer methods require either point-wise correspondences between source and target shapes, or pairs of deformed source and target shapes with corresponding deformations. However, in most cases, such correspondences are not available and cannot be reliably established using an automatic algorithm. Therefore, substantial user effort is needed to label the correspondences or to obtain and specify such shape sets. In this work, we propose a novel approach to automatic deformation transfer between two unpaired shape sets without correspondences. 3D deformation is represented in a high-dimensional space. To obtain a more compact and effective representation, two convolutional variational autoencoders are learned to encode source and target shapes to their latent spaces. We exploit a Generative Adversarial Network (GAN) to map deformed source shapes to deformed target shapes, both in the latent spaces, which ensures the obtained shapes from the mapping are indistinguishable from the target shapes. This is still an under-constrained problem, so we further utilize a reverse mapping from target shapes to source shapes and incorporate cycle consistency loss, i.e. applying both mappings should reverse to the input shape. This VAE-Cycle GAN (VC-GAN) architecture is used to build a reliable mapping between shape spaces. Finally, a similarity constraint is employed to ensure the mapping is consistent with visual similarity, achieved by learning a similarity neural network that takes the embedding vectors from the source and target latent spaces and predicts the light field distance between the corresponding shapes. Experimental results show that our fully automatic method is able to obtain high-quality deformation transfer results with unpaired data sets, comparable or better than existing methods where strict correspondences are required.

 

SIGGRAPH Asia Conference  [Paper] [ Supplementary ] [ Video


 

 

DeepWarp DNN-based Nonlinear Deformation

Ran Luo, Tianjia Shao, Huamin Wang, Weiwei Xu, Kun Zhou, Yin Yang

DeepWarp is an efficient and highly re-usable deep neural network (DNN) based nonlinear deformable simulation framework. Unlike other deep learning applications such as image recognition, where different inputs have a uniform and consistent format (eg an array of all the pixels in an image), the input for deformable simulation is quite variable, high-dimensional, and parametrization-unfriendly. Consequently, even though DNN is known for its rich expressivity of nonlinear functions, directly using DNN to reconstruct the force-displacement relation for general deformable simulation is nearly impossible. DeepWarp obviates this difficulty by partially restoring the force-displacement relation via warping the nodal displacement simulated using a simplistic constitutive model--the linear elasticity. In other words, DeepWarp yields an incremental displacement fix based on a simplified (therefore incorrect) simulation result other than returning the unknown displacement directly. We contrive a compact yet effective feature vector including geodesic, potential and digression to sort training pairs of per-node linear and nonlinear displacement. DeepWarp is robust under different model shapes and tessellations. With the assistance of deformation substructuring, one DNN training is able to handle a wide range of 3D models of various geometries including most examples shown in the paper. Thanks to the linear elasticity and its constant system matrix, the underlying simulator only needs to perform one pre-factorized matrix solve at each time step, and DeepWarp is able to simulate large models in real time.

 

IEEE transactions on visualization and computer graphics Journal  [Paper] [ Supplementary ] [ Video


 

 

2017

 

Layout Style Modeling for Automating Banner Design

Yunke Zhang, Kangkang Hu, Peiran Ren, Changyuan Yang, Weiwei Xu, Xian-Sheng Hua

Banner design for is challenging to clearly convey information while also satisfying aesthetic goals and complying with the banner owner or advertiser's visual identity system. In online advertising, banners are often born with tens of different display sizes and rapidly changing design styles to chase fashion in many distinct market areas and designers have to make huge efforts to adjust their designs for each display size and target style. Therefore, automating multi-size and multi-style banner design can greatly release designers' creativity. Different from previous work relying on a single unified omnipotent optimization to accomplish such a complex problem, we tackle it with a combination of layout style learning, interpolation and transfer. We optimize banner layout given the style parameter learned from a set of training banners for a particular display size and layout style. Such kind of optimization is faster and much more controllable than optimizing for all sizes and diverse styles. To achieve multi-size banner design, we collect style parameters for a small collection of various sizes and interpolate them to support arbitrary target size. To reduce the difficulty of style parameter training, we invent a novel style transfer technique so that creating a multi-size style becomes as easy as designing a single banner. With all of the three techniques described above, a robust and easy-to-use layout style model is built, upon which we automate the banner design. We test our method on a data set containing thousands of real banners for online advertising and evaluate our generated banners in various sizes and styles by comparing them with professional designs.

 

Proceedings of the on Thematic Workshops of ACM Multimedia 2017 Conference  [Paper] [ Supplementary ] [ Video


 

 

Modeling, Evaluation and Optimization of Interlocking Shell Pieces

Miaojun Yao, Zhili Chen, Weiwei Xu, Huamin Wang

While the 3D printing technology has become increasingly popular in recent years, it suffers from two critical limitations: expensive printing material and long printing time. An effective solution is to hollow the 3D model into a shell and print the shell by parts. Unfortunately, making shell pieces tightly assembled and easy to disassemble seem to be two contradictory conditions, and there exists no easy way to satisfy them at the same time yet. In this paper, we present a computational system to design an interlocking structure of a partitioned shell model, which uses only male and female connectors to lock shell pieces in the assembled configuration. Given a mesh segmentation input, our system automatically finds an optimal installation plan specifying both the installation order and the installation directions of the pieces, and then builds the models of the shell pieces using optimized shell thickness and connector sizes. To find the optimal installation plan, we develop simulation‐based and data‐driven metrics, and we incorporate them into an optimal plan search algorithm with fast pruning and local optimization strategies. The whole system is automatic, except for the shape design of the key piece. The interlocking structure does not introduce new gaps on the outer surface, which would become noticeable inevitably due to limited printer precision. Our experiment shows that the assembled object is strong against separation, yet still easy to disassemble.

 

Computer Graphics Forum  [Paper] [ Supplementary ] [ Video


 

 

Acoustic VR in the mouth: A real-time speech-driven visual tongue system

Ran Luo, Qiang Fang, Jianguo Wei, Wenhuan Lu, Weiwei Xu, Yin Yang

We propose an acoustic-VR system that converts acoustic signals of human language (Chinese) to realistic 3D tongue animation sequences in real time. It is known that directly capturing the 3D geometry of the tongue at a frame rate that matches the tongue's swift movement during the language production is challenging. This difficulty is handled by utilizing the electromagnetic articulography (EMA) sensor as the intermediate medium linking the acoustic data to the simulated virtual reality. We leverage Deep Neural Networks to train a model that maps the input acoustic signals to the positional information of pre-defined EMA sensors based on 1,108 utterances. Afterwards, we develop a novel reduced physics-based dynamics model for simulating the tongue's motion. Unlike the existing methods, our deformable model is nonlinear, volume-preserving, and accommodates collision between the tongue and the oral cavity (mostly with the jaw). The tongue's deformation could be highly localized which imposes extra difficulties for existing spectral model reduction methods. Alternatively, we adopt a spatial reduction method that allows an expressive subspace representation of the tongue's deformation. We systematically evaluate the simulated tongue shapes with real-world shapes acquired by MRI/CT. Our experiment demonstrates that the proposed system is able to deliver a realistic visual tongue animation corresponding to a user's speech signal.

 

Conference 2017 IEEE Virtual Reality  [Paper] [ Supplementary ] [ Video


 

 

2016

 

Interactive mechanism modeling from multi-view images

Mingliang Xu, Mingyuan Li, Weiwei Xu, Zhigang Deng, Yin Yang, Kun Zhou

In this paper, we present an interactive system for mechanism modeling from multi-view images. Its key feature is that the generated 3D mechanism models contain not only geometric shapes but also internal motion structures: they can be directly animated through kinematic simulation. Our system consists of two steps: interactive 3D modeling and stochastic motion parameter estimation. At the 3D modeling step, our system is designed to integrate the sparse 3D points reconstructed from multi-view images and a sketching interface to achieve accurate 3D modeling of a mechanism. To recover the motion parameters, we record a video clip of the mechanism motion and adopt stochastic optimization to recover its motion parameters by edge matching. Experimental results show that our system can achieve the 3D modeling of a range of mechanisms from simple mechanical toys to complex mechanism objects.

 

ACM Transactions on Graphics (TOG)  [Paper] [ Supplementary ] [ Video


 

 

Stress Constrained Thickness Optimization for Shell Object Fabrication

Haiming Zhao, Weiwei Xu, Kun Zhou, Yin Yang, Xiaogang Jin, Hongzhi Wu

We present an approach to fabricate shell objects with thickness parameters, which are computed to maintain the user-specified structural stability. Given a boundary surface and user-specified external forces, we optimize the thickness parameters according to stress constraints to extrude the surface. Our approach mainly consists of two technical components: First, we develop a patch-based shell simulation technique to efficiently support the static simulation of extruded shell objects using finite element methods. Second, we analytically compute the derivative of stress required in the sensitivity analysis technique to turn the optimization into a sequential linear programming problem. Experimental results demonstrate that our approach can optimize the thickness parameters for arbitrary surfaces in a few minutes and well predict the physical properties, such as the deformation and stress of the fabricated object.

 

Computer Graphics Forum  [ Paper ] [ Supplementary ] [ Video


 

 

All-hex Meshing using Closed-form Induced Polycube

Xianzhong Fang, Weiwei Xu, Hujun Bao, Jin Huang

The polycube based hexahedralization methods are robust to the generation of internal singularity free all-hex meshes. They avoid the difficulty to control the global singularity structure for a valid hexahedralization in frame-field based methods. To thoroughly utilize this advantage, we propose to use a frame field without internal singularities to guide the polycube construction. Theoretically, our method extends the vector fields associated with the polycube from exact forms to closed forms, which are curl free everywhere but may not be globally integrable. The closed forms give additional degrees of freedom to deal with the topological structure of high genus models, and also provide better initial axis alignment for subsequent polycube generation. We demonstrate the advantages of our method on various models, ranging from genus-zero models to high genus ones, from single-boundary models to multiple boundary ones.

 

ACM SIGGRAPH 2016  [ Paper (available soon) ] [ Supplementary ] [ Video


 

 

Make it swing: Fabricating personalized roly-poly toys

Haiming Zhao, Chengkuan Hong, Juncong Lin, Xiaogang Jin, Weiwei Xu

A roly-poly toy is considered as one of the oldest toys in history. People, both young and old, are fascinated by its unique ability to right itself when pushed over. There exist different kinds of roly-poly toys with various shapes. Most of them share a similar bottom which is a hollow hemisphere with a weight inside. However, it is not an easy task to make an arbitrary model to swing like a roly-poly due to the delicate equilibrium condition between the center of mass of the roly-poly toy and the shape of the hemisphere. In this paper, we present a computer-aided method to help casual users design a personalized roly-poly toy and fabricate it through 3D printing with reduced material usage and sufficient stability. The effectiveness of our method is validated on various models. Our method provides a novel easy-to-use means to design an arbitrary roly-poly toy with an ordinary 3D printing machine, extricating amateurs from the dilemma of finding extra weight to balance the shape.

 

Computer Aided Geometric Design 43: 226-236 (2016)  [Paper] [ Supplementary ] [ Video


 

 

View-Aware Image Object Compositing and Synthesis from Multiple Sources

Xiang Chen, Weiwei Xu, Sai-Kit Yeung, Kun Zhou

Image compositing is widely used to combine visual elements from separate source images into a single image. Although recent image compositing techniques are capable of achieving smooth blending of the visual elements from different sources, most of them implicitly assume the source images are taken in the same viewpoint. In this paper, we present an approach to compositing novel image objects from multiple source images which have different viewpoints. Our key idea is to construct 3D proxies for meaningful components of the source image objects, and use these 3D component proxies to warp and seamlessly merge components together in the same viewpoint. To realize this idea, we introduce a coordinate-frame based single-view camera calibration algorithm to handle general types of image objects, a structure-aware cuboid optimization algorithm to get the cuboid proxies for image object components with correct structure relationship, and finally a 3D-proxy transformation guided image warping algorithm to stitch object components. We further describe a novel application based on this compositing approach to automatically synthesize a large number of image objects from a set of exemplars. Experimental results show that our compositing approach can be applied to a variety of image objects, such as chairs, cups, lamps, and robots, and the synthesis application can create novel image objects with significant shape and style variations from a small set of exemplars.

 

J. Comput. Sci. Technol. 31(3): 463-478 (2016)  [Paper] [ Supplementary ] [ Video


 

 

Fast Nearest Neighbor Search in the Hamming Space

Zhansheng Jiang, Lingxi Xie, Xiaotie Deng, Weiwei Xu, Jingdong Wang

Recent years have witnessed growing interests in computing compact binary codes and binary visual descriptors to alleviate the heavy computational costs in large-scale visual research. However, it is still computationally expensive to linearly scan the large-scale databases for nearest neighbor (NN) search. In [15], a new approximate NN search algorithm is presented. With the concept of bridge vectors which correspond to the cluster centers in Product Quantization [10] and the augmented neighborhood graph, it is possible to adopt an extract-on-demand strategy on the online querying stage to search with priority. This paper generalizes the algorithm to the Hamming space with an alternative version of k-means clustering. Despite the simplicity, our approach achieves competitive performance compared to the state-of-the-art methods, i.e., MIH and FLANN, in the aspects of search precision, accessed data volume and average querying time.

 

International Conference on Multimedia Modeling (MMM) 2016: 325-336  [ Paper ] [ Supplementary ] [ Video


 

 

2015

 

Lightweight wrinkle synthesis for 3D facial modeling and animation

Jun Li, Weiwei Xu, Zhi-Quan Cheng, Kai Xu, Reinhard Klein

We present a lightweight non-parametric method to generate wrinkles for 3D facial modeling and animation. The key lightweight feature of the method is that it can generate plausible wrinkles using a single low-cost Kinect camera and one high quality 3D face model with details as the example. Our method works in two stages: (1) offline personalized wrinkled blendshape construction. User-specific expressions are recorded using the RGB-Depth camera, and the wrinkles are generated through example-based synthesis of geometric details. (2) Online 3D facial performance capturing. These reconstructed expressions are used as blendshapes to capture facial animations in real-time. Experiments on a variety of facial performance videos show that our method can produce plausible results, approximating the wrinkles in an accurate way. Furthermore, our technique is low-cost and convenient for common users.

 

Computer-Aided Design 58: 117-122 (2015)  [Paper] [ Supplementary ] [ Video


 

 

Agile structural analysis for fabrication-aware shape editing

Yue Xie, Weiwei Xu, Yin Yang, Xiaohu Guo, Kun Zhou

This paper presents an agile simulation-aided shape editing system for personal fabrication applications. The finite element structural analysis and geometric design are seamlessly integrated within our system to provide users interactive structure analysis feedback during mesh editing. Observing the fact that most editing operations are actually local, a domain decomposition framework is employed to provide unified interface for shape editing, FEM system updating and shape optimization. We parameterize entries of the stiffness matrix as polynomial-like functions of geometry editing parameters thus the underlying FEM system can be rapidly synchronized once edits are made. A local update scheme is devised to re-use the untouched parts of the FEM system thus a lot repetitive calculations are avoided. Our system can also perform shape optimizations to reduce high stresses in model while preserving the appearance of the model as much as possible. Experiments show our system provides users a smooth editing experience and accurate feedback.

 

Computer Aided Geometric Design 35-36: 163-179 (2015)  [Paper] [ Supplementary ] [ Video


 

 

A Suggestive Interface for Sketch-based Character Posing

Pei Lv, Pengjie Wang, Weiwei Xu, Jinxiang Chai, Mingmin Zhang, Zhigeng Pan, Mingliang Xu

We present a user-friendly suggestive interface for sketch-based character posing. Our interface provides suggestive information on the sketching canvas in succession by combining image retrieval technique with 3D character posing, while the user is drawing. The system highlights the canvas region where the user should draw on and constrains the user's sketches in a reasonable solution space. This is based on an efficient image descriptor, which is used to measure the distance between the user's sketch and 2D views of 3D poses. In order to achieve faster query response, local sensitive hashing is involved in our system. In addition, sampling-based optimization algorithm is adopted to synthesize and optimize the retrieved 3D pose to match the user's sketches the best. Experiments show that our interface can provide smooth suggestive information to improve the reality of sketching poses and shorten the time required for 3D posing.

 

Comput. Graph. Forum 34(7): 111-121 (2015)  [Paper] [ Supplementary ] [ Video


 

 

Interactive design and simulation of tubular supporting structure

Ran Luo, Lifeng Zhu, Weiwei Xu, Patrick Gage Kelley, Vanessa Svihla, Yin Yang

This paper presents a system for design and simulation of supporting tube structure. We model each freeform tube component as a swept surface, and employ boundary control and skeletal control to manipulate its cross-sections and its embedding respectively. With the parametrization of the swept surface, a quadrilateral mesh consisting of nine-node general shell elements is automatically generated and the stress distribution of the structure is simulated using the finite element method. In order to accelerate the complex finite element simulation, we adopt a two-level subspace simulation strategy, which constructs a secondary complementary subspace to improve the subspace simulation accuracy. Together with the domain decomposition method, our system is able to provide interactive feedback for parametric freeform tube editing. Experiments show that our system is able to predict the structural character of the tube structure efficiently and accurately.

 

Graphical Models 80: 16-30 (2015)  [Paper] [ Supplementary ] [ Video


 

 

Integrating 3D structure into traffic scene understanding with RGB-D data

Yingjie Xia, Weiwei Xu, Luming Zhang, Xingmin Shi, Kuang Mao

RGB Video now is one of the major data sources of traffic surveillance applications. In order to detect the possible traffic events in the video, traffic-related objects, such as vehicles and pedestrians, should be first detected and recognized. However, due to the 2D nature of the RGB videos, there are technical difficulties in efficiently detecting and recognizing traffic-related objects from them. For instance, the traffic-related objects cannot be efficiently detected in separation while parts of them overlap, and complex background will influence the accuracy of the object detection. In this paper, we propose a robust RGB-D data based traffic scene understanding algorithm. By integrating depth information, we can calculate more discriminative object features and spatial information can be used to separate the objects in the scene efficiently. Experimental results show that integrating depth data can improve the accuracy of object detection and recognition. We also show that the analyzed object information plus depth data facilitate two important traffic event detection applications: overtaking warning and collision avoidance.

 

Neurocomputing 151: 700-709 (2015)  [Paper] [ Supplementary ] [ Video


 

 

Recognizing multi-view objects with occlusions using a deep architecture

Yingjie Xia, Luming Zhang, Weiwei Xu, Zhenyu Shan, Yuncai Liu

Image-based object recognition is employed widely in many computer vision applications such as image semantic annotation and object location. However, traditional object recognition algorithms based on the 2D features of RGB data have difficulty when objects overlap and image occlusion occurs. At present, RGB-D cameras are being used more widely and the RGB-D depth data can provide auxiliary information to address these challenges. In this study, we propose a deep learning approach for the efficient recognition of 3D objects with occlusion. First, this approach constructs a multi-view shape model based on 3D objects by using an encode–decode deep learning network to represent the features. Next, 3D object recognition in indoor scenes is performed using random forests. The application of deep learning to RGB-D data is beneficial for recovering missing information due to image occlusion. Our experimental results demonstrate that this approach can significantly improve the efficiency of feature representation and the performance of object recognition with occlusion.

 

Inf. Sci. 320: 333-345 (2015)  [Paper] [ Supplementary ] [ Video


 

 

Boundary-dominant flower blooming simulation

Jianfang Li, Min Liu, Weiwei Xu, Haiyi Liang, Ligang Liu

This paper presents a new physics-based simulation method for flower blossom, which is based on biological observations that flower opening is usually driven by a boundary-dominant morphological transition in a curved petal. We use an elastic triangular mesh representing a flower petal and adopt in-plane expansion to induce global bending. Out-of-plane curl plays an auxiliary role in reducing the curvatures of cross-sections. We also propose to adapt semi-implicit Euler time integrator for fast simulation results, which has intrinsic damping and at least one order precision. Our system allows users to control the blossoming process by simply specifying a growth curve, which is easy to design because of the boundary-dominant property. Experimental results show that our physics-based system runs faster and generates more realistic and convincing blossom results than the existing simulation methods. Copyright ? 2015 John Wiley & Sons, Ltd.

 

Journal of Visualization and Computer Animation 26(3-4): 433-443 (2015)  [Paper] [ Supplementary ] [ Video


 

 

Online Structure Analysis for Real-Time Indoor Scene Reconstruction

Yizhong Zhang, Weiwei Xu, Yiying Tong, Kun Zhou

We propose a real-time approach for indoor scene reconstruction. It is capable of producing a ready-to-use 3D geometric model even while the user is still scanning the environment with a consumer depth camera. Our approach features explicit representations of planar regions and nonplanar objects extracted from the noisy feed of the depth camera, via an online structure analysis on the dynamic, incomplete data. The structural information is incorporated into the volumetric representation of the scene, resulting in a seamless integration with KinectFusion's global data structure and an efficient implementation of the whole reconstruction process. Moreover, heuristics based on rectilinear shapes in typical indoor scenes effectively eliminate camera tracking drift and further improve reconstruction accuracy. The instantaneous feedback enabled by our on-the-fly structure analysis, including repeated object recognition, allows the user to selectively scan the scene and produce high-fidelity large-scale models efficiently. We demonstrate the capability of our system with real-life examples.

 

ACM Trans. Graph. 34(5): 159 (2015)  [Paper] [ Supplementary ] [ Video


 

 

Expediting precomputation for reduced deformable simulation

Yin Yang, Dingzeyu Li, Weiwei Xu, Yuan Tian, Changxi Zheng

Model reduction has popularized itself for simulating elastic deformation for graphics applications. While these techniques enjoy orders-of-magnitude speedups at runtime simulation, the efficiency of precomputing reduced subspaces remains largely overlooked. We present a complete system of precomputation pipeline as a faster alternative to the classic linear and nonlinear modal analysis. We identify three bottlenecks in the traditional model reduction precomputation, namely modal matrix construction,cubature training, and training dataset generation, and accelerate each of them. Even with complex deformable models, our method has achieved orders-of-magnitude speedups over the traditional precomputation steps, while retaining comparable runtime simulation quality.

 

ACM Trans. Graph. 34(6): 243 (2015)  [Paper] [ Supplementary ] [ Video


 

 

2014

 

Transductive 3D Shape Segmentation using Sparse Reconstruction

Weiwei Xu, Zhouxu Shi, Mingliang Xu, Kun Zhou, Jingdong Wang, Bin Zhou, Jinrong Wang, Zhenming Yuan

We propose a transductive shape segmentation algorithm, which can transfer prior segmentation results in database to new shapes without explicitly specification of prior category information. Our method first partitions an input shape into a set of segmentations as a data preparation, and then a linear integer programming algorithm is used to select segments from them to form the final optimal segmentation. The key idea is to maximize the segment similarity between the segments in the input shape and the segments in database, where the segment similarity is computed through sparse reconstruction error. The segment-level similarity enables to handle a large amount of shapes with significant topology or shape variations with a small set of segmented example shapes. Experimental results show that our algorithm can generate high quality segmentation and semantic labeling results in the Princeton segmentation benchmark.

 

Comput. Graph. Forum 33(5): 107-115 (2014)  [Paper] [ Supplementary ] [ Video


 

 

Automatic 3D Indoor Scene Updating with RGBD Cameras

Zhenbao Liu, Sicong Tang, Weiwei Xu, Shuhui Bu, Junwei Han, Kun Zhou

Since indoor scenes are frequently changed in daily life, such as re-layout of furniture, the 3D reconstructions for them should be flexible and easy to update. We present an automatic 3D scene update algorithm to indoor scenes by capturing scene variation with RGBD cameras. We assume an initial scene has been reconstructed in advance in manual or other semi-automatic way before the change, and automatically update the reconstruction according to the newly captured RGBD images of the real scene update. It starts with an automatic segmentation process without manual interaction, which benefits from accurate labeling training from the initial 3D scene. After the segmentation, objects captured by RGBD camera are extracted to form a local updated scene. We formulate an optimization problem to compare to the initial scene to locate moved objects. The moved objects are then integrated with static objects in the initial scene to generate a new 3D scene. We demonstrate the efficiency and robustness of our approach by updating the 3D scene of several real-world scenes.

 

Comput. Graph. Forum 33(7): 269-278 (2014)  [Paper] [ Supplementary ] [ Video


 

 

An asymptotic numerical method for inverse elastic shape design

Xiang Chen, Changxi Zheng, Weiwei Xu, Kun Zhou

Inverse shape design for elastic objects greatly eases the design efforts by letting users focus on desired target shapes without thinking about elastic deformations. Solving this problem using classic iterative methods (e.g., Newton-Raphson methods), however, often suffers from slow convergence toward a desired solution. In this paper, we propose an asymptotic numerical method that exploits the underlying mathematical structure of specific nonlinear material models, and thus runs orders of magnitude faster than traditional Newton-type methods. We apply this method to compute rest shapes for elastic fabrication, where the rest shape of an elastic object is computed such that after physical fabrication the real object deforms into a desired shape. We illustrate the performance and robustness of our method through a series of elastic fabrication experiments.

 

ACM Trans. Graph. 33(4): 95:1-95:11 (2014)  [Paper] [ Supplementary ] [ Video


 

 

Imagining the unseen: stability-based cuboid arrangements for scene understanding

Tianjia Shao, Aron Monszpart, Youyi Zheng, Bongjin Koo, Weiwei Xu, Kun Zhou, Niloy J. Mitra

Missing data due to occlusion is a key challenge in 3D acquisition, particularly in cluttered man-made scenes. Such partial information about the scenes limits our ability to analyze and understand them. In this work we abstract such environments as collections of cuboids and hallucinate geometry in the occluded regions by globally analyzing the physical stability of the resultant arrangements of the cuboids. Our algorithm extrapolates the cuboids into the un-seen regions to infer both their corresponding geometric attributes (e.g., size, orientation) and how the cuboids topologically interact with each other (e.g., touch or fixed). The resultant arrangement provides an abstraction for the underlying structure of the scene that can then be used for a range of common geometry processing tasks. We evaluate our algorithm on a large number of test scenes with varying complexity, validate the results on existing benchmark datasets, and demonstrate the use of the recovered cuboid-based structures towards object retrieval, scene completion, etc.

 

ACM Trans. Graph. 33(6): 209:1-209:11 (2014)  [Paper] [ Supplementary ] [ Video


 

 

Sensitivity-optimized rigging for example-based real-time clothing synthesis

Weiwei Xu, Nobuyuki Umetani, Qianwen Chao, Jie Mao, Xiaogang Jin, Xin Tong

We present a real-time solution for generating detailed clothing deformations from pre-computed clothing shape examples. Given an input pose, it synthesizes a clothing deformation by blending skinned clothing deformations of nearby examples controlled by the body skeleton. Observing that cloth deformation can be well modeled with sensitivity analysis driven by the underlying skeleton, we introduce a sensitivity based method to construct a pose-dependent rigging solution from sparse examples. We also develop a sensitivity based blending scheme to find nearby examples for the input pose and evaluate their contributions to the result. Finally, we propose a stochastic optimization based greedy scheme for sampling the pose space and generating example clothing shapes. Our solution is fast, compact and can generate realistic clothing animation results for various kinds of clothes in real time.

 

ACM Trans. Graph. 33(4): 107:1-107:11 (2014)  [Paper] [ Supplementary ] [ Video


 

 

2013

 

As-Rigid-As-Possible Distance Field Metamorphosis

Yanlin Weng, Menglei Chai, Weiwei Xu, Yiying Tong, Kun Zhou

Widely used for morphing between objects with arbitrary topology, distance field interpolation (DFI) handles topological transition naturally without the need for correspondence or remeshing, unlike surface-based interpolation approaches. However, lack of correspondence in DFI also leads to ineffective control over the morphing process. In particular, unless the user specifies a dense set of landmarks, it is not even possible to measure the distortion of intermediate shapes during interpolation, let alone control it. To remedy such issues, we introduce an approach for establishing correspondence between the interior of two arbitrary objects, formulated as an optimal mass transport problem with a sparse set of landmarks. This correspondence enables us to compute non-rigid warping functions that better align the source and target objects as well as to incorporate local rigidity constraints to perform as-rigid-as?rpossible DFI. We demonstrate how our approach helps achieve flexible morphing results with a small number of landmarks.

 

Comput. Graph. Forum 32(7): 381-389 (2013)  [Paper] [ Supplementary ] [ Video


 

 

Interpreting concept sketches

Tianjia Shao, Wilmot Li, Kun Zhou, Weiwei Xu, Baining Guo, Niloy J. Mitra

Concept sketches are popularly used by designers to convey pose and function of products. Understanding such sketches, however, requires special skills to form a mental 3D representation of the product geometry by linking parts across the different sketches and imagining the intermediate object configurations. Hence, the sketches can remain inaccessible to many, especially non-designers. We present a system to facilitate easy interpretation and exploration of concept sketches. Starting from crudely specified incomplete geometry, often inconsistent across the different views, we propose a globally-coupled analysis to extract part correspondence and inter-part junction information that best explain the different sketch views. The user can then interactively explore the abstracted object to gain better understanding of the product functions. Our key technical contribution is performing shape analysis without access to any coherent 3D geometric model by reasoning in the space of inter-part relations. We evaluate our system on various concept sketches obtained from popular product design books and websites.

 

ACM Trans. Graph. 32(4): 56:1-56:10 (2013)  [Paper] [ Supplementary ] [ Video


 

 

Boundary-Aware Multidomain Subspace Deformation

Yin Yang, Weiwei Xu, Xiaohu Guo, Kun Zhou, Baining Guo

In this paper, we propose a novel framework for multidomain subspace deformation using node-wise corotational elasticity. With the proper construction of subspaces based on the knowledge of the boundary deformation, we can use the Lagrange multiplier technique to impose coupling constraints at the boundary without overconstraining. In our deformation algorithm, the number of constraint equations to couple two neighboring domains is not related to the number of the nodes on the boundary but is the same as the number of the selected boundary deformation modes. The crack artifact is not present in our simulation result, and the domain decomposition with loops can be easily handled. Experimental results show that the single-core implementation of our algorithm can achieve real-time performance in simulating deformable objects with around quarter million tetrahedral elements.

 

IEEE Trans. Vis. Comput. Graph. 19(10): 1633-1645 (2013)  [Paper] [ Supplementary ] [ Video


 

 

2012

 

Motion-Guided Mechanical Toy Modeling

Lifeng Zhu, Weiwei Xu, John Snyder, Yang Liu, Guoping Wang, Baining Guo

We introduce a new method to synthesize mechanical toys solely from the motion of their features. The designer specifies the geometry and a time-varying rotation and translation of each rigid feature component. Our algorithm automatically generates a mechanism assembly located in a box below the feature base that produces the specified motion. Parts in the assembly are selected from a parameterized set including belt-pulleys, gears, crank-sliders, quick-returns, and various cams (snail, ellipse, and double-ellipse). Positions and parameters for these parts are optimized to generate the specified motion, minimize a simple measure of complexity, and yield a well-distributed layout of parts over the driving axes. Our solution uses a special initialization procedure followed by simulated annealing to efficiently search the complex configuration space for an optimal assembly.

 

ACM SIGGRAPH Asia 2012  [ Paper ] [ Supplementary ] [ Video


 

 

An Interactive Approach to Semantic Modeling of Indoor Scenes with an RGBD Camera

Tianjia Shao, Weiwei Xu, Kun Zhou, Jingdong Wang, Dongping Li, Baining Guo

We present an interactive approach to semantic modeling of indoor scenes with a consumer-level RGBD camera. Using our approach, the user first takes a RGBD image of an indoor scene, which is automatically segmented into a set of regions with semantic labels. If the segmentation is not satisfactory, the user can draw some strokes to guide the algorithm to achieve better results. After the segmentation is finished, the depth data of each semantic region is used to retrieve a matching 3D model from a database. Each model is then transformed according to the image depth to yield the scene. For large scenes where a single image can only cover one part of the scene, the user can take multiple images to construct other parts of the scene. The 3D models built for all images are then transformed and unified into a complete scene. We demonstrate the efficiency and robustness of our approach by modeling several real-world scenes.

 

ACM SIGGRAPH Asia 2012  [ Paper ] [ Video ] [Data


 

 

All-Hex Meshing using Singularity-Restricted Field

Yufei Li, Yang Liu, Weiwei Xu, Wenping Wang, Baining Guo

Decomposing a volume into high-quality hexahedral cells is a challenging task in geometric modeling and computational geometry. Inspired by the use of cross field in quad meshing and the CubeCover approach in hex meshing, we present a complete all-hex meshing framework based on singularity-restricted field that is essential to induce a valid all-hex structure. Given a volume represented by a tetrahedral mesh, we first compute a boundary-aligned 3D frame field inside it, then convert the frame field to be singularity-restricted by our effective topological operations. In our all-hex meshing framework, we present an enhanced CubeCover approach that reduces degenerate elements appearing in the volume parameterizations via tetrahedral split operations and handle flipped elements effectively in hex-mesh extraction. Experimental results show that our algorithm generates high-quality all-hex meshes from a variety of 3D volumes robustly and efficiently.

 

ACM SIGGRAPH Asia 2012  [ Paper ] [ Video


 

 

Diffusion Curve Textures for Resolution Independent Texture Mapping

Xin Sun, Guofu Xie, Yue Dong, Stephen Lin, Weiwei Xu, Wencheng Wang, Xin Tong, Baining GUo

We introduce a vector representation called diffusion curve tex-tures for mapping diffusion curve images (DCI) onto arbitrary surfaces. In contrast to the original implicit representation of DCIs [Orzan et al. 2008], where determining a single texture value requires iterative computation of the entire DCI via the Poisson equation, diffusion curve textures provide an explicit representa-tion from which the texture value at any point can be solved di-rectly, while preserving the compactness and resolution indepen-dence of diffusion curves. This is achieved through a formulation of the DCI diffusion process in terms of Green’s functions. This formulation furthermore allows the texture value of any rectangular region (e.g. pixel area) to be solved in closed form, which facilitates anti-aliasing. We develop a GPU algorithm that renders anti-aliased diffusion curve textures in real time, and demonstrate the effective-ness of this method through high quality renderings with detailed control curves and color variations.

 

ACM Transactions on Graphics (SIGGRAPH 2012)  [Paper][Video][bibtex


 

 

2011

 

General Planar Quadrilateral Mesh Design Using Conjugate Direction Field

Yang Liu, Weiwei Xu, Lifeng Zhu, Jun Wang, Baining Guo, Falai Chen, Guoping Wang

We present a novel method to approximate a freeform shape with a planar quadrilateral (PQ) mesh for modeling architectural glass structures. Our method is based on the study of conjugate direction fields (CDF) which allow the presence of k/4 singularities. Starting with a triangle discretization of a freeform shape, we first compute an as smooth as possible conjugate direction field satisfying the user’s directional and angular constraints, then apply mixed-integer quadrangulation and planarization techniques to generate a PQ mesh which approximates the input shape faithfully. We demonstrate that our method is effective and robust on various 3D models.

 

ACM SIGGRAPH Asia 2011  [Paper] [Video][bibtex


 

 

Discriminative Sketch-based 3D Model Retrieval via Robust Shape Matching

Tianjia Shao, Weiwei Xu, Kangkang Yin, Jingdong Wang, Kun Zhou, Baining Guo

We propose a sketch-based 3D shape retrieval system that is substantially more discriminative and robust than existing systems, especially for complex models. The power of our system comes from a combination of a contourbased 2D shape representation and a robust sampling-based shape matching scheme. They are defined over discriminative local features and applicable for partial sketches; robust to noise and distortions in hand drawings; and consistent when strokes are added progressively. However, our robust shape matching algorithm requires dense sampling and registration, which incurs a high computational cost. We thus devise critical acceleration methods to achieve interactive performance: precomputing kNN graphs that record transformations between neighboring contour images and enable fast online shape alignment; pruning sampling and shape registration strategically and hierarchically; and parallelizing shape matching on multi-core platforms or GPUs. We demonstrate the effectiveness of our system through various experiments, comparisons, and a user study.

 

Computer Graphics Forum (PG 2011)  [Paper] [Video][Bibtex


 

 

2010

 

Sampling-based Contact-Rich Motion Control

Libin Liu, KangKang Yin, Michiel van de Panne, Tianjia Shao, Weiwei Xu

Human motions are the product of internal and external forces, but these forces are very difficult to measure in a general setting. Given a motion capture trajectory, we propose a method to reconstruct its open-loop control and the implicit contact forces. The method employs a strategy based on randomized sampling of the control within user-specified bounds, coupled with forward dynamics simulation. Sampling-based techniques are well suited to this task because of their lack of dependence on derivatives, which are difficult to estimate in contact-rich scenarios. They are also easy to parallelize, which we exploit in our implementation on a compute cluster. We demonstrate reconstruction of a diverse set of captured motions, including walking, running, and contact rich tasks such as rolls and kip-up jumps. We further show how the method can be applied to physically based motion transformation and retargeting, physically plausible motion variations, and reference trajectory-free idling motions. Alongside the successes, we point out a number of limitations and directions for future work.

 

ACM SIGGRAPH 2010  [Paper] [Video 46MB][Bibtex


 

 

Deformation Transfer to Multi-component Objects

Kunzhou, Weiwei Xu, Yiying Tong, Mathieu Desbrun

We present a simple and effective algorithm to transfer deformation between surface meshes with multiple components. The algorithm automatically computes spatial relationships between components of the target object, builds correspondences between source and target, and finally transfers deformation of the source onto the target while preserving cohesion between the target’s components. We demonstrate the versatility of our approach on various complex models.

Computer Graphics Forum (Eurographics 2010) [Paper] [Video 22MB][Bibtex

 

 

 

2009

 

Joint-aware Manipulation of Deformable Model

Weiwei Xu, Jun Wang, KangKang Yin, Kun Zhou, Michiel van de Panne, Falai Chen, Baining Guo

 

Complex mesh models of man-made objects often consist of multiple components connected by various types of joints. We propose a joint-aware deformation framework that supports the direct manipulation of an arbitrary mix of rigid and deformable components. We apply slippable motion analysis to automatically detect multiple types of joint constraints that are implicit in model geometry. For single-component geometry or models with disconnected components, we support user-defined virtual joints. We integrate manipulation handle constraints, multiple components, joint constraints, joint limits, and deformation energies into a single volumetric-cell based space deformation problem. An iterative, parallelized Gauss-Newton solver is used to solve the resulting non-linear optimization. Interactive deformable manipulation is demonstrated on a variety of geometric models while automatically respecting their multi-component nature and the natural behavior of their joints.

ACM SIGGRAPH 2009, [Paper] [Video 66MB][Bibtex]

 

 

 

2008

 

Keyframe-based Video Object Deformation

Yanlin Weng, Weiwei Xu, Shichao Hu, Jun Zhang, Baining Guo

 

This paper presents a novel deformation system for video objects. The system is designed to minimize the amount of user interaction, while providing flexible and precise user control. It has a keyframe-based user interface. The user only needs to manipulate the video object at some keyframes. Our algorithm will smoothly propagate the editing result from the keyframes to the rest frames and automatically generate the new video object. The algorithm is able to preserve the temporal coherence as well as the shape features of the video objects in the original video clips.

IEEE Cyber World 2008, [Paper] [Video 32MB][Bibtex]

 

2007

 

Gradient Domain Editing of Deforming Mesh Sequence

Weiwei Xu, Kun Zhou, Yizhou Yu, Qifeng Tan, Qunsheng Peng, Baining Guo.

 

Many graphics applications, including computer games and 3D animated films, make heavy use of deforming mesh sequences. In this paper, we generalize gradient domain editing to deforming mesh sequences. Our framework is keyframe based. Given sparse and irregularly distributed constraints at unevenly spaced keyframes, our solution first adjusts the meshes at the keyframes to satisfy these constraints, and then smoothly propagate the constraints and deformations at keyframes to the whole sequence to generate new deforming mesh sequence. To achieve convenient keyframe editing, we have developed an efficient alternating least-squares method. It harnesses the power of subspace deformation and two-pass linear methods to achieve high-quality deformations. We have also developed an effective algorithm to define boundary conditions for all frames using handle trajectory editing. Our deforming mesh editing framework has been successfully applied to a number of editing scenarios with increasing complexity, including footprint editing, path editing, temporal filtering, handle-based deformation mixing, and spacetime morphing.

ACM Transactions on Graphics (SIGGRAPH 2007) [Paper] [Video 80MB][Bibtex]

Direct Manipulation of Subdivision Mesh

Kun Zhou, Xin Huang, Weiwei Xu, Baining Guo.

 

We present an algorithm for interactive deformation of subdivision surfaces, including displaced subdivision surfaces and subdivision surfaces with geometric textures. Our system lets the user directly manipulate the surface using freely-selected surface points as handles. During deformation the control mesh vertices are automatically adjusted such that the deforming surface satisfies the handle position constraints while preserving the original surface shape and details. To best preserve surface details, we develop a gradient domain technique that incorporates the handle position constraints and detail preserving objectives into the deformation energy. For displaced subdivision surfaces and surfaces with geometric textures, the deformation energy is highly nonlinear and cannot be handled with existing iterative solvers. To address this issue, we introduce a shell deformation solver, which replaces each numerically unstable iteration step with two stable mesh deformation operations. Our deformation algorithm only uses local operations and is thus suitable for GPU implementation. The result is a real-time deformation system running orders of magnitude faster than the state-of-the-art multigrid mesh deformation solver. We demonstrate our technique with a variety of examples, including examples of creating visually pleasing character animations in real-time by driving a subdivision surface with motion capture data.

 

ACM Transactions on Graphics (SIGGRAPH 2007) [Paper] [Video 80MB][Bibtex]

 

 

 

 

 

 

2003~2010

 

Gradient Domain Mesh Deformation – A Survey

Weiwei Xu, Kun Zhou

Journal of Computer Science and Technology, 24(1):6-18, 2009

 

 

 

2D Shape Deformation using Nonlinear Least Squares Optimization

Yanlin Weng, Weiwei Xu, Kun Zhou, Baining Guo

The visual Computer, 22(9-11):653-660, 2006, [Paper 2MB]

 

 

Footprint Sampling based Motion Editing

Weiwei Xu, Zhigeng Pan, Mingmin Zhang

Int. J. Image Graphics 3(2): 311-324 (2003)

 

 

Easybowling: a small bowling machine based on virtual simulation

Zhigeng Pan, Weiwei Xu, Jin Huang, Mingmin Zhang, Jiaoying Shi

Computers & Graphics 27(2): 231-238 (2003)

 

 

 

ACM Copyright Notice

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. The definitive version of this paper can be found at ACM’s Digital Library --http://www.acm.org/dl/.

IEEE Copyright Notice

This material is posted here with permission of the IEEE. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to pubs-permissions@ieee.org.