

We thus propose to conduct face inpainting in frequency domain. As the underlying characteristic distribution of different frequencies is different, directly processing on the RGB image will lead to artificial boundaries and blurred textures, especially for large masks. However, these prior-based methods just utilize incompletely separated frequency information as priors in RGB domain, rarely in frequency domain. Considering the uniqueness of human face, symmetry, facial UV map, landmarks and depth information were introduced to make the inpainting results more authentic. Besides, the gradient map adopted by Yang remained the edge information and high-frequency texture. EdgeConnect used the edge map got by canny operator as the structure information, while StructureFlow took edge-preserved smooth image as prior which retained sharp edge and low-frequency structure. Based on GAN, many learning-based methods explored semantic and structural priors to assist with inpainting process, commonly using two-stage framework. Initially, the Context Encoder proposed by Pathak proved strong effect of GAN on image inpainting. To break out the limitation, learning-based methods become the mainstream approaches.

A branch of image inpainting known as face inpainting, is challenging since no similar patches could be copied from known areas when facial features are occluded. It has wide applications for repairing damaged images, removing objects and editing contents. Image inpainting is proposed to virtually fill in the missing pixels, making the inpainted image realistic. Experimental results on CelebA-HQ and Helen datasets demonstrate that our Wavelet Prediction Network outperforms current state-of-the-art face inpainting techniques both qualitatively and quantitatively, especially when handling face images with large masks. In addition, a new wavelet loss is designed to constrain the generated wavelet coefficients closer to the ground truth. Meanwhile, we adopt the channel relation attention module to learn different weights for different feature channels, with the aim of ensuring the consistency between structure and texture features. To reduce this interference, we propose to inpaint face images in wavelet domain with two branches to capture global structure topology and local detailed texture separately. As the underlying characteristic distribution differs among different frequencies, existing face inpainting methods on RGB domain can cause some artificial boundaries and blurred details when the masks are large.
