Abstract: Humans are consumers of visual content. Every day, people watch videos, play digital games and share photos on social media. But there is still an asymmetry - not that many of us are creators. We aim to build machines capable of creating and manipulating photographs and use them as training wheels for visual content creation, with the goal of making people more visually literate. We propose to learn natural image statistics directly from large-scale data. We then define a class of image generation and editing operations and constrain their output to look realistic according to the learned image statistics.
I will discuss a few recent projects. First, we propose to directly model the natural image manifold via generative adversarial networks (GANs) and constrain the output of an image editing tool to lie on this manifold. Then, we present a general image-to-image translation framework, “pix2pix”, where a network is trained to map input images (such as user sketches) directly to natural looking results. Finally, we introduce CycleGAN, which learns image-to-image translation models even in the absence of paired training data and additionally demonstrate its application to style transfer, object transfiguration, season transfer, and photo enhancement.
Bio: Jun-Yan Zhu is a Ph.D. student at the Berkeley AI Research (BAIR) Lab, working on computer vision, graphics and machine learning with Professor Alexei A. Efros. He received his B.E. from Tsinghua University in 2012 and was a Ph.D. student at CMU from 2012-13. His research goal is to build machines capable of recreating the visual world. Jun-Yan is currently supported by a Facebook Graduate Fellowship.