At a glance, the photo up top looks like an ordinary photo of an ordinary street, taken either from a dash cam or from someone foolish enough to wander into the road to snap a picture of such a mundane scene.
But look a little closer. Notice how the traffic signal is slightly warped, or how some of the cars seem fuzzy? There’s something wrong here. This isn’t a photograph at all. It’s an image that was created entirely by an A.I.
Computer scientists from the tech company Nvidia and the University of California, Berkeley have written a research paper, available in preprint on arXiv, detailing how they were able to get a neural network to generate realistic street images and human portraits. They even included a user interface that lets you tweak the pictures however you’d like by adding extra foliage or even changing the weather.
“Gaming is growing fast, because people love interacting with each other in virtual environments,” Ming-Yu Liu, a senior scientist at Nvidia, tells Inverse in an email. “However, building virtual worlds is expensive with today’s technology, because it requires artists to explicitly model and simulate texture and lighting for the world they’re building. With image-to-image translation, we can instead sample the real world to create virtual worlds.”
Neural networks are computers modeled to work like a human brain by taking in information, applying it, and learning from the results. This research used special types of neural nets introduced by Ian Goodfellow in 2014, called generative adversarial networks — or GANs — which generally consist of two networks, the generator and discriminator.
The generator is given photos and begins to create synthetic images similar to the ones it was given. It then shows a mix of the images it was given and the fakes to the discriminator, whose job it is to tell them apart. As this process goes on, the generator becomes better at mimicking the original images and the discriminator becomes better at telling the fakes apart. The results are some pretty convincing — and totally fake — pictures.
This research builds on the traditional GANs model by adding splitting the generator and discriminator networks into a few sub-networks, allowing for the output of higher-resolution images. The neural networks are also able to take in a semantic map — or a blueprint of how the photo is supposed to look like — and fill in the textures autonomously. Users can even go into the blueprint and change things if they want to add buildings instead of trees in a street-view or make the eyes wider in a portrait.
The paper compares its results to similar experiments done using this method, the most notable one being pix2pix. The Nvidia and UC Berkeley study is able to generate images with details as tiny and precise as legible license plates, while pix2pix outputs images that almost look like watercolor paintings.
While this tool could be used to earn some free reddit karma with a couple of outlandish photos, the authors see huge potential in utilizing this approach to generate realistic graphics with just a simple blueprint.
Hundreds of hours of painstaking labor go into generating virtual worlds for use in Google Maps, films, and video games. Liu says this model could serve as a way to painlessly get most of the designing done and then go in and tweak the details later.
“Instead of rendering the world by explicitly modeling it, we can build the world implicitly by using image-to-image translation to translate between a simple model of the world that doesn’t contain any texture or lighting, and a photo-realistic output. This capability should make it much cheaper to build virtual worlds,” he tells Inverse.
For the next step in this research, the team hopes to explore video-to-video translation, which would use neural nets to create realistic videos. A goal that Lui says has challenged researchers in the field.
Now you know how easily fake images can be created. Don’t trust everything you see on Google images.