Saturday , October 23 2021

MIT researchers claim that augmentation techniques can train GANs with less data

Researchers at MIT, Adobe Research and Tsinghua University have reportedly developed a method – the differentiable extension (DiffAugment) – that improves the efficiency of generative opposing networks (GANs) by expanding both real and fake data samples . In one Preprint paperThey claim that it effectively stabilizes the networks during training and enables them to create high-fidelity images with only 100 images without pre-training and to achieve state-of-the-art performance on common benchmarks.

GANs – two-part AI models consisting of a generator that creates samples and a Discriminator These attempts to distinguish between the generated and the real samples have shown impressive achievements in media synthesis. Best performing GANs can be created realistic portraits from people who don't exist, for example, or snapshots from fictional apartment buildings. However, their success to date has been at the expense of considerable calculations and data. GANs rely heavily on quantities (in tens of thousands) of different, high quality training examples. In some cases, collecting such large data sets can take months or years, as well as annotation costs – if at all possible.

As already mentioned, the technology of the researchers expands both real images from training data and fake images created by the generator. (If the method only enlarges the real images, the target GAN may learn a different data distribution.) DiffAugment randomly enlarges or reduces the images and masks them with a random square that is half the size of the image while adjusting the brightness of the pictures, color and contrast values.

Above: With DiffAugment, GANs with just 100 Obama portraits, grumpy cats or pandas can create high-fidelity images from a data set. The cats and dogs were generated using 160 and 389 images, respectively.

In experiments with Open Source ImageNet and CIFAR-100, the researchers applied DiffAugment to two top-class GANs: DeepMinds BigGAN and Nvidia's StyleGAN2. With pre-training, they report that their method compared to CIFAR-100 has improved all baselines by a “considerable margin”, regardless of the architectures on the website Fréchet Inception Distance (FID) metric that takes photos of the target distribution and the models to be evaluated and uses an AI object recognition system to capture important features and find similarities. What is more impressive is that the GANs achieved no results with pre-training and with only 100 images, which were comparable to the existing algorithms for transfer learning in several image categories (namely "Panda" and "Grumpy Cat").

“The performance of StyleGAN2 deteriorates drastically because there is less training data. With DiffAugment we can roughly adjust the FID and exceed the Inception Score (IS) with only 20% training data, ”wrote the co-authors. “Extensive experiments consistently demonstrate its advantages with different network architectures, monitoring settings and target functions. Our method is particularly effective when only limited data is available. "

The code and models are freely available GitHub.

About Nancie Clifford

Nancie Clifford is a housewife and loves technology. He writes on various websites.

Check Also

Cheryl Ingram, diversity expert and tech CEO, offers 5 ways technology leaders can address racial inequalities

Cheryl Ingram is the CEO and founder of the consulting firm Diverse City and the …

Leave a Reply

Your email address will not be published. Required fields are marked *