Turbo Learning For CaptionBot And DrawingBot

Authors:
Qiuyuan Huang Microsoft Research AI
Pengchuan Zhang Microsoft Research
Dapeng Wu University of Florida
Lei Zhang Microsoft Research

Introduction:

The authors study in this paper the problems of both image captioning andtext-to-image generation, and present a novel turbo learningapproach to jointly training an image-to-text generator (a.k.a.CaptionBot) and a text-to-image generator (a.k.a.

Abstract:

We study in this paper the problems of both image captioning andtext-to-image generation, and present a novel turbo learningapproach to jointly training an image-to-text generator (a.k.a.CaptionBot) and a text-to-image generator (a.k.a. DrawingBot). Thekey idea behind the joint training is that image-to-textgeneration and text-to-image generation as dual problems can forma closed loop to provide informative feedback to each other. Basedon such feedback, we introduce a new loss metric by comparing theoriginal input with the output produced by the closed loop. Inaddition to the old loss metrics used in CaptionBot andDrawingBot, this extra loss metric makes the jointly trainedCaptionBot and DrawingBot better than the separately trainedCaptionBot and DrawingBot. Furthermore, the turbo-learningapproach enables semi-supervised learning since the closed loopcan provide peudo-labels for unlabeled samples. Experimentalresults on the COCO dataset demonstrate that the proposed turbolearning can significantly improve the performance of bothCaptionBot and DrawingBot by a large margin.

You may want to know: