ECE Colloquium: Suha Kwak(DGIST) “Learning with minimum supervision for semantic segmentation”
Speaker : Suha Kwak
Semantic segmentation is a visual recognition task aiming to estimate pixel-level class labels in images. This problem has been recently handled by Deep Convolutional Neural Networks (DCNNs), and the state of the art based on DCNN achieve impressive records on public benchmarks. However, learning DCNN demands a large number of annotated training data while segmentation annotations in existing datasets are significantly limited in terms of both quantity and diversity due to the heavy annotation cost. Weakly supervised approaches tackle this issue by leveraging weak annotations such as bounding boxes and scribbles, which are either readily available in existing large-scale datasets or easily obtained thanks to their low annotation costs. In this talk, I will introduce our recent approaches to weakly supervised semantic segmentation based on image-level class label, which is the form of minimum supervision indicating only presence or absence of a semantic entity in an image. Learning semantic segmentation with image-level class label is a significantly ill-posed problem since neither object location nor shape is informed by the label. We tackled this challenging problem by employing (1) unsupervised techniques revealing low-level image structures, (2) web-crawled videos as additional data sources, and (3) DCNN architectures appropriate for learning segmentation with incomplete pixel-level annotations. I will conclude this talk with a few suggestions for future research directions worth to investigate for further improvement.
Suha Kwak is an assistant professor in the Department of Information and Communication Engineering at Daegu Gyeongbuk Institute of Science and Technology (DGIST), Korea. He received his B.S. and Ph.D. degrees in computer science and engineering from Pohang University of Science and Technology (POSTECH), Korea, in 2007 and 2014, respectively. Before joining DGIST, he was a postdoctoral researcher at Inria / Ecole Normale Superieure in Paris, France, and a member of WILLOW project team. He has been working on various topics in the areas of computer vision and machine learning. He is primarily interested in problems related to video understanding such as object detection, tracking, and human behavior analysis. He is also interested in deep learning, structured prediction, and weakly supervised learning.