Self-Supervised Panoptic Segmentation

Self-Supervised Panoptic Segmentation

Semantic and object-level scene understanding is a crucial component to navigate and act in natural environments. While recent panoptic image segmen- tation pipelines perform well in structured environ- ments, the still do not generalize to more unstructured, dynamic environments that contain unknown objects. This work investigates a supervised approach based on the Mask2Former framework in such environments with limited labeled data available. In addition, the ef- fects of self-supervised pre-training are examined. Ac- cordingly, a new pre-training strategy for hierarchical Swin Transformer backbones is established based on the Masked AutoEncoder Framework. Furthermore, a combined approach of self-supervised pre-training and supervised fine-tuning is presented. The first results show that the pre-training-based networks can match the scores achieved by their supervised counterparts while the additional knowledge does not yet lead to increased generalization performance.

Pascal Roth
Pascal Roth
Ph.D. in Robot Learning & Software Engineer

My research interests include deep learning, reinforcement learning, and computer vision to enhance robotic autonomy.