Many open-world applications require the detection of novel objects.
but state-of-the-art object detection and instance segmentation models are unable to do so.
• It’s because models learn to suppress any unannotated objects by treating them as background
• To address that issue, the authors propose a simple yet surprisingly powerful data augmentation and training scheme they call Learning to Detect Every Thing (LDET)
• To avoid suppressing hidden (unannotated) objects, background objects that are visible but unlabeled, they paste annotated objects on a background image sampled from a small region of the original image (see figure)
• Since training on such synthetically augmented images suffers from domain shift, they decouple the training into two parts:
1- training the region classification and regression head on augmented images, and
2- training the mask heads on original images
• Therefore, a model does not learn to classify hidden objects as background while generalizing well to real images
• LDET leads to significant improvements on many datasets in the open world instance segmentation task
• It outperforms baselines on cross-category generalization on COCO, as well as cross-dataset evaluation on UVO and Cityscapes
⭐ If you are interested in mastering Object Detection (OD), follow me to receive highly curated content right in your feed.
⭐ I started a newsletter, join to master object detection and get a competitive edge in this field
• Trained a VFNet model using IceVision and @fastdotai
• Reached 73%🚀in the COCO metric score
Blog posts: 👇
📝 Main takeaway of this story is: You can learn object detection very quickly if You:
• Are determined
• Follow the optimal learning path
• Embrace the 80-20, and the KISS principles.
• Have access to high curated content, and libraries
• Know how to avoid roadblocks
• Stay focus, and avoid distraction
✨ Like many things in life, object detection is:
• neither too hard
• nor too easy
• right in between ... when you have the right ingredients