I’m at Day 23 of my 30 posts (on Object Detection) in 30 days challenge

I gathered 12 visual summaries on OD Modeling 🎁

A lot of people find those posts helpful, follow @ai_fast_track to catch the upcoming posts, and give this tweet a quick retweet πŸ™

Summary of summariesπŸ‘‡
1- Common Object Detector Architecture you should be familiar with:

2- Four Feature Pyramid Network (FPN) Designs you should know:

3- Seven things you should know about the Focal Loss

4- FCOS is the first anchor-free object detector that beat two-stage detectors

6- How easy creating YOLOV5 and YOLOX models in IceVision

7- VFNet: A very interesting model that isn’t under the radar

8- YOLO Real-Time (YOLO-ReT) architecture targets edge devices.

It achieves 68.75 mAP on Pascal VOC and 34.91 mAP on COCO

9- Similarities and the differences between some popular Object Detection models.

10- FCOS3D won the 1st place out of all the vision-only methods in the nuScenes 3D Detection Challenge of NeurIPS 2020.

11- The Generalized Intersection over Union (GIoU) can be used as a metric as well as a loss function

12- What is the Average Precision (AP), mean AP (mAP), and COCO Metric?

⭐️ If you find those summaries helpful, feel free to follow @ai_fast_track for more OD / CV demystified content in your feed

⭐️ If you could give the thread a quick retweet, it would help others discover this content. Thanks!

β€’ β€’ β€’

Missing some Tweet in this thread? You can try to force a refresh
γ€€

Keep Current with AI Fast Track (24/30)

AI Fast Track (24/30) Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @ai_fast_track

24 Nov
Day 24/30: πŸ₯‡ EfficientDet is a very popular object detection model for a good reason!

Let’s see why

πŸ“Œ EfficientDet achieved State-Of-The-Art (SOTA) accuracy while reducing both the size of parameters, and the FLOPS, when it was released. It’s still a very good contender.
πŸ“Œ Before introducing EfficientDet, models were getting impressively big to achieve SOTA results

❓ The authors asked the following question:
Is it possible to build a scalable detection architecture with both higher accuracy and better efficiency across # resource constraints?
So, they systematically studied neural network architecture design choices for object detection, and proposed several key optimizations to improve efficiency:

1- A weighted bi-directional feature pyramid network (BiFPN), which allows easy and fast multiscale feature fusion
Read 8 tweets
20 Nov
FCOS is an an anchor-free object detector.

It was one of first competitors of anchor-based single/two stage object detectors.

Understanding FCOS will help understanding other model inspired by FCOS.

Summary ...πŸ‘‡
πŸ“Œ FCOS reformulates object detection in a per-pixel prediction fashion

πŸ“Œ It uses multi-level prediction to improve the recall and resolve the ambiguity resulted from overlapped bounding boxes
πŸ“Œ It proposes β€œcenter-ness” branch, which helps suppress the low-quality detected bounding boxes and improves the overall performance by a large margin

πŸ“Œ It avoids complex computation such as the intersection-over-union (IoU)
Read 6 tweets
18 Nov
How to create a robustness evaluation dataset?

"Natural Adversarial Objects" (NAO) dataset is a challenging robustness evaluation dataset for models trained on MSCOCO

πŸ“Œ Models generally perform well on large scale training sets
πŸ“Œ They generalize on test sets coming from the same distribution

πŸ“Œ When using NAO dataset, EfficientDet-D7 mAP reduced by 74.5% compared to MSCOCO

πŸ“Œ Faster RCNN reduced by 36.3% compared to MSCOCO
πŸ“Œ They evaluated 7 SOTA models, and showed they consistently fail to perform accurately on NAO, comparing to MSCOCO

πŸ“Œ The drop is present on both in-distribution and out-of-distribution objects
Read 6 tweets
17 Nov
❇VFNet: A very interesting model that isn’t under the radar. You should give it a try :)

VariFocalNet: An IoU-aware Dense Object Detector

🧊 Background:
πŸ“Œ Accurately ranking candidate detections is crucial for dense object detectors to achieve high performance
...
πŸ“Œ Prior work uses the classification score or a combination of classification and predicted localization scores (centerness) to rank candidates.

πŸ“Œ Those 2 scores are still not optimal

🧊 Novelty:
πŸ“Œ VFNet proposes to learn an IoU-Aware Classification Score (IACS)
πŸ“ŒIACS is used as a joint representation of object presence confidence and localization accuracy using IoU

πŸ“Œ VFNet introduces the VariFocal Loss

πŸ“Œ The VariFocal Loss down-weights only negative examples for addressing the class imbalance problem during training
Read 7 tweets
16 Nov
πŸ“’ The amazing @OpenMMLab just released a new project:

MMFlow: an open-source optical flow toolbox written in Pytorch

OpenMMLab hosts several impressive open-source projects for both academic research and industrial applications.
OpenMMLab covers a wide range of research topics of computer vision, e.g., classification, detection, segmentation and super-resolution.

πŸ“Œ MMCV: Foundational library for computer vision.

πŸ“Œ MIM: MIM Installs OpenMMLab Packages.
πŸ“Œ MMClassification: Image classification toolbox and benchmark.

πŸ“Œ MMDetection: Detection toolbox and benchmark.

πŸ“Œ MMDetection3D: Next-generation platform for general 3D object detection.

πŸ“Œ MMSegmentation: Semantic segmentation toolbox and benchmark.
Read 6 tweets
15 Nov
4 Feature Pyramid Network (FPN) Design you should know:

FPN, PANet, NAS-FPN, and BiFPN

πŸ“Œ (a) FPN uses a top-down pathway to fuse multi-scale features from level 3 to 7 (P3 - P7);

πŸ“Œ (b) PANet adds an additional bottom-up pathway on top of FPN;
πŸ“Œ (c) NAS-FPN uses neural architecture search to find an irregular feature network topology and then repeatedly apply the same block;

πŸ“Œ (d) BiFPN is a bit similar to PANet, adds shortcut fusing, and then repeatedly apply the same block
πŸ“ Some other observations:

πŸ“Œ The model diagram corresponds to the One-Stage Object Detection Architecture

πŸ“Œ The FPN illustration is extracted from the EfficientDet paper

πŸ“ŒThe (P3-P5) layers are referred as the Convolutional (C3-C5) Layers in other papers
Read 5 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Thank you for your support!

Follow Us on Twitter!

:(