Tweet

@ai_fast_track

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @ai_fast_track

AI Fast Track (93/100)

@ai_fast_track

Feb 2

🌟 VFNet: IMHO, is the best anchor-free single-stage model, and it's not under the radar.

VariFocalNet: An IoU-aware Dense Object Detector

🧊 Background:
📌 Accurately ranking candidate detections is crucial for dense object detectors to achieve high performance.
...

📌 Prior work uses the classification score or a combination of classification and predicted localization scores (centerness) to rank candidates.

📌 Those 2 scores are still not optimal.

🧊 Novelty:
📌 VFNet proposes to learn an IoU-Aware Classification Score (IACS)

📌IACS is used as a joint representation of object presence confidence and localization accuracy using IoU

📌 VFNet introduces the VariFocal Loss

📌 The VariFocal Loss down-weights only negative examples for addressing the class imbalance problem during training.

Read 7 tweets

AI Fast Track (93/100)

@ai_fast_track

Jan 9

@wightmanr

How do you use transfer learning with images with 3+ (or 1) channel(s)?

Timm library, developed by @wightmanr, has an elegant way to handle that:

You can specify any input channel number (e.g. in_chans=1 or in_chans=8) using timm.create_model() function like this:

@wightmanr

@wightmanr m = timm.create_model('resnet34', pretrained=True, in_chans=8)

How does it work?

• Case 1: number of input channels is 1
timm simply sums the 3 channel weights into one single channel

@wightmanr

@wightmanr • Case 2: number of input channels is 8 (more than 3)
timm repeats the 3 channel weights as many times as required, and then select the required number of input channels weights

In 8 channels example, that would be: repeat 3 times (9 channels generated), then keep the first 8

Read 5 tweets

AI Fast Track (93/100)

@ai_fast_track

Jan 5

https://twitter.com/ai_fast_track/status/1474969748626673666

Here is a mega-summary of my YOLO-Series Visual Summaries:

1- YOLO Family Real-Time Performance

https://twitter.com/ai_fast_track/status/1474969748626673666

https://twitter.com/ai_fast_track/status/1472070080557113344

2- IA-YOLO improves object detection in adverse weather conditions using a hybrid task.

Image improvement combined with object detection.

https://twitter.com/ai_fast_track/status/1472070080557113344

https://twitter.com/ai_fast_track/status/1455585130446270471

3- YOLO Real-Time (YOLO-ReT) architecture targets edge devices.

https://twitter.com/ai_fast_track/status/1455585130446270471

Read 8 tweets

AI Fast Track (93/100)

@ai_fast_track

Jan 3

🔥 ZSD-YOLO: Zero-Shot YOLO Detection using Vision-Language Knowledge Distillation

Heads up: I’m preparing a visual summary on ZSD-YOLO.

So, what is Zero-Shot Detection?

• Zero-shot detection allows a model to detect something in an image even if the model has never seen that thing before

• So, if you have an image of a Chimpanzee and the model has never seen a Chimpanzee before, you can use your zero-shot detector to locate it in the image

• ZSD-YOLO leverages 2 models:
- CLIP: a pretrained Vision-Language model
- YOLOv5: a modified version that replaces the classification branch

Read 5 tweets

AI Fast Track (93/100)

@ai_fast_track

Dec 23, 2021

Many open-world applications require the detection of novel objects.

but state-of-the-art object detection and instance segmentation models are unable to do so.

• It’s because models learn to suppress any unannotated objects by treating them as background

• To address that issue, the authors propose a simple yet surprisingly powerful data augmentation and training scheme they call Learning to Detect Every Thing (LDET)

• To avoid suppressing hidden (unannotated) objects, background objects that are visible but unlabeled, they paste annotated objects on a background image sampled from a small region of the original image (see figure)

Read 8 tweets

AI Fast Track (93/100)

@ai_fast_track

Dec 20, 2021

❓ What is Multi-Scale Training (MST)?

💡 MTS helps your model to be robust to image sizes, an get better performance

• Training on small images is faster

• Training on large images increases your model performance

How is MST done?

Every N (e.g., 10) epochs, we randomly chooses a new image dimension from a range of sizes [640, 768, 800], and train our model

This means the same network becomes better at predicting at different resolutions.

• In MMDetetection, models trained using multi-scale technique have “_mstrain_” in their name.

• Example: vfnet_r50_fpn_mstrain_2x_coco

Read 5 tweets

Share this page!

AI Fast Track (93/100)

Try unrolling a thread yourself!

More from @ai_fast_track

AI Fast Track (93/100)

AI Fast Track (93/100)

AI Fast Track (93/100)

AI Fast Track (93/100)

AI Fast Track (93/100)

AI Fast Track (93/100)

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Like this author's thread?