DETR is a detection transformer introduced by facebook. DETR demonstrates significantly better performance on large objects but didn’t perform that well on small objects.
architecture
there are three main components:
- a CNN backbone
- an Encoder-Decoder transformer
- a simple feed-forward network
usage