Researchers Shanghai AI Lab and SenseTime Propose MM-Grounding-DINO: An Open and Comprehensive Pipeline for Unified Object Grounding and Detection
Marktechpost
JANUARY 16, 2024
OVD models are trained on base categories in zero-shot scenarios but must predict both base and novel categories within a broad vocabulary. PG provides a phrase to describe candidate categories and output corresponding boxes, while REC accurately identifies a target from text and outlines its position using a bounding box.
Let's personalize your content