Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models
Unite.AI
APRIL 26, 2024
To ensure training efficiency, the Mini-Gemini framework keeps the two vision encoders fixed, and optimizes the projectors of patch info mining in all stages, and optimizes the large language model during the instruction tuning stage itself.
Let's personalize your content