Artificial Intelligence Zone

Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models

Unite.AI

APRIL 26, 2024

To ensure training efficiency, the Mini-Gemini framework keeps the two vision encoders fixed, and optimizes the projectors of patch info mining in all stages, and optimizes the large language model during the instruction tuning stage itself.

Large Language Models

Large Language Models Natural Language Processing Convolutional Neural Networks Neural Network

Visual Instruction Tuning for Pixel-Level Understanding with Osprey

Unite.AI

JANUARY 25, 2024

Owing to its design and architecture, the Osprey framework is able to achieve fine-grained semantic understanding for object-level and part-level regions, and provides detailed object attributes along with primary object category and enhanced descriptions of complex scenes.

Large Language Models

Large Language Models Convolutional Neural Networks LLM Neural Network

Artificial Intelligence Zone

Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models

Visual Instruction Tuning for Pixel-Level Understanding with Osprey

Webinars

Stay Connected