Gemini ai multimodal articles
-
Meta Is Developing New Multimodal AI Model Chameleon to Rival OpenAI's GPT-4o
Meta enters the multimodal AI arena with Chameleon, an early-fusion model excelling in image captioning and generation, rivaling Google and OpenAI.
-
Microsoft reveals Phi-3-vision, a new multimodal AI small language model
During Build 2024, Microsoft announced a preview version of Phi-3-vision. This is a multimodel addition to its Phi-3 AI small language model family that was first revealed in April.
-
Microsoft announces Phi-3-vision, a new multimodal SLM for on-device AI scenarios
At Build 2024, Microsoft today expanded its Phi-3 family of AI small language models with the new Phi-3-vision. Phi-3-vision is a 4.2B parameter model that supports general visual reasoning tasks and chart/graph/table reasoning. The model can take both images and text as input, and output text responses. Microsoft today also announced the general availability of […]