Google releases "PaliGemma 2" visual language model that is easy to fine tune - GIGAZINE

Google releases “PaliGemma 2” visual language model that is easy to fine tune – GIGAZINE

BY Vasundhara Mali
December 5, 2024
0 Comments
31 Views

On December 5, 2024, Google announced “PaliGemma 2”, a visual language model that adds visual functions based on the open and lightweight language model “Gemma 2”.

Introducing PaliGemma 2: Powerful Vision-Language Models, Simple Fine-Tuning – Google Developers Blog
https://developers.googleblog.com/en/introducing-paligemma-2-powerful-vision-language-models-simple-fine-tuning/

Welcome PaliGemma 2 – New vision language models by Google
https://huggingface.co/blog/paligemma2

PaliGemma is the first visual language model in the Gemma family.GitHuborHugging FaceIt has the ability to recognize images, verbally describe the content of the image, and understand the text within the image.

If you read the article below, you will see what happens when you actually use PaliGemma.

Google releases open source visual language model “PaliGemma” & announces large-scale language model “Gemma 2” with performance equivalent to Llama 3 – GIGAZINE

Now released, its successor, PaliGemma 2, is available in multiple model sizes (3B, 10B, 28B) and resolutions (224×224, 448×448, 896×896 pixels) to optimize performance for any task. will become.

Another selling point is the length of the captions, which go beyond simply recognizing objects to generating detailed, contextual captions that can describe movement, emotion, and the context of an entire scene, or even chemical formulas or musical scores. It is said that it can show excellent performance in recognition, spatial reasoning, and chest X-ray image reporting.

A demo site is also available.

Paligemma2 Vqav2 – a Hugging Face Space by merve
https://huggingface.co/spaces/merve/paligemma2-vqav2

As a test, let’s click on the sample that asks you what type of graph it is.

The model then answered, “Accuracy after fine tuning.”

Google says, “We can’t wait to see what you create with PaliGemma 2. Join the vibrant Gemma community, share your projects on Gemmaverse, and let’s continue exploring the endless possibilities of AI together.” ” he said.

Copy the title and URL of this article

Follow Us

Google releases “PaliGemma 2” visual language model that is easy to fine tune – GIGAZINE

Google Pixel 9 series to get ‘Call To-Do List’ feature

Google Pixel adds features such as “feature to have AI make calls on your behalf” and “feature to have AI operate apps” – GIGAZINE

Vasundhara Mali

About Author

Leave a comment Cancel reply

You may also like

It turns out that TikTok’s algorithm may be actively suppressing criticism of the Chinese government

Even Apple has difficulty centering text in app layouts

Netflix removing six upcoming games from its subscription service

The pros and cons of Realme 14 Pro+ are named after a long test: News: Phones – Ferra.ru

Did extreme overclockers inspire Microsoft to develop closed loop liquid cooling for data centers? I’d like to think so

Netflix removing six upcoming games from its subscription service

The pros and cons of Realme 14 Pro+ are named

Did extreme overclockers inspire Microsoft to develop closed loop liquid

Jewish Leaders Call for Apple and Google to Drop Elon

Populer Posts

Jewish Leaders Call for Apple and Google

The first wave of “opening the door”

MaRS appoints Grace Lee Reynolds as permanent

Category