Google has launched MedGemma 1.5 and MedASR to advance medical image analysis and speech transcription.
Photo Credit: Unsplash/Irwan
Google’s new MedGemma and MedASR models available on Hugging Face and Vertex AI
Google has introduced two new healthcare-focused artificial intelligence (AI) models, MedGemma 1.5 and MedASR, aimed at improving how medical images and clinical speech data are processed. The release of the open-source AI models marks the next step in the Mountain View-based tech giant's push in the healthcare space. Interestingly, unlike OpenAI, which is offering its ChatGPT for Healthcare as a commercial product for enterprise, the Gemini-maker has taken a community-focused approach by making MedGemma 1.5 and MedASR publicly available.
In a blog post, Google Research detailed the new AI models and their capabilities. MedGemma 1.5 is the latest version of Google's open medical vision-language model. It is designed to analyse medical images alongside text, allowing it to interpret scans, answer questions about visual medical data, and support downstream research tasks. The updated version improves on earlier iterations by offering stronger multimodal reasoning, better handling of complex medical imagery, and increased flexibility for fine-tuning on specialised datasets.
Google said MedGemma 1.5 can work with different types of medical images, including radiology scans and other clinically relevant visuals. The model is intended to support research use cases such as image-based question answering, report generation, and structured data extraction. The company maintained that the model is not designed to provide diagnoses or treatment recommendations and should be used as a supporting tool in research and development environments.
Alongside MedGemma 1.5, Google also introduced MedASR, a medical automatic speech recognition model tailored for healthcare settings. MedASR is designed to convert spoken clinical conversations into text, with a focus on handling medical terminology, accents, and real-world clinical audio conditions. Google said the model aims to reduce errors commonly seen in general-purpose speech recognition systems when applied to healthcare use cases.
The company noted that MedASR can be used for tasks such as transcribing doctor-patient interactions, clinical notes, and dictated reports. It is designed to be adaptable across different healthcare environments and can be fine-tuned for specific clinical workflows or documentation standards.
Google said that all variants of MedGemma and the MedASR model can be accessed via the company's Hugging Face listing or the Vertex AI platform. Additionally, the tech giant's MedGemma GitHub repository is also available for developers who want to check out the tutorials. Both models come with a permissive licence allowing both research and commercial use cases.
Catch the latest from the Consumer Electronics Show on Gadgets 360, at our CES 2026 hub.