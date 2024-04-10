Technology News
  • Home
  • Ai
  • Ai News
  • Apple Researchers Are Building AI Model Called ‘Ferret UI’ That Can Navigate Through iOS

Apple Researchers Are Building AI Model Called ‘Ferret UI’ That Can Navigate Through iOS

Researchers claim that Ferret UI is capable of complex tasks such as widget classification and icon recognition.

Written by Akash Dutta, Edited by Siddharth Suvarna | Updated: 10 April 2024 17:27 IST
Apple Researchers Are Building AI Model Called ‘Ferret UI’ That Can Navigate Through iOS

Photo Credit: Pexels/Mateusz Taciak

The LLM is designed to automate the perception and interaction within smartphone user interfaces

Highlights
  • Apple researchers said that Ferret UI is a vision-language model
  • The paper claims most MLLMs cannot process beyond complex images
  • The AI model was trained using data generated by GPT-4
Advertisement

Apple researchers have published yet another paper on artificial intelligence (AI) models, and this time the focus is on understanding and navigating through smartphone user interfaces (UI). The yet-to-be peer-reviewed research paper highlights a large language model (LLM) dubbed Ferret UI, which can go beyond traditional computer vision and understand complex smartphone screens. Notably, this is not the first paper on AI published by the research division of the tech giant. It has already published a paper on multimodal LLMs (MLLMs) and another on on-device AI models.

The pre-print version of the research paper has been published on arXiv, an open-access online repository of scholarly papers. The paper is titled “Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs” and focuses on expanding the use case of MLLMs. It highlights that most language models with multimodal capabilities cannot understand beyond natural images and are functionality “restricted”. It also states the need for AI models to understand complex and dynamic interfaces such as those on a smartphone.

As per the paper, Ferret UI is “designed to execute precise referring and grounding tasks specific to UI screens, while adeptly interpreting and acting upon open-ended language instructions.” In simple terms, the vision language model can not only process a smartphone screen with multiple elements representing different information but it can also tell a user about them when prompted with a query.

ferret ui Ferret UI

How Ferret UI processes information on a screen
Photo Credit: Apple

 

Based on an image shared in the paper, the model can understand and classify widgets and recognise icons. It can also answer questions such as “Where is the launch icon”, and “How do I open the Reminders app”. This shows that the AI is not only capable of explaining the screen it sees, but can also navigate to different parts of an iPhone based on a prompt.

To train Ferret UI, the Apple researchers created data of varying complexities themselves. This helped the model in learning basic tasks and understanding single-step processes. “For advanced tasks, we use GPT-4 [40] to generate data, including detailed description, conversation perception, conversation interaction, and function inference. These advanced tasks prepare the model to engage in more nuanced discussions about visual components, formulate action plans with specific goals in mind, and interpret the general purpose of a screen,” the paper explained.

The paper is promising, and if it passes the peer-review stage, Apple might be able to utilise this capability to add powerful tools to the iPhone that can perform complex UI navigation tasks with simple text or verbal prompts. This capability appears to be ideal for Siri.

Is the Samsung Galaxy Z Flip 5 the best foldable phone you can buy in India right now? We discuss the company's new clamshell-style foldable handset on the latest episode of Orbital, the Gadgets 360 podcast. Orbital is available on Spotify, Gaana, JioSaavn, Google Podcasts, Apple Podcasts, Amazon Music and wherever you get your podcasts.
Affiliate links may be automatically generated - see our ethics statement for details.
Comments

For the latest tech news and reviews, follow Gadgets 360 on X, Facebook, WhatsApp, Threads and Google News. For the latest videos on gadgets and tech, subscribe to our YouTube channel. If you want to know everything about top influencers, follow our in-house Who'sThat360 on Instagram and YouTube.

Further reading: Apple, Apple AI, Artificial Intelligence, GPT
Akash Dutta
Akash Dutta
Akash Dutta is a Senior Sub Editor at Gadgets 360. He is particularly interested in the social impact of technological developments and loves reading about emerging fields such as AI, metaverse, and fediverse. In his free time, he can be seen supporting his favourite football club - Chelsea, watching movies and anime, and sharing passionate opinions on food. More
Intel Unveils New AI Chip, Gaudi 3, in Bid to Challenge Nvidia
Truecaller Web Interface With Unknown Number Lookup, SMS Messaging Support Launched

Related Stories

Apple Researchers Are Building AI Model Called ‘Ferret UI’ That Can Navigate Through iOS
Comment
Share on Facebook Gadgets360 Twitter Share Tweet Snapchat Share Reddit Comment google-newsGoogle News
 
 

Advertisement

Featured
Follow Us
Latest Videos
More Videos
Tech News in Hindi
More Technology News in Hindi

Advertisement

Popular on Gadgets
Latest Gadgets
Popular Mobile Brands
#Trending Stories
  1. Vivo T3x 5G Price Range, Design Revealed; India Launch Set for This Day
  2. Motorola Teases New Smartphone Launch, Moto G64 5G Leaks Online
  3. Motorola Edge 50 Series to Make Global Debut on This Date
  4. Truecaller Introduces Web Interface With These Two Handy Features
  5. Nubia Flip 5G With Snapdragon 7 Gen 1 SoC Launched: See Price
  6. iQoo Z9 Turbo With 144Hz Display, 6,000mAh Battery to Launch on This Date
#Latest Stories
  1. Hong Kong Said to Be Close to Approving Its First Spot Bitcoin ETFs
  2. IMF Withholding Financial Aid for El Salvador Due to Its Bitcoin Alliance: Report
  3. eBay Introduces AI-Powered ‘Shop the Look’ Feature to Find Personalised Outfits
  4. Apple Researchers Are Building AI Model Called ‘Ferret UI’ That Can Navigate Through iOS
  5. Messenger Now Lets You Send HD Photos, Create Shared Albums, Send Files Up to 100MB
  6. Truecaller Web Interface With Unknown Number Lookup, SMS Messaging Support Launched
  7. HMD Partners With Rajasthan Royals to Boost Brand Visibility Ahead of Self-Branded Phone Launch
  8. Vivo T3x 5G Price Range, Design, Colour Options Revealed; to Launch in India on April 17
  9. Motorola Edge 50 Series to See a Global Launch on April 16; Expected Price, Specifications
  10. Intel Unveils New AI Chip, Gaudi 3, in Bid to Challenge Nvidia
Gadgets 360 is available in
Follow Us
Download Our Apps
App Store App Store
Available in Hindi
App Store
© Copyright Red Pixels Ventures Limited 2024. All rights reserved.
Trending Products »
Latest Tech News »