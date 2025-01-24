Technology News
OpenAI Releases Operator AI Agent in Preview, Can Independently Perform Tasks on the Web

Operator is powered by Computer-Using Agent (CUA), an agentic model with GPT-4o’s vision capabilities.

Written by Akash Dutta, Edited by Siddharth Suvarna | Updated: 24 January 2025 14:19 IST
Photo Credit: Unsplash/Levart_Photographer

OpenAI’s Operator comes with its own dedicated web browser

  • Operator is currently available to ChatGPT Pro users in the US
  • OpenAI plans to launch the AI agent to more subscription tiers eventually
  • CUA can interact with graphical user interfaces (GUIs)
OpenAI released its first artificial intelligence (AI) agent, Operator, on Thursday. Currently available as a research preview, the agent comes with a dedicated web browser. It is a general-purpose AI agent that can autonomously perform tasks online based on prompts given by the user. The AI firm said the tool can be used to book tickets online, reserve a table in a restaurant, or buy a product online. Currently, Operator is only available in the US to ChatGPT Pro subscribers, but the company plans to expand it to other subscription tiers in the future.

OpenAI Introduces Operator AI Agent

In a live stream, OpenAI CEO Sam Altman introduced the company's first AI agent. Explaining what agents are, Altman said, “AI agents are AI systems that do work for you independently. You give them a task, and they go off and do it. We think it will be a big trend in AI.”

operator ai agent OpenAI Operator

The Operator AI agent interface
Photo Credit: OpenAI

 

Operator is powered by the Computer-Using Agent (CUA), an AI model that combines vision capabilities from GPT-4o with advanced reasoning, an OpenAI blog post explained. The AI agent was post-trained using reinforcement learning. It can interact with graphical user interfaces (GUIs) including buttons, menus, and text fields on the screen. With its dedicated browser, the agent can perform tasks behind the scenes while freeing up the screen for the user.

The AI agent accepts both text and images as input. To complete tasks, the CUA processes raw pixel data of the screen and uses a virtual keyboard and mouse to execute actions. OpenAI claims it can navigate multi-step tasks, handle errors, and can also adapt to unexpected changes.

Use Cases of the Operator AI Agent

Rowan Cheung, founder of the AI newsletter The Rundown AI, had early access to Operator and highlighted some of its use cases in a series of posts on X (formerly known as Twitter). The AI agent was able to plan a weekend trip based on advice from Reddit, a specific budget, and interests. Interestingly, when the agent was blocked from accessing Reddit, it completed the task by running a Bing search with Reddit as a keyword.

In another instance, Cheung asked the Operator to find cryptocurrency tokens worth looking into. During its research, the agent got stuck on an “Are you human” CAPTCHA and immediately pinged the user to take control to confirm. Once Cheung confirmed, the AI agent took control and continued with the task.

The AI agent can seamlessly allow the user to jump in and take control at any given time and edit or change the task. Once the user is done, they can also give the control back to the agent. This ensures that the user has control over the AI agent at all times.

OpenAI also stated that it is collaborating with companies such as DoorDash, eBay, Instacart, and Uber to ensure that Operator respects the terms of service agreements of these businesses while accessing the platforms.

Operator's Safety Risks and Mitigation

Coming to safety, the AI firm claimed that it has run extensive safety testing and has implemented mitigations against three safety classes — misuse, model mistakes, and frontier risks.

To reduce the risk of misuse, OpenAI has trained the CUA model to refuse harmful tasks and illegal or regulated activities. The company has also blocked gambling, adult entertainment, as well as drug and gun retailer websites. In addition, the company has also implemented automated and human-based reviews of user interactions.

For model mistakes or hallucinations, the AI agent is trained to ask for user confirmation before finalising tasks with external side effects. The CUA also declines to help with tasks such as banking transactions and while accessing sensitive websites, the agent requires active user supervision.

Frontier risks are the unexpected actions taken by a state-of-the-art AI model as it is generally not tested exhaustively. OpenAI said the CUA model has been evaluated against its Preparedness Framework, and the Operator System Card provides full details into the safety approach and ongoing improvements.

Currently, Operator is only available via the operator.chatgpt.com URL to ChatGPT Pro subscribers in the US. The company has stated that it plans to integrate the AI agent with all ChatGPT clients in the future. Notably, a ChatGPT Pro subscription is priced at $200 (roughly Rs. 17,200) a month.

Further reading: OpenAI, ChatGPT, OpenAI Operator, AI Agent, AI, Artificial Intelligence
Akash Dutta
Akash Dutta
Akash Dutta is a Senior Sub Editor at Gadgets 360.
