OpenAI launches an AI agent that will do all the work on its own without command

AI Agent: The operator can "see" the browser through screenshots and "interact" through mouse and keyboard actions. Its specialty is that it does not require custom API integration. If the Operator encounters any problem or mistake, it can correct itself using its reasoning abilities.

Fri, 24 Jan 2025 03:14 PM (IST)
OpenAI launches an AI agent that will do all the work on its own without command
OpenAI launches an AI agent that will do all the work on its own without command

OpenAI has introduced its new AI agent Operator that can perform various tasks on the web for users. According to the company, it can view a webpage using its browser and perform interactions like typing, clicking, and scrolling in it. Special about this is that this is the first AI agent from OpenAI that can act independently, meaning it does not need any command. For now, it has been released as a research preview, which only means it still has some weaknesses and is open to user feedback. This version only went live for ChatGPT Pro users in the US.

The operators can be used to perform pretty much any routine job on a browser: filling out forms, ordering groceries, and even making memes. It does this by leveraging the very same interface that people use every single day, through the same tools. That saves time and allows a business to capitalize on this new opportunity for connecting with their clients.

OpenAI plans to make it available to Plus, Team, and Enterprise users soon, and integrate it into ChatGPT in the future. "Operator is powered by a new model, the Computer-Using Agent (CUA). It combines GPT-4o's vision capabilities and advanced reasoning through reinforcement learning. It learns to interact with graphical user interfaces (GUIs), such as buttons, menus, and text fields," the company blog explains.

The operator can "see" the browser via screenshots and "interact" with mouse and keyboard actions. Its specialty is that it does not require custom API integration. If the operator encounters any problem or error, it can correct itself using its reasoning abilities.

Muskan Kumawat Journalist & Content Writer