Skip to content

OpenAI launches new tools to help businesses build AI agents

  • by

The hype in AI agents has surged in the recent years, but the tech industry is still in a position where it can’t even properly showcase or perceive “AI agents” exactly. The most recent example of this agent hype, however, coming way before any utility to it was witnessed by the Chinese startup Butterfly Effect, which recently went viral for a new AI agent platform called Manus-and was quickly discovered by users to not deliver on many of the company’s promises. Which means that OpenAI has a lot at stake in getting a realizing agent right. “It’s pretty easy to show off demos of the agent,” Olivier Godemont, OpenAI’s Product Head for APIs, told TechCrunch in an interview. “But scaling an agent is pretty hard and gets people to use it on a recurring basis is pretty hard.” The early part of this year saw the introduction of two AI agents by OpenAI in ChatGPT: Operator would navigate sites on your behalf, while deep research compiles research reports for you. Both actually previewed agent tech’s capabilities but left quite a lot to be desired in terms of autonomy.
OpenAI wants to sell access to the components that empower AI agents, Developing an Operator- and deep research-style technology for their own formulized applications within the Responses API. Towards this end, OpenAI hopes that developers can build applications with its agent technology that feel more autonomic than what’s available today.
Creating a search tool for GPT-4o involved tapping into both web search services-a large model designed for factual input and an equally smaller, limited one for light question answering. A benchmark created by OpenAI, called the SimpleQA benchmark, ensures that GPT-4o search is 90% accurate, but almost equally comparable values are given to GPT-4o mini search: 88%. As a benchmark, generally, OpenAI’s 4.5-version is very relevant, scoring only 63%. The relatively easy assumptions concerning web search models help in bridging over hallucinoids, for GPT-4o search might jump over to hunt for exact answers. Having answered all questions, web search is not at all an end to hallucinoids-by distance possibility. AI search tools also not always struggle by far with other questions under review-point short navigation, like an example-lakers really score today-or other(nonspecified) newer speculations related to progressing debate or research in view of source quotation.
Creating a search tool for GPT-4o involved tapping into both web search services-a large model designed for factual input and an equally smaller, limited one for light question answering. A benchmark created by OpenAI, called the SimpleQA benchmark, ensures that GPT-4o search is 90% accurate, but almost equally comparable values are given to GPT-4o mini search: 88%. As a benchmark, generally, OpenAI’s 4.5-version is very relevant, scoring only 63%. The relatively easy assumptions concerning web search models help in bridging over hallucinoids, for GPT-4o search might jump over to hunt for exact answers. Having answered all questions, web search is not at all an end to hallucinoids-by distance possibility. AI search tools also not always struggle by far with other questions under review-point short navigation, like an example-lakers really score today-or other(nonspecified) newer speculations related to progressing debate or research in view of source quotation.
OpenAI’s Agents SDK promises to equip developers with tools for integrating models with internal systems freely, implementing safeguards, and monitoring AI agent activities for debugging and optimization purposes. Agents SDK really carries on the spirit of OpenAI’s Swarm, the framework for multi-agent orchestration that the company shared with us last year. Godemont said he is hoping that OpenAI can bridge the gap between AI agent demos and AI products this year since agents, in his opinion, are the most impactful application of AI that is going to happen. This echoes a proclamation made by OpenAI CEO Sam Altman in January, claiming that the year 2025 would be when AI agents enter the workforce. Whether 2025 truly becomes the year of the AI agent or not, OpenAI’s latest releases are a testament to the fact the company wants a shift from flashy demos of the agent to impactful tools.

Leave a Reply

Your email address will not be published. Required fields are marked *