You can now use tools and Structured Outputs when completing eval runs, and evaluate tool calls based on the arguments passed and responses returned. This supports tools that are OpenAI-hosted, MCP, and non-hosted. Read more in our guides below.
1. Codex is rolling out to ChatGPT Plus users today. It includes generous usage limits for a limited time, but during periods of high demand, we might set rate limits for Plus users so that Codex remains widely available.
2. Next, our top requested feature: You can now give Codex access to the internet during task execution to install base dependencies, run tests that need external resources, upgrade or install packages needed to build new features, and more.
3. Internet access is off by default, and can be enabled when creating a new environment or by editing an existing one. You have full control over the domains and HTTP methods Codex can use during task execution. Learn more about usage and risks in the docs: platform.openai.com/docs/codex/age…
🆕 Four updates to building agents with OpenAI: Agents SDK in TypeScript, a new RealtimeAgent feature for voice agents, Traces support for the Realtime API, and improvements to our speech-to-speech model.
The Agents SDK is now available in TypeScript and supports handoffs, guardrails, tracing, MCP, and other core agent primitives, just like the Python version.
It includes new support for human-in-the-loop approvals, allowing you to pause tool execution, serialize and store the agent state, approve or reject specific calls, and resume the agent run.
We’ve been collaborating closely with developers to understand where image gen can be most useful in the real world. Here are some examples from early adopters across domains like creative tools, consumer apps, enterprise software, and more below. 👇
We're launching new tools to help developers build reliable and powerful AI agents. 🤖🔧
Timestamps:
01:54 Web search
02:41 File search
03:22 Computer use
04:07 Responses API
10:17 Agents SDK
Our new API primitive: the Responses API. Combining the simplicity of Chat Completions with the tool-use of Assistants, this new foundation provides more flexibility in building agents. Web search, file search, or computer use are a couple lines of code!
o‑series models excel at handling ambiguous, multi‑step tasks in domains such as math, engineering, legal, and finance—“the planners.” 🧠
Use o-series models to process unstructured data, find a needle in a haystack, improve code, or handle other complex tasks. For example, o1’s vision capabilities can analyze detailed architectural drawings. In this image, o1 recognized that “PT” wood posts were pressure-treated.