Openai Used To Test Its Ai Models For Months - Now It's Days. Why That Matters

SERVIDORES

OpenAI will struggle to survive, predicts AI expert — Elyse Betters Picaro /

On Thursday, the Financial Times reported that OpenAI has dramatically minimized its safety testing timeline.

Also: The top 20 AI tools of 2025 - and the No. 1 thing to remember when you use them

Eight people who are either staff at the company or third-party testers told FT that they had "just days" to complete evaluations on new models -- a process they say they would normally be given "several months" for.

Competitive edge

Evaluations are what can surface model risks and other harms, such as whether a user could jailbreak a model to provide instructions for creating a bioweapon. For comparison, sources told FT that OpenAI gave them six months to review GPT-4 before it was released -- and that they only found concerning capabilities after two months.

Also: Is OpenAI doomed? Open-source models may crush it, warns expert

Sources added that OpenAI's tests are not as thorough as they used to be and lack the necessary time and resources to properly catch and mitigate risks. "We had more thorough safety testing when [the technology] was less important," one person, who is currently testing o3, the full version of o3-mini, told FT. They also described the shift as "reckless" and "a recipe for disaster."

The sources attributed the rush to OpenAI's desire to maintain a competitive edge, especially as open-weight models from competitors, like Chinese AI startup DeepSeek, gain more ground. OpenAI is rumored to be releasing o3 next week, which FT's sources say rushed the timeline to under a week.

No regulation

The shift emphasizes the fact that there is still no government regulation for AI models, including any requirements to disclose model harms. Companies including OpenAI signed voluntary agreements with the Biden administration to conduct routine testing with the US AI Safety Institute, but records of those agreements have quietly fallen away as the Trump administration has reversed or dismantled all Biden-era AI infrastructure.

Also: OpenAI research suggests heavy ChatGPT use might make you feel lonelier

However, during the open comment period for the Trump administration's forthcoming AI Action Plan, OpenAI advocated for a similar arrangement to avoid navigating patchwork state-by-state legislation.

Outside the US, the EU AI Act will require that companies risk test their models and document results.

Also: The head of US AI safety has stepped down. What now?

"We have a good balance of how fast we move and how thorough we are," Johannes Heidecke, head of safety systems at OpenAI, told FT. Testers themselves seemed alarmed, though, especially considering other holes in the process, including evaluating the less-advanced versions of the models that are then released to the public or referencing an earlier model's capabilities rather than testing the new one itself.

Risks

Other experts in the field share the sources' anxiety.

As Shayne Longpre, an AI researcher at MIT, told , evolving AI systems are getting more access to data streams and, with the ongoing explosion of AI agents, software tools. This means "the surface area for flaws in AI systems is growing larger and larger," he explained. Longpre recently co-authored a call from researchers at MIT and Stanford that asked AI companies to "invest in the needs of third-party, independent researchers" to better serve AI testing.

Also: This new AI benchmark measures how much models lie

"As [AI systems] become more capable, they are being used in new, often dangerous, and unexpected ways, from AI therapists dispensing medical advice, acting as human companions and romantic partners, or writing critical software security code. De-risking these systems can take significant time, and require subject matter expertise from dozens of disciplines," Longpre noted.

With more people using AI tools every day, Longpre notes internal testing teams aren't sufficient. "More time to investigate these systems for AI safety and security issues is important. But even more important is the need to prioritize truly third-party access and testing: only the broader community of users, academics, journalists, and white-hack hackers can scale to cover the surface area of flaws, expertise, and diverse languages these systems now serve."

Also: The Turing Test has a problem - and OpenAI's GPT-4.5 just exposed it

To support this, Longpre suggests companies create bug bounties and disclosure programs for multiple types of AI flaws, make red-teaming available to a wider range of testers, and provide those testers' findings with legal protections.

Want more stories about AI? Sign up for Innovation, our weekly newsletter.

Artificial Intelligence

7 strategic insights business and IT leaders need for AI transformation in 2025
How I used this AI tool to build an app with just one prompt - and you can too
The best AI for coding in 2025 (and what not to use)
My 5 favorite AI apps on Android right now - and how I use them

Cisco Price, Dell Price, Huawei Price, ZTE HPE Fortinet Switch Router Server At Low Price

SERVIDORES

NOTÍCIAS QUENTES

Top Features of the Huawei S5731-S24T4X: The Ultimate Gigabit Access Switch for Modern Networks

General Power Module Fault Location Procedure (CE8800 & 7800 & 6800 & 5800)

How Do I Split a Stack? How to clear the stacking configuration?

Huawei CloudEngine S5731 Datasheet

Huawei CloudEngine S5731-S24P4X: Powerful Enterprise-Grade Switch Explained

Huawei S5731-S48T4X Review: Powerful Enterprise Switch for High-Speed Networking

Why are network cables limited to 100 meters?

Huawei S5731-S32ST4X: Powerful, Enterprise-Ready Gigabit Switch with Advanced Capabilities

Huawei S5731-H48T4XC Review: High-Performance Switching for Modern IT Infrastructures

Huawei S5731-H48P4XC: Comprehensive Overview

Common display Commands for Huawei Devices

Stacking Card Stacking vs. Service Port Stacking: Application Scenarios for the Two Switch Stacking Methods

Huawei S5731-H24T4XC: High-Performance Intelligent Gigabit Switch

Huawei S5731-S48P4X: High-Performance PoE Switch with Flexible Power and Uplink Options

Huawei S5731 Series: Advanced Networking Solutions for Enterprises

Difference between campus switch and data center switch

Huawei S6730-H28Y4C Campus CloudEngine Switch Datasheet

S6730-H48Y6C: Unleashing Power and Flexibility for Modern Networking

CloudEngine S6730-H Series Switches Datasheet

Huawei CloudEngine Switch S6730-S24X6Q Datasheet

CloudEngine S6700 Series Switches Naming Conventions & Description

Huawei CloudEngine S6730-H24X6C Datasheet

Huawei S6730 Series Switches Datasheet

Huawei CloudEngine Switch S6730-H48X6C Datasheet

Introduction to the Huawei CloudEngine S6730-S Series Switches

Huawei S6730-H48X6CZ-V2: The Ultimate High-Speed Network Switch

Overview of the S6730-H28X6CZ-V2 Switch

Huawei CloudEngine S6730-H24X4Y4C: A High-Performance Enterprise Switch for Modern Networks

​Introduction to Huawei CloudEngine S6730-H Series Switches

Comprehensive Guide to the CloudEngine S6730-H24X6C-V2: Features, Specifications, and Applications

OpenAI used to test its AI models for months - now it's days. Why that matters

Competitive edge

No regulation

Risks

Artificial Intelligence

Tags quentes : Inovação

Ordering Guide

Recursos

Quem somos

Introduction to Huawei CloudEngine S6730-H Series Switches