OpenAI is fortifying its internal safety protocols in response to growing concerns about the potential risks of artificial intelligence. The company has introduced a “safety advisory group” that will operate above its technical teams, offering recommendations to leadership, with the board wielding veto power—though the likelihood of its exercise remains uncertain.
While discussions about policy intricacies often escape public attention, recent leadership changes and an evolving discourse on AI risks prompt a closer examination of how the leading AI development company is addressing safety considerations.
OpenAI has unveiled an updated “Preparedness Framework” in a document and blog post. This framework aims to provide a clear roadmap for identifying, analyzing, and addressing “catastrophic” risks associated with the models under development. The definition of catastrophic risk includes scenarios leading to significant economic damage or harm and death to a large number of individuals, encompassing existential risks like the “rise of the machines.”
The framework categorizes models into different teams based on their development stage. In-production models fall under the governance of a “safety systems” team, which handles practical issues such as API restrictions. Frontier models in development are overseen by the “preparedness” team, identifying and quantifying risks before model release. The “superalignment” team is tasked with establishing theoretical guide rails for potentially “superintelligent” models.
Risk evaluation involves four key categories: cybersecurity, persuasion (e.g., disinformation), model autonomy, and CBRN threats. Mitigations are applied, and models deemed to have “high” or “critical” risks are restricted from deployment or further development.
To enhance the evaluation process, OpenAI is establishing a “cross-functional Safety Advisory Group” to review technical reports and make recommendations from a broader perspective. This group’s recommendations will be submitted simultaneously to the board and leadership for decisions, aiming to prevent the greenlighting of high-risk products without proper oversight.
While OpenAI emphasizes transparency, concerns remain about the board’s willingness to counter expert recommendations. The company plans to solicit audits from independent third parties but has not addressed whether guarantees exist for not releasing models with critical risks. The evolving landscape of AI development and the potential consequences of unchecked risks underscore the importance of OpenAI’s commitment to refining safety measures.